Viet-Anh on Software Logo

What is: Gradual Self-Training?

SourceUnderstanding Self-Training for Gradual Domain Adaptation
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Gradual self-training is a method for semi-supervised domain adaptation. The goal is to adapt an initial classifier trained on a source domain given only unlabeled data that shifts gradually in distribution towards a target domain.

This comes up for example in applications ranging from sensor networks and self-driving car perception modules to brain-machine interfaces, where machine learning systems must adapt to data distributions that evolve over time.

The gradual self-training algorithm begins with a classifier w0w_0 trained on labeled examples from the source domain (Figure a). For each successive domain PtP_t, the algorithm generates pseudolabels for unlabeled examples from that domain, and then trains a regularized supervised classifier on the pseudolabeled examples. The intuition, visualized in the Figure, is that after a single gradual shift, most examples are pseudolabeled correctly so self-training learns a good classifier on the shifted data, but the shift from the source to the target can be too large for self-training to correct.