| Chuck Rosenberg, Martial Hebert, Henry Schneiderman
This paper presents a semi-supervised self-training approach for object detection systems. The goal is to reduce the effort required to prepare training data by using a small number of fully labeled examples and an additional set of unlabeled or weakly labeled examples. The approach is implemented as a wrapper around the training process of an existing object detector and is evaluated empirically. The key contributions of this study are to demonstrate that a model trained in this manner can achieve results comparable to a model trained traditionally using a much larger set of fully labeled data, and that a training data selection metric defined independently of the detector outperforms a selection metric based on detection confidence.
The paper introduces a generic detection algorithm that classifies subwindows in an image as either object or clutter. The approach uses Expectation-Maximization (EM) for semi-supervised training, but the authors instead use a self-training or incremental training approach. This involves initially training a model with fully labeled data, then using the model to estimate labels for weakly labeled data. A selection metric is then used to decide which examples to add to the training set. The selection metric is crucial, as incorrect detections can negatively impact the final model.
The paper evaluates two selection metrics: one based on detection confidence and another based on a distance measure between patches, defined independently of the detector. The results show that the distance-based metric outperforms the confidence-based metric. The paper also evaluates the effect of the size of the labeled and weakly labeled sets on detector performance. It is found that the performance of the detector improves with the addition of weakly labeled data, and that the distance-based metric consistently outperforms the confidence-based metric.
The experiments show that the semi-supervised approach can be applied to an existing detector that was originally designed for supervised training. The results indicate that the semi-supervised approach can achieve performance comparable to traditional supervised training, even with a small initial set of labeled data. The paper also highlights the importance of choosing an appropriate selection metric for training, as it significantly affects the performance of the detector. The study concludes that the semi-supervised self-training approach is a promising method for improving object detection systems with limited labeled data.This paper presents a semi-supervised self-training approach for object detection systems. The goal is to reduce the effort required to prepare training data by using a small number of fully labeled examples and an additional set of unlabeled or weakly labeled examples. The approach is implemented as a wrapper around the training process of an existing object detector and is evaluated empirically. The key contributions of this study are to demonstrate that a model trained in this manner can achieve results comparable to a model trained traditionally using a much larger set of fully labeled data, and that a training data selection metric defined independently of the detector outperforms a selection metric based on detection confidence.
The paper introduces a generic detection algorithm that classifies subwindows in an image as either object or clutter. The approach uses Expectation-Maximization (EM) for semi-supervised training, but the authors instead use a self-training or incremental training approach. This involves initially training a model with fully labeled data, then using the model to estimate labels for weakly labeled data. A selection metric is then used to decide which examples to add to the training set. The selection metric is crucial, as incorrect detections can negatively impact the final model.
The paper evaluates two selection metrics: one based on detection confidence and another based on a distance measure between patches, defined independently of the detector. The results show that the distance-based metric outperforms the confidence-based metric. The paper also evaluates the effect of the size of the labeled and weakly labeled sets on detector performance. It is found that the performance of the detector improves with the addition of weakly labeled data, and that the distance-based metric consistently outperforms the confidence-based metric.
The experiments show that the semi-supervised approach can be applied to an existing detector that was originally designed for supervised training. The results indicate that the semi-supervised approach can achieve performance comparable to traditional supervised training, even with a small initial set of labeled data. The paper also highlights the importance of choosing an appropriate selection metric for training, as it significantly affects the performance of the detector. The study concludes that the semi-supervised self-training approach is a promising method for improving object detection systems with limited labeled data.