2004 | João Gama¹,², Pedro Medas¹, Gladys Castillo¹,³, and Pedro Rodrigues¹
This paper presents a method for detecting changes in the probability distribution of training examples, which is crucial for learning in non-stationary environments. The method focuses on controlling the online error rate of the learning algorithm. When the distribution of examples changes, the error rate increases, and the algorithm detects this change by monitoring the error rate. A warning level and a drift level are defined, and a new context is declared when the error reaches these levels. The algorithm then learns a new model using only the examples since the warning level. The method was tested on eight artificial datasets and a real-world dataset using three learning algorithms: a perceptron, a neural network, and a decision tree. The results show that the method effectively detects drift and adapts to changes in the data distribution. The method is independent of the learning algorithm used. The paper discusses the importance of detecting concept drift in dynamic environments, where the target concept may change over time. It introduces the concept of context, defined as a set of examples where the data generation process is stationary. The aim is to detect changes between contexts and re-learn the model with relevant information. The paper also reviews existing methods for handling concept drift, which can be classified into two categories: those that adapt the learner at regular intervals and those that first detect concept changes and then adapt the learner. The paper concludes that the proposed method is a straightforward and effective way to detect concept drift and adapt to changes in the data distribution.This paper presents a method for detecting changes in the probability distribution of training examples, which is crucial for learning in non-stationary environments. The method focuses on controlling the online error rate of the learning algorithm. When the distribution of examples changes, the error rate increases, and the algorithm detects this change by monitoring the error rate. A warning level and a drift level are defined, and a new context is declared when the error reaches these levels. The algorithm then learns a new model using only the examples since the warning level. The method was tested on eight artificial datasets and a real-world dataset using three learning algorithms: a perceptron, a neural network, and a decision tree. The results show that the method effectively detects drift and adapts to changes in the data distribution. The method is independent of the learning algorithm used. The paper discusses the importance of detecting concept drift in dynamic environments, where the target concept may change over time. It introduces the concept of context, defined as a set of examples where the data generation process is stationary. The aim is to detect changes between contexts and re-learn the model with relevant information. The paper also reviews existing methods for handling concept drift, which can be classified into two categories: those that adapt the learner at regular intervals and those that first detect concept changes and then adapt the learner. The paper concludes that the proposed method is a straightforward and effective way to detect concept drift and adapt to changes in the data distribution.