Data and concept drift in simple words

What is data drift?

Data drift occurs when a model, trained to recommend summer clothes in summer, encounters unexpected changes, like a heatwave starting in May. As a result, the model still is recommending spring clothes, even though outside weather is hot. This leads to irrelevant recommendations, poor user experience and lost sales opportunities.

MLOps pipelines check live data fed to a model and compare it to the training data. This is done by leveraging various tools, which we won’t get into now.

What is concept drift?

Concept drift is when the input-target relationship changes over time. Essentially, the thing that the model is trying to predict changes.

Let me give you an example: credit card fraud detection. The model is trained to detect high-value transactions as fraudulent. To avoid detection, fraudsters adapt. They make more frequent, smaller transactions over time to avoid detection. Now it is harder for the model to detect fraudulent transactions. 

This is where MLOps practices help. There are industry established best practices on how to track models' performance and accuracy.