Machine learning models don't improve on their own. Source: Shutterstock

Machine learning models don’t improve on their own. Source: Shutterstock

Why businesses using machine learning should not ignore “concept drift”

BUSINESSES often think that machine learning (ML) models learn on their own and get better over time. That’s not true.

If organizations want to use the technology effectively in 2020, they need to understand why and what to do about it.

Business leaders have been told that they need a mountain of data to train any artificial intelligence (AI) or machine learning model. As a result, much of their efforts in the past year have been focused on acquiring data.

However, once the models are deployed, they stop evolving and fail to account for changes that occur in variables. As a result, over time, ML models slowly start becoming inaccurate. This is known as “concept drift” and is something that academia has been studying for quite a while — but businesses seem to have been ignoring.

“In the case of concept drift, our interpretation of the data changes with time even while the general distribution of the data does not,” said Phillips Strategy and Innovation Consultant (and former AI Advisor) Alexandre Gonfalonieri.

“This causes the end-user to interpret the model predictions as having deteriorated over time for the same/similar data. Both data and concept can simultaneously drift as well, further vexing the matters.”

In Gonfalonieri’s experience, models which are dependent on human behavior may be particularly prone to degradation.

According to Phillips’ expert, business leaders need to encourage data scientists to predict how data is going to change over time, and once the model has passed the proof of concept stage, formulate a plan to monitor model performance and to keep it updated.

“Model monitoring is a continuous process. If you observe degraded model performance, then it’s time for restructuring the model design.

“The tricky part isn’t about refreshing the model and creating a retrained model but rather thinking of additional features that might improve the model’s performance and make it more solid and accurate.”

The reality is that there’s more to ML models than just creating and deploying it. Maintenance is critical.

Some models, like the ones that factor in human behavior, might need to be monitored closely and might require more frequent updates as compared to those that are more static.

The nature of the model and frequency of the updates required could also help inform the strategy for updates as well.

Gonfalonieri identifies two kinds of updates — manual learning and continuous learning.

While manual learning might be time-consuming, the process might allow data scientists to discover a new algorithm or a different set of features that provide improved accuracy.

In contrast, continuous learning saves time because it is automated and ensures that the team is alerted to degradations in the model early on if intervention is required.

Ultimately, Gonfalonieri’s message is that businesses need to start thinking beyond deploying AI and ML models and understand how their performance is impacted over time as a result of changes in the market, the business, customer behavior, and so on.

“You will need to invest in order to maintain the accuracy of the machine learning products and services that your customers use. This means that there’s a higher marginal cost to operating ML products compared to traditional software,” reminded Gonfalonieri.

Businesses that continue to neglect “concept drift” and don’t plan to monitor and update ML models might not be able to get the most out of their data.