Model-centric vs Data-centric View in the age of AI

Amine Hadj-Youcef. PhD
3 min readMar 26, 2021
https://www.youtube.com/watch?v=06-AZXmwHjo&t=1778s

A brief article and comment about Andrew Ng talk and view on MLOps

I totally agree with Andrew Ng view about machine learning and MLops. People in data science are on the point of becoming magicians by focusing on a model-centric view! They focus mostly on training the latest sophisticated /complex models.

I want to think of it as if they want more and better lemon juice by focusing on building a sophisticated juicer! Probably the solution comes from having a better quality, juicy lemons, and a state-of-the-art juicer, hence, a data-centric view is key.

In fact, the word “data” can sometimes be is misleading, e.g. what’s the point of having a BIG database that contains few/inconsistent/redundant/noisy information! useless. It takes more space to store, consume more computation (i.e. electric) power, difficult to handle and transfer…

Instead, it is key to have clean, free from noise, relevant, non-redundant Information in the dataset, aka, data exploration and engineering

The AI community, the researcher included, get it wrong with ~99% papers on arXiv focus on the model-centric view and only ~1% focus on a data-centric view

Experiment 1

--

--