Ivan Palomares Carrascosa builds a process:
Few data science projects are exempt from the necessity of cleaning data. Data cleaning encompasses the initial steps of preparing data. Its specific purpose is that only the relevant and useful information underlying the data is retained, be it for its posterior analysis, to use as inputs to an AI or machine learning model, and so on. Unifying or converting data types, dealing with missing values, eliminating noisy values stemming from erroneous measurements, and removing duplicates are some examples of typical processes within the data cleaning stage.
As you might think, the more complex the data, the more intricate, tedious, and time-consuming the data cleaning can become, especially when implementing it manually.
Ivan handles some of the most common types of data clean work and shows a simple way of implementing these.
Comments closed