Tomaz Kastrun continues an advent of Azure ML. Day 18 takes us through feature exploration:
Azure Machine Learning is also a great tool to do ordinary statistical analysis, graph plotting and everything that goes along.
Let’s get an open dataset, that is available on UCI Machine Learning repository and import it in the pandas dataframe.
Day 19 picks up with feature engineering:
Yesterday we have shown, that statistical analysis and all bolts and whistles can be done super simple in Azure machine learning. Today we will continue with feature engineering and modelling.
So, what is feature engineering? Is a general process and can involve both feature construction: adding new features from the existing data, and feature selection: choosing only the most important features for improving model performance, reducing data dimensionality, doing log-transformation, removing outliers, to do scaling (normalisation, standardisation), imputations, general transformation (and others, as polynomial), variable creation, variable extraction and so on.