Modify raw data and transform it into clean, usable and structured sets of data to help in faster and quality training of machine learning models.