Press "Enter" to skip to content

Date Cleaning with PySpark

Robert J. Blackburn needs to do some cleanup work:

The function will accept the dataframe and a list of columns to process. Because of syntax restrictions the steps are broken out into multiple statements and a sub-function. Luckily, Spark’s lazy evaluation will optimize it during runtime.

Click through for the function in question.