Aaron Davidson, et al, announce a new version of Databricks MLflow:
When scoring Python models as Apache Spark UDFs, users can now filter UDF outputs by selecting from an expanded set of result types. For example, specifying a result type of
pyspark.sql.types.DoubleType
filters the UDF output and returns the first column that contains double precision scalar values. Specifying a result type ofpyspark.sql.types.ArrayType(DoubleType)
returns all columns that contain double precision scalar values. The example code below demonstrates result type selection using theresult_type
parameter. And the short example notebook illustrates Spark Model logged and then loaded as a Spark UDF.
Read on for a pretty long list of updates.