No-Code ML On Cloudera Data Science Workbench

Tim Spann has a post covering ML on the Cloudera Data Science Workbench:

Using Cloudera Data Science Workbench with Apache NiFi, we can easily call functions within our deployed models from Apache NiFi as part of flows. I am working against CDSW on HDP (https://www.cloudera.com/documentation/data-science-workbench/latest/topics/cdsw_hdp.html),  but it will work for all CDSW regardless of install type.
In my simple example, I built a Python model that uses TextBlob to run sentiment analysis against a passed-in sentence. It returns Sentiment Polarity and Subjectivity, which we can immediately act upon in our flow.
CDSW is extremely easy to work with and I was up and running in a few minutes. For my model, I created a python 3 script and a shell script for install details. Both of these artifacts are available here: https://github.com/tspannhw/nifi-cdsw.

The “no code” portion was less interesting to me than the scalable ML portion, as “no code” either drops into tedium or ends up being replaced by code.

Related Posts

The Zen Of Airflow

Bas Harenslak shows how you can think of The Zen of Python as it applies to Apache Airflow: Apache Airflow is a Python framework for programmatically creating workflows in DAGs, e.g. ETL processes, generating reports, and retraining models on a daily basis. This allows for concise and flexible scripts but can also be the downside of […]

Read More

Tips For Using PolyBase With Cloudera QuickStart VM

I have a post on using Cloudera’s QuickStart VM with PolyBase: Here’s something which tripped me up a little bit while connecting to Cloudera using SQL Server. The data node name, instead of being quickstart.cloudera like the host name, is actually localhost. You can change this in /etc/cloudera-scm-agent/config.ini. Because PolyBase needs to have direct access to the data nodes, […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Categories

February 2019
MTWTFSS
« Jan  
 123
45678910
11121314151617
18192021222324
25262728