Manoj Gautam shows how to perform a logistic regression with Apache Spark:

Since we are going to try algorithms like Logistic Regression, we will have to convert the categorical variables in the dataset into numeric variables. There are 2 ways we can do this.

- Category Indexing
- One-Hot Encoding
Here, we will use a combination of StringIndexer and OneHotEncoderEstimator to convert the categorical variables. The

`OneHotEncoderEstimator`

will return a SparseVector.

Click through for the code and explanation.

Kevin Feasel

2018-11-05

Data Science, Hadoop, Spark