Breeze: Mathematics In Scala

Nitin Aggarwal introduces the mathematics library behind Spark’s machine learning library, MLlib:

In simple terms, Breeze is a Scala library that extends the Scala collection library to provide support for vectors and matrices in addition to providing a whole bunch of functions that support their manipulation. We could safely compare Breeze to NumPy in Python terms. Breeze forms the foundation of MLlib—the Machine Learning library in Spark

Breeze comprises four libraries:

  • breeze-math: Numerics and Linear Algebra. Fast linear algebra backed by native libraries (via JBlas) where appropriate.

  • breeze-process: Tools for tokenizing, processing, and massaging data, especially textual data. Includes stemmers, tokenizers, and stop word filtering, among other features.

  • breeze-learn: Optimization and Machine Learning. Contains state-of-the-art routines for convex optimization, sampling distributions, several classifiers, and DSLs for Linear Programming and Belief Propagation.

  • breeze-viz: (Very alpha) Basic support for plotting, using JFreeChart.

Read on for samples and basic usage.

Related Posts

Batch Consumption from Kafka with Spark

Swapnil Chougule shares a few tips on performing batch processing of a Kafka topic using Apache Spark: Spark as a compute engine is very widely accepted by most industries. Most of the old data platforms based on MapReduce jobs have been migrated to Spark-based jobs, and some are in the phase of migration. In short, […]

Read More

Securely Accessing External Resources From Databricks AWS

Itai Weiss shows how you can securely hit external data sources when using Databricks for AWS: For security purposes, Databricks Apache Spark clusters are deployed in an isolated VPC dedicated to Databricks within the customer’s account. In order to run their data workloads, there is a need to have secure connectivity between the Databricks Spark […]

Read More

Categories

December 2017
MTWTFSS
« Nov Jan »
 123
45678910
11121314151617
18192021222324
25262728293031