SparkSession

Kevin Feasel

2016-08-16

Spark

Jules Damji shows off SparkSession:

Beyond a time-bounded interaction, SparkSession provides a single point of entry to interact with underlying Spark functionality and allows programming Spark with DataFrame and Dataset APIs. Most importantly, it curbs the number of concepts and constructs a developer has to juggle while interacting with Spark.

In this blog and its accompanying Databricks notebook, we will explore SparkSession functionality in Spark 2.0.

This looks to be an easier method for integrating various parts of Spark in one user session.  Read the whole thing.

Related Posts

Benchmarking Streaming Systems

Burak Yavuz shares a benchmark of Spark Streaming versus Flink and Kafka Streams: At Databricks, we used Databricks Notebooks and cluster management to set up a reproducible benchmarking harness that compares the performance of Apache Spark’s Structured Streaming, running on Databricks Unified Analytics Platform, against other open source streaming systems such as Apache Kafka Streams and Apache Flink. In particular, we used the following […]

Read More

Installing Zeppelin With Spark2 Support On HDP

Paul Hernandez shows how to install Apache Zeppelin 0.7.3 on Hortonworks Data Platform 2.5 in order to gain Spark2 support: As a recent client requirement I needed to propose a solution in order to add spark2 as interpreter to zeppelin in HDP (Hortonworks Data Platform) 2.5.3 The first hurdle is, HDP 2.5.3 comes with zeppelin […]

Read More

Categories

August 2016
MTWTFSS
« Jul Sep »
1234567
891011121314
15161718192021
22232425262728
293031