Press "Enter" to skip to content

Join Types in Spark SQL

Rituraj Khare makes some connections:

In Apache Spark, we can use the following types of joins in SQL:

Inner join: An inner join in Apache Spark is a type of join that returns only the rows that match a given predicate in both tables. To perform an inner join in Spark using Scala, we can use the join method on a DataFrame.

The set of options is the same as you’d see in a relational database: inner, left outer, right outer, full outer, and cross. The examples here are in Scala, though would apply just as easily to PySpark and, of course, writing classic SQL statements.