Rituraj Khare makes some connections:
In Apache Spark, we can use the following types of joins in SQL:
Inner join: An inner join in Apache Spark is a type of join that returns only the rows that match a given predicate in both tables. To perform an inner join in Spark using Scala, we can use the
join
method on a DataFrame.
The set of options is the same as you’d see in a relational database: inner, left outer, right outer, full outer, and cross. The examples here are in Scala, though would apply just as easily to PySpark and, of course, writing classic SQL statements.
Comments closed