In Apache Spark, we can use the following types of joins in SQL:
Inner join: An inner join in Apache Spark is a type of join that returns only the rows that match a given predicate in both tables. To perform an inner join in Spark using Scala, we can use the
joinmethod on a DataFrame.
The set of options is the same as you’d see in a relational database: inner, left outer, right outer, full outer, and cross. The examples here are in Scala, though would apply just as easily to PySpark and, of course, writing classic SQL statements.