Franco Cano shows us how we can convert a Resilient Distributed Dataset in Spark to a DataFrame:
Sometimes you need transform you RRDs to DataFrames because DataFrames have a lot optimization options.
Let’s see how this is done.
This works, but there is a performance cost to converting a large RDD to a DataFrame (or vice versa). With that in mind, sticking to one type when you can is typically better.