Press "Enter" to skip to content

Spark DataFrameWriters

Miles Cole compares two generations of DataFrameWriter:

Most Spark developers learn to write data with df.write long before they ever encounter df.writeTo. It is simple, familiar, and everywhere: choose a format, pick a mode, add a few options, and save the result to a table or path. For years, that mental model worked well enough. Spark was often writing files first and tables second.

But modern lakehouse systems have changed the contract.

Read on to learn how, and what common problem the DataFrameWriterV2 is there to solve.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.