Ed Elliott shows how we can solve a challenging problem when newlines are in the wrong place:
So the first thing we need to do is to read in the whole file in one chunk, if we just do a standard read the file will get broken into rows based on the newline character:
var file = spark.Read().Option("wholeFile", true).Text(@"C:\git\files\newline-as-data.txt");
This solution is a bit complex. As Ed points out, you’re better off reshaping the file before you try to process it. If it’s a structured file like the example Ed has, a regular expression can do the trick.