Use Folders With Polybase

Andrew Peterson argues that you should use folders instead of individual files when creating external tables:

Why?
1) Add more files to the directory, and Polybase External table will automagically read them.
2) Do INSERTS and UPDATES from PolyBase back to your files in Hadoop.
( See PolyBase – Insert data into a Hadoop Hue Directory ,
PolyBase – Insert data into new Hadoop Directory ).
3) It’s cleaner.

This is good advice. Also, if you’re using some other process to load data—for example, a map-reduce job or Spark job—you might have many smaller file chunks based on what the reducers spit out. It’s not a bad idea to cat those file chunks together, but at least if you use a folder for your external data location, your downstream processes will still work as expected regardless of how the data is laid out.

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31