Eugene Meidinger has all the data he needs on his desktop:
The chart above shows the number of seconds it took to load X million rows of data from a given data source, according to a profiler trace and Phil Seamark’s Refresh visualizer. Parquet is a clear winner by far, with MS Access surprisingly coming in second. Sadly the 2 GB file limit stops Access from becoming the big data format of the future.
Part of the reason I wanted to do these tests is often people on Reddit will complain that their refresh is slow and their CPU is maxed out. This is almost always a sign that they are importing oodles and oodles of CSV files. I recommended trying Parquet instead of CSV, but it’s nice to have concrete proof that it’s a better file source.
Read on for the chart. Also, don’t tell his accountants about the gaming laptop. It’s 100% for work purposes, just like my desktop PC. Only work, nothing else, IRS. The high-end GPU is for AI work. And the big screen is for doing big business.