Things To Know About Databricks UAP

Kara Annanie has five things you should know about the Databricks Unified Analytics Platform:

4.     A Spark Dataframe is not the same as a Pandas/R Dataframe
Spark Dataframes are specifically designed to use distributed memory to perform operations across a cluster whereas Pandas/R Dataframes can only run on one computer. This means that you need to use a Spark Dataframe to realize the benefits of the cluster when coding in Python or R within Databricks.

This is a nice set of tips for people getting started with Spark and Databricks.

Related Posts

Disable Lightweight Pooling

Randolph West explains why enabling lightweight pooling in SQL Server is almost always a bad idea: When can I enable lightweight pooling then?Don’t. But if you must, these are the conditions under which Microsoft suggests it may be useful:– Large multi-processor servers are in use.– All servers are running at or near maximum capacity.– A lot […]

Read More

Dealing With Massive Database Counts

Edward Harrison describes some of the issues you run into when working on a SQL Server instance with thousands of databases: One of the headaches we frequently have to deal with is the excessive amount of time it takes for SQL Servers with the discussed design to go through any planned patches. We have seen […]

Read More

Categories

February 2019
MTWTFSS
« Jan  
 123
45678910
11121314151617
18192021222324
25262728