Spark Security

Dave Wang discusses the Databricks Enterprise Security framework:

The Databricks just-in-time data platform takes a holistic approach to solving the enterprise security challenge by building all the facets of security — encryption, identity management, role-based access control, data governance, and compliance standards — natively into the data platform with DBES.

  • Encryption: Provides strong encryption at rest and inflight with best-in-class standards such as SSL and keys stored in AWS Key Management System (KMS).
  • Integrated Identity Management: Facilitates seamless integration with enterprise identity providers via SAML 2.0 and Active Directory.
  • Role-Based Access Control: Enables fine-grain management access to every component of the enterprise data infrastructure, including files, clusters, code, application deployments, dashboards, and reports.
  • Data Governance: Guarantees the ability to monitor and audit all actions taken in every aspect of the enterprise data infrastructure.
  • Compliance Standards: Achieves security compliance standards that exceed the high standards of FedRAMP as part of Databricks’ ongoing DBES strategy.

In short, DBES will provide holistic security in every aspect of the entire big d

As enterprises come to depend on technologies like Spark and Hadoop, they need to have techniques and technologies to ensure that data remains secure.  This is a sign of a maturing platform.

Related Posts

Understanding A Spark Streaming Workflow

Himanshu Gupta continues a series on structured streaming using Spark Streaming: Here we can clearly see that if new data is pushed to the source, Spark will run the “incremental” query that combines the previous running counts with the new data to compute updated counts. The “Input Table” here is the lines DataFrame which acts as a […]

Read More

Calculating TF-IDF Using Apache Spark

Arseniy Tashoyan shows us how to calculate Term Frequency-Inverse Document Frequency using Apache Spark: TF-IDF is used in a large variety of applications. Typical use cases include: Document search. Document tagging. Text preprocessing and feature vector engineering for Machine Learning algorithms. There is a vast number of resources on the web explaining the concept itself […]

Read More

Categories

June 2016
MTWTFSS
« May Jul »
 12345
6789101112
13141516171819
20212223242526
27282930