Cloud – Page 119 – Curated SQL

An Exadata is an engineered system- database nodes, secondary cell nodes, (also referred to as storage nodes/cell disks), InfiniBand for fast network connectivity between the nodes, specialized cache, along with software features such as Real Application Clusters, (RAC), hybrid columnar compression, (HCC), storage indexes, (indexes in memory) offloading technology that has logic built into it to move object scans and other intensive workloads to cell nodes from the primary database nodes. There are considerable other features, but understanding that Exadata is an ENGINEERED system, not a hardware solution is important and its both a blessing and a curse for those databases supported by one. The database engineer must understand both Exadata architecture and software along with database administration. There is an added tier of performance knowledge, monitoring and patching that is involved, including knowledge of the Cell CLI, the command line interface for the cell nodes. I could go on for hours on more details, but let’s get down to what is required when I am working on a project to migrate an Exadata to Azure.

Click through for the process.

Comments closed

Using Koalas on Azure Databricks

Published 2020-01-21 by Kevin Feasel

Ginger Grant shows how you can install the koalas library on an Azure Databricks cluster:

Unfortunately if you are using an ML workspace, this will not work and you will get the error message org.apache.spark.SparkException: Library utilities are not available on Databricks Runtime for Machine Learning. The Koalas github documentation says “In the future, we will package Koalas out-of-the-box in both the regular Databricks Runtime and Databricks Runtime for Machine Learning”. What this means is if you want to use it now
Most of the time I want to install on the whole cluster as I segment libraries by cluster. This way if I want those libraries I just connect to the cluster that has them. Now the easiest way to install a library is to open up a running Databricks cluster (start it if it is not running) then go to the Libraries tab at the top of the screen.

Click through for a demo of what you need to do.

Comments closed

Using ACLs to Secure Azure Data Lake Data

Published 2020-01-21 by Kevin Feasel

Matthew Roche takes us through access control lists (ACLs) in Azure Data Lake Storage Gen2 and how they apply to Power BI:

Earlier this week I received a question from a customer on how to get Power BI to work with data in ADLSg2 that is secured using ACLs. I didn’t know the answer, but I knew who would know, and I looped in Ben Sack from the dataflows team.Ben answered the customer’s questions and unblocked their efforts, and he said that I could turn them into a blog post. Thank you, Ben!

Read on for the answer.

Comments closed

Tips for Using Azure Storage

Published 2020-01-20 by Kevin Feasel

James Serra takes us through Azure Data Lake Store Gen2 and Azure Blob Storage:

Azure Data Lake Store (ADLS) Gen2 should be used instead of Azure Blob Storage unless there is a needed feature that is not yet GA’d in ADLS Gen2.
The major features that are missing from ADLS Gen2 are premium tier, soft delete, page blobs, append blobs, and snapshots. The major features that are in preview are archive tier, lifecycle management, and diagnostic logs. Check out all the missing features at Known issues with Azure Data Lake Storage Gen2.
Note that underneath the covers, ADLS Gen2 uses Azure Blob Storage and is simply a layer over blob storage providing additional features (i.e. hierarchical file system, better performance, enhanced security, Hadoop compatible access).

Click through for a bullet point list of useful information.

Comments closed

Managing Systems with Azure Arc

Published 2020-01-17 by Kevin Feasel

Robert Smit takes us through Azure Arc:

This Blog post is about Azure Arc, how to set this up and get you started with Azure Arc. For customers who want to simplify complex and distributed environments across on-premises, edge and multi cloud, Azure Arc enables deployment of Azure services anywhere and extends Azure management to any infrastructure.
So Azure Arc is not a replacement for the old Azure Server manager tools! So no remote RDP or open MMC only log analytics, policy’s, CLI etc. https://robertsmit.wordpress.com/2016/08/25/azure-server-management-tools-manage-your-servers-from-anywhere-servermgmt-azure-smt/

Click through for a demonstration.

Comments closed

Azure SQL Database Edge in Public Preview

Published 2020-01-15 by Kevin Feasel

Amit Banerjee announces the public preview for Azure SQL Database Edge:

Azure SQL Database Edge is available in public preview. Azure SQL Database Edge runs on ARM and Intel architecture and brings the most secure Microsoft SQL engine to the edge. By running the same Microsoft SQL database engine both on-premises and in the cloud, you now only need to develop your applications once and deploy anywhere across the edge, your datacenter, and Azure.
With the availability of Azure SQL Database Edge in public preview, we’re inviting customers, partners, and ISVs to join the early adopter program to experience the power of SQL and AI on the edge.

I’m interested in checking out more of these time series capabilities.

Comments closed

Loading Event Hubs from Cosmos DB

Published 2020-01-14 by Kevin Feasel

Annie Xu shows us how we can use Azure Functions to take data from Cosmos DB and populate Event Hubs:

One way to load data from Cosmos DB to Event hub is to use Azure Function. But although there is many coding samples out there to create such Azure Function. If you are like me do not have much application development experience, reading those code samples is bit channenging. Luckly, Azure Portal made is so easy.

Annie has a step-by-step walkthrough which makes it easy.

Comments closed

Don’t Miss These Settings in Azure SQL DB

Published 2020-01-14 by Kevin Feasel

Arun Sirpal takes us through a few things administrators tend to miss in Azure SQL Database:

2. Allow Azure Services and resources to access this server setting set to on/off?
I always set this to off. I do not like it ON.
Why? Because I like to control things via vnets (maybe IPs if really needed – it depends on your solution). Nowadays you can use private endpoint connections which allow connections from within a vnet to a private IP. Sure, you may want to use IP addresses, if you do then I suggest database level firewall rules over server level, especially if you use failover groups.

There are several good ones here.

Comments closed

Labeling Queries in Azure Synapse Analytics

Published 2020-01-13 by Kevin Feasel

Niko Neugebauer touches on something I want for on-premises SQL Server:

In Azure Synapse Analytics (Azure SQL DW) we have a tool that can help us – the query labels. Firing up the same analytical query, but this time with the OPTION (LABEL = ‘QueryLabelIdentification’) can help us with the identification of the processing. So for the test example I have simply included the format QL – [Query Pupose] where QL stands for Query Labelling:

I think this would have a lot of value on-prem, especially if you are using Query Store.

Comments closed

Accessing S3 Data from Apache Spark

Published 2020-01-08 by Kevin Feasel

Divyansh Jain shows how we can connect to AWS’s S3 using Apache Spark:

Now, coming to the actual topic that how to read data from S3 bucket to Spark. Well, it is not very easy to read S3 bucket by just adding Spark-core dependencies to your Spark project and use spark.read to read you data from S3 Bucket.
So, to read data from an S3, below are the steps to be followed:

This isn’t a built-in source, so there is a little bit of work to do, but it’s not that bad.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: Cloud

Migrating Oracle Exadata Workloads to Azure

Using Koalas on Azure Databricks

Using ACLs to Secure Azure Data Lake Data

Tips for Using Azure Storage

Managing Systems with Azure Arc

Azure SQL Database Edge in Public Preview

Loading Event Hubs from Cosmos DB

Don’t Miss These Settings in Azure SQL DB

Labeling Queries in Azure Synapse Analytics

Accessing S3 Data from Apache Spark