Press "Enter" to skip to content

Category: Cloud

Azure TCO Calculator

James Serra links to the Azure Total Cost of Ownership calculator:

For a long time clients would ask me how to determine the cost savings by migrating their applications and databases to Azure.  I never had a good answer until now: The Total Cost of Ownership (TCO) Calculator.  Now in preview, just provide a brief description of your on-premises environment to get an instant estimate of the cost savings you can realize by migrating your application workloads to Microsoft Azure

I think this is a useful start, though a big part of the value I see in Azure is moving certain things from IaaS to PaaS, like getting rid of the web servers and going to Azure Websites.

Comments closed

Apache Ranger On ElasticMapReduce

Varun Rao explains role-based access control using Apache Ranger on Amazon ElasticMapReduce:

Using the HUE SQL Editor, execute the following query.

These queries use external tables, and Hive leverages EMRFS to access the data stored in S3. Because HiveServer2 (where Hue is submitting these queries) is checking with Ranger to grant or deny before accessing any data in S3, you can create fine-grained SQL-based permissions for users even though there is a single EC2 role specified for the cluster (which is used by all requests the cluster makes to S3). For more information, see Additional Features of Hive on Amazon EMR.

If your job includes securing a Hadoop cluster, this is a nice read, even if you don’t use EMR.

Comments closed

Identity Not Found

Melissa Coates explains Azure Active Directory tenancy to solve an Azure Analysis Services error:

The Analysis Services product team explained to me that a a user from a tenant which has never provisioned Azure Analysis Services cannot be added to another tenant’s provisioned server. Put another way, our Corporate tenant had never provisioned AAS so the Development tenant could not do so via cross-tenant guest security.

One resolution for this is to provision an AAS server in a subscription associated with the Corporate tenant, and then immediately delete the service from the Corporate tenant. Doing that initial provisioning will do the magic behind the scenes and allow the tenant to be known to Azure Analysis Services. Then we can proceed to provision it for real in the Development tenant.

Read the whole thing.

Comments closed

Elastic Database Pools

Arun Sirpal describes Azure elastic database pools:

The key to using elastic database pools is that you must understand the characteristics of the databases involved and their utilisation patterns, if you do not understand this then the idea of using an elastic database pool may cause problems.

The maximum amount my pool has is 100 eDTUs, I know for a fact that the S2 databases will not be used at the same time, the other S0 databases might be used at the same time at the most 3 of them at the same time. Basically what I am saying here is that I know that when the databases concurrently peak I know that it will not go beyond the 100 eDTU limit.

One thing that Arun does not mention is the relative ease of interconnecting databases within a pool, so even if it doesn’t end up being cheaper on net, that might be a benefit worth having.

Comments closed

AWS Data Lake

Nick Corbett announces that Amazon is rolling out their own data lake solution:

Separating storage from processing can also help to reduce the cost of your data lake. Until you choose to analyze your data, you need to pay only for S3 storage. This model also makes it easier to attribute costs to individual projects. With the correct tagging policy in place, you can allocate the costs to each of your analytical projects based on the infrastructure that they consume. In turn, this makes it easy to work out which projects provide most value to your organization.

The data lake stores metadata in both DynamoDB and Amazon ES. DynamoDB is used as the system of record. Each change of metadata that you make is saved, so you have a complete audit trail of how your package has changed over time. You can see this on the data lake console by choosing History in the package view:

Having a competitor in the data lake space is a good thing for us, though based on this intro post, it seems that Amazon and Microsoft are taking different approaches to the data lake, where Microsoft wants you to stay in the data lake (e.g., writing U-SQL or Python statements to query the data lake) and Amazon wants you to shop the data lake and check out the specific S3 buckets and files for your own processing.

Comments closed

Python Support In Azure Data Lake

Saveen Reddy announces that Python is now a first-tier language in the Azure Data Lake:

This week, were are now making announcing even more support for Python. As of today Python is now a first-class language supported by our management SDKs. This enables you to develop applications or automate the Data Lake services. Check out or Getting Started articles that now include many python samples

Saveen has a Jupyter notebook which demonstrates Python in Azure Data Lake Store.

Comments closed

Azure Functions

Steph Locke has taken a shine to Azure Functions:

Azure Functions take care of all the hosting, all the retry logic, all the parallelisation, all the authentication gubbins, all the monitoring for you. The only bits of code you really have to write is the important stuff – the code that implements the business process. This makes a coding project go from >500 lines to <50, and it should be better quality too! This is super handy for data integration, and I would recommend it over and above Data Factory, unless you need to do some Hadoop stuff and maybe not even then.

The wag in me says that with F#, you could take it from 50 lines to 10…  Read the whole thing.

Comments closed

Sharing Power Query Queries

Chris Webb shows how to use Azure Data Catalog to share queries from Power Query:

While I’m really happy to have this functionality back, and I think a lot of people will find it useful, there’s still a lot of room for improvement. Some thoughts:

  • This really needs to extended to work with Power BI Desktop too. In fact, it’s such an obvious thing to do it must be happening soon…?

Given how quickly the Power BI team iterates, that’s probably the case.  Anyhow, read the whole thing.

Comments closed

Cases For Using Azure Analysis Services

Melissa Coates enumerates several reasons why you might want to use Azure Analysis Services:

Varying Levels of Peak Workloads

Let’s say during month-end close the reporting activity spikes much higher than the rest of a typical month. In this situation, it’s a shame to provision hardware that is underutilized a large percentage of the rest of the month. This type of scenario makes a scalable PaaS service more attractive than dedicated hardware. Do note that currently Azure SSAS scales compute, known as the QPU or Query Processing Unit level, along with max data size (which is different than some other Azure services which decouple those two).

Read on for more use cases.

Comments closed

Azure VM Auto-Shutdown

Dave Bermingham shows how to configure automatic shutdown of Azure VMs:

If you are like me, I try to make my Azure MSDN subscription credits stretch the entire month. I’m typically just building labs to try out new features or to demonstrate SQL Server Failover Clusters in Azure. A lot of the time I am testing some pretty large instance sizes with plenty of premium storage. As you can imagine, you can burn through $150 pretty quick with a few GS5 instances running.

I try to be mindful and shutdown or destroy instances once I am done with them, but occasionally I’ll get pulled away for other business, only to log in the next day and see my credit has expired because I forgot to turn off the VMs.

Click through for details, including a warning about storage.

Comments closed