Press "Enter" to skip to content

Category: Cloud

Free Trial Of Azure SQL Data Warehouse

James Serra notes that there is a free one-month trial of Azure SQL Data Warehouse:

You can use this one month free trial to do POCs and try out SQL DW up to 200 DWU and 2TB of data.  You must sign up by December 31st 2016.  Please note that once the one month free trial is over, you will start getting billed at general availability pricing rates.  For more information on the free trial, and to sign up, go here.

This is great because you can quickly run out of credits otherwise.

Comments closed

Test Connection With HDInsight

I have a post trying to test a connection using HDInsight:

WebHCat is a web-based REST API for HCatalog, a management layer for dealing with files in HDFS.  If you’re looking for configuration settings for WebHCat, you’ll want generally to look for “templeton” in config files, as Templeton was the project name before WebHCat.  In Ambari, you can go to the Hive configs and look at webhcat-site.xml for configuration settings.  For WebHCat, the default port in HDInsight is 30111, which you should find in the templeton.port configuration setting.

I don’t like the fact that WebHDFS is blocked, but at least WebHCat is functional.

Comments closed

Spark Clusters On Spot Pricing

Sameer Farooqui explains spot pricing with respect to AWS servers:

The idea behind Spot instances is to allow you to bid on spare Amazon EC2 compute capacity. You choose the max price you’re willing to pay per EC2 instance hour. If your bid meets or exceeds the Spot market price, you win the Spot instances. However, unlike traditional bidding, when your Spot instances start running, you pay the live Spot market price (not your bid amount). Spot prices fluctuate based on the supply and demand of available EC2 compute capacity and are specific to different regions and availability zones.

So, although you may have bid 0.55 cents per hour for a r3.2xlarge instance, you’ll end up paying only 0.10 cents an hour if that’s what the going rate is for the region and availability zone.

Databricks uses spot pricing for Community Edition clusters to control costs.  Click through for a very interesting discussion of spot pricing and how they take advantage of it.

Comments closed

Capturing SSAS Query Activity

Bill Anton explains why and how he captures query activity by user in SSAS:

In most environments, it is trivial to obtain the name of the user who ran each query… all you have to do was capture the [QueryEnd] event in a profiler/xevent trace and pull the information from the [NTUserName] field. However, in environments involving Power BI and the Enterprise On-Premise Data Gateway, there’s a bit more to it.

The main issue is how authentication is handled in this type of architecture. When working with Power BI reports connected to an on-premise data source via the On-Premise Data Gateway, the account of the user running the report is passed as the “EffectiveUsername”. The implication here is that the value shown in the [NTUserName] field of the xevent/profiler trace is going to be the Data Gateway account – NOT the account of the user who actually generated the activity.

Read on for the full answer.

Comments closed

Developing In The Cloud

Richie Rump has some nice pointers about developing for Azure or AWS:

Since we’re a bunch of data freaks, we wanted to make sure that our data and files are properly backed up. I set out to create a script that will backup DynamoDB to a file and copy the data in S3 to Azure. The reasoning for saving our backups into a different cloud provider is pretty straightforward. First, we wanted to keep the data in a separate cloud account from the application. We didn’t make the same mistakes that Code Spaces did. Secondly, I wanted to kick the tires of Azure a bit. Heck, why not?

I figure this script would take me a day to write and a morning to deploy. In the end it took four days to write and deploy. So here are some lessons that I learned the hard way from trying to bang out this backup code.

This is a must-read if you’re starting to look at using cloud providers for services.

Comments closed

Azure SQL DW Statistics

Emma Stewart looks at how statistics are created in Azure SQL Data Warehouse:

In Azure SQL Data Warehouse, statistics have to be created manually. On previous SQL Server projects, creating and maintaining statistics wasn’t something that we had to incorporate into our design (and really think about!) however with SQL DW we need to make sure we think about how to include it in our process in order to make sure we take advantage of the benefits of working with Azure DW.

The major selling point of Azure SQL Data Warehouse is that it is capable of processing huge volumes of data, one of the specific performance optimisations that has been made is the distributed query optimiser. Using the information obtained from the statistics (information on data size and distribution), the service is able to optimize queries by assessing the cost of specific distributed query operations. Therefore, since the query optimiser is cost-based, SQL DW will always choose the plan with the lowest cost.

Azure SQL Data Warehouse is a bit of a strange animal, with differences in statistics being one of the smaller changes versus “classic” SQL Server.

Comments closed

Details On Azure SSAS

James Serra breaks down what Azure Analysis Services has to offer:

  • Developers can use SQL Server Data Tools (SSDT) in Visual Studio for creating models and deploying them to the service.  Administrators can manage the models using SQL Server Management Studio (SSMS) and investigate issues using SQL Server Profiler

  • Business users can consume the models in any major BI tool.  Supported Microsoft tools include Power BI, Excel, and SQL Server Reporting Services.  Other MDX compliant BI tools can also be used, after downloading and installing the latest drivers

  • The service currently supports tabular models (compatibility level 1200 only).  Support for multidimensional models will be considered for a future release, based on customer demand

Between tabular-only support and the max size being 100 GB (if I’m reading this correctly), they’re not yet ready to push the product hard.  Given that it just came out, that makes sense, and hopefully the training wheels come off.

Comments closed

Cached Azure Analysis Services Logins

Chris Webb shows how to log into Azure Analysis Services from Management Studio as a different user:

When Azure Analysis Services was announced I had to try it out right away. Of course I didn’t read the instructions properly so when I tried to log in to my Azure Analysis Services instance from SQL Server Management Studio, like an idiot I logged in with the wrong username. The problem is that once you’ve done this, with current versions of SQL Server Management Studio there’s no way of logging out and logging in as a different user. Luckily Igor Uzhviev of Microsoft had a solution for me and I thought I’d share it for anyone else who’s made the same mistake. Here’s what you need to do:

This seems a bit much, but should just be a temporary workaround.

Comments closed

Polybase As Ersatz StretchDB

Ginger Grant has a great idea:

PolyBase, which was released with SQL Server 2016, provides another method to access live data either locally or in the cloud, very similar to the SQL Server Stretch database feature. Polybase can also provide the ability to provide a more cost-effective availability for cold data, streamlines on-premises data maintenance, and keeps data secure even during migration. Polybase differs from Stretch database in a few ways, as the SQL must be different, the speed is noticeably slower, and it is a lot less expensive. The cost is significantly less because storing data in a Azure blob store starts at 1 cent a month and Stretch database starts at $2.50 an hour. In this post,I will show how to take data which was archived due to the age of the data, which was created in 2012 and store it in an Azure Blob Storage file which will be available via Polybase when I needed.

The ideal scenario for this solution is extremely cold data which is nonetheless required as part of regulatory compliance, where having a query run for 3 hours once every six months or so is acceptable.

Comments closed

Migrating To Azure SQL Database

Tim Radney walks us through steps to migrate an on-prem database to Azure SQL Database:

When planning to migrate on premises databases to V12, the size of the database is a huge factor in how long the migration will take. The export of the database, the transfer of the data, and the import will all increase in proportion to the size of the database.

Another big factor in the restore/import time when moving your databases to V12 is the performance tier you are restoring too. The restore/import process requires a lot of horsepower, so to help expedite your migration, you should consider restoring to a higher performance tier. When the database is online, you can easily and quickly drop down to a lesser tier that meets your daily performance needs. Being able to change performance tiers with a few mouse clicks is one of the big benefits of Azure SQL Database.

There are some design considerations for moving to Azure SQL Database, and once those are covered, Tim’s article helps with the actual migration process.

Comments closed