Press "Enter" to skip to content

Category: Cloud

Polybase Setup Errors

Murshed Zaman on the Azure CAT team covers a number of Polybase configuration errors:

SSMS Error:

Any Select query fails with the following error.
Msg 106000, Level 16, State 1, Line 1
Java heap space

Possible Reason:

Illegal input may cause the java out of memory error.  In this particular case the file was not in UTF8 format. DMS tries to read the whole file as one row since it cannot decode the row delimiter and runs into Java heap space error.

Possible Solution:

Convert the file to UTF8 format since PolyBase currently requires UTF8 format for text delimited files.

I imagine that this page will get quite a few hits over the years, as there currently exists limited information on how to solve these issues if you run into them, and some of the error messages (especially the one quoted above) have nothing to do with root causes.

Comments closed

Netflix Billing Architecture

The Netflix tech blog discusses changing their billing infrastructure to be entirely in the cloud (AWS in this case):

Cleaning up Code: We started chipping away existing code into smaller, efficient modules and first moved some critical dependencies to run from the Cloud. We moved our tax solution to the Cloud first.

Next, we retired serving member billing history from giant tables that were part of  many different code paths. We built a new application to capture billing events, migrated only necessary data into our new Cassandra data store and started serving billing history, globally, from the Cloud.

We spent a good amount of time writing a data migration tool that would transform member  billing attributes spread across many tables in Oracle  into a much simpler Cassandra data structure.

We worked with our DVD engineering counterparts to further simplify our integration and got rid of obsolete code.

Purging Data: We took a hard look at every single table to ensure that we were migrating only what we needed and leaving everything else behind. Historical billing data is valuable to legal and customer service teams. Our goal was to migrate only necessary data into the Cloud. So, we worked with impacted teams  to find out what parts of historical data they really needed. We identified alternative data stores that could serve old data for these teams. After that, we started purging data that was obsolete and was not needed for any function.

All in all, a very interesting read on how to migrate large databases.  Even if you’re moving from one version of a product to another, some of these steps might prove very helpful in your environment.

Comments closed

Minimizing Cloud Costs

Kenneth Fisher looks at reducing the bottom line for cloud operations:

This got me thinking about ways to reduce/minimize costs. These are some general ideas since from what I can tell cloud billing is as complex as the tax codes and at that I have limited experience.

  • If you aren’t using your VM, shut it down. You can do this manually, or with apowershell script or even at the push of a button

  • Start small. Only create the machines you need and keep them to a minimum.

  • Starting small will lead to some bottle necks. Feel free to bounce up and down as you need. There are some restrictions (size etc) when you move downwards, so be careful. Again this can be done manually or with powershell. Let’s say you need to do a high volume load. Bump your service tier, then once you are done, bump it back down again.

  • And my personal favorite : Don’t install enterprise when you only need standard.

Doing business on Azure or AWS does require a bit of a shift in mindset.  Cloud costs are entirely variable—you control when services run; how much compute, storage, and bandwidth you want to use; and your SLA.  Choosing different spots on the continuum results in different pricing.  This has also helped the growth of technologies like Hadoop, in which you can separate compute from storage.  If I know that my cluster gets heavy usage during core business hours, light usage overnight, and no usage on the weekend, I can spin up and down nodes as necessary, and can even shut off clusters which don’t need to operate, and because I’m storing the data off of the cluster nodes (and on S3 or in Azure Data Lake Storage), data doesn’t become unavailable just because the primary compute process is unavailable—I could spin up another cluster or write a quick one-off data reader.

Comments closed

Tools For Cortana Intelligence Suite Development

Melissa Coates has a list of tools she uses when working with Cortana Intelligence Suite:

4. Azure SDK

The Azure SDK sets up lots of libraries; the main features we are looking for from the Azure SDK right away are (a) the ability to use the Cloud Explorer within Visual Studio, and (b) the ability to create ARM template projects for automated deployment purposes. In addition to the Server Explorer we get from Visual Studio, the Cloud Explorer from the SDK gives us another way to interact with our resources in Azure.

This is a nice tools checklist to compare against what you’re using.

Comments closed

Microsoft & FreeBSD

Serdar Yegulalp points out that a new Azure VM image for FreeBSD has Microsoft as the publisher:

The other question people are likely to ask is why, kernel contributions notwithstanding, is Microsoft listed as the publisher of the distro? The short answer: support.

According to Microsoft’s blog post, the FreeBSD Foundation is a community of mutually supportive users, “not a solution provider or an ISV with a support organization.” The kinds of customers who run FreeBSD on Azure want to have service-level agreements of some kind, and the FreeBSD Foundation isn’t in that line of work.

The upshot: If you have problems with FreeBSD on Azure, you can pick up the phone and get Microsoft to help out — but only if you’re running its version of FreeBSD.

To be honest, I don’t see this as a big deal.  I’m glad the image is there, but this hardly seems like a landmark change in anything to me.

Comments closed

Views Within Elastic Query

Grant Fritchey shows how to build Elastic Query views to obscure underlying table names:

Creating a view, or any other query, that joins across databases using Elastic Query works just fine. However, if you want to mask things using a view, you might need to get a little creative in how you implement Elastic Query. The good news is, Elastic Query is somewhat, shall we say, elastic in how you set it up. More so than it immediately appears.

Interesting.

Comments closed

Thoughts On Stretch Database

Kevin Hill looks at Stretch database:

  • Lowest performance rate is $1.25/hr or just under $1K/mo. Only goes up from there

  • “Stretch Database currently does not support stretching to another SQL Server. ” Azure only

  • Lame/minimal filters…you have to roll your own functions, and they must be deterministic…no “Getdate() – 30”. This GUI is only slightly better than the horrible nightmare that was Notification Services…

I see the negatives overwhelming the positives at this point.  You also can’t modify schema while Stretch is active.

Comments closed

Premium Storage On Azure SQL Data Warehouse

Kenneth Nielsen reports that Azure SQL Data Warehouse will now support premium storage:

Today Microsoft have announced that Azure SQL Datawarehouse will support Premium Storage, this will allow the customers to see greater performance and predictability on queries. As of today, all newly created SQL Datawarehouse will be created with Premium Storage, at least in regions where Premium Storage is available. In the remainder of the preview period, the billing will continue to be based on standard pricing.

If you have Azure SQL DW, check it out to see if Premium is a big net benefit to you, as it looks like the price is the same for the moment.

Comments closed

SQL Licenses On Azure

Kenneth Nielsen notes that you can now bring your own SQL Server licenses to Azure marketplace images:

A few days ago, we announced that Microsoft Enterprise customers is now allowed to bring their own SQL Licenses to Azure VMs. This means that if a customer already have a SQL License, this license can be used on SQL Server VM images from Marketplace.

This means that they do no longer need to build their own VM, but instead can just provision a server from the marketplace and use the existing license.

I like this, but I do wonder what percentage of people will use marketplace-created VMs instead of customizing their own builds.

Comments closed