Category: Cloud

Q: Is there any way to handle the execution of SSIS packages stored locally?

A: Azure Automation works on Azure resources. It cannot be used for executing local SSIS packages.

In some cases, you may still need a scheduling tool (which might be a VM with SQL Agent).

Comments closed

Learning Azure

Published 2017-04-04 by Kevin Feasel

Grant Fritchey notes that web searches won’t always take you to the latest version of documentation:

If you’re learning Azure and you research things using a search engine, then I strongly recommend you use the ability to limit your searches to the last year. Otherwise, you may be getting incomplete or incorrect data. At this precise moment, I’d say you need to limit your searches to Google (although I honestly hate recommending one of these tools over the other, let’s keep the competition fierce) because I was able to easily get the correct information within a couple of mouse clicks.

Grant’s post makes sense, and so does the search engine behavior: in Grant’s case, those older cmdlet documentation links have been around longer and older resources tend to have a larger number of relevant linkbacks and clicks. That’s also visible in SQL Server documentation, where sometimes you’ll land on the 2008R2 or 2012 version of documentation rather than 2016 or vNext.

Meanwhile, Victoria Holt has a bunch of resources for the Azure curious:

Here are a whole set of links to kick start your learning of Microsoft Azure services.

Introduction video

Changes to computer thinking – Stephen Fry explains cloud computing

That’s a good set of starting links.

Comments closed

Scalable Data Analytics

Published 2017-04-03 by Kevin Feasel

David Smith covers a recent Microsoft Data Science team talk at Strata:

The tutorial covers many different techniques for training predictive models at scale, and deploying the trained models as predictive engines within production environments. Among the technologies you’ll use are Microsoft R Server running on Spark, the SparkR package, the sparklyr package and H20 (via the rsparkling package). It also touches on some non-Spark methods, like the bigmemory and ff packages for R (and various other packages that make use of them), and using the foreach package for coarse-grained parallel computations. You’ll also learn how to create prediction engines from these trained models using the mrsdeploy package.

Check out the post as well as the tutorial David links.

Comments closed

Encrypting Kinesis Records

Published 2017-04-03 by Kevin Feasel

Temitayo Olajide shows how to use Amazon’s Key Management Service to encrypt and decrypt Kinesis messages:

In this post you build encryption and decryption into sample Kinesis producer and consumer applications using the Amazon Kinesis Producer Library (KPL), the Amazon Kinesis Consumer Library (KCL), AWS KMS, and the aws-encryption-sdk. The methods and the techniques used in this post to encrypt and decrypt Kinesis records can be easily replicated into your architecture. Some constraints:

AWS charges for the use of KMS API requests for encryption and decryption, for more information see AWS KMS Pricing.
You cannot use Amazon Kinesis Analytics to query Amazon Kinesis Streams with records encrypted by clients in this sample application.
If your application requires low latency processing, note that there will be a slight hit in latency.

Check it out, especially if you’re thinking about streaming sensitive data.

Comments closed

Introduction To Amazon Kinesis

Published 2017-04-03 by Kevin Feasel

Jen Underwood describes Amazon Kinesis:

Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale. Amazon Kinesis is ideal for Internet of Things (IoT) use cases. It can collect and process hundreds of terabytes of data per hour from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time, from sources such as web site click-streams, Raspberry Pi gadgets, devices, social media, operational logs, metering data and more.

With Amazon Kinesis, you can build real-time dashboards, capture exceptions, execute algorithms, and generate alerts. With point-and-click menus, you can ingest data, query it and then send output to a variety of destinations including but not limited to Amazon S3, Amazon EMR, Amazon DynamoDB, or Amazon Redshift.

Kinesis is powerful, especially if you’re already locked into the AWS platform. My preference is Apache Kafka, but Kinesis is definitely worth learning about.

Comments closed

doAzureParallel

Published 2017-03-31 by Kevin Feasel

JS Tan announces a new R package:

For users of the R language, scaling up their work to take advantage of cloud-based computing has generally been a complex undertaking. We are therefore excited to announce doAzureParallel, a lightweight R package built on Azure Batch that allows you to easily use Azure’s flexible compute resources right from your R session. The doAzureParallel package complements Microsoft R Server and provides the infrastructure you need to run massively parallel simulations on Azure directly from R.

The doAzureParallel package is a parallel backend for the popular foreach package, making it possible to execute multiple processes across a cluster of Azure virtual machines with just a few lines of R code. The package helps you create and manage the cluster in Azure, and register it as a parallel backend to be used with foreach.

It’s an interesting alternative to building beefy R servers.

Comments closed

Explaining DTUs

Published 2017-03-31 by Kevin Feasel

Andy Mallon explains what a Database Transaction Unit is:

I’d like to point out that the definition of a DTU is that it’s “a blended measure of CPU, memory, and data I/O and transaction log I/O…” None of the perfmon counters used by the DTU Calculator take memory into account, but it is clearly listed in the definition as being part of the calculation. This isn’t necessarily a problem, but it is evidence that the DTU Calculator isn’t going to be perfect.

I’ll upload some synthetic load into the DTU Calculator, and see if I can figure out how that black box works. In fact, I’ll fabricate the CSVs completely so that I can totally control the perfmon numbers that we load into the DTU Calculator. Let’s step through one metric at a time. For each metric, we’ll upload 25 minutes (1500 seconds–I like round numbers) worth of fabricated data, and see how that perfmon data is converted to DTUs.

Andy then goes on to show how the DTU Calculator estimates DTU usage given different resource patterns. It’s a very interesting process and Andy clarified it considerably.

Comments closed

Unavailable Azure VM Sizes

Published 2017-03-31 by Kevin Feasel

Melissa Coates gives a few of the major reasons why a particular Azure VM size may not be available when you go to resize your VM:

Just a quick tip about why you might notice some sizes are not available when you are attempting to change the size/scale level of an Azure virtual machine in the portal.

I wanted to change one of my Development VMs to a DS12_v2, but that choice wasn’t available:

It didn’t immediately dawn on me why it wasn’t available, so I thought I’d try PowerShell:

Read on for the solution, as well as a few other common causes.

Comments closed

Azure SQL Database Premium RS

Published 2017-03-29 by Kevin Feasel

Arun Sirpal describes a new pricing tier for Azure SQL Database:

What Microsoft classifies as IO intensive I am not so sure, personally I have not seen any sort of IOPS figure(s) for what we could expect from each service tier, it’s not like I can just run DiskSpeed and find out. Maybe the underlying storage for Premium RS databases is more geared to work with complex analytical queries, unfortunately I do not have the funds in my Azure account to start playing around with tests for Premium vs. Premium RS (I would love to).

Also and just as important, Premium RS databases run with fewer redundant copies than Premium or Standard databases, so if you get a service failure you may need to recover your database from a backup with up to a 5-minute lag. If you can tolerate 5 minute data loss and you are happy with a reduced number of redundant copies of your database then this is a serious option for you because the price is very different.

It’s a lot less expensive (just under 1/3 the cost of Premium in Arun’s example), so it could be worth checking out.

Comments closed

Azure Networking

Published 2017-03-28 by Kevin Feasel

Joshua Feierman has an article on how Azure Networking works, particularly from the viewpoint of a DBA:

The connecting thread between an Azure virtual machine and a virtual network is a Virtual Network Interface Card, or VNic for short. These are resources that are separate and distinct from the virtual machine and network itself, which can be assigned to a given virtual machine.

If you go to the “All Resources” screen and sort by the “Type” column, you will find a number of network interface resources.

There’s some good information in here.

Comments closed

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31