Press "Enter" to skip to content

Category: Cloud

Creating Azure VMs Using Powershell: Laying The Groundwork

Robert Cain has part one of a two-part series on creating VMs in Azure using Powershell:

Creating a virtual machine is great, but it won’t be of much use unless it can communicate outside of itself. That’s where virtual networking comes in. To setup a virtual network, often abbreviated vnet, you need to accomplish three things. First is the creation of the virtual network itself. After the network is created, you need to define a security group for it. In essence, the security group defines a firewall. In the process of creating it, the PSAzure module automatically creates firewall rules that allow HTTP and RDP (Remote Desktop Protocol) traffic through the firewall. There are functions in PSAzure to create security groups at a lower level, allowing one to create alternate rules. This example will demonstrate the most common options.

The final step is to create a virtual NIC, or Network Interface Card. The NIC will form the bridge between the virtual network and the virtual machine, much like a physical network card allows a physical computer to connect to a real network. First off, a few variables are assigned. These will hold names for the security group, network and subnet names. The network addresses for the main network and subnet are also placed into into variables. Finally, a name is assigned to the NIC.

Check it out, especially if you build a lot of VMs in Azure.

Comments closed

Jupyter Notebooks In Azure

Steve Jones looks at using Jupyter Notebooks in Azure:

There’s a new feature in Azure, and I stumbled on it when someone posted a link on Twitter. Apologies, I can’t remember who, but I did click on the Azure Notebooks link and was intrigued. I’ve gotten Jupyter notebooks running on my local laptop, but these are often just on one machine. Having a place to share a notebook in the cloud is cool.

Once I clicked on the link, I found these are both R and Python notebooks, as well as F#. These allow you to essentially build a page of code and share it. It’s kind of like a REPL, kind of like a story. It’s a neat way of working through a problem. I clicked the Get Started link to get going and was prompted for a User ID.

I’m a major fan of using notebooks for validating results as well as training people.

Comments closed

Uploading Data Sets To Azure ML From R

Leila Etaati continues her series on the Azure ML R package by showing how to upload a data set:

There is a function in AzureML package name “workspace” that creates a reference to an AzureML Studio workspace by getting the authentication token and workspace id as below:

to work with other AzureML packages you need to pass this object to them.

for instance for exploring the all experiments in Azure ML there is a function name “experiments” that gets the “ws” object as input to connect the desire azure ml environment and also a filter.

Click through for  more.

Comments closed

Cloud Savings And TCO

James Serra argues that moving to the cloud can be a net savings on cost:

I often tell clients that if you have your own on-premise data center, you are in the air conditioning business.  Wouldn’t you rather focus all your efforts on analyzing data?  You could also try to “save money” by doing your own accounting, but wouldn’t it make more sense to off-load that to an accounting company?  Why not also off-load the  costly, up-front investment of hardware, software, and other infrastructure, and the costs of maintaining, updating, and securing an on-premises system?

And when dealing with my favorite topic, data warehousing, a conventional on-premise data warehouse can cost millions of dollars in the following: licensing fees, hardware, and services; the time and expertise required to set up, manage, deploy, and tune the warehouse; and the costs to secure and back up the data.  All items that a cloud solution eliminates or greatly minimizes.

When estimating hardware costs for a data warehouse, consider the costs of servers, additional storage devices, firewalls, networking switches, data center space to house the hardware, a high-speed network (with redundancy) to access the data, and the power and redundant power supplies needed to keep the system up and running.  If your warehouse is mission critical then you need to also add the costs to configure a disaster recovery site, effectively doubling the cost.

I don’t think this story plays quite as well.  For small and mid-sized companies, yes, the cloud is often a net savings.  For companies whose products were designed to be cloud-first and take advantage of burstiness and spot markets, yes, you can drive cost savings that way.  But for most mid-to-large companies, I think the calculus shifts to where sometimes cloud options work better but often they don’t.  Need a few hundred SQL Server instances with microsecond-level latency running SQL Server Enterprise Edition 24/7?  That’s not going to be cheaper.

Comments closed

The Shuffling Operator And Azure SQL DW

Arun Sirpal is ready to deal:

For the purposes of this post the TSQL shown is elementary (don’t be surprised by that), the point is really about SHUFFLE. So, I select the estimated plan for the following code.

SELECT SOD.[SalesOrderID],SOD.[ProductID], SOH.[TotalDue]
FROM [SalesLT].[SalesOrderDetail] SOD
JOIN [SalesLT].[SalesOrderHeader] SOH ON
SOH.[SalesOrderID] = SOD.[SalesOrderID]
WHERE SOH.[TotalDue] > 1000

Shuffle me once, why not shuffle me twice. If you REALLY want to see the EXPLAIN command output, then it looks like this snippet below.

The DSQL operation clearly states SHUFFLE_MOVE. Why am I getting this? What does it mean?

Shuffling data isn’t the worst thing in the world, but it is a fairly expensive operation all things considered.  Ideally, your warehouse architecture limits the number of shuffle operations, but considering that you can only hash on one key, sometimes it’s inevitable.

Comments closed

Spatial Workaround In Azure SQL Data Warehouse

Rolf Tesmer has you covered if you want to perform spatial queries against data in Azure SQL Data Warehouse:

Recently we had a requirement to perform SQL Spatial functions on data that was stored in Azure SQL DW.  Seems simple enough as spatial has been in SQL for many years, but unfortunately, SQL Spatial functions are not natively supported in Azure SQL DW (yet)!

If interested – this is the link to the Azure Feedback feature request to make this available in Azure SQL DW – https://feedback.azure.com/forums/307516-sql-data-warehouse/suggestions/10508991-support-for-spatial-data-type

AND SO — to use spatial data in Azure SQL DW we need to look at alternative methods.  Luckily a recent new feature in Azure SQL DB  in the form of Elastic Query to Azure SQL DW now gives us the ability to perform these SQL Spatial functions on data within Azure SQL DW via a very simple method!

Check out that Azure Feedback item if you’d like to see native spatial support rather than using elastic query.  In the meantime, click through to see Rolf’s workaround.

Comments closed

Configuring Azure SQL Analytics

Esat Erkec has a guide showing how to configure and use Azure SQL Analytics on Azure SQL Database:

The most important and challenging responsibility of a database administrator is monitoring performance metrics. Because monitoring performance and troubleshooting performance issues are considered to be difficult. For this reason, we need diagnostic and monitoring tools to measure performance counters and metrics. For Azure SQL there is a tool which is named SQL Analytics. With this tool, we can measure and monitor Azure SQL databases and elastic pools. At the same time, we can create alerts for notifications. SQL Analytics offers performance metrics in graphical form. In this article, we will learn how to enable Azure SQL Analytics.

This is a long and screenshot-filled post, which is helpful if you’re getting started.

Comments closed

DTUs Or vCores For Azure SQL DB

Denny Cherry looks at a new Azure SQL Database announcement:

Today Microsoft has announced that there is a new way to buy Azure SQL DB. If DTUs aren’t making sense to you, you’ll be happy to know that you can now simply select how many vCores you want for your SQL DB workload.  Now this will still require that you have an understanding on your workload to use this new vCore based way to buy Azure SQL DB, but Cores are a concept that is easy for people to talk about and wrap their heads around. Now this new model is only in preview at the moment, but I’m guessing that it’ll be around for a while in preview, then it’ll go GA as this new model makes sense.

I definitely prefer this model, as IT departments already understand the idea, whereas DTUs were nebulous at best.

Comments closed

Changing The SQL Server Port On Azure Container Services

Andrew Pruski shows how to change off of the default port for SQL Server when running Azure Container Services:

So, how do you do it when running SQL Server in Azure Container Services?

Well there’s a couple of options available.

The first one is to change the port that SQL is listening on in the container, open that port on the container, and direct to that port from the service.

The second one is to leave SQL Server listening on the default port and direct a non-default port to port 1433 from the service.

Read on to see Andrew try out both of these methods.

Comments closed

Azure Analysis Services Parallelism And Scale

Teo Lachev has a quick note on Azure Analysis Services and parallelism:

Anyone who’s done parallel programming knows that it’s not easy. Safe access and locks need to be used to synchronize access and protect shared resources. The AS xVelocity engine (aka Vertipaq) is a multi-threaded application that accesses RAM as a shared resource. As each thread competes for a global lock for memory allocation/deallocation, scalability decreases. Because processor cores reserve memory in chunks, called cache lines, additional synchronization is required when threads access memory location that are close to each other. Before SQL Server 2016 SP1, this was a serious issue with Tabular. As we’ve learned, starting with SQL Server 2016 SP1, Microsoft switched to using the Intel Threading Building Clocks (TBB) C++ library. Specifically, Tabular now uses the TBB scalable memory allocator. Mind you that TBB is not CPU-specific extensions, so any modern Intel CPU should get these benefits. “Analysis Services SP1 uses an Intel TBB-based scalable allocator that provides separate memory pools for every core. As the number of cores increases, the system can scale almost linearly” and “The Intel TBB-based scalable allocator is also expected to help mitigate performance problems due to heap fragmentation that have been shown to occur with the Windows Heap”. Further, starting with SP1, Tabular is NUMA aware with 4-node NUMA system but I don’t know if the AAS VMs are preconfigured for NUMA.

Teo doesn’t have any firm conclusions at this point, but his initial testing looks positive.

Comments closed