Press "Enter" to skip to content

Category: Cloud

Querying Serverless SQL Pools from Spark Notebooks in Scala

Jovan Popovic shows off one integration point between the data services in Azure Synapse Analytics:

Azure Synapse Analytics provides multiple query runtimes that you can use to query in-database or external data. You have the choice to use T-SQL queries using a serverless Synapse SQL pool or notebooks in Apache Spark for Synapse analytics to analyze your data.

You can also connect these runtimes and run the queries from Spark notebooks on a dedicated SQL pool.

In this post, you will see how to create Scala code in a Spark notebook that executes a T-SQL query on a serverless SQL pool.

Read on to see how. You can also query Spark pool and dedicated SQL pool tables from serverless SQL pools.

4 Comments

Automating Workflows in Azure DevOps with Logic Apps

Elie Bou Issa does some no-code automation:

Azure Logic Apps is a cloud service to help you schedule, automate, and orchestrate tasks and workflows between apps and across enterprises and organizations. A Logic App can be built using the Azure portal, or infrastructure as code.

By the end of this article, you will have a good understanding of leveraging a Logic App for Azure DevOps to automate the create of work items, in addition to creating an automated approval-based workflow using Office 365.

Click through for the demo. This is useful on its own, especially with non-technical product managers, but you can extend the use of Logic Apps quite a bit and automate more work without writing much code.

Comments closed

Updating or Creating a Measure for Power BI PPU without Full Publication

Gilbert Quevauvilliers follows up:

Following on from my previous blog post How to complete granular deployment of Power BI Desktop changes to the Power BI Service (Using PPU), I want to also show how to update or create a measure in my dataset, where I can deploy this via ALM Toolkit.

This now saves me from doing the following tasks previously:

– Time taken to refresh the PBIX file so that the data is up to date.
– Re-uploading my PBIX.
– If configured re-creating the incremental refreshing
– Time and effort to upload and wait for dataset refresh.
– Quick updates to my dataset.

I will not have to worry about saving my PBIX file, file and if configured re-creating the incremental refreshing. This saves me a lot of time and effort.

Read on to see how.

Comments closed

Using Notebooks in Azure Machine Learning Studio

Lina Kovacheva takes us through the process of working with notebooks in Azure Machine Learning Studio:

I discovered Jupiter notebooks not that long ago, but the more I use them the more I see how powerful they could be. For those of you who are not familiar whit Jupiter Notebook: It is an open-source web application where you can combine code, output, visualizations and explanatory text all in one document allowing you to write a code that tells a story. Now that you have an idea of what Jupiter notebook is I will walk you through how you can use it in Azure Machine Learning Studio.

Click through for the process. One advantage to notebooks in an environment like Azure ML over Azure Data Studio is that you have a much wider variety of languages, although Azure Data Studio has a SQL Server kernel, which other platforms currently do not have.

Comments closed

HammerDB CLI for Oracle Running on Azure

Kellyn Pot’vin-Gorman goes through a rough experience:

Disclaimer: I’m not a big fan of benchmark data.  I find it doesn’t provide us as much value in the real world as we’d like to think it does.  As Cary Milsap says, “You can’t hardware your way out of a software problem” and I find that many folks think that if they just get the fastest hardware, their software problems will go away and this just isn’t true.  Sooner or later, it’s going to catch up with you-  and it rarely tells you what your real database workload needs to run most efficiently or what might be running in your database that could easily be optimized to save the business time and money.

The second issue is that when comparing different workloads or even worse, different platforms or applications, using the same configuration can be detrimental to the benchmarks collected, which is what we’ll discover in this post.

That said, Kellyn dives into the problem and documents several of the issues in building out this test.

Comments closed

Deploying an Azure Arc Enabled Data Services Controller

Chris Adkin continues a series:

If you have been following this series, you should have:

– a basic understanding of Terraform
– a Kubernetes cluster that you can connect to using kubectl
– a basic understanding of Kubernetes services
– a working metalLB load balancer
– a basic understanding of how storage works in the world of Kubernetes
– a Kubernetes storage solution in the form of PX Store, alternatively you can use any solution (for the purposes of this series) which supports persistent volumes, however to use the backup solution in part 9 of the series you will need to use something that supports CSI

From here, Chris explains the importance of the data controller and then deploys one.

Comments closed

Auto-Failover Groups and Grace Periods

Taiob Ali clears up some misunderstanding:

The auto-failover groups feature for the Azure SQL database can be configured with an automatic failover policy. Azure triggers failover after the failure is detected and the grace period has expired. Grace period is determined by a setting called ‘GracePeriodWithDataLossHours’ that cannot be set under one hour. Why is it not allowed to set a time which is less than an hour? Can your business tolerate the application be down for that period? Should your turn off Auto Fail-over and set it to manual?

I noticed a lot of confusion around this setting, including my own. Some of the confusion is due to a lack of clarity in the documentation. I checked with the Microsoft Azure SQL team, and they are actively working on clarifying some of the questions I raised.

I want to thank Dimitri Furman and Roberto Bustos from the Azure SQL Team for clarifying some of my confusion that I will share here.

Read on for a Q&A style explanation of auto-failover and grace periods.

Comments closed

Retrieving Counts of Cosmos DB Collections

Manoj Pandey shows how you can retrieve counts of records in Cosmos DB using the .NET client:

Here in this post we will use C# .net code (for beginners like me) to see how to:
1. Connect to a Cosmos DB instance
2. Get list of all Databases in a Cosmos DB
3. Iterate through all the Databases and get the list of all Collections (or Tables)
4. Get COUNT of all documents/items (or records) in these Collections

Click through to see how.

Comments closed

Using Azure Functions to Tag Resources

Jess Pomfret shows off an interesting way of using Azure Functions to apply tags to resources:

In part one I discussed how useful Azure tags can be, and specifically about how adding a ‘dateCreated’ tag can help you keep track of your resources, and how to find resources with certain tags using PowerShell.  Part 2 and 3 are based around the fact that adding the ‘dateCreated’ tag is a great idea, but relying on a human to remember to add it is less than ideal. In part 2 we looked at using Azure Policy to automatically add the tag. Today’s post will cover another option using Azure Functions.

Azure Functions gives us a way of running serverless code, written in a number of different languages, triggered by specific events or timings.  Looking through the documentation there are many use cases from processing files to analysing IoT workstreams.  Our use case is to run a PowerShell script that tags any resources that are missing the ‘dateCreated’.

Click through to see how.

Comments closed

Hyperparameter Tuning in Azure Machine Learning

Dinesh Asanka takes us through hyperparameter tuning with Azure Machine Learning’s designer:

In the above experiment, both the previous model and the TMH included the model so that we can compare both models. In the above experiment, Tune Model Hyperparameters control is inserted between the Split Data and Score Model controls as shown. In the TMH, control has three inputs. The first control needs the relevant technique and, in this scenario, it is the Two-Class Logistic Regression technique. The second input needs the train data set and the last input needs the evaluation data set and for that, the test data set can be used.

Tune Model Hyperparameters control provides the best combinations and it will be connected to the score model. After the test data stream is connected to the score model, the output of the model is connected to the second input of the Evaluate model so that the previous model and the tuned model can be compared.

I’m not sure if there’s something handled internally in the Tune Model Hyperparameters component, but based on the pipeline images, I’d actually want two separate Split components so that I ended up with something more like 50-20-30 for training, hyperparameter testing, and validation. The first two pipelines appear to be 70-30-0 instead, and so can give you a false sense of confidence in model quality.

Comments closed