Press "Enter" to skip to content

Category: Cloud

Oracle’s Automatic Workload Repository Explained

Kellyn Pot’vin-Gorman explains to us what the Automatic Workload Repository is:

The Automatic Workload Repository, (AWR) had been around since Oracle 10g and requires the diagnostic and tuning management pack licensing to use all of its features in Oracle’s Enterprise Edition database. Versions before 10.2.0.4 had limited collections vs. the modern reporting schema and every subsequent release of Oracle has added to it’s content, which explains the size increase stored in the objects/number of objects in the SYSAUX tablespace.

By default and since version 11.2.0.4, the AWR retention is 8 days and takes an automatic snapshot once per hour. It’s common for DBAs to up this retention to at least 31 days to capture a month of workload information and these snapshot identifiers can then be used to identify workload intervals for querying and reporting. Oracle can be also be configured to lessen the intervals between snapshots to change the granularity of the AWR reports, or my preference, the DBA or privileged user can take manual snapshots to identify an important beginning or ending of a period.

Kellyn goes into a good amount of detail in this post and, based on the title, promises at least a part 2. Though this could be a History of the World: Part I trick Kellyn is playing on us.

Comments closed

Metadata-Driven ADF Pipelines

Paul Andrew wraps up a series on metadata-driven processing of Azure Data Factory pipelines. Part 3 covers the ADF wrapper necessary for our custom pipelines:

Firstly, to help guide this post below is a mock up of our Data Factory pipelines and activities to show the end goal. Hopefully this view informs how things are going to be connected using what I call a pipeline hierarchy system and how they will work in the overall framework. For our metadata processing framework we can make the following category distinctions about the activities represented:

Grandparent – This is the top level orchestration of our wider data platform solution. Here a scheduled trigger could be connected or processing in our solution grouped into natural areas. Technically this level isn’t required for our processing framework, but I’ve included it as good practice.
Parent – Our parent pipelines primary purpose (try saying that fast 3 times 🙂 ) is to handle the stages of our processing framework. The stages will then be passed off sequentially to our child pipeline using another execute pipeline activity.
Child – At this level in the framework the child is hitting the Azure Function to call the lowest level executors, or the pipelines that we want to actually do the work in our data platform solution. In my previous post I added some example metadata to call pipelines name Stage X-X. These in turn relate to a set of empty place holder pipelines that contained only Wait activities.

Part 4 puts it all together:

For the end to end run of the framework we have a few options to see progress once its been triggered. Before that its worth pointing out that in the below I’ve used the sample metadata provided with the database scripts in GitHub. Then for each execution pipeline I’ve added a single Wait activity with a random time delay of a few seconds. The point here is to test the framework execution, not the pipelines being called. To further clarify, the Stage X-X pipelines names should be replaced with your actual pipeline names in your data platform solution.

Definitely worth the read.

Comments closed

Syncing Mobile Apps via Change Tracking

Davide Mauri shows how we can perform data synchronization using change tracking in Azure:

Sending data from the cloud to the app is way more tricky. You want to do it in the most efficient way, to spare bandwidth and device battery life, so you need a way to know what has changed since the last time that specific user and device synced. As data is surely stored in a database of some sort, you also need some efficient method on the database side to make sure you can quickly get everything that is new or changed and that is in the scope for that specific user/device. If your mobile application is successful, this means that you may literally have millions and millions of rows or documents to scan and check for changes.

Not an easy task: all hope is lost then? Just send back the whole data set and that’s it? Of course not! We don’t want to just be developers, but better developers, right?

Modern databases can help a lot in tackling this challenge. Azure SQL, for example, has a feature called Change Tracking that, guess what?, will take care of keeping track of changes for you.

Davide includes a lot of detail and even a sample application on GitHub.

Comments closed

Troubleshooting Azure SQL DB Elastic Jobs

Kate Smith wraps up a series on elastic jobs in Azure SQL Database:

This error means that the Elastic Job Agent cannot connect to the target server(s) because the target has some firewall rules blocking the connection requests.  Indeed – it is required that every target in the target group allows connections from Azure Services in order for Elastic Jobs to work.  To fix this, I go to the target server in the Azure Portal and click on the “Firewalls and virtual networks” item under “Security”.  Next, I toggle the “Allow Azure services” from OFF to ON, and save my changes.  

This has been an interesting series to read through, even though I don’t do much at all with Azure SQL Database.

Comments closed

Backup Up Cosmos DB

Josh Smith shows us how to back up a Cosmos DB collection:

Now that ADF is a trusted service I wanted to document the state of my current solution since I’ve been able to dump the hack-y PowerShell script I put together. I haven’t been able to get the level of abstraction I’d really like to see but overall I think I’m pretty happy with the solution (and I still get to include a hack-y PowerShell script). My solution consists of

– a control pipeline,
– a notification pipeline and
– 1 pipeline for every Cosmos DB service I want to to work with. (This is because I wasn’t able to figure out a way to abstract the data source connection for the Copy Data task.)

Read on for the solution.

Comments closed

Creating Triggers with Cosmos DB

Hasan Savran shows how you can create a trigger in Cosmos DB:

You have options if you need to use any type of triggers in Cosmos DB. There are two types of triggers in Cosmos DB. First one which I will cover here is the regular triggers which can be executed before (Pre-Triggers) or after (Post-Triggers) an operation. This type of triggers is written in JavaScript and you need to register them to a collection just like stored procedures. Second type of triggers can be created by Azure Functions and you can find more information about them in my older posts.

     Pre-Triggers and Post-Triggers do not take any input parameters. Since Cosmos DB needs to work more work to execute triggers, you will end up with higher Request Units for your queries. They might name triggers, but both do not get executed automatically with every operation. You need to call them programmatically if you want to run them.  If trigger throws any error for any reason, transaction will roll back and data will not be saved to the database.

Naturally, triggers are going to have a performance impact on your system regardless of the choice of data platform.

Comments closed

Running and Scheduling Azure SQL DB Elastic Jobs

Kate Smith continues a series on Azure SQL Database Elastic Jobs:

In previous posts, I have demonstrated how to create an Elastic Jobs Agent, setup credentials for Elastic Jobs, create a target group of servers/databases for the agent, and how to create and define an elastic job using both PowerShell and T-SQL.

In this post, I drill down into how to run an Elastic Job both in an ad-hoc fashion and how to schedule a job to run regularly. I do this both for PowerShell and for T-SQL.

The Powershell version is a one-liner and the T-SQL version looks a good bit like it does with SQL Agent jobs.

Comments closed

Building Metadata for an ADF Pipeline

Paul Andrew continues a series on Azure Data Factory and metadata-driven pipelines:

Welcome back friends to part 2 of this 4 part blog series. In this post we are going to deliver on some of the design points we covered in part 1 by building the database to house our processing framework metadata.

Let’s start with a nice new shiny Azure SQLDB database and schema. This can easily be scaled up as our calls from Data Factory increase and ultimately the solution we are using the framework for grows.

Soon we will get to see the Azure Data Factory power in action.

Comments closed

Secure Azure Data Source Access from Databricks

Bhavin Kukadia, Abhinav Garg, and Michal Marusan show us the right way to access Azure data sources from Azure Databricks:

Enterprise Security is a core tenet of building software at both Databricks and Microsoft, and thus it’s considered as a first-class citizen in Azure Databricks. In the context of this blog, secure connectivity refers to ensuring that traffic from Azure Databricks to Azure data services remains on the Azure network backbone, with the inherent ability to whitelist Azure Databricks as an allowed source. As a security best practice, we recommend a couple of options which customers could use to establish such a data access mechanism to Azure Data services like Azure Blob StorageAzure Data Lake Store Gen2Azure Synapse Data WarehouseAzure CosmosDB etc. Please read further for a discussion on Azure Private Link and Service Endpoints.

This is more about network configuration rather than things like “store your credentials and other secrets in Azure Key Vault,” which is also a good idea.

Comments closed

Creating the Elastic Job Agent for Azure SQL Database

Kate Smith continues a series on Elastic Jobs in Azure SQL Database:

There is no way to create the Elastic Job Agent in T-SQL. I have already shown how to do this in PowerShell. To do this in the Azure Portal, go to Home, click the box that says “+ Create a Resource”, then search in the box for Elastic Job Agent. Select that, and then follow the steps in the portal to create the agent.

After creating the agent, Kate then shows how to set up credentials, target groups, and jobs.

Comments closed