Press "Enter" to skip to content

Category: Cloud

Azure Elastic Queries, Jobs, and Transactions

Steve Hughes walks us through three Elastic concepts in Azure:

Elastic queries allow developers to interact with data from multiple databases supported on the Azure SQL database platform including Synapse. Elastic queries are often referred to as Polybase which is currently implemented in SQL Server 2019 and Azure Synapse. The key difference is that elastic queries only allow you to interact with other Azure SQL Databases but not Hadoop or other database implementations (e.g. Teradata or Oracle). Part of the confusion comes from the fact that the implementation looks very similar. Both toolsets use external tables in SQL Server to interact with the connected data sources. However, Polybase requires additional components to run whereas elastic queries are ready to go without additional setup.

Read on for more information, including demos.

Comments closed

Tips for Using Azure Table Storage

Adrian Hills takes us through using Azure Table Storage:

Azure Table Storage is a NoSQL key-value PaaS data store that can be a great option for highly scalable, highly available systems. It supports storing petabytes of data and a flexible data schema, meaning different entities in the same table can have different schemas. References to NoSQL databases having “flexible schema” or being “schema-less” can give the impression that database schema design is a thing of the past and that you can bypass it and focus more on the application code. The reality is, even in this NoSQL world, schema design is very important and if you don’t give it due care and attention, then it can come back to bite you.

If you have a RDBMS background and are new to Azure Table Storage, it’s common to find yourself “thinking in SQL” and trying to solve database modeling requirements with a SQL approach before then trying to translate that to a key-value mindset. In this blog post, I’ll cover some of the fundamentals of Azure Table Storage and dive into some common questions you might find yourself asking about Azure Table Storage. Where code samples or references are applicable in this blog post, we’ll be focusing on .NET and using the Azure SDK (specifically relating to the Microsoft.Azure.Cosmos.Table nuget package).

Read on for the full story.

Comments closed

Self-Service with Azure Synapse Analytics

Paul Andrew lays out an interesting idea:

I’ve been playing around with Azure Synapse Analytics for a while now exploring the preview features and trying to find a meaningful use case for the ‘single pane of glass’ capabilities. In this post I’m exploring one possible option/idea for creating a very simple self service approach to dataset ingestion and consumption. Full disclosure, the below is far from technical perfection for lots of reasons, I mainly wanted to put something out there as an idea and use it to maybe start a conversation.

Click through to see Paul’s take on the matter.

Comments closed

The Raw Facts on Azure SQL DB Serverless

Taiob Ali gives us a briefing summary on Azure SQL Database Serverless:

Occasionally, load balancing automatically occurs if the machine cannot satisfy resource demand within a few minutes. For example, if the resource demand is 4 vCores, but only 2 vCores are available, it may take up to a few minutes to load balance before 4 vCores are provided. The database remains online during load balancing except for a brief period at the end of the operation when connections are dropped.

Click through for more points along these lines.

Comments closed

Azure Site-to-Site VPN Blocking Certain Traffic

Denny Cherry diagnoses a network configuration issue:

I ran across an interesting a couple of weeks ago when working with a client. The client has several subsidiaries each with their own vNet. The client had a site to site VPN been the Azure vNets. All traffic was successfully crossing the Azure Site to Site VPN as expected. The sticking point was that a software licensing server running in one of the subsidiaries Azure infrastructure configurations. The software licensing software simply wasn’t working.

Click through to learn why.

Comments closed

Optical Character Recognition with Tesseract and Databricks

Alex Aleksandrov takes a look at optical character recognition with the Tesseract library:

The topic of Optical Character Recognition (OCR) is not an unexplored field to the Adatis audience. Some Adati like Kalina Ivanova (link1link2) and Francesco Sbrescia (link3) have already explored this topic from the perspective of Azure Cognitive Services and Azure Data Lake. In my first blog, I would like to explore this topic from a different perspective: using Tesseract and Databricks.

Click through for instructions.

Comments closed

Indexing S3 Data with NiFi and CDP Data Hubs

Eva Nahari, et al, walk us through text indexing of S3 data with Solar, NiFi, and Cloudera Data Platform:

Data Discovery and Exploration (DDE) was recently released in tech preview in Cloudera Data Platform in public cloud. In this blog we will go through the process of indexing data from S3 into Solr in DDE with the help of NiFi in Data Flow. The scenario is the same as it was in the previous blog but the ingest pipeline differs. Spark as the ingest pipeline tool for Search (i.e. Solr) is most commonly used for batch indexing data residing in cloud storage, or if you want to do heavy transformations of the data as a pre-step before sending it to indexing for easy exploration. NiFi (as depicted in this blog) is used for real time and often voluminous incoming event streams that need to be explorable (e.g. logs, twitter feeds, file appends etc).

Our ambition is not to use any terminal or a single shell command to achieve this. We have a UI tool for every step we need to take. 

Click through to see how well they do at that.

Comments closed

Durable Azure Functions and Azure Data Factory

Rayis Imayev wants to use Azure Functions with Azure Data Factory:

Ok, here is my problem: I have an Azure Data Factory (ADF) workflow that includes an Azure Function call to perform external operations and returns output result, which in return is used further down my ADF pipeline. My ADF workflow (1) depends on the output result of the Azure Function call; (2) plus a time efficiency of the Azure Function call is another factor to consider, if its time execution hits 230 seconds or more, ADF Azure Function will fail with a time-out error message and my workflow is screwed.

This gave Rayis the impetus to try out durable functions. Read on to see how that worked out.

Comments closed