Cloud – Page 63 – Curated SQL

Reviewing Oracle Database Service on Azure

Published 2022-07-25 by Kevin Feasel

If we were to ask any DBA to separate the database in one cloud and the application tier in another without the context of a marketing announcement, they would look at us like we’d grown a third head. I’m incredibly surprised that anyone even considers the OCI Interconnect for this use, let alone the 150 that are currently using it. Oracle applications, like E-business Suite, Peoplesoft, JD Edwards and Hyperion are incredibly network latency sensitive and to recommend separating their tiers in two separate clouds just is alien to me. When we deploy these in Azure, we place all tiers in a proximity placement group to let Azure know that they are connected and this ensures that when a resource comes online after changes are made, redeployments, etc. the resources stay close to each other.

Definitely worth a read.

Comments closed

Data Sharing and Secure Cleanrooms in Databricks

Published 2022-07-20 by Kevin Feasel

Craig Porteous reviews a couple of announcements from Data + AI Summit:

Having worked with many organisations across different industries and sectors, the sharing of data with partners and vendors is always a pain point and one that all too often results in both parties not quite getting what they want or need. This isn’t restricted to my experience however which is why Databricks announced Delta Sharing back at DATA + AI Summit 2021.
Coming to this year’s conference, Delta Sharing has been established as the foundation for many new features with the announcement Databricks Marketplace and Cleanrooms for example, both built upon the Delta Sharing protocol. We’ll explore Cleanrooms below and I’ll look at the Databricks Marketplace in it’s own post.

Read on for Craig’s thoughts on two of the bigger announcements at this year’s summit.

Comments closed

The Basics of Snowflake Architecture

Published 2022-07-20 by Kevin Feasel

Arun Sirpal lays out the foundation of Snowflake DB’s architecture:

At the most basic level, Snowflake has 3 important components. The Cloud services layer, centralised storage layer and the compute layer.
Cloud services – they call this the “brains” of snowflake. This is where infrastructure management takes place, the optimiser is based (cost-based), metadata management and security (authentication and access control) are handled.

Read on to learn about the other two layers and how they meet.

Comments closed

A Primer on Azure Automation

Published 2022-07-15 by Kevin Feasel

Andy Leonard automates all the things:

Azure Automation has been around for a few years now. I just got started because my brother and friend, Aaron Nelson (@SQLvariant), shared some automation he’s been working on. Once I got my head around the piece I’m about to share with you, Azure Automation started to make sense to me.

Read on for that piece and a walkthrough chock-full of screenshots.

Comments closed

Choosing among Snowflake Editions

Published 2022-07-13 by Kevin Feasel

Arun Sirpal helps us make a choice:

You have decided that snowflake is the technology you want to use to build your next gen data platform, you have decided your cloud provider (Azure, AWS, GCP) then next you need to think about what edition of snowflake suits your business needs?

Read on to see what the major differences are between editions.

Comments closed

Parameterizing Queries with Amazon Athena

Published 2022-07-12 by Kevin Feasel

Blayze Stefaniak, et al, architect a service to provide data via Amazon Athena:

Customers tell us they are finding new ways to make effective use of their data assets by providing data as a service (DaaS). In this post, we share a sample architecture using parameterized queries applied in the form of a DaaS application. This is helpful for many types of organizations, whether you’re working with an enterprise making data available to other lines of business, a regulator making reports available to your industry, a company monetizing your data assets, an independent software vendor (ISV) enabling your applications’ tenants to query their data when they need it, or trying to share data at scale in other ways. In DaaS applications, you can provide predefined queries to run against your governed datasets with values your users input. You can expand your DaaS application to break away from monolithic data infrastructure by treating data as a product (DaaP) and providing a distribution of datasets, which have distinct domain-specific data pipelines. You can authorize these datasets to consumers in your DaaS application permissions. You can use Athena parameterized queries as a way to predefine your queries, which you can use to run queries across your datasets, and serve as a layer of protection for your DaaS applications. This post first describes how parameterized queries work, then applies parameterized queries in the form of a DaaS application.

Click through to learn how.

Comments closed

Visualizing Kafka Stream Lineage

Published 2022-07-07 by Kevin Feasel

David Araujo and Julia Peng show off stream lineage in Confluent Cloud:

Stream Lineage is a tool Confluent built to address the lack of data visibility in Kafka and event-driven architectures. Confluent’s Stream Lineage provides an interactive map of all your data flows that enable users to:
1. Understand what data flows are running both now or at any point in the past
2. Trace where each data flow originated from
3. Track how data is transformed along its journey
4. Observe where each data flow ends up

Read on to see how it works.

Comments closed

Removing a Data Disk from a Running Azure VM

Published 2022-07-07 by Kevin Feasel

Joey D’Antoni tightrope walks without a net for fun:

I was working with a client recently, were we had to reconfigure storage within a VM (which is always a messy proposition). In doing so, we were adding and removing disks from the VM. this all happened mostly during a downtime window, so it wasn’t a big deal to down a VM, which is how you can remove a disk from a VM via the portal. However, upon further research, I learned that through the portal you can remove a disk from a running VM.

Read on to see how. Though I’d generally still recommend shutting the VM off first just to be sure.

Comments closed

Summarizing Data & AI Summit Announcements

Published 2022-07-05 by Kevin Feasel

Zach Stagers hits the high notes:

One of the biggest cheers of the keynote was that Delta is being fully open sourced! Databricks continue to share their incredible work to help drive our industry forward. Delta already has wide adoption, but with the open sourced version now being levelled up to the same standard as the ‘proprietary’ one, this should help cement it as the default choice for lake-based storage.
There were some announcements of things to come with Delta too, such as a optimised deletes and updates by removing single rows instead of having to completely rewrite the file. It’ll be really interesting to see how this works, and just how much it boosts performance.

Read on for more notes on several big announcements.

Comments closed

Date-Time Binning in Cosmos DB

Published 2022-07-01 by Kevin Feasel

Hasan Savran bins some data:

I wrote about the Date_Bucket() function in SQL Server a couple weeks ago. Azure Cosmos DB team announced the same functionality with a different name DateTimeBin() function. It works exactly the same with the Date_Bucket() function of SQL Server.
Cosmos DB version of the function has the same number of parameters. The order is different. All the datatime parameters must be in ISO 8601 format (YYYY-MM-DDThh:mm:ss.fffffffZ)

Read on to see how it works.

Comments closed

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

Category: Cloud