Press "Enter" to skip to content

Author: Kevin Feasel

Microsoft Fabric Data Warehouse in a Database Project

Kevin Chant creates a database project:

In this post I want to cover how you can share a Microsoft Fabric Data Warehouse Database Project with the new target platform.

Which is now possible thanks to the latest Azure Data Studio Insiders update. You can view the ‘Add projects support for Fabric DW‘ pull request in the public azuredatastudio GitHub repository.

Kevin takes us through creating the database project in Azure Data Studio and then using Azure DevOps or Azure Data Studio to deploy it back out.

Comments closed

ML with Keras and TensorFlow over Streaming Kafka Data

Paul Brebner gives us a streaming scenario for model training:

One of the goals of incremental learning is to train a model continuously from streaming data. Incremental learning from streaming data means you don’t need all the data in memory at once, and the model is as up-to-date as possible, which can matter for real-time use cases. The third driver for incremental learning that I mentioned in the previous blog is when there is concept drift in the data itself—but we’ll ignore this aspect for the time being. 

In the last blog we demonstrated batch training with TensorFlow, and mentioned that TensorFlow, being a neural network framework, has the potential for incremental learning—just like animals and people do. In this blog, we will set ourselves the task of using TensorFlow to demonstrate incremental learning from the same static drone delivery data set of busy/not busy shops that we used in the last blog. 

Read on to see the code, results, and warnings.

Comments closed

An Overview of Flink SQL

Martijn Visser continues a series on Kafka and Flink:

In the first two parts of our Inside Flink blog series, we explored the benefits of stream processing with Flink and common Flink use cases for which teams are choosing to leverage the popular framework to unlock the full potential of streaming. Specifically, we broke down the key reasons why developers are choosing Apache Flink® as their stream processing framework, as well as the ways in which they are putting it into practice. These range from streaming data pipelines to train ML models, to real-time inventory management in retail and predictive maintenance in manufacturing.

Next, we’ll dive into Flink SQL, which is a powerful data processing engine that allows developers to process and analyze large volumes of data in real time. We’ll cover how Flink SQL relates to the other Flink APIs and showcase some of its built-in functions and operations with syntax examples.

I’m naturally predisposed to blog posts which validate Feasel’s Law, so of course I was going to pick this one to recommend.

Comments closed

Selective Fire for Trigger Execution

Erik Darling holds trigger fire until he sees the whites in their eyes:

I was helping a client with an issue recently where they wanted to allow certain admin users to override changes currently in a table, but not allow anyone else to make changes.

The thing is, users had to be allowed to make other changes to the table, so it wasn’t something that could be handled easily with security features.

The example I’m going to show here is simplified a bit to get the code across to you, so keep that in mind.

This is a somewhat wacky scenario, but Erik does get it working.

Comments closed

Where Git Repositories Store File Versions

Julia Evans digs into a folder:

Hello! I was talking to a friend about how git works today, and we got onto the topic – where does git store your files? We know that it’s in your .git directory, but where exactly in there are all the versions of your old files?

For example, this blog is in a git repository, and it contains a file called content/post/2019-06-28-brag-doc.markdown. Where is that in my .git folder? And where are the old versions of that file? Let’s investigate by writing some very short Python programs.

Read on to learn how you can parse it all out. And this is also reason number 3 why you don’t want to commit a large file to Git: even if you delete that file later, the contents will live in the .git folder forever, or at least until you take some manual action to excise it from Git’s history.

Comments closed

Service Level Agreements (RPO and RTO) and SQL Server

David Klee wants to know how much downtime is acceptable to you:

Database professionals of the world – I have a question. Has your organization defined service level agreements (SLAs) for your data estate? I’m talking specifically the Recovery Point Objective (RPO) and Recovery Time Objective (RTO), and to have these defined not in an arbitrary number of nines, but in minutes or hours. If these aren’t defined from above, your business continuity plan is doomed to fail.

Read on to learn what RPO and RTO mean, how to think in terms of RPO and RTO, and some of David’s recommendations.

Comments closed

Row Level Security Anti-Patterns and Alternatives

Ben Johnston tells us why we might not want to use row level security in SQL Server:

One of the primary reasons to implement RLS is to facilitate reporting and ease the administrative burden. This section covers some considerations for using RLS with the primary Microsoft reporting engines and gives you an idea of things to look for in your reporting engine. Some anti patterns and alternatives to RLS are also examined.

This article goes a long way toward explaining why I find row level security so rare in the wild and never implemented it myself: most databases I’ve worked with are either transactional or hybrid OLTP/OLAP, they’re mostly multi-tenant, and they’re accessed through service accounts. That’s just a no-go across the board.

Comments closed

An Overview of Postgres Data Types

Arindam Mondal categorizes various Postgres data types:

This article will show PostgreSQL Data Types with various examples.

Data Types are an important part of a database. It represents values associated with it. Choosing the right data type for a table is one of the most important tasks because it determines the kind of data we want to store in a table. While creating a table you must specify a data type for each column. A column can store a specific type of data, like integer, string, Boolean, floating points, and so on. In this article, we are going to discuss PostgreSQL data types.

The list is quite similar to what’s available in SQL Server, though there are a few differences, such as built-in support for storing network addresses.

Comments closed

Multi-Plot Graphs in R

Steven Sanderson needs more than one line:

Data visualization is a crucial aspect of data analysis. In R, the flexibility and power of its plotting capabilities allow you to create compelling visualizations. One common scenario is the need to display multiple plots on the same graph. In this blog post, we’ll explore three different approaches to achieve this using the same dataset. We’ll use the set.seed(123) and generate data with x and y equal to cumsum(rnorm(25)) for consistency across examples.

Click through for three common techniques.

Comments closed