Press "Enter" to skip to content

Day: October 13, 2022

Training a Language Transformer Model

Stefania Cristina continues a series on building a language transformer:

We have put together the complete Transformer model, and now we are ready to train it for neural machine translation. We shall be making use of a training dataset for this purpose, which contains short English and German sentence pairs. We will also be revisiting the role of masking in computing the accuracy and loss metrics during the training process. 

In this tutorial, you will discover how to train the Transformer model for neural machine translation. 

Read on for the process, including a lot of code.

Comments closed

Azure Stream Analytics Job Diagram Simulator

Alex Lin points out some functionality to understand Azure Stream Analytics jobs:

Azure Stream Analytics (ASA) uses a SQL query language that has been augmented with powerful temporal operators to analyze data streams. One way to optimize a Stream Analytics job’s performance is to leverage parallelism in query. A parallel job divides the workload of queries into multiple streaming nodes and reduces the overall execution time.  

We strongly recommend that you use the Azure Stream Analytics Tools extension in Visual Studio Code (VS Code) for testing and debugging queries locally. For a job to be parallel, you need to align partition keys between all inputs, query steps, and outputs. The Job Diagram Simulator in VS Code allows you to simulate a job running topology with different streaming units (SUs) and provides suggestions for improving your query parallelism. 

Click through to see how it works and the kinds of information you can get from it.

Comments closed

Updates with Nested REPLACE Functions

Chad Callihan neeeds to make an update:

Have you ever ran into a situation where you needed to replace both ends of a string? Maybe it was tags in a set of XML values or phrases at the beginning and end. I came across that situation recently and was able to show if two statements were needed or if REPLACE could be used against the same column in the same statement. Let’s look at a quick example and see what’s possible.

Chad’s solution is one I use fairly frequently. I agree that it’s not pretty, though one thing I like to do is tokenize code, so I might have something that looks like:

-- Procedure input parameters here
@FirstName NVARCHAR(150),
@FavoriteColor NVARCHAR(30)

-- Guts of procedure here
DECLARE @msg NVARCHAR(4000) = N'Hello, {FIRST_NAME}!  Your favorite color is {FAVORITE_COLOR}!';

SELECT
    REPLACE(REPLACE(@msg,
        '{FIRST_NAME}', @FirstName),
        '{FAVORITE_COLOR}, @FavoriteColor);

It’s definitely not the prettiest but when you have several tokens to replace, it’s a lot easier to read than building the string all at once.

Comments closed

Portworx and Kubernetes Storage Failover

Andrew Pruski digs into a problem:

In a nutshell, the issue is that the attachdetach-controller in Kubernetes won’t detach storage from an offline node until that node is either brought back online or is removed from the cluster. What this means is that a pod spinning up on a new node that requires that storage can’t come online.

Aka, if you’re running SQL Server in Kubernetes and a node fails, SQL won’t be able to come back online until someone manually brings the node online or deletes the node.

Not great tbh, and it’s been a blocker for my PoC testing.

However, there are ways around this…one of them is by a product called Portworx which I’m going to demo here.

After a short disclaimer, there’s plenty of good content.

Comments closed

Power Automate Trigger on Power BI Dataset Refresh Completion

Imke Feldmann wants to know when the work is done:

If you have been working with Power BI for a while now you might have come across the standard Power BI actions in Power Automate. They allow you to automate a nice bunch of Power BI processes. But while there is a trigger that starts a flow when a Power BI dataflow action has completed, no such trigger is available for when a Power BI dataset action has (successfully) completed. So here I will present a workaround that is still possible without a premium Power Automate license.

Read on for that workaround.

Comments closed

Dynamic SQL and String Data Types

Erik Darling theorizes about strings:

If you write the good kind of dynamic SQL, that is:

1. Parameterized

2. Executed with sp_executesql

You’ll probably have run into some silly-ish errors in the practice. Namely, that sp_executesql expects your SQL string and your Parameter string to be NVARCHAR(…).

Read on for some thoughts on data types, max-ness, and the like.

Comments closed

A Duplicate Despite a Distinct

Murder Forrest McDaniel Wrote:

“A duplicate despite a DISTINCT bodyguard…” one mutters. “This isn’t the normal level of foul play.”

Mods arrive to cordon off the scene. Twelve different butlers have already been accused, but each has a solid alibi, even MAXDOP.

Generic Protagonist paces back and forth. They’re waiting for the forensics report. Finally it arrives.

Of course, those forensics people don’t know what they’re doing, so you’d better click through and find the real killer.

Comments closed

Generating SQL Code from Metadata

Richard Swinbank shows off one of the more common uses of dynamic SQL:

Now you come to ingest your second table – and you have to do this all again!

It’s a lot of code, and worse, it’s all basically the same – only the names and types are any different. You can easily make a mistake, and the boredom of repetition only makes that more likely. Later on, when you decide you want to add something to the process – a logging statement at the end of the stored proc, for example – you’ll have to add it separately to every stored procedure you’ve created.

The solution? Generate your code automatically, from metadata, using dynamic SQL! 

Read on for what you’d need to pull this off.

Comments closed