Passing Data To Java In SQL Server 2019

Niels Berglund continues looking at the SQL Server 2019 Extensibility Framework’s support for Java:

In Java, there are also helper components, (a topic for future posts), but the integration is not as tight, so when we want to pass data into and out of Java we need to code somewhat more explicit to make data passing possible.
In our Java code, we need to represent the data passed in and out as class member column arrays. You define in your classes, one array per column passed in, and one array per column returned. These column arrays are some of the “magic” members I mentioned above, and they are the equivalent to InputDataSet, and OutputDataSet.
The components that are part of the Java extension need to know about these members as the components either populate them when pushing data into Java or read from them when returning data from Java. The way that the components know about the members is based on a naming standard.

It’s definitely easier to pass data in and get data back from R and Python, but I suppose part of that is Java being a static, compiled language.

The Intuition Behind Principal Component Analysis

Holger von Jouanne-Diedrich gives us an intuition behind how principal component analysis (PCA) works:


Principal component analysis (PCA) is a dimension-reduction method that can be used to reduce a large set of (often correlated) variables into a smaller set of (uncorrelated) variables, called principal components, which still contain most of the information.
PCA is a concept that is traditionally hard to grasp so instead of giving you the n’th mathematical derivation I will provide you with some intuition.
Basically PCA is nothing else but a projection of some higher dimensional object into a lower dimension. What sounds complicated is really something we encounter every day: when we watch TV we see a 2D-projection of 3D-objects!

Click through for the rest of the story.

Thoughts On Exclusive Locks

Louis Davidson shares some thoughts on exclusive locks in SQL Server:


You will find that the SELECT statement executes, ignoring the exclusive lock, because it is not a write lock, and the data on the page has not been changed.
The main reason people try to do this is to force access to a row in a single threaded manner. For example, building their own sequence number, either in a row they update, or by trying to do MAX() on all of the data in a table to make sure only one reader gets the same value.
This is generally a bad idea, since locking an entire table is a generally bad idea, but if you needed to block readers, you can couple the XLOCK with a PAGLOCK. So, change the first reader to:

BEGIN TRANSACTION;
SELECT *
FROM   Demo.Test WITH (XLOCK,PAGLOCK);

As Louis points out in the summary, locking is complicated.  Having a good understanding of the locking model will serve you well, though.

NoSQL? No! MoSQL

Steve Jones points out a bit of a shift at Google:


Google is doing more SQL, or at least shifting towards relational SQL databases as a way of storing data. At least, some of their engineers see this as a better way to store data for many problems. Since I’m a relational database advocate, I found this to be interesting.
When Google first started to publish information on BigTable and other new ways of dealing with large amounts of data, I felt that these weren’t solutions I’d use or problems that many people had. The idea of Map Reduce is interesting and certainly applicable to the problem space Google had of a global database of sites, but that’s not a problem I’ve ever encountered. Instead, most of the struggles I’ve had with relational systems are still better addressed in a relational system.

Read the whole thing.  Note that this is slightly different than Feasel’s Law, as Steve is focusing more on the consistency side of things rather than the interface.

Also, just going to leave this here:

When Azure SQL DB Tells You You’re Hitting Your DTU Limits

Brent Ozar shows us how to use sp_BlitzFirst to determine if you’re hitting DTU limits:


You’ve got an Azure SQL DB, and your queries are going slow. You’re wondering, “Am I hitting the performance limits? Is Microsoft throttling my queries?”
There’s an easy way to check: run sp_BlitzFirst. sp_BlitzFirst is our free performance health check stored procedure that analyzes a lot of common performance issues and then gives you a prioritized list of reasons why your server might be slow right now.

Click through for an example.  The same signal can mean “it’s time to move to a higher level in your tier,” “it’s time to move up to the next tier,” or “it’s time to tune that cursor in a cursor in a cursor which performs several scans of a very large table for each operation.”

Finding The SQL Server Port With T-SQL

Jack Vamvas shows us how to find the port SQL Server is listening on using T-SQL:


Question: Without going into the SQL Server Configuration manager via the GUI is there a command oriented method to extract the port number SQL Server is listening on?
 
Answer: There are a few different methods to extract the port number without going into the Configuration interface.
Method 1 – use xp_readererrorlog

Read on for an explanation of this technique as well as links to a couple other methods.  I hadn’t thought about using the error log as a source, but it works.

Preventing Credential Compromise When Using AWS

Will Bengtston walks us through techniques Netflix uses to protect credentials in AWS:

Scope

In this post, we’ll discuss how to prevent or mitigate compromise of credentials due to certain classes of vulnerabilities such as Server Side Request Forgery (SSRF) and XML External Entity (XXE) injection. If an attacker has remote code execution (RCE) or local presence on the AWS server, these methods discussed will not prevent compromise. For more information on how the AWS services mentioned work, see the Background section at the end of this post.

Protecting Your Credentials

There are many ways that you can protect your AWS temporary credentials. The two methods covered here are:

  • Enforcing where API calls are allowed to originate from.

  • Protecting the EC2 Metadata service so that credentials cannot be retrieved via a vulnerability in an application such as Server Side Request Forgery (SSRF).

Read the whole thing if you’re an AWS user.

Plotting Diagrams In R With nest() And map()

Sebastian Sauer shows how to display multiple ggplot2 diagrams together using facets as well as a combination of the nest() and map() functions:

One simple way is to plot several facets according to the grouping variable:

d %>% ggplot() + aes(x = hp, y = mpg) + geom_point() + facet_wrap(~ cyl)

Faceting is great, but it’s good to know the other technique as well.

Indexed View Matching

Erik Darling has a series of posts on indexed views, with the latest covering query matching even when using a keyword in creation of the indexed view itself:

There are a whole bunch of limitations in creating indexed views. One of them is that you can’t base the query on DISTINCT.

Fair enough, but you can do GROUP BY.

And what’s pretty cool is that the optimizer can match a query written to find distinct values to an indexed view with a group by.

Click through for the best example ever.

Continuing The Advent Of Code In T-SQL

Wayne Sheffield has a few more posts in the Advent of Code series.  His latest edition:

In Day 5, we find ourselves working with the polymers of the new Santa suit. A polymer (the input file), consists of units, represented by upper and lower case letters. Adjacent units of the same letter, but of different polarity (case), cancel each other out. This may lead to other units that can then cancel each other out. The goal is to reduce the polymer to as small as possible, and report back the reduced size.

Tasks:

  1. Perform a case-sensitive search/replace for each letter of the alphabet. The search is for a pair of the same letter, where one is upper case, and the other is lower case.
  2. Recursively perform this operation until the string can no longer be reduced.

In my opinion, the key part to this is that the operation needs to be performed recursively. I can think of only two ways to recursively perform an operation in SQL Server:

  1. A recursive common table expression (cte).
  2. Using a WHILE loop.

I don’t like using either of these mechanisms in SQL Server – they both perform operations in a “Row-By-Agonizing-Row” method, instead of a more set-based approach. However, set-based recursion usually performs extremely poorly. So, I’m going to use a while loop.

The recursion requirement does limit things a bit; otherwise I could see putting something together with the LEAD() window function.

Categories

December 2018
MTWTFSS
« Nov  
 12
3456789
10111213141516
17181920212223
24252627282930
31