Press "Enter" to skip to content

Month: April 2025

Securing Parquet Files

Vamshidhar Morusu writes on vulnerabilities:

Although open-source Java libraries are essential for contemporary software development, they frequently introduce serious security flaws that put systems at risk. The risks are highlighted by recent examples:

  • Deep Java Library (CVE-2025-0851): Attackers can write files outside of designated directories due to a path traversal vulnerability in DJL’s archive extraction tools. Versions 0.1.0 through 0.31.0 are affected by this vulnerability, which may result in data corruption or illegal system access. Version 0.31.1 has a patch for it.
  • CVE-2022-42003, Jackson Library: Unsafe serialization/deserialization configurations in the well-known JSON parser cause a high-severity problem (CVSS 7.5) that could result in denial-of-service attacks.

These illustrations highlight how crucial it is for open-source libraries to have careful dependency management, frequent updates, and security audits. Companies should enforce stringent validation and use automated vulnerability scanning tools.

Click through for a more detailed view of a third CVE, as well as tips to protect your data.

Comments closed

Loading Excel from SQL Server via Power BI XMLA

Jared Westover doesn’t want to share:

Users want to pull data from tables in an Azure SQL database into Excel via Power Query. This situation sounds simple. However, I don’t want to provide direct access to the database for several reasons, including the potential governance and permissions nightmare. We have a Fabric workspace, and most of the data already exists in Power BI reports. How can we give users access to the data they need without providing direct access to the database for an easy SQL export to Excel?

Click through for the answer. This solution is a bit more roundabout than granting direct database access, but also comes with a host of security benefits.

Comments closed

Reviewing Two Explain Plans in PostgreSQL

Semab Tariq talks query tuning:

Performance optimization in a production database is crucial, but over-optimizing can complicate things without real improvements.

In this blog post, I’ll share two very basic EXPLAIN ANALYZE outputs from a production system. A user asked us to help optimize these plans (I’ve changed the table and column names for privacy).

We will look at how to spot slow parts of a query, improve performance the right way, and avoid unnecessary tuning.

One of the things you eventually learn as a performance tuner is that sometimes, it’s best not to try to optimize a particular query. This may seem a bit contradictory–who doesn’t want to go faster? But there are costs to actions, and spending a long time tuning an ad hoc query that somebody ran one time and probably won’t run again isn’t worth it. Ultimately, know how to tune, but also when to tune and what will give you the biggest marginal benefit.

Comments closed

Binding a Power BI Report to a Separate Semantic Model via Power BI Studio

Gilbert Quevauvilliers makes use of a Gerhard Brueckl extension:

The default option to rebind a Power BI report is to use the Power BI REST API.

This works well, but for a lot of people this can be quite intimidating.

Fortunately, Gerhard Brueckl, has created the amazing Power BI Studio, which is a Visual Studio Code Extension.

Click through to see how to install it and how to use this extension to rebind an existing Power BI report to a different semantic model, whether in the same workspace or even a different one.

Comments closed

Comparing INSERT INTO and SELECT * INTO

Haripriya Naidu runs an experiment:

Instead of looking at which option is faster, you may want to look at which option is better suited for a given context. Let’s take a look at 2 common options to insert data and analyze them.

INSERT INTO TARGETTABLE
SELECT * FROM SOURCETABLE

OR

SELECT * INTO TARGETTABLE 
FROM SOURCETABLE 

Click through for a comparison of the two, not just for which is faster but also the pros and cons of each approach.

Comments closed

The Monty Hall Problem

I have a new video:

In this video, I explain the classic Monty Hall problem, based on the concept of the show Let’s Make a Deal. I explain the paradox behind the problem and demonstrate that it’s better to switch doors.

I’m not joking at all when I say it took me years of listening to explanations before it actually clicked. Some of it is my innate stubbornness, but I think this is a great example of a true paradox, where the intuitive answer is wrong and first-level reasoning also leads you astray.

Comments closed

Testing ChatGPT with Bad Advice

Louis Davidson continues a series:

As started in part 1 of this series, I have set out to test an LLMs ability to technical edit. For my first set of tests, I am using a pair of articles I created, filled with very bad advice. The advice is the same for both articles, but what differs is the intro and the conclusion. One says the advice is good, the other said it is bad. It is all very very bad, including a really terrible SELECT statement versus loop construct that will cause an eternal loop that inserts into a temporary table.

My goal is to see how much of that advice will be noted as bad, and if it says anything nice at all about the text, etc. If you want to see the entire documents, you can get them here in a zip file, both in text and word document formats.

Starting with an extreme example like this is fine, I believe. Given the results, they were fine, though it sounds like Louis won’t be out of a job anytime soon.

Comments closed

Finding Object Dependencies in SQL Server

Vlad Drumea is looking for matches:

Figuring out object dependencies in a SQL Server database, especially one you’re not familiar with, can be a challenge.
Luckily, Microsoft provides the sys.sql_expression_dependencies system catalog view to help DBAs and developers track down various object dependencies.

I leveraged this catalog view in a query that returns the necessary info in a more user-friendly way, while also allowing easy filtering for referenced or referencing object.

Click through for the script. It’s been a while since I’ve used this DMV in anger, though my recollection is that I had problems finding dependencies over linked servers.

Comments closed

The WINDOW Clause in SQL Server 2022

Andy Brownsword takes a look at one quality of life improvement in SQL Server 2022:

Window functions allow us to perform a function across a set of rows in a result set, rather than how we might typically group them. In SQL Server 2022 we have a new clause available for our queries which can help tidy up how these are defined.

You can do some neat things with the WINDOW clause, including partial matches. For example, you could define a window r1 AS (PARTITION BY x.MyColumn) and then another window r2 AS (r1 ORDER BY x.SomeOtherColumn). I don’t think there are a huge number of scenarios in which this is helpful, especially because performance typically depends upon minimizing the unique number of window functions in your query, but every once in a while it’s a really neat thing.

Comments closed

Microsoft Fabric SKU Estimator

Jonathan Garriss has an announcement:

We’re excited to unveil the Microsoft Fabric SKU estimator, now available in preview—an enhanced version of the previously introduced Microsoft Fabric Capacity Calculator. This advanced tool has been refined based on extensive user feedback to provide tailored capacity estimations for businesses.

Designed to optimize data infrastructure planning, the Microsoft Fabric SKU Estimator helps customers and partners accurately assess capacity requirements and select the most suitable SKU for their workloads, protecting users from under-provisioning and overcommitment.

And, in classic Microsoft Fabric fashion, it’s in preview.

Coming up with some fairly low estimates for a lot of things, it bounced me between an F32 and an F64, which seemed about right.

Comments closed