This is a bit of a longer one, a look at how to do all the different joins and the exciting thing for MSSQL developers is that we get a couple of extra joins (semi and anti semi oooooooh).
Click through for lots of examples.
As software changes, it often accretes feature and drifts away from its design, if it even started with one, and many defaults and settings become undesirable. New users are blamed for not moving parameter settings away from the defaults to the “obvious” acceptable values.
Click through for the origin of the name and more info on how to avoid it.
Every Parquet file has the footer that contains metadata information: schema, row groups and column statistics. The footer is located at the end of the file.
A parquet file content starts and ends with 4-byte
PAR1“magic” string. Right before the ending
PAR1there is 4-byte footer length size (little-endian encoding):
Click through for more details, as well as one downside to Parquet 1.x.
When you build an ETL Solution, at some point you will most likely feel the need to compare the data between your source and your staging (or datawarehouse) database. There may be various reasons for them to be out of sync, such as delta loads, aggregations or added business logic, but one day your phone will ring and whoever is on the other end will tell you that the numbers are wrong. While this doesn’t necessarily have to be an issue within your ETL process, it might just as well be the report itself. In many cases, this is a good starting point to look at.
This article (formally published on PASS Community Blog) focusses on SSIS as your orchestrator, but the same principles could, obviously, also be applied to Azure Data Factory for example. Also, as we want to solve this task as lightweight as possible, we will, demonstrably, use Biml to implement it!
Click through to see how.
Jon Schwabish over at PolicyViz has created great initiative called the One Chart at a Time Video Series. It’s an effort to expand readers’ graphic literacy through short videos explaining how to read and use different charts. Each video is from a different person in the data visualization industry. Participants include people I admire such as Andy Kirk, Ben Jones, and Cole Nussbaumer Knaflic. Jon releases a new video each weekday. The initiative started January 11 and will continue through mid-March.
Click through for Meagan’s contribution on waterfall charts.
Introduced in SQL Server 2016 (13.x) SP1 and Azure SQL Database is the USE HINT option. At the time of writing this article, there are 23 of those. You can see an up to date list by running Dynamic Management View sys.dm_exec_valid_use_hints.
With the introduction of this feature, we can replace some of the trace flags with hints. I personally like hints over trace flags (if I have to choose between two devils). That way someone reading the code have some idea what the hint is about instead of remembering the trace flag numbers.
I agree with Taiob on that sentiment.
I don’t recall ever having a performance issue with
sys.partitions. My initial thought (echoed by Joey D’Antoni) was that a filter on the
data_compressioncolumn should avoid the redundant scan, and reduce query runtime by about half. However, this predicate doesn’t get pushed down, and the reason why takes a bit of unpacking.
Click through for the results of Aaron’s research.
Our story begins on a test server hosting a couple hundred databases. Over time, the server kept getting slower and slower when trying to navigate in SSMS. Query windows would take too much time to load, logging in would lag a bit, etc. When investigating with sp_Blitz, I found that the server had been experiencing Threadpool waits.
There are some cases where increasing thread count is important, especially when you’re working with database mirroring or availability groups. I worked with a customer with thousands of mirrored databases per server. None of the databases were particularly large or heavily-used, so it was on properly-sized hardware. As a result, to prevent the server from falling over due to threadpool waits, we had to scale thread counts to scary-high levels.