Press "Enter" to skip to content

Curated SQL Posts

Using a Date Template in Power BI

Haroon Ashraf recommends using a template with date dimension details:

A Power BI Template

A Power BI template is a structure or model that typically contains commonly used tables, relationships, and hierarchies belonging to an organization or an individual. This model is reused in any Power BI report. More information is provided in the previous article:

Centralized Data Modelling using Power BI Templates

What is a Date Template in Power BI?

A Date template is a precise structure of the Date table that is a background for building reports in the organization. In other words, it is like a built-in Date table that any reports developer or a skilled business user can apply to build Power BI reports.

Read on for more Q&A as well as how to create a simple version of a date table for this template. The idea of using a template makes even more sense as you have more complicated date table requirements, such as adding in fiscal year details, holiday information (especially holidays which don’t always fall on the same solar calendar day, such as Passover or Easter), and dates important to the company.

Comments closed

Defining an Ad Hoc Query

Kathi Kellenberger explains what it means to be an ad hoc query:

Someone recently asked me which queries are ad hoc in SQL Server. An ad hoc query is a single query not included in a stored procedure and not parameterized or prepared. Depending on the server settings, SQL Server can parameterize some statements initially written as ad hoc queries. Ad hoc doesn’t mean dynamic.

Next on the list, a post hoc ergo propter hoc query. That’s where I explain to the DBAs that just because the server goes down every time I run a query, it doesn’t mean my queries caused this.

Comments closed

When PyODBC fast_executemany Isn’t

Jon Morisi troubleshoots a performance issue:

I recently had a project in which I needed to transfer a 60 GB SQLite database to SQL Server.  After some research I found the sqlite3 and pyodbc modules, and set about scripting connections and insert statements.  

The basic form of my script is to import the modules, setup the database connections, and iterate (via cursor) over the rows of the select statement creating insert statements and executing them.  

The issue here is that this method results in single inserts being sent one at a time yielding less than satisfactory performance.  Inserting 35m+ rows in this fashion takes ~5hrs on my system.

Jon tries out a few different options. It would appear that there is no easy bulk insertion operation with PyODBC.

Comments closed

Availability Groups and Logins

Andrea Allred runs into a post-failover issue:

While doing a planned Availability Group failover, the application stopped talking to the database. After checking the SQL Server log, we found that all the SQL Logins were failing with an “incorrect password” error. The logins were on the server, the users were in the databases, and the passwords were even right, so what was wrong? It all comes down to SID’s (Security Identifiers).

Read on for the cause and the solution. I’d also recommend Sync-DbaAvailabilityGroup as a good dbatools cmdlet to use.

Comments closed

Managing Powershell Functions with PSFunctionInfo

Jeffrey Hicks announces a new tool:

Over the last year, I’ve been working on a solution. I’ve been using it and finding it helpful. My friend Gladys Kravitz was also bemoaning the lack of tools for managing stand-alone functions. And while she had her own approach, I thought my solution might offer more. So I polished it up, setup a Github repository, and published a preview release to the PowerShell Gallery. The module is called PSFunctionInfo. You can find the repository on Github. Because it is a pre-release, you might need to install the newest version of the PowerShellGet module so you have the prerelease parameters.

Click through for more detail on how to use it.

Comments closed

Aligning Kubernetes Nodes to Physical Infrastructure

Frank Denneman has some advice for us:

With the new VM service and the customizable VM classes, you can help the developer align their nodes to the infrastructure. Infrastructure details are not always visible at the Kubernetes layers, and maybe not all developers are keen to learn about the intricacies of your environment. The VM service allows you to publish only the VM classes you see fit for that particular application project. One of the reasons could be the avoidance of monster-VM deployment. Before this update, developers could have deployed a six worker node Kubernetes cluster using the guaranteed 8XLarge class (each worker node equipped with 32 vCPUs, 128Gi all reserved), granted if your hosts config is sufficient. But the restriction is only one angle to this situation. Long-lived relationships are typically symbiotic of nature, and powerplays typically don’t help build relationships between developers and the InfraOps team. What would be better is to align it with the NUMA configuration of the ESXi hosts within the cluster.

Click through for more detail. This is aimed particularly at operations people running Kubernetes clusters over VMware.

Comments closed

Recent Apache NiFi Updates

Pierre Villard has some news for us around Apache NiFi:

Cloudera released a lot of things around Apache NiFi recently! We just released Cloudera Flow Management (CFM) 2.1.1 that provides Apache NiFi on top of Cloudera Data Platform (CDP) 7.1.6. This major release provides the latest and greatest of Apache NiFi as it includes Apache NiFi 1.13.2 and additional improvements, bug fixes, components, etc. Cloudera also released CDP 7.2.9 on all three major cloud platforms, and it also brings Flow Management on DataHub with Apache NiFi 1.13.2 and more.  Let’s have a look at the main highlights of these releases.

Click through to see what’s included.

Comments closed

Writing SQL to Query R data.frames

Tomaz Kastrun tries out a package:

There are many R packages for querying SQL Databases. Recently, I was looking into sqldf package | CRAN documentation.

There are so many great advantages (simple running SQL statements, creating, loading, deleteing data to data.frames, connectivity to many databases, support for SQL functions, data types and many many more) , but one that was really a major win was interactions with data frames and SQL Language.

Between sqldf and dbplyr, you get it both ways: treat a data.frame like a SQL table, or treat a SQL database like R data.frames.

Comments closed