Press "Enter" to skip to content

Curated SQL Posts

Building a Simple Streamlit App

I jump into a new web framework:

In the course of working on my book, I wanted to build an easy-to-use website for outlier detection. The idea here is that I have a REST API to perform the outlier detection work but I’d like something a little easier to read than JSON blobs coming out of Postman. That’s where Streamlit comes into play.

Click through to see how it all works. I was impressed with how easy it was to build a decent interactive website.

Comments closed

Making a Scatter Plot in Excel

Mike Cisneros shows how to create a nice-looking scatter plot in Excel:

Scatter plots are excellent charts for showing a relationship between two numerical variables across a number of unique observations. We see them in business communications from time to time, although they’re much more commonly used in the “exploration” part of the process—when we’re still trying to understand our data and find the important insights. 

If you’re unfamiliar with scatter plots, their common use cases, or their benefits and drawbacks in a range of scenarios, check out the what is a scatter plot? article in our SWD Chart Guide. There, we explore some of the basics of scatter plots via an example, share tips for designing them more effectively, and discuss common variations (bubble charts, connected scatter plots, and more).

Read on for the process, which can be a lot more difficult than you may first expect.

Comments closed

Avoiding SQL Injection in Entity Framework Raw Queries

Josh Darnell has a public service announcement (and I would have had that lead-in even if he didn’t say it himself!):

These days, most .NET developers seem to write SQL using Entity Framework, and specifically using LINQ. This is very convenient for us developers, and I’m sure most DBAs love it too.

One of the nice things about LINQ is that it’s not really vulnerable to SQL injection. C# code gets turned into SQL queries, and variables get translated into parameters automatically.

Thus folks who have only worked with ORMs and LINQ may not have ever learned about the bad old days. Which is normally fine, until…

Click through for an example of what not to do.

Comments closed

Working with Views in PostgreSQL

Gauri Mahajan tries out different types of views in Azure Database for PostgreSQL:

A view can be considered as a dataset that has a pre-determined schema, the data is derived based on a specific criterion and is the source from one or more underlying tables. When a view is queried for data, it, in turn, queries the underlying tables and presents the intended dataset. At times, when the volume of data is very large, a view may start impacting query performance. In those cases, data from the underlying tables with the intended logic that makes up the view is physically stored in another table. This specific construct is called a materialized view. As the data from the source table gets updated, this materialized view needs to be refreshed from time to time to keep the data update in the materialized view, unlike a regular view. Azure’s offering of PostgreSQL database is Azure Database for PostgreSQL and it offers the same features as well.

Click through for more information on creating and working with views, materialized or not. Note that SQL Server’s indexed views are not the same as materialized views here.

Comments closed

Paginated Reports vs Interactive Power BI Reports

Paul Turley explains when to use each type of report:

Technology evolves, the industry changes and the way businesses use technology changes. The road that has brought us to the current state of reporting capabilities in the Microsoft data platform has been long and winding. Ten years ago, we were using SQL Server Reporting Services (SSRS) to create interactive, dashboard-like reports scorecard reports. It wasn’t easy and the report interactivity wasn’t exactly “snappy” as the report server had to re-query data and re-render the report every time a link or chart series was clicked. But, because SSRS was the best tool that we had back in the day, the techniques we used for advanced report development were arcane by today’s standards. Over the years, I’ve written a few articles and posts about how to create reports with interactive KPI gauges, charts and scorecards using SSRS. Today we have a better tool for that style of reporting and analytics. Reporting Services remains a powerful and flexible tool for a few different styles of reports. Now that SSRS has been integrated into the premium Power BI service as “Paginated Reports”, this raises questions about how and when to use each tool – and how to use them together.

Read on for more information, including a handy set of bullet points to help you make a decision.

Comments closed

Azure Data Explorer Dynamic Columns in Power Query

Chris Webb blows out the margins:

Now you can easily extract the individual property values from this column in KQL, and indeed I did so in the KQL queries in this post, but the interesting thing is you can also extract these values in Power Query M very easily and – crucially – maintain query folding using the Record.FieldOrDefault M function in a custom column without needing to write any KQL yourself, in both Import mode and DirectQuery mode.

Read on to see how.

Comments closed

SQL Server Scheduler Affinity

Bob Dorr re-posts an oldie but goodie:

SQL Server uses 3 types of affinity to control where the SQL Server worker threads execute.  Before explaining the different scheduler affinity types let me clarify some terminology.

Click through to see what these three types of affinity are, as well as some of the terms you’ll need to know in order to understand affinity types.

Comments closed

Sentiment Analysis with Python

Sanil Mhatre performs a bit of sentiment analysis:

Previous articles in this series have focused on platforms like Azure Cognitive Services and Oracle Text features to perform the core tasks of Natural Language Processing (NLP) and Sentiment Analysis. These easy-to-use platforms allow users to quickly analyze their text data with easy-to-use pre-built models. Potential drawbacks of this approach include lack of flexibility to customize models, data locality & security concerns, subscription fees, and service availability of the cloud platform. Programming languages like Python and R are convenient when you need to build your models for Natural Language Processing and keep your code as well as data contained within your data centers. This article explains how to do sentiment analysis using Python.

Python is a versatile and modern general-purpose programming language that is powerful, fast, and easy to learn. Python runs on interpreters, making it compatible with multiple platforms, and is widely used in applications for web platforms, graphical interfaces, data science, and machine learning. Python is increasingly gaining popularity in data analysis and is one of the most widely used languages for data science. You can learn more about Python from the official Python Software Foundation website.

Click through to see what’s available in the NLP world for Python. The short version is “a lot.”

Comments closed

Generating Python Documentation with Sphinx

Evan Seabrook generates some docs:

As you can see above, we have docstrings defined for properties, methods and the classes themselves. Ultimately, these docstrings will be used by Sphinx to generate the documentation. If you’re using a different docstring format, you can use a Sphinx extension called Napoleon to use your existing docstrings. Once your project has a level of docstring usage that you’re happy with, we can move on to the next step of configuring Sphinx.

And that’s the downside to this: you get auto-generated documentation, which means it’s only as good as your developers’ ability to explain the code.

Comments closed