Press "Enter" to skip to content

Author: Kevin Feasel

Plotting Multiple Histograms in R

Steven Sanderson shows us two libraries to plot two histograms:

Histograms are a powerful tool for visualizing the distribution of numerical data. They allow us to quickly understand the frequency distribution of values within a dataset. In this tutorial, we’ll explore how to create multiple histograms using two popular R packages: base R and ggplot2. By the end of this guide, you’ll be able to confidently display multiple histograms on a single graph using both methods.

Click through for more than two examples.

Comments closed

File Not Found in SQL Server 2022 with Distributed AG and Filestream

Sean Gallardy goes sleuthing:

I don’t often find many people using FileStream in their databases (which isn’t a bad or good thing, in my opinion, just a statement of fact). Some technologies in SQL Server use it behind the scene, such as FileTable or Hekaton, and there isn’t really any getting around it in those cases. However, I was brought an interesting issue by a friend on Database Administrators Stack Exchange, Hannah Vernon (w), when it came to a database that was in a Distributed Availability Group in 2019 and had no issues, but after upgrading to SQL Server 2022, started having a major problem.

Read on for Sean’s analysis of the problem and solution.

Comments closed

Powershell Quizzes

Jeff Hicks wants you to think fast:

Time to get back to the to blog. I’ve been working through my backlog of projects. These are things that I started writing or updating but then got pushed to the back of the line. One of these projects is a PowerShell module I wrote as a teaching tool. The idea was to create short quizzes on PowerShell topics that someone could take in a PowerShell session. My idea was to create quizzes on PowerShell topics, but you can create a quiz on anything. If you want to try things out, install the PSQuizMaster module from the PowerShell Gallery. The module will work in Windows PowerShell and PowerShell 7, including cross-platform.

Read on to see what’s in a quiz and how to create your own quiz.

Comments closed

ADX Date and Time Representations in Power Query and Power BI

Dany Hoter does some explaining:

Data in ADX (aka Kusto aka RTA in Fabric) almost always has columns that contain datetime values like 2023-08-01 16:45 and sometimes timespan values like 2 hours or 36 minutes.

In this article I’ll describe how these values are represented in ADX in Power Query and in Power BI.

Notice that I don’t just say Power BI because timespan values have different types in Power Query and in Power BI.

Read on for those details.

Comments closed

CREATE OR ALTER and sp_ Procedures

Louis Davidson hits an edge case:

It isn’t that such objects are to be completely avoided, it is that they are ONLY to be used when you need the special qualities. Ola Hallengren’s backup solution creates a dbo.sp_BackupServer procedure so you can run the backup command from any database.

But if you don’t need the special properties of the sp_procedure, they are bad for the reasons Aaron stated, the reason I stumbled upon today being just a special subset. In this case CREATE OR ALTER behaves differently than CREATE in a way that was really confusing to me as I was working on a piece of code today.

Read on for Louis’s tale of woe and confusion, but mostly confusion.

Comments closed

External Table is Not in the Expected Format for Power Query

Chris Webb handles an error:

Sometimes when you’re importing data from files using Power Query in either Power BI or Excel you may encounter the following error:

DataFormat.Error: External table is not in the expected format

What causes it? TL;DR it’s because you’re trying to load data from one type of file, probably Excel (I don’t think you can get this error with any other source but I’m not sure), and actually connecting to a different type of file.

Read on for an example, a more detailed description of when you’d get the error, and how to fix it.

Comments closed

Accessing OneLake Files from Power BI Desktop

Marc Lelijveld reads a file:

Fabric content is all over the place by now. In Fabric, as a SaaS platform, most (if not all) services have interconnectivity. In a few clicks you connect your web-developed Power BI dataset to a lakehouse, or warehouse to fetch data from OneLake. But what about Power BI Desktop? You might have uploaded some files to OneLake which you cannot access from Power BI Desktop.

In this blog I’ll explain on how you can connect to OneLake data using Power BI Desktop!

This turns out to be a bit trickier than I would have expected. Hopefully the experience gets better over time.

Comments closed

Running Python in Excel

Alex Woodie reports on a new beta:

Excel-maker Microsoft and Anaconda, a key distributor of Python tools, unveiled a collaboration this week that will see Python integrated with Excel.

The new Anaconda Python Distribution in Excel, which is currently in beta, will bring Python data analysis and data science capabilities to the popular spreadsheet program from Microsoft. The integration will enable users to use a variety of Python libraries and tools to prep, manipulate, analyze, and visualize data in Excel.

It’s still in preview, but it is interesting to see.

Comments closed

Loading OpenStreetMap Data in Postgres

Ryan Lambert gets just the right amount of data:

Populating a PostGIS database with OpenStreetMap data is favorite way to start a new geospatial project. Loading a region of OpenStreetMap data enables you with data ranging from roads, buildings, water features, amenities, and so much more! The breadth and bulk of data is great, but it can turn into a hinderance especially for projects focused on smaller regions. This post explores how to use PgOSM Flex with custom layersets, multiple schemas, and osmium. The goal is load limited data for a larger region, while loading detailed data for a smaller, target region.

The larger region for this post will be the Colorado extract from Geofabrik. The smaller region will be the Fort Collins area, extracted from the Colorado file. The following image shows the data loaded in this post with two maps side-by-side. The minimal data loaded for all of Colorado is shown on the left and the full details of Fort Collins is on the right.

Click through for more details on these two examples.

Comments closed

Visualizing Univariate Data Distributions in R

Steven Sanderson reviews the shape of the data:

Understanding the distribution of your data is a fundamental step in any data analysis process. It gives you insights into the spread, central tendency, and overall shape of your data. In this blog post, we’ll explore two popular functions in R for visualizing data distribution: density() and hist(). We’ll use the classic Iris dataset for our examples. Additionally, we will introduce the {TidyDensity} library and show how it can be used to create distribution plots.

Click through for three different functions for visualizing the density of a variable.

Comments closed