Press "Enter" to skip to content

Day: August 14, 2023

Counting Groups in R

Steven Sanderson counts items in a group:

As data-driven decision-making becomes more critical in various fields, the ability to extract valuable insights from datasets has never been more important. One common task is to calculate counts by group, which can shed light on trends and patterns within your data. In this guide, we’ll explore three different approaches to achieve this using the powerful R programming language. So, let’s dive into the world of grouped counting with the help of the classic mtcars dataset!

Read on for the base R solution, the dplyr solution (which looks a lot like how we’d solve it in SQL), and the data.table solution.

Comments closed

Goldbach’s Conjecture and the Sieve of Sundaram in R

Tomaz Kastrun promised us there would be no math on the quiz and yet here we are:

This is fun It is also O(MAX) complexity. But first some background. Since the problem is super old, we are not intending to solve it, merely to play with it. In the number theory of mathematics, the Goldbach’s conjecture states that for every even integer (greater than 2) can be expressed with the sum of two prime numbers. There are also far cries from this theory. For example, prove that every even number can be written as the sum of not more than 300.000 primes (by Schnirelman (1939)).

Read on for the functions and trials of Goldbach’s conjecture.

Comments closed

Creating a Time Dimension in Power BI via DAX

Angela Henry gets a watch:

There are some instances when you want to analyze data over time, not just dates. Most of us are familiar with having to create date tables and use them in analysis, but having to analyze data over time is not as common. Let’s say you run a taxi company and you want to determine when your busiest times of day are. This would come in handy for scheduling drivers. You need more drivers during busy times because no one wants to wait for a taxi!

Read on to see one way to create the table in Power BI.

Comments closed

Database Concurrency in Postgres

Mohan Saraswatipura explains how database concurrency works in Postgres:

Concurrency control is an essential aspect of database systems that deals with multiple concurrent transactions. PostgreSQL employs various techniques to ensure concurrent access to the database while maintaining data consistency using atomicity and isolation of ACID (stands for Atomicity, Consistency, Isolation and Durability – https://en.wikipedia.org/wiki/ACID) properties.

The majority of the article focuses on Multi-Version Concurrency Control, which is also the concurrency option which would be least well-known to SQL Server users.

Comments closed

Lessons Learned from Azure Data Factory Integrating with DB/2 on Mainframe

Teo Lachev shares some thoughts:

I’ve done a few BI integration projects extracting data from ERPs running on IBM Db2. Most of the implementations would use a hybrid architecture where the ERP would be running on an on-prem mainframe while the data was loaded in Microsoft Azure. Here are a few tips if you’re facing this challenge:

Click through for five major points. Surprisingly, one of them isn’t “Avoid DB/2 like the plague.”

Comments closed

Power BI and Eventual Browser Development

Chris Webb talks about the present and the future:

Turning the question around, however, leads you to some aspects of the question that haven’t been fully explored. Instead of asking “Can I run Power BI Desktop on my Mac?”, you can instead ask “Can I do all of my Power BI development using only a browser?”. At Microsoft our long-term goal is to make all Power BI development web-based, but how close are we to that goal?

Read on for Chris’s answer.

Comments closed

SQL Server on Linux 2022 Available in Preview

Amit Khandelwal has an update on SQL Server on Linux:

We are glad to announce that SQL Server 2022 is now available in preview mode for both Red Hat Enterprise Linux (RHEL) 9 and Ubuntu 22.04. For this preview, only Evaluation edition is available, which is limited to 180 days starting Thursday, July 27th, 2023. 

In your Dev/Test environments, you may now take advantage of the most recent SQL Server 2022 improvements on both RHEL 9 and Ubuntu 22.04. Currently, production workloads on RHEL 9 and Ubuntu 22.04 are not supported by the SQL Server 2022 preview packages. You can run the production workloads for SQL Server 2022 on RHEL 8 and Ubuntu 22.04 and they are fully supported. 

I’m going to wait until it’s actually available for real, not just in preview.

Comments closed