Press "Enter" to skip to content

Month: May 2024

Counting Words in a String in R

Steven Sanderson counts the ways:

Counting words in a string is a common task in data manipulation and text analysis. Whether you’re parsing tweets, analyzing survey responses, or processing any textual data, knowing how to count words is crucial. In this post, we’ll explore three ways to achieve this in R: using base R’s strsplit(), the stringr package, and the stringi package. We’ll provide clear examples and explanations to help you get started.

I, of course, would commission a 128-node Hadoop cluster and write a few dozen pages of Java code to get the answer.

Leave a Comment

Backup Storage Redundancy in Cosmos DB

Manvendra Singh talks about backups:

This article will explain backup storage redundancy for Azure Cosmos DB. Backups are a critical feature to keep copies of our data to ensure data protection and recoverability in case of any accidental deletion, updating, or any kind of disaster. But this is not enough to run backups only to save its copies. We must also protect those backup copies from accidental deletes or corruption and ensure their proper resiliency should be in place to keep backups safe from any unforeseen circumstances. It depends on the criticality of your data whether you want to keep them locally to want to replicate them in other locations or regions to ensure their resiliencies.

The backup process isn’t the same as with a relational database, but it’s still critical to back up your data, for the same reasons that you’d take backups of relational data.

Leave a Comment

Testing if PowerShell is Running from Windows Terminal

Mike Robbins wants to know your terminal:

In the evolving landscape of software development and system administration, understanding your environment is essential for ensuring that scripts and commands run as expected. One useful PowerShell function that helps identify whether a script is running within Windows Terminal is my Test-IsWindowsTerminal function. This function is valuable for developers and system administrators who must tailor their scripts based on the terminal environment.

Click through for a PowerShell function that does just that.

Leave a Comment

Using Static Cursors

Hugo Kornelis digs into one type of cursor:

I have used the GLOBAL scope option, so that it is possible to step through the code one statement at a time. This option does not affect the execution plans used. The FORWARD_ONLY and READ_ONLY options are probably the most commonly used options with cursors, especially with static cursors. We will briefly look at the effect of other options for read direction and concurrency at the end of this post.

Click through to learn more about how cursors show up in execution plans and how you can tell, based on the execution plan of a static cursor, why it’s either really fast (relative to other cursor types) or really slow.

Leave a Comment

Sorting Date Values in Power BI Slicers

Kenneth Omorodion demands order:

Sorting values on a Power BI visual is a common requirement for reporting. For example, it is standard practice to sort months from January to December when months are represented in a visual. This might also be required in a slicer visual. In both cases, it is easy to sort when it’s a continuous value, like months, years, quarters, or alphabetically.

However, in Power BI, business users might require a slicer visual with string values to be ordered in a way that makes it easier for them to make a slicer selection based on what they want to view on a report page. The issue here is that string values are not continuous in nature, and we cannot leverage the usual sorting approaches. This tip will demonstrate this problem and an approach to resolve it.

Read on for the solution.

Leave a Comment

Azure SQL Database Watcher and Query Store

Kendra Little is happy:

I’ve spent a bit of time with Microsoft’s new database watcher tool for Azure SQL recently.

There are a lot of things I like about database watcher– which is currently in preview and which refuses to Capitalize Its Name– but it does one big thing that I really, really like: it collects data from Query Store. You can access that Query Store data from built-in database watcher dashboards, query it using KQL, or (something something) in Microsoft Fabric if you’ve got money to burn on your monitoring data.

Query Store has been available since SQL Server 2016, but I haven’t yet heard of monitoring tools that truly take advantage of it. It’s about time.

This is where I’d also plug QDS Toolbox for on-premises environments. A good amount of the reporting information comes out of Query Store and it helps manage Query Store to boot.

Leave a Comment

Selecting Columns Containing a Specific String in R

Steven Sanderson goes hunting for strings:

Today I want to discuss a common task in data manipulation: selecting columns containing a specific string. Whether you’re working with base R or popular packages like stringrstringi, or dplyr, I’ll show you how to efficiently achieve this. We’ll cover various methods and provide clear examples to help you understand each approach. Let’s get started!

Click through for five examples across the three methods.

Leave a Comment

Making Code Developer Friendly with an Example in R

Mark Niemann-Ross says the rest is commentary:

If you are reading this, you’re a coder and use functions. We write them for ourselves. If someone else writes a function, you can hope it works. If it doesn’t, you can hope to fix it. Hopefully, the return value is obviously correct. But maybe it’s subtly wrong?

If things are amiss, read the name of the function and hope it’s descriptive. I worked with a programmer who omitted all vowels from his function names. So the above code would expand to this…

Read on for the rationale behind commenting your functions appropriately, as well as one way to do it in R. There is a bit of art and a bit of science to writing good comments, but the starting point is simply having them to begin with. And the more clever you feel like you’re being, the more you need to comment this, because three months from now, you probably won’t be feeling quite as clever. H/T R-Bloggers.

Leave a Comment

The Joy of Partitioned Views

Rod Edwards talks partitioned views:

This post came around when I was at a loose end one evening, and just started poking at a local sandpit database, and it got me reminiscing and revisiting / testing a few things. The devil makes work for idle thumbs and all that…

Partitioned Views…do they have a place in society anymore?

Rod does a great job of following Betteridge’s Law of Headlines, as well as saving the ‘Yes’ answer for the post itself. Partitioned views come with their own pains, though one use case Rod did not include is using PolyBase and partitioned views to move “cold” data to slower external storage.

Leave a Comment

Job Threading and Thread Partitioning in SQL Server

Aaron Bertrand continues a series on threading:

In part 2 of this series, I showed an example implementation of distributing a long-running workload in parallel, in order to finish faster. In reality, though, this involves more than just restoring databases. And I have significant skew to deal with: one database that is many times larger than all the rest and has a higher growth rate. So, even though I had spread out my 9-hour job with 400 databases to run faster by having four threads with 100 databases each, one of the threads still took 5 hours, while the others all finished within 1.5 hours.

Read on to learn what Aaron did to make things move faster.

Leave a Comment