Press "Enter" to skip to content

Day: December 6, 2024

Finding Columns in R with No Data

Steven Sanderson looks for the missing columns:

When working with real-world datasets in R, it’s common to encounter missing values, often represented as NA. These missing values can impact the quality and reliability of your analyses. One important step in data preprocessing is identifying columns that consist entirely of missing values. By detecting these columns, you can decide whether to remove them or take appropriate action based on your specific use case. In this article, we’ll explore how to find columns with all missing values using base R functions.

Click through to see how you can do this. It’s not quite as simple as missing rows (complete_cases()) but it’s also not too much of an ordeal, either.

Leave a Comment

Comparing Positron to RStudio

Theo Roe performs a product comparison:

Positron is the new beta Data Science IDE from Posit. Though Posit have stressed that maintenance and development of RStudio will continue, I want to use this blog to explore if Positron is worth the switch. I’m coming at this from the R development side but there will of course be some nuances from other languages in use within Positron that require some thought.

Read on for Theo’s perspective. Knowing that it’s using the same underlying framework as Visual Studio Code, I kind of wish this were an extension for VS Code rather than a separate app.

1 Comment

Deployment Parameters in Azure AI Foundry

Tomaz Kastrun continues a series on Azure AI:

Give the model instructions about how it should behave and any context it should reference when generating a response. You can describe the assistant’s personality, tell it what it should and shouldn’t answer, and tell it how to format responses. There’s no token limit for this section, but it will be included with every API call, so it counts against the overall token limit

Click through for a description of each part of the deployment parameters section.

Leave a Comment

Mounding ADF Instances in Microsoft Fabric

Koen Verbeeck has an existing Azure Data Factory:

We recently started using Microsoft Fabric for our cloud data platform. However, we already have quite an estate of Azure data services running in our company, including a huge number of Azure Data Factory (ADF) pipelines. It seems cumbersome to migrate all those pipelines to Microsoft Fabric, especially because some features are not supported yet and ADF is the mature choice at the moment. We like the concept of Microsoft Fabric’s centralization, where everything is managed in one platform. Is there an option to manage ADF in Fabric?

Read on for the answer, but make sure to check out its limitations as well.

Leave a Comment

The Power of Rounding

Denny Cherry makes a change:

So, I ran across a problem with QuickBooks that involves some of the most basic math that we were all taught in elementary school: how to round numbers properly. You’d think that a company that makes accounting and invoicing software for a living would understand how rounding of numbers works. But based on the last hour of having to edit the data that gets send to QuickBooks from our internal system you’d be wrong.

Denny’s example is $3.18497736 and rounding happens after four decimal spots, so it’s $3.1849 or $3.1850. Denny expects $3.1850 and QuickBooks gives $3.1849.

In this case, Denny’s right. The part that confuses people is banker’s rounding, which has you round to the even number if your last digit is a 5. For example, if it were $3.18495 and you round to four spots after the decimal, that would be $3.1850. With $3.18485, it would round to $3.1848.

.NET uses banker’s rounding by default, which can confuse people unfamiliar with the concept. SQL Server, meanwhile, rounds the way that Denny expects: 5 or higher rounds up, 0-4 rounds down.

Leave a Comment