Press "Enter" to skip to content

Month: April 2022

Retrieving Twitter Engagements in R

Bryan Shalloway continues looking at Twitter data:

This is a follow-up to a short post I wrote on R Access to Twitter’s v2 API. In this post I’ll walk through a few more examples of pulling data from twitter using a mix of Twitter’s v2 API as well as the {rtweet} package.

I’ll pull all Twitter users that I (brshallo) have recently been engaged by (e.g. they like my tweet) or engaged with (e.g. I like their tweet). I’ll lean towards using {rtweet} but will use {httr} in cases where it’s more convenient to use Twitter’s v2 API.

Click through for more information, including several R scripts.

Comments closed

Downloading Power BI Reports with Powershell

Jon Fletcher needs to get some PBIX files:

In this blog post I will be sharing a PowerShell script that allows multiple Power BI reports to be downloaded at once.

In the Power BI service, there is no way of downloading multiple Power BI reports at once. Therefore, users must download files one by one which is slow, time consuming and inefficient. Thankfully, there are ways around this, one of which is using PowerShell.

Read on for the script and some additional notes.

Comments closed

Streaming Data into Synapse Dedicated SQL Pool

Lionel Penuchot loads some data:

This article reviews a common pattern of streaming data (i.e. real-time message ingestion) in Synapse dedicated pool. It opens a discussion on the simple standard way to implement this, as well as the challenges and drawbacks. It then presents an alternate solution which enables optimal performance and greatly reduces maintenance tasks when using clustered column store indexes. This is aimed at developers, DBAs, architects, and anyone who works with streams of data that are captured in real-time.

I’d probably avoid the MERGE statement in there because of how many problems there are with it. That said, this is a useful pattern for trickle-loading columnstore tables.

Comments closed

Making Redis Do Your Bidding

Arun Sirpal looks at some of the command language for Azure Redis:

Now that we have created our Redis Cache lets connect to it. You can use the most common tool redis cli.exe https://redis.io/download or as I am going to do, use the Azure Portal to use the console directly, this isn’t probably the best way but it’s the easiest for this blog. 2 key points here:

Read on for those points, as well as examples of commands you can run.

Comments closed

Including Zero on Charts

Steve Jones thinks about zero:

I’m not great at building charts and graphs. I can build a basic chart, but I often depend on the tooling I use to size, scale, etc. appropriately for whatever I’m graphing. That, or I just use a basic graph that starts from zero and has some sort of linear scale. Or I just present a table of numbers.

There are plenty of misleading charts, especially used by the media that want to show some particular aspect of data that suits the story they are reporting. Many of these misleading charts often don’t start at zero, and they end up scaling in a way that can confuse people.

Steve references a lengthy article on the topic, one which is definitely worth the read, especially because as far as I’m aware, most of the academic literature on visualization and starting at 0 ignores line charts. The only work I’m familiar with is Cleveland, McGill, and McGill, who recommended banking to 45 degrees (and here’s an example of it in SAS).

Comments closed

Code Formatting Holy Wars

Tom Zika and I are on opposite sides:

So I’ll take it one step further.
I’ll never use semicolons unless I have to.

Tools like Redgate’s SQL Prompt can add semicolons automatically, but I still won’t do it.

My quick thoughts:

  • Semi-colons? Love them. The chaotic neutral part of me wants to see Microsoft make good on their deprecation notice of code lacking semi-colons just to watch the world burn.
  • Commas go at the end because we are not barbarians.
  • Aliases should be short and sufficiently meaningful within the context of the statement. Tom and I agree here.
  • PascalCase is the best case.
  • INNER JOIN instead of JOIN because, again, we are not barbarians. LEFT OUTER JOIN instead of LEFT JOIN because, well, you guessed it.

And at the end of the day, consistency and readability are the most important things…though I’ll fight for my aesthetics like I’m the third monkey in line for Noah’s Ark and brother, it’s starting to rain.

Comments closed