Press "Enter" to skip to content

Day: August 10, 2021

Diving into Prophet for Time Series Analysis

Dan Lantos continues a series on the Prophet library:

These plots give us a little insight into how the model is formed. The trend plot (top) exhibits a linear, piecewise function, with approximately appropriate values for our dataset throughout the years. This looks to be a baseline for predictions.
The weekly plot (middle) demonstrates some interesting behaviour – weekdays have a small negative impact on the predictions (approximately -50), and we see large spikes for the weekends. This appears peculiar, as we have no weekend data in our dataset, but it is a product of fitting a 7-day periodic function to only 5 days of data. Thankfully, this won’t be an issue as we have no need to forecast weekends.
The yearly plot (bottom) shows a much more volatile impact on predictions (-200 to +180) with frequent changepoints throughout. This points to a more sensitive and complex relationship between the time of year and the FTSE100 index than the day of the week.

If you’re already familiar with techniques like ARMA or ARIMA, this post will let you see immediately what the key differences are.

Comments closed

Logic Branching in Powershell

Robert Cain continues to have fun with Powershell:

In this post, we’ll see how some of PowerShell’s logic branching works. Logic branching is where you execute certain lines of code only when a condition is true.

For all of the examples, we’ll display the code, then under it the result of our code. In this article I’ll be using PowerShell Core, 7.1.3, and VSCode. The examples should work in PowerShell 5.1 in the PowerShell IDE, although they’ve not been tested there.

As always, Robert has a large number of examples here to illuminate the process.

Comments closed

Finding Procedures Using SELECT *

Michael J. Swart hunts for the real performance killer:

I have trouble with procedures that use SELECT *. They are often not “Blue-Green safe“. In other words, if a procedure has a query that uses SELECT * then I can’t change the underlying tables can’t change without causing some tricky deployment issues. (The same is not true for ad hoc queries from the application).

I also have a lot of procedures to look at (about 5000) and I’d like to find the procedures that use SELECT *.
I want to maybe ignore SELECT * when selecting from a subquery with a well-defined column list.
I also want to maybe include related queries like OUTPUT inserted.*.

Read on to see Michael’s strategy for attacking the problem while not including benign instances of it (such as WHERE EXISTS (SELECT * ...), which won’t cause any issues because the database engine doesn’t expand that wildcard).

Comments closed

Scaling ADF and Synapse Analytics Pipelines

Paul Andrew has a process for us:

Back in May 2020 I wrote a blog post about ‘When You Should Use Multiple Azure Data Factory’s‘. Following on from this post with a full year+ now passed and having implemented many more data platform solutions for some crazy massive (technical term) enterprise customers I’ve been reflecting on these scenario’s. Specifically considering:

– The use of having multiple regional Data Factory instances and integration runtime services.

– The decoupling of wider orchestration processes from workers.

Furthermore, to supplement this understanding and for added context, in December 2020 I wrote about Data Factory Activity Concurrency Limits – What Happens Next? and Pipelines – Understanding Internal vs External Activities. Both of which now add to a much clearer picture regarding the ability to scale pipelines for the purposes of large-scale extraction and transformation processes.

Read on for details about the scenario, as well as a design pattern to explain the process. This is a large solution for a large-scale problem.

Comments closed

Creating a REST API on SQL Server Data

Tomaz Kastrun needs to get at some data:

REST API (or RESTful API) is a application programming interface (abbreviated as API) that complies with REST architectural style and is created to allow communication, information flow and interaction with REST (Representational State Transfare) web services. API is used as an interface for communication between two or more programs (or machines) and provides content required from one side to another side (known as consumer and producer; call/request and response).

Getting data from and to database using API is great for building integration with application and retrieving to client all the information in JSON format only once, instead of every time an action occurs on frontend. Yes there are multiple ways to push data to frontend, it can be entity framework, it can be Dapper, it can be T-SQL with procedures layer, it can be REST API, and many many others.

Click through to see how Tomaz does it with Node.Js.

Comments closed

Excel Cube Functions and Lambdas for Grouping

Chris Webb continues a series on lambda helper functions in Excel:

In the last post in this series I showed how you can use Excel’s new Lambda helper functions to return tables. In this post I’ll show you how you can use them to return a dynamic array of CubeSet functions which can be used to build a histogram and do the kind of ABC-type analysis that can be difficult to do in a regular Power BI report.

Read on to see a pair of examples along these lines.

Comments closed