Press "Enter" to skip to content

Month: October 2020

Full Moon Finder in R

Tomaz Kastrun has a not-so-useless function:

The full moon function, or should we call it fool moon – due to it’s simplistic and approximate nature, calculates the the difference between the date (only date, no time, no long/lat coordinates) and Julian constant. Should you be using a different calendar, don’t run the function, just look out the window.

The function is written based on generalized equation for julian day numbers and months. Another one could be to calculate RMSE of the predicted values and realization of lunar behavior (lunatic start time). In this case – reversed engineering – you would use the the approximate date/time for the first new moon after that date if the synod period was constant. This number than obtained is only empirically proven by recursively solving for the new “possible date/time” of lunar behavior and calculate the prediction error. In order to minimize the RMSE value of the difference between the full moon dates/times predicted formula and the dates/times for the full moon over the next 10 years you get something like this.

Click through for the function as well as sound advice if it’s not a full moon.

Comments closed

Preparing for the Kafka-Zookeeper Breakup

Yeva Byzek prepares us:

As described in the blog post Apache Kafka® Needs No Keeper: Removing the Apache ZooKeeper Dependency, when KIP-500 lands next year, Apache Kafka will replace its usage of Apache ZooKeeper with its own built-in consensus layer. This means that you’ll be able to remove ZooKeeper from your Apache Kafka deployments so that the only thing you need to run Kafka is…Kafka itself. Kafka’s new architecture provides three distinct benefits. First, it simplifies the architecture by consolidating metadata in Kafka itself, rather than splitting it between Kafka and ZooKeeper. This improves stability, simplifies the software, and makes it easier to monitor, administer, and support Kafka. Second, it improves control plane performance, enabling clusters to scale to millions of partitions. Finally, it allows Kafka to have a single security model for the whole system, rather than having one for Kafka and one for Zookeeper. Together, these three benefits greatly simplify overall infrastructure design and operational workflows.

Read on to see where this story is at and what kinds of changes you’ll have to make to code.

Comments closed

Purging the SSIS Catalog

Peter Schott extends Tim Mitchell’s work:

I wrote about cleaning up the SSIS Catalog a while back, but needed to revisit this recently for a different use and needed something that can run in an ongoing manner. My earlier post still works, but I recently adapted some code from Tim Mitchell to create a stored procedure that can do that cleanup. Tim wrote the majority of this. I adapted it to wrap it in a stored procedure to handle varying batch sizes for the deletes as well as to ensure all of the tables are deleted in smaller sets of rows.

Here is the code to create the stored procedure. If your SSIS Catalog is not named SSISDB, adjust accordingly.

Click through for the script. It’s interesting to note how frequently cleanup processes for functionality in SQL Server is inadequate for the task at scale. I’ve regularly seen people write these sorts of things for SSISDB, Query Store, replication, ML Services (though that, at least, was changed), etc.

Comments closed

SSAS Performance Counters to Monitor

Steven Wright takes us through a set of SQL Server Analysis Services performance counters we should track:

That said, I wanted to provide some updates with more of a focus on Tabular mode, as it has clearly become the new standard for how most organizations deploy SSAS. I recommend referring to that original series for a focus on Multidimensional mode, but much of the information provided in this blog post will be applicable to both modes. I recommend referencing Allen White’s blog post on SQL Server counters to learn more about many of the Windows-level counters that apply across the board, as I won’t speak to them here.

Let’s dive into the 15 SSAS performance counters you should be monitoring.

Click through for the set, as well as explanations for why.

Comments closed

Recommendations for Hosting SQL Server on VMware

Michelle Gutzait walks us through recommendations on hosting SQL Server in Windows on VMware:

VMware has created a very detailed best-practice document for us, specifically for SQL Server. You may find the latest one here.

In case the link doesn’t work for you, or you have a different version of VMware, you can search for the proper SQL Server best practices on the VMware site.

Here are the main best practices VMware recommends, and the most important based on Pythian’s experience (SQL Server on Windows):

Click through for a detailed checklist.

Comments closed

Installing an Always On Availability Group in SQL Server 2019

Ginger Daniel takes us through the process of setting up an Always On Availability Group in SQL Server 2019:

With SQL Server 2012 Microsoft introduced the AlwaysOn Availability Group feature, and since then many changes and improvements have been made.  This article is an update to our previous article https://www.sqlrx.com/steps-for-installing-sql-server-alwayson-availability-groups/ , and will cover the prerequisites and steps for installing AlwaysOn in your SQL Server 2019 environment.

Click through for a checklist of pre-requisites and installation + configuration steps.

Comments closed

Understanding Digital Twins

Kurt Cagle explains a concept:

At the core of such digital twins is the notion of a model. A model, in the most basic definition of the word, is a proxy for a thing or process. A runway model, for instance, is a person who is intended to be a proxy for the viewer, showing off how a given garment looks. An artist’s model is a stand-in or proxy for the image, scene, or illustration that an artist is producing. An architectural model is a simulation of how a given building will look like when constructed, and with 3D rendering technology, such models can appear quite life-like. Additionally, though, the models can also simulate more than appearance – they can simulate structural integrity, strain analysis, and even chemistry interactions. We create models of stars, black holes, and neutron stars based upon our understanding of physics, and models of disease spread in the case of epidemics.

This is a really good explanation of the concept. Contrast with the explanation of, say, Azure Digital Twins. The first time I saw it, I thought one thing; then, when I read the intro page, I thought something different. Then, I walked through the demo and thought something yet again different. I might have just missed the part where it lays out exactly what a digital twin is and its importance but I do like Kurt’s explanation a lot more.

Comments closed

Combining Two Survey Questions into a Graph

Stephanie Evergreen solves a challenge:

You’ve asked employees to rate a bunch of different aspects of their job. You want to know if they think that aspect is important AND how satisfied they are with that aspect of their job. So, naturally, you make two individual questions with response options like Not at all Important to Very Important and Not at all Satisfied to Very Satisfied. I would probably do the same thing.

But then you’ve got to show the data and, importantly, how those two variables – Importance and Satisfaction – relate to each other.

Click through for two methods of visualizing the results.

Comments closed