Press "Enter" to skip to content

Author: Kevin Feasel

Automatically Restarting Telegraf On Windows

Tracy Boggiano has a quick Powershell script to try starting Telegraf until it succeeds:

I’ve noticed on demo machines that sometimes Telegraf doesn’t start on the first try, and this seems to not happen on most of my production servers, but they have a lot more memory and CPU power. So I figured I would write a quick blog post and provide a way to set up a way to get the service to start when the machine is rebooted. This is a known issue that a user has offered a bounty to get it fixed so if you know some Go and have time, please check out the issue on Github.

Click through for the script.

Comments closed

Execution Plans And GDPR

Grant Fritchey isn’t crazy when it comes to execution plans:

Now, when you save an execution plan out to a file, you’re potentially transmitting PI data. It goes further. When you hard code values, PI is not just in the query. Those PI values can also be stored throughout the plan in various properties.

So now you see what I mean when I say that the GDPR affects how we deal with execution plans. I’m not done yet.

Unfortunately, questions like the one Grant raises here won’t be answered until we see a few test cases in the European courts.

Comments closed

Calling Azure Cognitive Services From SSIS

Rolf Tesmer shows off how easy it is to call Azure Cognitive Services from SQL Server Integration Services:

My SQL SSIS package leverages the Translator Text API service.  For those who want to learn the secret sauce then I suggest to check here – https://azure.microsoft.com/en-us/services/cognitive-services/translator-text-api/

essentially this API is pretty simple;

  1. It accepts source textsource language and target language.  (The API can translate to/from over 60 different languages.)

  2. You call the API with your request parameters + API Key

  3. The API will respond with the language translation of the source text you sent in

  4. So Simple, so fast, so effective!

Click through for the full post.  It really is simple.

Comments closed

Anticipating Disk Growth

Adrian Buckman has a script which gives you an idea of what would happen if your databases all grew by some factor overnight:

The other day I got thinking about what would happen if all databases on a single instance grew out, every single one of them! but not just once, what if they all grew out three, four or fives times overnight – what would things look like?

Well I know the likelihood may be slim but wouldn’t it be nice just to see how many times things could grow before it all runs out of space.

I decided for a bit of fun I would write a query to see what the drive space would look like, this would simulate database growth and then show what drive space would be left after the total growths specified.

It’s a good idea to anticipate this kind of activity, though based on the companies I’ve worked for in the past, the answer would be “run out of disk really fast.”

Comments closed

VS Code And Splatting Powershell

Rob Sewell shows off how easily to splat Powershell parameters with Visual Studio Code:

Well you will know that when you call a PowerShell function you can use intellisense to get the parameters and sometimes the parameter values as well. This can leave you with a command that looks like this on the screen

It goes on and on and on and while it is easy to type once, it is not so easy to see which values have been chosen. It is also not so easy to change the values.
By Splatting the parameters it makes it much easier to read and also to alter.

To learn more about splatting, there’s a whole section in Powershell help on the topic.

Comments closed

Apache Spark 2.3

The Databricks team has been busy.  They’ve recently announced Apache Spark 2.3 on Databricks:

Continuing with the objectives to make Spark faster, easier, and smarter, Spark 2.3 marks a major milestone for Structured Streaming by introducing low-latency continuous processing and stream-to-stream joins; boosts PySpark by improving performance with pandas UDFs; and runs on Kubernetes clusters by providing native support for Apache Spark applications.

In addition to extending new functionality to SparkR, Python, MLlib, and GraphX, the release focuses on usability, stability, and refinement, resolving over 1400 tickets. Other salient features from Spark contributors include:

Anirudh Ramanathan and Palak Bathia also get into Kubernetes support in Spark 2.3:

Starting with Spark 2.3, users can run Spark workloads in an existing Kubernetes 1.7+ cluster and take advantage of Apache Spark’s ability to manage distributed data processing tasks. Apache Spark workloads can make direct use of Kubernetes clusters for multi-tenancy and sharing through Namespaces and Quotas, as well as administrative features such as Pluggable Authorization and Logging. Best of all, it requires no changes or new installations on your Kubernetes cluster; simply create a container image and set up the right RBAC rolesfor your Spark Application and you’re all set.

Concretely, a native Spark Application in Kubernetes acts as a custom controller, which creates Kubernetes resources in response to requests made by the Spark scheduler. In contrast with deploying Apache Spark in Standalone Mode in Kubernetes, the native approach offers fine-grained management of Spark Applications, improved elasticity, and seamless integration with logging and monitoring solutions. The community is also exploring advanced use cases such as managing streaming workloads and leveraging service meshes like Istio.

Stream to stream joins looks particularly interesting.

Comments closed

For Loops And R

John Mount has a couple of tips around using for loops in R.  First up, pre-allocate lists to make certain types of iterative processing faster:

Another R tip. Use vector(mode = "list") to pre-allocate lists.

result <- vector(mode = "list", 3)
print(result)
#> [[1]]
#> NULL
#>
#> [[2]]
#> NULL
#>
#> [[3]]
#> NULL

The above used to be critical for writing performant R code (R seems to have greatly improved incremental list growth over the years). It remains a convenient thing to know.

Also, use loop indices when iterating through for loops:

Below is an R annoyance that occurs again and again: vectors lose class attributes when you iterate over them in a for()-loop.

d <- c(Sys.time(), Sys.time())
print(d)
#> [1] "2018-02-18 10:16:16 PST" "2018-02-18 10:16:16 PST"
for(di in d) { print(di)
}
#> [1] 1518977777
#> [1] 1518977777

Notice we printed numbers, not dates/times.

Very useful information.

Comments closed

Using Python In SQL Server 2017

Emma Stewart has a post covering setup and configuration of SQL Server 2017 Machine Learning Services and using Python within SQL Server:

One of the new features of SQL Server 2017 was the ability to execute Python Scripts within SQL Server. For anyone who hasn’t heard of Python, it is the language of choice for data analysis. It has a lot of libraries for data analysis and predictive modelling, offers power and flexibility for various machine learning tasks and is also a much simpler language to learn than others.

The release of SQL Server 2016, saw the integration of the database engine with R Services, a data science language. By extending this support to Python, Microsoft have renamed R Services to ‘Machine Learning Services’ to include both R and Python.

The benefits of being able to run Python from SQL Server are that you can keep analytics close to the data (if your data is held within a SQL Server database) and reduce any unnecessary data movement. In a production environment you can simply execute your Python solution via a T-SQL Stored Procedure and you can also deploy the solution using the familiar development tool, Visual Studio.

ML Services is a great addition to SQL Server.

Comments closed

Query Store UserVoice Requests

Erin Stellato has a compendium of Query Store UserVoice requests:

In early January Microsoft announced that Connect, the method for filing SQL Server bugs and feature requests, was being retired.  It was replaced by User Voice, and any bugs/requests were ported over.  Sadly, the votes from Connect did not come across to User Voice, so I went through and found all the Query Store requests, which are listed below.  If you could please take the time to up-vote them, that would be fantastic.  If you could also take time to write about why this would help your business, help you upgrade, or purchase more SQL Server licenses, that is even better.  It helps the product team immensely to understand how this feature/fix/functionality helps you and your company, so taking 5 minutes to write about that is important.

Check them out and upvote any which look interesting.

Comments closed

Accessibility And Power BI Reports

Meagan Longoria has some tips to make your Power BI reports easier for people to read:

Avoid using color as the only means of conveying information. Add text cues where possible. It’s very common to show KPIs with a background color or a box next to a metric that uses red/yellow/green to indicate status. Users who have difficulties seeing color need another way to understand the status of a key metric. This could mean that you use a text icon in addition to or instead of color to indicate a status. Power BI reports often include conditional formatting to change the background color or font color of items in a table to convey high/low or acceptable/unacceptable values. If that is important for your users to understand, you could add a field containing the values “high” and “low” to the table itself or to the tooltips. Tooltips are accessible to screen readers via the accessible Show Data table (Alt + Shift + F11).

These are good design principles in addition to providing accessibility benefits.

Comments closed