Press "Enter" to skip to content

Author: Kevin Feasel

Breaking Out URLs With M

Chris Webb shows the RelativePath and Query parameters of Web.Contents in M:

Generates a call that returns 20 results, rather than the default 10:

https://data.gov.uk/api/3/action/package_search?q=cows&rows=20

Obviously these options make it easier to construct urls and the code is much clearer, but there are also other benefits to using these options which I’ll cover in another blog post soon.

This makes for a more maintainable, dynamic URL generation.  Think about an internal product dashboard, where you might need to make API calls to pull in data by product (or maybe you want to send people to an external link for each product).  This can help you parameterize your URLs quite easily.

Comments closed

SQLPS Is Dead; Long Live SQLPS

Mike Fal thought he had escaped his SQLPS nightmare:

The second issue is that even if you do install SSMS 2016, SQL Agent won’t recognize and give you access to the new module if you use a PowerShell job step. When you create a PowerShell job step, the script in that job step runs within a specific context. It’s hidden from you, but whenever that script runs the first thing that happens is SQL Server launchessqlps.exe.

Check out the links Mike provides to Connect items and the Trello board if you want to see the issues he brought up fixed.

Comments closed

Troubleshooting Parameter Sniffing

Brent Ozar has a guide on troubleshooting parameter sniffing:

Parameter sniffing fixes are based on your career progression with databases, and they go like this:

1. Reboot the server! – Junior folks panic and freak out, and just restart the server. Sure enough, that erases all cached execution plans. As soon as the box comes back up, they run rpt_Sales for China because that’s the one that was having problems. Because it’s called first, it gets a great plan for big data – and the junior admin believes they’ve fixed the problem.

2. Restart the SQL Server instance – Eventually, as these folks’ careers progress, they realize they can’t go rebooting Windows all the time, so they try this instead. It has the same effect.

If a reboot can’t fix the problem, I’m out of ideas…

By the way, I second Brent’s recommendation of Erland’s query plan article.  Erland doesn’t publish frequently, but when he does it’s worth the wait.

Comments closed

Learning JSON

Jason Brimhall wants to learn a bit of JSON:

Let’s just get this out there right now – I suck at JSON. I suck at XML. The idea of querying a non-normalized document to get the data is not very endearing to me. It is for that reason that I have written utilities or scripts to help generate my XML shredding scripts – as can be seen here.

Knowing that I have this allergy to features similar to XML, I need to build up some resistance to the allergy through a little learning and a little practice. Based on that, my plan is pretty simple:

  1. Read up on JSON

  2. Find some tutorials on JSON

  3. Practice using the feature

  4. Potentially do something destructive with JSON

I’m not particularly excited about JSON support in SQL Server 2016 but the fact that it is there, combined with the fact that so many developers love JSON means that it’s a good idea to learn how to integrate, if only to figure out when it’s a bad idea to parse JSON within your very expensive SQL Server instances.

Comments closed

R Or M?

Ryan Wade gives a few scenarios in which R might be a better language choice than M for Power BI integration:

When referring to what can be done in iOS, Apple often say that there is an “app” for that. Likewise, when R developers refer to what can be done in R, we often say that there is a “package” for that. For instance:

· If one needs to scrap data from the web there are packages for that (rvest, rcurl, and others)

· If one needs to make complicated transformations to their data there are packages for that (dplyr, tidyr, lubrdiate, stringr, and others)

I like the F#-ness of M, but I admit that I’m happy there’s some fairly close R integration within Power BI, as that means there’s one fewer language I need to learn right now…

Comments closed

DBCC CHECKDB’s NOINDEX Option

Kendra Little explains the NOINDEX option of DBCC CHECKDB and how it relates to PHYSICAL_ONLY:

PHYSICAL_ONLY looks at page structures and makes sure something isn’t obviously wrong. If you have the database’s page verification option set to CHECKSUM (the default since SQL Server 2005, and a very good thing), it runs a checksum on the page and compares it to a checksum recorded in the header when the page was last modified.

When you use the PHYSICAL_ONLY option, you’re telling SQL Server to skip logical checks. It won’t do things like make sure the data in a clustered and nonclustered index on a table is consistent with one another.

I like the analogy, as well as the how-to guide on messing up your database.  But try not to do that to a production database…

Comments closed

SparkSession

Jules Damji shows off SparkSession:

Beyond a time-bounded interaction, SparkSession provides a single point of entry to interact with underlying Spark functionality and allows programming Spark with DataFrame and Dataset APIs. Most importantly, it curbs the number of concepts and constructs a developer has to juggle while interacting with Spark.

In this blog and its accompanying Databricks notebook, we will explore SparkSession functionality in Spark 2.0.

This looks to be an easier method for integrating various parts of Spark in one user session.  Read the whole thing.

Comments closed

Concurrency In Hadoop Using ZooKeeper

Garima Dosi discusses an architecture using ZooKeeper to introduce some limited protections for concurrent access in HDFS:

The ZooKeeper nodes topology as per the design looks like this. ZooKeeper works like a filesystem starting with a root directory followed with several nodes (analogous to folders) and finally the data nodes (analogous to files). The circles in the image represent the name of a property/folder that we are trying to maintain and the rounded boxes are the values/files for those properties/folders.

So, the image above shows that the “global version” is 100 and there are 10 & 20 read requests being executed on versions 98 and 99 respectively and since there is a write request in progress, no other write request would be taken up until it completes.

This feels a little overly complicated to me.

Comments closed

Analytic Tool Usage

Alex Woodie notes the increased popularity of Python for data analysis:

According to the results of the 2016 survey, R is the preferred tool for 42% of analytics professionals, followed by SAS at 39% and Python at 20%. While Python’s placing may at first appear to relegate the language to Bronze Medal status, it’s the delta here that really matters.

It’s interesting to see the breakdowns of who uses which language, comparing across industry, education, work experience, and geographic lines.

Comments closed