Press "Enter" to skip to content

Author: Kevin Feasel

Introduction To ElasticSearch

Hasan Rahhal gives a good introduction to ElasticSearch:

On the top level hits.total is the total number of the docs using an empty search query, and max_score is the maximum score a document can take in a specific query. In our case it’s one, since no query was specified.

In __shards.total_ the value is the number of Lucene indexes that Elasticsearch created for that index. The default number is always 5 unless we specify otherwise on index creation time. More details about shards are explained here.

ElasticSearch is designed to store things like logs and monitoring metrics, and the interface is JSON.  This makes it very useful for certain tasks and infuriatingly difficult to do other things (like advanced aggregations).  Still, in a medium-sized or larger environment, this is probably a technology you either are using today or want to use.

Comments closed

Minion CheckDB Beta

Sean McCown has released a beta of Minion CheckDB:

We’ve had many of you asking to be part of the Minion CheckDB beta and now is the time. We’re putting the finishing touches on the 1st beta and it’s looking great with some fabulous features.
So this is the open call for beta users. If you’d like to meet Codex before anyone else then send me an email.
We have some requirements though. We don’t want dead beta users. This is your chance to shape the product and we want to hear from you. So if you’re serious about putting the product through its paces then we definitely want you. So you should be ready to provide real feedback, report bugs as you find them, and work with us to fix them.

Considering that I’ve bothered Sean about this at every SQL Saturday the two of us have been at this year, I’d better get moving and join the beta.

Comments closed

Building Blocks Of Cortana Intelligence Suite

Melissa Coates has put together a new presentation on the building blocks of the Cortana Intelligence Suite:

Each section will wrap up with an example of the ‘building blocks’ to formulate a solution. Although these ‘building blocks’ examples are greatly simplified, my hope is it will generate ideas for how the different Azure components can fit together for formulating hybrid solutions.

Check it out, as there are a lot of pieces.

Comments closed

Cortana Intelligence Suite

Buck Woody discusses various components of the Cortana Intelligence Suite:

It’s not a simple matter of “choose one from column B and two from column A” – you have to learn the processes, and then the tools, and then think about your situation. In other words, some things are complicated because they are…complicated. However:

There are some things you can consider out of the box. So I spoke with my friend Romit Girdhar while we were co-teaching in London last week, and he put together a great visualization. You can see them here, and download the PDF below. Thanks, Romit!

And of course they had to change the name—it wouldn’t be a Microsoft product if the name didn’t change every six months…

Comments closed

Forcing Query Store Plans

Grant Fritchey wonders, if you force a plan using Query Store but the plan ages out of cache, do you still use the forced plan?

To start with, a small stored procedure that I use all the time for bad parameter sniffing demos that reliably gets different plans with different values due to statistics skew:

1
2
3
4
5
6
7
8
9
10
11
12
CREATE PROC dbo.spAddressByCity @City NVARCHAR(30)
AS
SELECT  a.AddressID,
        a.AddressLine1,
        a.AddressLine2,
        a.City,
        sp.Name AS StateProvinceName,
        a.PostalCode
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City;

If this procedure is called for a value of ‘London’ it gets a plan with a Merge Join. For most other value it gets a plan with a Loops Join. Here’s an example of the ‘London’ plan:

It’s a good question with a good answer.

Comments closed

Threat Modeling

Michael Howard discusses threat modeling in Azure:

Many conversations I have with customers go like this:

Customer: “We cannot deploy on Azure until we know that appropriate defenses are in place.”

Me: “I agree 100%, so let’s build a threat model for the proposed design and see what you need to do, and what Microsoft provides.”

A couple of days pass as we build and iterate on the threat model.

Now here’s when the customer has an “a-ha” moment. At the end of the process we have a list of defenses for each part of the architecture and we all agree that the defenses are correct and appropriate.

It’s at that point the customer realizes that they can deploy a cloud-based solution securely.

My tongue-in-cheek response is, of course a customer can deploy a cloud-based solution securely if they have Michael Howard walking them through it.  Michael does include some links on Azure security configuration and threat modeling resources so check those out.

Comments closed

Default Query Store Settings

Erin Stellato talks about default query store settings:

For each database where I enable Query Store, I’d consider the workload and then look at the settings.  I tend to think that the default value of 100MB for MAX_STORAGE_SIZE_MB is really low, and I would be inclined to bump up STALE_QUERY_THRESHOLD_DAYS from 30 to something a bit higher.  I’d also probably drop  DATA_FLUSH_INTERVAL_SECONDS to something lower than 900 seconds (15 minutes) if my storage can support it.  This setting determines how often Query Store data is flushed to disk.  If it’s every 15 minutes, then I could potentially lose 15 minutes of Query Store data if my server happened to crash before it could be written to disk.  I’d also think about changing INTERVAL_LENGTH_MINUTES to a value smaller than 60 if I wanted to aggregate my query data over a smaller amount of time.  Sometimes interesting events happen within a 60 minute time frame, and they get can lost when data is aggregated across that window.  However, aggregating more frequently means I’m adding processing overhead to the system – there’s a trade-off there to figure out.

In our environment at least, 100 MB of query store data would last, oh, a couple hours?  Definitely tweak your settings and keep an eye on them early on.

Comments closed

DBA Scripts Update

Rob Sewell has updated his DBA-Database project:

By making use of the dbo.InstanceList in my DBA database I am able to target instances, by SQL Version, OS Version, Environment, Data Centre, System, Client or any other variable I choose. An agent job that runs every night will automatically pick up the instances and the scripts that are marked as needing installing. This is great when people release updates to the above scripts allowing you to target the development environment and test before they get put onto live.

I talked to a lot of people in Hannover and they all suggested that I placed the scripts onto GitHub and after some how-to instructions from a few people (Thank you Luke) I spent the weekend updating and cleaning up the code and you can now find it on GitHub here

Check out his solution, especially if you do not already have an administrative database on your instances.

Comments closed

Reporting Services Accessibility

Andrew Notarian notes that Reporting Services reports still aren’t Section 508 compliant, even in 2016:

Last weekend I spent some time creating very basic tabular reports in SSRS 2016 to see how they handle accessibility, WCAG and Section 508 issues. It looks like things are largely unchanged since the first SQL 2008 Service Pack which introduced the AccessibleTablix property to the various render options (See this prior post). Now you will want to add the AccessibleTablix item to your HTML4, HTML5 and MHTML renderers. You will get tags linking the detail cells to the header but you will still be lacking a few of the items needed to pass a WCAG audit with flying colors (e.g. TH tags in the header row).

That’s a shame.

Comments closed

Exploring Spark

Adnan Masood has photos of slides from a Spark-related meetup:

Apache Spark is a general purpose cluster computing platform which extends map-reduce to support multiple computation types including but not limited to stream processing and interactive queries. Last week IBM’s Moktar Kandil presented at the Tampa Hadoop and Tampa Data Science Group Joint meetup on the topic of exploring Apache Spark.

Apache Spark for Azure HD-Insight

Following are some of the slides discussed in the meetup. To play with the ALS Recommendation engine notebook, please register at www.datascientistworkbench.com which is a free notebook for Apache Spark platform for educational purposes.

Check out the links.

Comments closed