Press "Enter" to skip to content

Month: November 2016

Get-DbaTcpPort

Steve Jones looks at one Powershell function inside dbatools:

I like using PoSh for some tasks, especially when I don’t have an easy way to do something in SSMS or want to run a task across a variety of instances. In this case, as I glanced through the September updates, I found a good one.

Get-DbaTcpPort

I don’t love the mixed naming, and I’ll get used to it, but I do love the autocomplete in PoSh.

Steve has lots of screenshots walking you through this function.

Comments closed

Tar And Polybase

I look at what the deal is with Polybase and Tar files:

The select statement returned 3104 records, exactly 4 shy of the 3108 I would have expected (777 * 4 = 3108).  In each case, the missing row was the first, meaning when I search for LastName = ‘Turgeon’ (the first player in my data set), I get zero rows.  When I search for another second basemen in the set, I get back four rows, exactly as I would have expected.

What’s really interesting is the result I get back from Wireshark when I run a query without pushdown:  it does actually return the row for Casey Turgeon.

This isn’t an ideal scenario, but it did seem to be consistent in my limited testing.

Comments closed

Ways To Crash Elasticsearch

Roi Ravhon shows how to take down an Elasticsearch instance:

Cardinality aggregation is used to count distinct values in a data set. For example, if you want to know the number of IPs used in your system, you can use this aggregation on an IP field and then count the results.

Despite the usefulness, cardinality can also be a touchy Elasticsearch feature to use. Performing a unique count on a field with a multitude of possible values when configuring a visualization, for example, can bring Elasticsearch to a halt.

Most of it comes down to writing good queries.  But if you don’t know what good Elasticsearch queries look like, read on.

Comments closed

Index Create Dates

Kenneth Fisher looks to see when his indexes were created (or at least updated):

SQL Server stores a create date and a change date for each object in the sys.objects system view.

Unfortunately while tables, views and even constraints are objects, indexes are not. Or at least they aren’t stored in the sys.objects system view. And the sys.indexes system view doesn’t have any dates associated with it. So how do we get the create/update date on an index? Well, short answer is you don’t. Long answer is that in some cases you can get some information.

These aren’t ideal answers, but they can be better than nothing.

Comments closed

SSIS Fast Load

Chris Taylor runs into an issue with the OLE DB Destination’s fast load option in Integration Services:

What I do want to bring to your attention is the differences between the two when it comes to redirecting error rows, specifically rows that are truncated. One of the beauties of SSIS is the ability to output rows that fail to import through the error pipeline and push them into an error table for example. With fast load there is a downside to this, the whole batch will be output even if there is only 1 row that fails, there are ways to handle this and a tried and tested method is to push those rows into another OLE DB Destination where you can run them either in smaller batches and keep getting smaller or simply push that batch to run in row-by-row to eventually output the 1 error you want. Take a look at Marco Schreuder’s blog for how this can be done.

One of the issues we have exerienced in the past is that any truncation of a column’s data in fast load will not force the package to fail. What? So a package can succeed when in fact the data itself could potentially not be complete!?! Yes this is certainly the case, lets take a quick look with an example.

Read on for details and potential workarounds.

Comments closed

Showplan XML Updates

Pedro Lopes discusses new properties which show up on execution plans:

This caused some friction, and so we are happy to announce that in the latest (October) release of SSMS, these attributes are now readily available in the Properties window of an Actual Execution Plan, in the scope of each operator.

Here’s an example with the same query as the March post, where we can see the Actual Number of Rows returned by each of the two threads executing the index seek.

Click through for more details.

Comments closed

Network Shares And SQL Server R Services

John Pertell runs into an issue reading a file on a network share using SQL Server R Services:

What I’m trying to do should be pretty easy. I want to use R code inside a stored procedure to read all the log files created by my weather program and store the results in a database. I also want to read the current monthly file on a regular basis, at least once a day. Once the data is in the database I’ll create some mobile reports with data and charts I can read on my phone. I’ll also be able to use my own data to play with local weather predicting. I thought it would make a pretty cool demo for my R sessions.

However when I run my stored procedure to read the logs I receive an error that there is no such file if I map the share as a drive, or I’m using an invalid parameter if I try to access the share directly.

Read on for code and the specific error message.

Comments closed

Filtered Indexes

Kendra Little explains the two types of filtered indexes:

These two filtered indexes are very different – and the SQL Server optimizer can use them very differently!

While classic filtered nonclustered rowstore indexes must reliably “cover” parts of the query to be used to the optimizer, filtered nonclustered columnstore indexes may be combined with other indexes to produce a plan returning a larger range of data.

This sounds a little weird. I’ll show you what I mean using the WideWorldImporters database.

Kendra exposes an interesting difference in the two types of index and a case where filtered indexes simply fail  (though that’s not a situation you want to be in anyhow!).

Comments closed

Installing Zeppelin On Windows 10

Paul Hernandez shows how to install Apache Zeppelin on Windows 10:

There are several settings you can adjust. Basically, there are two main files in the ZEPPELIN_DIR\conf :

  • zeppelin-env
  • zeppelin-site.xml

In the first one you can configure some interpreter settings. In the second more aspects related to the Website, like for instance, the Zeppelin server port (I am using the 8080 but most probably yours is already used by another application)

This is a very clear walkthrough.  Jupyter is still easier to install, but Paul’s blog post lowers that Zeppelin installation learning curve.

Comments closed