Cardinality aggregation is used to count distinct values in a data set. For example, if you want to know the number of IPs used in your system, you can use this aggregation on an IP field and then count the results.
Despite the usefulness, cardinality can also be a touchy Elasticsearch feature to use. Performing a unique count on a field with a multitude of possible values when configuring a visualization, for example, can bring Elasticsearch to a halt.
Most of it comes down to writing good queries. But if you don’t know what good Elasticsearch queries look like, read on.
SQL Server stores a create date and a change date for each object in the sys.objects system view.
Unfortunately while tables, views and even constraints are objects, indexes are not. Or at least they aren’t stored in the sys.objects system view. And the sys.indexes system view doesn’t have any dates associated with it. So how do we get the create/update date on an index? Well, short answer is you don’t. Long answer is that in some cases you can get some information.
These aren’t ideal answers, but they can be better than nothing.
The Scroller the values move from left to right on an animated scroll bar.
The settings allow you to customize the speed, size, status indicator and colors used.
If you’d like to see the status indicator then you must have a deviation value.
Maybe it’s just me, but I’m not a fan of tickers. They’re too visually noisy for my taste.
What I do want to bring to your attention is the differences between the two when it comes to redirecting error rows, specifically rows that are truncated. One of the beauties of SSIS is the ability to output rows that fail to import through the error pipeline and push them into an error table for example. With fast load there is a downside to this, the whole batch will be output even if there is only 1 row that fails, there are ways to handle this and a tried and tested method is to push those rows into another OLE DB Destination where you can run them either in smaller batches and keep getting smaller or simply push that batch to run in row-by-row to eventually output the 1 error you want. Take a look at Marco Schreuder’s blog for how this can be done.
One of the issues we have exerienced in the past is that any truncation of a column’s data in fast load will not force the package to fail. What? So a package can succeed when in fact the data itself could potentially not be complete!?! Yes this is certainly the case, lets take a quick look with an example.
Read on for details and potential workarounds.
This caused some friction, and so we are happy to announce that in the latest (October) release of SSMS, these attributes are now readily available in the Properties window of an Actual Execution Plan, in the scope of each operator.
Here’s an example with the same query as the March post, where we can see the Actual Number of Rows returned by each of the two threads executing the index seek.
Click through for more details.
What I’m trying to do should be pretty easy. I want to use R code inside a stored procedure to read all the log files created by my weather program and store the results in a database. I also want to read the current monthly file on a regular basis, at least once a day. Once the data is in the database I’ll create some mobile reports with data and charts I can read on my phone. I’ll also be able to use my own data to play with local weather predicting. I thought it would make a pretty cool demo for my R sessions.
However when I run my stored procedure to read the logs I receive an error that there is no such file if I map the share as a drive, or I’m using an invalid parameter if I try to access the share directly.
Read on for code and the specific error message.
These two filtered indexes are very different – and the SQL Server optimizer can use them very differently!
While classic filtered nonclustered rowstore indexes must reliably “cover” parts of the query to be used to the optimizer, filtered nonclustered columnstore indexes may be combined with other indexes to produce a plan returning a larger range of data.
This sounds a little weird. I’ll show you what I mean using the WideWorldImporters database.
Kendra exposes an interesting difference in the two types of index and a case where filtered indexes simply fail (though that’s not a situation you want to be in anyhow!).
There are several settings you can adjust. Basically, there are two main files in the ZEPPELIN_DIR\conf :
In the first one you can configure some interpreter settings. In the second more aspects related to the Website, like for instance, the Zeppelin server port (I am using the 8080 but most probably yours is already used by another application)
This is a very clear walkthrough. Jupyter is still easier to install, but Paul’s blog post lowers that Zeppelin installation learning curve.
I was perusing the release notes for SQL Server 2014 SP2 and found this gem:
Ohhhhhh, very cool. This is great information for troubleshooting!
Read the whole thing.
This is a very interesting set of results. First, 7Zip archived files do not work with the default encoding. I’m not particularly surprised by this result, as 7Zip support is relatively scarce across the board and it’s a niche file format (though a very efficient format).
The next failure case is tar. Tar is a weird case because it missed the first row in the file but was able to collect the remaining 776 records. Same goes for .tar.gz. I unpackaged the .tar file and the constituent SecondBasemen.csv file did in fact have all 777 records, so it’s something weird about the codec.
Stick to BZip2 and GZip if you’re using flat files.