In fact the drillthrough/multi-select improvements (which I blogged about here) already shipped as part of SSAS 2014 and are reliant on improvements in Excel 2016 as much as in SSAS; similarly the Excel 2016 query optimisations are not reliant on any changes in SSAS 2016 and will benefit users of all versions of SSAS.
So what has actually changed with SSAS 2016 Multidimensional? I don’t know all the details on every change, but here’s what I know right now:
It sounds like the answer is “not much.” Tabular has been getting more love in Analysis Services.
The last method is done by generating scripts and it will help user in copying not only the table schema or data, but also allows user to copy views, functions, constraints, triggers, etc.
These are three built-in methods which require no additional tooling.
Finally, keep in mind that because the clone is a read-only, empty database, you should be able to test repeatedly without updating statistics and skewing your results. Since I wanted to see this for myself, I executed a set of updates and selects against the SQLSentryData and SQLSentryDataClone databases. As a result of the lack of data and read-only database status, there were no actual updates in SQLSentryDataClone. Consequently, the statistics were updated in the SQLSentryData database, but remained the same in the SQLSentryDataClone database:
Read the whole thing.
Click on “SQL Server 2016 In-memory OLTP technical white paper” and you can open or save.
It’s 90 pages long, so you might reserve it for beach reading.
The query runs faster, make no mistake – but check out the estimates:
- Estimated number of rows = 1
- Actual number of rows = 165,367
Those estimates are built by SQL Server’s cardinality estimator (CE), and there have been major changes to it over the last couple of versions. You can control which CE you’re using by changing the database’s compatibility level. This particular StackOverflow database is running in 2016 compat mode – so what happens if we switch it back to 2012 compat mode?
Based on this result, there might be further optimizations available. Read on for more of Brent’s thoughts.
Now, due to an unfortunate incident when I was a Software Support Engineer that involved a 3 week old backup and a production database, I prefer to not to use the GUI if I can help it.
I’m not joking about that as well, if there is ANY way that I can accomplish something with scripts instead of the GUI, I will take it!
Especially when the need was to document the properties of over 100 articles, I was particularly not looking forward to opening the article properties window for each of the articles and copying them out individually.
Check it out.
The Enhanced Scatter functions very similarly to the standard Power BI scatter chart but with a few new properties added to it including:
Shapes as markers
Background image support
I’ve enjoyed going through this series and getting a chance to dig into custom visuals others have created.
SSC and SSDT require the use of compare tools to build deployment scripts. This is referred to as a state based migration. I’d done deployments like this in the past and saw that people reviewing the release found it difficult to review these scripts when the changes were more than trivial. For this reason, I decided to look at some migration based solutions. Migration solutions generate scripts during the development process that will be used to deploy changes to production. This allows the developer to break the changes down into small manageable individual scripts which in turn makes code reviews easier and deployments feel controlled. These scripts sit in the VS project and are therefore source controlled in the same way as the database.
James recommends Git here. I’m not Git’s biggest fan, but it’s much, much better than not having any source control at all.
Remember chemistry class in high school or college? You might remember having to keep a lab notebook for your experiments. The purpose of this notebook was two-fold: first, so you could remember what you did and why you did each step; second, so others could repeat what you did. A well-done lab notebook has all you need to replicate an experiment, and independent replication is a huge part of what makes hard sciences “hard.”
Take that concept and apply it to statistical analysis of data, and you get the type of notebook I’m talking about here. You start with a data set, perform cleansing activities, potentially prune elements (e.g., getting rid of rows with missing values), calculate descriptive statistics, and apply models to the data set.
I didn’t realize just how useful notebooks were until I started using them regularly.
I like to think of Pig as a high-level Map/Reduce commands pipeline. As a former SQL programmer, I find it quite intuitive, and at my organization our Hadoop jobs are still mostly developed in Pig.
Pig has a lot of qualities: it is stable, scales very well, and integrates natively with the Hive metastore HCatalog. By describing each step atomically, it minimizes conceptual bugs that you often find in complicated SQL code.
But sometimes, Pig has some limitations that makes it a poor programming paradigm to fit your needs.
Philippe includes a couple of examples in Pig, PySpark, and SparkSQL. Even if you aren’t familiar with Pig, this is a good article to help familiarize yourself with Spark.