I have not met a setup where applying compression was not an option, yet. Obviously this has a penalty cost on CPU while executing the backup, and will affect the rest of the tasks running on the server (even if you have your data and backup dir on different drives). But in my experience, the impact is negligible.
This may not be the case with the encryption option, as this has a much larger foot print on the server. You should be using this with some caution in production. Test on smaller subsets of the data if in doubt.
Another thing to keep in mind, as always when dealing with encryption, do remember the password. There is no way of retrieving the data other than with the proper password.
My goal is to be able to rebuild any cube from the relational database, but even with that goal in mind, it is smart to have backups.
I came across an interesting MDX challenge this week; within a cube’s Date dimension, how to show December twice, once where it should be and again as the opening value for the following year. i.e. for each year I need to show Dec (prev yr), Jan, …, Nov, Dec.
Why? Well if you consider the following pivot chart, you can clearly see growth from Jan to Feb, Feb to Mar, etc., but it’s very difficult to see the growth between Dec and Jan.
The solution is easier than I would have expected.
Frequently when developing updates to an SSAS cube I want to deploy my schema and process the dimension. Sometimes several of dimensions process successfully and then fails on one. At this point I go and correct the error, deploy the new schema, and then I only want to process all of my dimensions except the dimensions which did process successfully the first time. Sometimes this is really easy, but if you have a large number of dimensions this can become cumbersome since the only way to know which dimensions had been processed successfully or to right-click each dimension one at a time and find out, or to have memorized which dimensions had processed successfully on the earlier attempt. There can be a better way, and of course, PowerShell is one of those options. J The only problem is that as things currently stand, PowerShell is not as easy as it could be; the Invoke-ProcessDimension cmdlet doesn’t accept [direct] pipeline input. What is one to do when PowerShell isn’t as easy as it could be? File a Connect item of course!
Check out the Trello board. It’s been instrumental in helping Microsoft developers get the leverage they need to dedicate time to improving particular aspects of the product.
In addition to being an “in-memory” technology, Analysis Services Tabular is also a “column-store” technology which means all the values in a table for a single column are stored together. As a result – and this is especially true for dimensional models – we are able to achieve very high compression ratios. On average, you can expect to see compression ratios anywhere from 10x-100x depending on the model, data types, and value distribution.
What this ultimately means is that your 2 TB data mart will likely only requrie between 20 GB of memory (low-end) and 200 GB (high-end) of memory. That’s pretty amazing – but still leaves us with a fairly wide margin of uncertainty. In order to further reduce the level of uncertainty, you will want to take a representative sample from your source database, load it into a model on a server in your DEV environment, and calculate the compression factor.
Read the whole thing; Bill has several factors he considers when sizing a machine.
The Analysis Services product team explained to me that a a user from a tenant which has never provisioned Azure Analysis Services cannot be added to another tenant’s provisioned server. Put another way, our Corporate tenant had never provisioned AAS so the Development tenant could not do so via cross-tenant guest security.
One resolution for this is to provision an AAS server in a subscription associated with the Corporate tenant, and then immediately delete the service from the Corporate tenant. Doing that initial provisioning will do the magic behind the scenes and allow the tenant to be known to Azure Analysis Services. Then we can proceed to provision it for real in the Development tenant.
Read the whole thing.
Varying Levels of Peak Workloads
Let’s say during month-end close the reporting activity spikes much higher than the rest of a typical month. In this situation, it’s a shame to provision hardware that is underutilized a large percentage of the rest of the month. This type of scenario makes a scalable PaaS service more attractive than dedicated hardware. Do note that currently Azure SSAS scales compute, known as the QPU or Query Processing Unit level, along with max data size (which is different than some other Azure services which decouple those two).
Read on for more use cases.
A semi-additive average? What exactly are you trying to calculate? Let me explain first. A semi-additive measure is a measure that can be summed across some dimensions, but not all. Typically it’s the time dimension that isn’t additive. For example, the stock level at various warehouses. You can add all the stock levels of your warehouses together, to get an idea of how much stock you have for your entire company. However, you can’t add the stock level across time. 250 stock yesterday and 240 stock today doesn’t equal 490 stock for the two days. In reality the sum aggregation is replaced with another aggregation when aggregating over the non-additive dimension. In our stock example, we could use the last value known (240) or the average (245). Which aggregation you want depends on the requirements.
In this blog post I’m going to calculate a semi-additive measure, using the average for the non-additive dimension. Quite recently a colleague asked how you could calculate this in DAX. The use case is simple: there are employees that perform hours on specific tasks. The number of hours is our measure. The different tasks (the task dimension) is additive. The employee dimension however is not when we calculate an average. When two employees are selected, the result should not be the average of all the individual hours, but rather the average of the sum of the hours per employee. Let’s illustrate with an example:
That’s really interesting, and a good bit easier to do than the T-SQL equivalent (at least in one step).
Whenever you are introduced to a new environment, either because you visit a new client or take over a new position from someone else, it’s always crucial to get on top of what’s going on. More often than not, any documentation (if you are lucky to even get hands on that) is out of date or not properly maintained. So going through that may even end up making you even more confused – or in worst case; misinformed.
In a previous engagement of mine came a request from the Data Architecture team. I was asked to produce a list of all servers and cubes running in a specific environment. They provided the list of servers and wanted to know which servers were hit by running solutions. Along with this information the team also needed all sorts of information on the connection strings from the Data Source Views, as well as which credentials were used, if possible.
If you’re dealing with a large number of cubes, this becomes even more useful.
Wow, right on the heals of Azure AS and just when you thought things couldn’t get any better for SSAS geeks of the world… Microsoft releases SP1 for SQL Server 2016… an voila, Tabular is now NUMA-Aware!
Click through for more information.
This post contains a list of various methods that can be used to process (i.e. load data into) an Azure AS tabular model. As you will see – not much has changed from the regular on-premise version (which is a very good thing as it softens the learning curve).
Read on if you’re looking at putting an Analysis Services model into Azure.