Jan Mulkens has started an interesting series on data science using the Microsoft stack. His first post is an overview of the products currently available:
But on a more serious note, I’m going to be crude to Microsoft here.
A long time ago, Power BI started as an over-hyped and underwhelming experience. Everyone saw the potential this Excel stuff had but I’m guessing the experience most people had was similar to mine. That is, Power BI back then was a disappointment because of what we were expecting.
The one good thing it did have at one point was PowerPivot.
Skip forward to august 2015.
The Power BI dream had suddenly come true!
Most of the things we were expecting in the past suddenly were there, in a web service AND a desktop application.
From there, Mulkens shares a number of training materials:
Make Microsoft’s Virtual Academy your first or last stop when learning, but you should always pay it a visit!
It’s filled with incredible information broken down in some great free courses.
It seems that (at least some of) the closed edX.org courses are being placed on here, so you can follow up on them at your own pace.
Do be aware that you can’t receive certificates on Microsoft Virtual Academy.
This is an exciting time to jump into analytics. Most of the material is free, and it’s easy to get VMs to practice, so the barrier to entry is low.
As I sit here typing this blog post in my home, we are 145 days or more accurately about 3480 hours until April 12, 2016. That is the date when Extended Support for SQL Server 2005 will be done. Over. Dunzo. Kaput. Yes, Microsoft can do Custom Support Agreements for large companies but it will cost you and you still will be stuck without all the sweet features that have been released in the last 7 years (since 2008 came out). So let’s face, it’s time to upgrade SQL Server 2005!
Chaves gives two good options: either upgrade or move your database into Azure. The unfortunate thing is that there are industries (health care, I’m looking at you) whose vendors are so slow to support new versions that some servers will be stuck on 2005 or (ick) 2000 forever. I feel for you guys.
Organizations looking to take control of this onslaught of information are turning to other solutions to meet their data needs, either in addition to or instead of the traditional EDW. Quite often this means turning to a logical architecture that abstracts the inherent complexities of the big data universe. Such an approach embraces mixed environments through the use of distributed processing, data virtualization, metadata management, and other technologies that help ease the pain of accessing and federating data.
Dubbed the logical data warehouse (LDW), this virtual approach to a BI analytics infrastructure originated with Mark Beyer, when participating in Gartner’s Big Data, Extreme Information and Information Capabilities Framework research in 2011. According to his blog post “ Mark Beyer, Father of the Logical Data Warehouse, Guest Post ,” Beyer believes that the way to approach analytical data is to focus on the logic of the information, rather than the mechanics:
This feels like something that first-movers are starting to adopt, but won’t be mainstream for another 6-8 years. That should give the idea some time to mature as we see the first round of successes and (more importantly) failures.
I trust the utilization reported by “Processor Info”. Note that the greatest reported “Resource Pool Stats” utilization (approaching 120%) is when total “Processor Info” utilization is near 100% across all 12 physical/24 logical cores. Nominal rating of the core is 3.06 GHz, top SpeedStep is 3.46 GHz. That would give a maximum ratio of 3.46/3.06 = 113%, which is still under the number reported by SQL Server (for Default pool alone, I’ll add). Even if the numbers made it seem possible that SpeedStep was responsible for more than 100% utilization reported by SQL Server, I don’t think SpeedStep is the culprit. The older Intel processors were by default conservative with SpeedStep, to stay well within power and heat envelope. And no-one’s been souping this server up for overclocking 🙂
So… if my database engine will give 110% (and sometimes more…) I guess I better, too. 🙂
Math is hard.
With the domain-centric approach, on the other hand, programmers view the domain model as the most important part of the software project. It is usually represented in the application code, using an OO or functional language. Data (as well as other notions such as UI) is considered to be secondary in this case:
Each of the approaches brings its own pros and cons, as well as some differences in the way developers address common design challenges. Let’s elaborate on that.
Khorikov is domain-centric, whereas I am data-centric. My justification is as follows: 20 years from now, the most likely scenario is that your application has been re-written three or four times, whereas my database is still chugging along. Therefore, we should design in ways which make it easier to maintain correct data.
Once I click Properties, I get a dialog with a lot of items on the left. The bottom one is for Extended Properties, with a simple add/edit/delete grid. Here I can see the property(ies) I’ve added.
However this is cumbersome for me. I’d much rather find a way to query the information, which is what I need to do with an application of some sort. I’d think sp_help would work, but it doesn’t.
Extended properties, like Service Broker, was a great idea that floundered because there was never a good UI. Given how much fighting Steve has to do to see one object’s properties, it’s no wonder they’re so relatively unused. And that’s a shame, because with the right tooling, they can be a great way of documenting data structures.
Next I wanted to see if you got all these same project types with an install of Visual Studio. They announced at Summit that “SSDT” would now be “included” with Visual Studio. So I went out and downloaded Visual Studio (CTP3, Community Edition, i.e., free) from here. And look what shows up on the install features list, there it is in black and white, Microsoft SQL Server Data Tools, almost too good to be true.
Well, we all know that if something seems too good to be true, then it usually is. This is no exception. Let’s see if you can pick out the reason for my disappointment in the picture below.
It’s one step closer, at least.
Randolph West has another nice post on temporal tables, in which he lets the cat out of the bag:
Look at the log records. They are identical to before. In fact, because a trigger is called in the same implicit transaction as the
UPDATEstatement, we didn’t even need to wrap the
INSERTstatement in a transaction.
What have we learned?
Temporal tables are doing exactly what an update or delete trigger (or stored procedure) would do, based on what we saw in the log records.
This is an in-depth look at what, exactly, is happening when temporal tables get updated.
When I needed to do an rmarkdown repository for making R Consortium Infrastructure Proposals, I was able to take the opportunity to take Jan’s code and move forward with it so that the ISC proposal is always web-facing. Here’s how I did it:
She’s using this to build the satRday planning site.