Microsoft Atop Hadoop Cloud Solutions

Forrester has named Microsoft a leader in the Hadoop cloud solutions space:

This week, we’re excited that Forrester recognized Microsoft Azure as a leader in their Big Data Hadoop Cloud Solutions. Apache Hadoop as a technology has become popular amongst organizations to unlock insights from data of all size, shape, and speed. Hadoop power solutions to help businesses improve their performance, educators to better connect with the needs of their students, medical professionals to improve the quality of their care, or researchers to accelerate new advancements in science.

As an example, Ultra Tendency uses Hadoop to achieve something not possible before – visualize more than 27 million distinct sensor readings to give Japanese citizens accurate, up-to-date information about the radiation contamination from the Fukushima nuclear plant meltdown. More and more organizations are also deploying Hadoop in the cloud with 47% of Forrester’s respondents to a 2015 survey increasing their cloud deployments either by 5-10% (37%) or more than 10% (10%).1 This makes sense because the cloud allows you to scale elastically on demand to handle the processing of any amount of data.

AWS and IBM also have very good solutions, and Google is trying to get a stronger foothold on the cloud game.

SQLCover Updated

Kevin Feasel

2016-05-09

Tools

Ed Elliott has an update to SQLCover, his code coverage tool:

This includes a few minor fixes but also support for SQL Azure so if you run your test in a v12 database or higher you can now get an idea of code coverage from that.

If you are interested in using this but don’t know where to start, there is a powershell script in the download (https://the.agilesql.club/SQLCover/download.php) and if you also get reportgenerator (https://github.com/danielpalme/ReportGenerator/releases/tag/v2.4.5.0):

Check out the tool and add a few database tests…or any database tests…

More RAM In Standard Edition Requested

Erik Darling wants Standard Edition to support more RAM:

Microsoft needs to make money. I get it. There’s no such thing as a free etc. But do they really need to make Enterprise licensing money off of people who will never use a single Enterprise feature? Should a small shop with a lot of data really have to make a $5000 jump per core just to cache another 128-256GB of data? That seems unreasonable to me. RAM is cheap. Licensing is not.

I wouldn’t suggest à la carte pricing, because licensing is already complicated enough. What could make sense is offering higher memory limits to shops with Software Assurance. Say up to 512GB on Standard Edition. That way, Microsoft can still manage to keep the lights on, and smaller shops that don’t need all the pizzaz and razzmatazz of Enterprise Edition can still hope to cache a reasonable amount of their data.

It’s an interesting argument.  I’m always sympathetic to having more features (and I’m including stretching limits as a feature here) in Standard Edition as not every company can afford Enterprise Edition.

Script Those Indexes

Kendra Little provides a T-SQL script to script out all indexes on a database:

Sometimes you need to script out all the indexes in a database.

Maybe you’re concerned something has changed since they were last checked in.

Or maybe the indexes aren’t checked into source control, and you’re working on fixing that. (Important!)

Either way, sometimes you need to do it, and it’s not fun through the GUI. I needed to write some fresh demo code for this recently, and I needed it to give the details for partitioned tables using data compression, and I thought I’d share.

The fact that the built-in Generate Scripts does not include compression is annoying, but Kendra’s script does.  For bonus points, use Powershell to update scripts automatically with index changes and check them into your source control system of choice.

Think Set-Based

Paul Randal explains why you should think in a set-based manner when writing T-SQL queries:

This explanation involves a similar problem but not involving SQL Server. Imagine you need to acquire twelve 4′ x 8′ plywood sheets from your local home improvement store.

You could drive to and from the store twelve times, and each time you need to go into the store, purchase the sheet, and wait for a staff member to become available to load the sheet into your pickup truck, then drive home and unload the sheet.

Or you could drive to the store once and purchase all twelve sheets in one go, with maybe four staff members making three trips each out to your pickup, carrying one sheet each time. Or even just one staff member making twelve trips out to your pickup.

Which method is more efficient? Multiple trips to the store or one trip to the store, no matter how many staff members are available to carry the sheets out?

On the SQL Server side, the APPLY operator is a method for bridging set-based operations with procedural thought patterns.  It’s not the perfect answer for everything, but there are some fantastic use cases (like simplifying calculations via chained APPLY operators).

New R And RTVS

Kevin Feasel

2016-05-09

R

R 3.3.0 is now available:

As this is a major release, you’ll need to re-install any packages you were using (and perhaps wait a little while until package authors make any compatibility fixes needed for version 3.3.0). If you’re on the Windows platform, Tal Galili’s installr package automates the process for you. If you are using the checkpoint package (on any platform) you can simply increment the checkpoint date to anytime after May 2, 2016.

(For Microsoft R Open users, the next version to be released will be MRO 3.2.5, and MRO 3.3.0 will follow soon thereafter.)

For more information about R 3.3.0, including the detailed list of changes and bug fixes, follow the link to the announcement from the R Core Group below.

David Smith also notes that R Tools for Visual Studio 0.3 has been released:

R Tools for Visual Studio, the open-source extenstion to Visual Studio that provides an IDE for the R language, has been upgraded to include several new features.

The latest update, RTVS 0.3, now includes:

  • An R package manager, allowing you to review, install, and uninstall packages using a convenient user interface.

  • The Variable Explorer now allows you to open data-frames for viewing in an Excel workbook.

  • New toolbar buttons to run selected code, source the current script, import data from a URL or file, and start/stop a Shiny app.

This is a great time to get interested in R.  If you’re familiar with Visual Studio, Microsoft is making great strides toward integrating things nicely.

HBase’s Failure To Catch On

Kevin Feasel

2016-05-09

Hadoop

Matt Asay has an interesting article on how HBase started as a big thing but has fizzled since:

Ex-Googler (and current Amazon Web Services employee) Tim Bray argues “there is a real cost to this continuous widening of the base of knowledge a developer has to have to remain relevant.” RedMonk analyst Stephen O’Grady takes this a step further: “It could be that we’re approaching the too-much-of-a-good-thing stage. In which case, the logical outcome will be a gradual slowing of fragmentation followed by gradual consolidation.”

In other words, niche data stores that do one thing really well are giving way to more generally applicable databases that can serve a broader range of enterprise needs.

The second part of Keep’s sentence above, however, spells out another reason HBase is struggling: It’s really hard to use.

I have a statement which is 90% serious and 10% joke:  a database product is truly mature once it supports SQL.  So what’s the answer for HBase?  The current attempt at an answer is Phoenix, which is…SQL for HBase.

Traces Can Be Expensive

Matt Slocum warns against having too many server-side traces running:

Before the client took the last ditch effort of just restarting SQL Server, I checked traces.  There were 9 user traces collecting a ton of trace events.

I manually killed them all and suddenly performance returned to normal.  Phew!  Crisis averted.

As a follow up from that issue, I created a script that stops and then deletes all user traces.  We are discussing setting up a job to run this script periodically to keep traces at bay and I am educating the group on proper trace discipline (and Extended Events).

Even Extended Events aren’t free, so the same advice applies.

Categories

May 2016
MTWTFSS
« Apr Jun »
 1
2345678
9101112131415
16171819202122
23242526272829
3031