Category: Learning

Diving into Kubernetes: a Workshop

Published 2020-04-01 by Kevin Feasel

I have not blogged for a while, it was my hope to produce part 5 in the series of creating a Kubernetes cluster for production grade Big Data Clusters. However, there is a very good reason for this, and that is because I have been working on a one day workshop to be delivered at SQL Bits in September, the material can be found here, enjoy !

I’ve only looked at the module listings, but Chris does a great job putting long-form articles together, so I’ve already added it to my todos.

Comments closed

Confluent Developer

Published 2020-03-05 by Kevin Feasel

Tim Berglund announces Confluent Developer:

Today, I am pleased to announce the launch of Confluent Developer, the one and only portal for everything you need to get started with Apache Kafka^®, Confluent Platform, and Confluent Cloud! Everything on Confluent Developer is completely free and ungated. It’s a single online source of everything you’ll need to learn Kafka: links to documentation, collections of video tutorials, links to sample code, the entire collection of guided Kafka Tutorials, an index of podcast episodes, and a link to our global network of meetups.

The site is laid out really well.

Comments closed

The Hype Cycle for Artificial Intelligence

Published 2020-03-05 by Kevin Feasel

William Vorhies takes a look at Gartner’s hype cycle for AI (among other things):

Supposing you’re a business leader and supposing you’re trying to make an intelligent decision about prioritizing your AI adoption plans. It’s likely that like many of us the first thing you’d reach for would be one of Gartner’s many hype cycle or magic quadrant analyses.
What you might not know is that you now need an expert just to guide you through the expert literature. There has been such a proliferation of hype cycles and magic quadrants that you could easily be looking in the wrong place.

The hype cycle is definitely opinion-based, but I think it’s a useful look at the relative maturity of different segments of an industry or technology cluster. Do read the whole thing, though, as these things aren’t perfect.

Comments closed

Dirty Deeds in SQL: Identity-Based Looping Edition

Published 2020-03-03 by Kevin Feasel

Nate Johnson has a confession to make:

I’ve done some things I’m not proud of. We all do, in IT, typically when we’re under-the-gun for a deadline or when the systems and frameworks in which we work have some sort of nuance or limitation that we just cannot get around, past, or over. And so we hack. We write code we’re not happy with. We even write code that we despise with every fiber of our well-intentioned being. But it has to be done. Because there’s no other choice.

Read on for the story. And if you want a much less ugly way to find gaps, I know a guy.

Comments closed

Goodbye, MCSE

Published 2020-02-28 by Kevin Feasel

John Deardurff helps break the news:

Major Announcement from Microsoft Learning today. As Microsoft continues to invest in role-based learning offerings, the Microsoft Certified Solutions Associate (MCSA), Microsoft Certified Solutions Developer (MCSD), and Microsoft Certified Solutions Expert (MCSE) certifications will be phased out with a final retirement date of June 30th, 2020. Find the entire list of retired certifications here.

On the plus side, at least people who hold the next iteration of the MCSE won’t be confused with people who worked with NT4 anymore…

Comments closed

2020 Data Professional Salary Survey Results

Published 2020-01-10 by Kevin Feasel

Brent Ozar has another year of salary data for us:

A few things to know about it:
– The data is public domain. The license tab makes it clear that you can use this data for any purpose, and you don’t have to credit or mention anyone.
– The spreadsheet includes the results for all 4 years (2017-2020.) We’ve gradually asked different questions over time, so if a question wasn’t asked in a year, the answers are populated with Not Asked.
– The postal code field was totally optional, and may be wildly unreliable. Folks asked to be able to put in small portions of their zip code, like the leading numbers.
– Frankly, anytime you let human beings enter data directly, the data can be pretty questionable – for example, there were 14 folks this year who entered annual salaries below $500. If you’re doing analysis on this, you’re going to want to discard some outliers.

It’s on my agenda (somewhere…probably a bit further back than I’d like) to dig into this year’s data and try to come up with something a little more comprehensive now that there are four years of data.

Comments closed

Chesterton’s Fence in Development Terms

Published 2020-01-02 by Kevin Feasel

Pete Warden picked a blog post title I couldn’t refuse:

This script came to mind as I was thinking back over the year for a few reasons. One of them was that I spent a non-trivial amount of time writing and debugging it, despite its small size and the apparent simplicity of the problem it tackled. Even in apparently glamorous fields like machine learning, 90% of the work is nuts and bolts integration like this. If anything you should be doing more of it as you become more senior, since it requires a subtle understanding of the whole system and its requirements, but doesn’t look impressive from the outside. Save the easier-to-explain projects for more junior engineers, they need them for their promotion packets.
The reason this kind of work is so hard is precisely because of all the funky requirements and edge cases that only become apparent when code is used in production. As a young engineer my first instinct when looking at a snarl of complex code for something that looked simple on the surface was to imagine the original authors were idiots. I still remember scoffing at the Diablo PC programmers as I was helping port the codebase to the Playstation because they used inline assembler to do a simple signed to unsigned cast. My lead, Gary Liddon, very gently reminded me that they had managed to ship a chart-topping game and I hadn’t, so maybe I had something to learn from their approach?

I am a huge fan of the concept which, made brief, states that if you do not understand why something is the case, don’t change it. If you do understand it, maybe change it but be prudent about it. It’s also something I have often trouble with, as my natural inclination toward code bases is to use the cleansing power of fire to burn it all down.

Comments closed

Why So Few Columnstore Indexes Around?

Published 2019-12-18 by Kevin Feasel

Grant Fritchey has a bit of a rant about people not using Columnstore indexes as much as they should:

It was already common knowledge that columnstore indexes didn’t work for most of us.
Fact is, that’s not true. Now that we have clustered columnstore and non-clustered columnstore, you can go nuts. Most of your data access is through analytical channels? Awesome, use a clustered columnstore. Sometimes though, you need point lookups. Not a problem, add a nonclustered b-tree index to the clustered columnstore. Go here to learn more about Columnstore Indexes.
In short, today, we can completely orient our data storage with our principal data access. Yet, most people are not using these things at all.

One of my interview questions is about columnstore indexes. I’ve learned that I needed to preface it with “What’s the latest version of SQL Server you’ve worked with?” A lot of people answer 2012. Even among the people who use 2016, the normal answer is that they haven’t learned about columnstore yet. And that goes back to Grant’s learning gap: it’s not that hard to grab a book on SQL Server 2019, spin up a Docker container, and dive in. Or watch a course, spin up a Docker container, and follow along. Or read a blog post, spin up a Docker container, and…well, you get the idea.

Comments closed

Learning to Learn

Published 2019-12-06 by Kevin Feasel

Buck Woody has a great post on learning how to learn:

In this new world of fast-paced learning, you’ll often find that you have to “throw away” what you’ve learned, meaning that a new language or tool is out now that requires your attention, and you won’t return to the one you know now. That doesn’t mean your hard study was wasted, because you’ll often find that new technology builds on the one you just learned, but I find that Type-A technologists are loath to drop something they just learned. You’ll have to get over that – it’s the way it is.
However, it can be true that once you learn something, it may be in an area that you just had to come up to speed on quickly, or it has “staying power” and will be around for a while. In that case, take this same process, and repeat all the steps, taking time to fill in the gaps and go much deeper in the areas you didn’t spend time on during your speed learning.

I really liked this post. The first thing it reminded me of was Sir Francis Bacon’s Of Studies (pdf, but with bonus content from Samuel Johnson), specifically the part about how we should superficially breeze through some books, but that others we must digest. The same goes with technologies.

Comments closed

Why the DBA is Important

Published 2019-11-21 by Kevin Feasel

Melody Zacharias takes us through five areas where DBAs are important in the SQL Server 2019 world:

Databases are the beating heart of digital transformation. Businesses increasingly realize that having a unified view gives them a competitive advantage in a world where data is king. The task of breaking down those silos will fall to highly skilled DBAs using cool new technologies such as PolyBase [https://docs.microsoft.com/en-us/sql/relational-databases/polybase/polybase-guide?view=sql-server-ver15]. Although it was introduced in SQL Server 2016, PolyBase got a whole lot more interesting in SQL Server 2019 with the ability to query external SQL Server, Oracle, Teradata, and MongoDB using T-SQL. Our world just got a whole lot bigger!

Read on for the full set of reasons. My agreement with this comes with one caveat: DBAs are important insasmuch as they are willing to grow, try new things, and develop skills. If you’re a stodgy type who hasn’t learned a thing since SQL Server 2008, you’ve got a shelf life.

Comments closed