Press "Enter" to skip to content

Category: Learning

Becoming An Expert

Adrian Colyer wraps up The Morning Paper for the year by reviewing a big picture paper on developer expertise:

You’ll know an expert programmer by the quality of the code that they write. Experts have good communication skills, both sharing their own knowledge and soliciting input from others. They are self-aware, understanding the kinds of mistakes they can make, and reflective. They are also fast (but not at the expense of quality).
Experience should be measured not just on its quantity (i.e., number of years in the role), but on its quality. For example, working on a variety of different code bases, shipping significant amounts of code to production, and working on shared code bases. The knowledge of an expert is T-shaped with depth in the programming language and domain at hand, and a broad knowledge of algorithms, data structures, and programming paradigms.

Click through for the full review.

Comments closed

Thoughts On The Year’s Big Data Platform News

Kevin Chant shares some thoughts on some of the biggest news stories of 2018 for data platform professionals:

Hortonworks and Cloudera announcement about their merger is certainly an interesting for the Big Data landscape. These two are thought to be the leaders in the Hadoop industry.
Undeniably, a lot of people have seen what these two Big Data giants have delivered over the years within the Hadoop ecosystem.
With this merger they are aiming to use their combined expertise to deliver an enterprise data cloud. We’ve already seen what Hadoop based cloud offerings like HDInsight are capable of, so the potential here is huge.
Certainly, there’s potential for this to have massive implications in the Big Data industry. And this merger could also encourage even more Data Platform offerings to emerge.

Read on for Kevin’s thoughts on five major stories this year.

Comments closed

Lessons From Being Self-Employed

Eugene Meidinger shares some hard-learned lessons from being self-employed:

I’m naturally an introvert. If you and I have a conversation, it’s like a little taxi meter starts running. I may deeply, deeply enjoy the conversation and find it incredibly exciting, but it still taxes my energy levels. Small talk even more so. Imagine that every time someone chatted about the weather, you had to pay the same price as a Lyft ride to go 4 blocks. That’s how I feel about small talk.
That being said, we are still social creatures, and even introverts need human interaction. Especially so when you need to think through new situations, new problems. One of the things I realized attending PASS Summit is that I need social interaction to thrive. So now I spend a lot more time on Twitter and am part of a peer group of authors. I work down at the library whenever I have the chance.

When I did the work-from-home full-time thing, I sought out user groups to build up some technical skills and, more importantly, to get out of the house and talk to a group of people a couple times a week.  That paid off really well in the long run.

Speaking of paying off in the long run, check out Eugene’s BI newsletter.

Comments closed

Take The Data Professional Salary Survey

Brent Ozar has a new edition of the Data Professional Salary Survey:

It’s time for our annual salary survey to find out what data professionals make. You fill out the data, we open source the whole thing, and you can analyze the data to spot trends and do a better job of negotiating your own salary:

Take the Data Professional Salary Survey now.

The anonymous survey closes Sunday, January 6, 2019. The results will be completely open source, and shared with the community for your analysis.

I like this survey so much that I delivered a talk at PASS Summit making heavy use of it.

Comments closed

Gaining Business Understanding Through Paying Attention

Laura Ellis lays down some good tips for understanding business problems:

I know this sounds somewhat silly. But, when thinking through the steps that I take to solve a business problem, I realized that I do employ a strategy. The backbone of that strategy is based on the principals of solving a word problem. Yes, that’s right. Does anyone else remember staring at those first complex word problems as a kid and not quite knowing where to start? I do! However, when my teacher provided us with strategies to break down the problem into less intimidating, actionable steps, everything became rather doable. The steps: circle the question, highlight the important information and cross out unnecessary information. Do these steps and all of a sudden the problem is simplified and much less scary. What a relief! By employing the same basic strategy, we too can feel that sense of calm when working on a business problem.

It sounds blase but paying attention to what people are saying (or writing) versus hearing a few words and assuming the rest.

Comments closed

Wat-Provenance And Debugging Distributed Systems

Adrian Colyer reviews an interesting paper on debugging distributed systems:

Why why-provenance doesn’t work

Relational databases have why-provenance, which sounds on the surface exactly like what we’re looking for.

Given a relational database, a query issued against the database, and a tuple in the output of the query, why-provenance explains why the output tuple was produced. That is, why -provenance produces the input tuples that, if passed through the relational operators of the query, would produce the output tuple in question.

One reason that won’t work in our distributed systems setting is that the state of the system is not relational, and the operations can be much more complex and arbitrary than the well-defined set of relational operators why-provenance works with.

Read the whole thing.

Comments closed

Analysis Of A Failed Project

Eugene Meidinger looks back at a big project which fell apart:

So the first issue was that the software was built in-house by another company in the same industry. Imagine, for example, if a large bakery had created an ERP system and another large bakery wanted to move to that system. Sounds great, right? Well, you run into two issues in that scenario.

First, a bakery is not an independent software vendor. Programming, by definition, is not their core competency. Which means that you may run into fragility or issues that you wouldn’t run into with a commercial piece of software. It also means that there isn’t going to be any documentation on migrating to the software or implementing it. Why would there be. If you built software for one company, why would you create scaffolding to move other companies onto it?

Second, not every business is the same. A lot of the fundamentals are the same, but you will run into many edge cases. We do invoices this way. They do workorders this way. We handle purchase orders this way. They handle inventory that way.

The way that I think about it is like a sea shell. It’s this intricate curve that’s grown over time, organically, to fit that creature. If you just try to fit a different snail or mollusk in that shell, it may not work out.

Read the whole thing.

Comments closed

What You Can Learn At SQL Saturday

Nate Johnson shares a few things he picked up at the SQL Saturday in San Diego:

This was an interesting and even slightly entertaining session presented by Max @ SQLHA. One analogy that really stood out to me was this:

SANs have become a bit like the printer industry — You don’t pay a lot for the enclosure, the device itself, i.e. the SAN box & software; but you pay through the nose for ‘refills’, i.e. the drives that your SAN vendor gods deem worthy of their enclosure.

It’s frighteningly accurate. Ask your storage admin what it costs to add a single drive (or pair of drives, if you’re using something with built-in redundancy) to your SAN. Then compare that cost with the same exact drive off the retail market. It’s highway robbery. And we’re letting them get away with it because we can’t evolve fast enough to take advantage of storage virtualization tech (S2DSOFSRDMA) that effectively makes servers with locally attached SSDs a superior architecture. (As long as they’re not using a horribly outdated interface like SAS!)

Nate also includes several more interesting lessons.  SQL Saturdays are great for picking up useful knowledge.

1 Comment

Tips For Troubleshooting Code Problems

Bert Wagner shares some techniques he uses to troubleshoot code:

1. Rubber Duck Debugging

The first thing I usually do when I hit a wall like this is talk myself through the problem again.

This technique usually works well for me and is equivalent to those times when you ask  someone for help but realize the solution while explaining the problem to them.

To save yourself embarrassment (and to let your coworkers keep working uninterrupted), people often substitute an inanimate object, like a rubber duck, instead of a coworker to try and work out the problem on their own.

Alas, in this case explaining the problem to myself didn’t help, so I moved on to the next technique.

This one works more often than you might expect, and is a big part of the value behind pair programming.

Comments closed

Understanding Binary Trees

Robert Maclean has a couple of posts on binary trees.  In the first post, he explains the basics of a binary tree:

As a binary tree has some flexibility in it, a number of classifications have come up to have a consistent way to discuss a binary tree. Common classifications are:

  • Full binary tree: Each node in a binary tree can have zero, one, or two child nodes. In a fullbinary tree, each node can only have zero or two child nodes.
  • Perfect binary tree: This is a full binary tree with the additional condition that all leaf nodes (i.e. nodes with no children) are at the same level/depth.
  • Complete binary tree: The complete binary tree is where each leaf node is as far left as possible.
  • Balanced binary tree: A balanced binary tree is a tree where the height of the tree is as small a number as possible.

Then, he looks at binary search trees:

So, why should we care about a BST? We should care because searching is really performant in it as each time you move a depth down, you eliminate approximately 50% of the potential nodes.

So, for example, if we wanted to find the item in our example with the key 66, we could start at the root (50) and move right. At that point, we have eliminated 8 possible nodes immediately. The next is to the left from the node with the 70 (total possible nodes removed 12). Next is to the right of the node with the value of 65, and then to 66 to the left of 67. So we found the node with 5 steps.

Going to Big O Notation, this means we achieved a performance of close to O(log n). It is possible to have a worst case of O(n), when the tree is not Optimal or Unbalanced.

Binary search trees are an easy to understand, reasonably efficient model for searching for data.  Even when there are better options, this is an easy algorithm to implement and can often be good enough to solve the problem.

Comments closed