Press "Enter" to skip to content

Category: Learning

API Servers and the Importance of Learning

Steve Jones tells a story:

While talking with a client recently about their performance challenges, I was relieved to find that the database wasn’t the problem. Instead, their API server was overloaded by the number of calls taking place in their application. While the database did provide the backing for the API calls, there was a fair amount of caching. However, as they’d moved to microservices, more and more of the interaction between modules was taking place as a network call to a single server, which became overloaded.

Steve goes on to the broader point of people freely donating their time and expertise to explain how to solve problems. And the above is a major problem of moving to microservices: everything gets several times chattier. The biggest tricks I have there are to embrace asynchronous processing via queues and ensure that messages passed back and forth are as small as possible, which means getting rid of the idea of passing big lists of fully-hyrdated objects around.

Comments closed

SQL Saturday Orlando Notes

Andy Warren reflects on hosting the only in-person SQL Saturday in the United States this year:

We held an in-person SQLSaturday here in Orlando last weekend (Oct 30th). We didn’t organize one last year, there was just too much risk and too much uncertainty, so it felt good to return to something close to normal this year, even in scaled back fashion. I’ve got a lot of notes to share about how we ran the event this year!

The journey started at the end of 2020. We wrote up our plan for 2021 knowing there were a lot of unknowns, but hoping things would improve enough to resume doing the things we used to do as a local group and that included organizing a SQLSaturday. As this year has progressed attendance at our virtual meetings dropped, as did our enthusiasm for having them. Enthusiasm matters a lot when it comes to volunteer work and while I know many of you like the virtual format, it’s just not what I want to do. That narrowed the option list to having an in-person SQLSaturday or not doing one at all, not a great range of choices.

Read on for a lot of details. I appreciate how transparent Andy has always been with respect to running events like this and if you’re thinking about a SQL Saturday in 2022, definitely read Andy’s post.

Also, the event was small, but it was really nice to get to see people I hadn’t seen in years, so thank you, Andy, for putting on the show.

Comments closed

Eliminate the DeWitt Clause

Justin Olsson and Reynold Xin throw down the gauntlet:

At Databricks, we often use the phrase “the future is open” to refer to technology; it reflects our belief that open data architecture will win out and subsume proprietary ones (we just set a new official record on TPC-DS). But “open” isn’t just about code. It’s about how we as an industry operate and foster debate. Today, many companies in tech have tried to control the narrative on their products’ performance through a legal maneuver called the DeWitt Clause, which prevents comparative benchmarking. We think this practice is bad for customers and bad for innovation, and it’s time for it to go. That’s why we are removing the DeWitt Clause from our service terms, and calling upon the rest of the industry to follow.

One example of how you can tell if you’re influential is how many legal terms are named after you, which I’m pretty sure makes Dr. DeWitt the Steve Tasker of the database industry. So put David DeWitt in the Data Platform Hall of Fame.

And good of Databricks to eliminate their DeWitt Clause. Vendors put the clause in ostensibly to prevent rigged or invalid comparisons between products, but there’s a much better way to do this: publish the benchmark configuration and allow peer validation. If you put out garbage numbers (including on accident because you didn’t know the right way to do something), people are smart enough to catch that. And if people aren’t willing to publish the process, call for them to do it and if they still don’t, ignore the results. 100 times out of 100, that’s the right way to do it…assuming that you’re looking for the truth and not just trying to hide inferiorities in your product *cough* Oracle *cough*.

1 Comment

Thinking like an Escalation Engineer

Stacy Gray shares stories:

“You new?” asked with an amused grin.

“Yes,” I replied floating 2 inches off the ground with a huge, toothy smile.

“Which team?”

“SQL!”

“Good luck.”

I glanced at the badge.  It was blue.  My opportunity to get some secret, inside wisdom!

“I want to become a blue badge.  Do you have any advice on that?” The elevator doors opened.

“Solve your own cases,” was the reply.

Read on for stories, advice, and more.

Comments closed

A Primer on Kafka Streams

Bill Bejeck has an introduction to Kafka Streams:

Kafka Streams is an abstraction over Apache Kafka® producers and consumers that lets you forget about low-level details and focus on processing your Kafka data. You could of course write your own code to process your data using the vanilla Kafka clients, but the Kafka Streams equivalent will have far fewer lines, because it’s declarative rather than imperative. As a library, Kafka Streams lets you create a standalone application that can be run anywhere that can connect to a Kafka broker, whether that’s a laptop or a hefty cloud server. You just need to provide it with the host and port name of a broker. Combining Kafka Streams with Confluent Cloud grants you even more processing power with very little code investment.

Click through for a description as well as a whole series of embedded videos.

Comments closed

Attorneys General Anti-Trust Suit against Google

This isn’t entirely data-related, but it’s fascinating to read the claims against Google. Here is the full text of the suit (with mild redactions). @PatrickMcGee_ dives into the details on Twitter (link goes to ThreadReaderApp), as does @fasterthanlime.

It’s important to keep in mind that these are allegations not yet proved, and Google’s lawyers will get their chance to respond. But, because Google’s not giving me any money to shill for them, I will say that this looks bad for them. Assuming these allegations are close to accurate, there’s some pretty blatant abuse of monopoly power.

Comments closed

Supporting 100 Languages with Microsoft Translator

Krishna Doss Mohan and Jann Skotdal take us through the evolution of Microsoft Translator:

Today, we’re excited to announce that Microsoft Translator has added 12 new languages and dialects to the growing repertoire of Microsoft Azure Cognitive Services Translator, bringing us to a total of 103 languages!

The new languages, which are natively spoken by 84.6 million people, are Bashkir, Dhivehi, Georgian, Kyrgyz, Macedonian, Mongolian (Cyrillic), Mongolian (Traditional), Tatar, Tibetan, Turkmen, Uyghur, and Uzbek (Latin). With this release, the Translator service can translate text and documents to and from languages natively spoken by 5.66 billion people worldwide.

I’ve used the live translation service a few times. It’s a little clunky but it does work pretty well.

Comments closed

Scaling Limitations with Site Reliability Engineering

Tyler Treat argues that the Site Reliability Engineering paradigm doesn’t scale

:We encounter a lot of organizations talking about or attempting to implement SRE as part of our consulting at Real Kinetic. We’ve even discussed and debated ourselves, ad nauseam, how we can apply it at our own product company, Witful. There’s a brief, unassuming section in the SRE book tucked away towards the tail end of chapter 32, “The Evolving SRE Engagement Model.” Between the SLIs and SLOs, the error budgets, alerting, and strategies for handling change management, it’s probably one of the most overlooked parts of the book. It’s also, in my opinion, one of the most important.

Read on for an explanation of this chapter and how it applies to organizations trying to implement SRE.

Comments closed

Calculating Lead Time from Jira and GitHub

Maria Zakourdaev wants to measure agility:

Do you want to visualize your RnD team performance to drive business value? Is there anything that is slowing down your development pipeline? How agile is your team? How long are your customers waiting for the features?

There are many things that can hold you back. Backlog management, code review delays, resources provisioning, manual testing and deployment automation efficiency. In this article I will show you my method of measuring one of the metrics described in this book called LeadTime.

Read on to see how you can do this.

Comments closed