Press "Enter" to skip to content

Category: Learning

Exploring a Dataset for Microsoft Fabric Suitability

Eugene Meidinger continues a series on learning Microsoft Fabric:

This is week 1 where I try to take Magic the Gathering draft data to learn Microsoft Fabric. Check out week 0 for some reasoning why.

So, before I do anything else, I want to get a sense of the data I’m looking at to see if it’s suitable for this project. I download the data, and because it’s gzipped, I use 7-zip to open it up on windows 10, or Windows explorer on Windows 11. In either case, the first thing I notice is the huge size disparity. When compressed, it is a quarter of a gigabyte. Uncompressed, it’s about 10 GB. This tells us something.

Read on to learn more about the dataset and how Eugene tackled some of the exploratory data analysis.

I also agree completely with Eugene’s point about serendipity. Keeping your metaphorical eyes open will increase the likelihood that you’ll just happen upon something that can help you later, or something that serves a need you didn’t know you had. I used to wander around the library back in my university days because I didn’t know what I didn’t know about topics (that is, the “unknown unknown” quadrant), so I’d just pick up some books that caught my eye. Not all of them are hits, though enough were to make the strategy worthwhile.

Comments closed

The Importance of Asking the Right Question

Brian Kelley offers some advice:

This is not a “clickbait” title, but an important consideration when it comes to developing technical solutions. Let me give you an example between two questions for SQL Server on-premises running on Windows.

Question 1: Does SQL Server allow you to set things like password complexity, password length, and the number of failed login attempts before the account is locked?

Question 2: Does SQL Server support things like password complexity, password length, and locking the account after a number of failed login attempts?

Betteridge’s Law of Headlines also applies to Brian’s post.

It’s so easy to get locked into answering the question without that additional context, and it’s also hard to tell if a person is asking question 1 because they don’t know the answer in general, or if they’re asking because they know you can do it in Windows but aren’t sure if there is a separate mechanism for SQL Server.

Comments closed

Whitepapers for Oracle and SQL Server in Azure

Kellyn Gorman has been busy:

I’ve been pretty busy with work and travel, but I finally got an official Silk Github repository to publish a couple new white papers and sizing assessment worksheets for customer access.  These are primarily Oracle and SQL Server to Azure focused white papers, but I will be publishing ones on GCP next, to be followed by AI and other database platforms soon.

Click through for links to the documents.

Comments closed

A Path to Avoid Getting Overwhelmed with Microsoft Fabric

Kurt Buhler tries to limit information overload:

It’s just too much; I don’t have time for all this stuff.

I think this is a big problem. It’s a problem not just because people shouldn’t feel overwhelmed, but also because it says something about how effectively these new features, tools, and resources are being communicated, understood, and used. But what is the problem, exactly? And if you’re in the minority of people not feeling overwhelmed, why should you care?

Perhaps most importantly, how can we approach these new features, tools, and resources to ensure we understand them and can find value without feeling overwhelmed?

Read on for several tips on how to tackle learning about a product with a large surface area. And I’d also note that anybody who is comfortable working in SQL Server had to go through the same process.

Comments closed

The Search for Extended Events Information

Grant Fritchey stays on the first page:

Here’s their paraphrased (probably badly) story:

“I was working with an organization just a few weeks back. They found that Trace was truncating the text on some queries they were trying to track. I asked them if they had tried using Extended Events. They responded: What’s that? After explaining it to them, they went away for an hour or so and came back to me saying that had fixed the problem.”

We all smiled and chuckled. But then it struck me. This wasn’t a case of someone who simply had a lot more experience and understanding of Profiler/Trace, so they preferred to use it. They had literally never heard of Extended Events.

Why?

This led Grant to perform some search engine shenanigans and what he found was curious. A couple of points with search engines, though:

  • Search engine results will differ based on your location (IP address) and whether you are signed in or not. Google is particularly selective about this stuff. It might also affect Bing, but let’s face it: if you’re using Bing to search for anything other than images, you’ve already resigned yourself to failure.
  • In my case, a search for “extended events” (without quotation marks) did show quite a few pages which I’d consider reasonable for the topic: a Microsoft Learn quickstart article on using extended events, Brent Ozar’s extended events material, a SQL Shack article on the topic, etc. A good number of these links are content from the past 5 years, as well.
  • Grant mentions the “page 1” effect in search engines, and he’s absolutely right. The vast majority of people performing a search never leave the first page of results. This is part of why Google went to an infinite scrolling approach rather than showing explicit numbered pages.
Comments closed

Microsoft Fabric Presentations

Wolfgang Strasser opens a vault:

Are you searching for Microsoft Fabric Presentations? You want learn more about the new unified analytics solution?

There are plenty of presentation available around the internet – some only as recordings, some as PDFs only.

BUT – last week, I found a (now not more) hidden gem of Microsoft Fabric content on the internet – the Microsoft Fabric Readiness repository

Click through for the link to those presentations.

Comments closed

Request: Fill out the Redgate State of the Database Landscape Survey

Ryan Booz would like a few minutes of your time:

We’d like to hear what you have to say about the topology of your database landscape, and we want to give you first access to the data after the survey closes.

By taking a few minutes to answer the questions, you can help provide clarity on how our jobs as database professionals are changing and what skills will be needed in the future to successfully manage change.

Click through for the article and fill out the survey at https://rd.gt/survey. This survey is open until September 30, 2023, so there’s still a bit of time to share your thoughts. One annoying thing about the survey is that they ask you about all of the database platforms, even if you didn’t select that you actually use them. Fortunately, you can skip those questions.

Comments closed

Freshness Labels on Content

Steve Jones does some noodling:

I chose the title slightly to poke at Stack Overflow (SO), but the same take expressed in this tweet could be said about SQL Server Central. It’s not quite the same as anyone can answer questions on SQL Server Central.

The tweet is a (long) hot take from Jerry Nixon, a C# developer and MS evangelist in Denver. Essentially he says that a lot of the SO answers are wrong, especially as the software and languages change. Old answers are upvoted, and remain at the top of the list, even as newer answers might be better. People don’t like the behavior on SO of moderators and people who post, which is something we’ve tried to avoid or limit here at SQL Server Central. We want there to be professional discussions. SO also doesn’t allow much discussion or nuance in the questions or answers.

This isn’t just a SO problem or am SSC one. 

Read the whole thing. This is a huge problem with search engines today and there’s a hacky solution for it. Going back to the original PageRank algorithm that Google used, your rank on the search results list was heavily tied to how many individuals linked back to you. Older pages tend to have more linkbacks because they’ve been around longer, and so there’s a built-in bias toward older content. Google, in particular, has done a lot to work around this problem, but there’s a real issue with timeliness in articles: sometimes, you want the brand new information (like say, product recommendations); other times, you want older or even the original information (such as if you’re researching historical activities). The problem is that there’s no good way to indicate this to the search engines we have, so the hacky solution is for content creators to create sites like “The May 2023 Guide to Blahblahblah” and for search engine users to look for terms like “2023 blahblahblah” so they can avoid all of the outdated 2022 and 2021 blahblahblah discussions.

There’s also a story in here around keeping things up to date. Some people are good about that—they’ll go back and update years-old blog posts based on what’s new and happening. I am not one of those people.

Comments closed