Kevin Feasel – Page 969

Python in Power BI Desktop

Published 2020-07-22 by Kevin Feasel

David Eldersveld dives into using Python as an external tool in Power BI:

Why use Python as an external “tool”? Even though Python isn’t a “tool” in the same sense as the “Big 3” community tools focused this month, I want to show how versatile the External Tools feature is. I also want to encourage people to use imagination and also explore how Power BI isn’t really as closed as some people think–at least the data model…
Some of these ideas are not exclusive to Python, but there’s enough variety in the Power BI and data science communities for people to possibly figure out if some of this might be useful within the context of their own environments, skills, and organizations.

David also follows up with a series of sample ideas.

Comments closed

Disk Caching with SQL Server VM Disks in Azure

Published 2020-07-22 by Kevin Feasel

Niko Neugebauer performs some tests:

Microsoft has been extremely clear in the best practices recommendation for the SQL Server workloads on Azure VMs:
– use read caching for the data drives/storage pools
– use no caching for the log drives/storage pools
– use read caching for the temp db drives/storage pools
Sounds simple and direct, isn’t it ?
Let me borrow your attention for the next couple of minutes pointing to some situations where you might want to reconsider the best practices.

But do read on for some important notes.

Comments closed

Finding and Downloading SQL Server Updates

Published 2020-07-22 by Kevin Feasel

Andy Levy combines dbatools and KBUpdate:

Another of Chrissy LeMaire’s (blog | twitter) projects is KBUpdate. Compared to dbatools it’s a pretty compact module, but it’s incredibly useful – it’ll seek out information about KB updates and even download them for you! She’s also rolled these functions into dbatools for convenience, so we don’t need to install or import that module separately.

Read on to see how Andy ties it all together.

Comments closed

Waiting on a SQL Agent Job to Complete

Published 2020-07-22 by Kevin Feasel

Thomas Rushton doesn’t have time to sit around and wait:

You know how it is. You need to wait for a job to complete before moving onto the next one… And, just sometimes, you need to do the same with SQL Server.

Read on for a script which waits for one SQL Agent job to complete before moving on and doing something else.

Comments closed

Latching in SQL Server

Published 2020-07-22 by Kevin Feasel

Dan Jackson explains the concept of latching in SQL Server:

To start with, a basic definition: ‘Latches are lightweight synchronization objects, that are used by the storage engine of SQL Server to protect the internal memory structures’. Compare this with locks in SQL server, which are a transaction level construct to manage concurrency, latches work at the thread level to maintain data integrity within the internal memory structures. They are not exposed outside of the SQL Server Operating System (SQLOS). They are only managed by SQL Server itself, not by users (unlike locks that can be overridden via lock hints or changing isolation level). It is useful to keep in mind that a single transaction can use multiple threads at the same time.

Latching is a funny concept to me, in that I think people say “This must be a latching problem” far more than there actually is, but when there is a proper latching problem, it usually winds up being a pretty big deal.

Comments closed

Making a Heap Molehill out of a Heap Mountain

Published 2020-07-22 by Kevin Feasel

Bert Wagner needed to move a large subset of a heap into its own heap:

Recently I had to filter out 1.2 billion records from a 3.5 billion row heap. Don’t ask me why this 3.5 billion row table is stored as a heap.
If the lack of a clustered index wasn’t bad enough, I also had some other restrictions:
– I couldn’t add a clustered index (or any index for that matter) sorted on the key I needed to filter on. It wasn’t my system, and and I needed to access the 1.2 billion records sooner than it would take to get a clustered index approved and added. Even then, maybe the lack of a clustered index on this table is a feature and my request would be denied. Who knows.
– I didn’t have a server that could store all of the data. At first I thought of copying all 3.5 billion rows to my own server and indexing it how I needed, but I didn’t have enough storage space anywhere to do that.
– My connection to the server had a relatively short timeout set on it. This also couldn’t be changed. If I couldn’t copy all 3.5 billion rows because of storage, I also couldn’t copy all 1.2 billion records in one fell swoop because the connection would timeout.

I’m also glad to see that Bert is back in action and look forward to seeing these posts and videos pop up again.

Comments closed

Result Window Too Large in Elasticsearch

Published 2020-07-21 by Kevin Feasel

Samir Behara explains a common Elasticsearch error:

I have configured Error Logs for my Elasticsearch cluster, and I see a frequent error below in the logs —

org.elasticsearch.ElasticsearchException$1: Result window is too large, from + size must be less than or equal to: [10000] but was [15020]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.

Click through to understand what the issue is and how you can resolve it.

Comments closed

The Cloudera Operational Database Experience

Published 2020-07-21 by Kevin Feasel

Liliana Kadar, et al, cover scalability options for DBAs working with Cloudera:

Cloudera’s Operational Database (OpDB) supports a scale-up (SMP) environment. The caching layer is able to consume all memory in a large SMP environment. Memory has to be large enough to cover RegionServers, DataNodes and operating system, and to have enough extra space to allow the block cache to assist with reads. When HBase is running with other components, CPU contention and memory contention can be a problem that is easy to address with proper YARN tuning.
As a result of the scale-up architecture, multiple services and engines can be run on a single node. For smaller nodes, multiple services and engines have to be spread out amongst a larger set of nodes.

In addition, Krishna Maheshwari, et al, announce a technical preview of their Cloudera Operational Database experience:

The Cloudera Operational Database (COD) experience is a managed dbPaaS solution which abstracts the underlying cluster instance as a Database. It can auto-scale based on the workload utilization of the cluster and will be adding the ability to auto-tune (better performance within the existing infrastructure footprint) and auto-heal (resolve operational problems automatically) later this year. It offers multi-modal client access with NoSQL key-value using Apache HBase APIs and relational SQL with JDBC (via Apache Phoenix). The latter makes COD accessible to developers who are used to building applications that use MySQL, Postgres, etc.

It’s interesting to see Cloudera move in this direction.

Comments closed

Missing Indexes Don’t Tell the Whole Story

Published 2020-07-21 by Kevin Feasel

Erik Darling explains some of the shortcomings of the missing indexes DMV:

The problem with relying on any data point is that when it’s not there, it can look like there’s nothing to see.
Missing indexes requests are one of those data points. Even though there are many reasons why they might not be there, sometimes it’s not terribly clear why one might not surface.
That can be annoying if you’re trying to do a general round of tuning on a server, because you can miss some easy opportunities to make improvements.

Read on for a few examples of where the results can betray you.

Comments closed

Auto-Shutdown an Azure VM and Notify You on Slack

Published 2020-07-21 by Kevin Feasel

Daniel Hutmacher has a fun assignment:

Virtual machines cost money when they’re powered on. Most servers obviously need to be on 24 hours a day. Others, like development machines, only have to be on when you’re using them. And if you forget to turn them off, they’ll empty out your Azure credits (or your credit card) before you know it.
Today, I’ll show you how to set an Auto-shutdown time to turn a VM off if you forget, as well as have Azure notify you on Slack 30 minutes ahead of time, so you have the option to postpone or cancel the shutdown.

There are a few steps to the process, but everything is straightforward.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Author: Kevin Feasel