Press "Enter" to skip to content

Category: Administration

A Trace Flag (Generally) to Avoid

Erik Darling takes us through trace flag 3608:

According to the docs:

Prevents SQL Server from automatically starting and recovering any database except the master database. If activities that require TempDB are initiated, then model is recovered and TempDB is created. Other databases will be started and recovered when accessed. Some features, such as snapshot isolation and read committed snapshot, might not work. Use for Move System Databases and Move User Databases.

Note: Do not use during normal operation.

Scope: global only

But it turns out it can do quite a bit of harm. It seems that many things stop working when it’s in use, though, including statistics getting automatically created.

Click through to see what kinds of things fail to work as a result of this trace flag.

Comments closed

Scaling Hadoop Beyond 10,000 Nodes

Keqiu Hu, et al, take us through a problem of scale:

At LinkedIn, we use Hadoop as our backbone for big data analytics and machine learning. With an exponentially growing data volume, and the company heavily investing in machine learning and data science, we have been doubling our cluster size year over year to match the compute workload growth. Our largest cluster now has ~10,000 nodes, one of the largest (if not the largest) Hadoop clusters on the planet. Scaling Hadoop YARN has emerged as one of the most challenging tasks for our infrastructure over the years.

In this blog post, we will first discuss the YARN cluster slowdowns we observed as we approached 10,000 nodes and the fixes we developed for these slowdowns. Then, we will share the ways we proactively monitored for future performance degradations, including a now open-sourced tool we wrote called DynoYARN, which reliably forecasts performance for YARN clusters of arbitrary size. Finally, we will describe Robin, an in-house service which enables us to horizontally scale our clusters beyond 10,000 nodes.

Read on to learn about the problems they experienced and how they resolved them.

Comments closed

AWS EC2 I3 Instance Types and Storage Persistence

Steve REzhener has a warning for us:

Amazon Web Services Elastic Cloud Computing (a.k.a. EC2)  is a service that lets anyone with a credit card rent a virtualized server from Amazon. To cater to different clients’ needs, AWS provides various instance types that are either general instance or specific-purpose instances (focused on CPU, RAM, IO). You can see the different types in Fig 1. This blog post is going to talk about a storage optimized instance. the I3 instance type family, its little-known problem, and the solution in the form of  Elastic Block Storage (a.k.a. EBS).

Click through for the warning, more explanation, and what you can do about it. H/T the SQLServerCentral newsletter.

Comments closed

Fixing Remote Desktop Resolution to Match Screen Size

Kenneth Fisher fixes a problem:

One of the things I hate most about using remote desktop is when the resolution for the remote server is way off what the size of my window is. You end up with the scroll bars on the side and bottom of the screen and spend half your time moving the screen around so that you can find what you need. Open Notepad, scroll to the top of the screen so you can save it, then scroll to the bottom of the screen so you can switch to the Excel spreadsheet that’s hidden by the Notepad window, etc.

It turns out there is an easy setting that forces the resolution to match the window size. Even if you end up with really weird window sizes.

My fix historically was to use mRemoteNg, which did this for you.

Comments closed

Attributing Redshift Costs to Users

Jason Pedreza, et al, show how you can break down query utilization by user in an Amazon Redshift database:

At its simplest form, cost attribution can be determined using the amount of the storage assigned to the individual objects using the ownership of the objects to the groups. But the downside of this approach is it doesn’t provide a true translation of the resource usage. For example, let’s say Team 1 has total object size of 1 TB, whereas Team 2 has 100 GB in total size. Team 1 member runs 10 queries daily, and Team 2 runs 1,000 queries per day. Of course, Team 2 uses more resources than Team 1.

The Amazon Redshift RA3 architecture allows you to pay for the compute and data warehouse storage capacity separately, therefore storage doesn’t reflect the resources used by the teams for the cost attribution.

Click through to see how.

Comments closed

Learn, Test, Forget

Erik Darling shows us the bright side of a memory equipped to forget:

I often have to talk clients down from odd ledges. At this point, there’s about 20 years of advice about SQL Server out there, and for various reasons received wisdom often gets migrated en masse, whether it’s still applicable or not.

Outdated trace flags and settings are pretty common, and don’t get me started on coding practices.

But I get it — that stuff came along with advice that still makes sense — like parallelism and memory settings, separating off tempdb, etc.

Unfortunately, it often leads people to the “it used to be fast so it should always be fast” trap based on what used to work, rather than what makes sense currently. Or even what the current problem is.

Read the whole thing. Also, in a rare double-quotation, I want to cite this line from Erik:

“I don’t know, xxVanWilderFan420xx said they were bad and we should avoid them.”

Never fully trust a person going by the pseudonym xxVanWilderFan420xx. However, if it were xxSolomonKaneFan420xx, 100% believe.

Comments closed

Azure Database for PostgreSQL Replicas

Gauri Mahajan takes us through replica creation in Azure Database for PostgreSQL:

Azure Database for PostgreSQL is an Azure offering of the open-source Postgres database. As there are many databases and data warehouses that are derived from Postgres, during migration from Postgres to a different flavor of another database or data warehouse that is compatible with Postgres, often read replicas are employed. The replicas are read-only since it’s a one-way replication from the master database to replicas. And replicas serve the purpose of decreasing the load on the primary transactional database in production environments. Replicas are typically used as migration sources, reporting and ad-hoc analytics sources and for other purposes. Let’s go ahead and learn to create and manage read replicas in Azure Database for PostgreSQL.

Click through for the process.

Comments closed

Accessing Network Shares from SQL Server

Daniel Hutmacher engages in chicanery:

Using a local service account for your SQL Server service, your server won’t automatically have permissions to access to other network resources like UNC paths. Most commonly, this is needed to be able to perform backups directly to a network share.

Using a domain account as your SQL Server service account will allow the server to access a network share on the same domain, but if the network share is not on your domain, like an Azure File Share, you need a different solution.

There’s a relatively easy way to make all of this work, though.

Click through to see how, as well as several methods to make it work within SQL Server.

Comments closed