Press "Enter" to skip to content

Category: Administration

Thinking About The Gitlab Outage

Brent Ozar shares his thoughts on the recent Gitlab outage:

You can read more about the details in GitLab’s outage timeline doc, which they heroically shared while they worked on the outage. Oh, and they streamed the whole thing live on YouTube with over 5,000 viewers.

There are so many amazing lessons to learn from this outage: transparency, accountability, processes, checklists, you name it. I’m not sure that you, dear reader, can actually put a lot of those lessons to use, though. After all, your company probably isn’t going to let you live stream your outages. (I do pledge to you that I’m gonna do my damnedest to do that ourselves with our own services, though.)

There are some good pointers in here.

Comments closed

SQL On Linux Connectivity

Slava Murygin troubleshoots connectivity issues with SQL Server on Linux:

After you enter SA password you have to get “vNext” version of your SQL Server.
If you did not get the correct response you might have following problems:
I)   Wrong SA password. To fix it, just re-configure SQL Server.
II)  SQL Server Tools are not installed.
III) Typo. Check your syntax.

The troubleshooting process is a bit different from SQL Server on Windows, but it’s still pretty straightforward.

Comments closed

Avoiding Percent Growth

Angela Henry has a script to tame percent growth on database files:

I decided I needed to do something else other than just send an email notification, I needed to take corrective action when it occurred.  So I wrote a little stored procedure that will take the ALTER DATABASE statement as a parameter, parse it and take the appropriate corrective action. 

Simple enough, right?  Now I just need to add the call to my newly created stored procedure in my server level trigger and we are good to go.  But wait, you can’t ALTER a database within an ALTER DATABASE statement (don’t believe me? Use this as a learning exercise to see what happens when you try).  So what could I do?  There are several things you could do, but I chose to create a table that could hold this newly created ALTER DATABASE statement and insert the record there.  Then I created a SQL Agent job that runs once every hour and reads that table and executes any entries it finds, then deletes them after successfully executing.

Read the whole thing, including the disclaimer.

Comments closed

Troubleshooting Connectivity Errors

The CSS SQL Server Engineers have a guide for solving connectivity issues:

In addition to providing a quick checklist of items that you can go through, the doc provides step by step troubleshooting procedures for the following error messages:

  • A network-related or instance-specific error occurred while establishing a connection to SQL Server

  • No connection could be made because the target machine actively refused it

  • SQL Server does not exist or access denied

  • PivotTable Operation Failed: We cannot locate a server to load the workbook Data Model

  • Cannot generate SSPI context

  • Login failed for user

  • Timeout Expired

  • The timeout period elapsed prior to obtaining a connection from the pool

Click through for the guide.  It’s in choose-your-own-adventure format, though without nice graphics.

Comments closed

SSRS Log File Location Change

Wolfgang Strasser points out that SSRS log files are in a new directory structure for vNext:

The log files can be found in the Logfiles directory (it was the same directory also for the older versions). In SSRS vNext there more different log files..

The logging information seems to be splitted into multiple log files – if you want for example dig into the Power BI on-premises logging I propose to have a look at the RSPower*.log files.

This happens every once in a while, so it’s good to know when the log files move somewhere else.

Comments closed

Checking Database Availability

Dimitri Furman explains that database availability is a trickier problem than it may first appear:

To check the availability of the database, the application executes the spCheckDbAvailability stored procedure. This starts a transaction, inserts a row into the AvailabilityCheck table, flushes the data to the transaction log to ensure that the write is persisted to disk even if delayed durability is enabled, explicitly reads the inserted row, and then rolls back the transaction, to avoid accumulating unnecessary synthetic transaction data in the database. The database is available if the stored procedure completes successfully, and returns a single row with the value 1 in the single column.

Note that an execution of sp_flush_log procedure is scoped to the entire database. Executing this stored procedure will flush log buffers for all sessions that are currently writing to the database and have uncommitted transactions, or are running with delayed durability enabled and have committed transactions not yet flushed to storage. The assumption here is that the availability check is executed relatively infrequently, e.g. every 30-60 seconds, therefore the potential performance impact from an occasional extra log flush is minimal.

This ends up being much more useful than a simple “is the service on?” heartbeat, as it shows that the database is available, not just that the engine is running.

Comments closed

Comparing Column Names

Jen McCown has a script to compare column names between tables to find case inconsistencies:

I’m reviewing the code for the upcoming Minion CheckDB, and one of the things we’re checking for is case consistency in column names. For example, if Table1 has a column named Col1, and Table2 has COL1, well that’s no good.

But, how do we easily find those mismatches on a system that’s not case sensitive? Easy: collations.

Click through for the script.

Comments closed

Why Logins Are Failing

Kenneth Fisher looks at various error log messages to show how to fix different login failures:

Error Displayed
Login failed for user ‘Kenneth_Test’. (Microsoft SQL Server, Error: 18456)

How do I fix it
Simple enough. Change the password. However, there are a few warnings here. If you change the password and it’s being used by someone (for example this is an application id) you may be breaking the application. If this is production, that could be a no-no. Just in case I like to back up the password hash before changing it.

I’ve also gone back to the well several times over the years with this blog post showing what the various error states mean when you get error 18456.

Comments closed

Thinking About Dead Connections

Lonny Niederstadt has a half-baked idea on dead connections:

In SQL Server 2005, a valuable addition was made.  SQL Server would use a “Keep Alive” value specified specifically per instance, rather than the KeepAliveTime specified by the Windows registry.  It defaulted to 30000 ms/30 seconds.

The blog post below explains this new addition, and mentions that the interval for SQL Server will be a fixed 1000 ms/1 second, regardless of the KeepAliveInterval specified in the Windows registry. At the time SQL Server 2005 was introduced, TCPMaxDataRetransmissions from the Windows registries still controlled the maximum number of probes.

Read on for more.

Comments closed