Month: March 2018

I recently ran into this very same problem in which the bcp error message stated:

An error occurred while processing the file “D:\MyBigFeedFile.txt” on data row 123798766555678.

Given that the text file was far in excess of notepad++ limits (or even easy human interaction usability limits), I decided not to waste my time trying to find an editor that could cope and simply looked for a way to pull out a batch of lines from within the file programmatically to compare the good rows with the bad.

Click through for a precise and useful Powershell snippet. The .NET file stream libraries are also good for this kind of thing.

Comments closed

The TIME Data Type

Published 2018-03-15 by Kevin Feasel

Randolph West covers the TIME data type in SQL Server:

This week, we look at the TIME data type. It is formatted as HH:mm:ss.fffffff, where HH is hours between 0 and 23, mmis minutes between 0 and 59, ss is seconds between 0 and 59, and f represents 0 or more fractional seconds, up to a maximum of seven decimal places.

With a maximum length of 5 bytes, TIME can store a value with a granularity of up to 100 nanoseconds.

I tend not to use TIME very often. It’s useful if you need it, but I rarely find myself needing a dateless time.

Comments closed

Working With Azure SQL Managed Instances

Published 2018-03-15 by Kevin Feasel

Jovan Popovic has a couple of posts covering configuration for Azure SQL Managed Instances. First, he looks at how to configure tempdb:

One limitation in the current public preview is that tempdb don’t preserves custom settings after fail-over happens. If you add new files to tempdb or change file size, these settings will not be preserved after fail-over, and original tempdb will be re-created on the new instance. This is a temporary limitation and it will be fixed during public preview.

However, since Managed Instance supports SQL Agent, and SQL Agent can be configured to execute some script when SQL Agent start, you can workaround this issue and create a SQL Agent job that will pre-configure your tempdb.

SQL Agent will start whenever Managed Instance fail-over and the job that contains script above can increase tempdb size before you start running your workload on the new instance.

Then, he covers network configuration:

Managed Instance is your dedicated resource that is placed in Azure Virtual network with assigned private IP address. Before you create Managed Instance, you need to create Azure Virtual network using Azure portal, PowerShell, or Azure CLI.

If you are using Azure portal, make sure that you use Resource Manager deployment model when you create Virtual Network. Classic Virtual Networks are not supported. Once you start creating Virtual Network, make sure that Service Endpoints option is Disabled in Creating Virtual Network Blade (this is default option so don’t change it).

If you want to have only one subnet in your Virtual Network (Virtual Network blade will enable you to define first subnet called default), you need to know that Managed Instance subnet can have between 16 and 256 addresses. Therefore, use subnet masks /28 to /24 when defining your subnet IP ranges for default subnet. If you know how many instances you will have make sure that you have at least 2 addresses per instance + 5 system addresses in the default subnet.

Both posts are useful if you’re interested in getting started with a managed instance.

Comments closed

Flushing The Authentication Cache

Published 2018-03-15 by Kevin Feasel

Arun Sirpal describes an Azure SQL DB-only DBCC command:

This command only applies to Azure SQL Database, at a high level it empties the database authentication cache for logins and firewall rules for the current USER database.

In Azure SQL Database the authentication cache makes a copy of logins and server firewall rules which are in the master database and puts them into memory within the user database. The Database Engine attempts re-authorisation using the originally submitted password and no user input is required.

If this still doesn’t make sense, then an example will really help.

Click through for the helpful example.

Comments closed

Levels And Unique In R

Published 2018-03-14 by Kevin Feasel

Eric Cai demonstrates the difference between levels() and unique() when dealing with factors in R:

The new data set “iris2” does not have any rows containing “setosa” as a possible value of “Species”, yet the levels() function still shows “setosa” in its output.

According to the user G5W in Stack Overflow, this is a desirable behaviour for the levels() function. Here is my interpretation of the intent behind the creators of base R: The possible values of a factor are fundamental attributes of that variable, which should not be altered because of changes in the data.

There’s some back-and-forth in the comments; my takeaway is that both are useful functions depending upon what, exactly, you want to learn.

Comments closed

Row Goals And Anti-Joins

Published 2018-03-14 by Kevin Feasel

Paul White continues his row goals series:

The optimizer assumes that people write a semi join (indirectly e.g. using EXISTS) with the expectation that the row being searched for will be found. An apply semi join row goal is set by the optimizer to help find that expected matching row quickly.

For anti join (expressed e.g. using NOT EXISTS) the optimizer’s assumption is that a matching row will not be found. An apply anti join row goal is not set by the optimizer, because it expects to have to check all rows to confirm there is no match.

If there does turn out to be a matching row, the apply anti join might take longer to locate this row than it would if a row goal had been used. Nevertheless, the anti join will still terminate its search as soon as the (unexpected) match is encountered.

Another very interesting part of the series and well worth the time to read.

Comments closed

How LSNs Get Generated

Published 2018-03-14 by Kevin Feasel

Stuart Moore looks at how SQL Server builds log sequence numbers:

If you’ve ever dug down in the SQL Server transaction logs or had to build up restore chains, then you’ll have come across Log Sequence Numbers (LSNs). Ever wondered why they’re so large, why they all look suspiciously the same, why don’t they start from 0 and just how does SQL Server generate these LSNs? Well, here we’re going to take a look at them

Below we’ll go through examples of how to look inside the current transaction log, and backed up transaction logs. This will involve using some DBCC commands and the undocumented fn_dblog and fn_dump_dblog function. The last 2 are very handy for digging into SQL Server internals, but be wary about running them on a production system without understanding what’s going on. They can leave filehandles and processes behind that can impact on your system.

It’s an interesting look into SQL Server’s internals.

Comments closed

Using Biml With Azure Data Factory v2

Published 2018-03-14 by Kevin Feasel

Ben Weissman shows how you can use BimlStudio to build ADF v2 flows:

As you may have seen at PASS Summit 2017 or another event, with the announcement of Azure Data Factory v2 (adf), Biml will natively support adf objects.

Please note, that the native support is currently only available in BimlStudio 2018. If you’re using BimlExpress, you can still generate the JSON for your pipelines, datasets etc. using Biml but you cannot use the newly introduced tags.

The really good parts are only available in the paid product; if you do a lot of Azure Data Factory work, that might tip the scales in favor of getting BimlStudio.

Comments closed

Excluding Checks With dbachecks

Published 2018-03-14 by Kevin Feasel

Garry Bargsley shows us how to set a config which lets us exclude particular checks when running dbachecks:

While tweaking my Invoke-DbcCheck the list of -ExcludeCheck checks keeps growing and growing.

1

Invoke-DbcCheck -SqlInstance $Servers -ComputerName $Servers -Check $_ -ExcludeDatabase ReportServer, ReportServerTempDB -ExcludeCheck TestLastBackup, TestLastBackupVerifyOnly, LinkedServerConnection, SPN, MaintenanceSolution, SaRenamed, LastGoodCheckDb, LogShipping, InvalidDatabaseOwner -PassThru | Update-DbcPowerBiDataSource -Environment Production

Sure does make for a long command line to scroll thru.

Click through to see how to save these excluded checks in a configuration file.

Comments closed

Meidinger’s Law

Published 2018-03-14 by Kevin Feasel

Eugene Meidinger shares his thoughts on the future:

Since we are prognosticating, I want to take a guess at one of the constraints limiting the future. I present you with Meidinger’s law:

An industry’s growth is constrained by how much your junior dev can learn in two years.

Let me explain. On my team, one of our developers’ just left for a different company. We also have a college student who will be going full time in May, upon graduation. How long do you think it’s going to take the new guy to get up to speed?

And how long do you think he’s going to stay?

This I think is a useful dictum which explains a pretty good amount of industry movement.

Comments closed