We have in the middle an open source time series database called InfluxDBis designed for collecting data that is timestamped such as performance metrics. Into that, we feed data from an open source project called Telegraf which can feed in more than just SQL Server statistics. And to be able to show us the data in nice pretty graphs that we can manipulate, drill-down on, and even set up alerts we display it using Grafana. Links to all of these products you find as we go through the setup of the solution.
Tracy’s post is dedicated to installation and configuration more than defining metrics, but it does get you on the road to custom metrics visualization.
As I was writing this, I thought I’d play with the autodetect_column_namessetting. Unfortunately, it wasn’t an option for this particular file. Logstash threw an error :exception=>java.lang.ArrayIndexOutOfBoundsException: -1which leads me to guess that my file is too wide for this setting. This file is staggeringly wide with 75 columns. If you have a more narrow file, this could be a really cool option. If your file format changes by someone adding or removing a column from the CSV, it’ll be a lot easier to maintain. Alas, it’s not an option in this situation.
Check out the script.
Everyday is an interesting day in the life of a DBA. Today, I received a request to run in a HUGE script. This script had 125k INSERT statements each in a separate line. Just opening the script was an issue, leave alone running it into a database. You get this error in SSMS just trying to open – “The operation could not be completed. Not enough storage is available to complete this operation”
This is how I handled it
- I split the file into a manageable 1000 line files (total of 125 files)
- Looped through each split file and ran it in!
Yes, it was that simple. Thanks to PowerShell!
Read on to see how simple it is.
Right now, the data is not ideal for analysis. Keeping in mind how I want to use the data, I need to perform some cleansing and transformation tasks. Any time I work with a new data source, I look to see if I need to do any of the following:
Remove unneeded rows or columns. Power BI stores all my data in memory when I have the PBIX file open. For optimal performance when it comes time to calculate something in a report and to minimize the overhead required for my reports, I need to get rid of anything I don’t need.
Expand lists or records. Whether I need to perform this step depends on my data source. I’ve noticed it more commonly in JSON data sources whenever there are multiple levels of nesting.
Rename columns. I prefer column names to be as short, sweet, and user friendly as possible. Short and sweet because the length of the name affects the width of the column in a report, and it drives me crazy when the name is ten miles long, but the value is an inch long—relatively speaking. User friendly is important because a report is pretty much useless if no one understands what a column value represents without consulting a data dictionary.
Rearrange columns. This step is mostly for me to look at things logically in the query editor. When the model is built, the fields in the model are listed alphabetically.
Set data types. The model uses data types to determine how to display data or how to use the data in calculations. Therefore, it’s important to get the data types set correctly in the Query Editor.
It’s a fun topic to use for learning about Power BI…says the guy wearing a Blue Jackets shirt right now…
If you do not enable implicit transactions, and you don’t start an explicit transaction, you are in the default “autocommit” mode.
This mode means that individual statements are automatically committed or rolled back as whole units. You can’t end up in a place where only half your statement is committed.
Our question is really about whether there are unseen problems with this default mode of autocommit for single-statement units of work.
By force of habit, I wrap data modification operations in an explicit transaction. They let me test my changes before committing and the time you’re most likely to spot an error seems to be right after hitting F5.
Here’s a strange one that I’ve recently come across. I had a customer report that their log shipping restore jobs were chock a block of errors. Now, the logs seem to have been restoring just fine but before every restore attempt, the job is reporting the error,
Error: Failed to update database “DATABASE NAME” because the database is read-only.
Unfortunately I haven’t got any direct access to the server but their logshipping is setup to disconnect users before and leave the database in standby after. After a bit of to-ing and fro-ing, I asked the customer to send me a trace file covering the period that the restore job ran.
Read on for the details and keep those servers patched.
Below this, I see a list of the recommendations. This has grabbed tables that appear to continue to contain some data that might be sensitive and require classification. One of the tenets of the GDPR is that you know your data. You aren’t allowed to figure this out later, but rather you must proactively know what data you are collecting and processing.
It’s a good overview of the feature. Like Steve mentions, I appreciate this data being stored as extended properties: that way, third party and custom-built tools can make use of it. You can also script them out for migration.
As I started to work on this, my first thought was that it would be helpful to know how many tables had a composite primary key. This would give me an idea on how many tables I was dealing with. Thankfully, SQL Server has this information by using system DMVs (dynamic management views) along with the COL_NAME function.
Note: the COL_NAME function will only work with SQL Server 2008 and newer.
All of this time, I’d never known about COL_NAME.