Day: April 11, 2017

In this post I’m going to share a function (actually two) I use run scripts against multiple instances of SQL servers and run the data into a data table. I use this mainly for a replacement of the CMS feature of running against a folder and to put the data into a DataTable object which I output to a GridView that I can sort and filter any way I want which you can’t do in CMS.

Click through for the script.

Comments closed

Risk Vs Opportunity With Technical Advancement

Published 2017-04-11 by Kevin Feasel

Rob Farley on this month’s T-SQL Tuesday topic:

Does Automatic Tuning in Azure mean the end of query tuners? Does Self-Service BI in Excel and Power BI mean the end of BI practitioners? Does PaaS mean the end of DBAs?

I think yes. And no.

Yes, because there are tasks that will disappear. For people that only do one very narrow thing, they probably have reason to fear. But they’ve had reason to fear for a lot longer than Azure has been around. If all you do is check that backups have worked, you should have expected to be replaced by a script a very long time ago. The same has applied in many industries, from production lines in factories to ploughing lines in fields. If your contribution is narrow, you are at risk.

But no, because the opportunity here is to use the tools to become a different kind of expert. The person who drove animals to plough fields learned to drive tractors, but could use their skills in ploughing to offer a better service. The person who painted cars in a factory makes an excellent candidate for retouching dent repair, or custom paint jobs. Their expertise sets them apart from those whose careers didn’t have the same background.

Read the whole thing. Rob is characteristically thoughtful.

Comments closed

Weights In Graphs

Published 2017-04-11 by Kevin Feasel

Angshuman Talukdar shows how to use neo4j to solve minimum weighted distance problems:

A sample dataset is created in Neo4j using the CREATE clause in Cypher as given in Query 1 (create clause in Cypher). This loads the data into Neo4j and generates the graph database as shown in Figure 2.

Neo4j has a lot of graph algorithms shipped with it as a package and those are accessible only from the JAVA API. Implementing some of these algorithms in Cypher is quite complex and time consuming. From Neo4j 3.x, the concept of user defined procedures had been introduced called APOC (Awesome Procedures On Cypher). Those are custom implementations of certain functionality, that can’t be (easily) expressed in Cypher itself. The APOC library consists of many (about 300) procedures to help with many different tasks in areas like data integration, graph algorithms or data conversion.

Graph databases aren’t common, but they can be very useful for certain questions like the one Angshuman solves.

Comments closed

R Plots In Power BI

Published 2017-04-11 by Kevin Feasel

Leila Etaati has a three-part series on displaying R visuals in Power BI. Part 1 shows how to create a scatter plot:

so in the above picture we can see that we have 3 different fields that has been shown in the chart :highway and city speed in y and x axis. while the car’s cylinder varibale has been shown as different cycle size. However may be you need a bigger cycle to differentiate cylinder with 8 to 4 so we able to do that with add another layer by adding a function name

Part 2 shows how to use facet_grid to show multiple plots in one visual:

now I want to add other layer to this chart. by adding year and car drive option to the chart. To do that first choose year and drv from data field in power BI. As I have mentioned before, now the dataset variable will hold data about speed in city, speed in highway, number of cylinder, years of cars and type of drive.

I am going to use another function in the ggplot packages name “facet_grid” that helps me to show the different facet in my scatter chart. in this function, year and drv (driver) will be shown against each other.

Part 3 shows how to place charts on a map in R:

Now I have to merg the data to get the location information from “sPDF” into “ddf”. To do that I am going to use” merge” function. As you can see in below code, first argument is our first dataset “ddf” and the second one is the data on Lat and Lon of location (sPDF). the third and forth columns show the main variables for joining these two dataset as “ddf” (x) is “country” and in the second one “sPDF” is “Admin”. the result will be stored in “df” dataset

Aside from my strong dislike of bar/pie charts on maps, this is good to know, particularly if there is not a built-in or customer Power BI visual to replicate something you can do in R.

Comments closed

Azure Data Lake Store Best Practices

Published 2017-04-11 by Kevin Feasel

Ust Oldfield provides recommendations on how to size and lay out files in Azure Data Lake Store:

The format of the file has a huge implication for the storage and parallelisation. Splittable formats – files which are row oriented, such as CSV – are parallelizable as data does not span extents. Non-splittable formats, however, – files what are not row oriented and data is often delivered in blocks, such as XML or JSON – cannot be parallelized as data spans extents and can only be processed by a single vertex.

In addition to the storage of unstructured data, Azure Data Lake Store also stores structured data in the form of row-oriented, distributed clustered index storage, which can also be partitioned. The data itself is held within the “Catalog” folder of the data lake store, but the metadata is contained in the data lake analytics. For many, working with the structured data in the data lake is very similar to working with SQL databases.

This is the type of thing that you can easily forget about, but it makes a huge difference down the line.

Comments closed

Inline Outsourcing

Published 2017-04-11 by Kevin Feasel

Shane O’Neill coins a term:

There’s never enough hours in the day for everything I need to do!

How many times have we heard a complaint similar to that? Especially now-a-days when DBAs are tasked to look after more and more servers and instances. I cannot remember the last time I heard of a DBA taking care of servers in the single digits.

The work of the DBA keeps increasing but the amount of time that we have remains the same. How do we combat this? How do we make it so we are not sprinting just to keep up?

The only answer I have to this problem is this.

Don’t try to re-invent the wheel…let someone else do it.

It’s an interesting riff on the T-SQL Tuesday theme this month.

Comments closed

Azure SQL Automation

Published 2017-04-11 by Kevin Feasel

Arun Sirpal thinks about Azure automation in the context of how the job market is changing:

This ultimately maps to Query ID 297 where if you click the bar you can see the actual code.

Now, a debate occurred. This code was pretty awful, implicit conversions, GUIDs as cluster keys etc. I took the above code and analysed the execution plan and made some recommendations. I was quickly shut down; I was told to bump up the DTU of the database! Talk about masking the issue with hardware.

Check it out.

Comments closed

Backups Causing Transaction Log Growth In Simple Mode

Published 2017-04-11 by Kevin Feasel

Andy Mallon explains why the transaction log will grow during a backup even if you’re in simple recovery mode:

When SQL Server begins backing up data pages, it also starts keeping track of transactions, via the transaction log. After it has backed up the last data page, it then also backs up all of the transactions that occurred during the data backup. Upon restore, it will then roll those transactions forward or backward, as necessary, to ensure a consistent image is restored.

In our librarian metaphor, she would keep an activity log, which would include the changes to books A and D from the first update, then also the changes to D, X, Y, and Z from the second update. She would not “fix” the data within the backup, but simply store those update details along with her mashed-up copy. In the unlikely event she had to recreate the books (ie, a restore), then she would go back and spend the effort to piece it back together. During that restore process, she would look at the first transaction and see that her copy of Book A in her backup was too old, but Book D already had the update, and she would roll forward the update to Book A. Next, she would process the second update and see that Books X, Y, and Z had the updates, but D still needed this second update, and she would roll forward that second update to Book D. At this point, she would have successfully reconstructed an image that is consistent to the time the backup completed.

Great metaphor to describe consistency during backups.

Comments closed

Logging R Scripts

Published 2017-04-11 by Kevin Feasel

Tomaz Kastrun shows the places where you might be able to track R scripts running on your system:

Extensibility Log will store information about the session but it will not store the R or R environment information or data, just session information and data. Navigate to:

C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\LOG\ExtensibilityLog

to check the content and to see, if there is anything useful for your needs.

It’s not a great answer today.

Comments closed

Linchpins

Published 2017-04-11 by Kevin Feasel

Bert Wagner on the ongoing “what happens to my tech job?” question:

Seth Godin discusses the concept of a Linchpin in his same-titled book. A Linchpin is someone who is so good at what they do that they become indispensable to their organization. Linchpins are the kind of people who are self-motivated and are able to consistently deliver quality work. They are integral to the operation of a business, even if they don’t get all of the glamour of having VP or Director in their title.

And why are Linchpins always guaranteed jobs? In one scenario, Linchpins will outgrow their role and be promoted or find a better job. They are always learning and growing in addition to delivering, and so this is the natural procession. In the alternate scenario, if the Linchpin has to lose his or her current job (ie. think company buyouts where entire departments close), they will either 1) become promoted to elsewhere in the company because management recognizes their great skills or 2) they will have no problem finding work elsewhere, especially with great recommendations from their former employer.

It’s an interesting read.

Comments closed

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30