Press "Enter" to skip to content

Day: September 12, 2018

Subsetting Lists In R

Dave Mason continues his look at lists in R:

Subsetting the list with single brackets [] for the first element returns “Atlantic”. But if we take a closer look using the str() function, we see R returned the data as a class of type list:

> #Appears to return "Atlantic" as a character class.
> division[1]
$Name
[1] "Atlantic"
> #str shows us the return is actually a list of 1 element.
> str(division[1])
List of 1 $ Name: chr "Atlantic"

Dave also explains the difference between single brackets and double brackets for list elements.

Comments closed

Naive Bayes Against Large Data Sets

Catherine Bernadorne walks us through using Naive Bayes for sentiment analysis:

The more data that is used to train the classifier, the more accurate it will become over time. So if we continue to train it with actual results in 2017, then what it predicts in 2018 will be more accurate. Also, when Bayes gives a prediction, it will attach a probability. So it may answer the above question as follows: “Based on past data, I predict with 60% confidence that it will rain today.”

So the classifier is either in training mode or predicting mode. It is in training mode when we are teaching it. In this case, we are feeding it the outcome (the category). It is in predicting mode when we are giving it the features, but asking it what the most likely outcome will be.

My contribution is a joke that I heard last night:  a Bayesian statistician hears hooves clomping the ground.  He turns around and sees a tiger.  Therefore, he decides that it must be a zebra.  First time I’d heard that joke, and as a Bayesian zebra-spotter, I enjoyed it.

Comments closed

Triggers: Good, Bad, Mostly Ugly

Bob Pusateri walks us through a poorly-written DDL trigger:

First, the scope. While the application that deployed this trigger has its own database, AppDB, this trigger is firing for events on the entire server, which is what the ON ALL SERVER line means. Any qualifying event on this server, even if it pertains to another application with a separate database, will be written into this database. And what is a “qualifying event”? Literally any DDL statement. The line AFTER DDL_EVENTS specifies the very top of the event hierarchy used by DDL triggers.

So to recap on scope, this application is capturing all DDL statements on the entire server and saving a copy for itself. This application is seeing (and recording) plenty of events that it has no need to see. If this were a healthcare application or a system that dealt with PII it would be a legal nightmare, but fortunately it isn’t.

However, scope isn’t the only issue.

Worth the read.  If you use DDL triggers on the instance level, make sure you know what you’re looking for and limit yourself as much as possible.

Comments closed

Tracking Database Logins: 5 Methods

Eugene Meidinger has a medley of options for tracking server logins:

I once had to some auditing for a customer and it was a complicated, multi-stage process. We had to be able to demonstrate who had admin access and what kind of activity was going on, on the server. But before we could do any of that, we first had to identify who was actually logging on.

We get a brief walkthrough of each, and an important warning.

Comments closed

Server-Level Triggers

Shane O’Neill makes me wish Policy-Based Management ever got the love it needed:

Now, my normal attitude with regard to triggers tends to run to the negative. Which is horrible, because triggers are just like any other tool; neutral by themselves and only good or bad based on how we use them.

So, with that being said, I’ve forced myself to think of a positive use for them. So here is a time when I’ve used triggers for a “good” cause and used them to get some visibility on when new databases are created.

DDL triggers can be useful things, as Shane shows us.

Comments closed

Saving Table History With Triggers

Bert Wagner shows us a way of saving table history in SQL Server using triggers:

Triggers are something that I rarely use.  I don’t shy away from them because of some horrible experience I’ve had, but rather I rarely have a good need for using them.

The one exception is when I need a poor man’s temporal table.

Check it out.  My main comment is, make sure you write the triggers to handle updating multiple rows; otherwise, you’ll be disappointed when rows go missing.

Comments closed

Managing Power BI Dashboard Releases

Jesse Gorter gives us a couple examples of process for releasing Power BI dashboard changes in a corporate environment:

Corporate BI is the ‘one version of the truth’ data that must be governed and tested. The report on top of that might need to be governed too. So any changes to the model or the report (or both) need to go through some testing before it can be accepted. Hence DTAP & Power BI.

Let’s explore a case where an organization has a HR department with a manager and a data analyst. Those two make the reports and decide what they want to publish to their target users: HR employees. A developer creates the datamart and the Analysis Services Tabular model cube on which they create Power BI reports. For the first version, they create a pbix file and upload it to Powerbi.com. What happens when the business wants the report or the model changed?

Read on for two potential solutions.

Comments closed

Table Swaps With Triggers

Jay Robinson walks through the process of making a breaking change to a large, active table with limited downtime:

I can only recall one time in the past several years (at least a decade) that I’ve found triggers to be useful. It involves data migration.

The problem: You have a massive, high-activity table. Let’s call this Table_A. You need to make significant changes to it. For example, the clustered index needs to change. How do you accomplish this?

I’ve used a similar process with good success in the past.

Comments closed