R6 Classes In R

Kevin Feasel

2017-07-28

R

David Smith explains what R6 classes are in R:

The big advantage of R6 is that it makes it much easier to implement some common data structures in a user-friendly manner. For example, to implement a stack “pop” operation in S3 or S4 you have to do something like this:

x <- topval(mystack)
mystack <- remove_top(mystack)

In R6, the implementation is much simpler to use:

x <- mystack$pop()

David links to some good resources on the topic, so check those out as well.

Finding Running Agent Jobs

Adrian Buckman has a script to find all running SQL Agent jobs:

For our Procedure we wanted to show currently running jobs regardless of run time or run time vs historical run time, we also wanted to be able to see if the job was started by the Agent itself or a User, and to see which step the job is currently running on including that steps Elapsed time and last but not least the Total job elapsed time.

Now I will be the first to admit , this is not the prettiest code I have ever produced but getting some of this information out is quite tricky 🙂

This Procedure is a great alternative to the SQL Agent Activity Monitor, it doesn’t have all the information that the Monitor has but it probably has everything that you need to see at a glance – and whats more being that it is a Stored Procedure you could run this across multiple servers via registered servers for example and get results within seconds.

Click through for the script.

Row Headers In U-SQL

Kevin Feasel

2017-07-28

U-SQL

Melissa Coates shows how to handle row headers in CSV files when writing U-SQL queries:

This is a quick tip about syntax for handling row headers in U-SQL, the data processing language of Azure Data Lake Analytics. There are two components: handling row headers on the source data which is being queried, and row headers on the dataset being generated by ADLA.

Click through for the one-liners as well as sample queries.

Powershell To Copy Reporting Services Subscriptions

Claudio Silva has a new contribution to the Reporting Services Powershell module:

If we take a look to the “New Subscription” form, we will discover about a dozen of fields that need to be configured. Doing this by hand can make you want to pull your hairs, also the probability of error is huge, even with copy & paste.
Who wants to do copy & paste of dozens of fields between reports? I know who doesn’t – me 🙂

Click through to learn more about Claudio’s cmdlets for getting, setting, and removing Reporting Services subscriptions.

Moving TempDB In Linux

David Klee shows how to migrate the tempdb database when running SQL Server on Linux:

We previously created a folder at /var/opt/mssql/data/tempdb01 for these files. Moving them is straightforward, once you know the file system structure. The following commands will move them to the new location, and I also add additional files to equal the four vCPUs I have on this SQL Server VM. The file growth is my model database’s default of 64MB for this instance. Do as you would normally do with SQL Server on Windows with tempdb file counts and separation of duties for your workload.

Read on for the process.  As a general spoiler, the “how to do this in Linux” answer is usually pretty close to the same as the “how to do this in Windows” answer, at least once you get into Management Studio.

The Risks Of Clearing The Procedure Cache

Erin Stellato explains two downsides to running DBCC FREEPROCCACHE or anything else which clears query plans:

Ideally, you should remove only what’s absolutely necessary.  Using DBCC FREEPROCCACHE is a sledgehammer approach and typically creates a spike in CPU as all subsequent queries need to have their plans re-generated.  Glenn gives examples on how to use each statement (and others) in his post Eight Different Ways to Clear the SQL Server Plan Cache, and I want to show you one more thing that happens when you clear a plan (or all plans) from cache.

For this demo script, I recommend running it against a TEST/DEV/QA environment because I am removing plans from cache which can adversely affect performance.

There are reasons to run these commands, but ideally, you should be as precise as possible in clearing plans out of the cache.

Power KPI Custom Visual

Devin Knight continues his Power BI custom visuals series:

In this module you will learn how to use the Power KPI Custom Visual.  The Power KPI displays your KPI indicator values on a helpful multi-line chart with labels.

I like Devin’s example of using this for time series projections versus actuals versus priors.

Extracting Phone Numbers With Apache Tika

Kevin Feasel

2017-07-27

Hadoop

Unni Mana knows how to get your digits:

Last time, I had difficulties detecting phone numbers from different types of documents. The challenge was that I had to use different parsers to parse and extract the phone numbers. For example, to extract phone numbers from a Word document, I had to use a library that supports Word. Also, I cannot use the same library or logic to parse a PDF file. Ultimately, I need to maintain different libraries for different document types, which, as you can image, can lead to many issues.

It looks like this covers international phone numbers as well.  Seems pretty interesting.

ConvertTo-HTML Tips

Jeff Hicks shows off some of the niceties of Powershell’s ConvertTo-HTML cmdlet:

This is because Convertto-Html, like Export-CSV and Export-Clixml, take the entire object. This is not just the default result you see on the screen. Remember, everything will be treated as a string. In my example, if I want a similar HTML file, I will have to recreate the output with Select-Object. This might require piping the original result to Get-Member to discover the “real” property names.

It won’t output beautiful results, but with the appropriate CSS theming, you can generate good internal reports.

Binary Collation Case-Sensitivity

Solomon Rutzky explains that binary collations are not really case-sensitive:

Quite often people will use, or will recommend using, a binary Collation (one ending in “_BIN” or “_BIN2“) when wanting to do a case-sensitive operation. While in many cases it appears to behave as expected, it is best to not use a binary Collation for this purpose. The problem with using binary Collations to achieve case-sensitivity is that they have no concept of linguistic rules and cannot equate different versions of characters that should be considered equal. And the reason why using a binary Collation often appears to work correctly is simply the result of working with a set of characters that has no accents or other versions. One such character set (a common one, hence the confusion), is US English (i.e. “A” – “Z” and “a” – “z”; values 65 – 90 and 97 – 122, respectively). However, there are a few areas where binary collations don’t behave as many (most, perhaps?) people expect them to.

Solomon gives examples of false negatives (such as the same character represented by different code point combinations) and also explains how sort order can change.

Categories

July 2017
MTWTFSS
« Jun Aug »
 12
3456789
10111213141516
17181920212223
24252627282930
31