Hadoop 3.0.0-alpha2

Kevin Feasel

2017-02-23

Hadoop

Andrew Wang and Ray Chaing note that a new Hadoop 3 alpha is now available:

YARN introduces the notion of opportunistic containers in addition to the current guaranteed containers. An opportunistic container is queued at the NodeManager waiting for resources to become available, and run opportunistically so long as resources are available. They are preempted, if and when needed, to make room for guaranteed containers. Running opportunistic containers between the completion of a guaranteed container and the allocation of a new one should improve cluster utilization.

There are a couple other new features, including support for Azure Data Lake Store.

Using popt To Switch Tabs In Powershell ISE

Jana Sattainathan shows how to open a specific Powershell ISE tab without moving your hands off the keyboard:

This is a function that activates a tab based on a partial search string that you type in the CLI portion of the ISE. Among the numerous tabs you have open in the ISE, to activate a tab that has the word “register”, you would just type in

1
popt register

That’s it. The tab that has the word “register” will become active.

Why “popt” instead of a standard name?

“popt” is the alias you can create for the real “search and activate function” Pop-FileInISETab. It stands for “poptab”. You don’t want to trade one problem with another by having to type that long name, so, there is functionality to optionally create an alias named “popt” for the function.

At first I said to myself, “Why would anybody have this many tabs open?”  Then I looked at my instance of SQL Server Management Studio.  Then I decided I wanted this for SSMS.

SQL Server Error Handling Scenarios

Dan Guzman goes into some detail about error handling in SQL Server:

T-SQL and ADO.NET data access code must work in concert with one another to ensure SQL errors are detected in application code. The T-SQL constructs used in multi-statement batches can affect if and how when errors are reported by ADO.NET during batch execution. I’ll start by citing core T-SQL error handling objectives, which can be summarized as:

1) Ensure a multi-statement T-SQL batch doesn’t continue after an error occurs.
2) Rollback transaction after errors.
3) Raise error so that the client application is aware a problem occurred.

Read the whole thing.

Thinking About Contexts

Ewald Cress has started a new series on system internals:

Per-processor partitioning of certain thread management functions makes perfect sense, since we’d aim to minimise the amount of global state. Thus each processor would have its own dispatcher state, its own timer list… And hang on, this is familiar territory we know from SQLOS! The only difference is that SQLOS operates on the premise of virtualising a CPU in the form of a Scheduler, whereas the OS kernel deals with physical CPUs, or at least what it honestly believes to be physical CPUs even in the face of CPU virtualisation.

This is a start to a very interesting series.

Python + knitr

Steph Locke shows how to integrate Python code into knitr:

One of the nifty things about using R is that you can use it for many different purposes and even other languages!

If you want to use Python in your knitr docs or the newish RStudio R notebook functionality, you might encounter some fiddliness getting all the moving parts running on Windows. This is a quick knitr Python Windows setup checklist to make sure you don’t miss any important steps.

Between knitr, Zeppelin, and Jupyter, you should be able to find a cross-compatible notebook which works for you.

Getting Started With Azure Cognitive Services

Rolf Tesmer has a demo app showing what Azure Cognitive Services Text Analytics can do:

Each execution of the application on any input file will generate 3 text output files with the results of the assessment.  The application runs at a rate of about 1-2 calls per second (the max send rate cannot exceed 100/min as this is the API limit).

  • File 1 [AzureTextAPI_SentimentText_YYYYMMDDHHMMSS.txt] – the sentiment score between 0 and 1 for each individual line in the Source Text File.  The entire line in the file is graded as a single data point.  0 is negative, 1 is positive.

  • File 2 [AzureTextAPI_SentenceText_YYYYMMDDHHMMSS.txt] – if the “Split Document into Sentences” option was selected then this contains each individual sentence in each individual line with the sentiment score of that sentence between 0 and 1.  0 is negative, 1 is positive.  RegEx is used to split the line into sentences.

  • File 3 [AzureTextAPI_KeyPhrasesText_YYYYMMDDHHMMSS.txt] – the key phrases identified within the text on each individual line in the Source Text File.

Rolf has also put his code on GitHub, so read on and check out his repo.

Pattern Matching With LIKE

Kevin Feasel

2017-02-23

T-SQL

Shane O’Neill looks at different ways to match patterns with the LIKE operator:

So let’s test out this bad boy using the WideWorldImporters database, see if we can find everyone with the first name of Leyla.

Simple right? And because [Sales].[Customers] uses the full name, we have to use LIKE.

Now a developer comes along and says “Wait a second, my sister is Leila”. So we try to cheat and add a wildcard in there.

Leonardo!? Well I suppose he does count in this situation, but there’s 2 characters between the ‘e’ and the ‘a’ and I only wanted one.

Click through for a couple pattern matching tricks and look for ways to avoid Lejla in your life.

Unpivoting Wide Tables

Kevin Feasel

2017-02-23

T-SQL

Dave Mason has to unpivot a particularly wide table:

Today’s less-than-ugent challenge was to un-pivot the output of RESTORE HEADERONLY. I thought for certain someone else, somewhere, at at some time must have wanted to do the same thing. So I asked the Twitterverse, but no one responded. I guess I’ll have to make do myself without the easy button. No worries, though. We can do this!

Metadata tables are good friends at times like these.

Clusterless Availability Groups

Allan Hirt talks about clusterless AGs:

A quick history lesson: through SQL Server 2016, we have three main variants of AGs:

  • “Regular” AGs (i.e. the ones deployed using an underlying Windows Server failover cluster [WSFC] requiring Active Directory [AD]; SQL Server 2012+)
  •  AGs that can be deployed without AD, but using a WSFC and certificates (SQL Server 2016+ with Windows Server 2016+)
  • Distributed AGs (SQL Server 2016+)

SQL Server v.Next (download the bits here) adds another variant which is, to a degree, a side effect of how things can be deployed in Linux: AGs with no underlying cluster. In the case of a Windows Server-based install, this means that there could be no WSFC, and for Linux, currently no Pacemaker.

Read on for more details, including limitations and expectations.

Categories

February 2017
MTWTFSS
« Jan Mar »
 12345
6789101112
13141516171819
20212223242526
2728