Press "Enter" to skip to content

Day: April 3, 2019

Python Natural Language Processing Tools

Sandeep Aspari takes us through some of the tooling available in Python around Natural Language Processing:

TextBlob
TextBlob is a python library tool and extension of NLTK. It provides a simple API approach to its methods and executes a large number of NLTK functions, and it also includes the pattern library functionality. You are just at the beginning, this might be an excellent tool to learning, and we can use it in applications production those don’t require heavy performant. TextBlob libraries are similar to python strings, so we can quickly transform and play similarly we performed in python. Finally, TextBlob is used in everywhere, and it is best suitable for smaller projects.

There are several tools from which you can choose. Sandeep also gives us some Node- and Java-based tools as well.

Comments closed

Developing Big Data Cluster Spark Jobs with IntelliJ

Jenny Jiang shows how we can use IntelliJ IDEA to develop Spark jobs against SQL Server Big Data Clusters:

We’re delighted to release the Azure Toolkit for IntelliJ support for SQL Server Big Data Cluster Spark job development and submission. For first-time Spark developers, it can often be hard to get started and build their first application, with long and tedious development cycles in the integrated development environment (IDE). This toolkit empowers new users to get started with Spark in just a few minutes. Experienced Spark developers also find it faster and easier to iterate their development cycle.

The toolkit extends IntelliJ support for the Spark job life cycle starting from creation, authoring, and debugging, through submission of jobs to SQL Server Big Data Clusters. It enables you to enjoy a native Scala and Java Spark application development experience and quickly start a project using built-in templates and sample code. The integration with SQL Server Big Data Cluster empowers you to quickly submit a job to the big data cluster as well as monitor its progress. The Spark console allows you to check schemas, preview data, and validate your code logic in a shell-like environment while you can develop Spark batch jobs within the same toolkit.

It looks pretty good from my vantage point.

Comments closed

Quick Data Migration With Powershell

Emanuele Meazzo shows how you can use dbatools to perform a quick table-by-table data migration using Powershell:

I’m using the sqlserver and dbatools Powershell modules to accomplish such a tedious task in the fastest way possibile.

The Write-DbaDbTableData cmdlet is pretty neat because it can create automatically the destination table if it doesn’t exists, truncate the table if it exists (or append, your choice), keep the identity values and nulls if necessary and everything is done via a bulk insert with a configurable batch size.

Click through for the script. It’s not a replacement for a real ETL process but if you just need something fast, it will do the job.

Comments closed

Using Temp Stored Procedures

Bert Wagner shows how you can perform testing with temporary stored procedures:

A while back I learned that it’s possible to create temporary stored procedures in SQL Server.

I never put that knowledge into practice however because I struggled to think of a good use case for when a temporary stored procedure would be preferable to a permanent stored procedure.

Not long ago I encountered a scenario where using a temporary stored procedure was the perfect solution to my problem.

Those scenarios are rare but Bert did hit one of them.

Comments closed

SQL Server Isolation Levels

Pamela M. takes us through a humorous journey covering transaction isolation levels in SQL Server:

READ COMMITTED:  One step up (and the default for SQL Server).    A query in the current transaction can’t read data modified by someone else that hasn’t yet committed.  No dirty reads.  BUT….data could be changed by others between statements in the current transaction, so the data may not look the same twice.  READ COMMITTED uses shared locks to prevent dirty reads, but that’s about all you get.  You still get non-repeatable reads and phantom reads here (more on phantom reads below).

Click through for the full list.

Comments closed

CLR Assembly Failure After Availability Group Failover

Paul Randal walks us through a permissions issue after an Availability Group failover:

Jonathan was working with a client recently who experienced a CLR assembly failure after an AG failover and needed to figure out why. They’d been testing their AG disaster recovery strategy and ran into an unexpected problem with their application which relies heavily on SQLCLR and an UNSAFE assembly that calls a web service from inside SQL Server.  When they failed over their AG to their DR server, the CLR assembly failed with the following error:

An error occurred in the Microsoft .NET Framework while trying to load assembly id 65546. The server may be running out of resources, or the assembly may not be trusted with PERMISSION_SET = EXTERNAL_ACCESS or UNSAFE. Run the query again, or check documentation to see how to solve the assembly trust issues. For more information about this error: System.IO.FileLoadException: Could not load file or assembly ‘sqlclr_assemblyname, Version=1.0.0.0, Culture=neutral, PublicKeyToken=fa39443c11b12591’ or one of its dependencies. Exception from HRESULT: 0x80FC80F1

Read on to see the root cause and what you can do to correct it.

Comments closed