Press "Enter" to skip to content

Category: Versions

Using Specific R Package Versions in Docker Images

Roman Lustrik shares how to fix package versions in Docker images:

Using package in R is easy. You install from CRAN using install.packages("packagename"), it resolves dependencies and you’re good to go. What R natively doesn’t handle so well is installing a particular package version without jumping through hoops. Technically you need the source file of the package version you want to install AND all source files of the dependencies (in the correct version, of course). This has been made almost seamless with packages packrat and recently, renv.

This comes handy when you are constructing a Docker file to run in production. Usually you want to run this defensively and do not want things to change from one image build to another. To get there, you can save all your package names and version into a file (renv.lock) and use that to reconstruct the now defined package structure with predictable versions (see renv vignette here).

This is quite useful as R package developers tend not to covet backwards compatibility, and one of the key benefits of containers is to have the option to keep the same code base and configuration in all environments.

Leave a Comment

SQL Server 2019 CU5

Microsoft has released SQL Server 2019 CU5:

This article describes Cumulative Update package 5 (CU5) for SQL Server 2019. This update contains fixes that were released after the initial release of SQL Server 2019 and updates the SQL Server and Analysis services components to the following builds.

I see a half-dozen or so PolyBase-related fixes, though I was hoping to read that they’ve fixed the issue around creating and using external objects when you are connected to SQL Server via Windows authentication. I’ll have to test that out to see if it was fixed—that was broken in CU3 and remained so in CU4.

Leave a Comment

Azure Data Studio June 2020 Release

Alan Yu announces a new release of Azure Data Studio:

The Data Virtualization extension for Azure Data Studio is now updated with more functionality and a new logo. This update allows you to use the data virtualization wizard to virtualize MongoDB and Teradata data sources into your SQL Server. This new functionality is available for SQL Server 2019 instances running CU5 or later.

To install the extension, search for Data Virtualization in the extension viewlet in Azure Data Studio and click install.

Of course I’m going to clip the bit about PolyBase.

Leave a Comment

Merge Replication: Subscriber and Publisher Versions

Steve Stedman gives us the proper order for upgrading SQL Server when you’re using merge replication:

Working on a recent SQL Server merge replication project we needed to update some of the servers in a merge replication scenario without upgrading all of them. Consider a merge replication setup with a publisher, a distributor and 2 or more subscribers all on the same version of SQL Server, and you need to upgrade the SQL Server version on the subscriber to a newer version like SQL Server 2019.

Before doing any type of upgrade, I wanted to confirm that things would or would not work. First checking some Microsoft documentation it appears that replication from a SQL 2012, SQL 2014, SQL 2016, or any older version of a publisher is not supported to a subscriber running on SQL Server 2019. Or more specifically the subscriber needs to be on the same ore older version than the publisher.

Read on for a demo, as well as an interesting caveat.

Comments closed

May 2020 Release of Azure Data Studio

Alan Yu has some goodies for us:

The key highlights to cover this month include:

– Announcing Redgate SQL Prompt extension
– Announcing the new machine learning extension
– Added new Python dependencies wizard
– Added support for parameterization for Always Encrypted
– Improvements to the notebook markdown toolbar
– Bug fixes

For a list of complete updates, refer to the Azure Data Studio release notes.

I’ll have to check out the ML extension.

Comments closed

Don’t Install Hadoop on Windows

Hadi speaks truth:

A few days ago, I published the installation guides for Hadoop, Hive, and Pig on Windows 10. And yesterday, I finished installing and configuring the ecosystem. The only consequence I have is that “Think 1000 times before installing Hadoop and related technologies on Windows!”.

The biggest problem is that Microsoft got flaky about this. Back in 2012-2013, they backed running Hadoop on Windows as part of getting HDInsight up and running. I even remember the HDInsight emulator which could run on a local desktop. By 2014 or so, they shifted directions and decided it wasn’t worth the effort. Because Apache Spark (which does have pretty decent Windows support, at least for development) really wants Hive, you can fake it with winutils.

Comments closed

Availability Group Offerings in Standard Edition

Guy Glantser notes an issue with Availability Group documentation:

In SQL Server 2017 Microsoft added a new flavor called Read-Scale Availability Groups. This is different, because the goal here is not high availability or disaster recovery, but rather read-scalability. As opposed to the other flavors, in RSAG there is no cluster, and there is also no automatic failover mechanism. But you can set up multiple secondary replicas with read-only access and load balancing, and offload read workloads from the primary replica. This is a great scalability feature, and you can read more about it here.

Now, if you check Microsoft documentation regarding the editions and supported features of SQL Server, you will be happy to see that RSAG is supported in Standard Edition. I was happy to see it too. Unfortunately, if you try to set up a Read-Scale Availability Group on Standard Edition, it will not work. You will only be able to create a Basic Availability Group, as discussed earlier.

Click through for the answer, as well as what you can do in Standard Edition.

Comments closed

Azure Data Studio April 2020 Release

Alan Yu announces the April 2020 release of Azure Data Studio:

KQL magic extension support is now available in Azure Data Studio Notebooks. It allows you to connect, query and explore Azure Data Explorer (Kusto), ApplicationInsights and LogAnalytics data using kql (Kusto Query Language). If you are using Log Analytics today for your Azure SQL DB as described here, you can now do log metric analysis using KQL magic in Azure Data Studio Notebooks. 

KQL magic package can be downloaded from Manage Packages in Python Notebook or using pip install. In a Python Notebook in Azure Data Studio, load KQL magic using (%reload_ext Kqlmagic). Start connecting, querying, and exploring using %kql or %%kql for multi-lines.   

KQL magic allows you to see tabular results similar to SQL Notebook, where you can also have the benefits of exporting outputs to other formats (csv, Excel, JSON, XML) and using the Charting functionality. You can also take advantage of rendering charts directly with plotly for richer interactivity. 

There are several fairly big changes in here, so check them all out.

Comments closed

R 4.0 Released

David Smith walks us through what’s new in R 4.0:

R 4.0.0 was released in source form on Friday, and binaries for Windows, Mac and Linux are available for download now.

As the version number bump suggests, this is a major update to R that makes some significant changes. Some of these changes — particularly the first one listed below — are likely to affect the results of R’s calculations, so I would not recommend running scripts written for prior versions of R without validating them first. In any case, you’ll need to reinstall any packages you were using for R 4.0.0. (You might find this R script useful for checking what packages you have installed for R 3.x.)

And I just got 3.6 into production yesterday. Them’s the breaks…

Comments closed