R User Salaries By Country

Capri Granville shares a chart showing a box plot of salaries for professional R users by country:

Interesting analysis done in R, about salaries of R developers broken down by country, featuring salary range and median salary. 

The dataset consists of survey answers from nearly 90,000 respondents. About 5,000 of them reported using R for “extensive development work over the past year”. The first filter used reduces the dataset from 88,883 respondents to 5,048. The second filter excludes students, hobby programmers and former developers. This reduces the dataset to 4,047 respondents. The third filter excludes unemployed and retired respondents and the dataset is further reduced to 3,871 respondents. Finally, we exclude respondents from an unknown country and respondents with unknown or zero salary.

Check out Tomaz Weiss’s detailed post which dives into these numbers for United States respondents.

Logistic Regression Defaults and sklearn

Giovanni Lanzani shares some thoughts on scikit-learn defaults for Logistic Regression:

If you read the post, you can see that the biggest problem with the choice is that, unless your data is regularized, you will train a model that probably under performs: you are unnecessarily penalizing it by making it learn less than what it could from the data.

The second problem with the default behavior of LogisticRegression is about choosing a regularization constant that is — in effect — a magic number (equal to 1.0). This hides the fact that the regularization constant should be tuned by hyperparameter search, and not set in advance without knowing how the data and problem looks like.

Knowledge is power. Also read the post Giovanni links to in order to learn more about the issue.

Azure Data Studio September Release

Alan Yu announces the September release of Azure Data Studio:

As we continue to bring over key features from SQL Server Management Studio, one highly requested feature was enabling SQL Server command line (SQLCMD) mode in our Query Editor. SQLCMD mode allows users to write and edit queries as SQLCMD scripts. In addition, users can also execute the SQLCMD scripts.

This feature is now possible in Azure Data Studio.

Looks like there were several good improvements this month.

The Value of Query Store

Erin Stellato has started a series on the benefits of Query Store:

The Query Store feature previewed in Azure SQL Database in summer 2015, was made generally available that fall, and was part of the SQL Server 2016 release the following summer.  Over the past four years (has it really been that long?!) I have devoted significant time to learning Query Store – not just understanding how it works from the bottom up, but also why it works the way it does.  I’ve also spent a lot of time sharing that information and helping customers understand and implement the feature, and then working with them to use the query store data to troubleshoot issues and stabilize performance.  During this time I have developed a deep appreciation for the Query Store feature, as its capabilities go far beyond its original marketing.  Is it perfect?  No.  But it’s a feature that Microsoft continues to invest in, and in this series of blog posts my aim is to help you understand why Query Store is a tool you need to leverage in your environment.

Read on for a high-level overview of how Query Store is useful.

Execution Plan Properties

Erik Darling reminds you to check out the properties of execution plan elements:

I read a lot of posts about query plans, and I rarely see people bring up the properties tab.

And I get it. The F4 button is right next to the F5 button. If you hit the wrong one, you might ruin everything.

But hear me out, dear reader. I care about you. I want your query plan reading experience to be better.

Erik provides sound advice here.

Troubleshooting AWS Database Migration Service Errors

Samir Behara takes us through troubleshooting AWS Database Migration Service issues:

For troubleshooting any issues with AWS DMS, it is necessary to have logs enabled. The DMS logs would typically give a better picture and helps find errors or warnings that would indicate the root cause of the failure. If the logs are not available there is nothing much you can do from a detailed troubleshooting analysis perspective. So basically next step is to turn on DMS logs and kick the job again and validate if the errors are captured in the logs.

If logs are not enabled, you need to set up a new task with logging enabled so if and when it errors out, you can take a look and troubleshoot the same.

I’ll save my full rant for another day, but I’m not that impressed with DMS. It could be a failing on my part, though.

Uploading to Blob Storage Archive Tier

Bub Pusateri has a helpful script for us:

Last year I wrote about how to upload data to Azure Blob Storage Archive Tier, and included a PowerShell script to do so. It’s something I use regularly, as I have hundreds of gigabytes of photos and videos safely (and cheaply!) stored in Azure Blob Storage using Archive Tier.

But read on for a recent announcement to make the process easier.

Shared Access Signatures

Arun Sirpal explains what an Azure Shared Access Signature is:

Using a Shared Access Signature (SAS) is usually the best way to control access rights to Azure storage resources (like a container for backups) without exposing the primary / secondary storage keys. It is based on a URI and this is what I want to look at today.

Read on to learn about the components which make up a Shared Access Signature.

Categories

September 2019
MTWTFSS
« Aug  
 1
2345678
9101112131415
16171819202122
23242526272829
30