Cohen’s d is a measure of effect size for the difference of two means that takes the variance of the population into account. It’s defined as
d = | μ1 – μ2 | / σpooled
where σpooled is the pooled standard deviation over both cohorts.
Read the whole thing.
It is usually said, that for– and while-loops should be avoided in R. I was curious about just how the different alternatives compare in terms of speed.
The first loop is perhaps the worst I can think of – the return vector is initialized without type and length so that the memory is constantly being allocated.
The performance of
map isn’t great, though the benefits to me are less about performance and more about readability. H/T R-bloggers
Let’s say you’re trying to improve the fuel efficiency of your car, and you have a budget of $100 to make those improvements. Furthermore, let’s say that you can only spend your budget on three different kinds of improvements: better tires, better spark plugs, and a better exhaust system.
The simple answer, of course, is to buy the Best of each part – the best Tires, the best Sparkplugs, and the best Exhaust System.
But what if Reality rears its head and rudely insists that the Best Costs More? What if there are a range of options available in each bucket, with varying price AND effectiveness? Then what?
Read on for the solution which has not nearly enough baseball.
The *absolute* unique feature that BDCs offer that *no* other company or product offers is: Data Virtualization
The new and enhanced SQL Server 2019 Polybase feature comes with connectors to many different data sources: Oracle, Teradata, Apache Spark, MongoDB, Azure Cosmos DB, and even ODBC connectivity to IBM’s DB2, SAP HANA and excel (see image below)
So far, Microsoft does not have a simple way to create a Big Data Cluster. It’s a bit cumbersome of a process and the learning curve is a bit steep. However, Microsoft is currently working on making it easier to deploy a Big Data Cluster via Notebook in Azure Data Studio and eventually some type of “deployment wizard.” But for now, the only option is to do it the long way.
The series will continue, but check out the setup work.
Regardless of the method that you decided to use, at the end of the replays, you will have two distinct sets of tables containing the workload analysis data, sitting in different schemas in the same database or in completely different databases.
WorkloadViewer will let you visualize performance over time, as we have seen for a single workload analysis, but this time it will be able to show you data from both workloads, so that you can compare them.
This sort of production load testing is both important and difficult; WorkloadTools makes it easier.
This is tale of troubleshooting…
When you unexpectedly or intermittently encounter a change in query performance, it can be extremely frustrating and troublesome for a DBA or developer. If you’re not using Query Store, a third-party application, or your own method to capture query data, then you probably don’t have information about what query performance looked like when things were good…you just know how it’s running now. I was working with a customer of Tim’s last week that had been chasing this exact problem for, in their words, years. They had recently upgraded to SQL Server 2016, and the problem was still occurring.
Strangely, “blame the network” didn’t appear in Erin’s post, so I don’t know if it’s comprehensive.
Let’s list the basic known details for the possible solution(for the Enterprise Edition of the Sql Server that is):
– We can make the secondary replica readable and read the same data on it. (Not that you should do that by default, but if you really know what you are doing …)
– We can copy our object into the TempDB (yeah, your Multi-TB table is probably not the best candidate for this operation), or maybe into some other writable DB.
– We can write results in the shared folder between the replicas (let’s say in a text file into a File Share)
– We can export the BLOB object of the statistics out of the SQL Server
– We can import the BLOB object of the statistics into the statistics
Read the whole thing.
And that’s when it happened: I started getting a value back from
QUOTENAME(ORIGINAL_DB_NAME()), and not just the square brackets with nothing between them. The name of a database was now between the square brackets. Why wasn’t that showing up before?
Solomon knows the how but not the why in this case.