Press "Enter" to skip to content

Day: March 3, 2021

Including and Resizing External Images in knitr

The folks at Jumping Rivers continue a series on knitr and rmarkdown:

In this third post, we’ll look at including eternal images, such as figures and logos in HTML documents. This is relevant for all R markdown files, including fancy things like {bookdown}, {distill} and {pkgdown}. The main difference with the images discussed in this post, is that the image isn’t generated by R. Instead, we’re thinking of something like a photograph. When including an image in your web-page, the two key points are

– What size is your image?
– What’s the size of your HTML/CSS container on your web-page?

Read the whole thing.

Comments closed

Semantic Search in Azure Cognitive Search

Rangan Majumder, et al, have an article on how semantic search works in Azure Cognitive Search:

As part of our AI at Scale effort, we lean heavily on recent developments in large Transformer-based language models to improve the relevance quality of Microsoft Bing. These improvements allow a search engine to go beyond keyword matching to searching using the semantic meaning behind words and content. We call this transformational ability semantic search—a major showcase of what AI at Scale can deliver for customers.

Semantic search has significantly advanced the quality of Bing search results, and it has been a companywide effort: top applied scientists and engineers from Bing leverage the latest technology from Microsoft Research and Microsoft Azure. Maximizing the power of AI at Scale requires a lot of sophistication. One needs to pretrain large Transformer-based models, perform multi-task fine-tuning across various tasks, and distill big models to a servable form with very minimal loss of quality. We recognize that it takes a large group of specialized talent to integrate and deploy AI at Scale products for customers, and many companies can’t afford these types of teams. To empower every person and every organization on the planet, we need to significantly lower the bar for everyone to use AI at Scale technology.

Click through to learn more about the technology.

Comments closed

The Logging Costs of DROP TABLE and TRUNCATE

Paul Randal explains that DROP TABLE and TRUNCATE TABLE are logged operations:

Hopefully you all know that it’s a myth that DROP TABLE and TRUNCATE TABLE are non-logged operations. If you didn’t know that, read my blog post on that explains about the deferred drop mechanism. Both operations are fully logged, and will generate quite a bit of transaction log.

The bulk of the log that’s generated comes from having to log the deallocation of extents and the pages within them. For each extent, a bit must be cleared in the corresponding GAM page and IAM page, and all 8 pages in the extent must be marked as deallocated in the corresponding PFS page (turning off the 0x40 bit in each PFS byte). So that’s three log records per allocated extent.

To get a feeling for how much that is, Paul provides an example of a 20TB table being dropped.

Comments closed

Using Terraform to Tag Created Date

John Martin has an interesting use case for tagging in Terraform:

One of the key properties missing from Azure resources, in my opinion anyway, is a CreatedDate. This can be largely overcomes with Azure policy, but what if you don’t have access to create one that applies a timestamp tag at resource creation?

It is possible to use Terraform to tag the resource and set the value for when the resource is created. There is a little more work that needs to go into it to ensure that once it is set that Terraform does not overwrite it on subsequent deployments. But, it is achievable and brings this into your control if needed.

Click through to see how.

Comments closed

Azure Synapse Pathway

John Macintyre announces a new product:

Azure Synapse Pathway connects to the source system and inspects details about your database objects. An assessment report captures further details on the database objects that can be translated into Azure Synapse Analytics. With Azure Synapse Pathway the source database objects are automatically converted and optimized to T-SQL code on Azure Synapse Analytics. This means your existing code, whether a thousand or million lines of code, will be converted by Azure Synapse Pathway.

As a result of these capabilities, the traditional process of manual code conversion can now be automated in a fraction of the time; all while cutting out manual errors and reducing the total cost of the migration.

They’re starting with a few data sources (including Snowflake), but it’s an interesting product. I could see it useful for getting 80-85% of the migration done, though I don’t trust auto-generated code to be optimal.

Comments closed

Batch Mode on Row Store in SQL Server 2019

Deepthi Goguri looks at a nice performance improvement in SQL Server 2019:

In the previous post, we learned about Table variable deferred compilation. In this blog, lets focus on the batch mode on rowstore feature introduced in SQL Server 2019. This feature improves the performance of the analytical queries using the batch mode query processing. This feature is for CPU optimization helping analytical queries to run faster. We do not have to specify this option if the database compatibility is 150.

This feature is especially for the analytical queries for CPU bound analytic workloads without needing the columnstore indexes. We can specifically mention the hints in the query for using the batch mode or not.

There are specific rules which must be met before it kicks in, but the performance benefit can be significant. If you’re running SQL Server 2017, you needed a columnstore index on a table to get batch mode, though there is a trick around this: you can create a filtered, nonclustered columnstore index WHERE 1=0 so that it doesn’t have any rows. Then, any queries which hit that table are potentially eligible for batch mode processing, even though none of them use the columnstore index.

Comments closed

Creating a Database Publish Profile in Visual Studio

Elizabeth Noble shows us how to create a database publish profile using Visual Studio:

One of our fears was always how to prevent losing data and critical data code. Here were publish profiles to our rescue. We also found that some of our database code had specific values depending on the environment or contained references to other databases. Once again, publish could solve these problems!

While I’d love to say that you could use ADS to manage your database projects, that just isn’t true right now. However, we have a way to help you get a publish profile created. If you don’t want to use Visual Studio yourself, you might want to ask your Developer friends real nice and see if they’d be willing to help you out.

Click through for a video and a sample of what a publish profile looks like.

Comments closed