We are going to represent the content of a Facebook post using word embeddings and comparing the transformed posts using word mover’s distance. The combination of both have shown lower k-nearest neighbor-document classification error rates compared to other state of the art techniques.
The advantage of word embeddings is that the words which have similar meanings but don’t have any letters in common will still have similar vectors (be close) in the embedded space (e.g. lion and tiger).
There’s a good high-level discussion of techniques in this post.
When browsing for the symbols, you can use this command:
x /1 *!TCP*. By using the option
/1you’ll only see the names, and no addresses. On my machine that gives me quite a lot, but there are two entries that catch my eye:
sqllang!Tcp::Close. So let us set breakpoints at those two symbols, and see what happens when we execute our code.
The result when executing the code is that we initially break at
sqllang!Tcp::AcceptConnection. Followed somewhat later by breaking at
sqllang!Tcp::Close. Cool, this seems to work – let us set some more breakpoints and try to figure out the flow of events.
The first half recapitulates his previous findings, and then he incorporates new information in the second half.
SQL is a really cool language. I can write really complex business logic with this logic programming language. I was again thrilled about SQL recently, at a customer site:
Given a string, find all substrings from that string, which are palindromes. Challenge accepted! (For the moment, let’s forget about algorithmic complexity.)
His answer is in Postgres syntax, and a commenter includes Oracle syntax. T-SQL is left as an exercise for the reader.
The answer is via Azure Automation.
At a high level this is what I did.
Create an Automation Account.
Create a credential.
Create a PowerShell Runbook which has the code for index rebuilds.
Create a schedule and link it to the above.
Configure parameters within the schedule (if any).
Configure logging level (if desired).
Click through for the detailed steps.
This solution is easier than Solution 1:
8 steps instead of 22!
No extra Project
However, a very small amount of risk was added by overriding the default MSBuild workflow for SSDT. This risk can be eliminated if Microsoft provides a pre-defined Target for the appropriate event. Please upvote my suggestion to have this happen: Add MSBuild predefined Targets for “BeforeSqlBuild” and “BeforePublish” to SSDT SQL Server Data Projects.
ALSO: Even though we did not sign the assembly with a Strong Name Key, it is still probably a good idea to do that.
If you use CLR, this is worth the read.
Yesterday, I was running a health assessment for a client. They are running a weekly maintenance plan that is shrinking all of their data files. After I picked myself up off the floor, I searched the web for “Paul Randal shrink” and hit on Paul’s excellent post Why you should not shrink your data files. In the post, Paul (b|t) demonstrates the effect of DBCC SHRINKDATABASE on index fragmentation. After the demo script, Paul writes, “As well as introducing index fragmentation, data file shrink also generates a lot of I/O, uses a lot of CPU and generates *loads* (emphasis Paul’s) of transaction log.”
This led me to ask the question, “How much is *loads*?”. To find an answer, I made the following modification to Paul’s script:
Read on for the answer. There are legitimate reasons to shrink data files, but it comes at a very high cost.
There are, as of RC2 being released, 194(!) new Events to Extend your mind with. Not all of them are interesting to me, and I haven’t had time to pry into all of the ones that are interesting just yet.
This is a rundown of the new Events with names or descriptions that I found interesting, and will try to spend some time with.
I can’t promise anything
After all, getting some of these to fire is tougher than using a Debugger.
There are some interesting events here.
The ARM API deploys resources to Azure, but doesn’t deploy code onto those resources. For example you can use ARM to deploy a virtual machine with SQL Server already installed but you can’t use ARM to deploy a database from an SSDT DacPac.
To save time when designing solutions, it is important to understand that ARM API is used simply for resources and we need to use some other technology such as DSC or PowerShell to manage the deployments onto the infrastructure once it is deployed.
This is a nice overview of the topic, and because it’s Ed (who is much better about this than most), he goes into how to test before even getting into how to create.