R is an interpreted programming language with vectorized data structures. This means a single R command can ask for very many arithmetic operations to be performed. This also means R computation can be fast. We will show an example of this using Conway’s Game of Life.
Conway’s Game of Life is one of the most interesting examples of cellular automata. It is traditionally simulated on a rectangular grid (like a chessboard) and each cell is considered either live or dead. The rules of evolution are simple: the next life grid is computed as follows:
To compute the state of a cell on the next grid sum the number of live cells in the eight neighboring cells on the current grid.
If this sum is 3 or if the current cell is live and the sum is 2 or 3, then the cell in the next grid will be live.
Not only is the R code faster, but it’s also terser.
Basically, the idea is to keep the fast stuff fast and the slow stuff slow. I wrote a paper 14 years ago on the challenges of real-time data warehousing. Fortunately, both the data streaming, database, and BI layers have all evolved significantly since then, and now there exist databases and other data storage engines which can support the feature trinity that is needed to do both real-time and historical analytics right, without a Lambda architecture:
- Accept real-time streams of data at high rates.
- Simultaneously respond to large volumes of queries, including on the most recently added data.
- Store all the history needed for analysis.
We call these engines “fast data sinks” and there are four main groups of them today:
It’s an interesting argument.
The first thing that you need to do is write a Powershell script that when run can send an email. While researching this I discovered many ways to accomplish this task, so what I’m about to show you is just one way, but feel free to experiment and use what is right for your environment.
In my lab I do not run my own SMTP server, so I had to write a script that could leverage my Gmail account. You will see in my Powershell script the password to the email account that authenticates to the SMTP server is in plain text. If you are concerned that someone may have access to your script and discover your password then you will want to encrypt your credentials. Gmail requires and SSL connection so your password should be safe on the wire, just like any other email client.
Here is an example of a Powershell script that when used in conjunction with Task Scheduler which will send an email alert automatically when any specified Event is logged in the Windows Event Log.
Read on for the script as well as some bonus troubleshooting.
By large databases I’m roughly meaning databases with several hundred tables, and I usually see a lot of these tables with several hundred GB’s of data in them.
When I generally ask about the reason for no foreign key, I’m told
- they add overhead
- they give no benefit
- we can’t enter our data properly when we have them
The last one in the above list is generally down to poor modelling – an example being entering a later part of a financial transaction before the first part of the financial transaction has taken place. Once both parts of the financial transaction have taken place then the database is left in a consistent state – BUT, that generally being handled by the app NOT the database – OUCH!
There are times where key constraints are too much—often-updating fact tables might be one such scenario. And some of “too much” comes down to hardware quality. But for the most part, those key constraints are one of the clearest forms of database documentation available, not to mention their normal benefits.
There are certain design patterns in T-SQL that give me pause. They may not be “code smells” per se, but when I encounter them I find myself thinking “there’s got to be a more sensible way to accomplish this”. WAITFOR DELAY is one example. I’ve used it a few times here and there, mostly in one-off scripts. If I wrote some code that used it regularly in production, I’d be paranoid about putting my SPID to sleep forever. Maybe a little bit of paranoia is a good thing. But I digress.
A recent task found its way to me, and I’ve decided to use WAITFOR DELAY as part of my solution. (It hasn’t been tested or implemented yet–perhaps more on this in another post.) My usage this time has been more complex than in the past. What I already knew is that you can use a string literal for the time_to_pass argument. For example, this will delay for 3½ seconds:
WAITFOR DELAY '00:00:03.500'
Click through for a bunch of testing.
The other day, Erin Stellato asked a question on twitter about the value of nested SPs. Here’s how I weighed in:
I’m not a fan of nested anything. Too much hidden complexity. Code reusability leads to queries that are jack of all trades, master of none.
“Don’t repeat yourself” doesn’t work as well in SQL as it does in other code.
— Michael J Swart (@MJSwart) October 19, 2018
Hidden complexity has given me many problems in the past. SQL Server really really likes things simple and so it’s nice to be able to uncover that complexity. Andy Yun has tackled this problem for nested views with his
Click through for a script which helps.