There’s no video yet available of Joel’s talk, but you can guess the theme of that opening slide, and walking through the slides conveys the message well, I think. Yuhui Xie, author and creator of the rmarkdown package, provides a detailed summary and response to Joel’s talk, where he lists Joel’s main critiques of Notebooks:
Hidden state and out-of-order execution
Notebooks are difficult for beginners
Notebooks encourage bad habits
Notebooks discourage modularity and testing
Jupyter’s autocomplete, linting, and way of looking up the help are awkward
Notebooks encourage bad processes
Notebooks hinder reproducible + extensible science
Notebooks make it hard to copy and paste into Slack/Github issues
Errors will always halt execution
Notebooks make it easy to teach poorly
Notebooks make it hard to teach well
Read the whole thing. I agree with some of these points, but disagree with a few on the list.
The notation we used above is the “explicit argument” variation we recommend for readability. What a lot of
dplyrusers do not seem to know is: base-
Ralready has this functionality. The function is called
To demonstrate this, let’s first detach
dplyrto show that we are not using functions from
detach("package:dplyr", unload = TRUE)
Now let’s write the equivalent pipeline using exclusively base-
Click through for the way to do this as a pipeline operation.
A long time ago in a galaxy far, far away, I had to troubleshoot interesting performance issue in SQL Server. Suddenly, the CPU load on the server started to climb up. Nothing changed in terms of workload. The system still processed the same amount of requests. The execution plans of the critical queries stayed the same. Nevertheless, the CPU usage grew up slowly and steadily by a few percent per hour.
Eventually, we nailed it down. The problem occured in very busy OLTP system with very volatile data. We noticed that system performed much more I/O (logical and physical) than before. It was very strange, because nothing should have changed that day. Finally, we found that we have large number of deleted rows in the database that had not been cleaned up by ghost cleanup task.
Read on to learn what caused this mess.
Fellow Canadian Doran Douglas brought this issue to my attention recently, and I wanted to share it with you as well.
Let’s say you have a file in UTF-8 format. What this means is that some of the characters will be single-byte, and some may be more than that.
Where this becomes problematic is that a fixed-width file has fields that are, well, fixed in size. If a Unicode character requires more than one byte, it’s going to cry havoc and let slip the dogs of truncation.
Click through for an example. This seems like a bug to me—I interpret fixed-width as fixed number of characters, not fixed number of bytes. At the very least, it’s liable to cause confusion.
To do this, a trigger was created which would send all the details via a Service Broker message to another SQL Server, this SQL Server was used to hold details of the AD accounts and from there, changes were automatically propagated out to AD.
This was working well until one day when it was realised that any changes to account permissions in AD weren’t reflected in the personnel database. To solve this, another trigger was created to send a Service Broker message back to the personnel database with details of the change.
This was where I came in, it was noticed that the system had started to run slower and slower, not only that but permissions seemed to be constantly changing for no obvious reason. Were the machines finally waking up and taking over?
There’s a reasonable explanation here, for some definition of reasonable.
Simon Su has an interesting tool available:
Now I develop a tool to analyze AG log block movement latency between replicas and create report accordingly.
Click through for more info and check it out on Github.