“Some R functions have an awful lot of arguments”, you think to yourself. “I wonder which has the most?”
It’s not an original thought: the same question as applied to the R base package is an exercise in the Functions chapter of the excellent Advanced R. Much of the information in this post came from there.
There are lots of R packages. We’ll limit ourselves to those packages which ship with R, and which load on startup. Which ones are they?
It’s a fun exercise and helpful for learning a bit more about how to work with arguments when metaprogramming in R.
Download specific macroeconomic data from FRED St. Louis economic databases and ETL the data. Many other data series can be found at the FRED’s website.# get unemployment data time series from FRED St. Louis dfunrate <- get_fred_series("UNRATE", "unrate", observation_start = startdate, observation_end = enddate) # get University of Michigan consumer sentiment index data time series from FRED St. Louis dfumcsent <- get_fred_series("UMCSENT", "umcsent", observation_start = startdate, observation_end = enddate) # combine the two time series data into one data frame dfall <- cbind(dfunrate,dfumcsent) # strip or remove redundant month field from data downloaded from FRED St. Louis dfall <- dfall[,c(1,2,4)] # obtain the number of data points in the dataframe mdx <- (1:nrow(dfall)) # convert FRED date field from string to R's date type dfall$date <- as.Date(dfall$date)
There’s a nice chart builder on the FRED website too, but it’s good to be able to grab the data on your own.
code that actually runs
code that I don’t have to run
code that can be easily run
Very useful if you want to get help on a problem.
I had no sooner published my blog post about the coolness of Biml 2018 when I encountered a bug trying to use one of the features I really like – converting SSIS packages to Biml using (FREE!) BimlExpress 2018.
My first response was, “Durnit! This worked in the test versions.” My second response was to drop a note into an issue-tracking site Varigence set up to record these kinds of things.
And then I started getting emails similar to, “Hey Andy, I get this error when I try to use the new ‘Convert SSIS Packages to Biml’ feature”…
David Stein has a workaround for us until Varigence can fix the bug.
Whatever you elect to do there will always be a master branch, where you go from here depends on whether you favor branching or feature toggles. Wikipedia provides a nice definition of what a feature toggle is, thus:
A feature toggle (also feature switch, feature flag, feature flipper, conditional feature, etc.) is a technique in software development that attempts to provide an alternative to maintaining multiple source-code branches (known as feature branches), such that a feature can be tested even before it is completed and ready for release. Feature toggle is used to hide, enable or disable the feature during run time. For example, during the development process, a developer can enable the feature for testing and disable it for other users.
A branch is initially a clone of the master branch to begin with, developers work on the branch. Once the work on that branch is code complete and it has been tested to satisfaction, it is merged into the master. An overview of the branching and merging process is provided in the Git documentation here.
The continuous integration and delivery purist are not great fans of branches and prefer the ethos of integrating changes into one place to be rigidly adhered to, ergo one code branch only. However, in practice you will find that most projects have to come up with some sane branching strategy. The subject of branching is a topic in its own right, suffice it to say there is an overhead in applying changes across multiple branches and overheads involved in merging into the master branch. Therefore, there needs to be some governance and rigor applied around the number of branches in the source code repository.
Chris then shows us how to create a multi-branch pipeline with Jenkins.
We have to be sure we know what accesses our data. There isn’t a technical solution that can automatically give us the answer. We can’t run a PowerShell script and know immediately everything that hits our key financial database. Over time we can collect that information, but the key word is “time.” If I look today, and today is not quarter end, then I don’t see the quarter end processes. If we’re looking at our HR related databases, then we really don’t know everything unless we also take into account the annual enrollment period.
The only way to be able to follow the principle of least privilege correctly is to know who and what access our data. This also includes ad hoc access, like folks running reports through SQL Server Reporting Services (SSRS) or doing analysis through Microsoft Excel. Therefore, in order to improve our data protection, we have to understand what accesses that data.
Obviously, documentation is required. When we have documentation there’s always the problem with keeping that documentation updated. While there are tools available, this task ultimately falls to people. Realistically, this is a battle we will always have to fight. Taking time to update documentation means we take time from other efforts. However, if we want to be serious about data protection, we have to know what accesses that data in order to be able to protect it.
It’s interesting to contrast this with Alex Yates’s essay on the topic.
I got my haircut today (pretty spiffy one too, even if I do say so myself). While I was chatting I asked my hair dresser “on average, how often should I get my hair cut”? She told me (for men) around 4-6 weeks. Then I got thinking (as I do), I wonder if I could data-mine my credit card data using Power BI and find out how often I actually get my own hair cut. It turns out I was able to do this, and this article explains the hardest part of that task – find the number of days between two transaction dates using DAX.
I’d probably end up doing this in SQL with the LAG function, but it’s good to know several ways to solve date difference problems.
As you can see, the upgrade feature rules check failed around the KB2919355 installation. At this point, reading the error message, I assumed (I know, I know, it’s something we should never do as a DBA!) that the patch had been downloaded and applied during the latest round of Windows patching, and all that was required was a server reboot. I was wrong.
Upon running the upgrade again, I got the same error message. Hmm, annoying. So, after some Googling I was confident I knew what to do to resolve this; download and install the KB2919355 patch. So, I downloaded the patch from the official Microsoft website (KB2919355) and kicked off the installation.
There’s a bit more to it than “install the patch.”
When SQL Server replication is used on environments with high traffic OLTP systems, users often need to adjust the agent profile parameters to increase the throughput of the log reader and distribution agents to keep up with the workload. We recently performed a series of tests to measure the performance of log reader and distribution agents while changing some of the parameters for these agents. This blog summarizes the outcomes and conclusions from this testing.
Read on for the relevant parameters.