My personal website is a static site: 100% HTML, JS, and CSS files with no server-side processing. I have custom code that pulls data from a variety of sources and builds updated versions of the files from templates, which are then deployed to the host. I do this to move the CPU latency of building the pages to my time, instead of charging it to visitors on each page hit. While I have a host, a strategy like this means I could also choose to host for free via github or similar services.
So there’s a great benefit to the reader and our wallet, but no server-side execution makes things like contact forms trickier. Luckily, Azure Functions or AWS Lambda can be used as a webhook to receive the form post and process it, costing nothing near nothing to use (AWS and Azure both offer a free tier for 1M requests/month and 400,000 GB-seconds of compute time).
Eli has a working example in the post, which I recommend checking out.
The timings in this post came from combining 8 csv files with 13 columns and a combined total of 9.2 million rows.
I first tried combining the files with the PowerShell technique described here. It was painfully slow and took an hour and a half! This is likely because it is deserializing and then serializing every bit of data in the files, which adds a lot of unnecessary overhead.
Next I tried the C# script below using LINQPad. When reading from and writing to a network share, it took 3 minutes and 56 seconds. Much better! Next I tried it on a local SSD drive and it took just 44 seconds.
Read on for the script itself. The ReadAllLines method works fine as long as the file isn’t larger than your working memory.
Since I’ve started to play with (and rave about) functional programming (FP), a lot of people have asked me how to get started.
Instead of writing the same email multiple times, I decided to create a blog post I can refer them to. Also, it’s a central place to put all my notes about the topic.
Here’s a small collection of all the resources I’ve accumulated on my adventure on learning functional programming.
I think the functional paradigm fits relational database development extremely well, better than the object-oriented paradigm.
It took me a while to make the transition from SQL Profiler to Extended Events. Eventually I got comfortable enough with it to use it 100% of the time. As I read more about the XEvents architecture (as opposed to just “using” XEvents), I gained a deeper appreciation of just how great the feature is. My only gripe is that there isn’t a way to handle the related events from within SQL Server using T-SQL. DDL triggers can’t be created for XEvents. And they can’t be targeted to Service Broker for Event Notifications (not yet, anyway). For now, the one way I know of to handle an XEvent is by accessing the event_stream target via the .NET framework. I’ll demonstrate with C#.
9/10, would have preferred F# but would read again.
The Cross-site request forgery (CSRF) exploit uses cross-site scripting (mentioned above), browser insecurities, and other techniques to cause a user to unwittingly perform an action within their current authenticated context that allows the attacker to access the user’s account. This type of attack usually occurs when a malicious email, blog, or a message causes a user’s Web browser to perform an unwanted action on a trusted site for which the user is currently authenticated.
This is a nice overview of the most common attack vectors for web applications.
The trickiest part of wiring a circuit like this is detecting a button press. Most logic boards don’t know if an input circuit should poll at high or low levels. That’s where pull-ups come in. Above, you can see we set one of the pins for the button to be a pull-up (or an input if we were using another board). That means it will pull the current and look for impedance. The other important thing is our debounce. With circuits, one button press can actually turn into lots because as soon as the switch completes (or interrupts) the circuit, it starts sending signals. A debounce is like a referee saying “only look for a signal for this long” and it will filter out extra “presses” based on current that might linger on a press.
Once we detect our button press, we’re calling the function below. All it does is read the current LED pin values, and looks to see which one is currently lit, and then lights the next one.
Go from understanding general purpose input/output pins to calling SMO via a web service all in one post. If you’ve got an itch for a weekend project, have at it.
This is of course not a new invention. The earliest instance I could find with a bit of searching was from 2004, with Andrew Morton mentioning in it a code review so casually that it seems to have been a well established trick. But the vast majority of implementations I looked at do not do this.
So here’s the question: Why do people use the version that’s inferior and more complicated? I’ve must have written a dozen ring buffers over the years, and before being forced to really think about it, I’d always just used the first definition. I can understand why a textbook wouldn’t take advantage of unsigned integer wraparound. But it seems like it should be exactly the kind of cleverness that hackers would relish using and passing on.
Check out the comments for more information, a bit of code golf, and multiple links on tying shoelaces.
Recently, we launched Amazon Athena as an interactive query service to analyze data on Amazon S3. With Amazon Athena there are no clusters to manage and tune, no infrastructure to setup or manage, and customers pay only for the queries they run. Athena is able to query many file types straight from S3. This flexibility gives you the ability to interact easily with your datasets, whether they are in a raw text format (CSV/JSON) or specialized formats (e.g. Parquet). By being able to flexibly query different types of data sources, researchers can more rapidly progress through the data exploration phase for discovery. Additionally, researchers don’t have to know nuances of managing and running a big data system. This makes Athena an excellent complement to data warehousing on Amazon Redshift and big data analytics on Amazon EMR.
In this post, I discuss how to prepare genomic data for analysis with Amazon Athena as well as demonstrating how Athena is well-adapted to address common genomics query paradigms. I use the Thousand Genomes dataset hosted on Amazon S3, a seminal genomics study, to demonstrate these approaches. All code that is used as part of this post is available in our GitHub repository.
This feels a lot like a data lake PaaS process where they’re spinning up a Hadoop cluster in the background, but one which you won’t need to manage. Cf. Azure Data Lake Analytics.
OzCode’s new LINQ debugging capability is tremendous, no doubt about it. But it is not a panacea; it is still constrained by Visual Studio’s own modeling capability. As a case in point, Figure 17 shows another example from my earlier article. This code comes from an open-source application I wrote called HostSwitcher. In a nutshell, HostSwitcher lets you re-route entries in your hosts file with a single click on the context menu attached to the icon in the system tray. I discussed the LINQ debugging aspects of this code in the same article I mentioned previously, LINQ Secrets Revealed: Chaining and Debugging, but if you want a full understanding of the entire HostSwitcher application, see my other article that discusses it at length, Creating Tray Applications in .NET: A Practical Guide.
This is quite interesting. My big problem with LINQ in the past was that Visual Studio’s debugger treated a LINQ statement as a black box, so if you got anything wrong inside a long chain of commands, good luck figuring it out. This lowers that barrier a bit, and once you get really comfortable with LINQ, it’s time to give F# a try.