I’m going to show you how to use ksqlDB to do the following:
– Configure the live ingest of a stream of data from an external source (in this case, Twitter)
– Filter the stream for certain columns
– Create a new stream populated only by messages that match a given predicate
– Build aggregate materialised views, and use pull queries to directly fetch the state from these
Let’s dive in! As always, you’ll find the full test rig for trying this out yourself on GitHub.
Day: January 2, 2020
This script came to mind as I was thinking back over the year for a few reasons. One of them was that I spent a non-trivial amount of time writing and debugging it, despite its small size and the apparent simplicity of the problem it tackled. Even in apparently glamorous fields like machine learning, 90% of the work is nuts and bolts integration like this. If anything you should be doing more of it as you become more senior, since it requires a subtle understanding of the whole system and its requirements, but doesn’t look impressive from the outside. Save the easier-to-explain projects for more junior engineers, they need them for their promotion packets.
The reason this kind of work is so hard is precisely because of all the funky requirements and edge cases that only become apparent when code is used in production. As a young engineer my first instinct when looking at a snarl of complex code for something that looked simple on the surface was to imagine the original authors were idiots. I still remember scoffing at the Diablo PC programmers as I was helping port the codebase to the Playstation because they used inline assembler to do a simple signed to unsigned cast. My lead, Gary Liddon, very gently reminded me that they had managed to ship a chart-topping game and I hadn’t, so maybe I had something to learn from their approach?
I am a huge fan of the concept which, made brief, states that if you do not understand why something is the case, don’t change it. If you do understand it, maybe change it but be prudent about it. It’s also something I have often trouble with, as my natural inclination toward code bases is to use the cleansing power of fire to burn it all down.
You can buy all type of sensors and connect them to Raspberry Pi. Then you can use Python or .NET Core to write small applications to check your connected sensors and read data from the sensors. If you like to push this data to store or analyze in Azure, then you need to make Raspberry Pi ready by installing couple of applications.
Installing an application in Windows, is not a big deal for me. I had to install and configure all the applications in Linux in this project. First thing we need to do is copying some files to register Microsoft GPG key and software repository feed. To do that, we will use the curl command. Curl is used for transferring data using various protocols including HTTP/S. We are going to use it to copy some files from Internet to local storage. It’s a fancy copy tool.
There are a few steps involved, but nothing too onerous. I think I know where Hasan is going with this, too.
I just put some result on the output, because as you can imagine there are some certain limits on the amount of the output that will be cached and that will be not. Besides the basic logical stuff, such as having deterministic functions only (functions which output will not be varying depending on the execution), not using System Objects or UDFs (and it seems that scalar UDF inlining is not a part of Azure SQL DW yet), no row-level security or column-level security enabled, the main thing and which seems to be pretty good decision as far as I am concerned – the row size larger than 64KB won’t be cached period.
Read on to see what Niko has learned, including cache performing and limitations. Between this and the data pools in SQL Server Big Data Clusters, Microsoft’s spent some time thinking about data caching in cloud-based versions of SQL Server.
Take quick note of the port number I have circled in red. This doesn’t match the original query at all. In fact, it doesn’t come anywhere close to the actual port number. In addition, the port number shown here is a negative value. Obviously a negative port is not correct as TCP/IP ports only range from 0-65535. So what is happening here?
Read on for the answer.
Microsoft released 13 new functions in 2019. The first 4 functions are related to the calculation group feature, which is now only available on Azure Analysis Services and Analysis Services 2019:
Read on for those 13 functions. Then, keep reading to see what Marco, Alberto, & crew have in mind for 2020.
The moment everyone has been waiting for some time is here, PowerShell Release Candidate is available for download. This a “Go Live” release officially supported in production by Microsoft.
Everyone in the Microsoft PowerShell Team, with the help of the community, has done an excellent job with the evolution of this new version of PowerShell. Read all about it on the PowerShell DevBlogs recent post “Announcing the PowerShell 7.0 Release Candidate“.
Make sure to read all previous posts as they perfectly outlined under the “Why is PowerShell 7 so awesome?” section of the release candidate post.
Click through for more details. One of the nice things in this RC is a consistent
Out-Gridview experience, so it’s not just for Windows anymore.
Welcome to 2020! I wanted to start this year by giving to all my fellow consultants another way to troubleshoot our beloved SQL Servers; I’ve already talked about diagnostic notebooks in the past, and now, since Azure Data Studio has implemented the feature, I wanted to group them into a Diagnostic Book.
As the name implies, a jupyter book is no other than a collection of notebooks (and markdown files) that groups everything in a coherent space, with an index and navigation options alike.
I think this sort of collection of notebooks (a, uh, note-book), if put together well, makes it easier to learn a new environment and understand key problems than a big Scripts.txt file or a folder full of scripts.
The new color picker allows colors in RGB format in addition to the hex color format that Power BI has used exclusively until now.
The new one also easily allows users to choose from a wider selection of shades and tones. This builds upon the simpler selection of hues and tints in the original.
In case you don’t know what David means, there is an excellent explanation of each term.