The Apache Flink project has followed the philosophy of taking a unified approach to batch and stream data processing, building on the core paradigm of “continuous processing of unbounded data streams” for a long time. If you think about it, carrying out offline processing of bounded data sets naturally fits the paradigm: these are just streams of recorded data that happen to end at some point in time.
Flink is not alone in this: there are other projects in the open source community that embrace “streaming first, with batch as a special case of streaming,” such as Apache Beam; and this philosophy has often been cited as a powerful way to greatly reduce the complexity of data infrastructures by building data applications that generalize across real-time and offline processing.
Check it out. At the end, the authors also describe Blink, a fork of Flink being (slowly) merged back in and which supports this paradigm.
Kudu supports coarse-grained authorization of client requests based on the authenticated client Kerberos principal. The two levels of access which can be configured are:
1. Superuser – principals authorized as a superuser are able to perform certain administrative functionality such as using the kudu command line tool to diagnose or repair cluster issues.
2.User – principals authorized as a user are able to access and modify all data in the Kudu cluster. This includes the ability to create, drop, and alter tables as well as read, insert, update, and delete data.
Access levels are granted using whitelist-style Access Control Lists (ACLs), one for each of the two levels.
Read on to see how to tie it all together.
One thing I need you to understand first: you have to provide the database and the queries. Almost all of the tools in this post, except the last one, are designed to help you run queries, but they don’t include the queries. The whole idea with load testing is that you’re trying to mimic your own workloads. If you’re just trying to test a server with generic workloads, start with my post, “How to Check Performance on a New SQL Server.”
Click through for a list of tools. I’d also throw in Pigdog from Mark Wilkinson (one of my co-workers). This helped replicate a few issues in SQL Server 2017 around tempdb performance.
There is always a discussion on how to store back the data from Power BI to local computer or SQL Server Databases, in this short blog, I will show how to do it by writing R scripts inside Power Query.
Leila also describes a complication you may hit where writes happen twice.
Pamela Mooney has a two-part series on dropping database objects. Part one includes a big setup script:
Some months ago, a fellow DBA came to me and expressed her concern over the normal housecleaning process that occurred at her company. Developers or product owners would submit changes for objects which were no longer used, only to find that sometimes, they were. No damage had been done, but the possibility of creating an emergent situation was there.
I could well understand her concern. Most of us who have been DBAs for any length of time have probably come across a similar scenario. I thought it would be fun to write a process that reduced the risk of dropping database objects and made rollbacks a snap.
Now, the objects in the table will be dropped after 120 days. But what if you need them to be dropped before (or after)? Both options work, but I’ll show you the before, and also what happens if there is a problem with the drop.
Check it out and drop with impunity. Or at least the bare minimum of punity.
These results confirm that:
1. You can import a certificate from a
2. You can import a private key when creating a certificate from a
3. You cannot import a private key when creating a certificate from an assembly
4. Except when creating a certificate from an assembly, any combination of sources for the certificate (i.e. public key and meta-data) and the private key should be valid
It’s a long post with a lot of detail and quite a few tests, so check it out.
If you’ve been query tuning for a while, you probably know about SARGability, and that wrapping columns in functions is generally a bad idea.
But just like there are slightly different rules for CAST and CONVERT with dates, the repercussions of the function also vary.
Read the whole thing. Maybe “go to brunch” in the middle of it for maximum effect.
Fine, let’s talk about a business then. How about 24 million loan records, including bank account information, email, phones, social security numbers and all the rest. Yeah, that was sitting on an Elasticsearch database with no password of any kind. Oh, and the S3 storage was completely open too. Security? Is that still a thing?
How about exposing your entire client list because you left the password off the database (Elasticsearch again, is it hard to add a password to Elasticsearch). How about stacks of resumes (ElasticSearch, again, and MongoDB).
Those are just breaches from this year. If we go back, we can find more and more. Please, put a password on your systems.
The OWASP Top 10 application security risks is out there and provides a lot of useful information on how to prevent the problems Grant mentions.
In order to build the html table I have used a function
table_framewhich can be used as a container in
DT::renderdatatable. This function basically uses
htmltools. For more references on the basics of html tables please refer here
What is Results to Grid and what can it do for you? Results to Grid are Query Results options in SQL Server Management Studio (SSMS) that can help users customize their query results in a variety of ways that can help make users more efficient. Some of these might be little changes, but when used often throughout the day, they can make a big difference. Once you change the setting, you will need to open a new query window for the change to go into effect
The two I wish were on by default are column headers when copying or saving results, and retaining CR/LF on copy or save.