If you have done any text/data analysis, you might already be familiar with Regular Expressions (RegEx). RegEx evolved as a necessary tool for text editing. If you are still using RegEx to deal with text processing, then you may have some problems to deal with. Why? When it comes to large-sized texts, the low efficiency of RegEx can make data analysis unacceptably slow.
In this article, we will discuss how you can use FlashText, a Python library that is 100 times faster than RegEx to perform data analysis.
Learn more on the GitHub repo. I haven’t used this before but I could see it being handy.
We started our tooling efforts with providing visibility into device and server logs, so that our users can go to one tool instead of having to use separate data-specific tools or logging into servers. Providing visibility into logs is valuable because log messages include important contextual information, especially when errors occur.
However, at some point in our business growth, storing device and server logs didn’t scale because the increasing volume of log data caused our storage cost to balloon and query times to increase. Besides reducing our storage retention time period, we addressed scalability by implementing a real-time stream processing platform called Mantis. Instead of saving all logs to persistent storage, Mantis enables our users to stream logs into memory, and keep only those logs that match SQL-like query criteria. Users also have the choice to transform and save matching logs to persistent storage. A query that retrieves a sample of playback start events for the Apple iPad is shown in the following screenshot:
It’s an interesting post.
Switching to the editor mode is pretty simple and straight-forward. At the bottom of the screen, we can see the help bar which guides us through the switching process between the available editor modes. The options available for instant switching are the multiline mode, activated by pressing F3, and the Emacs mode, activated by pressing the F4 button.
To run the multi-line query in the multi-line mode, append the query with a semicolon and then press the enter key to execute it.
Use the same keys as mentioned above to turn on and turn off the editor modes—F3 for the multi-line query mode and F4 for the EMACS mode.
If you’re big on command-line interfaces, you’ll probably enjoy this client.
As many of us are familiar with, GDPR is approaching and we made some updates. In the past, file history stored entire T-SQL queries. However, if the query contained any secrets or passwords, it wasn’t smart enough to scrub those out. This is no longer the case, and now file history has been updated to no longer store secrets or passwords.
In addition, we have added 24-hour rotation of UserID when we collect telemetry.
Read on for other improvements.
Tuesday 3rd of this Month I invited people in the SQL Server community to share which tools are essential to their daily work. I was really overwhelmed by the number of stories that the topic triggered. 22 in total took the time to write down and share which tools they use for their work chores.
Going through 22 posts and aggregating them has been taking more time than I had hoped for, since my trusted laptop broke down – blinking codes are well and alive I tell you!
Click through for the 22 submissions as well as Jens’s set of links to the tools people mentioned.
When I need to quickly create smaller sets of test data or dummy data, I use Mockaroo. It is highly configurable with over 140 built-in field types for locations, personal information, product information, technical information and much more. Every field type can be customized, and you can also use your own regular expression to generate data. The data can then be exported to CSV, JSON, SQL, and Excel formats. The interface is simple to use and understand, and you can save your schemas and data sets for later reuse.
I’m fond of Coblis and was aware of the last two, but the first two were new to me.
The Admin stack is probably the most important stack here. You still using maintenance tasks via SSMS? stop doing that. Rebuilding indexes every night? Maybe rethink that.
How you keep track of, monitor and do basics DBA tasks?
Ok so this can involve SSMS, but a feature not a lot of people may not use. We use it to keep track of all of our instances and push things..oh baby baaaby! It also allows me to combine PoSh to do work against instances, gather data (historical, dmv…etc) and do a boat load of admin stuff without pointing and clicking. Heck I don’t even have to open SSMS to use my CMS server at all.
SQLSentry can automatically defrag indexes for you and update stats. You could use this instead of the below choices for this aspect if desired. Although not free, it’s an option we have in our environment and I love me some options.
Both amazing options for backup, reindexing and checkdb. Although most places i’ve worked use Ola’s scripts by default. HOWEVER…. Minion has some pretty nice options that are FAR more configurable than Ola’s. We have mitigated some large DB issues by rolling our own code on top of Ola’s scripts. We could avoid this by simply using Minionware!
I’m a huge fan of the Minionware suite. And several other things Warren mentions.
Documentation is an important part of every DBA’s job and Snagit Screen Capture is my screenshot tool of choice. I don’t know how I ever lived without it. Snagit is awesome for both pics and videos – wayyy better than the built-in (but still useful) Snipping tool.
Click through for several other tooling recommendations.
This T-SQL Tuesday is brought to us by Jens Vestergaard (b | t), and we are asked to share our favorite SQL Server tools. Hint Profiler will not be on the list. But where do you start there are so many tools out there. In alphabetical order here are my top 5 tools because I can’t pick which one is better than other.
Click through to see Tracy’s top 5 list.
We have in the middle an open source time series database called InfluxDBis designed for collecting data that is timestamped such as performance metrics. Into that, we feed data from an open source project called Telegraf which can feed in more than just SQL Server statistics. And to be able to show us the data in nice pretty graphs that we can manipulate, drill-down on, and even set up alerts we display it using Grafana. Links to all of these products you find as we go through the setup of the solution.
Tracy’s post is dedicated to installation and configuration more than defining metrics, but it does get you on the road to custom metrics visualization.