When making a chart, we should use the same color scheme for the same metrics. And we need to avoid the excessive color interference to the user.
For example, when we do sales analysis, we usually analyze the indicators of sales and payment collection. Then, when we do data visualization analysis of different dimensions for the same indicator, we recommend using the same color system for sales and payment collection. It means that the sales amount can be indicated by the yellow-green color, and the return amount can be indicated by the blue color accordingly. After following the principle of consistency of indicator color, the user can quickly understand the meaning of the indicator expressed by the current data visualization chart according to the color distinction.
Color is a pre-attentive attribute: we sub-consciously pay attention to it before we consciously observe it. That has advantages but it also comes with responsibilities.
I have been working on some improvements to some of the regular ways we monitor for important changes. We always have to be on the lookout for unexpected changes being made in the SQL instances that we monitor since often times we are not the only team who has sysadmin access to the instance. We are always the best trained to take care of and configure things but we sometimes find that someone makes a change either to the SQL or database configuration without telling us. We want to know when things like this happen!
I’m a big fan of these. Of course you need to get the code right, as a bad trigger can be devastating but you can get a lot of useful information out of it and figure out who’s hand was in the cookie jar.
Flink’s network stack is one of the core components that make up the
flink-runtimemodule and sit at the heart of every Flink job. It connects individual work units (subtasks) from all TaskManagers. This is where your streamed-in data flows through and it is therefore crucial to the performance of your Flink job for both the throughput as well as latency you observe. In contrast to the coordination channels between TaskManagers and JobManagers which are using RPCs via Akka, the network stack between TaskManagers relies on a much lower-level API using Netty.
This blog post is the first in a series of posts about the network stack. In the sections below, we will first have a high-level look at what abstractions are exposed to the stream operators and then go into detail on the physical implementation and various optimisations Flink did. We will briefly present the result of these optimisations and Flink’s trade-off between throughput and latency. Future blog posts in this series will elaborate more on monitoring and metrics, tuning parameters, and common anti-patterns.
There’s a lot in here and it’s worth reading.
SQL 2017 on Windows Server 2016 behaves the same as SQL 2016 on Windows Server 2016 – “tail of the log” is supported. However, there is no support for PMEM with SQL 2017 on supported Linux distributions (except as a traditional block store). Using PMEM with SQL 2019 on Linux supports what’s known as “enlightenment”, which allows us to place data and log files on DAX formatted volumes, thereby reducing latency considerably. SQL 2019 on Linux also support “tail of the log”.
This is one of those areas where understanding Linux versus Windows administration really pays off, at least until Windows Server supports something like enlightenment.
Of course, no one reads the documentation. And even if they do, they certainly don’t continue on to the 4th paragraph of the “General Remarks” section. That’s like going to the second page of Google search results.
If one did trudge on through the docs, they would find this gem:
Although BEGIN TRANSACTION starts a local transaction, it is not recorded in the transaction log until the application subsequently performs an action that must be recorded in the log, such as executing an INSERT, UPDATE, or DELETE statement. An application can perform actions such as acquiring locks to protect the transaction isolation level of SELECT statements, but nothing is recorded in the log until the application performs a modification action.
Well that’s interesting! Let’s check it out.
Four paragraphs? Where’s the tl;dr, Microsoft Docs?
Read the whole thing even though Josh just made a joke about nobody reading the whole thing and I followed along with it.
I’d suggest starting with CTEs because they’re easy to write and to read. If you hit a performance wall, try ripping out a CTE and writing it to a temp table, then joining to the temp table.
This is my advice, too. Start with the thing which is easiest for you to develop and maintain. If it suffices for performance, stick with it; otherwise, move to the next-lowest level of complication. Stop when you have good enough performance. This optimizes for one of the most precious resources people rarely think about: developer maintenance time. Developers are pretty expensive, so the more time they spend trying to understand complex code, the less time they’re doing stuff which pushes the business forward.
Monitoring Agents – The first (and easiest) thing to be fixed were the monitoring agents (ie. MS Systems Center Operations Manager). All that took was a quick Redeploy and they came back online.
SQL Agent Jobs – The next issue was SQL Agent Jobs. Many of the maintenance jobs (eg. syspolicy_check_schedule_xxxx) have the server name hardcoded into them. This can be easily remediated by editing the job step properties and replacing the old server name with the new one.
Read on for more problems you might run into. None of these is insurmountable but they are annoying enough that I wouldn’t want to rename servers willy-nilly.
Starting with CDH 6.2, Cloudera now includes the ability to use Intel’s newly released Optane Memory as an alternate destination for the 2nd tier of the bucket cache. This deployment configuration enables you to have ~3x the size of the cache for constant cost (as compared to off-heap cache on DRAM). It does incur some additional latency compared to the traditional off-heap configuration, but our testing indicates that by allowing more (if not all) of the data’s working set to fit in the cache the set up results in a net performance improvement when the data is ultimately stored on HDFS (using HDDs).
When deploying to the cloud or using on-prem object storage, the performance improvement will be even better as object storage tends to be very expensive for random reads of small amounts of data.
There aren’t too many changes to HBase in the blog post, but the two mentioned are pretty good ones.
If you do define your Spark DataFrames well, you get a much happier result. Here’s me creating a better-looking DataFrame in Spark:
INT(SUMLEV) AS SummaryLevel,
INT(COUNTY) AS CountyID,
INT(PLACE) AS PlaceID,
BOOLEAN(PRIMGEO_FLAG) AS IsPrimaryGeography,
NAME AS Name,
POPTYPE AS PopulationType,
INT(YEAR) AS Year,
INT(POPULATION) AS Population
POPULATION <> 'A'
It’s not all perfect, though: I also cover driver problems that I ran into here with Spark and Hive.
One of the more perplexing problems to troubleshoot in SQL Server can be those related to memory grants. Some queries need more memory than others to execute, based on what operations need to be performed (e.g. sort, hash). SQL Server’s optimizer estimates how much memory is needed, and the query must obtain the memory grant in order to start executing. It holds that grant for the duration of query execution – which means if the optimizer overestimates memory you can run into concurrency issues. If it underestimates memory, then you can see spills in tempdb. Neither is ideal, and when you simply have too many queries asking for more memory than is available to grant, you’ll see RESOURCE_SEMAPHORE waits. There are multiple ways to attack this issue, and one of my new favorite methods is to use Query Store.
Click through for a demonstration.