Press "Enter" to skip to content

Day: May 11, 2022

Shipping Kafka Logs to Kibana via Filebeat

Shivani Sarthi uses Filebeat to perform log shipping:

To ship the Kafka logs, we will be using the filebeat agent. A filebeat agent is a lightweight shipper whose purpose is to forward and centralize the log data.

For filebeat to work, you need to install it as an agent on the desired servers. Filebeat then monitors the log files, collects the log events, and forwards them to the ElasticSearch or LogStash for indexing.

Click through for an Ansible script to install Filebeat, integrate with Kafka, and communicate with Logstash for eventual querying via Kibana.

Comments closed

Optimizing Hive Performance with Tez

Jay Desai has some recommendations around tuning Tez queries:

Tuning Hive on Tez queries can never be done in a one-size-fits-all approach. The performance on queries depends on the size of the data, file types, query design, and query patterns. During performance testing, evaluate and validate configuration parameters and any SQL modifications. It is advisable to make one change at a time during performance testing of the workload, and would be best to assess the impact of tuning changes in your development and QA environments before using them in production environments. Cloudera WXM can assist in evaluating the benefits of query changes during performance testing.

Click through for several configuration and query considerations.

Comments closed

Context Transition in DAX

Marco Russo and Alberto Ferrari explain the idea of context transition in DAX:

Let us state this from the very beginning: context transition is a simple concept. It is a powerful feature aiming to simplify the authoring of DAX code. That said, most new DAX developers find context transition hard to understand, and they consider it to be the major reason behind incorrect results. There are two reasons for developers to feel this way:

– A solid understanding of the difference between the row context and the filter context is an important prerequisite to understand and master the concept of context transition.

– You need to remember when and how the context transition works. Most errors involving context transition are due to the developer forgetting to take the context transition into account, rather than not knowing how it works. Once they realize that context transition is happening, the code suddenly makes sense.

Read on to understand more about the idea of context transition.

Comments closed

Thoughts on Code Obfuscation

Joy George Kunjikkur reminds us of worse times:

Long long ago I was given the special task of hiding code. Hiding code..what? Yes, we have to deliver code in such a way nobody should be able to reverse engineer.

I’ve run into problems around this in modern code as well. For example, using client-side React means that you aren’t going to be able to hide secrets like credentials, connection strings, etc. in a way that users absolutely won’t be able to see them. In the SQL Server world, some companies use encrypted stored procedures, which is a joke considering that you also need to ship the keys to decrypt those procedures, meaning that an enterprising user can get around your obfuscation attempt in moments.

Comments closed

Reversing Strings with Powershell

Jeff Hicks solves a challenge:

The warm-up challenge was all about manipulating strings. That alone may not sound like much. But the process of discovering how to do it and wrapping it up in a PowerShell function is where the true value lies.

Beginner Level: Write PowerShell code to take a string like ‘PowerShell’ and display it in reverse.

Intermediate Level: Take a sentence like, “This is how you can improve your PowerShell skills.” and write PowerShell code to display the entire sentence in reverse with each word also reversed. You should be able to encode and decode text. Ideally, your functions should take pipeline input. For bonus points, toggle upper and lower case when reversing the word.

Click through to see both of these solutions in action.

Comments closed

Window Functions and Tips on Sorting

Itzik Ben-Gan shares some good advice:

A supporting index can potentially help avoid the need for explicit sorting in the query plan when optimizing T-SQL queries involving window functions. By a supporting index, I mean one with the window partitioning and ordering elements as the index key, and the rest of the columns that appear in the query as the index included columns. I often refer to such an indexing pattern as a POC index as an acronym for partitioningordering, and covering. Naturally, if a partitioning or ordering element doesn’t appear in the window function, you omit that part from the index definition.

But what about queries involving multiple window functions with different ordering needs? Similarly, what if other elements in the query besides window functions also require arranging input data as ordered in the plan, such as a presentation ORDER BY clause? These can result in different parts of the plan needing to process the input data in different orders.

Read on for six tips. By the way, if you see broken images on the page (which I saw at the time of posting), click the broken image icon to see the image. It appears that the problem is just with inline images.

Comments closed

Getting Cluster File Share Witness Sharepath via Powershell

Tom Collins needs some information:

I normally use the Windows reg utility to get the  Cluster File Share Witness sharepath information. This is an example command line on a server returning the File Share Witness details

reg query \\myCluster\HKEY_LOCAL_MACHINE\Cluster\Resources /s /f sharepath

I want to start integrating these sorts of tasks into automated information gathering procedures and migrate this task into the Powershell library. Could you share some Powershell code to get the Cluster File Share Witness sharepath  information

Click through for a Powershell one-liner which gets this information.

Comments closed

Running Kubernetes Clusters on Windows

Boemo Mmopelwa shows off Kubernetes Kind:

Kubernetes production clusters are typically run on cloud platforms. However, running and deploying Kubernetes applications on cloud platforms such as Google Kubernetes Engine is costly. These high costs can restrict the Kubernetes learning process for beginners. However, running Kubernetes clusters locally helps you efficiently test applications without disrupting the production environment or paying for cloud services. 

To make things easier, the Kubernetes team developed two tools,  Minikube and Kind, that allow Kubernetes users to run clusters locally without spending a dollar. This article will cover how to do it with Kind. Kind is a command-line tool used to run Kubernetes clusters locally on your computer using Docker containers. Kind works with Docker by loading Docker containers into Kind clusters. A Kubernetes cluster is a group of nodes used to run containerized applications.

Read the whole thing.

Comments closed