Apache describes Kafka as a distributed streaming platform that lets us:
Publish and subscribe to streams of records.
Store streams of records in a fault-tolerant way.
Process streams of records as they occur.
Kafka is probably the most generally interesting of the current Hadoop ecosystem, with Spark not too far behind. By “generally interesting,” I mean in the sense that companies with no vested interest in Hadoop as a whole could still be excited by the prospect of Kafka.
Local Debug enables you to debug your C# code behind, step through the code, and validate your script locally before submitting to ADLA.
Use command ADL: Start Local Run Service to start local run service and set a breakpoint in your code behind, then click command ADL: Local Debug to start local debug service. You can debug through the debug console and view parameter, variable, and call stack information.
Click through to see the other improvements.
Data Science & Engineering
Cloudera Data Science Workbench enhancements include:
GPU Support: Cloudera Data Science Workbench now enables popular deep learning frameworks to run on GPUs, both on-premises and in the cloud.
Embedded Web UIs: Users can work with the Apache Spark Web UI for Spark sessions. Other interactive web applications like TensorBoard, Shiny, and Plotly now appear directly in the workbench.
Enhanced Job Scheduling: Cloudera Data Science Workbench users can now schedule jobs directly from external schedulers or orchestration systems via the new Jobs API.
Read on for more enhancements.
So, to briefly sum up, to use SMO on Linux, you need to do the following:
- Install .NET Core 2.0
- Install PowerShell beta 2
- Install SQL Tool Service
You can use PowerShell from the Terminal, but I prefer something like an IDE so this is optional:
Download Visual Studio Code
Install PowerShell plugin
Change settings file to point explicitly to PowerShell beta 2.
Read the whole thing.
Microsoft says that turning on TDE (Transparent Data Encryption) for a database will result in a 2-4% performance penalty, which is actually not too bad given the benefits of having your data more secure. There is even more of a performance hit when enabling cell level or column level encryption. When encrypting any of your databases, keep in mind that the tempdb database will also be encrypted. This could have a performance impact on your other non-encrypted databases on the same instance.
In a previous post I demonstrated how to add an encrypted database to an AlwaysOn group in SQL2016. In this article I will demonstrate the performance effects of having an encrypted database in your AlwaysOn Group compared to the same database not-encrypted.
The results aren’t surprising, though the magnitude of the results might be.
As DBAs our stock in trade is information and there is certainly an impressive amount available. The diagnostic views are the most common place to get the information we need but every now and again it’s nice to get an organized/pretty view. To that end, you can write your own reports or you can use the default reports that Microsoft makes available through SSMS. There are reports at the Server, Database and Agent level.
The Disk Usage by Table report is on my go-to list.
We typically think of error logs as somewhere to go to find issues, but what if your error logs ARE the issue? Like most anything else in SQL Server, if you neglect your error logs you can run into trouble. Even on a low-traffic SQL Server instance, a bad piece of code, or a hardware issue, could easily fill your error logs, and with the introduction of Hekaton in SQL Server 2014, the SQL Server error log started getting a lot more data pumped into it than you might have been used to before. What this means for the DBA is that you can quickly start filling your main system drive (if your SQL install and error logs are in the default location) with massive error logs. So what questions should you be answering about error logs to make sure you don’t run into problems?
Read on to learn more.