Avro has support for primitive types (
bytes, etc…), complex types (
unions, optionals), logical types (
decimal), and data record (
namespace). All the types you’ll ever need.
Avro has support for embedded documentation. Although documentation is optional, in my workflow I will reject any Avro Schema PR (pull request) that does not document every single field, even if obvious. By embedding documentation in the schema, you reduce data interpretation misunderstandings, you allow other teams to know about your data without searching a wiki, and you allow your devs to document your schema where they define it. It’s a win-win for everyone.
Avro schemas are defined using JSON. Because every developer knows or can easily learn JSON, there’s a very low barrier to entry
Read on for more about Avro as well as the possibilities of using other techniques for defining schemas in Kafka.
We set our goal here to investigate the age distribution of Maine residents, men and women, using SQL queries. But the question is… on Apache Hive or on Apache Spark? Well, why not both? We could use SparkSQL to extract men’s age distribution and HiveQL to extract women’s age distribution. We could then compare the two distributions and see if they show any difference.
But the main question, as usual, is: Will SparkSQL queries and HiveQL queries blend?
Topic: Age distribution for men and women in the U.S. state of Maine.
Challenge: Blend results from Hive SQL and Spark SQL queries.
Access mode: Apache Spark and Apache Hive nodes for SQL processing.
Using KNIME, the authors are able to blend together data from different sources.
This article shows you how to troubleshoot a failed installation of SQL Server and how to implement a workaround to allow SQL Server 2017’s PolyBase feature to be installed when version 9 of the Java Runtime Environment (JRE) is present. An installation of all features in SQL Server 2017 has three external dependencies. Python, R, and the JRE are third party or open source software needed in a full installation. Changes to external software after the release of SQL Server 2017 can introduce breaking changes. Oracle, the company that owns Java, changed how Windows registry keys are named. This caused a breaking change for SQL Server 2017. Version 8 of the JRE is compatible with the SQL Server 2017 installer. Version 9 of the JRE is not. If version 9 of the JRE is the only version of the JRE on a Windows machine, it is not possible to install the PolyBase feature. The JRE version bug also is found in the SQL Server 2016 installer. The same workaround works for both SQL Server 2016 and 2017.
Mssql-cli is a new and interactive command line tool that provides the following key enhancements over sqlcmd in the Terminal environment:
- T-SQL IntelliSense
- Syntax highlighting
- Pretty formatting for query results, including Vertical Format
- Multi-line edit mode
- Configuration file support
Mssql-cli aims to offer an improved interactive command line experience for T-SQL. It is fully open source under the BSD-3 license, and a contribution to the dbcli organization, an open source suite of interactive CLI tools for relational databases including SQL Server, PostgresSQL, and MySQL. The command-line UI is written in Python and the tool leverages the same microservice backend (sqltoolsservice) that powers the VS Code SQL extension, SQL Operations Studio, and the other Python CLI tool we announced earlier, mssql-scripter.
Something very cool that Alan points out is that this is “the first time our team is contributing source code to an existing open source organization with a commitment to be a good citizen in an existing open source community.”
You don’t need a gateway in all scenarios. Only if the data source is located on-premises, you need a gateway. For online or cloud-based data sources, no gateway is required. For example; if you are getting data from CRM Online, you don’t need a gateway. However, if you are getting data from SQL Server database located on your local domain server, then you need a gateway. For Azure SQL DB you don’t need a gateway. However, a SQL Server database located on Azure Virtual Machine is considered as on-premises and needs gateway.
This post could not have come at a better time for me, so I’m definitely happy to see it.
A few things to know:
It’s totally anonymous (we’re not getting your email, IP address, or anything like that.)
It’s open to all database platforms.
As with last year’s results, we’ll publish the raw data in Excel for anyone to analyze. If you want to set up your analysis ahead of time, here’s the incoming raw results as they happen, and we’ll share them in that exact same format.
Please take the survey, especially if you’re hitting Curated SQL for the analytics or Hadoop/Spark side of things rather than the SQL Server side. That way there’s a broader distribution of entries.
I could not read my error log on one of my local SQL Servers, when I executed the following code:EXEC sp_readerrorlog
I received the below:
Msg 22004, Level 16, State 1, Line 2 Failed to open loopback connection. Please see event log for more information. Msg 22004, Level 16, State 1, Line 2 Error log location not found.
Fortunately, the error logs had a bit more detail, so Arun has the answer for you.
This blog post is around the situation where you have SSRS setup to use HTTPS and thus using a certificate and the certificate expires (or just needs replacing). We had caught the initial error via our Continuous Monitoring of the SSRS site — basically when the certificate expired we got an exception and alerted on it.
The client installed a new certificate but the issue arose where in Reporting Service Configuration Manager we went to use the new certificate but when we chose it we got this error:
“We are unable to create the certificate binding”
Read on to see how to get past this.
Managing the password, access tokens and private keys are being tedious in the application. Any small mistakes accidentally expose all the secret information. Even storing such thing in docker images can be easily accessible one should just run the image in the interactive mode container and all your application code is available in containers. Docker provides secrets to protect all secret data.
This blog explains the low-level of storage information as well as secured access to docker secret. so, let’s get started.
Read the whole thing, especially if you’ve gone container-happy.
I was sitting in a bagel shop on Saturday with my 9 year old daughter. We had brought along hexagonal graph paper and a six sided die. We decided that we would choose a hexagon in the middle of the page and then roll the die to determine a direction:
1 up (North)
2 diagonal to the upper right (Northeast)
3 diagonal to the lower right (Southeast)
4 down (South)
5 diagonal to the lower left (Southwest)
6 diagonal to the upper left (Northwest)
Our first roll was a six so we drew a line to the hexagon northwest of where we started. That was the first “step.”
After a few rolls we found ourselves coming back along a path we had gone down before. We decided to draw a second line close to the first in those cases.
We did this about 50 times. The results are pictured above, along with kid hands for scale.
Last Monday we celebrated a “Scientific Marathon” at Royal Botanic Garden in Madrid, a kind of mini-conference to talk about our research. I was talking about the relation between fungal spore size and environmental variables such as temperature and precipitation. To make my presentation more friendly, I created a GIF to explain the Brownian Motion model. In evolutionary biology, we can use this model to simulate the random variation of a continuous trait through time. Under this model, we can notice how closer species tend to maintain closer trait values due to shared evolutionary history. You have a lot of information about Brownian Motion models in evolutionary biology everywhere!
Another place that this is useful is in describing stock market movements in the short run.