As the world continually becomes “eaten by software,” more and more services are being replaced by software. IT pros have most likely seen this in the form of software-defined everything. One of the premier components of this focus on software and with the continuing adoption of DevOps is application programming interfaces (APIs). All of these services needs to talk together and must provide a way for programs and users to interact with them. This is where APIs come in handy. But, what does this have to do with PowerShell and JSON, you ask?
APIs, more specifically REST APIs, return data when queried. This data is typically in the JSON format. JSON is a way of structuring data that makes it easy for software to consume. When working with PowerShell, Microsoft has provided some helpful tools to work with JSON called the ConvertTo-Json and ConvertFrom-Json commands. These commands allow you to quickly work with REST APIs or any other service that returns or accepts JSON as an input.
Read on for more details on how to use these commands.
Once the Script Component has been defined as a source, the output columns can be defined. For this post, the same USGS Earthquake data that was used in the “Download JSON data with PowerShell” post will serve as an example. Be careful to choose the correct data types here. This can be tedious because you have to choose the correct data types in the C# code as well, and ensure that they correspond with the SSIS types. It’s helpful to bookmark a SSIS data type translation table for reference.
It does involve creating a script component, but aside from the tedium that Chris mentions, it’s not too bad.
Here are results of the queries:SQL Server Execution Times: CPU time = 656 ms, elapsed time = 651 ms. SQL Server Execution Times: CPU time = 204 ms, elapsed time = 197 ms.
As you can see, WITH clause specify that OPENJSON should immediately return properties from the JSON array without second parsing. Performance of the queries might be increased 3 times if you avoid double parsing.
That’s a pretty big difference when you specify the relevant data model elements.
This is basically a cheat code for indexing computed columns.
SQL will only compute the “Make” value on a row’s insert or update into the table (or during the initial index creation) — all future retrievals of our computed column will come from the pre-computed index page.
This is how SQL is able to parse indexed JSON properties so fast; instead of needing to do a table scan and parsing the JSON data for each row of our table, SQL Server can go look up the pre-parsed values in the index and return the correct data incredibly fast.
Personally, I think this makes JSON that much easier (and practical) to use in SQL Server 2016. Even though we are storing large JSON strings in our database, we can still index individual properties and return results incredibly fast.
It’s great that the database engine is smart enough to do this, but I’m not really a big fan of storing data in JSON and parsing it within SQL Server, as that violates first normal form. If you know you’re going to use Make as an attribute and query it in SQL, make it a real attribute instead of holding multiple values in a single attribute.
In SQL Server, datetime2’s format is defined as follows:YYYY-MM-DD hh:mm:ss[.fractional seconds]YYYY-MM-DDTHH:mm:ss.sssZ
Honestly, they look pretty similar. However, there are few key differences:
JSON separates the date and time portion of the string with the letter
Zis optional and indicates that the datetime is in UTC (if the
-05:00for Eastern Standard Time)
Read on for a few scripts handling datetime conversions between these types.
Every once in a while I hear of some technologist say that relational databases are dead; instead, a non-table based NoSQL storage format is the way of the future. SQL Server 2016 introduced JSON functionality, making it possible for some “non-SQL” data storage to make its way into the traditionally tabled-based SQL Server.
Does this mean all data in SQL Server going forward should be stored in long JSON strings? No, that would be a terrible idea. There are instances when storing JSON in SQL Server is a good choice though. In this post I want to create recommendations for when data should be stored as JSON and when it shouldn’t.
Protip: anyone who says relational databases are dead is already working with one strike against. Bert has great use cases for JSON as well as a good understanding that there are plenty of anti-use cases, making his post well worth reading.
SQL Server 2016 and Azure SQL Database enable you to parse JSON text and transform it into tabular format. In this post, you might see that JSON functions can handle very large JSON text – up to 4GB.
First, I would need very large JSON document. I’m using TPCH database so I will export the content of lineitem table in a file. JSON can be exported using the bcp.exe program:
My first draft read “Jovan Popovic has created a monster.” I might go back to that one. On the plus side, the operation took a lot less time than I had expected, though I’d have to imagine that his SQL Express instance had some decent specs.
CLUSTERED COLUMNSTORE INDEXES provide extreme data compression in SQL Server and Azure SQL Database. With NVARCHAR(MAX) support in CCI indexes you can use them on your JSON data stored is database and get high 25x compression. Therefore, CCI is a perfect solution if you need to store a large volume of JSON data in your SQL Database.
ContosoDW database is publicly available for download, so you can use this database and the script below to re-create this table and try this in your environment.
I’m curious whether this will also apply to non-JSON data.
Although this is not a valid JSON format, many system use it to exchange data.
One advantage of line-delimited JSON format compared to the standard JSON is the fact that you can append new JSON objects at the end of the file without removing closing array bracket as in the standard JSON.
This might be a niche use case, but I’m sure that in this post-XML-all-the-things era, this is more common than you might first expect.
This is equivalent to collections that you might find in classic NoSQL database because they store each JSON document as a single entity and optionally create indexes on these documents. The only difference is CLUSTERED COLUMNSTORE index on this table that provides the following benefits:
Data compression – CCI uses various techniques to analyze your data and choose optimal compression algorithms to compress data.
Batch mode analytic – queries executed on CCI process rows in the batches from 100 to 900 rows, which might be much faster than row-mode execution.
I think it’s worth reading this in conjunction with Niko Neugebauer’s comments regarding strings in columnstore.