In this article, I continue to review the exciting features available in SQL Server 2016. One such feature is the long awaited T-SQL support for JSON formatted data. In this article we take a look at how JSON support will impact data warehouse solutions.
Since the advent of EXtensible Markup Language (XML) many modern web applications have focused on providing data that is both human-readable and machine-readable. From a relational database perspective, SQL Server kept up with these modern web applications by providing support for XML data in a form of an XML data type and several functions that could be used to parse, query and manipulate XML formatted data.
As a result of being supported in SQL Server, data warehouse solutions based off SQL Server were then able to source XML-based OLTP data into a data mart. To illustrate this point, let’s take a look at the XML representation of our fictitious Fruit Sales data shown in figure below.
To process this data in data warehouse, we would first have to convert it into relational format of rows and columns using T-SQL XML built-in functions such as the nodes() function.
The results of the above script are shown in figure below in a recognisable format for data warehouse.
Although there were workarounds (i.e. using Json.Net) to addressing the lack of JSON support in SQL Server, there was always sense that these workarounds were inadequate, time-wasting, and were forcing data warehouse development teams to pick up a new skill (i.e. learn .Net). Fortunately, the release of SQL Server 2016 has ensured that development teams can throw away their JSON workarounds as JSON is supported in SQL Server 2016.
Parsing JSON Data into Data Warehouse
Similarly to XML support in SQL Server, SQL Server supports of JSON can be classified into two ways:
- Converting Relational dataset into JSON format
- Converting JSON dataset into relational format
However, for the purposes of this discussion we are focusing primarily on the second part – which is converting a JSON formatted data (retrieved from OLTP sources) into a relational format of rows and columns. To illustrate our discussion points we once again make use of the fictitious fruit sales dataset. This time around the fictitious dataset has been converted into a JSON format as shown below.
As part of supporting JSON formatted data in other relational databases such as MySQL and PostgreSQL 9.2, there is a separate JSON data type that has been introduced by these vendors. Amongst other things, JSON data type conducts validation checks to ensure that values being stored are indeed of valid JSON format.
Unfortunately, SQL Server 2016 (and ORACLE 12c) do not have a special data type for storing JSON data instead a variable character (varchar/nvarchar) data type is used. Therefore, a recommended practice to dealing with JSON data in SQL Server 2016 is to firstly ensure that you are indeed dealing with a valid JSON data. The simplest way to do so is to use the ISJSON function. This is a built-in T-SQL function that returns 1 for a valid JSON dataset and 0 for invalids.
Image below shows us the implementation of ISJSON function whereby we validate our fictitious sample dataset.
Now that we have confirmed that we are working with a valid JSON dataset, the next step is to convert the data into a table format. Again, we have a built-in T-SQL function to do this in a form of OPENJSON. OPENJSON works similar to OPENXML in that it takes in an object and convert its data into rows and columns.
Figure below shows a complete T-SQL script for converting JSON object into rows and columns.
Once we execute the above script, we get relational output shown below.
Now that we have our relational dataset, we can process this data into data warehouse.
Prior to concluding our discussion of JSON in SQL Server 2016, it is worth mentioning that in addition to OPENJSON, you have other functions such as JSON_VALUE that could be used to query JSON data. However this function returns a scalar value which means that unlike the multiple rows and columns returned using OPENJSON, JSON_VALUE returns a single value as shown below.
If you the JSON object that you are querying doesn’t have multiple elements, than you don’t have to specify the row index (i.e. ) as shown below.
The long wait is finally over and with the release of SQL Server 2016, JSON is now supported. Similarly to XML, T-SQL support the conversion of JSON object to relational format as well the conversion of relational tables to a JSON object. This support is implemented via built-in T-SQL functions such as OPENJSON and JSON_VALUE. In spite of all the excitement with the support of JSON is SQL Server 2016, we still don’t have a JSON data type. The ISJSON function can then be used to validate JSON text.