SQL Server administration and T-SQL development, Web Programming with ASP.NET, HTML5 and Javascript, Windows Phone 8 app development, SAP Smartforms and ABAP Programming, Windows 7, Visual Studio and MS Office software
Development resources, articles, tutorials, code samples, tools and downloads for ASP.Net, SQL Server, Reporting Services, T-SQL, Windows, AWS, SAP HANA and ABAP


Parse JSON Response of Amazon Transcribe Service using SQL OpenJSON

Amazon Web Services AWS provides Amazon Transcribe service for developers to convert speech to text. While I was developing a prototype application using Amazon Transcribe service, I tried to parse JSON response of the AWS service in SQL Server with the latest enhancements of SQL Server 2017 to support JSON data handling.

If you are new to cloud computing and interested on AWS Amazon Web Services solutions for speech to text applications, you can visit Transcribe service at Amazon AWS platform.

Amazon Transcribe Service from AWS speech to text

When a transcription job is created on AWS Transcribe service, a TranscriptFileUri is returned back to the user. If the URL provided by TranscriptFileUri is called, a JSON data similar to following example is returned.

{"jobName":"transcription-job-test01","accountId":"777999555111","results":{"transcripts":[{"transcript":"Hello, my name is joanna. Nice to meet you."}],"items":[{"start_time":"0.000","end_time":"0.640","alternatives":[{"confidence":"1.0000","content":"Hello"}],"type":"pronunciation"},{"alternatives":[{"content":","}],"type":"punctuation"},{"start_time":"0.740","end_time":"0.900","alternatives":[{"confidence":"1.0000","content":"my"}],"type":"pronunciation"},{"start_time":"0.900","end_time":"1.120","alternatives":[{"confidence":"1.0000","content":"name"}],"type":"pronunciation"},{"start_time":"1.120","end_time":"1.250","alternatives":[{"confidence":"1.0000","content":"is"}],"type":"pronunciation"},{"start_time":"1.250","end_time":"1.870","alternatives":[{"confidence":"1.0000","content":"joanna"}],"type":"pronunciation"},{"alternatives":[{"content":"."}],"type":"punctuation"},{"start_time":"1.880","end_time":"2.200","alternatives":[{"confidence":"1.0000","content":"Nice"}],"type":"pronunciation"},{"start_time":"2.200","end_time":"2.310","alternatives":[{"confidence":"0.9992","content":"to"}],"type":"pronunciation"},{"start_time":"2.310","end_time":"2.520","alternatives":[{"confidence":"0.9833","content":"meet"}],"type":"pronunciation"},{"start_time":"2.520","end_time":"2.830","alternatives":[{"confidence":"0.9831","content":"you"}],"type":"pronunciation"},{"alternatives":[{"content":"."}],"type":"punctuation"}]},"status":"COMPLETED"}
Code

Above JSON file is the transcript of a speech file which is located in a S3 bucket.
Using Amazon Transcribe, developers can build speech-to-text applications easily.

Below code is showing how I declared SQL variable for JSON data for this SQL tutorial.

declare @json nvarchar(max)

set @json = N'
{"jobName":"transcription-job-test01","accountId":"777999555111","results":{"transcripts":[{"transcript":"Hello, my name is joanna. Nice to meet you."}],"items":[{"start_time":"0.000","end_time":"0.640","alternatives":[{"confidence":"1.0000","content":"Hello"}],"type":"pronunciation"},{"alternatives":[{"content":","}],"type":"punctuation"},{"start_time":"0.740","end_time":"0.900","alternatives":[{"confidence":"1.0000","content":"my"}],"type":"pronunciation"},{"start_time":"0.900","end_time":"1.120","alternatives":[{"confidence":"1.0000","content":"name"}],"type":"pronunciation"},{"start_time":"1.120","end_time":"1.250","alternatives":[{"confidence":"1.0000","content":"is"}],"type":"pronunciation"},{"start_time":"1.250","end_time":"1.870","alternatives":[{"confidence":"1.0000","content":"joanna"}],"type":"pronunciation"},{"alternatives":[{"content":"."}],"type":"punctuation"},{"start_time":"1.880","end_time":"2.200","alternatives":[{"confidence":"1.0000","content":"Nice"}],"type":"pronunciation"},{"start_time":"2.200","end_time":"2.310","alternatives":[{"confidence":"0.9992","content":"to"}],"type":"pronunciation"},{"start_time":"2.310","end_time":"2.520","alternatives":[{"confidence":"0.9833","content":"meet"}],"type":"pronunciation"},{"start_time":"2.520","end_time":"2.830","alternatives":[{"confidence":"0.9831","content":"you"}],"type":"pronunciation"},{"alternatives":[{"content":"."}],"type":"punctuation"}]},"status":"COMPLETED"}
'
Code

It is always a good idea to execute a simple SQL SELECT statement. As seen in the output of the SQL query, text data is fetched with type equals to 1 and JSON data is fetched having value 5 in type column.

select * from openJSON(@json)
Code

execute SQL OpenJSON query on Amazon Transcribe service response

Executing SQL OpenJSON using WITH clausing including return data types and names will convert this row based output into a column based view just like SQL PIVOT queries.

select *
from openJSON(@json) with (
 jobName nvarchar(100),
 accountId varchar(30),
 results nvarchar(max),
 [status] varchar(30)
) as j
Code

As seen in following screenshot of the outcome, everything is OK except the results column.
There is no value returned for results, it is NULL.

SQL OpenJson query using WITH clause

But if you remember from the previous SQL OpenJSON query, the row with "key" column value "results" has a JSON data in its "value" column. Please note that the same row has "type" column with 5

Please keep in mind, for data that includes JSON data and has type 5 or 4 can be processed as follows with a slight change only by adding "as JSON" after data type declaration in WITH clause.
Here is the corrected OpenJSON SQL query

select *
from openJSON(@json) with (
 jobName nvarchar(100),
 accountId varchar(30),
 results nvarchar(max) as json,
 [status] varchar(30)
) as j
Code

And the output is as follows

SQL Server OpenJSON SQL syntax using WITH and AS JSON clause

After modifying SQL Select query to parse JSON string on SQL Server using OpenJSON, we have a new question.
How to parse the JSON string in "results" column and combine with other fields like jobName, etc.
Here comes the cross apply join type for our requirement.
Database developers can use a second OpenJSON in combination with CROSS APPLY as in below SQL code block

select *
from openJSON(@json) with (
 jobName nvarchar(100),
 accountId varchar(30),
 results nvarchar(max) as json,
 [status] varchar(30)
) as j
cross apply openJSON(results)
Code

As database developers will see on "key" column, the JSON string in "result" column has JSON objects with name "transcripts" and "items".
You can understand the JSON string on these columns from their data types indicated with 4.
Remember, data types with 4 and 5 can be queried with "CROSS APPLY"

SQL OpenJSON and CROSS APPLY commands

Before executing a Transact-SQL query to parse "results" data, let's convert above output by applying a pivot conversion.

select
 jobName, accountId, [status], r.*
from openJSON(@json) with (
 jobName nvarchar(100),
 accountId varchar(30),
 results nvarchar(max) as json,
 [status] varchar(30)
) as j
cross apply openJSON(results) with (
 transcripts nvarchar(max) as json,
 items nvarchar(max) as json
) as r
Code

As seen in the output, we have now transacript column and items column in the output columns list

parse JSON string using OpenJSON table function on SQL Server

SQL Server programmers will see that the output has still JSON data in query outputs.
The solution is same as above.
Using CROSS APPLY with OpenJSON and in output column list defined using WITH clause use AS JSON for Json string containing columns.

Following SQL Server OpenJSON query will return the JSON string response of Amazon Transcribe service call in tabular format.

select
 ROW_NUMBER() over (order by getdate()) as rn,
 jobName,
 accountId,
 [status],
 transcript,
 i.start_time,
 i.end_time,
 i.[type],
 a.confidence,
 a.content
from openJSON(@json) with (
 jobName nvarchar(100),
 accountId varchar(30),
 results nvarchar(max) as json,
 [status] varchar(30)
) as j
cross apply openJSON(results) with (
 transcripts nvarchar(max) as json,
 items nvarchar(max) as json
) as r
cross apply openJSON(transcripts) with (
 transcript nvarchar(max)
) as t
cross apply openJSON(items) with (
 start_time varchar(10),
 end_time varchar(10),
 alternatives nvarchar(max) as json,
 [type] varchar(20)
) as i
cross apply openJSON(alternatives) with (
 confidence decimal(10,5),
 content nvarchar(100)
) as a
Code

The JSON string output is converted into tabular format by executing above SQL OpenJSON query on a SQL Server 2017 database.

Parse JSON Response of Amazon Transcribe Transcription Job using SQL



SQL Server

SQL Server 2019 Installation
download SQL Server 2019
download SQL Server 2017
download SQL Server 2016
download SQL Server 2014
download SQL Server 2012
MacOS ve SQL Server 2019


Copyright © 2004 - 2021 Eralper YILMAZ. All rights reserved.