SQL Server administration and T-SQL development, Web Programming with ASP.NET, HTML5 and Javascript, Windows Phone 8 app development, SAP Smartforms and ABAP Programming, Windows 7, Visual Studio and MS Office software
Development resources, articles, tutorials, code samples, tools and downloads for AWS Amazon Web Services, Redshift, AWS Lambda Functions, S3 Buckets, VPC, EC2, IAM

Concatenate String Values on Amazon Redshift using ListAgg Function

In this Amazon Redshift Data Warehouse tutorial for SQL programmers I want to give SQL sample query codes showing how to concatenate column table values using SQL ListAgg functions. To concatenate string values on Amazon Redshift database, SQL developers can use ListAgg SQL aggregate function with several arguments like comma separate character for fetching a CSV list, or WithIn Group Order By clause for sorting values within the result list, or using a Group By clause as well as using ListAgg function with Distinct clause.
In this SQL tutorial for Amazon Redshift database developers and administrators, I will provide sample SQL codes with different usages of ListAgg string aggregation function.

Assume that in our Amazon Redshift database we have created a SQL table using following CREATE TABLE DDL statement

 catgroup VARCHAR(10) ENCODE lzo,
 catname VARCHAR(10) ENCODE lzo,
 catdesc VARCHAR(50) ENCODE lzo

And by executing a SQL SELECT query, we can display the data inserted into this category table to have a better understand of the data.

select * from category;

Amazon Redshift database sample table data

Now using ListAgg function, I want to show SQL developers to concatenate category names field catname in a comma separated format as string data type variable
Here is the SQL code with ListAgg function which we will execute

select listagg(catname, ', ') from category;

As SQL code builders will see, the category names under catname column are aggregated and concatenated as a single string and separated by a comma character we provided as the second argument of the Amazon Redshift SQL ListAgg aggregate function.

Redshift SQL Aggregate function ListAgg sample code

Here is the returning result:
MLS, Plays, Pop, Opera, NFL, Musicals, NHL, NBA, Jazz, MLB, Classical

Let's order the list so that the items will be listed alphabetically.
The related SQL query that will sort the items in the return list includes WITHIN GROUP (Group By ...) clause.

select listagg(catname, ', ') within group(order by catname) from category;

Now in the return list, the category names are sorted in alphabetical order with the help of "Within Group (Order By catname)" SQL clause

In following screenshot, it is seen that the items are sorted by the Order By clause:
Classical, Jazz, MLB, MLS, Musicals, NBA, NFL, NHL, Opera, Plays, Pop

Redshift ListAgg aggregate SQL function Within Group Order By clause

How to concatatenate these characted data typed column values in such a manner that we will put them in different lists according to their category group values.
Each table row so each category includes a category group attribute by "catgroup" field.
Following SQL query will show database programmers to concatenata catname string values grouped by category group attribute catgroup

select catgroup, listagg(catname, ', ') from category group by catgroup;

The execution of the SQL aggregation function ListAgg() with Group By clause results as follows on our sample Amazon Redshift database.

SQL aggregation function ListAgg with Group By clause on Redshift database

SQL database developers will see now a separate list of concatenated values for each category group value.

Of course, it is also possible to sort the aggregated items within each group alphebatically as seen in below SQL query

select catgroup, listagg(catname, ', ')
within group(order by catname)
from category
group by catgroup
order by catgroup;

Now we have a better sorted and grouped concatenated column lists using Redshift SQL aggregate function ListAgg as in below SQL execution result

Redshift ListAgg aggregate function with Group By and Order By SQL

The return data type with Redshift ListAgg SQL aggregation function is Varchar(Max).
The maximum number of characters allowed with varchar(max) data type on an Amazon Redshift is 65535 characters which is 64K-1 characters.

So if the return list of concatenated string values has a longer list than 65535 characters, ListAgg() SQL function will raise following error message:
SQL Error [500310] [XX000]: [Amazon](500310) Invalid operation: Result size exceeds LISTAGG limit

To visualize this error, I want to execute a SQL Select query with ListAgg() function in use but this time on a Redshift database table with thousands of rows

select listagg(country, ', ') from covid19;

The error indicating that the result size exceeds maximum number of characters allowed

Amazon Redshift SQL error: result size exceeds ListAgg limit

SQL Error [500310] [XX000]: [Amazon](500310) Invalid operation: Result size exceeds LISTAGG limit
error: Result size exceeds LISTAGG limit
code: 8001
context: LISTAGG limit: 65535
query: 3458881
location: _rds_bin_padb.1.0.19097_data_exec_113_2080259697_9ef..3_0.cpp:158
process: query0_236_3458881 [pid=16437]

In this case, the return list includes repeating values of country names.
If it is OK for you to include only distinct values of this column used with ListAgg SQL function for concatenation, then SQL programmer can convert ListAgg SQL query as follows with ListAgg(Distict ...) format

select listagg(distinct country, ', ') from covid19;

Of course, Redshift data warehouse SQL developers can also sort the country attributes within the result list

select listagg(distinct country, ', ') within group(order by country) from covid19;

I hope SQL developers building data warehouse solutions on Amazon Redshift will find these ListAgg functions sample codes useful.


Copyright © 2004 - 2021 Eralper YILMAZ. All rights reserved.