SQL Server administration and T-SQL development, Web Programming with ASP.NET, HTML5 and Javascript, Windows Phone 8 app development, SAP Smartforms and ABAP Programming, Windows 7, Visual Studio and MS Office software Amazon Web Services AWS Tutorials and Guides
Development resources, articles, tutorials, code samples and tools and downloads for ASP.Net, SQL Server, R Script, Windows, Windows Phone, AWS, SAP HANA and ABAP, like SAP UI5, Screen Personas, etc.




Load Data from Compressed Text File into Amazon Redshift with COPY Command


In Amazon S3 bucket folders cloud users can store compressed data to gain from storage space. As many AWS services Amazon Redshift SQL COPY command supports to load data from compressed text files. In this tutorial, I want to share how compressed text files including delimited or fixed length data can be easily imported into Amazon Redshift database tables.

In this tutorial, I will use sample Amazon Redshift database table sales in tickit schema. For more on Amazon Redshift sample database, please check referenced tutorial.

Please note that AWS supports load of compressed files using following options gzip, lzop, or bzip2.
For details, check official documentation for loading compressed data files from Amazon S3.

This is very important, otherwise during COPY command execution to load data into Redshift table some errors may occur as in this sample

copy tickit.sales
from 's3://kodyaz/sales_tab.zip'
iam_role 'arn:aws:iam::1234567890:role/Amazon-Redshift-Copy-Command-Role'
timeformat as 'MM/DD/YYYY HH24:MI:SS'
delimiter '\t' region 'eu-central-1' gzip;

[Amazon](500310) Invalid operation: S3CurlException: Failed writing body (0 != 16347)
error: S3CurlException: Failed writing body (0 != 16347)
Cause: Failed to inflateinvalid or incomplete deflate data. zlib
error code: -3, CurlError 23, multiCurlError 0, CanRetry 0, UserError 0 code: 9001
context: S3 key being read : s3://kodyaz/sales_tab.zip
query: 1260803
location: table_s3_scanner.cpp:373 process: query0_119_1260803 [pid=9448];

This error occurred since the text data file was not compressed with "gzip" compression method.
In such cases where the compressed file or zipped file format does not match with the format passed to the Amazon Redshift COPY command, these error will occur.

On the other hand, if you download following file sales_tab.bz2 which is compressed using the BZIP2 compression method, by modifying the COPY command for the compression method parameter, we can easily load data from compressed text files into Amazon Redshift database table from AWS S3 buckets.

archiving type to compress text file for Amazon Redshift COPY command

copy tickit.sales
from 's3://kodyaz/sales_tab.bz2'
iam_role 'arn:aws:iam::1234567890:role/Amazon-Redshift-Copy-Command-Role'
timeformat as 'MM/DD/YYYY HH24:MI:SS'
delimiter '\t' region 'eu-central-1' bzip2;

This COPY command will be exexcuted successfully loading data from compressed text file stored on Amazon S3 bucket folders into Redshift database table sales.

As last note please visit and have a look at following resources to load data from SQL Server to Amazon Redshift database using Copy command, to create sample database on Amazon Redshift and load sample data with COPY command and lastly for common COPY command errors on Amazon Redshift database.





AWS Tutorials

AWS Tutorials

AWS








Copyright © 2004 - 2019 Eralper YILMAZ. All rights reserved.
Community Server by Telligent Systems