Redshift’s UNLOAD command is a great little tool that complements Redshift’s COPY command, by doing the exact reverse function. While COPY grabs data from an Amazon s3 bucket and puts into a Redshift table for you, UNLOAD takes the result of a query, and stores the data in Amazon s3.
Why would you want to store the result of a query in s3?
If you ever need your data to be consumed by different applications, or if you want to analyze your data in ways that you can’t do in Redshift (e.g. machine learning), then it makes sense to export it.
How do you use it?
Firstly, you need to determine the data you want to unload into s3. Play around with your SELECT statement until you get it right. Once you have that, here are the basic steps:
UNLOAD(“your select query goes here”)
TO ‘the full path of your Amazon s3 bucket goes here’
AUTHORIZATION (this uses the same authorization template as the COPY command)
What do you need to be aware of when using it?
It’s important to know that max file size of the result sent to s3 can only be 6.2 GB. If your result is larger than that, you’ll either need to trim it down somehow or break it up into multiple SELECTs. More info can be found in AWS’s official documentation.