Redshift SQL Reference

ANALYZE
 - Specify a table name to anlayze a table. If you do not specify then it will analyze all the tables in the current database including persistent system catalog tables.
 - Redshift automatically analyzes the tabel when you use CREATE TABLE AS / SELECT INTO
 - By default, the COPY command performs an analysis after it loads data into an empty table.You
can force an analysis regardless of whether a table is empty by setting STATUPDATE ON.

 ANALYZE COMPRESSION
  - Run this to get recommendations for  column encoding schemas. It is generated by performing compression analysis.
  - You can use COMPROWS to specify number of rows to be sampled. It does sampling in all the node slices.
 - By default, Amazon Redshift runs a sample pass for the DISTKEY column and another sample pass for all of the other columns in the table.
 - If you want to generate statistics for a subset of columns, you can specify a comma-separated column list
 - To find out when ANALYZE commands were run, you can query STL_QUERY and SVL_STATEMENTTEXT and include a restriction on querytxt=padb_fetch_sample

CREATE TABLE
  - ENCODE - compression encoding for the column. RAW is default if no compression is specified.
  - DISTKEY
  - SORTKEY
  - PRIMARY KEY - Informational only. They will be used by the optimzer.
  - DISTSTYLE {EVEN|KEY|ALL} - specify the distribution style of data to compute nodes.

COPY

EXPLAIN [VERBOSE] query
 - Displays execution plan of the query.
 - Query is breaken down into discrete sequence of stesp and table operations
 - You can run explain plan for SELECT and DML. For DDL it will fail
 
 UNLOAD
 - To unload the results of a query or table into S3 files.
 - By default it unloads data in parallel to multiple files (based on number of slices in the cluster). If you specify PARALLEL OFF, hten it writes serially.

 VACCUM
 - Reclaims disk space occupied by deleted rows and resorts all rows.

Comments