Redshift SQL Reference

ANALYZE
- Specify a table name to anlayze a table. If you do not specify then it will analyze all the tables in the current database including persistent system catalog tables.
- Redshift automatically analyzes the tabel when you use CREATE TABLE AS / SELECT INTO
- By default, the COPY command performs an analysis after it loads data into an empty table.You
can force an analysis regardless of whether a table is empty by setting STATUPDATE ON.

ANALYZE COMPRESSION
- Run this to get recommendations for column encoding schemas. It is generated by performing compression analysis.
- You can use COMPROWS to specify number of rows to be sampled. It does sampling in all the node slices.
- By default, Amazon Redshift runs a sample pass for the DISTKEY column and another sample pass for all of the other columns in the table.
- If you want to generate statistics for a subset of columns, you can specify a comma-separated column list
- To find out when ANALYZE commands were run, you can query STL_QUERY and SVL_STATEMENTTEXT and include a restriction on querytxt=padb_fetch_sample

CREATE TABLE
- ENCODE - compression encoding for the column. RAW is default if no compression is specified.
- DISTKEY
- SORTKEY
- PRIMARY KEY - Informational only. They will be used by the optimzer.
- DISTSTYLE {EVEN|KEY|ALL} - specify the distribution style of data to compute nodes.

COPY

EXPLAIN [VERBOSE] query
- Displays execution plan of the query.
- Query is breaken down into discrete sequence of stesp and table operations
- You can run explain plan for SELECT and DML. For DDL it will fail

UNLOAD
- To unload the results of a query or table into S3 files.
- By default it unloads data in parallel to multiple files (based on number of slices in the cluster). If you specify PARALLEL OFF, hten it writes serially.

VACCUM
- Reclaims disk space occupied by deleted rows and resorts all rows.

Datawarehousing

Search This Blog

Redshift SQL Reference

Comments