Snowflake

Comparison to Redshift

Pros

Data loading

Snowflakes COPY INTO is much more flexible and powerful than redshifts COPY command * JSON parsing (can implement schema on read rather than just on load) * Loading small files, it seemed to deal better with large numbers of small files

Scaling

Separation of storage and compute architecture is great!

  • S3 storage make it much more cost effective to warehouse infrequently used large datasets
  • No downtime when you need to rescale some compute infra
  • Enables separate compute for processing ELT jobs which doesn’t impact the performance users experience
Semi Structured Data

Has first class support for semi-structured data e.g. JSON, Arrays, Mappings

Query performance visibility

You can check on the status of a query while it’s running to get a sense of how far through it’s got

Other
  • The pre-computed metadata make lots of basic queries lightning fast

Cons

  • Lack of simple integrations with PyCharms (Datagrip) Database section (Now exists)
  • Tableau did not work with the SSO login
  • Doesn’t support nested sub queries within the SELECT e.g.
SELECT
col_1 IN (SELECT col_1 FROM table_2)
FROM table_1