Snowflake
Comparison to Redshift
Pros
Data loading
Snowflakes COPY INTO
is much more flexible and powerful than redshifts COPY
command
* JSON parsing (can implement schema on read rather than just on load)
* Loading small files, it seemed to deal better with large numbers of small files
Scaling
Separation of storage and compute architecture is great!
- S3 storage make it much more cost effective to warehouse infrequently used large datasets
- No downtime when you need to rescale some compute infra
- Enables separate compute for processing ELT jobs which doesn’t impact the performance users experience
Semi Structured Data
Has first class support for semi-structured data e.g. JSON, Arrays, Mappings
Query performance visibility
You can check on the status of a query while it’s running to get a sense of how far through it’s got
Other
- The pre-computed metadata make lots of basic queries lightning fast
Cons
Lack of simple integrations with PyCharms (Datagrip) Database section(Now exists)- Tableau did not work with the SSO login
- Doesn’t support nested sub queries within the
SELECT
e.g.
SELECT
col_1 IN (SELECT col_1 FROM table_2)
FROM table_1