Debugging Kubernetes

23 Jul 2024

Helpful tips for debugging applications running in k8s

build/k8s.png

Handling multiple errors in Rust iterator adapters

17 Dec 2023

Approaches for handling multiple errors within iterator adapters

build/rust.png

Better FastAPI Background Jobs

29 Aug 2022

A more featureful background task runner for Async apps like FastAPI or Discord bots

build/fastapi_logo.png

Useful Linux Examples

21 Dec 2021

A plethora of helpful tips for working in Linux

build/bash_logo.png
Continue to all blog posts

Snowflake

Comparison to Redshift

Pros

Data loading

Snowflakes COPY INTO is much more flexible and powerful than redshifts COPY command * JSON parsing (can implement schema on read rather than just on load) * Loading small files, it seemed to deal better with large numbers of small files

Scaling

Separation of storage and compute architecture is great!

  • S3 storage make it much more cost effective to warehouse infrequently used large datasets
  • No downtime when you need to rescale some compute infra
  • Enables separate compute for processing ELT jobs which doesn’t impact the performance users experience
Semi Structured Data

Has first class support for semi-structured data e.g. JSON, Arrays, Mappings

Query performance visibility

You can check on the status of a query while it’s running to get a sense of how far through it’s got

Other
  • The pre-computed metadata make lots of basic queries lightning fast

Cons

  • Lack of simple integrations with PyCharms (Datagrip) Database section (Now exists)
  • Tableau did not work with the SSO login
  • Doesn’t support nested sub queries within the SELECT e.g.
SELECT
col_1 IN (SELECT col_1 FROM table_2)
FROM table_1