Debugging Kubernetes

23 Jul 2024

Helpful tips for debugging applications running in k8s

build/k8s.png

Handling multiple errors in Rust iterator adapters

17 Dec 2023

Approaches for handling multiple errors within iterator adapters

build/rust.png

Better FastAPI Background Jobs

29 Aug 2022

A more featureful background task runner for Async apps like FastAPI or Discord bots

build/fastapi_logo.png

Useful Linux Examples

21 Dec 2021

A plethora of helpful tips for working in Linux

build/bash_logo.png
Continue to all blog posts

Star Schemas with Tableau Relationships

TDLR: Tableau’s Relational data layer is a great way to easily expose Star Schema Datasets

Exposing Data Warehouse content as simple Tableau Data Source has been a key part of our strategy in enabling self serve analytics.

Until version 2020.2, Tableau Data Source always effectively boiled down to a flat table of data. In order to reduce the amount of logic being stored in Tableau Data Source/Workbooks (not version controlled, or code reviewed) we use to create the flat tables in the database as views or materialised as tables. This though had the downside of high code duplication in exposing the same common dimensions (e.g. customer account) in almost every dataset, and in the case of tables, heavily increased the space used for the same data. Additionally, when new fields get added to the dimensions they need to be added to every view too.

As with many Data Warehouses we have a large body of data stored in star schemas which can now be more easily directly exposed. Just bring the tables into the relational layer, and relate the Surrogate keys

Pros

Increased performance

There is less need to extract the data to make it performant Tableau can write smarter queries against the database which don’t need to materialise a big join to compute a simple aggregate

Less space required

Reduced data duplication in materialised tables in the Data Warehouse, or in Tableau Extracts. Less disk space used by not materialising flat tables, instead running smart queries and performing joins at a later stage post aggregation of fact tables metrics.

Easier Maintenance

As new fields are added to dimension, they will automatically become available

Cons

Some Constraints

Datatypes of relationship link columns need to be the same

Links