Satellites store descriptive data with full history. Every time an attribute changes — a customer's phone number, an order's status, a product's price — a new row is inserted with a new timestamp. Nothing is ever updated or deleted.
What a Satellite Contains
- Hub/link hash key (FK) — which entity this data describes
- Hashdiff — MD5 of all descriptive columns, used to detect changes
- Descriptive columns — the actual attributes (name, status, amount, etc.)
- Load datetime — when this version was loaded
- Record source — which source provided this version
Satellite Models in This Pipeline
sat_customer_details — name, address, region, segment sat_customer_contact — phone, email, contact person sat_product — name, category, price, weight sat_order_status — status (new → confirmed → shipped → delivered) sat_order_financials — total amount, discount, tax sat_order_line — quantity, unit price, line total sat_invoice — amount, due date, status sat_payment — amount, payment date, method sat_warehouse — name, location, capacity sat_manager — name, department, hire date sat_contract — terms, start/end date, value
Change Detection with Hashdiff
On each load, the satellite compares the incoming hashdiff against the latest row for that hub key. If the hash differs, a new row is inserted. If it's the same, the row is skipped. This is equivalent to SCD Type 2 but handled automatically by the automate_dv sat macro.
Why Split Satellites?
Customers have two satellites: sat_customer_details and sat_customer_contact. This is intentional — contact info changes more frequently than demographic details. By splitting, we avoid creating unnecessary history rows when only one group of attributes changes.