Skip to content

Data Layers

This document explains the data pipeline architecture used in the EWE Numera platform.

Pipeline Overview

Data flows through several layers from source systems to the customer-facing dimensional model:

Pipeline Overview

Layer Details

Raw Data Layer (SAT)

Purpose: Capture full history of source system data.

Characteristics: - Materialization: Incremental tables - Pattern: Satellite tables with full change history - Source: Endur UDSRs (User-Defined Stored Results), clearing bank files - Key Feature: Preserves all versions for audit trail

Naming Convention: sat_<source>_<entity>

Examples: - sat_endur_sim_result_udsr_credit_exposures - sat_endur_sim_result_udsr_party_data_all - sat_endur_sim_result_udsr_collateral_data_all

Key Columns: - sim_run_id, sim_version, scenario_id - Simulation identifiers - _ingested_at - When data arrived in Snowflake

Staging Layer (STG)

Purpose: Apply schema and data types to raw data.

Characteristics: - Materialization: Views - Pattern: Select from raw with explicit column typing - Benefit: Schema changes isolated to this layer

Naming Convention: stg_<source>_<entity>

Examples: - stg_endur_sim_result_udsr_credit_exposures - stg_abn_open_position - stg_bnp_open_position

Transformations Applied: - Column selection and renaming - Data type casting - NULL handling - Basic validation

Point-in-Time Layer (PIT)

Purpose: Provide current state snapshots for each metric date.

Characteristics: - Materialization: Incremental tables - Pattern: Select latest version per business key per date - Strategy: Delete+insert by metric_date partition

Naming Convention: pit_<source>_<entity>

Examples: - pit_endur_credit_exposure - pit_endur_credit_exposure_monthly - pit_clearing_bank_statement_open_position

Key Logic:

-- Select latest simulation run for each metric date
SELECT *
FROM sat_endur_sim_result_udsr_credit_exposures
QUALIFY ROW_NUMBER() OVER (
    PARTITION BY metric_date, exp_line_id
    ORDER BY sim_run_id DESC, sim_version DESC
) = 1

Mapping Layer (MAP)

Purpose: Create cross-reference mappings between entities.

Characteristics: - Materialization: Incremental tables - Pattern: Join multiple sources to create unified keys

Naming Convention: map_<source>_<mapping_purpose>

Examples: - map_endur_exposure_party_attribute - Links exposures to party dimension

Key Function: Creates dim_party_external_bu_le_attribute_key for dimensional joins.

Dimensional Layer (DIM/FCT)

Purpose: Customer-facing star schema for BI tools.

Characteristics: - Materialization: Secure views - Pattern: Kimball-style star schema - Access: NUMERA_DIM_MODEL schema

Naming Conventions: - dim_<entity> - Dimension tables (descriptive attributes) - fct_<entity> - Fact tables (measurable events)

Key Features: - Surrogate keys (dim_*_id) - Placeholder rows (ID = 0 for unknown) - Relationship tests between facts and dimensions - Secure views for row-level security

Layer Summary

Layer Prefix Materialization Schema Purpose
Raw sat_ Incremental ENDUR Full history
Staging stg_ View ENDUR Apply schema
PIT pit_ Incremental ENDUR Current state
Mapping map_ Incremental ENDUR Cross-references
Dimension dim_ Secure View NUMERA_DIM_MODEL Entity attributes
Fact fct_ Secure View NUMERA_DIM_MODEL Business metrics

Data Freshness

Layer Update Frequency Latency
Raw (SAT) Continuous (Snowpipe) Near real-time
Staging (STG) Real-time (views) None
PIT On dbt run Minutes
Mapping On dbt run Minutes
Dimensional On dbt run Minutes

Schema Organization

Schema Purpose Access
ENDUR Staging, history, PIT, mapping models Transformer roles
NUMERA_DIM_MODEL Customer-facing dimensional models Reporter roles
DATA_QUALITY Data quality test results Developer roles
MONITORING Operational monitoring All roles

EWE-Specific Extensions

Credit Exposure Flow

Credit Exposure Flow

Clearing Statement Flow

Clearing Statement Flow

Power Volume Flow

Power Volume Flow

Floating Power Volume Flow

For index-linked (PWR Index) trades with associated index prices:

Floating Power Volume Flow