Testing Reference¶
The nexus package includes comprehensive data quality tests to ensure ID
uniqueness, data integrity, and proper relationships between models. All tests
are defined in models/nexus-models/nexus.yml
.
0. Source Testing (Strongly Recommended)¶
Before nexus processing begins, create comprehensive tests for your source models. Source tests are your first line of defense against data quality issues.
Why Source Tests Matter¶
- Early Detection: Catch problems before they propagate through nexus processing
- Pipeline Reliability: Prevent downstream failures in identity resolution
- Data Quality Assurance: Ensure IDs are unique and required fields are populated
Essential Source Tests¶
Create tests for your source models following this pattern:
# models/sources/your_source/your_source.yml
version: 2
models:
- name: your_source_events
tests:
- unique:
column_name: event_id
config:
severity: error
columns:
- name: event_id
tests:
- not_null:
config:
severity: error
- dbt_utils.expression_is_true:
expression: "like 'evt_%'"
config:
severity: warn
- name: your_source_person_identifiers
tests:
- unique:
column_name: person_identifier_id
config:
severity: error
columns:
- name: person_identifier_id
tests:
- not_null:
config:
severity: error
- dbt_utils.expression_is_true:
expression: "like 'per_idfr_%'"
config:
severity: warn
Running Source Tests¶
# Test all models in a specific source
dbt test --select models/sources/segment/
# Test and build everything in a source folder
dbt build --select models/sources/segment/
Complete Guide: See Source Testing Best Practices for detailed examples and patterns.
1. Test Categories¶
Primary Key Tests¶
- Uniqueness: Ensures no duplicate IDs across all records
- Not Null: Ensures all ID fields have values
Composite Key Tests¶
- Multi-column uniqueness: Validates unique combinations across multiple fields
- Edge relationship integrity: Ensures proper identifier connections
Data Integrity Tests¶
- Foreign key relationships: Validates references between models
- Business rule compliance: Ensures data follows expected patterns
- ID prefix validation: Ensures all ID columns follow the expected naming
convention from
create_nexus_id
macro
2. Event-Level Tests¶
nexus_events¶
Purpose: Validates the unified events table from all enabled sources.
tests:
- unique:
column_name: event_id
config:
severity: error
columns:
- name: event_id
tests:
- not_null:
config:
severity: error
- dbt_utils.expression_is_true:
expression: "like 'evt_%'"
config:
severity: warn
- name: occurred_at
tests:
- not_null:
config:
severity: error
What it tests:
- Each event has a unique
event_id
- No events are missing IDs or timestamps
- Event IDs follow the expected
evt_
prefix pattern (warning level) - Events from all sources (Gmail, Google Calendar, Notion) are properly unified
Common failures:
- Duplicate event IDs when source models generate non-unique IDs
- Event IDs not following the expected
evt_
prefix pattern
2.1. ID Prefix Validation Tests¶
All nexus models include ID prefix validation tests to ensure consistency with
the create_nexus_id
macro. These tests use dbt_utils.expression_is_true
with
warning severity to validate that ID columns start with the expected
prefixes.
Expected ID Prefixes¶
Entity Type | Expected Prefix | Example ID |
---|---|---|
Events | evt_ |
evt_abc123... |
Persons | per_ |
per_def456... |
Groups | grp_ |
grp_ghi789... |
Memberships | mem_ |
mem_jkl012... |
States | st_ |
st_mno345... |
Person Identifiers | per_idfr_ |
per_idfr_pqr678... |
Group Identifiers | grp_idfr_ |
grp_idfr_stu901... |
Person Traits | per_tr_ |
per_tr_vwx234... |
Group Traits | grp_tr_ |
grp_tr_yza567... |
Person Participants | per_prt_ |
per_prt_bcd890... |
Group Participants | grp_prt_ |
grp_prt_efg123... |
Person Edges | per_edg_ |
per_edg_hij456... |
Group Edges | grp_edg_ |
grp_edg_klm789... |
Test Configuration¶
Why warning severity?
- Allows builds to continue even with prefix violations
- Provides visibility into naming convention compliance
- Enables gradual adoption of naming standards
Common prefix violations:
- Manual ID generation that bypasses the
create_nexus_id
macro - Source data with unexpected ID formats
3. Identifier-Level Tests¶
nexus_person_identifiers¶
Purpose: Validates person identifiers from all sources have unique IDs.
tests:
- unique:
column_name: person_identifier_id
config:
severity: error
columns:
- name: person_identifier_id
tests:
- not_null:
config:
severity: error
- dbt_utils.expression_is_true:
expression: "like 'per_idfr_%'"
config:
severity: warn
What it tests:
- Each person identifier record has a unique ID
- No person identifiers are missing IDs
- Person identifier IDs follow the expected
per_idfr_
prefix pattern (warning level) - Person identifiers from Gmail, Google Calendar, and Notion are properly deduplicated
Common failures:
- Same person appears multiple times with different roles but same ID
- Duplicate source data not properly deduplicated
- Missing role or timestamp in ID generation
nexus_group_identifiers¶
Purpose: Validates group identifiers (domains, organizations) have unique IDs.
tests:
- unique:
column_name: group_identifier_id
config:
severity: error
columns:
- name: group_identifier_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each group identifier record has a unique ID
- No group identifiers are missing IDs
- Group identifiers properly deduplicated when multiple people from same domain attend same event
Common failures:
- Multiple employees from same company create duplicate group records
- Missing deduplication in source models
- Role not included in ID generation
nexus_membership_identifiers¶
Purpose: Validates person-to-group membership relationships have unique IDs.
tests:
- unique:
column_name: membership_identifier_id
config:
severity: error
columns:
- name: membership_identifier_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each membership relationship has a unique ID
- No memberships are missing IDs
- Same person can belong to multiple groups with different roles
Common failures:
- Same person-group combination with different roles gets same ID
- Missing role in membership ID generation
4. Trait-Level Tests¶
nexus_person_traits¶
Purpose: Validates person traits (names, emails, etc.) have unique IDs.
tests:
- unique:
column_name: person_trait_id
config:
severity: error
columns:
- name: person_trait_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each person trait record has a unique ID
- No person traits are missing IDs
- Person traits properly linked to identifiers
nexus_group_traits¶
Purpose: Validates group traits (domain names, organization details) have unique IDs.
tests:
- unique:
column_name: group_trait_id
config:
severity: error
columns:
- name: group_trait_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each group trait record has a unique ID
- No group traits are missing IDs
- Group traits properly linked to identifiers
5. Resolved Entity Tests¶
nexus_persons¶
Purpose: Validates final resolved person entities after identity resolution.
tests:
- unique:
column_name: person_id
config:
severity: error
columns:
- name: person_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each resolved person has a unique final ID
- Identity resolution properly merged duplicate identifiers
- No persons are missing final IDs
nexus_groups¶
Purpose: Validates final resolved group entities after identity resolution.
tests:
- unique:
column_name: group_id
config:
severity: error
columns:
- name: group_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each resolved group has a unique final ID
- Identity resolution properly merged duplicate identifiers
- No groups are missing final IDs
nexus_memberships¶
Purpose: Validates final resolved membership relationships.
tests:
- unique:
column_name: membership_id
config:
severity: error
columns:
- name: membership_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each resolved membership has a unique final ID
- Memberships properly link resolved persons to resolved groups
- No memberships are missing final IDs
6. Participant-Level Tests¶
nexus_person_participants¶
Purpose: Validates person participation in events with proper role handling.
tests:
- unique:
column_name: person_participant_id
config:
severity: error
columns:
- name: person_participant_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each person-event-role combination has unique participant ID
- Same person can participate in same event with multiple roles
- No participants are missing IDs
Common failures:
- Role not included in participant ID generation
- Same person-event combination with different roles gets same ID
nexus_group_participants¶
Purpose: Validates group participation in events with proper role handling.
tests:
- unique:
column_name: group_participant_id
config:
severity: error
columns:
- name: group_participant_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each group-event-role combination has unique participant ID
- Same group can participate in same event with multiple roles
- No participants are missing IDs
Common failures:
- Role not included in participant ID generation
- Same group-event combination with different roles gets same ID
7. Identity Resolution Tests¶
nexus_resolved_person_identifiers¶
Purpose: Validates resolved person identifiers after identity resolution processing.
tests:
- unique:
column_name: person_identifier_id
config:
severity: error
columns:
- name: person_identifier_id
tests:
- not_null:
config:
severity: error
What it tests:
- Resolved identifiers maintain unique IDs
- Identity resolution process doesn't create duplicates
- All identifiers properly linked to resolved persons
nexus_resolved_group_identifiers¶
Purpose: Validates resolved group identifiers after identity resolution processing.
tests:
- unique:
column_name: group_identifier_id
config:
severity: error
columns:
- name: group_identifier_id
tests:
- not_null:
config:
severity: error
What it tests:
- Resolved identifiers maintain unique IDs
- Identity resolution process doesn't create duplicates
- All identifiers properly linked to resolved groups
nexus_resolved_person_traits¶
Purpose: Validates resolved person traits after identity resolution processing.
tests:
- unique:
column_name: person_trait_id
config:
severity: error
columns:
- name: person_trait_id
tests:
- not_null:
config:
severity: error
nexus_resolved_group_traits¶
Purpose: Validates resolved group traits after identity resolution processing.
tests:
- unique:
column_name: group_trait_id
config:
severity: error
columns:
- name: group_trait_id
tests:
- not_null:
config:
severity: error
8. Edge Relationship Tests¶
nexus_person_identifiers_edges¶
Purpose: Validates edges connecting person identifiers for identity resolution.
tests:
- unique:
column_name:
"edge_id || '|' || identifier_type_a || '|' || identifier_value_a || '|'
|| identifier_type_b || '|' || identifier_value_b"
config:
severity: error
columns:
- name: edge_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each edge relationship is unique across all identifier combinations
- No edges are missing IDs
- Bidirectional edges are properly handled
Note: Uses concatenated string syntax for composite key uniqueness testing.
nexus_group_identifiers_edges¶
Purpose: Validates edges connecting group identifiers for identity resolution.
tests:
- unique:
column_name:
"edge_id || '|' || identifier_type_a || '|' || identifier_value_a || '|'
|| identifier_type_b || '|' || identifier_value_b"
config:
severity: error
columns:
- name: edge_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each edge relationship is unique across all identifier combinations
- No edges are missing IDs
- Bidirectional edges are properly handled
9. State Management Tests¶
nexus_states¶
Purpose: Validates entity state tracking and transitions.
tests:
- unique:
column_name: state_id
config:
severity: error
columns:
- name: state_id
tests:
- not_null:
config:
severity: error
What it tests:
- Each state record has a unique ID
- State transitions are properly tracked
- No states are missing IDs
10. Running Tests¶
Run All Tests¶
Run Specific Model Tests¶
Run Only Uniqueness Tests¶
Run Tests with Increased Verbosity¶
Run Only Prefix Validation Tests¶
# Run all expression_is_true tests (prefix validation)
dbt test --models nexus_* --select test_type:expression_is_true
# Check prefix compliance across all models
dbt test --models nexus_* --select test_name:*expression_is_true*
11. Test Failure Investigation¶
When tests fail, use these approaches:
1. Check Test Results¶
# View compiled test SQL
cat target/compiled/nexus/models/nexus-models/nexus.yml/unique_nexus_person_identifiers_person_identifier_id.sql
2. Run Diagnostic Queries¶
See Troubleshooting Duplicates for specific diagnostic queries.
3. Validate Fixes¶
# Rebuild and test incrementally
dbt run --models source_model
dbt run --models nexus_model
dbt test --models nexus_model
12. Test Configuration¶
Severity Levels¶
- error: Test failure stops execution (used for uniqueness and not-null tests)
- warn: Test failure logs warning but continues (used for ID prefix validation tests)
Custom Test Thresholds¶
tests:
- unique:
column_name: person_id
config:
severity: error
error_if: ">= 1" # Fail if any duplicates
warn_if: ">= 0" # Warn if any issues
Test Tags¶
All nexus tests are automatically tagged for easy filtering:
For troubleshooting specific test failures, see the Troubleshooting Duplicates Guide.