Atomic Data Warehouse
The Atomic Data Warehouse is the part of the Data Analytics environment where structured data is
broken down into low level components and integrated with other components in
preparation for exposing to data consumers. The Atomic Warehouse is designed using
normalization techniques along with methods that enable recording of history and
make it fast to load and retrieve data. Expected functionality must:
- Act as a data hub, distribute information to the Dimensional Warehouse and other
targets.
- Accept, integrate, and cleanse data coming from multiple sources.
- Enable trace back to the source system; must show lineage of data.
- Support controls to ensure that data is complete and correct.
The Atomic Warehouse tends to have these Level 2 structures
or similar structures with different names:
- Ingest: zone where data is input to the database.
It is best practice to limit database inputs to the Ingest Zone.
- Core: zone where subject content is stored. This is
central focus of the database.
It is best practice to isolate the Core Zone from external processes and databases.
- Expose: zone where data is made available outside of the database.
It is best practice to limit database outputs to the Expose zone.
- Process: zone where database processes are tracked and controlled.
It is best practice to use the same Process Zone schema across databases.
- Archive: zone where history data is stored in Raw Immutable form.
This means that data is stored in the Ingest form and not is not altered which
makes for an effective audit trail.
- Metadata: zone where data describing database content and structure are stored. Glossaries and Data Lineage are examples of data managed here.
It is best practice to use the same Metadata Zone schema across databases
or to share a Metadata repository across the enterprise.
- Notify: zone where logs of events are stored.
It is best practice to use the same Process Zone schema across databases.
Notifications may be sent to a centralized Notifications System or Database.
The Atomic Warehouse Core tends to have these Level 3 structures
or similar structures with different names:
- Object: tables that are identified through
business keys. The Object contains enough information to be identified.
The primary key of the object is inherited by the Tie and Properties tables.
Similar tables in other approaches include: header, master and
hub (Data Vault).
- Tie: tables that associate one or more Object tables.
Similar tables in other approaches include: association, relationship and
link (Data Vault).
- Properties: tables that contain data elements that
describe an Object or Tie. Property tables are designed to contain history through
by including a datetime or timestamp as part of the primary key.
Data is inserted into the Properties table rather than updated.
Similar tables in other approaches include: detail, satellite (Data Vault).
- Guide: tables that enable efficient and effective
data access such as: hierarchy navigation, supertype / subtype
and use conditions.
Data is inserted into the Properties table rather than updated.
Similar tables in other approaches include: helper, bridge (Data Vault), PIT (Data Vault).
- Reference: tables that contains static lookup data such as:
calendards, currencies, countries and transaction codes.
Similar tables in other approaches include: code and lookup.
Data Vault also terms these tables as Reference.
Advertisements