Cloud Normalization¶
Agent BOM keeps cloud ingestion accurate by separating three layers:
- Raw provider payloads from AWS, Azure, GCP, Databricks, and other sources
- A canonical
agent-bomnormalization layer - Optional downstream projections such as OCSF
The product does not force every cloud response into OCSF first. It preserves source-specific semantics, normalizes them into stable envelopes for the rest of the product, and projects to OCSF only where that mapping is accurate.
Canonical envelopes¶
The current canonical cloud envelopes are additive metadata attached to discovered assets and emitted through JSON output. They give the CLI, API, UI, graphs, and future exports a stable internal contract without discarding provider-specific evidence.
cloud_origin¶
Identity and provenance for the cloud asset:
providerserviceresource_typeresource_idresource_namelocation- scoped identifiers such as
account_id,subscription_id, orproject_id - constrained
raw_identityfields for debugging and mapping verification
cloud_state¶
Normalized lifecycle state for the asset:
lifecycle_stateraw_statestate_source
This lets the product distinguish canonical status from provider-specific state strings without losing the original value.
cloud_timestamps¶
Normalized timestamps for asset lifecycle history:
created_atupdated_atsources
Timestamps are normalized to UTC ISO-8601 so downstream graph, timeline, and history features do not need to guess provider formats.
cloud_principal¶
Execution identity or attached principal where the provider exposes one:
principal_typeprincipal_idprincipal_nametenant_idsource_field- constrained
raw_identity
This helps preserve the relationship between a resource and the principal it runs as, without forcing every provider into a single vendor-specific schema.
Why this layer exists¶
Cloud providers do not agree on:
- field names
- nested response shapes
- lifecycle enums
- timestamp formats
- identity semantics
The normalization layer prevents those differences from leaking into every product surface.
Relationship to OCSF¶
OCSF is optional and sits downstream of the canonical model:
- raw cloud payload in
- source-specific adapter
- canonical
agent-bomenvelope - optional OCSF projection
If OCSF does not fit a provider-specific concept cleanly, Agent BOM keeps the canonical field or extension instead of forcing a bad mapping.
Persistence and history¶
Canonical cloud metadata supports current-state and historical views:
first_seenandlast_seenstay in the product model- lifecycle changes remain queryable over time
- deactivated or disappeared entities can remain visible in history even when they no longer appear in the latest inventory
The design goal is persistent truth plus derived views, not transient recomputation of every cloud shape on every page load.