Skip to content

Amazon Security Lake

What Is This Service?

Centralized AWS security data lake built on Open Cybersecurity Schema Framework (OCSF).

Mental model:

Security Lake = Security logs → Normalize → Store → Query → Investigate

Primary purpose:

Aggregate security and operational telemetry across AWS, multi-account environments, and third-party tools into a unified security analytics platform.


Why It Matters for Security

Modern security failures are rarely caused by lack of logs.

They happen because logs are:

  • fragmented
  • inconsistent
  • account-isolated
  • difficult to correlate
  • expensive to operationalize

Security Lake exists to:

  • centralize security telemetry
  • normalize data into OCSF
  • accelerate investigations
  • support SIEM platforms
  • simplify compliance
  • reduce security data engineering

Security outcomes:

  • centralized detection
  • faster incident response
  • unified investigations
  • long-term evidence retention

MOST TESTED:

Security Lake is not a SIEM.

It is the security data foundation layer.


Architecture Example

flowchart LR

subgraph Source Accounts
CT[CloudTrail]
VPC[VPC Flow Logs]
Route53[Resolver Logs]
GuardDuty[GuardDuty]
WAF[WAF Logs]
SecurityHub[Security Hub]
Inspector[Inspector]
end

subgraph Security Lake
Collector[Ingestion]
OCSF[OCSF Normalization]
S3[S3 Data Lake]
Subscriber[Subscribers]
end

subgraph Analytics
Athena[Athena]
OpenSearch[OpenSearch]
SIEM[Third Party SIEM]
QuickSight[QuickSight]
end

CT --> Collector
VPC --> Collector
Route53 --> Collector
GuardDuty --> Collector
WAF --> Collector
SecurityHub --> Collector
Inspector --> Collector

Collector --> OCSF
OCSF --> S3

S3 --> Athena
S3 --> OpenSearch
S3 --> SIEM
S3 --> QuickSight

Subscriber --> S3

Architecture goals:

  • centralize
  • normalize
  • decouple producers and consumers

Workflow(s)

Security Data Ingestion

sequenceDiagram

participant Source
participant SecurityLake
participant OCSF
participant S3
participant Consumer

Source->>SecurityLake: Emit security data

SecurityLake->>OCSF: Normalize records

OCSF->>S3: Store parquet objects

Consumer->>S3: Query security data

Incident Investigation

sequenceDiagram

participant Analyst
participant Athena
participant SecurityLake
participant SecurityHub

Analyst->>Athena: Query indicators

Athena->>SecurityLake: Retrieve records

SecurityLake->>Athena: Return normalized logs

Athena->>Analyst: Correlated findings

Analyst->>SecurityHub: Validate findings

Core Concepts

Security Data Lake

Security Lake creates:

  • centralized storage
  • normalized telemetry
  • analytics-ready datasets

Storage engine:

  • Amazon S3

Storage format:

  • Apache Parquet

Benefits:

  • compression
  • efficient analytics
  • lower cost

Open Cybersecurity Schema Framework (OCSF)

MOST TESTED

OCSF standardizes security event formats.

Problem solved:

Every security tool produces different schemas.

Without OCSF:

Tool A → source_ip
Tool B → src_ip
Tool C → client_address

With OCSF:

src_endpoint.ip

Benefits:

  • unified analytics
  • easier SIEM integration
  • lower ETL complexity

Exam trap:

Security Lake normalizes.

It does not create detections.


Native AWS Sources

Supported examples:

  • CloudTrail
  • VPC Flow Logs
  • Route 53 Resolver Logs
  • AWS WAF
  • GuardDuty
  • Security Hub
  • Inspector

Security Lake continuously collects.

No manual exports required.


Custom Sources

Supports:

  • custom applications
  • third-party security tools
  • partner integrations

Requirements:

  • OCSF-compatible ingestion

Examples:

  • Splunk
  • CrowdStrike
  • Palo Alto
  • SIEM platforms

Subscribers

Consumers of Security Lake data.

Examples:

  • Athena
  • OpenSearch
  • Security analytics
  • custom pipelines

Subscribers do not own ingestion.

Security Lake remains source of truth.


Important Integrations

Service Purpose
S3 Data storage
Athena Investigation
Security Hub Findings correlation
GuardDuty Threat findings
CloudTrail API telemetry
VPC Flow Logs Network visibility
Route 53 Resolver Logs DNS analysis
AWS WAF Application telemetry
OpenSearch Search
EventBridge Automation
Organizations Central governance
IAM Access control
KMS Encryption

Security Features

Centralized Log Collection

Organization-wide collection.

Benefits:

  • fewer blind spots
  • simpler investigations

Encryption

Security Lake encrypts data.

Supported:

  • AWS owned keys
  • AWS KMS

MOST TESTED:

Customer-managed KMS keys provide:

  • auditability
  • key lifecycle control
  • cross-account governance

Fine-Grained Access

Access controlled through:

  • IAM
  • S3 permissions
  • Lake permissions

Supports:

  • least privilege
  • account separation

Cross-Account Aggregation

Organization model:

flowchart TB

Org[Organizations]

AccountA --> SecurityLake
AccountB --> SecurityLake
AccountC --> SecurityLake

SecurityLake --> SecurityAccount

Central security account receives telemetry.


Advanced Security and Operational Concepts

Control Plane vs Data Plane

Control Plane:

  • configure lake
  • onboard sources
  • manage subscribers

Data Plane:

  • ingest logs
  • normalize
  • store records

Exam trap:

Query operations occur outside Security Lake.

Security Lake manages storage.


Regional Architecture

HIGH VALUE

Security Lake is regional.

Implications:

  • data residency matters
  • collection configured per Region
  • retention managed regionally

Cross-region collection supported through architecture.

Security Lake itself is not globally replicated.


Multi-Account Governance

MOST TESTED

flowchart LR

Organizations --> DelegatedAdmin

DelegatedAdmin --> SecurityLake

SecurityLake --> Member1
SecurityLake --> Member2
SecurityLake --> Member3

Benefits:

  • centralized operations
  • delegated administration
  • lower operational overhead

Security Lake vs SIEM

MASSIVE EXAM TRAP

Security Lake:

  • collects
  • normalizes
  • stores

SIEM:

  • correlates
  • alerts
  • detects

Security Lake complements SIEM.

Not replacement.


Security Lake vs CloudTrail Lake

HIGH VALUE

Capability Security Lake CloudTrail Lake
Purpose Security analytics API activity
Sources Multiple CloudTrail
Schema OCSF CloudTrail
Storage S3 Managed
Query Consumers Built-in

Rule:

Need broad security telemetry → Security Lake

Need API audit analysis → CloudTrail Lake


Security Lake vs OpenSearch

Capability Security Lake OpenSearch
Storage Long-term Search index
Query External Native
Analytics Consumer-driven Built-in
Detection No Possible

Security Lake often feeds OpenSearch.


Storage and Cost Model

Costs driven by:

  • ingestion
  • processing
  • retention
  • S3 storage
  • analytics

Optimization:

  • Parquet
  • lifecycle policies
  • archive tiers

Exam trap:

Security Lake does not eliminate S3 charges.


Retention Strategy

Security telemetry frequently follows:

Hot → S3 Standard
Warm → Intelligent Tiering
Cold → Glacier
Archive → Deep Archive

Compliance often determines retention.


Detection Pipeline Pattern

flowchart LR

Sources --> SecurityLake

SecurityLake --> Athena

Athena --> SecurityHub

SecurityHub --> EventBridge

EventBridge --> Lambda

Flow:

Collect → Normalize → Investigate → Respond


OCSF vs ASFF

HIGH VALUE

Standard Purpose
OCSF Security event schema
ASFF Security findings

Security Lake → OCSF

Security Hub → ASFF

MASSIVE EXAM TRAP:

Do not confuse them.


Comparisons

Service Primary Role
Security Lake Security telemetry lake
Security Hub Findings aggregation
GuardDuty Threat detection
CloudTrail Lake API analytics
Athena Query engine
OpenSearch Search and analytics
SIEM Detection platform

Common Exam Traps

  1. Security Lake is not a SIEM.

  2. Security Lake stores—not detects.

  3. OCSF ≠ ASFF.

  4. Security Lake is regional.

  5. Data resides in S3.

  6. Athena commonly queries Security Lake.

  7. Security Hub findings are not raw logs.

  8. Security Lake supports third-party ingestion.

  9. CloudTrail Lake is API-specific.

  10. Security Lake normalizes schema automatically.

  11. Security Lake does not replace GuardDuty.

  12. Security Lake ingestion and storage both affect cost.

  13. Security Lake improves investigation—not prevention.

  14. Multi-account collection is common architecture.

  15. Subscriber access should follow least privilege.


5-Second Recall

  • Security Lake = centralized security data lake
  • OCSF = normalized schema
  • S3 = storage
  • Athena = investigation
  • Security Hub = findings
  • GuardDuty = detection
  • Not a SIEM
  • Regional service

Quick Revision Notes

  • Collect security telemetry centrally
  • Normalize using OCSF
  • Store in S3 using Parquet
  • Query with Athena/OpenSearch
  • Multi-account through Organizations
  • KMS for encryption governance
  • Subscribers consume data
  • Security Hub correlates findings
  • CloudTrail Lake ≠ Security Lake
  • Security Lake enables investigations, not detections