Amazon Macie¶
What Is This Service?¶
Managed AWS data security and privacy service that discovers, classifies, and monitors sensitive data stored in Amazon S3.
Mental model:
Amazon Macie = Discover → Classify → Prioritize → Alert → Protect
Primary purpose:
Identify:
- sensitive data
- accidental exposure
- risky access configurations
- compliance violations
inside Amazon S3 environments.
Typical discoveries:
- PII
- credentials
- financial records
- healthcare information
- secrets
- intellectual property
- regulated datasets
Why It Matters for Security¶
One of the most common cloud security failures:
Sensitive data exists
↓
Nobody knows where
↓
Bucket becomes exposed
↓
Data breach
Macie exists to answer:
What sensitive data exists?
Where is it stored?
How exposed is it?
Security outcomes:
- data visibility
- privacy protection
- reduced exposure risk
- compliance monitoring
MOST TESTED:
Macie protects data at rest in S3.
It is not network inspection.
It is not DLP for all AWS services.
Architecture Example¶
Organization-Wide Sensitive Data Discovery¶
flowchart TD
subgraph Organization
AccountA[S3 Account A]
AccountB[S3 Account B]
AccountC[S3 Account C]
end
subgraph SecurityAccount
Macie[Amazon Macie]
Findings[Findings]
end
subgraph Response
SecurityHub[Security Hub]
EventBridge[EventBridge]
Lambda[Lambda]
SNS[SNS]
end
AccountA --> Macie
AccountB --> Macie
AccountC --> Macie
Macie --> Findings
Findings --> SecurityHub
Findings --> EventBridge
EventBridge --> Lambda
Lambda --> SNS
Architecture goals:
- centralized visibility
- automated discovery
- response orchestration
Workflow(s)¶
Sensitive Data Discovery¶
sequenceDiagram
participant S3
participant Macie
participant Classifier
participant Findings
S3->>Macie: Submit object metadata
Macie->>Classifier: Analyze object content
Classifier->>Macie: Detect sensitive data
Macie->>Findings: Generate finding
Bucket Exposure Evaluation¶
sequenceDiagram
participant Bucket
participant Macie
participant Policy
participant Analyst
Bucket->>Macie: Metadata evaluation
Macie->>Policy: Evaluate exposure
Policy->>Macie: Determine risk
Macie->>Analyst: Publish finding
Automated Remediation¶
sequenceDiagram
participant Macie
participant EventBridge
participant Lambda
participant S3
Macie->>EventBridge: Publish finding
EventBridge->>Lambda: Invoke
Lambda->>S3: Correct configuration
Core Concepts¶
Sensitive Data Discovery¶
MOST TESTED
Macie discovers:
- PII
- credentials
- healthcare records
- financial identifiers
- authentication secrets
- regulated information
Examples:
Credit Card Numbers
Email Addresses
Passport Numbers
API Keys
Detection methods:
- machine learning
- managed identifiers
- regex-based matching
Bucket Metadata Analysis¶
Macie continuously evaluates:
- encryption status
- public exposure
- bucket policies
- object inventory
- replication state
Purpose:
Identify risky storage patterns.
Examples:
Public bucket
Unencrypted bucket
Sensitive data publicly accessible
Discovery Jobs¶
HIGH VALUE
Macie supports:
One-Time Jobs¶
Used for:
- audits
- migration analysis
- initial discovery
Scheduled Jobs¶
Used for:
- continuous governance
- recurring scans
Automated Sensitive Data Discovery¶
MOST TESTED
Macie supports continuous discovery.
Characteristics:
- evaluates bucket inventory
- prioritizes sampling
- identifies changing risk
Benefits:
- lower cost
- broad coverage
MASSIVE EXAM TRAP:
Automated discovery does not scan every object continuously.
Findings¶
Macie produces two major finding categories.
Policy Findings¶
Examples:
Public bucket
Encryption disabled
Sensitive Data Findings¶
Examples:
Credit card numbers detected
Important Integrations¶
| Service | Purpose |
|---|---|
| Amazon S3 | Data source |
| Security Hub | Findings aggregation |
| EventBridge | Automation |
| Lambda | Remediation |
| CloudTrail | Auditing |
| IAM | Access visibility |
| Organizations | Multi-account |
| KMS | Encryption posture |
| Config | Governance |
| SNS | Notifications |
| Detective | Investigation |
| GuardDuty | Threat context |
Security Features¶
Managed Data Identifiers¶
MOST TESTED
Prebuilt recognition for:
- PII
- credentials
- financial data
- regulated identifiers
Benefits:
- no custom setup
Custom Data Identifiers¶
HIGH VALUE
Supports custom matching.
Examples:
Employee IDs
Internal Tokens
Project Codes
Built using:
- regex
- proximity rules
Exam trap:
Custom identifiers extend built-in discovery.
Automated Discovery¶
Continuously prioritizes:
- high-risk buckets
- changing environments
- new data
Purpose:
Reduce manual operations.
Encryption Visibility¶
Macie evaluates:
- SSE-S3
- SSE-KMS
- encryption posture
Purpose:
Identify weak protection.
Advanced Security and Operational Concepts¶
Control Plane vs Data Plane¶
Control Plane:
- configure jobs
- manage findings
- enable accounts
Data Plane:
- S3 object analysis
Exam trap:
Macie does not intercept S3 requests.
Multi-Account Architecture¶
MOST TESTED
flowchart LR
Organizations
Organizations --> DelegatedAdmin
DelegatedAdmin --> Macie
Macie --> Member1
Macie --> Member2
Macie --> Member3
Benefits:
- centralized visibility
- reduced operations
Delegated Administrator¶
Best practice:
Dedicated security account.
Responsibilities:
- findings
- discovery jobs
- governance
Object Selection and Sampling¶
HIGH VALUE
Macie prioritizes:
Risk
+
Sampling
+
ML
Purpose:
Improve coverage while controlling cost.
Exam trap:
Macie does not continuously inspect entire buckets.
Macie vs GuardDuty¶
MASSIVE EXAM TRAP
| Capability | Macie | GuardDuty |
|---|---|---|
| Sensitive data discovery | Yes | No |
| Threat detection | No | Yes |
| S3 analysis | Yes | Limited |
| Attack detection | No | Yes |
Rule:
GuardDuty detects attacks.
Macie discovers data.
Macie vs Security Hub¶
| Capability | Macie | Security Hub |
|---|---|---|
| Generate findings | Yes | |
| Aggregate findings | No | Yes |
| Discover data | Yes | No |
Rule:
Macie generates.
Security Hub centralizes.
Macie vs Inspector¶
| Capability | Macie | Inspector |
|---|---|---|
| Data discovery | Yes | |
| Vulnerability scanning | No | Yes |
| S3 sensitivity | Yes | No |
Macie vs S3 Inventory¶
MASSIVE EXAM TRAP
| Capability | Macie | S3 Inventory |
|---|---|---|
| Object metadata | Partial | Yes |
| Sensitive content | Yes | No |
| ML classification | Yes | No |
Rule:
Need sensitive data → Macie
Need object inventory → S3 Inventory
Event-Driven Response Pattern¶
flowchart LR
Macie
Macie --> EventBridge
EventBridge --> Lambda
Lambda --> S3
Examples:
- block public access
- apply encryption
- notify security teams
Regional Behavior¶
HIGH VALUE
Macie is regional.
Implications:
- enable per Region
- findings remain regional
Exam trap:
Macie does not automatically scan globally.
Data Residency¶
Data processing remains in Region.
Benefits:
- compliance
- governance
Cost Model¶
Primary drivers:
- objects analyzed
- discovery jobs
- automation scope
Optimization:
- targeted jobs
- scheduled discovery
- scope reduction
Exam trap:
Pricing follows analysis volume.
Comparisons¶
| Service | Primary Role |
|---|---|
| Macie | Sensitive data discovery |
| GuardDuty | Threat detection |
| Security Hub | Findings aggregation |
| Inspector | Vulnerability scanning |
| Config | Compliance |
| S3 Inventory | Inventory |
Common Exam Traps¶
-
Macie only analyzes Amazon S3.
-
Macie is not GuardDuty.
-
Macie is not a SIEM.
-
Macie is regional.
-
Macie uses ML and identifiers.
-
Findings integrate with Security Hub.
-
EventBridge automates remediation.
-
Automated Discovery uses sampling.
-
Macie does not inspect network traffic.
-
Policy findings differ from sensitive findings.
-
Custom identifiers use regex.
-
Macie does not replace DLP platforms.
-
Delegated admin is preferred.
-
Macie analyzes content and exposure.
-
Pricing depends on analyzed data volume.
5-Second Recall¶
- Macie = S3 sensitive data discovery
- ML + identifiers
- S3 only
- Regional
- Generates findings
- Security Hub aggregates
- EventBridge automates
- GuardDuty ≠ Macie
Quick Revision Notes¶
- Discover sensitive data in S3
- Evaluate exposure risk
- Centralize using delegated admin
- Automate with EventBridge
- Use managed + custom identifiers
- Automated Discovery uses sampling
- Findings feed Security Hub
- Enable per Region
- Macie is not threat detection
- Visibility before protection