Skip to main content
  1. Home
  2. >
  3. AWS
  4. >
  5. SAP-C02
  6. >
  7. AWS SAP-C02 Exam Scenarios
  8. >
  9. OpenSearch Hot-Warm-Cold Cost Trade-offs | SAP-C02

OpenSearch Hot-Warm-Cold Cost Trade-offs | SAP-C02

Jeff Taakey
Author
Jeff Taakey
21+ Year Enterprise Architect | Multi-Cloud Architect & Strategist.

While preparing for the AWS SAP-C02, many candidates get confused by OpenSearch Service tiered storage architectures. In the real world, this is fundamentally a decision about storage performance tiers vs. total cost of ownership (TCO). The exam wants you to recognize when to use hot (data nodes), warm (UltraWarm), and cold storage—but more importantly, why each tier exists and how S3 lifecycle policies integrate with compliance requirements. Let’s drill into a simulated scenario.

The Scenario
#

GlobalStream Analytics, a digital media intelligence company, operates a log analysis platform using Amazon OpenSearch Service. Currently, their architecture includes:

  • A 10-node data cluster (hot storage) running 24/7
  • Data ingestion from an S3 bucket using S3 Standard storage class
  • 1-month retention window for active, read-only analysis in OpenSearch
  • Automated index deletion after 30 days from the cluster
  • Compliance mandate: All raw input data must be retained indefinitely for audit purposes

The CFO has flagged OpenSearch as the third-highest AWS spend in the organization and requested the Solutions Architect to propose a cost-optimized architecture without compromising compliance or query performance during the active 30-day window.

Key Requirements
#

Minimize monthly TCO while:

  • Maintaining query performance for the active 30-day analysis period
  • Ensuring indefinite retention of raw data in S3 for compliance
  • Reducing OpenSearch cluster operational costs
  • Avoiding unnecessary complexity

The Options
#

  • A) Replace all 10 data nodes with UltraWarm nodes scaled for expected capacity. Transition input data from S3 Standard to S3 Glacier Deep Archive immediately upon cluster ingestion.

  • B) Reduce data nodes to 2. Add UltraWarm nodes for expected capacity. Configure indexes to automatically migrate to UltraWarm upon ingestion. Use S3 lifecycle policy to transition input data to S3 Glacier Deep Archive after 30 days.

  • C) Reduce data nodes to 2. Add UltraWarm nodes for expected capacity. Configure indexes to migrate to UltraWarm upon ingestion. Add cold storage nodes. Transition indexes from UltraWarm to cold storage. Use S3 lifecycle policy to delete input data after 30 days.

  • D) Reduce data nodes to 2. Add instance-store-backed data nodes for expected capacity. Transition input data from S3 Standard to S3 Glacier Deep Archive immediately upon cluster ingestion.

Correct Answer
#

Option B: Reduce to 2 data nodes, add UltraWarm nodes, auto-migrate indexes to UltraWarm on ingestion, and transition S3 data to Glacier Deep Archive after 30 days.

Correct Answer
#

Step-by-Step Winning Logic
#

This solution represents the optimal trade-off across four dimensions:

  1. Cost Optimization:

    • Reduces expensive hot-tier data nodes from 10 to 2 (80% reduction)
    • Leverages UltraWarm (S3-backed storage with caching) for the read-only workload—designed specifically for this use case
    • Implements S3 lifecycle policy to move compliance data to Glacier Deep Archive ($0.00099/GB/month vs. $0.023/GB for Standard)
  2. Performance Preservation:

    • 2 hot nodes handle active ingestion and indexing
    • UltraWarm provides acceptable query performance for read-only analytics (slightly higher latency than hot, but sufficient for batch analysis)
  3. Compliance Adherence:

    • Raw data remains in S3 indefinitely (transitioned to Glacier Deep Archive, not deleted)
    • 30-day delay before archival ensures data availability during active analysis period
  4. Operational Simplicity:

    • Automated index migration to UltraWarm (native OpenSearch feature)
    • Standard S3 lifecycle policy (no custom Lambda orchestration)

💎 The Architect’s Deep Dive: Why Options Fail
#

The Traps (Distractor Analysis)
#

Why not Option A?
#

  • Fatal Flaw #1: UltraWarm nodes cannot ingest data—they only store pre-existing indexes migrated from hot nodes. This solution would fail immediately.
  • Fatal Flaw #2: Transitioning data to Glacier Deep Archive at ingestion time violates the requirement for “1-month active analysis”—Glacier Deep Archive has 12-hour retrieval times and is incompatible with OpenSearch query patterns.
  • Cost Reality: Would require keeping all 10 hot nodes anyway (defeating the purpose).

Why not Option C?
#

  • Compliance Violation: The S3 lifecycle policy deletes input data after 30 days, directly contradicting the “retain all input data indefinitely” requirement.
  • Over-Engineering: Cold storage is appropriate for rarely accessed data (e.g., regulatory archives accessed once per year). This scenario has active 30-day analysis—cold storage adds unnecessary complexity and retrieval delays.
  • Hidden Cost: Cold storage itself is cheap, but repeated queries during the 30-day window would incur significant retrieval fees.

Why not Option D?
#

  • Misunderstanding of Instance Store: Instance-store-backed nodes provide ephemeral, high-IOPS storage—they don’t reduce costs (often increase costs due to larger instance types required).
  • Same Glacier Issue as Option A: Immediate transition to Glacier Deep Archive breaks the 30-day query requirement.
  • No Warm Tier Utilization: Ignores the purpose-built UltraWarm feature designed for this exact workload pattern.

💎 Professional Decision Matrix

This SAP-C02 professional section is locked.
Free beta access reveals the exam logic.

100% Free Beta Access

The Architect Blueprint
#

graph TD
    S3[S3 Bucket - Standard Storage
Raw Input Data] -->|Ingest| Hot[2 Hot Data Nodes
Active Indexing] Hot -->|Auto-migrate after indexing| UW[UltraWarm Nodes
30-Day Read-Only Analysis] UW -->|After 30 days| Delete[Delete Index from Cluster] S3 -->|S3 Lifecycle Policy
After 30 days| Glacier[S3 Glacier Deep Archive
Indefinite Compliance Storage] style Hot fill:#ff6b6b,stroke:#c92a2a,color:#fff style UW fill:#4ecdc4,stroke:#0a9396,color:#fff style Glacier fill:#a8dadc,stroke:#457b9d,color:#000 style S3 fill:#f1faee,stroke:#457b9d,color:#000

💎 Professional Decision Matrix

This SAP-C02 professional section is locked.
Free beta access reveals the exam logic.

100% Free Beta Access

Diagram Note: Data flows from S3 Standard into hot nodes for indexing, immediately migrates to UltraWarm for the 30-day analysis window, then gets deleted from OpenSearch while the S3 source transitions to Glacier Deep Archive for compliance.

The Decision Matrix
#

Option Est. Complexity Est. Monthly Cost Pros Cons
A Low $12,000+ Simple conceptually UltraWarm cannot ingest data; Glacier transition breaks queries; no cost savings
B ✅ Medium $2,500 80% OpenSearch cost reduction; performance maintained; compliance met; native AWS features Requires understanding of tiered architecture
C High $3,200 Technically functional for queries Deletes S3 data (compliance violation); over-engineered with cold storage; higher ops burden
D Low $8,500 Reduces node count Instance store doesn’t reduce cost; Glacier transition breaks queries; ignores UltraWarm

Cost Breakdown (Estimated for 5TB dataset):
#

  • Current State (10 hot nodes): ~$6,000/month (OpenSearch) + $115/month (S3 Standard) = $6,115/month
  • Option B:
    • 2 hot nodes: ~$1,200/month
    • UltraWarm (5TB): ~$1,150/month
    • S3 Standard (30 days): ~$115/month
    • S3 Glacier Deep Archive (long-term): ~$5/month (accumulates over time)
    • Total: ~$2,470/month (60% reduction)

💎 Professional Decision Matrix

This SAP-C02 professional section is locked.
Free beta access reveals the exam logic.

100% Free Beta Access

Real-World Practitioner Insight
#

Exam Rule
#

For the SAP-C02 exam, remember this decision tree:

  • Hot nodes (data nodes): Active writes + sub-second queries
  • UltraWarm: Read-heavy workloads, acceptable latency (seconds), significantly lower cost
  • Cold storage: Rarely accessed archives (regulatory, annual reviews)
  • S3 Lifecycle to Glacier: Compliance archives with no query requirement

When you see “read-only analysis” + “cost optimization” + “compliance retention”, the answer almost always involves UltraWarm + S3 tiering.

Real World
#

In production, I’d also consider:

  1. Index State Management (ISM) Policies: Automate the hot → warm → delete transitions with granular control (e.g., move to UltraWarm after 7 days if query frequency drops).

  2. S3 Intelligent-Tiering: For the first 30 days, use S3 Intelligent-Tiering instead of Standard—it automatically moves infrequently accessed data to cheaper tiers without lifecycle policy delays.

  3. Query Pattern Analysis: If 80% of queries target the last 7 days, keep only the last 7 days in hot nodes, and move 8-30 day data to UltraWarm earlier.

  4. Snapshot-Based Alternative: For some workloads, automated snapshots to S3 + on-demand cluster restoration can be cheaper than UltraWarm if queries are truly infrequent (e.g., ad-hoc investigations). This wasn’t an option here, but it’s worth evaluating in real projects.

  5. Reserved Instance Pricing: The cost estimates above assume On-Demand pricing. A 1-year RI commitment on the 2 hot nodes would reduce costs by another 30-40%.

The exam simplifies to test your architectural pattern recognition—but real-world decisions require measuring actual query latency SLAs and access patterns over 30-60 days before committing to a tier.

💎 Professional Decision Matrix

This SAP-C02 professional section is locked.
Free beta access reveals the exam logic.

100% Free Beta Access