Skip to main content
  1. Home
  2. >
  3. AWS
  4. >
  5. SAA-C03
  6. >
  7. AWS SAA-C03 Exam Scenarios
  8. >
  9. File Gateway vs EFS for Low Latency | SAA-C03

File Gateway vs EFS for Low Latency | SAA-C03

Jeff Taakey
Author
Jeff Taakey
21+ Year Enterprise Architect | Multi-Cloud Architect & Strategist.

While preparing for the AWS SAA-C03, many candidates get confused by hybrid storage migration strategies. In the real world, this is fundamentally a decision about latency requirements vs. storage economics. Let’s drill into a simulated scenario.

The Scenario
#

GlobalMedia Productions operates an on-premises Windows SMB file server that stores large video editing project files. Their creative teams heavily access newly created assets—approximately 85% of file operations occur within the first 7 days after creation. After this initial period, files are rarely accessed but must be retained for compliance and occasional reference for up to 7 years.

The company’s storage infrastructure is reaching 92% capacity, and procurement cycles for additional SAN hardware take 6-8 weeks. The IT Director needs a solution that:

  • Immediately increases available storage capacity
  • Preserves sub-50ms latency for recently created files
  • Reduces long-term storage costs through automated tiering
  • Requires minimal changes to existing user workflows (users currently access files via SMB shares)

Key Requirements
#

Extend storage capacity immediately while maintaining low-latency access to recently created files, and implement automated lifecycle management to control future storage growth and costs.

The Options
#

  • A) Use AWS DataSync to replicate files older than 7 days from the SMB file server to AWS.
  • B) Deploy an Amazon S3 File Gateway to extend storage capacity, configure S3 lifecycle policies to transition objects to S3 Glacier Deep Archive after 7 days.
  • C) Create an Amazon FSx for Windows File Server file system to extend the company’s storage capacity.
  • D) Install Amazon S3 client utilities on each user workstation to access S3 directly, create S3 lifecycle policies to transition data to S3 Glacier Flexible Retrieval after 7 days.

Correct Answer
#

Option B.

The Architect’s Analysis
#

Correct Answer
#

Option B - Deploy Amazon S3 File Gateway with S3 lifecycle policies.

Step-by-Step Winning Logic
#

S3 File Gateway provides the optimal latency-cost trade-off for this hybrid storage scenario:

1. Transparent SMB Integration

  • Users continue accessing files via familiar SMB protocol (no workflow changes)
  • Gateway presents S3 storage as a standard Windows file share
  • No client-side software installation or training required

2. Intelligent Caching Architecture

  • Gateway appliance caches frequently accessed files locally (your “hot” 7-day window)
  • Sub-50ms latency for cached data (meets the low-latency requirement)
  • Asynchronous upload to S3 in the background

3. Automated Lifecycle Economics

  • S3 lifecycle policies automatically transition aging data through storage tiers
  • Days 1-7: S3 Standard (~$23/TB/month) with local cache
  • Day 8+: S3 Glacier Deep Archive (~$1/TB/month) = 95% cost reduction
  • No manual intervention required—meets “avoid future storage problems” requirement

4. Immediate Capacity Expansion

  • Gateway deployed as VM or hardware appliance in days, not weeks
  • Effectively unlimited S3 backend storage (no more capacity planning cycles)

The Traps (Distractor Analysis)
#

Why not Option A (AWS DataSync)?

  • DataSync is a one-way migration/sync tool, not a transparent storage extension
  • Does NOT provide SMB access to data after migration to AWS
  • Users would lose access to migrated files through existing workflows
  • Requires building an entirely new access pattern (violates minimal workflow change requirement)
  • Cost model: Adds DataSync transfer fees (~$0.0125/GB) without solving the access latency problem

Why not Option C (Amazon FSx for Windows File Server)?

  • FSx provides excellent SMB performance but at much higher cost (~$0.13-0.65/GB/month depending on throughput)
  • No automated lifecycle management to cheaper storage tiers
  • 100% of data remains in expensive high-performance storage
  • For a 100TB dataset: FSx = ~$13,000-65,000/month vs. File Gateway with lifecycle = ~$2,300/month initially, dropping to ~$100/month for archived data
  • Doesn’t address “avoid future storage problems”—just moves the cost burden to AWS

Why not Option D (S3 client tools on workstations)?

  • Catastrophic user experience change: completely abandons SMB workflow
  • Users must learn new tools and access patterns
  • Applications expecting SMB paths will break
  • S3 API calls have higher latency than SMB for small file operations
  • Glacier Flexible Retrieval has 1-5 minute retrieval time (violates low-latency requirement for any archived file access)
  • No local caching mechanism

The Architect Blueprint
#

graph TD Users[Creative Teams<br/>SMB Clients] -->|SMB Protocol<br/>Sub-50ms for cached| Gateway[S3 File Gateway<br/>Local Cache: Hot Data 0-7 days] Gateway -->|Async Upload<br/>HTTPS| S3Standard[S3 Standard Bucket<br/>Days 0-7<br/>~$23/TB/month] S3Standard -->|Lifecycle Policy<br/>After 7 days| Glacier[S3 Glacier Deep Archive<br/>Days 8+<br/>~$1/TB/month] Gateway -.->|Cache Miss<br/>Retrieve from S3| S3Standard style Gateway fill:#FF9900,stroke:#232F3E,stroke-width:3px,color:#fff style S3Standard fill:#569A31,stroke:#232F3E,stroke-width:2px,color:#fff style Glacier fill:#5294CF,stroke:#232F3E,stroke-width:2px,color:#fff style Users fill:#232F3E,stroke:#FF9900,stroke-width:2px,color:#fff

Diagram Note: Users access S3 storage transparently via SMB protocol through the File Gateway, which caches hot data locally and automatically tiers cold data to Glacier Deep Archive, creating a cost-optimized hybrid storage architecture.

The Decision Matrix
#

Option Est. Complexity Est. Monthly Cost (100TB Dataset) Pros Cons
A: DataSync Medium $1,250 initial transfer + S3 storage (~$2,300/month) = $3,550 first month, then $2,300 • Simple one-way migration
• Good for backup scenarios
Breaks SMB access
• No transparent storage extension
• Doesn’t solve latency requirement
B: S3 File Gateway Low-Medium Month 1-2: ~$2,400
Month 12: ~$350 (80% tiered to Glacier)
Long-term: ~$150/month
Transparent SMB access
• Local cache for low latency
Automated lifecycle = 95% cost reduction
• Unlimited scalability
• Initial gateway setup
• Cache size planning needed
• Retrieval latency for archived data
C: FSx for Windows Low $13,000-$65,000/month (depending on throughput tier) • Native Windows experience
• High performance
• Active Directory integration
10-40x more expensive
• No automated tiering to cold storage
• Doesn’t address future growth economics
D: S3 Direct Access High Month 1-2: ~$2,300
Month 12: ~$300 (with lifecycle)
• Lowest storage cost potential
• Direct cloud integration
Requires user retraining
Application compatibility issues
• Glacier retrieval = 1-5 min latency
• No local caching

Cost Calculation Notes:

  • S3 Standard: $0.023/GB/month = $23/TB
  • Glacier Deep Archive: $0.00099/GB/month = $1/TB
  • FSx Windows (50 MB/s): $0.13/GB/month = $130/TB
  • DataSync: $0.0125/GB one-time transfer

Real-World Practitioner Insight
#

Exam Rule
#

For the AWS SAA-C03 exam, when you see:

  • “SMB file server” + “low latency for recent files” + “lifecycle management” → Choose S3 File Gateway with lifecycle policies
  • “Transparent access” + “minimal user workflow changes” → Storage Gateway family (not DataSync or direct S3 access)
  • “7 days hot, then cold” → Perfect use case for S3 lifecycle transitions

Real World
#

In production environments, we’d enhance this architecture with:

1. Hybrid Cache Sizing

  • Calculate cache size based on actual working set: (Daily new data × 7 days) + 20% buffer
  • For GlobalMedia’s scenario: If they create 2TB/day, cache appliance needs ~17TB local storage
  • Consider multiple gateway appliances for high-availability

2. Multi-Tiered Lifecycle

  • Days 0-7: S3 Standard (hot cache)
  • Days 8-90: S3 Intelligent-Tiering (automatically optimizes)
  • Days 91-365: S3 Glacier Flexible Retrieval (3-5 hour retrieval acceptable)
  • Day 366+: S3 Glacier Deep Archive (12-hour retrieval for compliance-only access)
  • This approach saves an additional 15-25% vs. direct Standard→Deep Archive transition

3. Bandwidth Considerations

  • File Gateway requires adequate bandwidth: Estimate (Daily change rate × 1.5) for comfortable asynchronous upload
  • For 2TB/day workload: Minimum 500 Mbps dedicated connection recommended
  • Consider AWS Direct Connect if on-premises link is constrained

4. Monitoring & Alerting

  • CloudWatch metrics: CachePercentDirty (data not yet uploaded)
  • CacheHitPercent (efficiency of cache sizing)
  • Set alarms when cache hit rate drops below 85% (indicates undersized cache)

5. Disaster Recovery Enhancement

  • S3 Cross-Region Replication for business-critical video assets
  • Versioning enabled to protect against accidental deletions
  • MFA Delete for compliance environments

The Exam Simplification: The exam question intentionally omits these nuances to test your foundational understanding. In real projects, we’d also evaluate:

  • Existing MPLS/Direct Connect infrastructure costs
  • Whether the company needs multi-site access (favors FSx for Windows with Multi-AZ)
  • Actual retrieval SLAs for archived content (might influence Glacier tier selection)
  • Integration with existing backup solutions (Veeam, Commvault, etc.)

Accelerate Your Cloud Certification.

Stop memorizing exam dumps. Join our waitlist for logic-driven blueprints tailored to your specific certification path.