1. AWS Compute Overview

Professional-level design questions expect you to remember that “compute” isn’t just EC2 boxes. AWS provides a toolkit and expects architects to combine pieces that best fit the workload’s shape.

The remaining sections follow a logical progression: start with regional entry points, cover cost levers, tighten up networking, then finish with scaling and inline security.


2. Regional and Global Architecture

High-level designs usually start at the regional entry point and then trace how traffic fans out globally. Key checkpoints:

Simplified AWS Architecture
🌍 Regional entry → global delivery path

3. EC2 Purchase & Savings Strategies

After the region/edge layout is clear, cost strategy becomes the next focus. Picking the right commercial model matters just as much as sizing instances, so the tables below act as a quick reference for when each option shines.

3.1 Standard Purchase Options

Model Billing Traits Best For Key Watchouts
On-Demand Per-second billing on shared hardware, no commitments. Default choice for short-lived, unpredictable, or interruption-intolerant tasks. Most expensive rate; capacity not reserved.
Spot Per-second billing, deep discounts (up to ~90%) on unused capacity. Stateless, batch, or flexible jobs that can handle sudden interruptions. Instances can disappear with little warning; design retry logic.
Reserved Instances Commit to 1 or 3 years for reduced hourly rates; zonal RIs also reserve capacity. Always-on workloads with predictable instance families/regions. Locked to attributes (region, tenancy, family). Unused reservations still bill.

3.2 Dedicated Capacity Choices

Option What You Get Ideal Use Notes
Dedicated Host Entire physical server under your control; BYOL-friendly. Socket/core-licensed software, strict compliance, pinning instances. You manage host capacity; pair with Host Affinity for placement guarantees.
Dedicated Instance Instance runs on hardware isolated to your account, but AWS manages hosts. Isolation requirements without host-level management. No host visibility; pay instance fees with dedicated tenancy surcharge.
Dedicated Host/Instances
🧩 EC2 tenancy isolation options

Quick note to future me: a host is the single physical server that runs the VMs, while a rack is the collection of hosts an AWS placement group carves up.


3.3 Reserved Instance Flavors


3.4 Savings Plans


3.5 Taxi Analogy for Charging Models

Plan Plain-English Explanation Taxi Analogy 🚕 Commitment Capacity Guaranteed? Discount? When to Use
On-Demand Pay only while the instance runs. Hail a taxi whenever needed. None Tests, spikes, unpredictable workloads.
Savings Plan Promise to spend $/hr for discount, flexible usage. Retainer for any taxi ride at lower price. 1–3 yrs Continuous usage that may change families/regions.
Reserved Instance Lock specific attributes for big discount. Lease the same taxi full-time. 1–3 yrs ❌ (✅ for zonal) Steady 24×7 fleets.
Scheduled RI Recurring reserved blocks. Pre-book the same taxi for rush hour. 1 yr ✅ (during window) Predictable periodic jobs.
Capacity Reservation Keep capacity idle but ready. Pay driver to wait while you shop. Open-ended ❌ (unless paired) Mission-critical or DR kickoffs.

3.6 Dedicated Options Recap

Option Analogy Discount Potential Why choose it
Dedicated Instance Private taxi, company-owned. Compliance isolation without hardware management.
Dedicated Host Lease the whole car. ✅ (with RIs/SP) BYOL licensing, socket/core tracking, placement control.

4. EC2 Networking & Image Strategy

Cost planning is pointless if instances can’t talk or boot quickly, so the next pass is all about ENIs and image hygiene.

4.1 Elastic Network Interfaces (ENIs)


4.2 Bootstrapping vs. AMI Baking

Application provisioning typically follows three layers:

  1. Base OS + Dependencies – slowest layer, rarely changes.
  2. Application Binaries – updated occasionally.
  3. Runtime Configuration – fast, environment-specific tweaks.

Strategies:

Optimal builds usually blend baked AMIs for heavy dependencies plus minimal bootstrapping for secrets, region settings, or last-minute patches.


5. Elastic Load Balancing

With networking squared away, the next layer is the load balancers that shape every request before it hits compute.

5.1 Architecture & Traffic Flow

ELB Architecture
⚖️ ELB request flow (public + internal tiers)

5.2 Cross-Zone Load Balancing

ELB Architecture - Cross Zone
🔁 Cross-zone load balancing fan-out

5.3 User Session State


5.4 Load Balancer Generations

Type Protocols Highlights Limitations
Classic LB (CLB) HTTP/HTTPS/TCP Legacy option, basic Layer 4/7 support. No SNI, only one SSL cert, minimal app awareness.
Application LB (ALB) HTTP/HTTPS/WebSocket True Layer 7: content-based routing, multiple certs, WebSockets. No TCP/UDP listeners; TLS terminates at ALB.
Network LB (NLB) TCP/TLS/UDP Ultra-low latency, static IPs, preserves end-to-end encryption. No HTTP header visibility or cookies.
Gateway LB (GWLB) GENEVE encapsulated L3/L4 Auto-scales third-party appliances inline. Requires appliance support for GENEVE.

5.5 Session Stickiness


5.6 Connection Draining & Deregistration Delay


5.7 Client IP Preservation


6. Auto Scaling Groups (ASG)

All that front-door engineering falls flat without an automated fleet, so this section collects focused ASG reminders.

6.1 Core Concepts

ASG Sequence
⚙️ ASG lifecycle hook timeline

6.2 Scaling Policies

TARGET_CPU = 50

while True:
    avg = get_average_cpu(asg_instances)
    error = avg - TARGET_CPU

    if error > 10 and desired_instances < max_instances:
        scale_out(by=1)
    elif error < -10 and desired_instances > min_instances:
        scale_in(by=1)

    cooldown(120)
    sleep(60)
Policy Behavior Notes
Simple One metric threshold, one action. “If CPU > 70%, add 1 instance.”
Step Multiple bands with different step sizes. “>90% add 3, >70% add 2…”
Target Tracking Feedback loop aims for target metric. Works like cruise control; AWS manages math.

6.3 Health Checks & Hooks

6.4 ASGs with Load Balancers


7. EC2 Placement Groups

When latency or blast-radius guarantees show up, placement groups keep the topology honest.

Strategy Layout Best For Constraints
Cluster Packs instances close together in one AZ. HPC, tightly coupled apps needing 10 Gbps+ per flow. No cross-AZ spreading; low resilience. Launch similar instances simultaneously for best results.
Spread Places up to 7 instances per AZ on distinct racks with separate power/network. Small fleets needing maximum isolation (critical services, HA pairs). Not available for Dedicated Hosts/Instances; limit 7 per AZ.
Partition Segments racks into partitions; instances within a partition share hardware, but partitions are isolated. Large-scale distributed systems (Hadoop, Cassandra, Kafka). Up to 7 partitions per AZ; you assign instances to partitions for failure-domain control.
EC2 Placement Group
🧱 Placement group rack layout

8. Gateway Load Balancer (GWLB)

The final piece is inline security. GWLB keeps third-party appliances elastic so next hops are not hard-coded everywhere.

Traditional inspection tiers often require manually chaining firewalls or IDS appliances, creating scaling pain. GWLB changes that by:

GWLB Architecture
🛡️ GWLB inline inspection flow

Use cases:

  1. Centralized security VPC – route spoke VPC traffic via GWLB endpoints for inspection before reaching workloads.
  2. Inline egress filtering – send outbound flows through IDS/IPS without hard-wiring next hops.
  3. Scalable appliances – autoscale firewall AMIs with GWLB controlling attachment and health.

8. Operations & Exam Notes


Contents