Numbers That Matter

These numbers are useful during system design interviews and architecture reviews. They are not exact guarantees; use them to sanity-check scale, latency, storage, and reliability before choosing deeper implementation details.


Powers of Two

Computer systems usually count storage, memory, partitions, hash spaces, and binary IDs in powers of two. The decimal approximation is often enough for quick capacity math.

The reason is binary. One bit has two possible states: 0 or 1. Every extra bit doubles the number of possible combinations because it can be added as either 0 or 1 in front of every existing pattern:

1 bit  = 2 combinations    = 0, 1
2 bits = 4 combinations    = 00, 01, 10, 11
3 bits = 8 combinations    = 000, 001, 010, 011, 100, 101, 110, 111
n bits = 2^n combinations

That doubling is why systems naturally “grow” as 2, 4, 8, 16, 32, 64, 128, and so on. For example, 8 bits make 28 = 256 possible byte values, and 32 bits make 232 = ~4.3 billion possible unsigned integer values.

For storage, people usually say kilobyte, megabyte, gigabyte, terabyte, petabyte in conversation. Strictly speaking, there are two naming systems:

In system design interviews, it is usually fine to say KB, MB, GB, TB, and PB and approximate each step as ~1,000x. The binary names are included here so you know why 2^10 = 1,024 is technically 1 KiB, even though people often casually call it 1 KB.

Power Approximate Value Common Name Precise Binary Name Common Use
28 256 Two hundred fifty-six 256 Byte values, small lookup tables
210 ~1 thousand Kilobyte-ish / Thousand Kibibyte (KiB) 1 KB-ish blocks, small buffers
220 ~1 million Megabyte-ish / Million Mebibyte (MiB) 1 MB-ish objects, user counts
230 ~1 billion Gigabyte-ish / Billion Gibibyte (GiB) 1 GB-ish files, rows, events
240 ~1 trillion Terabyte-ish / Trillion Tebibyte (TiB) 1 TB-ish storage, logs, analytics
250 ~1 quadrillion Petabyte-ish / Quadrillion Pebibyte (PiB) 1 PB-ish data lakes, large archives
260 ~1 quintillion Exabyte-ish / Quintillion Exbibyte (EiB) Very large ID spaces, global-scale storage

Storage conversions:

Unit Decimal Meaning Binary Approximation Interview Shortcut
1 KB 1,000 bytes 1 KiB = 1,024 bytes ~1 thousand bytes
1 MB 1,000 KB = 1,000,000 bytes 1 MiB = 1,024 KiB ~1 million bytes
1 GB 1,000 MB = 1,000,000,000 bytes 1 GiB = 1,024 MiB ~1 billion bytes
1 TB 1,000 GB = 1,000,000,000,000 bytes 1 TiB = 1,024 GiB ~1 trillion bytes
1 PB 1,000 TB = 1,000,000,000,000,000 bytes 1 PiB = 1,024 TiB ~1 quadrillion bytes

Quick notes:


Average Object Sizes

Use these as rough estimates when converting product requirements into storage, bandwidth, cache, and database sizing. Real systems vary a lot because of data types, text encoding, metadata, indexes, compression, replication, and media quality.

For a stored object, start with this mental model:

total object size
~= primitive fields
 + text fields
 + structured format overhead
 + metadata fields
 + indexes
 + storage engine overhead
 + replication / compression effects
 + media objects, if any

Primitive fields are the easiest to estimate because they are usually fixed-size:

Primitive / Field Type Rough Size Why It Matters Example Estimate
Boolean 1 byte-ish Often padded by storage formats; tiny alone, noticeable across billions of rows. is_deleted, is_verified
Integer ID 4-8 bytes Primary keys, foreign keys, counters, timestamps. 32-bit int = 4 bytes; 64-bit int = 8 bytes
UUID 16 bytes binary / 36 chars text Text UUIDs cost more in storage and indexes than binary UUIDs. 550e8400-e29b-41d4-a716-446655440000
Timestamp 8 bytes Common on almost every event and row. Unix milliseconds or database timestamp
Decimal / money 8-16 bytes Exact decimals often cost more than integers or floats. Price, balance, invoice amount
Foreign key pointer 4-8 bytes References multiply quickly across relational rows and indexes. user_id, product_id, order_id

Text fields depend on character count and encoding:

text bytes ~= character_count x bytes_per_character

Encoding comparison:

Encoding Bytes per Character What It Represents Design Note
ASCII 1 byte Basic English letters, digits, punctuation, and control characters. Small and simple, but limited to 128 characters.
UTF-8 1-4 bytes Unicode using variable-length bytes. ASCII stays 1 byte; many European characters use 2 bytes; CJK often uses 3 bytes; emoji often use 4 bytes.
UTF-16 2-4 bytes Unicode using 16-bit code units. Common in some runtimes; many common characters use 2 bytes, while some symbols and emoji use 4 bytes.
UTF-32 4 bytes Unicode using fixed-width 32-bit code units. Simple indexing by code unit, but usually wastes space for normal text.

The difference is not that UTF-16 or UTF-32 are “richer” than UTF-8. They can represent the same Unicode character set; they just encode characters with different byte layouts. UTF-8 is compact for ASCII-heavy text, UTF-16 is common in some language runtimes, and UTF-32 trades storage efficiency for fixed-width representation.

Useful text estimates:

Text Field Characters ASCII / English UTF-8 Worst-Case UTF-8
Username 20 chars ~20 bytes ~80 bytes
Short title 100 chars ~100 bytes ~400 bytes
Short post text 280 chars ~280 bytes ~1.1 KB
Description 1,000 chars ~1 KB ~4 KB

Composite objects are bigger than the visible text because they include fixed fields, metadata, pointers, and storage overhead:

Object Type Main Sizing Driver Rough Size Example Estimate
Small JSON event Field names + values + quotes + commas + braces 0.5-2 KB { user_id, action, timestamp, metadata }
Metadata row IDs + timestamps + counters + flags + text + pointers 0.5-2 KB Post ID + author ID + short text + counters + media pointers
User profile row Bio + settings + links + counters + media pointers 1-5 KB Name, bio, settings, counters, links
Product catalog row Description + attributes + variants + media URLs 2-10 KB Product title, description, category, price, inventory
Log line JSON fields + request metadata 0.5-4 KB JSON log with request ID, route, status, latency
Thumbnail image Compression quality + resolution + format 10-100 KB Avatar, preview image, product thumbnail
Compressed photo Compression quality + resolution + format 0.5-5 MB Phone photo after web/mobile compression
Short video Resolution + duration + codec + bitrate 5-100+ MB Short social video at multiple renditions

How those estimates add up:

Object Type Example Field Breakdown Why the Range Moves
Small JSON event user_id integer (8 bytes) becomes larger in JSON because the key name and decimal text are stored too; action string, timestamp, and metadata fields add more repeated keys and values. JSON repeats field names on every event, and request/session/device metadata often outweighs the core values.
Metadata row post_id (8 bytes) + author_id (8 bytes) + created_at (8 bytes) + counters + flags + short text + media pointers + row overhead. The text length, number of pointers, and storage engine/index overhead usually decide whether it is closer to 0.5 KB or 2 KB.
User profile row user_id (8 bytes) + handle/name text + bio text + settings JSON + counters/timestamps + links/avatar/banner pointers. Profiles with empty bios are small; rich profiles with settings, links, localization, and media pointers are larger.
Product catalog row product_id (8 bytes) + seller_id (8 bytes) + timestamps/price/inventory + title + description + attributes JSON + category/brand/tags + media URLs. Descriptions, variant attributes, and media URL lists dominate the primitive fields.
Log line request_id UUID string (36 chars) + route/status/latency + timestamp + user/session/IP/user-agent/trace fields. Logs become large when they include stack traces, user agents, headers, request bodies, or nested JSON metadata.
Thumbnail image Small image file, for example 100x100 to 400x400 pixels, compressed as JPEG/WebP/AVIF; the database row usually stores only the URL or object key. Resolution, image complexity, format, and quality setting dominate. Store the binary in object storage, not inside the row.
Compressed photo Photo file resized and compressed for web/mobile; the row usually stores metadata and an object-storage pointer, not the binary photo. Original resolution, compression quality, and format matter more than database field sizes.
Short video Video size is roughly bitrate x duration. For example, 2 Mbps for 30 seconds is about 60 megabits, or ~7.5 MB, before extra renditions. Codec, bitrate, duration, resolution, and number of transcoded renditions dominate everything else.

For a database row:

row size
~= fixed fields
 + text fields
 + pointers / URLs
 + format overhead
 + storage overhead

Example product catalog row:

Field Group Rough Size
product_id, seller_id, timestamps, price, inventory ~40-60 bytes
Title ~100 bytes
Description ~1-4 KB
Category, brand, tags ~100-500 bytes
Attributes JSON ~500 bytes-3 KB
Media URLs ~300 bytes-1 KB
Rough total ~2-10 KB

Quick sizing pattern:

Daily storage ~= events per day x average event size
Monthly storage ~= daily storage x 30
With replication ~= raw storage x replication factor

Example: 100M posts/day at 1 KB metadata each is ~100 GB/day of metadata. If 10% include a 1 MB compressed image, media adds ~10 TB/day before replicas, thumbnails, CDN cache, and backups.

Interview shortcut: for text, estimate characters x encoding size. For rows, add fixed fields, metadata, pointers, format overhead, and storage overhead. For media, estimate from compression, resolution, codec, and duration because media usually dominates storage and bandwidth.


Latency Units

Before reading latency numbers, keep the time units straight. Each step below is 1,000 times larger than the previous one.

Unit Short Name Seconds Compared to Next Unit System Design Intuition
Nanosecond ns 0.000000001 s 1 us = 1,000 ns CPU cache and very low-level memory timing.
Microsecond us 0.000001 s 1 ms = 1,000 us Fast local operations, SSD access, kernel/network overhead.
Millisecond ms 0.001 s 1 s = 1,000 ms Databases, service calls, same-region networks, user-visible latency.
Second s 1 s 1 s = 1,000,000,000 ns Slow user flows, retries, batch jobs, timeout budgets.

Order from smallest to largest: ns < us < ms < s. A useful mental model is: if 1 ns were 1 second, then 1 us would be ~17 minutes, 1 ms would be ~11.6 days, and 1 second would be ~31.7 years.


Latency Numbers Every Programmer Should Know

Latency changes by hardware, cloud provider, region, load, and implementation. The point is the order of magnitude: memory is nanoseconds, local disk/network is microseconds to milliseconds, and cross-region work is tens to hundreds of milliseconds.

Operation Typical Latency Design Meaning
L1 cache reference ~1 ns Fastest CPU-local access.
Main memory reference ~100 ns Still very fast; avoid unnecessary network calls before optimizing RAM access.
Compress 1 KB with a fast codec ~10 us Often cheaper than sending extra bytes over a slow network.
Read 1 MB sequentially from memory ~250 us Batching can be efficient when access is sequential.
Same-AZ network round trip ~0.2-1 ms Cheap enough for service calls, but still much slower than memory.
SSD random read ~100 us-1 ms Indexes and caches matter when random reads dominate.
Read 1 MB sequentially from SSD ~1-3 ms Sequential I/O is much cheaper than many small random I/Os.
Cross-AZ network round trip ~1-3 ms Good for HA, but can add visible tail latency if every request crosses AZs repeatedly.
Database query on indexed hot data ~1-10 ms Reasonable for user paths; watch p95/p99 and connection pool limits.
Cross-region round trip ~30-200 ms Avoid synchronous cross-region dependencies on latency-sensitive paths.
Internet request from browser to origin ~50-500 ms Use CDNs, caching, compression, and fewer round trips.

Rule of thumb: every synchronous dependency adds to tail latency. If a request path calls five services and each has a p99 of 100 ms, the end-to-end p99 can easily miss a 300 ms target even when the average looks healthy.


Availability Numbers

Availability is usually expressed as “number of nines”. More nines reduce allowed downtime, but the cost and operational complexity rise quickly. For deeper SLA/SLO/SLI design, see SLA, SLO & SLI.

Availability Common Name Downtime / Year Downtime / Month Typical Design Implication
90% One nine ~36.5 days ~3 days Best effort; manual recovery may be acceptable.
99% Two nines ~3.65 days ~7.3 hours Basic monitoring, backups, and restart procedures.
99.9% Three nines ~8.77 hours ~43.8 minutes Redundancy, automated deploy rollback, clear on-call response.
99.95% Three and a half nines ~4.38 hours ~21.9 minutes Multi-AZ, health checks, failover, error budgets.
99.99% Four nines ~52.6 minutes ~4.4 minutes Automated recovery, no routine manual operations in the hot path.
99.999% Five nines ~5.26 minutes ~26.3 seconds Multi-region or equivalent isolation, rigorous release safety, graceful degradation.

Important caveat: a system made of serial dependencies is only as strong as the combined path. Roughly:

System availability ~= Dependency A x Dependency B x Dependency C

So three required services that are each 99.9% available produce about 99.7% end-to-end availability before any fallback, caching, retry, or degradation strategy.

Contents