Numbers That Matter

These numbers are useful during system design interviews and architecture reviews. They are not exact guarantees; use them to sanity-check scale, latency, storage, and reliability before choosing deeper implementation details.

Powers of Two

Computer systems usually count storage, memory, partitions, hash spaces, and binary IDs in powers of two. The decimal approximation is often enough for quick capacity math.

The reason is binary. One bit has two possible states: 0 or 1. Every extra bit doubles the number of possible combinations because it can be added as either 0 or 1 in front of every existing pattern:

bit  = 2 combinations    = 0, 1
bits = 4 combinations    = 00, 01, 10, 11
bits = 8 combinations    = 000, 001, 010, 011, 100, 101, 110, 111
n bits = 2^n combinations

That doubling is why systems naturally “grow” as 2, 4, 8, 16, 32, 64, 128, and so on. For example, 8 bits make 2⁸ = 256 possible byte values, and 32 bits make 2³² = ~4.3 billion possible unsigned integer values.

For storage, people usually say kilobyte, megabyte, gigabyte, terabyte, petabyte in conversation. Strictly speaking, there are two naming systems:

Decimal units: KB, MB, GB, TB, PB use powers of 10. Example: 1 KB = 1,000 bytes.
Binary units: KiB, MiB, GiB, TiB, PiB use powers of 2. Example: 1 KiB = 1,024 bytes.

In system design interviews, it is usually fine to say KB, MB, GB, TB, and PB and approximate each step as ~1,000x. The binary names are included here so you know why 2^10 = 1,024 is technically 1 KiB, even though people often casually call it 1 KB.

Power	Approximate Value	Common Name	Precise Binary Name	Common Use
2⁸	256	Two hundred fifty-six	256	Byte values, small lookup tables
2¹⁰	~1 thousand	Kilobyte-ish / Thousand	Kibibyte (KiB)	1 KB-ish blocks, small buffers
2²⁰	~1 million	Megabyte-ish / Million	Mebibyte (MiB)	1 MB-ish objects, user counts
2³⁰	~1 billion	Gigabyte-ish / Billion	Gibibyte (GiB)	1 GB-ish files, rows, events
2⁴⁰	~1 trillion	Terabyte-ish / Trillion	Tebibyte (TiB)	1 TB-ish storage, logs, analytics
2⁵⁰	~1 quadrillion	Petabyte-ish / Quadrillion	Pebibyte (PiB)	1 PB-ish data lakes, large archives
2⁶⁰	~1 quintillion	Exabyte-ish / Quintillion	Exbibyte (EiB)	Very large ID spaces, global-scale storage

Storage conversions:

Unit	Decimal Meaning	Binary Approximation	Interview Shortcut
1 KB	1,000 bytes	1 KiB = 1,024 bytes	~1 thousand bytes
1 MB	1,000 KB = 1,000,000 bytes	1 MiB = 1,024 KiB	~1 million bytes
1 GB	1,000 MB = 1,000,000,000 bytes	1 GiB = 1,024 MiB	~1 billion bytes
1 TB	1,000 GB = 1,000,000,000,000 bytes	1 TiB = 1,024 GiB	~1 trillion bytes
1 PB	1,000 TB = 1,000,000,000,000,000 bytes	1 PiB = 1,024 TiB	~1 quadrillion bytes

Quick notes:

2¹⁰ = 1,024, so treat it as 10³ for rough math.
In casual speech, 1 KB often means “about 1 thousand bytes”; technically, 1 KiB = 1,024 bytes.
1 byte = 8 bits; 1 KB-ish = 1,024 bytes; 1 MB-ish = 1,024 KB-ish.

Average Object Sizes

Use these as rough estimates when converting product requirements into storage, bandwidth, cache, and database sizing. Real systems vary a lot because of data types, text encoding, metadata, indexes, compression, replication, and media quality.

For a stored object, start with this mental model:

total object size
~= primitive fields
 + text fields
 + structured format overhead
 + metadata fields
 + indexes
 + storage engine overhead
 + replication / compression effects
 + media objects, if any

Primitive fields are the easiest to estimate because they are usually fixed-size:

Primitive / Field Type	Rough Size	Why It Matters	Example Estimate
Boolean	1 byte-ish	Often padded by storage formats; tiny alone, noticeable across billions of rows.	`is_deleted`, `is_verified`
Integer ID	4-8 bytes	Primary keys, foreign keys, counters, timestamps.	32-bit int = 4 bytes; 64-bit int = 8 bytes
UUID	16 bytes binary / 36 chars text	Text UUIDs cost more in storage and indexes than binary UUIDs.	`550e8400-e29b-41d4-a716-446655440000`
Timestamp	8 bytes	Common on almost every event and row.	Unix milliseconds or database timestamp
Decimal / money	8-16 bytes	Exact decimals often cost more than integers or floats.	Price, balance, invoice amount
Foreign key pointer	4-8 bytes	References multiply quickly across relational rows and indexes.	`user_id`, `product_id`, `order_id`

Text fields depend on character count and encoding:

text bytes ~= character_count x bytes_per_character

Encoding comparison:

Encoding	Bytes per Character	What It Represents	Design Note
ASCII	1 byte	Basic English letters, digits, punctuation, and control characters.	Small and simple, but limited to 128 characters.
UTF-8	1-4 bytes	Unicode using variable-length bytes.	ASCII stays 1 byte; many European characters use 2 bytes; CJK often uses 3 bytes; emoji often use 4 bytes.
UTF-16	2-4 bytes	Unicode using 16-bit code units.	Common in some runtimes; many common characters use 2 bytes, while some symbols and emoji use 4 bytes.
UTF-32	4 bytes	Unicode using fixed-width 32-bit code units.	Simple indexing by code unit, but usually wastes space for normal text.

The difference is not that UTF-16 or UTF-32 are “richer” than UTF-8. They can represent the same Unicode character set; they just encode characters with different byte layouts. UTF-8 is compact for ASCII-heavy text, UTF-16 is common in some language runtimes, and UTF-32 trades storage efficiency for fixed-width representation.

Useful text estimates:

Text Field	Characters	ASCII / English UTF-8	Worst-Case UTF-8
Username	20 chars	~20 bytes	~80 bytes
Short title	100 chars	~100 bytes	~400 bytes
Short post text	280 chars	~280 bytes	~1.1 KB
Description	1,000 chars	~1 KB	~4 KB

Composite objects are bigger than the visible text because they include fixed fields, metadata, pointers, and storage overhead:

Object Type	Main Sizing Driver	Rough Size	Example Estimate
Small JSON event	Field names + values + quotes + commas + braces	0.5-2 KB	`{ user_id, action, timestamp, metadata }`
Metadata row	IDs + timestamps + counters + flags + text + pointers	0.5-2 KB	Post ID + author ID + short text + counters + media pointers
User profile row	Bio + settings + links + counters + media pointers	1-5 KB	Name, bio, settings, counters, links
Product catalog row	Description + attributes + variants + media URLs	2-10 KB	Product title, description, category, price, inventory
Log line	JSON fields + request metadata	0.5-4 KB	JSON log with request ID, route, status, latency
Thumbnail image	Compression quality + resolution + format	10-100 KB	Avatar, preview image, product thumbnail
Compressed photo	Compression quality + resolution + format	0.5-5 MB	Phone photo after web/mobile compression
Short video	Resolution + duration + codec + bitrate	5-100+ MB	Short social video at multiple renditions

How those estimates add up:

Object Type	Example Field Breakdown	Why the Range Moves
Small JSON event	`user_id` integer (`8 bytes`) becomes larger in JSON because the key name and decimal text are stored too; `action` string, `timestamp`, and `metadata` fields add more repeated keys and values.	JSON repeats field names on every event, and request/session/device metadata often outweighs the core values.
Metadata row	`post_id` (`8 bytes`) + `author_id` (`8 bytes`) + `created_at` (`8 bytes`) + counters + flags + short text + media pointers + row overhead.	The text length, number of pointers, and storage engine/index overhead usually decide whether it is closer to 0.5 KB or 2 KB.
User profile row	`user_id` (`8 bytes`) + handle/name text + bio text + settings JSON + counters/timestamps + links/avatar/banner pointers.	Profiles with empty bios are small; rich profiles with settings, links, localization, and media pointers are larger.
Product catalog row	`product_id` (`8 bytes`) + `seller_id` (`8 bytes`) + timestamps/price/inventory + title + description + attributes JSON + category/brand/tags + media URLs.	Descriptions, variant attributes, and media URL lists dominate the primitive fields.
Log line	`request_id` UUID string (`36 chars`) + route/status/latency + timestamp + user/session/IP/user-agent/trace fields.	Logs become large when they include stack traces, user agents, headers, request bodies, or nested JSON metadata.
Thumbnail image	Small image file, for example 100x100 to 400x400 pixels, compressed as JPEG/WebP/AVIF; the database row usually stores only the URL or object key.	Resolution, image complexity, format, and quality setting dominate. Store the binary in object storage, not inside the row.
Compressed photo	Photo file resized and compressed for web/mobile; the row usually stores metadata and an object-storage pointer, not the binary photo.	Original resolution, compression quality, and format matter more than database field sizes.
Short video	Video size is roughly `bitrate x duration`. For example, 2 Mbps for 30 seconds is about 60 megabits, or ~7.5 MB, before extra renditions.	Codec, bitrate, duration, resolution, and number of transcoded renditions dominate everything else.

For a database row:

row size
~= fixed fields
 + text fields
 + pointers / URLs
 + format overhead
 + storage overhead

Example product catalog row:

Field Group	Rough Size
`product_id`, `seller_id`, timestamps, price, inventory	~40-60 bytes
Title	~100 bytes
Description	~1-4 KB
Category, brand, tags	~100-500 bytes
Attributes JSON	~500 bytes-3 KB
Media URLs	~300 bytes-1 KB
Rough total	~2-10 KB

Quick sizing pattern:

Daily storage ~= events per day x average event size
Monthly storage ~= daily storage x 30
With replication ~= raw storage x replication factor

Example: 100M posts/day at 1 KB metadata each is ~100 GB/day of metadata. If 10% include a 1 MB compressed image, media adds ~10 TB/day before replicas, thumbnails, CDN cache, and backups.

Interview shortcut: for text, estimate characters x encoding size. For rows, add fixed fields, metadata, pointers, format overhead, and storage overhead. For media, estimate from compression, resolution, codec, and duration because media usually dominates storage and bandwidth.

Latency Units

Before reading latency numbers, keep the time units straight. Each step below is 1,000 times larger than the previous one.

Unit	Short Name	Seconds	Compared to Next Unit	System Design Intuition
Nanosecond	ns	0.000000001 s	1 us = 1,000 ns	CPU cache and very low-level memory timing.
Microsecond	us	0.000001 s	1 ms = 1,000 us	Fast local operations, SSD access, kernel/network overhead.
Millisecond	ms	0.001 s	1 s = 1,000 ms	Databases, service calls, same-region networks, user-visible latency.
Second	s	1 s	1 s = 1,000,000,000 ns	Slow user flows, retries, batch jobs, timeout budgets.

Order from smallest to largest: ns < us < ms < s. A useful mental model is: if 1 ns were 1 second, then 1 us would be ~17 minutes, 1 ms would be ~11.6 days, and 1 second would be ~31.7 years.

Latency Numbers Every Programmer Should Know

Latency changes by hardware, cloud provider, region, load, and implementation. The point is the order of magnitude: memory is nanoseconds, local disk/network is microseconds to milliseconds, and cross-region work is tens to hundreds of milliseconds.

Operation	Typical Latency	Design Meaning
L1 cache reference	~1 ns	Fastest CPU-local access.
Main memory reference	~100 ns	Still very fast; avoid unnecessary network calls before optimizing RAM access.
Compress 1 KB with a fast codec	~10 us	Often cheaper than sending extra bytes over a slow network.
Read 1 MB sequentially from memory	~250 us	Batching can be efficient when access is sequential.
Same-AZ network round trip	~0.2-1 ms	Cheap enough for service calls, but still much slower than memory.
SSD random read	~100 us-1 ms	Indexes and caches matter when random reads dominate.
Read 1 MB sequentially from SSD	~1-3 ms	Sequential I/O is much cheaper than many small random I/Os.
Cross-AZ network round trip	~1-3 ms	Good for HA, but can add visible tail latency if every request crosses AZs repeatedly.
Database query on indexed hot data	~1-10 ms	Reasonable for user paths; watch p95/p99 and connection pool limits.
Cross-region round trip	~30-200 ms	Avoid synchronous cross-region dependencies on latency-sensitive paths.
Internet request from browser to origin	~50-500 ms	Use CDNs, caching, compression, and fewer round trips.

Rule of thumb: every synchronous dependency adds to tail latency. If a request path calls five services and each has a p99 of 100 ms, the end-to-end p99 can easily miss a 300 ms target even when the average looks healthy.

Availability Numbers

Availability is usually expressed as “number of nines”. More nines reduce allowed downtime, but the cost and operational complexity rise quickly. For deeper SLA/SLO/SLI design, see SLA, SLO & SLI.

Availability	Common Name	Downtime / Year	Downtime / Month	Typical Design Implication
90%	One nine	~36.5 days	~3 days	Best effort; manual recovery may be acceptable.
99%	Two nines	~3.65 days	~7.3 hours	Basic monitoring, backups, and restart procedures.
99.9%	Three nines	~8.77 hours	~43.8 minutes	Redundancy, automated deploy rollback, clear on-call response.
99.95%	Three and a half nines	~4.38 hours	~21.9 minutes	Multi-AZ, health checks, failover, error budgets.
99.99%	Four nines	~52.6 minutes	~4.4 minutes	Automated recovery, no routine manual operations in the hot path.
99.999%	Five nines	~5.26 minutes	~26.3 seconds	Multi-region or equivalent isolation, rigorous release safety, graceful degradation.

Important caveat: a system made of serial dependencies is only as strong as the combined path. Roughly:

System availability ~= Dependency A x Dependency B x Dependency C

So three required services that are each 99.9% available produce about 99.7% end-to-end availability before any fallback, caching, retry, or degradation strategy.