Good System Design: Practical Principles Every Engineer Should Know

8/19/20252 min read

a room with purple lights
a room with purple lights

System design advice is everywhere—and a lot of it is bad. You’ve probably seen the “queues will change your life” style posts or the Twitter-bait hot takes like “never store booleans in a database.” Even well-intentioned advice, like lessons from Designing Data-Intensive Applications, often doesn’t match the day-to-day challenges most engineers face.

So what does good system design actually look like? Let’s break it down.

What Is System Design?

If software design is about how you structure code (variables, functions, classes), system design is about how you structure services: app servers, databases, caches, queues, proxies, event buses. It’s the architecture that makes your code scale, stay reliable, and not wake you up at 3 AM.

The Hallmark of Good Design

Good design usually feels… boring. If nothing breaks, everything feels easier than expected, and parts of the system disappear into the background—that’s success.
On the flip side, flashy systems packed with distributed consensus, fancy event-driven pipelines, and clever tricks are often a red flag. Complexity should evolve naturally from a simple system that works—not be the starting point.

State vs. Stateless

State is the hardest part of system design. Stateless services (like a PDF-to-HTML API) are easy to restart, repair, and scale. Stateful services (like databases) are trickier—they can get corrupted, run out of space, or fail in ways automation can’t fix.

Rule of thumb: keep as much logic stateless as possible, and funnel all writes through a single service that “owns” the database.

Databases: The Core of State

Since most state lives in the database, schema design and indexing are crucial:

  • Schema: Keep it human-readable and not overly flexible. JSON blobs everywhere = slow, messy queries.

  • Indexes: Match them to common queries. Too few = slow reads. Too many = slow writes.

  • Bottlenecks: Always JOIN when possible. Avoid making hundreds of ORM queries in loops.

And remember—replicas are your friend. Push reads to replicas, writes to primaries. Watch out for query spikes, especially transactions.

Background Jobs & Caching

Not all operations can be instant. Heavy work (like processing a big PDF) belongs in background jobs—queue it, process it later, and keep your APIs responsive.

Caching is another tool, but don’t overdo it. A cache is just more state, and stale or inconsistent caches cause nasty bugs. Always try to optimize the underlying query first before reaching for Redis or Memcached.

Events, Push vs. Pull, and Hot Paths

  • Events: Use them when the sender doesn’t care about the response (e.g., “new account created” → send email, scan for fraud). Otherwise, stick to direct API calls for clarity.

  • Push vs. Pull: Push when you need real-time updates (like Gmail). Pull when simplicity is more important.

  • Hot Paths: Focus optimization on the few critical flows that handle most of the traffic—these are the parts that can take your whole system down if mishandled.

Observability & Failure Planning

A reliable system isn’t just about uptime—it’s about visibility.

  • Log everything important (especially unhappy paths).

  • Track metrics like CPU, memory, queue sizes, and p95/p99 latencies.

  • Fail gracefully: Use circuit breakers, retries with idempotency keys, and decide carefully when to fail open (e.g., rate limits) vs. fail closed (e.g., authentication).

Final Thoughts

Good system design isn’t about chasing the newest framework or brag-worthy architecture diagram. It’s about using boring, proven tools—databases, queues, caches—correctly and sparingly. If your system feels almost invisible, congratulations: you probably did it right.

Or, to put it bluntly: great system design looks unremarkable because it just works.