50 System Design Concepts

When you start learning system design, the hardest part is not the concepts themselves.

It is about finding clear explanations in one place.

That is why having a single guide that covers all the essentials is such a game-changer.

Therefore, I’ve designed this guide to cover 50 of the most important system design concepts.

Think of it as your one-stop reference for understanding how real systems scale, stay reliable, communicate, and handle data.

My goal is to walk you through fifty important ideas using short explanations and simple examples so everything clicks quickly.

If you are preparing for a system design interview, this guide is your go-to resource.

When you subscribe, you get access to our free system design crash course.

I. Core Architecture Principles

Vertical vs Horizontal Scaling

  • Vertical scaling means upgrading a single machine, like adding more CPU, RAM, or faster storage.
  • Horizontal scaling means adding more machines and spreading work across them.

Vertical is easier but hits hardware limits and becomes expensive.

Horizontal is harder because you need load balancing, stateless services, and shared storage.

Think of it this way: vertical is one superhero getting stronger, horizontal is building a team.

CAP Theorem

CAP Theorem says that in the presence of a network partition, a distributed system must choose between Consistency and Availability. Consistency means every user sees the same data at the same time.

Availability means the system always responds, even if the data might be slightly stale.

You cannot have perfect consistency and perfect availability when your network is broken, so you decide which one to sacrifice for your use case.

PACELC Theorem

PACELC extends CAP and says: if there is a Partition, choose Availability or Consistency; Else choose Latency or Consistency.

Even when the network is fine, you still trade off slow but consistent reads vs fast but eventually consistent reads. Systems that sync across regions often pay in latency to keep strong consistency.

It explains why some databases are fast but slightly stale, while others are slower but always accurate.

 

ACID vs BASE

ACID is about strict, reliable transactions: Atomicity, Consistency, Isolation, Durability. It suits financial systems, inventory, and anything where mistakes are very costly.

BASE stands for Basically Available, Soft state, Eventual consistency and is used in large distributed systems that need to stay up and respond quickly.

BASE systems might show temporary inconsistencies but fix themselves over time.

In practice, many architectures combine both, using ACID for core money flows and BASE for things like feeds and analytics.

Throughput vs Latency

  • Throughput is how many requests your system can handle per second.
  • Latency is how long a single request takes from start to finish.

You can often increase throughput by doing more work in parallel, but that may increase latency if queues build up.

Think of a restaurant that takes many orders at once but makes customers wait longer. Good system design tries to balance both: enough throughput for peak load but low latency for a smooth user experience.

Amdahl’s Law

Amdahl’s Law says that the speedup from parallelization is limited by the part that cannot be parallelized.

If 20 percent of your system is always sequential, no amount of extra machines will fix that bottleneck.

Let me break it down.

If your request always has to hit a single master database, that master will cap your performance. This law reminds you to hunt for bottlenecks instead of just adding more servers.

 

Strong vs Eventual Consistency

  • Strong consistency means all users see the same data immediately after a write.
  • Eventual consistency means updates spread over time and nodes may briefly disagree.

Strong consistency is easier to reason about but usually slower and less available under failures.

Eventual consistency is great for large-scale systems like timelines or counters where perfect freshness is not critical.

The key is to choose the model that matches the user experience you need.

Stateful vs Stateless Architecture

  • stateful service remembers user context between requests, often storing session data locally.
  • stateless service treats every request as new, relying on external stores like caches or databases for any state.

Stateless services are easier to scale horizontally because any instance can handle any request.

Stateful systems can be simpler to code but harder to load balance and fail over.

In modern cloud systems, we try to push state into databases and keep services as stateless as possible.

Microservices vs Monoliths

monolith is a single application that contains many features in one deployable unit.

Microservices split features into separate services that communicate over the network.

Microservices help teams work independently and scale different parts separately, but introduce complexity around communication, debugging, and data consistency.

Monoliths are simpler to start with and often fine up to a certain scale. Here is the tricky part.

Many great systems start as monoliths and gradually evolve into microservices when the pain is real.

Serverless Architecture

Serverless lets you run small functions in the cloud without managing servers directly. You pay only when your code runs, and the platform handles scaling and infrastructure for you.

It is ideal for event-driven workloads such as webhooks, background jobs, or light APIs with spiky traffic.

The tradeoff is less control over long-running tasks, cold starts, and sometimes a higher cost at very high volumes.

Think of serverless as “functions as a service,” perfect for glue code and lightweight services.

II. Networking and Communication

Load Balancing

Load balancing spreads incoming traffic across multiple servers so no single server gets overloaded. It improves both reliability and performance, since a single server’s failure does not bring down the entire system.

Load balancers can be hardware devices or software services. They often support health checks so they stop sending traffic to unhealthy instances.

From an interview point of view, they are your first building block when scaling horizontally.

Load Balancing Algorithms

Common load balancing algorithms include Round Robin, Least Connections, and IP Hash.

  • Round Robin cycles through servers in order and is simple to implement.
  • Least Connections sends traffic to the server with the fewest active connections, which helps when requests vary in length.
  • IP Hash uses a hash of the client IP so the same user usually goes to the same server, which helps with simple session stickiness.

Picking the right algorithm affects fairness, resource usage, and user experience.

Reverse Proxy vs Forward Proxy

  • reverse proxy sits in front of servers and represents them to clients. It hides the inner topology, can do TLS termination, caching, compression, and routing.
  • forward proxy sits in front of clients and represents them to the outside world, often for security, caching, or content filtering.

 

Think of a reverse proxy as the reception desk of a company that hides all the internal rooms, and a forward proxy as a gateway your laptop must pass through to reach the internet.

Knowing the difference helps when you talk about API gateways and corporate proxies.

API Gateway

An API gateway is a special reverse proxy that acts as a single entry point for all API calls in a microservices system. It handles routing to the right service, rate limiting, authentication, logging, and sometimes response shaping.

This reduces complexity on the client side, since clients only talk to a single endpoint.

If you put too much logic in the gateway, it can become a bottleneck or a mini monolith of its own. Good designs keep it focused and thin.

CDN (Content Delivery Network)

CDN is a network of geographically distributed servers that cache static content like images, videos, and scripts closer to users.

When a user requests content, they are routed to the nearest CDN node, which greatly reduces latency. This also offloads traffic from your origin servers, improving scalability and resilience.

CDNs are essential for global applications and front-end performance.

Think of them as “local copies” of your website’s heavy files sprinkled around the world.

DNS (Domain Name System)

DNS maps human readable domain names to IP addresses.

When you type a website name, your device queries DNS to find the numeric address of the server.

has multiple layers of caching, so responses are fast after the first lookup. It can also be used to perform simple load balancing by returning different IPs for the same name.

Understanding DNS helps you reason about why name changes take time to propagate and why some outages are caused by misconfigured DNS.

TCP vs UDP

  • TCP is a reliable, connection-oriented protocol. It guarantees ordered, error checked delivery by using acknowledgments and retries.
  • UDP is connectionless and does not guarantee delivery or order, which makes it much faster and lighter.

TCP suits APIs, web pages, and file transfers where accuracy matters.

UDP works well for real time applications like video calls or games where occasional packet loss is acceptable.

Think of TCP as registered mail and UDP as quick postcards.

HTTP/2 and HTTP/3 (QUIC)

  • HTTP/2 introduced multiplexing, which lets multiple requests share a single TCP connection, reducing overhead. It also brought features like header compression and server push.
  • HTTP/3 runs over QUIC, which is built on UDP and improves connection setup time and performance on unreliable networks. These versions mainly aim to reduce latency and better use modern network conditions.

For you as an engineer, the key idea is: fewer connection setups and better use of a single connection.

gRPC vs REST

  • REST typically uses HTTP with JSON and focuses on resources like /users or /orders. It is simple, human-readable, and widely used for public APIs.
  • gRPC uses HTTP/2 and binary encoded messages (protobuf), which are smaller and faster over the wire. It also supports bidirectional streaming and strong typing.

In microservices, gRPC is often preferred for service-to-service calls, while REST is common for external clients.

Use REST when readability and compatibility matter, gRPC when performance and contracts matter.

WebSocket and Server-Sent Events (SSE)

WebSockets create a full-duplex connection where client and server can send messages to each other at any time.

SSE allows the server to push events to the client over a one way channel using HTTP.

WebSockets are great for chats, multiplayer games, and live collaboration.

SSE is simpler and fits cases like live score updates or notifications, where only the server needs to push updates.

Both solve real-time communication problems that plain HTTP cannot handle well.

Long Polling

Long polling is a technique where the client sends a request and the server holds it open until there is new data or a timeout.

When the response comes back, the client immediately opens another request. This simulates real time updates over plain HTTP without special protocols.

It is less efficient than WebSockets but easier to implement and works through most proxies and firewalls.

Think of it as asking “anything new?” and waiting quietly until there is an answer.

Gossip Protocol

gossip protocol lets nodes in a distributed system share information by periodically talking to random peers.

Over time, information spreads like gossip in a social group until everyone has roughly the same view. It is used to share membership, health status, or configuration in a fault tolerant way.

The protocol is eventually consistent and does not rely on a central authority. This makes it ideal for large clusters where nodes frequently join and leave.

III. Database and Storage Internals

Sharding (Data Partitioning)

Sharding splits data across multiple machines, each holding a subset of the data. Common strategies include range-based sharding, hash-based sharding, and directory-based sharding.

The main goal is to scale storage and throughput by avoiding a single giant database node.

The tricky part is choosing a shard key that avoids hot spots where one shard gets most of the traffic. Once you shard, moving data between shards (resharding) becomes an important operational challenge.

Replication Patterns (Master Slave, Master Master)

Replication means keeping multiple copies of data on different nodes.

  • In master slave (or primary replica), one node handles writes and replicates changes to others that serve reads.
  • In master master (multi-primary), multiple nodes accept writes and reconcile conflicts.

Replication improves read performance and availability, but makes consistency harder, especially when writes go to multiple nodes.

In interviews, expect to talk about how replication lag affects reads and how failover works when a master dies.

Consistent Hashing

Consistent hashing is a technique to distribute keys across nodes in a way that minimizes data movement when nodes are added or removed.

Keys and nodes are placed on a logical ring, and each key belongs to the next node on the ring.

When a node joins or leaves, only a small portion of keys need to move. This property is very helpful in distributed caches and databases.

Think of it as a smooth mapping that does not get scrambled when the cluster size changes.

Database Indexing (B Trees, LSM Trees)

Indexes speed up queries by organizing data in a way that allows fast lookup.

B Trees are balanced trees that keep data sorted and let you find ranges efficiently, common in relational databases.

LSM Trees batch writes in memory and periodically flush them to disk, which makes writes very fast but reads more complex.

The tradeoff is write heavy vs read heavy workloads.

The key idea is that indexes are a separate structure that must be updated on every write, which is why too many indexes hurt insert performance.

Write Ahead Logging (WAL)

Write Ahead Logging records changes to a log before applying them to the main database.

If a crash happens in the middle of a transaction, the system can replay the log to restore a consistent state. WAL ensures durability and atomicity of transactions. It also allows techniques like replication from the log stream. Let me tell you why it is important.

Without WAL, a crash could leave your data in a half updated, corrupt state.

Normalization vs Denormalization

  • Normalization organizes data into tables that reduce redundancy and dependencies, following rules like first normal form, second normal form, and so on. This avoids anomalies on updates and inserts.
  • Denormalization intentionally duplicates data to speed up reads and reduce joins. In high scale systems, denormalization is common for read heavy paths, such as storing user names along with posts instead of joining every time.

The real skill is knowing where you can safely denormalize without breaking consistency.

Polyglot Persistence

Polyglot persistence means using multiple types of databases within the same system, each chosen for what it does best. You might use a relational database for transactions, a document store for logs, a key value store for caching, and a graph database for relationships.

Instead of forcing everything into one database, you pick the right tool for each job.

The tradeoff is more operational complexity and more knowledge required from the team.

Bloom Filters

Bloom filter is a space efficient data structure that quickly answers “might this item be in the set?” with possible false positives but no false negatives. It uses multiple hash functions to set bits in a bit array when items are inserted.

To check membership, you test the same bits; if any bit is zero, the item is definitely not present.

Databases and caches use Bloom filters to avoid unnecessary disk lookups or cache misses.

Think of them as fast gatekeepers that say “definitely not” or “maybe.”

Vector Databases

Vector databases store and query vectors, which are numeric representations of data such as text, images, or audio. These vectors come from models like embeddings and allow similarity search, such as “find documents most similar to this one.”

Instead of exact equality comparisons, they use distance metrics like cosine similarity or Euclidean distance. This is essential for modern search, recommendation, and AI assistant systems.

In interviews, it is enough to know that vector databases support nearest neighbor search over high-dimensional data.

IV. Reliability and Fault Tolerance

Rate Limiting

Rate limiting controls how many requests a user, IP, or API key can make in a given time window. It protects your system from abuse, accidental traffic spikes, and runaway loops.

Common strategies include fixed window, sliding window, and token bucket.

Rate limits are often enforced at the API gateway or load balancer.

Think of them as safety brakes that keep shared resources from being overwhelmed.

Circuit Breaker Pattern

circuit breaker monitors calls to a remote service and “opens” if there are too many failures.

When open, it immediately fails new requests instead of trying the broken service again.

After a cooldown period, it allows a few trial calls to see if the service has recovered and closes if they succeed. This pattern prevents cascading failures where one slow service drags down the entire system.

Here is the tricky part. Circuit breakers must be tuned carefully so they do not open too aggressively or too late.

Bulkhead Pattern

The bulkhead pattern isolates parts of a system so a failure in one area does not sink everything. This can mean separate connection pools, thread pools, or even entire service clusters for different features.

If one bulkhead is flooded with traffic, others keep working.

The name comes from ship bulkheads that contain flooding in one compartment.

In design discussions, using bulkheads shows you are thinking about fault isolation and blast radius.

Retry Patterns and Exponential Backoff

Retries help recover from transient errors like network timeouts or temporary overload.

Exponential backoff means each retry waits longer than the previous one, such as 1 second, 2 seconds, 4 seconds, and so on. This prevents your client from hammering a service that is already struggling.

Good retry policies also use jitter (small randomness) to avoid thundering herds.

Let me break it down.

Retries without backoff can make outages worse instead of helping.

Idempotency

An operation is idempotent if performing it multiple times has the same effect as performing it once.

For example, “set user status to active” is idempotent, while “increment account balance by 10” is not.

Idempotency is critical when systems use retries, because the same request may be sent more than once.

APIs often require idempotency keys on operations like payments to avoid double charging.

In interviews, always mention idempotency when you talk about at least once delivery or retries.

Heartbeat

heartbeat is a periodic signal sent by a service or node to indicate that it is alive and healthy.

Monitoring systems or coordinators listen for heartbeats.

If they stop receiving them, they mark the node as down and trigger failover or scaling actions.

Heartbeats are simple but powerful tools for liveness detection. Think of them as the system’s “pulse checks.”

 

Leader Election (Paxos, Raft)

Leader election is the process of choosing a single node to act as a coordinator among many.

Algorithms like Paxos and Raft ensure that only one leader is chosen and that all nodes eventually agree on who that leader is.

The leader handles tasks like assigning work, managing metadata, or ordering writes. If the leader fails, a new one is elected automatically.

You do not need to memorize the math for interviews, but you should know that consensus algorithms power many critical systems like metadata stores and distributed logs.

Distributed Transactions (SAGA Pattern)

distributed transaction spans multiple services or databases.

The SAGA pattern models such a transaction as a sequence of local steps with compensating actions for rollbacks.

Instead of locking everything like a single ACID transaction, each service performs its part and publishes an event. If something fails, compensating steps attempt to undo previous changes. This fits naturally with microservices and eventual consistency.

The tradeoff is more complex logic and the possibility of partial failures that must be handled gracefully.

Two Phase Commit (2PC)

Two Phase Commit is a protocol that tries to provide atomic transactions across multiple nodes.

  • In the first phase, the coordinator asks all participants if they can commit.
  • In the second phase, if everyone agrees, it tells them to commit; otherwise, it tells them to roll back.

2PC provides strong guarantees but can block if the coordinator fails, and it is expensive at scale due to locking.

In modern cloud systems, 2PC is often avoided for high throughput paths and replaced by patterns like SAGA.

V. Caching and Messaging

Caching

Caching stores frequently accessed data in a fast storage layer, usually memory, to reduce latency and backend load.

Common cache layers include in process caches, external key value stores, and CDNs. Caching is especially effective for read heavy workloads and expensive computations.

Here is the tricky part. Stale data and invalidation make caching harder than it first appears.

As the saying goes, cache invalidation is one of the hard problems in computer science.

Caching Strategies (Cache Aside, Write Through, etc.)

  • Cache aside means the application reads from the cache, and on a miss, loads from the database and writes to the cache.
  • Write through writes to the cache and database at the same time, ensuring cache and source are always in sync.
  • Write back writes to the cache first and flushes to the database later, which is fast but risky if the cache fails.

Each strategy balances freshness, complexity, and performance differently.

Interviewers love when you mention which strategy you would pick for a given scenario.

Cache Eviction Policies (LRU, LFU)

Cache eviction policies decide which items to remove when the cache is full.

  • LRU (Least Recently Used) evicts items that have not been accessed recently, assuming recent items are more likely to be used again.
  • LFU (Least Frequently Used) evicts items that are rarely accessed, focusing on long term popularity.

Some systems use random, FIFO, or advanced algorithms.

The key idea is that cache space is limited, so you want to keep the most valuable items in memory.

Message Queues (Point to Point)

message queue allows one component to send messages to another without needing both to be online at the same time.

In a point to point model, messages in a queue are consumed by one receiver and then removed. This decouples sender and receiver so they can scale and fail independently.

Queues are great for background jobs, email sending, and processing heavy tasks asynchronously.

Think of them as a todo list shared between services.

Pub Sub (Publish Subscribe)

In pub sub, publishers send messages to topics, not directly to consumers.

Subscribers listen to topics they care about and receive copies of relevant messages. This enables broadcast style communication and loose coupling between producers and consumers.

Multiple services can react to the same event in different ways, such as logging, analytics, and notifications.

In interviews, pub sub often appears in event driven designs like activity feeds or event sourcing.

Dead Letter Queues

dead letter queue stores messages that could not be processed successfully after several attempts.

Instead of retrying forever and blocking the main queue, these messages are moved aside.

Engineers can inspect the dead letter queue to debug issues, fix data, or replay messages later. This pattern improves resiliency and keeps your system from getting stuck on “poison messages.”

Think of it as a holding area for problematic jobs.

VI. Observability and Security

Distributed Tracing

Distributed tracing tracks a single request as it flows through multiple services. Each service adds a trace ID and span information so you can reconstruct the full path of a request. This is extremely helpful when debugging slow responses or failures in microservice architectures.

Without tracing, you just see errors in isolation. With it, you see the whole story across services, queues, and databases.

SLA vs SLO vs SLI

An SLA (Service Level Agreement) is an external promise to customers, such as “99.9 percent uptime per month.”

An SLO (Service Level Objective) is an internal target that engineers aim to meet, usually stricter than the SLA. An SLI (Service Level Indicator) is the actual measured metric, like real uptimes or request success rates.

Think of SLA as the contract, SLO as the goal, and SLI as the scoreboard.

In interviews, using these terms correctly shows maturity in thinking about reliability.

OAuth 2.0 and OIDC

OAuth 2.0 is a framework for delegated authorization. It lets users grant an application limited access to their resources without sharing passwords.

OIDC (OpenID Connect) builds on OAuth 2.0 to add authentication, letting clients verify who the user is and get user identity information. This is the basis of many “Login with X” flows.

The key idea is that an authorization server issues tokens that clients and APIs can trust.

TLS/SSL Handshake

TLS/SSL secures communication between client and server by encrypting data in transit.

During the handshake, the client and server agree on encryption algorithms, exchange keys securely, and verify certificates.

Once the handshake completes, all subsequent data is encrypted and safe from eavesdropping. This is what puts the little lock icon in your browser.

Without TLS, anyone on the network could read or modify sensitive traffic.

Zero Trust Security

Zero Trust is a security model that says: “Never trust, always verify.” It assumes that threats can exist both outside and inside the network.

Every request must be authenticated, authorized, and encrypted, even if it comes from within your data center or VPC. Access is granted based on identity, device posture, and context, not just on being “inside the firewall.”

In modern architectures, Zero Trust is becoming the default approach to secure system design.

Reference

Key Takeaways

  • System design is mostly about understanding trade-offs: consistency vs. availability, latency vs. throughput, simplicity vs. flexibility.
  • Scaling is not just “add more servers.” You must think about load balancing, sharding, replication, and bottlenecks.
  • Reliability patterns like rate limiting, circuit breakers, retries, and bulkheads exist because failures are normal in distributed systems.
  • Caching, queues, and pub-sub are your best friends for performance and decoupling, but they introduce new challenges around consistency and ordering.
  • Observability and security concepts such as tracing, SLIs, OAuth, TLS, and Zero Trust are essential for systems that are not just fast but also safe and debuggable.

Fonte: https://designgurus.substack.com/p/50-system-design-concepts-for-beginners

 

50 Core System Design Concepts

Executive Summary

This document synthesizes 50 fundamental concepts in system design, drawing from a comprehensive guide on the subject. The core insight is that effective system design is an exercise in managing trade-offs, particularly between consistency and availability, latency and throughput, and simplicity versus flexibility. Successful scaling extends beyond merely adding servers; it necessitates a deep understanding of load balancing, data sharding, replication, and bottleneck identification.

Get Rishabh Maheshwari’s stories in your inbox

Join Medium for free to get updates from this writer.

Reliability in distributed systems is not an accident but a deliberate architectural choice, achieved through patterns like rate limiting, circuit breakers, retries, and bulkheads, which are designed to handle expected failures gracefully. Performance and decoupling are significantly enhanced by tools such as caching, message queues, and publish-subscribe models, though these introduce their own complexities regarding data consistency and message ordering. Finally, modern systems must be built with observability and security as primary concerns, incorporating distributed tracing, service level indicators (SLIs), robust authentication (OAuth/OIDC), data-in-transit encryption (TLS), and a Zero Trust security posture to ensure they are not only performant but also safe, secure, and debuggable.

Press enter or click to view image in full size

I. Core Architecture Principles

This section outlines the foundational principles and architectural choices that govern how systems are structured, scaled, and managed.

Press enter or click to view image in full size

Vertical vs. Horizontal Scaling

  • Vertical Scaling: Involves upgrading a single machine by adding more CPU, RAM, or faster storage. It is simpler to implement but is constrained by hardware limits and becomes progressively more expensive. The analogy provided is a single superhero getting stronger.
  • Horizontal Scaling: Involves adding more machines and distributing the workload across them. While more complex, requiring load balancing, stateless services, and shared storage, it offers greater scalability. The analogy is building a team of superheroes.

CAP Theorem

  • The CAP Theorem states that in a distributed system experiencing a network partition, it is impossible to simultaneously guarantee both Consistency and Availability.
  • Consistency: Every user sees the same data at the same time.
  • Availability: The system always provides a response, even if the data may be temporarily out of date.
  • A system must choose which of these two guarantees to sacrifice during a network failure.

PACELC Theorem

  • PACELC is an extension of the CAP theorem. It posits that: if there is a Partition, a system must choose between Availability and Consistency; Else (in normal operation), it must choose between Latency and Consistency.
  • This theorem clarifies that even without network failures, systems face a trade-off between fast, eventually consistent reads (lower latency) and slower, strongly consistent reads (higher consistency).
Press enter or click to view image in full size

ACID vs. BASE

  • ACID (Atomicity, Consistency, Isolation, Durability): A set of properties for strict, reliable database transactions. It is essential for systems where data integrity is paramount, such as financial or inventory management systems.
  • BASE (Basically Available, Soft state, Eventual consistency): An alternative model for large-scale distributed systems that prioritize high availability and rapid response times. BASE systems may exhibit temporary inconsistencies that resolve over time.
  • Many modern architectures employ a hybrid approach, using ACID for critical transactional flows and BASE for less critical functions like activity feeds or analytics.

Throughput vs. Latency

  • Throughput: The number of requests a system can process per unit of time (e.g., requests per second).
  • Latency: The time taken to process a single request from start to finish.
  • These two metrics are often in opposition; increasing throughput by processing more work in parallel can lead to queue buildup and increased latency for individual requests. Effective system design seeks to balance both for an optimal user experience.

Amdahl’s Law

  • This law states that the potential performance improvement from parallelization is limited by the portion of the system that must remain sequential.
  • If a part of a process is inherently non-parallelizable (e.g., a final step that must hit a single master database), that part will become the ultimate bottleneck, capping overall performance regardless of how many more resources are added.
Press enter or click to view image in full size

Strong vs. Eventual Consistency

  • Strong Consistency: Guarantees that all users see the same data immediately following a write operation. It is simpler to reason about but can be slower and less available during failures.
  • Eventual Consistency: Allows for a brief period where different nodes in a distributed system may have different versions of the data. Updates propagate through the system over time. This model is suited for large-scale applications where immediate consistency is not critical, such as social media timelines.

Stateful vs. Stateless Architecture

  • Stateful Service: Remembers user-specific context or session data between requests, often storing it locally. This can simplify application logic but complicates scaling, load balancing, and failover.
  • Stateless Service: Treats every request as new and self-contained, relying on external storage (e.g., databases, caches) for any required state. Stateless services are easier to scale horizontally, as any server instance can handle any request.

Microservices vs. Monoliths

  • Monolith: A single, unified application where all features are contained within one deployable unit. Monoliths are simpler to develop and deploy initially.
  • Microservices: An architectural style that splits application features into small, independent services that communicate over a network. This approach allows teams to work independently and scale different components separately but introduces complexity in communication, debugging, and data management.
  • A common evolutionary path is to start with a monolith and gradually break it apart into microservices as the system grows and its pain points become clear.

Serverless Architecture

  • Also known as “Functions as a Service” (FaaS), serverless architecture allows developers to run small, event-driven functions in the cloud without managing the underlying server infrastructure.
  • Advantages: Pay-per-use pricing and automatic scaling handled by the cloud provider. Ideal for workloads with spiky traffic like webhooks, background jobs, or simple APIs.
  • Trade-offs: Can involve “cold starts” (initial latency), less control over long-running tasks, and potentially higher costs at sustained high volumes.

II. Networking and Communication

This section covers the protocols, patterns, and components used to manage traffic and facilitate communication between different parts of a system.

Press enter or click to view image in full size

Load Balancing

  • Function: Distributes incoming network traffic across multiple servers to prevent any single server from becoming a bottleneck.
  • Benefits: Improves both system performance and reliability, as the failure of one server does not bring down the entire application.
  • Implementation: Can be a hardware appliance or a software service. Load balancers typically use health checks to avoid sending traffic to unresponsive servers.

Load Balancing Algorithms

  • Round Robin: Distributes requests to servers sequentially in a circular order. Simple but does not account for server load or request complexity.
  • Least Connections: Sends new requests to the server with the fewest active connections. This is effective when requests have varying completion times.
  • IP Hash: Uses a hash of the client’s IP address to determine which server receives the request. This provides a basic form of “session stickiness,” ensuring a user is consistently routed to the same server.

Reverse Proxy vs. Forward Proxy

  • Reverse Proxy: Sits in front of a group of servers, intercepting client requests and forwarding them to the appropriate backend server. It can handle tasks like TLS termination, caching, compression, and routing, while hiding the internal network topology.
  • Forward Proxy: Sits in front of clients, forwarding their requests to the internet. It is often used for security, content filtering, or caching within a corporate or private network.

API Gateway

  • An API Gateway is a specialized reverse proxy that serves as the single entry point for all API calls in a microservices architecture.
  • Responsibilities: Handles routing, rate limiting, authentication, logging, and response transformation.
  • Benefit: Simplifies the client-side by providing a single, unified endpoint.
  • Risk: Can become a bottleneck or a “mini monolith” if too much business logic is embedded within it.
Press enter or click to view image in full size

CDN (Content Delivery Network)

  • A CDN is a geographically distributed network of proxy servers that cache static assets (images, videos, CSS, JavaScript) close to end-users.
  • Function: When a user requests content, the request is routed to the nearest CDN node, dramatically reducing latency.
  • Benefits: Offloads traffic from origin servers, improves front-end performance, and increases application scalability and resilience.

DNS (Domain Name System)

  • DNS is the system that translates human-readable domain names (e.g., www.example.com) into machine-readable IP addresses (e.g., 192.0.2.1).
  • It operates with multiple layers of caching for fast lookups and can be used for basic load balancing by returning different IP addresses for the same domain name.

TCP vs. UDP

  • TCP (Transmission Control Protocol): A connection-oriented protocol that guarantees reliable, ordered, and error-checked delivery of data. It is suitable for applications where data integrity is critical, such as web browsing, file transfers, and APIs.
  • UDP (User Datagram Protocol): A connectionless protocol that is faster and has less overhead than TCP but does not guarantee delivery or order. It is well-suited for real-time applications like video streaming and online gaming, where speed is more important than perfect reliability.
Press enter or click to view image in full size

HTTP/2 and HTTP/3 (QUIC)

  • HTTP/2: Improved upon HTTP/1.1 by introducing request multiplexing over a single TCP connection, header compression, and server push, all aimed at reducing latency.
  • HTTP/3: Further enhances performance by running over QUIC (a transport protocol built on UDP), which reduces connection setup time and performs better on unreliable networks with packet loss.

gRPC vs. REST

  • REST: An architectural style that typically uses HTTP and JSON. It is resource-oriented, human-readable, and widely adopted for public-facing APIs.
  • gRPC: A high-performance RPC framework that uses HTTP/2 for transport and Protocol Buffers (protobuf) for binary serialization. It is smaller and faster than REST/JSON and supports features like bidirectional streaming, making it a popular choice for internal service-to-service communication in microservices architectures.

WebSocket and Server-Sent Events (SSE)

  • WebSockets: Provide a persistent, full-duplex (two-way) communication channel between a client and a server over a single TCP connection. Ideal for real-time interactive applications like chat, collaborative editing, and multiplayer games.
  • SSE: A simpler protocol that allows a server to push updates to a client over a one-way channel using standard HTTP. It is suitable for use cases where only the server needs to send data, such as live news feeds or stock tickers.

Long Polling

  • A technique that simulates server-push functionality over standard HTTP. The client sends a request to the server, which holds the connection open until it has new data to send or a timeout occurs. Upon receiving a response, the client immediately initiates a new request.
  • It is less efficient than WebSockets but is easier to implement and compatible with older proxies and firewalls.

Gossip Protocol

  • A decentralized communication protocol where nodes in a distributed system share information by periodically exchanging data with random peers.
  • Information propagates through the network “like gossip,” ensuring that all nodes eventually converge on a consistent view without a central coordinator. It is highly fault-tolerant and used for service discovery, health monitoring, and state dissemination in large clusters.

III. Database and Storage Internals

This section details the techniques and technologies used to manage data at scale, focusing on partitioning, replication, indexing, and transactional integrity.

Press enter or click to view image in full size

Sharding (Data Partitioning)

  • Definition: The process of splitting a large database into smaller, more manageable pieces called shards, with each shard residing on a separate machine.
  • Goal: To scale database storage capacity and throughput horizontally.
  • Strategies: Include range-based, hash-based, and directory-based sharding.
  • Challenge: Choosing an effective shard key is crucial to avoid “hot spots,” where one shard receives a disproportionate amount of traffic.

Replication Patterns

  • Definition: The practice of keeping multiple copies of data on different nodes to improve availability and read performance.
  • Master-Slave (Primary-Replica): One node (the master) handles all write operations, which are then replicated to one or more slave nodes that can serve read requests.
  • Master-Master (Multi-Primary): Multiple nodes can accept write operations, and they synchronize data with each other. This increases write availability but introduces complexity in resolving write conflicts.

Consistent Hashing

  • A hashing technique designed to minimize data re-shuffling when nodes are added to or removed from a distributed system (like a cache or database).
  • Both keys and nodes are mapped to a logical ring. A key is assigned to the first node encountered moving clockwise on the ring. This ensures that when a node is added or removed, only a small, adjacent set of keys needs to be remapped.
Press enter or click to view image in full size

Database Indexing

  • Purpose: Indexes are data structures that improve the speed of data retrieval operations on a database table at the cost of slower writes and increased storage space.
  • B-Trees: Balanced tree structures common in relational databases. They keep data sorted and are efficient for both point lookups and range queries.
  • LSM (Log-Structured Merge) Trees: Optimize for high write throughput by batching writes in memory and periodically flushing them to sorted files on disk. Reads can be more complex as they may need to check multiple files.

Write-Ahead Logging (WAL)

  • A standard method for ensuring data durability and atomicity. Before any changes are applied to the database itself, they are first recorded in a sequential log file on durable storage.
  • In the event of a system crash, the database can replay the log to recover to a consistent state, preventing data corruption from partially completed transactions.

Normalization vs. Denormalization

  • Normalization: The process of organizing data in a relational database to minimize redundancy and improve data integrity by dividing larger tables into smaller, well-structured ones.
  • Denormalization: The intentional introduction of redundancy by duplicating data across multiple tables. This is often done in high-scale systems to optimize read performance by avoiding expensive join operations.
Press enter or click to view image in full size

Polyglot Persistence

  • The practice of using multiple different database technologies within a single application, choosing the best tool for each specific job.
  • An application might use a relational database for transactional data, a document store for unstructured content, a key-value store for caching, and a graph database for relationship-heavy data. This adds operational complexity but allows for optimized performance and functionality.

Bloom Filters

  • A probabilistic, space-efficient data structure used to test whether an element is a member of a set.
  • It can produce false positives (it might incorrectly say an element is in the set) but never false negatives (if it says an element is not in the set, it is definitively not).
  • They are used to avoid expensive lookups for items that are likely not present, such as checking a cache before querying a database.

Vector Databases

  • Specialized databases designed to store, manage, and query high-dimensional vector embeddings, which are numerical representations of data like text or images.
  • They excel at similarity searches using distance metrics (e.g., cosine similarity), enabling applications like semantic search, recommendation engines, and other AI-powered features.

IV. Reliability and Fault Tolerance

This section explores patterns and strategies for building resilient systems that can withstand and recover from failures.

Press enter or click to view image in full size

Rate Limiting

  • Function: Controls the frequency of requests a user or client can make to an API or service within a specific time window.
  • Purpose: Protects backend services from abuse, accidental overload, and denial-of-service attacks.
  • Strategies: Common algorithms include fixed window, sliding window, and token bucket.

Circuit Breaker Pattern

  • A pattern that prevents an application from repeatedly trying to execute an operation that is likely to fail.
  • Mechanism: A circuit breaker monitors calls to a downstream service. If the number of failures exceeds a threshold, the breaker “opens,” and subsequent calls fail immediately without attempting to contact the service. After a timeout, the breaker enters a “half-open” state to test if the service has recovered.

Bulkhead Pattern

  • An application design pattern that isolates system elements into pools so that if one fails, the others can continue to function.
  • Named after the partitioned sections of a ship’s hull, this pattern can be implemented by using separate thread pools or connection pools for different services, preventing a failure in one area from cascading and taking down the entire system.
Press enter or click to view image in full size

Retry Patterns and Exponential Backoff

  • Retries: A mechanism for handling transient failures by automatically re-attempting a failed operation.
  • Exponential Backoff: A crucial enhancement to retries where the delay between attempts increases exponentially (e.g., 1s, 2s, 4s). This prevents a client from overwhelming a struggling service with rapid-fire retries. Adding “jitter” (a small random delay) is also recommended to avoid synchronized retry storms.

Idempotency

  • An operation is idempotent if it can be performed multiple times with the same result as performing it once. For example, setting a value is idempotent, while incrementing a counter is not.
  • Idempotency is critical in distributed systems where network failures can lead to retries, ensuring that a re-sent request does not cause unintended side effects like duplicate transactions.

Heartbeat

  • A periodic signal sent from a node or service to a monitoring system to indicate it is alive and functioning correctly.
  • If the monitoring system stops receiving heartbeats from a node, it can assume the node has failed and trigger a failover process.

Leader Election

  • The process in a distributed system by which a single node is chosen to assume a special role, such as a coordinator or primary for writes.
  • Consensus algorithms like Paxos and Raft provide fault-tolerant mechanisms to ensure that all nodes agree on a single leader and can elect a new one if the current leader fails.

Distributed Transactions (SAGA Pattern)

  • The SAGA pattern is a way to manage data consistency across multiple microservices without using traditional two-phase commit locks.
  • A transaction is structured as a sequence of local transactions, each with a corresponding compensating action. If any step fails, the compensating actions are executed in reverse order to undo the preceding steps, thus maintaining overall consistency.

Two-Phase Commit (2PC)

  • A protocol used to achieve atomic transactions across multiple distributed nodes.
  • Phase 1 (Prepare): A coordinator asks all participating nodes if they are ready to commit.
  • Phase 2 (Commit/Abort): If all participants vote “yes,” the coordinator instructs them to commit. If any vote “no” or fail to respond, the coordinator instructs all to roll back.
  • 2PC provides strong consistency but is prone to blocking if the coordinator fails and can be a performance bottleneck.

V. Caching and Messaging

This section describes key technologies for improving performance and decoupling system components through in-memory data storage and asynchronous communication.

Caching

  • Definition: Storing copies of frequently accessed data in a fast, temporary storage layer (typically memory) to serve future requests more quickly.
  • Benefits: Reduces latency for end-users and decreases the load on backend systems like databases.
  • Challenge: The primary difficulty with caching is “cache invalidation” — ensuring that stale data is removed or updated when the source data changes.
Press enter or click to view image in full size

Caching Strategies

  • Cache-Aside: The application is responsible for managing the cache. It first checks the cache; on a miss, it reads data from the database, then writes that data into the cache for future requests.
  • Write-Through: The application writes data to the cache and the database simultaneously. This ensures the cache is always consistent with the database but adds latency to write operations.
  • Write-Back: The application writes data only to the cache, which acknowledges the write immediately. The data is then flushed to the database asynchronously at a later time. This offers very low write latency but risks data loss if the cache fails before the data is persisted.

Cache Eviction Policies

  • LRU (Least Recently Used): When the cache is full, the item that has been accessed least recently is removed.
  • LFU (Least Frequently Used): When the cache is full, the item that has been accessed the fewest times is removed.
  • Other policies include FIFO (First-In, First-Out) and random replacement. The choice of policy depends on the application’s access patterns.
Press enter or click to view image in full size

Message Queues (Point-to-Point)

  • A message queue enables asynchronous communication between services. A “producer” sends a message to a queue, and a “consumer” retrieves it for processing at a later time.
  • Each message is typically processed by only one consumer. This pattern decouples the sender and receiver, allowing them to operate and scale independently. It is commonly used for background jobs.

Pub/Sub (Publish-Subscribe)

  • A messaging pattern where “publishers” send messages to a “topic” without knowledge of the “subscribers.” Any number of subscribers can listen to a topic and receive a copy of every message sent to it.
  • This enables one-to-many, broadcast-style communication and is central to event-driven architectures.

Dead Letter Queues (DLQ)

  • A secondary queue used to store messages that could not be processed successfully after a certain number of retries.
  • Moving “poison messages” to a DLQ prevents them from blocking the main processing queue. Engineers can later inspect the DLQ to diagnose and resolve the underlying issues.

VI. Observability and Security

This section covers essential concepts for monitoring system health, understanding behavior, and implementing robust security measures.

Press enter or click to view image in full size

Distributed Tracing

  • A method for monitoring and profiling applications, especially those built using a microservices architecture.
  • It tracks a single request as it travels through multiple services, assigning a unique trace ID that allows developers to visualize the entire request path, identify bottlenecks, and debug cross-service issues.

SLA vs. SLO vs. SLI

  • SLA (Service Level Agreement): A formal contract with a customer that defines the level of service they can expect, often with financial penalties for non-compliance (e.g., “99.9% uptime”).
  • SLO (Service Level Objective): An internal target for system reliability that is stricter than the SLA. This is the goal that engineering teams strive to meet.
  • SLI (Service Level Indicator): The actual, quantitative metric used to measure compliance with an SLO (e.g., the success rate of HTTP requests). The SLI is the “scoreboard” that measures performance.

OAuth 2.0 and OIDC

  • OAuth 2.0: An authorization framework that allows a user to grant a third-party application limited access to their resources on another service without sharing their credentials.
  • OIDC (OpenID Connect): A thin layer built on top of OAuth 2.0 that adds an authentication component. It allows an application to verify a user’s identity and obtain basic profile information. Together, they form the foundation of modern “Login with…” features.

TLS/SSL Handshake

  • TLS (Transport Layer Security)/SSL (Secure Sockets Layer): Cryptographic protocols that provide secure communication over a computer network.
  • The handshake is the initial process where the client and server establish a secure connection. During the handshake, they agree on an encryption cipher, exchange cryptographic keys, and authenticate the server via its digital certificate.

Zero Trust Security

  • A security model based on the principle of “never trust, always verify.” It assumes that threats can originate from anywhere, both inside and outside the network perimeter.
  • In a Zero Trust architecture, every request must be authenticated, authorized, and encrypted, regardless of its origin. Access is granted based on user identity and device posture, not on network location.
Press enter or click to view image in full size

You can think of system design like running a professional restaurantVertical scaling is buying a bigger stove, while horizontal scaling is hiring a whole team of chefs. Load balancing is the host at the front door assigning customers to different tables so no waiter is overwhelmed. A CDN is like having pre-made snacks available at local convenience stores so people don’t have to travel to your main kitchen for everything. Finally, Circuit Breakers are like a safety fuse in the kitchen: if one appliance starts smoking, it cuts the power immediately to that section so the whole restaurant doesn’t burn down.

Fonte: https://medium.com/@MaheshwariRishabh/50-core-system-design-concepts-6828ed73c2e8

Top 50 System Design

System design interview performance is always a critical factor in validating whether a candidate can come up with scalable and efficient systems. Knowledge of major terminologies will definitely help in acing these. Below are the top 50 must-know system design interview terminologies that we will explain with definitions and working examples, along with additional resources for learning.

1. Scalability

  • Definition: It is the ability of a system to support increased load by adding resources.
  • Example: Addition of more servers to handle the increase in web traffic.
  • Learn MoreWhat is Scalability and How to Achieve it?

2. Load Balancer

  • Definition: Dividing the incoming network traffic among multiple servers so that no one server processes a large amount of load.
  • Example: Load balancing web traffic across multiple EC2 instances using the AWS Elastic Load Balancer(ELB) Service.
  • Learn MoreUnderstanding Load Balancer

3. Microservices

  • Definition: It is an architectural pattern forcing the structuring of an application as a collection of loosely coupled services.
  • Example: Breaking down a monolithic application into independent services responsible for user management, processing payments, and sending notifications.
  • Learn MoreWhat are Microservices?

4. CAP Theorem

  • Definition: It states that at best, only two out of three guarantees can be gained in a distributed system: Consistency, Availability, and Partition Tolerance.
  • Example: When to Trade Off Consistency for Availability — And Vice Versa — in Distributed Database Design.
  • Learn MoreUnderstanding CAP Theorem

5. Sharding

  • Definition: It involves breaking down a large database into smaller pieces called shards for better management.
  • Example: Sharding a user database based on geographic region.
  • Learn MoreDatabase Sharding Explained

6. Latency

  • Definition: This gets defined as the time that it takes for data to travel from point A to point B.
  • Example: Measuring the delay involved in message delivery through a chat application.
  • Learn MoreLatency explained!

7. Throughput

  • Definition: A measure of the quantity of data a system processes in some timeframe
  • Example: Requests processed by a web server in one second.
  • Learn MoreThroughput in Computer Networks

8. Cache

  • Definition: Any hardware or software component that stores data to obviate future requests for the same data, serving It quickly.
  • Example: Implementing Redis caching for repeated database queries.
  • Learn MoreCaching Explained

9. Content Delivery Network (CDN)

  • Definition: A server system, geographically dispersed, that shows Web content to a user based on the geographical location from which he is accessing.
  • Example: Using Cloudflare CDN for faster web page loading.
  • Learn MoreWhat is a CDN?

10. REST API

  • Definition: a type of architectural style designed to build web services where data is accessed and manipulated using HTTP requests.
  • Example: Designing the Social Media API by REST(Representational State Transfer) principles.
  • Learn MoreREST API Tutorial

11. GraphQL

  • Definition: It is a language designed to query data, so it is much more powerful, efficient, and flexible than REST.
  • Example: Using GraphQL to query user information in a single request.
  • Learn MoreGraphQL Introduction

12. ACID

  • Definition: A set of properties ensuring reliable processing of database transactions. The properties are Atomicity, Consistency, Isolation, and Durability.
  • Example: Ensuring that a banking transaction has ACID properties prevents corrupted data.
  • Learn MoreACID Properties in Databases

13. BASE

  • Definition: An alternate to ACID that emphasizes Availability and Partition tolerance over strict-Consistency. Basically Available, Soft state, Eventually consistent system.
  • Example: Design of a highly available, eventually consistent NoSQL database.
  • Learn MoreBASE vs ACID

14. NoSQL

  • Definition: A type of database designed to promote storage and retrieval of data modelled in ways other than the tabular relationships used in relational databases.
  • Example: Using MongoDB for a document-based data store.
  • Learn MoreWhat is a NoSQL Database?

15. SQL

  • Definition: It is the standard language used for storing, manipulating, and retrieving data in relational databases.
  • Example: Writing SQL queries to get data back from a relational database.
  • Learn MoreSQL Tutorial

16. Database Indexing

  • Definition: It is a data structure technique that allows quick searching and access to data from a database.
  • Example: Create indexing on the column of User ID for searching speed enhancement.
  • Learn MoreDatabase Indexing

17. Replication

  • Definition: A process of copying and maintaining database objects in a multitude of databases which make up a distributed database system.
  • Example: It involves allowing a database to be highly available across different geographical locations using replication.
  • Learn MoreDatabase Replication

18. Failover

  • Definition: A backup operational mode in which system component functions are taken over by other system components in case of loss of a primary system component.
  • Example: Built-in automatic failovers to standby servers in the event of a server failure of your internet applications.
  • Learn MoreFailover Vs Disaster Recovery

19. API Gateway

  • Definition: A server that sits at the front of an API, receiving API requests, applying throttling and security policies, and then forwarding them to back-end services.
  • Example: Using AWS API Gateway to manage APIs.
  • Learn MoreWhat is an API Gateway?

20. Service Mesh

  • Definition: A dedicated infrastructure layer for facilitating service-to-service communications between microservices.
  • Example: Integrating Istio as a service mesh for the management of microservice interactions.
  • Learn MoreIntroduction to Service Mesh

21. Serverless Computing

  • Definition: A Cloud computing implementation that “dynamically allows for the allotment of machine resources by the cloud provider”.
  • Example: Run backend code without any server provisioning at your end using AWS Lambda.
  • Learn MoreWhat is Serverless Computing?

22. Event-Driven Architecture

  • Definition: A software architecture paradigm encouraging the generation, detection, and consumption of, and the reaction to, events in general.
  • Example: Design a system with event communications between microservices using Apache Kafka.
  • Learn MoreEvent-Driven Architecture

23. Monolithic Architecture

  • Definition: A software architecture wherein all the elements are fitted into a single application and run as a single service.
  • Example: Old traditional enterprise applications built as a single, large unit.
  • Learn MoreMonolithic vs Microservices Architecture

24. Distributed Systems

  • Definition: A model wherein components located on networked computers communicate with each other and coordinate their actions by passing messages.
  • Example: Designing a distributed file system like Hadoop.
  • Learn MoreIntroduction to Distributed Systems

25. Message Queue

  • Definition: This method allows asynchronous, service-to-service communication in both serverless and microservices architectures.
  • Example: Using RabbitMQ to queue messages between services.
  • Learn MoreMessage Queues Explained

26. Pub/Sub Model

  • Definition: A messaging pattern in which senders (publishers) publish messages so abstractly that any one of them can end up being accessed by recipients without the sender having to even know the identity of the destination receivers (subscribers).
  • Example: A notification system that uses Google Cloud Pub/Sub.
  • Learn MorePub/Sub Messaging

27. Data Partitioning

  • Definition: Division of a database into smaller, manageable parts.
  • Example: Partitioning a table in a database by date to allow super-fast query execution.
  • Learn MoreDatabase Partitioning

28. Horizontal Scaling

  • Definition: Increasing the capacity by adding more machines or nodes within a system.
  • Example: Adding more web servers to handle an increasing volume of user traffic.
  • Learn MoreHorizontal vs Vertical Scaling

29. Vertical Scaling

  • Definition: Upgrading an already existing machine with more power in the form of a CPU or RAM.
  • Example: Upgrading the RAM of a server so that it can handle more requests all at once.
  • Learn MoreHorizontal vs Vertical Scaling

30. Rate Limiting

  • Definition: It means controlling the rate of traffic that the network interface controller is sending or receiving.
  • Example: Throttling an API to prevent abusive behaviour.
  • Learn MoreUnderstanding Rate Limiting

31. Circuit Breaker Pattern

  • Definition: A design pattern used in modern software development, applied to detect failures and encapsulate the logic of preventing a failure from constantly recurring.
  • Example: Handling failed remote service calls using a circuit breaker in a microservice architecture.
  • Learn MoreCircuit Breaker Pattern

32. Data Consistency

  • Definition: Ensuring that data is the same across multiple instances and is not corrupted.
  • Example: Maintaining the consistency of user data through multiple replicas of a database.
  • Learn MoreData Consistency Models

33. Eventual Consistency

  • Definition: A model of consistency used in distributed computing toward the goal of high availability, stating that updates to a system will eventually propagate and be reflected by all nodes.
  • Example: Amazon DynamoDB provides an eventually consistent model for the read operation.
  • Learn MoreEventual Consistency

34. Strong Consistency

  • Definition: A consistency model ensuring every read gets the most recent write on a given unit of data.
  • Example: Using strong consistency in a financial transaction system.
  • Learn MoreStrong Consistency

35. Containerization

  • Definition: Basically, this is whenever an application and its dependencies are encapsulated into a container to be run on any computational environment.
  • Example: Using Docker to containerize the applications for deployment in various environments such as dev, test, prod etc.
  • Learn MoreWhat is Containerization?

36. Kubernetes

  • Definition: An open-source platform that automates the process of application container deployment, scaling, and operation.
  • Example: Run and deploy containerized applications using Kubernetes.
  • Learn MoreKubernetes Documentation

37. Autoscaling

  • Definition: Automatically adjusting the number of computational resources based on the user load.
  • Example: Utilizing AWS EC2 Auto Scaling feature to dynamically adjust the number of instances.
  • Learn MoreAuto Scaling explained

38. Multi-Tenancy

  • Definition: Architecture where a single instance of a software application serves multiple consumers/customers.
  • Example: SaaS applications, such as Salesforce, utilize multi-tenancy in their service provision toward their different categories of customers.
  • Learn More: Single Tenancy Vs Multi-Tenancy?

39. Load Shedding

  • Definition: Backing off some demands or degrading services to maintain the health of the overall system under high load.
  • Example: This will turn off all non-essential services during times of peak traffic.
  • Learn MoreLoad Shedding

40. Idempotence

  • Definition: A property for some mathematical and computer-science operations stating that it has the same effect if repeated more times than once.
  • Example: An HTTP DELETE request is idempotent.
  • Learn MoreIdempotence in APIs

41. Quorum

  • Definition: The minimum number of votes needed to commit a distributed transaction.
  • Example: Basically, quorum-based replication ensures that consistency exists in the distributed database.
  • Learn MoreQuorum Systems

42. Orchestration

  • Definition: A pattern of service interaction where a central coordinator controls the interaction between services.
  • Example: Using a workflow engine to manage some multi-step business process.
  • Learn MoreOrchestration

43. Choreography

  • Definition: A service interaction pattern in which every service is self-contained and interacts with others through events; there will not be any coordinator or orchestrator.
  • Example: Microservices communicating through an event bus in a choreography pattern.
  • Learn MoreChoreography vs. Orchestration

44. Service Registry

  • Definition: A database that keeps track of instances of microservices.
  • Example: Using the Eureka service registry in a microservice architecture.
  • Learn MoreService Registry and Discovery

45. API Rate Limiting

  • Definition: It means controlling how many requests a client can make against an API within a certain timeframe.
  • Example: Limiting requests to an API to 100 per minute to prevent abuse.
  • Learn MoreAPI Rate Limiting

46. Data Warehouse

  • Definition: A system that helps in the generation of reports and business data analytics; the hub of Business Intelligence.
  • Example: Amazon Redshift can be implemented in data warehousing.
  • Learn MoreUnderstanding Data Warehouse?

47. Data Lake

  • Definition: A system or repository where data is kept in native/raw format, generally as object blobs or files.
  • Example: Petabyte scaling for storing and managing structured and unstructured data in a data lake.
  • Learn MoreData Lake

48. OLAP

  • Definition: Online Analytical Processing : The software category that allows the analysis of data kept in a database.
  • Example: Use of the OLAP cubes for pointy analytical and arbitrary queries.
  • Learn MoreOLAP Explained

49. OLTP

  • Definition: Online Transaction Processing: a class of systems that manage transaction-oriented applications.
  • Example: Using OLTP systems for transaction data management, as in banking systems etc.
  • Learn MoreOLTP Explained

50. Big Data

  • Definition: Large, complex data sets that cannot be efficiently managed by conventional data-processing software in the best of cases.
  • Example: Analyzing social media interactions to predict fashion trends.
  • Learn MoreIntroduction to Big Data

Keep in mind that it’s all about continuous learning and practice as you go further in system design. You can work with the resources, get involved in the discussions, and practice these concepts in your projects. The resources and discussions will expose you to the vocabulary and usages of the concept.

Fonte: https://interviewnoodle.com/top-50-system-design-terminologies-you-must-know-3c78f5fb99c1