Ошибка	Последствие
Сразу рисовать архитектуру, не собрав требования	Проектируешь под несуществующую задачу — решение мимо цели
Путать функциональные и нефункциональные требования	Считаешь не ту нагрузку, обещаешь не ту доступность
Пропустить оценку нагрузки	Нечем оправдать выбор компонентов и число узлов
Считать back-of-envelope ради галочки	Цифры не влияют на дизайн — оценка превращается в ритуал
Брать средний QPS вместо пикового	Система ложится в дневной пик, на который её не рассчитали
Обещать больше девяток, чем готов оплатить	SLA не выдержан — штрафы и репутационные потери
Путать SLA, SLO и SLI	Не понимаешь, что измеряешь и за что отвечаешь контрактом
Выбирать REST или gRPC без обоснования	Внешний API на gRPC неудобен клиентам, внутренний на REST теряет в латентности
Смешивать HLD и LLD в одном шаге	Тонешь в схемах таблиц до того, как нарисованы сервисы и потоки
Не заложить бэкап и восстановление	Бэкап есть, а восстановление не протестировано — данные не вернуть
Тянуть технологию, которую не понимаешь	На уточняющих вопросах дизайн рассыпается
Проектировать больше, чем нужно по требованиям	Оверинжиниринг — время уходит на то, что задача не просила

advanced

System Design Methodology

How to run a system-design walkthrough top-down — gather functional and non-functional requirements, do back-of-envelope capacity math, pick an availability budget in nines, justify the API style, separate HLD from LLD, plan for operations, and avoid the classic interview pitfalls.

go-sd-methodology

Practice: 8 questions →

Sections

Contents

Detailed explanation

System Design Methodology

A system-design interview is not about guessing the right architecture but about running a repeatable process top-down. First you find out what the system does and under what load it operates, then you estimate the order of magnitude of that load, then you fix what availability you can promise, and only after that do you draw components. Skip any of the early steps and you are designing blind: it is unclear what to size, which components become critical, and how to justify each decision.

The central trap of this topic is starting from the end. The candidate hears "design a marketplace" and immediately drags Kafka, a sharded Postgres, and a distributed cache onto the board without writing down a single user role or a single load number. The result is over-engineering a system for requirements that do not exist, hauling in a technology they do not understand, and trying to show off their whole breadth instead of solving the given task. This topic breaks the methodology into layers — from gathering requirements to operational readiness — so the answer has a skeleton rather than a stream of consciousness.

Topic Map

Gathering requirements — functional requirements (what the system does: roles, permissions, search, payment) versus non-functional ones (DAU/MAU, growth, content size, read/write ratio, latency, and availability).
Capacity estimation — back-of-envelope math: from DAU and actions per user derive average and peak QPS, storage per year, and bandwidth, to find the critical components and justify the architecture.
Availability budgets — the nines and their cost in downtime per year, the difference between SLA, SLO, and SLI, and why you must not promise more nines than you are willing to pay for.
Choosing an API style — REST for simple, cacheable, external traffic versus gRPC for binary, typed, internal service-to-service traffic, with a justification for the choice.
HLD versus LLD — high-level design as boxes and arrows (services, data stores, queues, data flows) versus low-level design (schemas, indexes, API signatures, structs, and algorithms).
Operational readiness — backups with a tested restore, versioned migrations, metrics, and distributed tracing, without which a design stays just a picture.
Interview pitfalls — a trendy technology you do not understand, designing beyond the requirements, trying to dazzle with everything at once, ignoring product metrics, and over-engineering.

Common Mistakes and Traps

Mistake	Consequence
Drawing the architecture before gathering requirements	You design for a problem that does not exist — the solution misses the goal
Confusing functional and non-functional requirements	You size the wrong load and promise the wrong availability
Skipping capacity estimation	Nothing justifies the choice of components or the node count
Doing back-of-envelope math just for show	The numbers do not affect the design — estimation becomes a ritual
Taking average QPS instead of peak	The system falls over at the daily peak it was never sized for
Promising more nines than you can pay for	The SLA is breached — penalties and reputational loss
Confusing SLA, SLO, and SLI	You do not know what you measure and what you owe by contract
Choosing REST or gRPC without justification	An external API on gRPC is awkward for clients, an internal one on REST loses latency
Mixing HLD and LLD in one step	You drown in table schemas before the services and flows are drawn
Not planning backup and restore	A backup exists but the restore is untested — the data cannot be recovered
Reaching for a technology you do not understand	The design falls apart under follow-up questions
Designing more than the requirements need	Over-engineering — time goes to what the task never asked for

Interview Relevance

System design methodology is a mandatory senior topic in a Go interview, and what is checked here is not a catalog of technologies but discipline of thought. The interviewer watches whether you run the walkthrough top-down: whether you ask clarifying questions and separate functional from non-functional requirements, whether you size the load out loud and use those numbers to choose components, whether you understand the cost of nines and the difference between SLA, SLO, and SLI, whether you justify the API style and do not confuse the HLD level with the LLD level. A strong candidate starts simple and complicates the system only when a concrete requirement forces it.

What interviewers usually check:

How you gather requirements and how functional requirements differ from non-functional ones.
How to derive average and peak QPS, storage, and bandwidth from DAU and actions per user, and why this estimation matters.
What the nines of availability mean in downtime per year and the difference between SLA, SLO, and SLI.
When to choose REST versus gRPC and how to justify the API style.
How high-level design (HLD) differs from low-level design (LLD) and in what order to do them.
What operational readiness includes — backups with a tested restore, migrations, metrics, and tracing.

A typical wrong answer: immediately drawing boxes and arrows and reaching for trendy technologies without gathering requirements or sizing the load. This triggers a discussion that without functional and non-functional requirements it is unclear what is even being designed, that without estimating QPS and storage nothing justifies the choice of components, that promising five nines with no budget for them is a future breached SLA, and that showing off your whole breadth instead of solving the given task is the most common way to fail system design.

Why it matters

A system-design interview tests not your isolated knowledge of databases and queues but your order of thinking — whether you can run the walkthrough top-down instead of jumping straight to boxes and arrows. Whoever starts from the architecture without writing down what the system does and what load it lives under designs a solution for a problem that does not exist — over-engineering where there are no requirements and forgetting to size the QPS, storage, and availability budget that actually drive the choice of components. This methodology gives the answer a skeleton — requirements, capacity estimation, nines of availability, API style, HLD and LLD, operations — and names up front the traps that sink even strong engineers: reaching for a trendy technology they do not understand, designing more than the requirements need, and trying to show off all their knowledge instead of solving the task at hand.

Ошибка	Последствие
Сразу рисовать архитектуру, не собрав требования	Проектируешь под несуществующую задачу — решение мимо цели
Путать функциональные и нефункциональные требования	Считаешь не ту нагрузку, обещаешь не ту доступность
Пропустить оценку нагрузки	Нечем оправдать выбор компонентов и число узлов
Считать back-of-envelope ради галочки	Цифры не влияют на дизайн — оценка превращается в ритуал
Брать средний QPS вместо пикового	Система ложится в дневной пик, на который её не рассчитали
Обещать больше девяток, чем готов оплатить	SLA не выдержан — штрафы и репутационные потери
Путать SLA, SLO и SLI	Не понимаешь, что измеряешь и за что отвечаешь контрактом
Выбирать REST или gRPC без обоснования	Внешний API на gRPC неудобен клиентам, внутренний на REST теряет в латентности
Смешивать HLD и LLD в одном шаге	Тонешь в схемах таблиц до того, как нарисованы сервисы и потоки
Не заложить бэкап и восстановление	Бэкап есть, а восстановление не протестировано — данные не вернуть
Тянуть технологию, которую не понимаешь	На уточняющих вопросах дизайн рассыпается
Проектировать больше, чем нужно по требованиям	Оверинжиниринг — время уходит на то, что задача не просила

advanced

System Design Methodology

go-sd-methodology

Practice: 8 questions →

Sections

Contents

Detailed explanation

System Design Methodology

Topic Map

Gathering requirements — functional requirements (what the system does: roles, permissions, search, payment) versus non-functional ones (DAU/MAU, growth, content size, read/write ratio, latency, and availability).
Capacity estimation — back-of-envelope math: from DAU and actions per user derive average and peak QPS, storage per year, and bandwidth, to find the critical components and justify the architecture.
Availability budgets — the nines and their cost in downtime per year, the difference between SLA, SLO, and SLI, and why you must not promise more nines than you are willing to pay for.
Choosing an API style — REST for simple, cacheable, external traffic versus gRPC for binary, typed, internal service-to-service traffic, with a justification for the choice.
HLD versus LLD — high-level design as boxes and arrows (services, data stores, queues, data flows) versus low-level design (schemas, indexes, API signatures, structs, and algorithms).
Operational readiness — backups with a tested restore, versioned migrations, metrics, and distributed tracing, without which a design stays just a picture.
Interview pitfalls — a trendy technology you do not understand, designing beyond the requirements, trying to dazzle with everything at once, ignoring product metrics, and over-engineering.

Common Mistakes and Traps

Mistake	Consequence
Drawing the architecture before gathering requirements	You design for a problem that does not exist — the solution misses the goal
Confusing functional and non-functional requirements	You size the wrong load and promise the wrong availability
Skipping capacity estimation	Nothing justifies the choice of components or the node count
Doing back-of-envelope math just for show	The numbers do not affect the design — estimation becomes a ritual
Taking average QPS instead of peak	The system falls over at the daily peak it was never sized for
Promising more nines than you can pay for	The SLA is breached — penalties and reputational loss
Confusing SLA, SLO, and SLI	You do not know what you measure and what you owe by contract
Choosing REST or gRPC without justification	An external API on gRPC is awkward for clients, an internal one on REST loses latency
Mixing HLD and LLD in one step	You drown in table schemas before the services and flows are drawn
Not planning backup and restore	A backup exists but the restore is untested — the data cannot be recovered
Reaching for a technology you do not understand	The design falls apart under follow-up questions
Designing more than the requirements need	Over-engineering — time goes to what the task never asked for

Interview Relevance

What interviewers usually check:

How you gather requirements and how functional requirements differ from non-functional ones.
How to derive average and peak QPS, storage, and bandwidth from DAU and actions per user, and why this estimation matters.
What the nines of availability mean in downtime per year and the difference between SLA, SLO, and SLI.
When to choose REST versus gRPC and how to justify the API style.
How high-level design (HLD) differs from low-level design (LLD) and in what order to do them.
What operational readiness includes — backups with a tested restore, migrations, metrics, and tracing.

Why it matters

Методология системного дизайна

Методология системного дизайна

Карта темы

Частые ошибки и ловушки

Значение для собеседований

System Design Methodology

System Design Methodology

Topic Map

Common Mistakes and Traps

Interview Relevance

Методология системного дизайна

Методология системного дизайна

Карта темы

Частые ошибки и ловушки

Значение для собеседований

System Design Methodology

System Design Methodology

Topic Map

Common Mistakes and Traps

Interview Relevance