HackerNews中文版

我正在为一个高吞吐量的 AdTech DSP 设计系统架构，希望得到构建过大规模竞价/投放系统的人的反馈。约束/目标仅限 DSP（不涉及交易所）目标：100 万次广告请求/秒端到端 DSP 延迟预算：约 100 毫秒定价模式：CPM硬性要求：广告商或广告系列不得超支定向/广告系列获取我使用 Redis + Roaring Bitmaps 对定向（地理位置、兴趣等）进行建模仅获取候选广告系列：Redis：约 1000 次请求/秒，约 8 毫秒（本地机器，非云端）Aerospike：约 200–400 次请求/秒，约 10 毫秒这仅是广告系列获取，不包括竞价或评分预算/钱包模型广告商有一个钱包广告系列有：总预算每日预算每日支出跟踪超支是不可接受的（即使是小百分比，在高规模下也很重要）考虑过的预算控制方法将每日预算拆分成小时级桶通过以下方式进行速率限制：令牌桶PID 控制器这些方法减少了超支，但不能保证在突发流量下的正确性最近考虑使用微单位（整数货币单位）来减少舍入误差未决问题在 100 万 QPS 下，人们实际上是如何保证预算的？软超支，然后进行对账？在热路径中进行硬性原子检查？基于 Redis 位图的定向在此规模下是否可行，还是每个人最终都会：预先物化广告系列集合？将逻辑推送到内存/C++？您如何平衡：严格的预算执行低延迟高吞吐量而不会引入全局锁或跨区域争用？“永远不超支”是一个现实的要求，还是有界误差是行业规范？我对教科书式的答案不太感兴趣，更感兴趣的是实际在生产环境中有效（或失败）的经验。

查看原文

I’m working on the system architecture for a high-throughput AdTech DSP and would love feedback from people who’ve built large-scale bidding / serving systems.Constraints / GoalsDSP only (no exchange)Target: 1M ad requests/secEnd-to-end DSP latency budget: ~100msPricing model: CPMHard requirement: no advertiser or campaign overspendTargeting / Campaign FetchI modeled targeting (geo, interests, etc.) using Redis + Roaring BitmapsFetching candidate campaigns alone:Redis: ~1000 RPS at ~8ms (Local machine not Cloud)Aerospike: ~200–400 RPS at ~10msThis is only campaign fetching, not bidding or scoringBudget / Wallet ModelAdvertiser has a walletCampaign has:Total budgetDaily budgetDaily spend trackingOverspend is not acceptable (even small % matters at scale)Budget Control Approaches ConsideredSplitting daily budgets into hourly bucketsRate limiting via:Token bucketPID controllersThese reduce overspend but don’t guarantee correctness under bursty trafficRecently considering micros (integer currency units) to reduce rounding errorsOpen QuestionsAt 1M QPS, how do people actually enforce budget guarantees?Soft overspend with reconciliation?Hard atomic checks in the hot path?Is Redis bitmap–based targeting viable at this scale, or does everyone eventually:Pre-materialize campaign sets?Push logic into memory / C++?How do you balance:Strict budget enforcementLow latencyHigh throughput without introducing global locks or cross-region contention?Is “no overspend ever” a realistic requirement, or is bounded error the industry norm?I’m less interested in textbook answers and more in what has actually worked (or failed) in production

设计用于处理 100 万 QPS CPM 广告的 DSP 架构，同时避免超支