{"id":1168,"date":"2026-05-23T17:00:00","date_gmt":"2026-05-23T22:00:00","guid":{"rendered":"https:\/\/tolinku.com\/blog\/?p=1168"},"modified":"2026-03-07T03:35:00","modified_gmt":"2026-03-07T08:35:00","slug":"webhook-rate-limiting","status":"publish","type":"post","link":"https:\/\/tolinku.com\/blog\/webhook-rate-limiting\/","title":{"rendered":"Webhook Rate Limiting: Handling High-Volume Events"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">A viral campaign link can go from 10 clicks per minute to 10,000 clicks per minute overnight. Each click generates a <code>link.clicked<\/code> webhook event. If your receiver processes events synchronously and writes to a database on each request, it falls over. If it responds slowly, Tolinku&#39;s 10-second timeout kicks in, the delivery is marked as failed, and retries add even more traffic.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Rate limiting on the receiver side isn&#39;t about rejecting events. It&#39;s about absorbing traffic spikes without losing data or destabilizing downstream systems. This guide covers strategies for handling high-volume <a href=\"https:\/\/tolinku.com\/features\/webhooks\">Tolinku webhooks<\/a> gracefully. For the general webhook architecture, see the <a href=\"https:\/\/tolinku.com\/blog\/webhooks-integrations-deep-linking\/\">webhooks and integrations pillar post<\/a>. For retry behavior, see the <a href=\"https:\/\/tolinku.com\/blog\/webhook-retry-logic\/\">retry logic guide<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><img decoding=\"async\" src=\"https:\/\/tolinku.com\/blog\/wp-content\/uploads\/2026\/03\/platform-webhooks.png\" alt=\"Tolinku webhook configuration for event notifications\">\n<em>The webhooks page with create form, webhook list, and delivery log.<\/em><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How Tolinku Delivers Webhooks<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Understanding the delivery characteristics helps you design your receiver:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Timeout<\/strong>: 10 seconds. If your endpoint doesn&#39;t respond within 10 seconds, the delivery fails.<\/li>\n<li><strong>Retries<\/strong>: 3 retries at 1 minute, 5 minutes, and 30 minutes after a failure.<\/li>\n<li><strong>Concurrency<\/strong>: Tolinku may deliver multiple events simultaneously. There&#39;s no guarantee of sequential delivery.<\/li>\n<li><strong>No redirects<\/strong>: Tolinku won&#39;t follow HTTP redirects. Your endpoint must respond directly.<\/li>\n<li><strong>Success criteria<\/strong>: HTTP status 200-299. Anything else triggers a retry.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">The critical implication: your endpoint must respond quickly. Everything else can happen asynchronously.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Strategy 1: Respond First, Process Later<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The most important pattern for handling any volume of webhooks. Separate the acknowledgement from the processing.<\/p>\n\n\n\n<pre><code class=\"language-typescript\">import express from &#39;express&#39;;\nimport crypto from &#39;crypto&#39;;\n\nconst app = express();\napp.use(&#39;\/webhooks&#39;, express.raw({ type: &#39;application\/json&#39; }));\n\napp.post(&#39;\/webhooks\/tolinku&#39;, (req, res) =&gt; {\n  \/\/ Verify signature (fast: ~0.1ms)\n  const signature = req.headers[&#39;x-webhook-signature&#39;] as string;\n  const expected = crypto\n    .createHmac(&#39;sha256&#39;, process.env.WEBHOOK_SECRET!)\n    .update(req.body)\n    .digest(&#39;hex&#39;);\n\n  if (signature !== expected) {\n    return res.status(401).send(&#39;Invalid signature&#39;);\n  }\n\n  \/\/ Respond immediately\n  res.status(200).send(&#39;OK&#39;);\n\n  \/\/ Process asynchronously (does not block the response)\n  processAsync(req.body).catch(err =&gt;\n    console.error(&#39;Processing failed:&#39;, err.message)\n  );\n});\n\nasync function processAsync(rawBody: Buffer) {\n  const event = JSON.parse(rawBody.toString());\n  \/\/ Your processing logic here\n}\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Response time: under 5ms. The 10-second timeout is irrelevant because you&#39;ve already responded. Processing happens in the background; if it fails, you&#39;ve still acknowledged the delivery.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Strategy 2: Queue as a Buffer<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">For sustained high volume, an in-process async function isn&#39;t enough. A message queue absorbs spikes and lets you process at a controlled rate.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">With Redis (BullMQ)<\/h3>\n\n\n\n<pre><code class=\"language-typescript\">import { Queue, Worker } from &#39;bullmq&#39;;\nimport IORedis from &#39;ioredis&#39;;\n\nconst connection = new IORedis(process.env.REDIS_URL!);\nconst webhookQueue = new Queue(&#39;webhooks&#39;, { connection });\n\n\/\/ Receiver: enqueue immediately\napp.post(&#39;\/webhooks\/tolinku&#39;, async (req, res) =&gt; {\n  \/\/ Verify signature...\n  res.status(200).send(&#39;OK&#39;);\n\n  const eventHash = crypto\n    .createHash(&#39;sha256&#39;)\n    .update(req.body)\n    .digest(&#39;hex&#39;)\n    .substring(0, 16);\n\n  await webhookQueue.add(&#39;process&#39;, {\n    body: req.body.toString(),\n    eventType: req.headers[&#39;x-webhook-event&#39;],\n  }, {\n    jobId: eventHash, \/\/ Deduplicate retries\n  });\n});\n\n\/\/ Worker: process at controlled rate\nconst worker = new Worker(&#39;webhooks&#39;, async (job) =&gt; {\n  const event = JSON.parse(job.data.body);\n  await processEvent(event);\n}, {\n  connection,\n  concurrency: 10,       \/\/ Process up to 10 events concurrently\n  limiter: {\n    max: 100,             \/\/ Max 100 jobs\n    duration: 1000,       \/\/ Per second\n  },\n});\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The <code>limiter<\/code> option controls throughput to downstream systems. If your database can handle 100 writes per second, set the limiter to match. Events exceeding that rate queue up and process when capacity is available.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The <code>jobId<\/code> based on the event hash deduplicates webhook retries automatically. If Tolinku sends the same event twice, BullMQ ignores the duplicate.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">With AWS SQS<\/h3>\n\n\n\n<pre><code class=\"language-typescript\">import { SQSClient, SendMessageCommand } from &#39;@aws-sdk\/client-sqs&#39;;\n\nconst sqs = new SQSClient({});\n\napp.post(&#39;\/webhooks\/tolinku&#39;, async (req, res) =&gt; {\n  \/\/ Verify signature...\n  res.status(200).send(&#39;OK&#39;);\n\n  const eventHash = crypto\n    .createHash(&#39;sha256&#39;)\n    .update(req.body)\n    .digest(&#39;hex&#39;);\n\n  await sqs.send(new SendMessageCommand({\n    QueueUrl: process.env.SQS_QUEUE_URL!,\n    MessageBody: req.body.toString(),\n    MessageDeduplicationId: eventHash,\n    MessageGroupId: req.headers[&#39;x-webhook-event&#39;] as string,\n  }));\n});\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">SQS FIFO queues provide built-in deduplication via <code>MessageDeduplicationId<\/code>. Standard queues offer higher throughput but don&#39;t deduplicate.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Strategy 3: Load Shedding<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">When the system is overwhelmed, it&#39;s better to drop low-priority events than to crash. Load shedding prioritizes important events over noise.<\/p>\n\n\n\n<pre><code class=\"language-typescript\">const PRIORITY_EVENTS = new Set([\n  &#39;install.tracked&#39;,\n  &#39;referral.created&#39;,\n  &#39;referral.completed&#39;,\n]);\n\nlet activeProcessing = 0;\nconst MAX_CONCURRENT = 50;\n\napp.post(&#39;\/webhooks\/tolinku&#39;, (req, res) =&gt; {\n  \/\/ Verify signature...\n\n  const eventType = req.headers[&#39;x-webhook-event&#39;] as string;\n\n  \/\/ Always accept and process priority events\n  if (PRIORITY_EVENTS.has(eventType)) {\n    res.status(200).send(&#39;OK&#39;);\n    processAsync(req.body);\n    return;\n  }\n\n  \/\/ Shed low-priority events when overloaded\n  if (activeProcessing &gt;= MAX_CONCURRENT) {\n    \/\/ Still respond 200 to prevent retries (we&#39;re intentionally shedding)\n    res.status(200).send(&#39;OK&#39;);\n    console.log(`Shed ${eventType} event (${activeProcessing} active)`);\n    return;\n  }\n\n  res.status(200).send(&#39;OK&#39;);\n  activeProcessing++;\n  processAsync(req.body).finally(() =&gt; activeProcessing--);\n});\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Note: we respond 200 even for shed events. Responding with 429 or 503 would trigger Tolinku&#39;s retry logic, adding more traffic during the spike. If you&#39;re intentionally shedding, acknowledge the delivery and move on.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For <code>link.clicked<\/code> events (the highest volume), losing a small percentage during a spike is usually acceptable since the analytics will still be directionally correct. Losing <code>install.tracked<\/code> or <code>referral.completed<\/code> events is not acceptable, hence the priority system.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Strategy 4: Receiver Auto-Scaling<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">If you deploy your receiver on an auto-scaling platform (AWS Lambda, Google Cloud Run, Fly.io), the infrastructure handles spikes by spinning up more instances.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">AWS Lambda<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Lambda scales automatically per request. Each invocation handles one webhook delivery.<\/p>\n\n\n\n<pre><code class=\"language-typescript\">\/\/ handler.ts (Lambda function)\nimport crypto from &#39;crypto&#39;;\n\nexport async function handler(event: any) {\n  const body = event.body;\n  const signature = event.headers[&#39;x-webhook-signature&#39;];\n\n  const expected = crypto\n    .createHmac(&#39;sha256&#39;, process.env.WEBHOOK_SECRET!)\n    .update(body)\n    .digest(&#39;hex&#39;);\n\n  if (signature !== expected) {\n    return { statusCode: 401, body: &#39;Invalid signature&#39; };\n  }\n\n  const webhookEvent = JSON.parse(body);\n\n  \/\/ Process or enqueue\n  await processEvent(webhookEvent);\n\n  return { statusCode: 200, body: &#39;OK&#39; };\n}\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Lambda concurrency limits protect downstream systems. Set a reserved concurrency of 100 to cap the number of simultaneous executions. Events that exceed the limit get throttled (429 response) and Tolinku retries them later.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Google Cloud Run<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Cloud Run scales containers based on request concurrency. Set the <code>--max-instances<\/code> flag to cap total capacity:<\/p>\n\n\n\n<pre><code class=\"language-bash\">gcloud run deploy webhook-receiver \\\n  --max-instances=10 \\\n  --concurrency=80 \\\n  --timeout=10\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This gives you up to 800 concurrent webhook requests (10 instances x 80 concurrency).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Rate Limiting Downstream Calls<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Your receiver might handle thousands of events per second, but your downstream systems (database, CRM, analytics API) probably can&#39;t. Rate limit the outbound calls.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Token Bucket Rate Limiter<\/h3>\n\n\n\n<pre><code class=\"language-typescript\">class TokenBucket {\n  private tokens: number;\n  private lastRefill: number;\n\n  constructor(\n    private readonly maxTokens: number,\n    private readonly refillRate: number, \/\/ tokens per second\n  ) {\n    this.tokens = maxTokens;\n    this.lastRefill = Date.now();\n  }\n\n  async acquire(): Promise&lt;boolean&gt; {\n    this.refill();\n    if (this.tokens &gt;= 1) {\n      this.tokens--;\n      return true;\n    }\n    return false;\n  }\n\n  private refill() {\n    const now = Date.now();\n    const elapsed = (now - this.lastRefill) \/ 1000;\n    this.tokens = Math.min(this.maxTokens, this.tokens + elapsed * this.refillRate);\n    this.lastRefill = now;\n  }\n}\n\n\/\/ Allow 50 database writes per second with burst capacity of 100\nconst dbLimiter = new TokenBucket(100, 50);\n\nasync function processEvent(event: any) {\n  if (await dbLimiter.acquire()) {\n    await writeToDatabase(event);\n  } else {\n    \/\/ Queue for later or drop if non-critical\n    await enqueueForLater(event);\n  }\n}\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Monitoring Under Load<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">During high-volume periods, watch these metrics:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Queue depth<\/strong>: If it&#39;s growing faster than it&#39;s draining, your workers need to scale up or you need to shed more aggressively.<\/li>\n<li><strong>Response time p99<\/strong>: Should stay under 100ms. If it spikes, your receiver is doing too much synchronous work.<\/li>\n<li><strong>Event loss rate<\/strong>: How many events are being shed? Is it within acceptable bounds?<\/li>\n<li><strong>Downstream error rate<\/strong>: Are your database, CRM, or analytics APIs returning errors under load?<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">See the <a href=\"https:\/\/tolinku.com\/blog\/webhook-delivery-monitoring\/\">delivery monitoring guide<\/a> for a complete monitoring setup.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Choosing Your Strategy<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Situation<\/th>\n<th>Strategy<\/th>\n<\/tr>\n<\/thead>\n<tbody><tr>\n<td>Occasional traffic spikes<\/td>\n<td>Respond first, process later<\/td>\n<\/tr>\n<tr>\n<td>Sustained high volume with multiple destinations<\/td>\n<td>Queue as buffer<\/td>\n<\/tr>\n<tr>\n<td>Extreme spikes with non-critical event types<\/td>\n<td>Load shedding<\/td>\n<\/tr>\n<tr>\n<td>Variable traffic on managed infrastructure<\/td>\n<td>Auto-scaling (Lambda\/Cloud Run)<\/td>\n<\/tr>\n<tr>\n<td>Rate-sensitive downstream systems<\/td>\n<td>Token bucket on outbound calls<\/td>\n<\/tr>\n<\/tbody><\/table><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Most teams need a combination. Start with &quot;respond first, process later.&quot; Add a queue when you need reliability guarantees. Add load shedding when you need to protect critical event types during extreme spikes.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For real-time processing patterns beyond rate limiting, see the <a href=\"https:\/\/tolinku.com\/blog\/real-time-event-processing\/\">real-time event processing guide<\/a>. For testing your receiver under load, see the <a href=\"https:\/\/tolinku.com\/blog\/webhook-testing-tools\/\">webhook testing tools guide<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Handle high-volume webhook traffic without dropping events. Implement rate limiting, backpressure, and load shedding for deep link event receivers.<\/p>\n","protected":false},"author":2,"featured_media":1167,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"rank_math_title":"Webhook Rate Limiting: Handling High-Volume Events","rank_math_description":"Handle high-volume webhook traffic without dropping events. Implement rate limiting, backpressure, and load shedding for deep link event receivers.","rank_math_focus_keyword":"webhook rate limiting","rank_math_canonical_url":"","rank_math_facebook_title":"","rank_math_facebook_description":"","rank_math_facebook_image":"https:\/\/tolinku.com\/blog\/wp-content\/uploads\/2026\/03\/og-webhook-rate-limiting.png","rank_math_facebook_image_id":"","rank_math_twitter_title":"","rank_math_twitter_description":"","rank_math_twitter_image":"https:\/\/tolinku.com\/blog\/wp-content\/uploads\/2026\/03\/og-webhook-rate-limiting.png","footnotes":""},"categories":[15],"tags":[62,20,264,296,295,297,61],"class_list":["post-1168","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-engineering","tag-api","tag-deep-linking","tag-engineering","tag-performance","tag-rate-limiting","tag-scalability","tag-webhooks"],"_links":{"self":[{"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/posts\/1168","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/comments?post=1168"}],"version-history":[{"count":2,"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/posts\/1168\/revisions"}],"predecessor-version":[{"id":2265,"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/posts\/1168\/revisions\/2265"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/media\/1167"}],"wp:attachment":[{"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/media?parent=1168"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/categories?post=1168"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/tags?post=1168"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}