{"id":1162,"date":"2026-05-23T09:00:00","date_gmt":"2026-05-23T14:00:00","guid":{"rendered":"https:\/\/tolinku.com\/blog\/?p=1162"},"modified":"2026-03-07T03:34:59","modified_gmt":"2026-03-07T08:34:59","slug":"webhook-delivery-monitoring","status":"publish","type":"post","link":"https:\/\/tolinku.com\/blog\/webhook-delivery-monitoring\/","title":{"rendered":"Monitoring Webhook Delivery: Ensuring Reliability"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">A webhook integration that works today can silently break tomorrow. An SSL certificate expires, a server runs out of disk space, a deployment changes a route, or a firewall rule blocks the webhook IP. Without monitoring, you don&#39;t know events are being lost until someone asks why the analytics don&#39;t add up.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This guide covers how to monitor <a href=\"https:\/\/tolinku.com\/features\/webhooks\">Tolinku webhook<\/a> delivery from both sides: what Tolinku provides in its dashboard, and what you should track on your receiver. For the initial webhook setup, see the <a href=\"https:\/\/tolinku.com\/blog\/webhook-setup-guide\/\">webhook setup guide<\/a>. For debugging specific issues, see the <a href=\"https:\/\/tolinku.com\/blog\/webhook-debugging\/\">webhook debugging guide<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><img decoding=\"async\" src=\"https:\/\/tolinku.com\/blog\/wp-content\/uploads\/2026\/03\/platform-webhooks.png\" alt=\"Tolinku webhook configuration for event notifications\">\n<em>The webhooks page with create form, webhook list, and delivery log.<\/em><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What Tolinku Tracks<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Every webhook delivery is logged with:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>delivery_id<\/strong>: Unique identifier for the delivery attempt<\/li>\n<li><strong>event_type<\/strong>: The event that triggered the delivery (e.g., <code>link.clicked<\/code>)<\/li>\n<li><strong>timestamp<\/strong>: When the delivery was attempted<\/li>\n<li><strong>status_code<\/strong>: The HTTP response code your endpoint returned (0 if timeout or network error)<\/li>\n<li><strong>response_time_ms<\/strong>: How long your endpoint took to respond<\/li>\n<li><strong>success<\/strong>: Whether the delivery was considered successful (status 200-299)<\/li>\n<li><strong>error_message<\/strong>: Description of the failure (e.g., &quot;Timeout&quot;, connection refused)<\/li>\n<li><strong>attempt<\/strong>: Which attempt this was (1 for the initial delivery, 2-4 for retries)<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Each webhook endpoint in the dashboard also shows:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>last_triggered_at<\/strong>: When the most recent event was sent<\/li>\n<li><strong>last_status_code<\/strong>: The response from the most recent delivery<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">You can view the last 50 deliveries per webhook in the <a href=\"https:\/\/tolinku.com\/docs\/user-guide\/webhooks\/\">Tolinku dashboard<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What You Should Track (Receiver Side)<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The dashboard shows what Tolinku sent. Your receiver should track what it received and how it processed the events. These metrics form a complete picture.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Key Metrics<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Events received per minute.<\/strong> Track the count of webhook requests hitting your endpoint. A sudden drop means either Tolinku stopped sending (check the dashboard) or your receiver is unreachable.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Processing latency.<\/strong> Time from receiving the webhook to completing your handler logic (database write, API call, queue push). High processing latency means your handler is doing too much synchronously. Respond with 200 first, then process asynchronously.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Error rate by type.<\/strong> Categorize errors:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Signature verification failures (potential security issue or misconfigured secret)<\/li>\n<li>Downstream failures (database down, API timeout, queue full)<\/li>\n<li>Parsing failures (unexpected payload format)<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Duplicate rate.<\/strong> How often you receive the same event more than once. Some duplication is normal (retries after timeouts), but a high rate might indicate your endpoint is responding too slowly, causing Tolinku to retry events that were actually received.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Structured Logging<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Log every webhook request with structured fields so you can query and aggregate:<\/p>\n\n\n\n<pre><code class=\"language-typescript\">import express from &#39;express&#39;;\nimport crypto from &#39;crypto&#39;;\n\nconst app = express();\napp.use(&#39;\/webhooks&#39;, express.raw({ type: &#39;application\/json&#39; }));\n\napp.post(&#39;\/webhooks\/tolinku&#39;, async (req, res) =&gt; {\n  const startTime = Date.now();\n  const eventType = req.headers[&#39;x-webhook-event&#39;] as string;\n\n  \/\/ Verify signature\n  const signature = req.headers[&#39;x-webhook-signature&#39;] as string;\n  const expected = crypto\n    .createHmac(&#39;sha256&#39;, process.env.WEBHOOK_SECRET!)\n    .update(req.body)\n    .digest(&#39;hex&#39;);\n\n  if (signature !== expected) {\n    console.log(JSON.stringify({\n      level: &#39;warn&#39;,\n      type: &#39;webhook_signature_failed&#39;,\n      event_type: eventType,\n      ip: req.ip,\n    }));\n    return res.status(401).send(&#39;Invalid signature&#39;);\n  }\n\n  \/\/ Respond immediately\n  res.status(200).send(&#39;OK&#39;);\n  const responseTime = Date.now() - startTime;\n\n  const event = JSON.parse(req.body.toString());\n\n  \/\/ Generate dedup key\n  const eventHash = crypto\n    .createHash(&#39;sha256&#39;)\n    .update(req.body)\n    .digest(&#39;hex&#39;)\n    .substring(0, 16);\n\n  try {\n    await processEvent(event);\n\n    console.log(JSON.stringify({\n      level: &#39;info&#39;,\n      type: &#39;webhook_processed&#39;,\n      event_type: eventType,\n      event_hash: eventHash,\n      response_time_ms: responseTime,\n      processing_time_ms: Date.now() - startTime,\n      timestamp: event.timestamp,\n    }));\n  } catch (err: any) {\n    console.log(JSON.stringify({\n      level: &#39;error&#39;,\n      type: &#39;webhook_processing_failed&#39;,\n      event_type: eventType,\n      event_hash: eventHash,\n      error: err.message,\n      response_time_ms: responseTime,\n    }));\n  }\n});\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Building a Monitoring Dashboard<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Whether you use Grafana, Datadog, or a custom dashboard, track these panels:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1. Delivery Success Rate<\/h3>\n\n\n\n<pre><code class=\"language-sql\">-- Success rate over the last hour (from your receiver logs)\nSELECT\n  date_trunc(&#39;minute&#39;, received_at) AS minute,\n  COUNT(*) AS total,\n  COUNT(*) FILTER (WHERE status = &#39;processed&#39;) AS success,\n  ROUND(\n    COUNT(*) FILTER (WHERE status = &#39;processed&#39;)::numeric \/ COUNT(*) * 100,\n    1\n  ) AS success_rate\nFROM webhook_logs\nWHERE received_at &gt; NOW() - INTERVAL &#39;1 hour&#39;\nGROUP BY minute\nORDER BY minute;\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">A healthy integration maintains 99%+ success rate. A dip below 95% warrants investigation.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">2. Response Time Distribution<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Track the p50, p95, and p99 response times of your endpoint. Tolinku times out after 10 seconds, so if your p95 approaches that threshold, you&#39;re at risk of timeouts triggering retries.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Target: p95 under 500ms. If processing takes longer, respond with 200 first and process asynchronously.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">3. Events by Type Over Time<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A time series of event counts broken down by type. This reveals:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Traffic patterns (when are your links most clicked?)<\/li>\n<li>Campaign launches (sudden spike in <code>link.clicked<\/code>)<\/li>\n<li>Conversion health (is the ratio of clicks to installs stable?)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">4. Retry Rate<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Track how many events arrive as retries (attempt &gt; 1). A rising retry rate means your endpoint is failing more often.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If you can identify retries (by detecting duplicate event hashes), chart the dedup rate over time. Normal is under 1%. Above 5% means something is consistently failing.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Alerting<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Set up alerts for these conditions:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">No Events Received<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">If your receiver hasn&#39;t received a webhook in the last N minutes (where N depends on your traffic volume), something is wrong. For high-traffic apps, alert after 5 minutes of silence. For low-traffic apps, alert after 1 hour.<\/p>\n\n\n\n<pre><code class=\"language-typescript\">\/\/ Simple staleness check\nlet lastEventTime = Date.now();\n\napp.post(&#39;\/webhooks\/tolinku&#39;, (req, res) =&gt; {\n  lastEventTime = Date.now();\n  \/\/ ... process\n});\n\nsetInterval(() =&gt; {\n  const silenceMinutes = (Date.now() - lastEventTime) \/ 60000;\n  if (silenceMinutes &gt; 15) {\n    sendAlert(`No webhook events received in ${Math.round(silenceMinutes)} minutes`);\n  }\n}, 60000);\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">High Error Rate<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Alert when the error rate exceeds 5% over a 10-minute window. Errors include signature failures, processing exceptions, and downstream failures.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Slow Response Times<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Alert when p95 response time exceeds 5 seconds. At that point, you&#39;re at risk of timeouts on the next traffic spike.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Signature Failures<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Any signature verification failure should trigger an alert. Either someone is sending fake requests to your endpoint, or your webhook secret is misconfigured. Both warrant immediate attention.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Health Check Endpoint<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Add a health check endpoint to your receiver so your monitoring system can verify it&#39;s running:<\/p>\n\n\n\n<pre><code class=\"language-typescript\">app.get(&#39;\/health&#39;, (req, res) =&gt; {\n  const silenceMinutes = (Date.now() - lastEventTime) \/ 60000;\n\n  res.json({\n    status: &#39;ok&#39;,\n    last_event_received: new Date(lastEventTime).toISOString(),\n    minutes_since_last_event: Math.round(silenceMinutes),\n    uptime_seconds: Math.round(process.uptime()),\n  });\n});\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Point your uptime monitor (UptimeRobot, Pingdom, or a simple cron curl) at this endpoint.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Reconciliation<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Even with monitoring, events can be lost. Periodic reconciliation compares what Tolinku sent with what your receiver processed.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Approach<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Query the Tolinku <a href=\"https:\/\/tolinku.com\/docs\/user-guide\/analytics\/\">analytics API<\/a> for event counts over a time range<\/li>\n<li>Query your receiver&#39;s logs for event counts over the same range<\/li>\n<li>Compare the totals<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">If the numbers diverge by more than 1-2% (accounting for timing differences at range boundaries), investigate:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Check the delivery logs in the Tolinku dashboard for failed deliveries<\/li>\n<li>Check your receiver logs for processing errors<\/li>\n<li>Look for gaps in your receiver&#39;s uptime<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Automated Reconciliation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Run a daily job that compares counts and alerts on significant discrepancies:<\/p>\n\n\n\n<pre><code class=\"language-typescript\">async function reconcile() {\n  \/\/ Count events your receiver processed yesterday\n  const result = await db.query(`\n    SELECT event_type, COUNT(*) as count\n    FROM webhook_logs\n    WHERE received_at &gt;= CURRENT_DATE - INTERVAL &#39;1 day&#39;\n      AND received_at &lt; CURRENT_DATE\n    GROUP BY event_type\n  `);\n\n  const receiverCounts = Object.fromEntries(\n    result.rows.map(r =&gt; [r.event_type, parseInt(r.count)])\n  );\n\n  \/\/ Compare with expected counts from your analytics\n  \/\/ If discrepancy &gt; 2%, alert\n  for (const [eventType, count] of Object.entries(receiverCounts)) {\n    console.log(`${eventType}: ${count} events received`);\n  }\n}\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Runbook: Common Issues<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>Symptom<\/th>\n<th>Likely Cause<\/th>\n<th>Fix<\/th>\n<\/tr>\n<\/thead>\n<tbody><tr>\n<td>All deliveries failing with status 0<\/td>\n<td>Endpoint unreachable<\/td>\n<td>Check DNS, firewall rules, SSL certificate<\/td>\n<\/tr>\n<tr>\n<td>Status 401 on all deliveries<\/td>\n<td>Signature mismatch<\/td>\n<td>Verify the webhook secret matches between Tolinku and your receiver<\/td>\n<\/tr>\n<tr>\n<td>Status 500 intermittently<\/td>\n<td>Application error<\/td>\n<td>Check your receiver logs for exceptions<\/td>\n<\/tr>\n<tr>\n<td>Status 503 during high traffic<\/td>\n<td>Server overloaded<\/td>\n<td>Scale your receiver or add a queue between receiver and processing<\/td>\n<\/tr>\n<tr>\n<td>Timeouts (10s)<\/td>\n<td>Slow processing<\/td>\n<td>Respond 200 immediately, process asynchronously<\/td>\n<\/tr>\n<tr>\n<td>Duplicate events increasing<\/td>\n<td>Endpoint slow or flaky<\/td>\n<td>Fix the root cause of failures\/timeouts<\/td>\n<\/tr>\n<\/tbody><\/table><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">For detailed debugging techniques, see the <a href=\"https:\/\/tolinku.com\/blog\/webhook-debugging\/\">webhook debugging guide<\/a>. For retry behavior details, see the <a href=\"https:\/\/tolinku.com\/blog\/webhook-retry-logic\/\">webhook retry logic guide<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Monitor webhook delivery health with dashboards, alerting, and logging. Detect failures early and ensure every deep link event reaches its destination.<\/p>\n","protected":false},"author":2,"featured_media":1161,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"rank_math_title":"Monitoring Webhook Delivery: Ensuring Reliability","rank_math_description":"Monitor webhook delivery health with dashboards, alerting, and logging. Detect failures early and ensure every deep link event reaches its destination.","rank_math_focus_keyword":"webhook delivery monitoring","rank_math_canonical_url":"","rank_math_facebook_title":"","rank_math_facebook_description":"","rank_math_facebook_image":"https:\/\/tolinku.com\/blog\/wp-content\/uploads\/2026\/03\/og-webhook-delivery-monitoring.png","rank_math_facebook_image_id":"","rank_math_twitter_title":"","rank_math_twitter_description":"","rank_math_twitter_image":"https:\/\/tolinku.com\/blog\/wp-content\/uploads\/2026\/03\/og-webhook-delivery-monitoring.png","footnotes":""},"categories":[15],"tags":[20,293,264,274,292,266,61],"class_list":["post-1162","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-engineering","tag-deep-linking","tag-devops","tag-engineering","tag-monitoring","tag-observability","tag-reliability","tag-webhooks"],"_links":{"self":[{"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/posts\/1162","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/comments?post=1162"}],"version-history":[{"count":2,"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/posts\/1162\/revisions"}],"predecessor-version":[{"id":2263,"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/posts\/1162\/revisions\/2263"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/media\/1161"}],"wp:attachment":[{"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/media?parent=1162"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/categories?post=1162"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tolinku.com\/blog\/wp-json\/wp\/v2\/tags?post=1162"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}