Monitoring, Debugging & Recovery
Monitoring, Debugging & Recovery
Once an endpoint is live, the webhook portal becomes your operational control surface.
Use it to answer four questions quickly:
- did the event reach my endpoint?
- if not, why not?
- will it retry automatically?
- how do I replay it safely?
Inspecting deliveries
From the endpoint page you can review the messages sent to that endpoint and drill into individual deliveries.
This is where you inspect:
- the event type
- the payload that was sent
- whether the delivery succeeded or failed
- each delivery attempt for the same message
The portal is the first place to check before debugging your own application logs.
Finding the right message
When you are tracking down a specific delivery, filter the message list by:
- endpoint
- event type
- date or time window
If you know roughly when the issue occurred, date filtering is usually the fastest way to narrow the list.
What counts as success
An event is considered successfully delivered when your endpoint returns a 2xx response within 15 seconds.
Everything else is treated as a failure, including:
- timeouts
3xxredirects4xxresponses5xxresponses
This means a redirecting URL is not acceptable for production webhook delivery. The configured endpoint must be the final destination.
Automatic retry schedule
If an attempt fails, YunoJuno retries automatically using this schedule:
- immediately
- 5 seconds later
- 5 minutes later
- 30 minutes later
- 2 hours later
- 5 hours later
- 10 hours later
- 10 hours later again
In practice, the final automatic attempt happens roughly 27 hours and 35 minutes after the first attempt, assuming each prior attempt failed.
What happens after repeated failure
If all automatic attempts are exhausted, the message is marked as failed for that endpoint.
Operationally, this is the dead-letter state:
- the message stops retrying automatically
- it remains visible in the portal for inspection
- you can replay or bulk-recover it later once the issue is fixed
If an endpoint keeps failing over multiple days, the portal can automatically disable the endpoint. That is intended to stop endless failed traffic against a broken destination.
Replay and recovery tools
The portal gives you three useful recovery patterns:
Resend: replay a single message to the endpointRecover Failed Messages: replay failed messages from a chosen point in timeReplay Missing: replay messages that were never attempted to that endpoint
These tools are what you use after downtime, misconfiguration, or a bug in your receiving service.
Practical recovery workflow
- Fix the underlying endpoint problem first.
- Verify the fix with a test message.
- Use
Resendfor a single known failure, orRecover Failed Messagesfor a wider outage. - Monitor the replayed deliveries in the portal while they are being processed.
Payload history
The portal retains payloads long enough to make debugging and recovery practical. The current default retention period is 90 days.
That retention window is another reason to use the portal early in an incident rather than trying to reconstruct failed deliveries later from incomplete logs.