Userpilot services degradation
Resolved
Apr 28, 2026 at 6:35pm UTC
Postmortem
The incident was caused by our database background optimization process not running aggressively enough. This allowed data fragments to accumulate across tables used for real-time cache loading.
As the number of fragments increased, queries that normally completed in milliseconds were forced to scan across many more fragments than necessary. This significantly increased query latency and led to exhaustion of available database connections, which impacted data ingestion and content publishing performance.
To resolve this, we reconfigured our database to optimize these tables more frequently and in smaller batches. This keeps fragment counts low and ensures real-time queries remain fast and stable.
Additionally, we have added monitoring and alerting on fragment count and size per table so we can detect abnormal accumulation early and prevent similar incidents in the future.
Affected services
Updated
Apr 28, 2026 at 5:51pm UTC
A fix has been implemented and deployed, and we are currently monitoring system performance to ensure stability.
We will share the postmortem for this incident as a follow-up message.
Thank you for your patience.
Affected services
Created
Apr 28, 2026 at 3:07pm UTC
We are currently experiencing degraded performance affecting data ingestion.
Some customers may notice delays in incoming data.
Our engineering team is actively investigating and working to restore full performance as quickly as possible.
Thank you for your patience.
Affected services