docs/operations.md
Operations — bugs and support
How we triage bugs, route support tickets, and run the on-call rotation.
draftrajivoperationssupportbugs
Status: draft scaffold. Most of this is tribal today. Document the actual flow as we observe it for two weeks, then codify.
PH analogue: PostHog folds this into per-team rotation docs + their support team handbook. No single canonical "ops" page — we'd be inventing the shape.
Support flow (Chatwoot)
TODO. Things to capture:
- Inbound channels (in-app, email, social).
- Chatwoot routing and labels.
- SLA policy (memory: SLAs assigned by support-bot's
apply_sla_policycall, not by a Chatwoot automation rule) — link tointernal/support-bot/once that's written up. - Escalation path: support → eng on-call.
- Templates / canned responses.
Bug intake
TODO.
- Where bugs come from (Chatwoot, Linear, Sentry, internal Slack).
- Triage cadence — who, when, how often.
- Severity rubric (S0/S1/S2/S3 — define).
- "Reproduce first" vs "ticket first" rule.
On-call
TODO.
- Rotation (who's on, for how long).
- Pager source (Sentry? Better Stack? PostHog alerts?).
- Hours of coverage (business hours? 24/7?).
- Hand-off ritual.
Bug-to-fix lifecycle
TODO.
- Linear states: Triage → Backlog → In Progress → In Review → Done.
- When to write a postmortem (see
docs/postmortems/README.md). - Customer comms loop — who tells the reporter the bug is fixed.
Tooling map
- Support inbox: Chatwoot (dashboard)
- Bug tracker: Linear (project
studyflash) - Errors: Sentry
- Replays: PostHog session replay
- AI traces: PostHog LLM analytics
- Support assistant:
internal/support-bot/(CLAUDE.md inside)
TODO
- Document the support-bot decision tree.
- Document who has Chatwoot admin vs agent access.
- Add a "first hour of a customer-reported outage" runbook.