docs/operations.md

Operations — bugs and support

How we triage bugs, route support tickets, and run the on-call rotation.

draftrajivoperationssupportbugs

Status: draft scaffold. Most of this is tribal today. Document the actual flow as we observe it for two weeks, then codify.

PH analogue: PostHog folds this into per-team rotation docs + their support team handbook. No single canonical "ops" page — we'd be inventing the shape.

Support flow (Chatwoot)

TODO. Things to capture:

Bug intake

TODO.

  • Where bugs come from (Chatwoot, Linear, Sentry, internal Slack).
  • Triage cadence — who, when, how often.
  • Severity rubric (S0/S1/S2/S3 — define).
  • "Reproduce first" vs "ticket first" rule.

On-call

TODO.

  • Rotation (who's on, for how long).
  • Pager source (Sentry? Better Stack? PostHog alerts?).
  • Hours of coverage (business hours? 24/7?).
  • Hand-off ritual.

Bug-to-fix lifecycle

TODO.

  • Linear states: Triage → Backlog → In Progress → In Review → Done.
  • When to write a postmortem (see docs/postmortems/README.md).
  • Customer comms loop — who tells the reporter the bug is fixed.

Tooling map

  • Support inbox: Chatwoot (dashboard)
  • Bug tracker: Linear (project studyflash)
  • Errors: Sentry
  • Replays: PostHog session replay
  • AI traces: PostHog LLM analytics
  • Support assistant: internal/support-bot/ (CLAUDE.md inside)

TODO

  • Document the support-bot decision tree.
  • Document who has Chatwoot admin vs agent access.
  • Add a "first hour of a customer-reported outage" runbook.