docs/working-with-ai.md

Working with AI

How we use Claude / Cursor / agents at studyflash — pair vs peer programming, and the rule that you own your slop.

draftrajivprocessaiengineering

Status: draft scaffold. Captures how we actually use Claude Code, Cursor, and ad-hoc agents in day-to-day work. Two parts: the workflow modes, and the cultural rule about output ownership.

The slop rule (non-negotiable)

You own everything you commit. The AI is a tool; the slop is yours.

What this means in practice:

  • No "the AI wrote it" defense. If you open a PR, you signed off. Reviewer pushback is on you, not the model.
  • You read every line before staging. No exceptions. If the diff is too big to read carefully, it's too big to merge.
  • You do not hand unread output to teammates. Not to reviewers, not in tickets, not in Slack threads. If you didn't read it, neither will they.
  • Slop is not a noun for AI-generated — it's a noun for unread. Hand-written code you didn't think about is also slop.

PH analogue: none. Their /engineering/ai/* handbook pages are about building AI products, not about engineers using AI to write code. They codify AI guardrails in .cursor/rules and CLAUDE.md files at the repo level, not in narrative docs. Closest cultural artifact is a PostHog newsletter post by Ian Vanagas: "Ultimately, you are responsible for the end product of what you create. This is true whether you use AI or not." That's the slop rule in one sentence, but off-handbook.

This doc is net-new territory.

Pair programming with AI

You drive, the AI watches. Tight feedback loop, you stay in the driver's seat.

When: small targeted changes, debugging, exploring an unfamiliar area, anything you'd want a second pair of eyes on while typing.

Shape:

  • Claude Code or Cursor in your editor.
  • You type / prompt → AI suggests → you accept / edit / reject.
  • Conversation lives in your head; the artifact is the diff.

Studyflash-specific tips: TODO.

Peer programming with AI

You delegate, the AI works alone, you review. Async, batch-able.

When: well-scoped tasks where you can specify the contract clearly, parallel work, things you'd otherwise put off.

Shape:

  • Spawn an agent (Claude Code background task, /loop, worktree agent, /schedule for a recurring task).
  • Brief it like a smart colleague who walked into the room — context, intent, success criteria.
  • Come back, review the diff like any other PR.
  • Trust but verify: an agent's summary describes what it intended; check what it actually did.

Studyflash-specific tips: TODO.

  • /worktree for isolated work (defaults to opening a PR).
  • /ultrareview for a multi-agent review pass on a branch / PR.
  • Background agents for long-running scoped tasks (eval runs, migration sweeps, doc audits).

Mode-picking heuristic

SituationMode
Bug fix in code you don't fully understandPair
"Touch all the files matching X pattern"Peer
Stuck on a hard problemPair
Routine refactor with a clear patternPeer
Net-new feature with unclear shapePair (until the shape's clear, then maybe split off pieces to peer)
Eval runs / long-running batchPeer (background)

Memory and durable instructions

TODO. Capture:

  • Where global Claude memory lives (~/.claude/projects/.../memory/).
  • When to add a CLAUDE.md / AGENTS.md to a directory.
  • When to update a skill vs ask in-the-moment.

Skills, hooks, commands

TODO. Inventory what's set up across the team:

  • .claude/skills/* checked into this repo.
  • User-level skills via Skill Store.
  • Hooks (pre-commit AI checks, etc. — none yet?).
  • Slash commands we lean on (/ultrareview, /loop, /schedule, /worktree, /triage-and-note, etc.).

Things we explicitly don't do

TODO. Candidates:

Open questions

  • Do we want a shared agent prompt library? Where?
  • How do we share learnings ("this prompt shape worked great for X")? Slack? This handbook?
  • When does a recurring agent task become a /schedule cron vs a manual /loop?