# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview TAKT (Task Agent Koordination Tool) is a multi-agent orchestration system for Claude Code. It enables YAML-based workflow definitions that coordinate multiple AI agents through state machine transitions with rule-based routing. ## Development Commands | Command | Description | |---------|-------------| | `npm run build` | TypeScript build | | `npm run watch` | TypeScript build in watch mode | | `npm run test` | Run all tests | | `npm run test:watch` | Run tests in watch mode (alias: `npm run test -- --watch`) | | `npm run lint` | ESLint | | `npx vitest run src/__tests__/client.test.ts` | Run single test file | | `npx vitest run -t "pattern"` | Run tests matching pattern | | `npm run prepublishOnly` | Lint, build, and test before publishing | ## CLI Subcommands | Command | Description | |---------|-------------| | `takt {task}` | Execute task with current workflow | | `takt` | Interactive task input mode (chat with AI to refine requirements) | | `takt run` | Execute all pending tasks from `.takt/tasks/` once | | `takt watch` | Watch `.takt/tasks/` and auto-execute tasks (resident process) | | `takt add` | Add a new task via AI conversation | | `takt list` | List task branches (try merge, merge & cleanup, or delete) | | `takt switch` | Switch workflow interactively | | `takt clear` | Clear agent conversation sessions (reset state) | | `takt eject` | Copy builtin workflow/agents to `~/.takt/` for customization | | `takt config` | Configure settings (permission mode) | | `takt --help` | Show help message | **Interactive mode:** Running `takt` (without arguments) or `takt {initial message}` starts an interactive planning session. The AI helps refine task requirements through conversation. Type `/go` to execute the task with the selected workflow, or `/cancel` to abort. Implemented in `src/features/interactive/`. **Pipeline mode:** Specifying `--pipeline` enables non-interactive mode suitable for CI/CD. Automatically creates a branch, runs the workflow, commits, and pushes. Use `--auto-pr` to also create a pull request. Use `--skip-git` to run workflow only (no git operations). Implemented in `src/features/pipeline/`. **GitHub issue references:** `takt #6` fetches issue #6 and executes it as a task. ## Architecture ### Core Flow ``` CLI (cli.ts) → Slash commands or executeTask() → WorkflowEngine (workflow/engine.ts) → Per step: 3-phase execution Phase 1: runAgent() → main work Phase 2: runReportPhase() → report output (if step.report defined) Phase 3: runStatusJudgmentPhase() → status tag output (if tag-based rules) → detectMatchedRule() → rule evaluation → determineNextStep() → Parallel steps: Promise.all() for sub-steps, aggregate evaluation ``` ### Three-Phase Step Execution Each step executes in up to 3 phases (session is resumed across phases): | Phase | Purpose | Tools | When | |-------|---------|-------|------| | Phase 1 | Main work (coding, review, etc.) | Step's allowed_tools (Write excluded if report defined) | Always | | Phase 2 | Report output | Write only | When `step.report` is defined | | Phase 3 | Status judgment | None (judgment only) | When step has tag-based rules | Phase 2/3 are implemented in `src/core/workflow/engine/phase-runner.ts`. The session is resumed so the agent retains context from Phase 1. ### Rule Evaluation (5-Stage Fallback) After step execution, rules are evaluated to determine the next step. Evaluation order (first match wins): 1. **Aggregate** (`all()`/`any()`) - For parallel parent steps 2. **Phase 3 tag** - `[STEP:N]` tag from status judgment output 3. **Phase 1 tag** - `[STEP:N]` tag from main execution output (fallback) 4. **AI judge (ai() only)** - AI evaluates `ai("condition text")` rules 5. **AI judge fallback** - AI evaluates ALL conditions as final resort Implemented in `src/core/workflow/evaluation/RuleEvaluator.ts`. The matched method is tracked as `RuleMatchMethod` type. ### Key Components **WorkflowEngine** (`src/core/workflow/engine/WorkflowEngine.ts`) - State machine that orchestrates agent execution via EventEmitter - Manages step transitions based on rule evaluation results - Emits events: `step:start`, `step:complete`, `step:blocked`, `step:loop_detected`, `workflow:complete`, `workflow:abort`, `iteration:limit` - Supports loop detection (`LoopDetector`) and iteration limits - Maintains agent sessions per step for conversation continuity - Delegates to `StepExecutor` (normal steps) and `ParallelRunner` (parallel steps) **StepExecutor** (`src/core/workflow/engine/StepExecutor.ts`) - Executes a single workflow step through the 3-phase model - Phase 1: Main agent execution (with tools) - Phase 2: Report output (Write-only, optional) - Phase 3: Status judgment (no tools, optional) - Builds instructions via `InstructionBuilder`, detects matched rules via `RuleEvaluator` **ParallelRunner** (`src/core/workflow/engine/ParallelRunner.ts`) - Executes parallel sub-steps concurrently via `Promise.all()` - Aggregates sub-step results for parent rule evaluation - Supports `all()` / `any()` aggregate conditions **RuleEvaluator** (`src/core/workflow/evaluation/RuleEvaluator.ts`) - 5-stage fallback evaluation: aggregate → Phase 3 tag → Phase 1 tag → ai() judge → all-conditions AI judge - Returns `RuleMatch` with index and detection method (`aggregate`, `phase3_tag`, `phase1_tag`, `ai_judge`, `ai_fallback`) - Fail-fast: throws if rules exist but no rule matched **Instruction Builder** (`src/core/workflow/instruction/InstructionBuilder.ts`) - Auto-injects standard sections into every instruction (no need for `{task}` or `{previous_response}` placeholders in templates): 1. Execution context (working dir, edit permission rules) 2. Workflow context (iteration counts, report dir) 3. User request (`{task}` — auto-injected unless placeholder present) 4. Previous response (auto-injected if `pass_previous_response: true`) 5. User inputs (auto-injected unless `{user_inputs}` placeholder present) 6. `instruction_template` content 7. Status output rules (auto-injected for tag-based rules) - Localized for `en` and `ja` - Related: `ReportInstructionBuilder` (Phase 2), `StatusJudgmentBuilder` (Phase 3) **Agent Runner** (`src/agents/runner.ts`) - Resolves agent specs (name or path) to agent configurations - Built-in agents with default tools: - `coder`: Read/Glob/Grep/Edit/Write/Bash/WebSearch/WebFetch - `architect`: Read/Glob/Grep/WebSearch/WebFetch - `supervisor`: Read/Glob/Grep/Bash/WebSearch/WebFetch - `planner`: Read/Glob/Grep/Bash/WebSearch/WebFetch - Custom agents via `.takt/agents.yaml` or prompt files (.md) **Provider Integration** (`src/infra/claude/`, `src/infra/codex/`) - **Claude** - Uses `@anthropic-ai/claude-agent-sdk` - `client.ts` - High-level API: `callClaude()`, `callClaudeCustom()`, `callClaudeAgent()`, `callClaudeSkill()` - `process.ts` - SDK wrapper with `ClaudeProcess` class - `executor.ts` - Query execution - `query-manager.ts` - Concurrent query tracking with query IDs - **Codex** - Direct OpenAI SDK integration - `CodexStreamHandler.ts` - Stream handling and tool execution **Configuration** (`src/infra/config/`) - `loaders/loader.ts` - Custom agent loading from `.takt/agents.yaml` - `loaders/workflowParser.ts` - YAML parsing, step/rule normalization with Zod validation - `loaders/workflowResolver.ts` - 3-layer resolution (builtin → user → project-local) - `loaders/workflowCategories.ts` - Workflow categorization and filtering - `loaders/agentLoader.ts` - Agent prompt file loading - `paths.ts` - Directory structure (`.takt/`, `~/.takt/`), session management - `global/globalConfig.ts` - Global configuration (provider, model, trusted dirs) - `project/projectConfig.ts` - Project-level configuration **Task Management** (`src/features/tasks/`) - `execute/taskExecution.ts` - Main task execution orchestration - `execute/workflowExecution.ts` - Workflow execution wrapper - `add/index.ts` - Interactive task addition via AI conversation - `list/index.ts` - List task branches with merge/delete actions - `watch/index.ts` - Watch for task files and auto-execute **GitHub Integration** (`src/infra/github/`) - `issue.ts` - Fetches issues via `gh` CLI, formats as task text with title/body/labels/comments - `pr.ts` - Creates pull requests via `gh` CLI ### Data Flow 1. User provides task (text or `#N` issue reference) or slash command → CLI 2. CLI loads workflow: user `~/.takt/workflows/` → builtin `resources/global/{lang}/workflows/` fallback 3. WorkflowEngine starts at `initial_step` 4. Each step: `buildInstruction()` → Phase 1 (main) → Phase 2 (report) → Phase 3 (status) → `detectMatchedRule()` → `determineNextStep()` 5. Rule evaluation determines next step name 6. Special transitions: `COMPLETE` ends workflow successfully, `ABORT` ends with failure ## Directory Structure ``` ~/.takt/ # Global user config (created on first run) config.yaml # Trusted dirs, default workflow, log level, language workflows/ # User workflow YAML files (override builtins) agents/ # User agent prompt files (.md) .takt/ # Project-level config agents.yaml # Custom agent definitions tasks/ # Task files for /run-tasks reports/ # Execution reports (auto-generated) logs/ # Session logs in NDJSON format (gitignored) resources/ # Bundled defaults (builtin, read from dist/ at runtime) global/ en/ # English agents and workflows ja/ # Japanese agents and workflows ``` Builtin resources are embedded in the npm package (`dist/resources/`). User files in `~/.takt/` take priority. Use `/eject` to copy builtins to `~/.takt/` for customization. ## Workflow YAML Schema ```yaml name: workflow-name description: Optional description max_iterations: 10 initial_step: plan # First step to execute steps: # Normal step - name: step-name agent: ../agents/default/coder.md # Path to agent prompt agent_name: coder # Display name (optional) provider: codex # claude|codex (optional) model: opus # Model name (optional) edit: true # Whether step can edit files permission_mode: acceptEdits # Tool permission mode (optional) instruction_template: | Custom instructions for this step. {task}, {previous_response} are auto-injected if not present as placeholders. pass_previous_response: true # Default: true report: name: 01-plan.md # Report file name format: | # Report format template # Plan Report ... rules: - condition: "Human-readable condition" next: next-step-name - condition: ai("AI evaluates this condition text") next: other-step - condition: blocked next: ABORT # Parallel step (sub-steps execute concurrently) - name: reviewers parallel: - name: arch-review agent: ../agents/default/architecture-reviewer.md rules: - condition: approved # next is optional for sub-steps - condition: needs_fix instruction_template: | Review architecture... - name: security-review agent: ../agents/default/security-reviewer.md rules: - condition: approved - condition: needs_fix instruction_template: | Review security... rules: # Parent rules use aggregate conditions - condition: all("approved") next: supervise - condition: any("needs_fix") next: fix ``` Key points about parallel steps: - Sub-step `rules` define possible outcomes but `next` is ignored (parent handles routing) - Parent `rules` use `all("X")`/`any("X")` to aggregate sub-step results - `all("X")`: true if ALL sub-steps matched condition X - `any("X")`: true if ANY sub-step matched condition X ### Rule Condition Types | Type | Syntax | Evaluation | |------|--------|------------| | Tag-based | `"condition text"` | Agent outputs `[STEP:N]` tag, matched by index | | AI judge | `ai("condition text")` | AI evaluates condition against agent output | | Aggregate | `all("X")` / `any("X")` | Aggregates parallel sub-step matched conditions | ### Template Variables | Variable | Description | |----------|-------------| | `{task}` | Original user request (auto-injected if not in template) | | `{iteration}` | Workflow-wide iteration count | | `{max_iterations}` | Maximum iterations allowed | | `{step_iteration}` | Per-step iteration count | | `{previous_response}` | Previous step output (auto-injected if not in template) | | `{user_inputs}` | Accumulated user inputs (auto-injected if not in template) | | `{report_dir}` | Report directory name | ### Workflow Categories Workflows can be organized into categories for better UI presentation. Categories are configured in: - `resources/global/{lang}/default-categories.yaml` - Default builtin categories - `~/.takt/config.yaml` - User-defined categories (via `workflow_categories` field) Category configuration supports: - Nested categories (unlimited depth) - Per-category workflow lists - "Others" category for uncategorized workflows (can be disabled via `show_others_category: false`) - Builtin workflow filtering (disable via `builtin_workflows_enabled: false`, or selectively via `disabled_builtins: [name1, name2]`) Example category config: ```yaml workflow_categories: Development: workflows: [default, simple] children: Backend: workflows: [expert-cqrs] Frontend: workflows: [expert] Research: workflows: [research, magi] show_others_category: true others_category_name: "Other Workflows" ``` Implemented in `src/infra/config/loaders/workflowCategories.ts`. ### Model Resolution Model is resolved in the following priority order: 1. **Workflow step `model`** - Highest priority (specified in step YAML) 2. **Custom agent `model`** - Agent-level model in `.takt/agents.yaml` 3. **Global config `model`** - Default model in `~/.takt/config.yaml` 4. **Provider default** - Falls back to provider's default (Claude: sonnet, Codex: gpt-5.2-codex) Example `~/.takt/config.yaml`: ```yaml provider: claude model: opus # Default model for all steps (unless overridden) ``` ## NDJSON Session Logging Session logs use NDJSON (`.jsonl`) format for real-time append-only writes. Record types: | Record | Description | |--------|-------------| | `workflow_start` | Workflow initialization with task, workflow name | | `step_start` | Step execution start | | `step_complete` | Step result with status, content, matched rule info | | `workflow_complete` | Successful completion | | `workflow_abort` | Abort with reason | Files: `.takt/logs/{sessionId}.jsonl`, with `latest.json` pointer. Legacy `.json` format is still readable via `loadSessionLog()`. ## TypeScript Notes - ESM modules with `.js` extensions in imports - Strict TypeScript with `noUncheckedIndexedAccess` - Zod schemas for runtime validation (`src/core/models/schemas.ts`) - Uses `@anthropic-ai/claude-agent-sdk` for Claude integration ## Design Principles **Keep commands minimal.** One command per concept. Use arguments/modes instead of multiple similar commands. Before adding a new command, consider if existing commands can be extended. **Do NOT expand schemas carelessly.** Rule conditions are free-form text (not enum-restricted). However, the engine's behavior depends on specific patterns (`ai()`, `all()`, `any()`). Do not add new special syntax without updating the loader's regex parsing in `workflowParser.ts`. **Instruction auto-injection over explicit placeholders.** The instruction builder auto-injects `{task}`, `{previous_response}`, `{user_inputs}`, and status rules. Templates should contain only step-specific instructions, not boilerplate. **Agent prompts contain only domain knowledge.** Agent prompt files (`resources/global/{lang}/agents/**/*.md`) must contain only domain expertise and behavioral principles — never workflow-specific procedures. Workflow-specific details (which reports to read, step routing, specific templates with hardcoded step names) belong in the workflow YAML's `instruction_template`. This keeps agents reusable across different workflows. What belongs in agent prompts: - Role definition ("You are a ... specialist") - Domain expertise, review criteria, judgment standards - Do / Don't behavioral rules - Tool usage knowledge (general, not workflow-specific) What belongs in workflow `instruction_template`: - Step-specific procedures ("Read these specific reports") - References to other steps or their outputs - Specific report file names or formats - Comment/output templates with hardcoded review type names **Separation of concerns in workflow engine:** - `WorkflowEngine` - Orchestration, state management, event emission - `StepExecutor` - Single step execution (3-phase model) - `ParallelRunner` - Parallel step execution - `RuleEvaluator` - Rule matching and evaluation - `InstructionBuilder` - Instruction template processing **Session management:** Agent sessions are stored per-cwd in `~/.claude/projects/{encoded-path}/` (Claude Code) or in-memory (Codex). Sessions are resumed across phases (Phase 1 → Phase 2 → Phase 3) to maintain context. When `cwd !== projectCwd` (worktree/clone execution), session resume is skipped to avoid cross-directory contamination. ## Isolated Execution (Shared Clone) When tasks specify `worktree: true` or `worktree: "path"`, code runs in a `git clone --shared` (lightweight clone with independent `.git` directory). Clones are ephemeral: created before task execution, auto-committed + pushed after success, then deleted. > **Why `worktree` in YAML but `git clone --shared` internally?** The YAML field name `worktree` is retained for backward compatibility. The original implementation used `git worktree`, but git worktrees have a `.git` file containing `gitdir: /path/to/main/.git/worktrees/...`. Claude Code follows this path and recognizes the main repository as the project root, causing agents to work on main instead of the worktree. `git clone --shared` creates an independent `.git` directory that prevents this traversal. Key constraints: - **Independent `.git`**: Shared clones have their own `.git` directory, preventing Claude Code from traversing `gitdir:` back to the main repository. - **Ephemeral lifecycle**: Clone is created → task runs → auto-commit + push → clone is deleted. Branches are the single source of truth. - **Session isolation**: Claude Code sessions are stored per-cwd in `~/.claude/projects/{encoded-path}/`. Sessions from the main project cannot be resumed in a clone. The engine skips session resume when `cwd !== projectCwd`. - **No node_modules**: Clones only contain tracked files. `node_modules/` is absent. - **Dual cwd**: `cwd` = clone path (where agents run), `projectCwd` = project root (where `.takt/` lives). Reports, logs, and session data always write to `projectCwd`. - **List**: Use `takt list` to list branches. Instruct action creates a temporary clone for the branch, executes, pushes, then removes the clone. ## Error Propagation `ClaudeResult` (from SDK) has an `error` field. This must be propagated through `AgentResponse.error` → session log history → console output. Without this, SDK failures (exit code 1, rate limits, auth errors) appear as empty `blocked` status with no diagnostic info. **Error handling flow:** 1. Provider error (Claude SDK / Codex) → `AgentResponse.error` 2. `StepExecutor` captures error → `WorkflowEngine` emits `step:complete` with error 3. Error logged to session log (`.takt/logs/{sessionId}.jsonl`) 4. Console output shows error details 5. Workflow transitions to `ABORT` step if error is unrecoverable ## Debugging **Debug logging:** Set `debug_enabled: true` in `~/.takt/config.yaml` or create a `.takt/debug.yaml` file: ```yaml enabled: true ``` Debug logs are written to `.takt/logs/debug.log` (ndjson format). Log levels: `debug`, `info`, `warn`, `error`. **Verbose mode:** Create `.takt/verbose` file (empty file) to enable verbose console output. This automatically enables debug logging and sets log level to `debug`. **Session logs:** All workflow executions are logged to `.takt/logs/{sessionId}.jsonl`. Use `tail -f .takt/logs/{sessionId}.jsonl` to monitor in real-time. **Testing with mocks:** Use `--provider mock` to test workflows without calling real AI APIs. Mock responses are deterministic and configurable via test fixtures. ## Testing Notes - Vitest for testing framework - Tests use file system fixtures in `__tests__/` subdirectories - Mock workflows and agent configs for integration tests - Test single files: `npx vitest run src/__tests__/filename.test.ts` - Pattern matching: `npx vitest run -t "test pattern"` - Integration tests: Tests with `it-` prefix are integration tests that simulate full workflow execution - Engine tests: Tests with `engine-` prefix test specific WorkflowEngine scenarios (happy path, error handling, parallel execution, etc.) ## Important Implementation Notes **Agent prompt resolution:** - Agent paths in workflow YAML are resolved relative to the workflow file's directory - `../agents/default/coder.md` resolves from workflow file location - Built-in agents are loaded from `dist/resources/global/{lang}/agents/` - User agents are loaded from `~/.takt/agents/` or `.takt/agents.yaml` - If agent file doesn't exist, the agent string is used as inline system prompt **Report directory structure:** - Report dirs are created at `.takt/reports/{timestamp}-{slug}/` - Report files specified in `step.report` are written relative to report dir - Report dir path is available as `{report_dir}` variable in instruction templates - When `cwd !== projectCwd` (worktree execution), reports still write to `projectCwd/.takt/reports/` **Session continuity across phases:** - Agent sessions persist across Phase 1 → Phase 2 → Phase 3 for context continuity - Session ID is passed via `resumeFrom` in `RunAgentOptions` - Sessions are stored per-cwd, so worktree executions create new sessions - Use `takt clear` to reset all agent sessions **Worktree execution gotchas:** - `git clone --shared` creates independent `.git` directory (not `git worktree`) - Clone cwd ≠ project cwd: agents work in clone, but reports/logs write to project - Session resume is skipped when `cwd !== projectCwd` to avoid cross-directory contamination - Clones are ephemeral: created → task runs → auto-commit + push → deleted - Use `takt list` to manage task branches after clone deletion **Rule evaluation quirks:** - Tag-based rules match by array index (0-based), not by exact condition text - `ai()` conditions are evaluated by Claude/Codex, not by string matching - Aggregate conditions (`all()`, `any()`) only work in parallel parent steps - Fail-fast: if rules exist but no rule matches, workflow aborts - Interactive-only rules are skipped in pipeline mode (`rule.interactiveOnly === true`) **Provider-specific behavior:** - Claude: Uses session files in `~/.claude/projects/`, supports skill/agent calls - Codex: In-memory sessions, no skill/agent calls - Model names are passed directly to provider (no alias resolution in TAKT) - Claude supports aliases: `opus`, `sonnet`, `haiku` - Codex defaults to `codex` if model not specified **Permission modes:** - `default`: Claude Code default behavior (prompts for file writes) - `acceptEdits`: Auto-accept file edits without prompts - `bypassPermissions`: Bypass all permission checks - Specified at step level (`permission_mode` field) or global config - Implemented via `--sandbox-mode` and `--accept-edits` flags passed to Claude Code CLI