24 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
TAKT (Task Agent Koordination Tool) is a multi-agent orchestration system for Claude Code. It enables YAML-based workflow definitions that coordinate multiple AI agents through state machine transitions with rule-based routing.
Development Commands
| Command | Description |
|---|---|
npm run build |
TypeScript build |
npm run watch |
TypeScript build in watch mode |
npm run test |
Run all tests |
npm run test:watch |
Run tests in watch mode (alias: npm run test -- --watch) |
npm run lint |
ESLint |
npx vitest run src/__tests__/client.test.ts |
Run single test file |
npx vitest run -t "pattern" |
Run tests matching pattern |
npm run prepublishOnly |
Lint, build, and test before publishing |
CLI Subcommands
| Command | Description |
|---|---|
takt {task} |
Execute task with current workflow |
takt |
Interactive task input mode (chat with AI to refine requirements) |
takt run |
Execute all pending tasks from .takt/tasks/ once |
takt watch |
Watch .takt/tasks/ and auto-execute tasks (resident process) |
takt add |
Add a new task via AI conversation |
takt list |
List task branches (try merge, merge & cleanup, or delete) |
takt switch |
Switch workflow interactively |
takt clear |
Clear agent conversation sessions (reset state) |
takt eject |
Copy builtin workflow/agents to ~/.takt/ for customization |
takt config |
Configure settings (permission mode) |
takt --help |
Show help message |
Interactive mode: Running takt (without arguments) or takt {initial message} starts an interactive planning session. The AI helps refine task requirements through conversation. Type /go to execute the task with the selected workflow, or /cancel to abort. Implemented in src/features/interactive/.
Pipeline mode: Specifying --pipeline enables non-interactive mode suitable for CI/CD. Automatically creates a branch, runs the workflow, commits, and pushes. Use --auto-pr to also create a pull request. Use --skip-git to run workflow only (no git operations). Implemented in src/features/pipeline/.
GitHub issue references: takt #6 fetches issue #6 and executes it as a task.
Architecture
Core Flow
CLI (cli.ts)
→ Slash commands or executeTask()
→ WorkflowEngine (workflow/engine.ts)
→ Per step: 3-phase execution
Phase 1: runAgent() → main work
Phase 2: runReportPhase() → report output (if step.report defined)
Phase 3: runStatusJudgmentPhase() → status tag output (if tag-based rules)
→ detectMatchedRule() → rule evaluation → determineNextStep()
→ Parallel steps: Promise.all() for sub-steps, aggregate evaluation
Three-Phase Step Execution
Each step executes in up to 3 phases (session is resumed across phases):
| Phase | Purpose | Tools | When |
|---|---|---|---|
| Phase 1 | Main work (coding, review, etc.) | Step's allowed_tools (Write excluded if report defined) | Always |
| Phase 2 | Report output | Write only | When step.report is defined |
| Phase 3 | Status judgment | None (judgment only) | When step has tag-based rules |
Phase 2/3 are implemented in src/core/workflow/engine/phase-runner.ts. The session is resumed so the agent retains context from Phase 1.
Rule Evaluation (5-Stage Fallback)
After step execution, rules are evaluated to determine the next step. Evaluation order (first match wins):
- Aggregate (
all()/any()) - For parallel parent steps - Phase 3 tag -
[STEP:N]tag from status judgment output - Phase 1 tag -
[STEP:N]tag from main execution output (fallback) - AI judge (ai() only) - AI evaluates
ai("condition text")rules - AI judge fallback - AI evaluates ALL conditions as final resort
Implemented in src/core/workflow/evaluation/RuleEvaluator.ts. The matched method is tracked as RuleMatchMethod type.
Key Components
WorkflowEngine (src/core/workflow/engine/WorkflowEngine.ts)
- State machine that orchestrates agent execution via EventEmitter
- Manages step transitions based on rule evaluation results
- Emits events:
step:start,step:complete,step:blocked,step:loop_detected,workflow:complete,workflow:abort,iteration:limit - Supports loop detection (
LoopDetector) and iteration limits - Maintains agent sessions per step for conversation continuity
- Delegates to
StepExecutor(normal steps) andParallelRunner(parallel steps)
StepExecutor (src/core/workflow/engine/StepExecutor.ts)
- Executes a single workflow step through the 3-phase model
- Phase 1: Main agent execution (with tools)
- Phase 2: Report output (Write-only, optional)
- Phase 3: Status judgment (no tools, optional)
- Builds instructions via
InstructionBuilder, detects matched rules viaRuleEvaluator
ParallelRunner (src/core/workflow/engine/ParallelRunner.ts)
- Executes parallel sub-steps concurrently via
Promise.all() - Aggregates sub-step results for parent rule evaluation
- Supports
all()/any()aggregate conditions
RuleEvaluator (src/core/workflow/evaluation/RuleEvaluator.ts)
- 5-stage fallback evaluation: aggregate → Phase 3 tag → Phase 1 tag → ai() judge → all-conditions AI judge
- Returns
RuleMatchwith index and detection method (aggregate,phase3_tag,phase1_tag,ai_judge,ai_fallback) - Fail-fast: throws if rules exist but no rule matched
Instruction Builder (src/core/workflow/instruction/InstructionBuilder.ts)
- Auto-injects standard sections into every instruction (no need for
{task}or{previous_response}placeholders in templates):- Execution context (working dir, edit permission rules)
- Workflow context (iteration counts, report dir)
- User request (
{task}— auto-injected unless placeholder present) - Previous response (auto-injected if
pass_previous_response: true) - User inputs (auto-injected unless
{user_inputs}placeholder present) instruction_templatecontent- Status output rules (auto-injected for tag-based rules)
- Localized for
enandja - Related:
ReportInstructionBuilder(Phase 2),StatusJudgmentBuilder(Phase 3)
Agent Runner (src/agents/runner.ts)
- Resolves agent specs (name or path) to agent configurations
- Built-in agents with default tools:
coder: Read/Glob/Grep/Edit/Write/Bash/WebSearch/WebFetcharchitect: Read/Glob/Grep/WebSearch/WebFetchsupervisor: Read/Glob/Grep/Bash/WebSearch/WebFetchplanner: Read/Glob/Grep/Bash/WebSearch/WebFetch
- Custom agents via
.takt/agents.yamlor prompt files (.md)
Provider Integration (src/infra/claude/, src/infra/codex/)
- Claude - Uses
@anthropic-ai/claude-agent-sdkclient.ts- High-level API:callClaude(),callClaudeCustom(),callClaudeAgent(),callClaudeSkill()process.ts- SDK wrapper withClaudeProcessclassexecutor.ts- Query executionquery-manager.ts- Concurrent query tracking with query IDs
- Codex - Direct OpenAI SDK integration
CodexStreamHandler.ts- Stream handling and tool execution
Configuration (src/infra/config/)
loaders/loader.ts- Custom agent loading from.takt/agents.yamlloaders/workflowParser.ts- YAML parsing, step/rule normalization with Zod validationloaders/workflowResolver.ts- 3-layer resolution (builtin → user → project-local)loaders/workflowCategories.ts- Workflow categorization and filteringloaders/agentLoader.ts- Agent prompt file loadingpaths.ts- Directory structure (.takt/,~/.takt/), session managementglobal/globalConfig.ts- Global configuration (provider, model, trusted dirs)project/projectConfig.ts- Project-level configuration
Task Management (src/features/tasks/)
execute/taskExecution.ts- Main task execution orchestrationexecute/workflowExecution.ts- Workflow execution wrapperadd/index.ts- Interactive task addition via AI conversationlist/index.ts- List task branches with merge/delete actionswatch/index.ts- Watch for task files and auto-execute
GitHub Integration (src/infra/github/)
issue.ts- Fetches issues viaghCLI, formats as task text with title/body/labels/commentspr.ts- Creates pull requests viaghCLI
Data Flow
- User provides task (text or
#Nissue reference) or slash command → CLI - CLI loads workflow: user
~/.takt/workflows/→ builtinresources/global/{lang}/workflows/fallback - WorkflowEngine starts at
initial_step - Each step:
buildInstruction()→ Phase 1 (main) → Phase 2 (report) → Phase 3 (status) →detectMatchedRule()→determineNextStep() - Rule evaluation determines next step name
- Special transitions:
COMPLETEends workflow successfully,ABORTends with failure
Directory Structure
~/.takt/ # Global user config (created on first run)
config.yaml # Trusted dirs, default workflow, log level, language
workflows/ # User workflow YAML files (override builtins)
agents/ # User agent prompt files (.md)
.takt/ # Project-level config
agents.yaml # Custom agent definitions
tasks/ # Task files for /run-tasks
reports/ # Execution reports (auto-generated)
logs/ # Session logs in NDJSON format (gitignored)
resources/ # Bundled defaults (builtin, read from dist/ at runtime)
global/
en/ # English agents and workflows
ja/ # Japanese agents and workflows
Builtin resources are embedded in the npm package (dist/resources/). User files in ~/.takt/ take priority. Use /eject to copy builtins to ~/.takt/ for customization.
Workflow YAML Schema
name: workflow-name
description: Optional description
max_iterations: 10
initial_step: plan # First step to execute
steps:
# Normal step
- name: step-name
agent: ../agents/default/coder.md # Path to agent prompt
agent_name: coder # Display name (optional)
provider: codex # claude|codex (optional)
model: opus # Model name (optional)
edit: true # Whether step can edit files
permission_mode: acceptEdits # Tool permission mode (optional)
instruction_template: |
Custom instructions for this step.
{task}, {previous_response} are auto-injected if not present as placeholders.
pass_previous_response: true # Default: true
report:
name: 01-plan.md # Report file name
format: | # Report format template
# Plan Report
...
rules:
- condition: "Human-readable condition"
next: next-step-name
- condition: ai("AI evaluates this condition text")
next: other-step
- condition: blocked
next: ABORT
# Parallel step (sub-steps execute concurrently)
- name: reviewers
parallel:
- name: arch-review
agent: ../agents/default/architecture-reviewer.md
rules:
- condition: approved # next is optional for sub-steps
- condition: needs_fix
instruction_template: |
Review architecture...
- name: security-review
agent: ../agents/default/security-reviewer.md
rules:
- condition: approved
- condition: needs_fix
instruction_template: |
Review security...
rules: # Parent rules use aggregate conditions
- condition: all("approved")
next: supervise
- condition: any("needs_fix")
next: fix
Key points about parallel steps:
- Sub-step
rulesdefine possible outcomes butnextis ignored (parent handles routing) - Parent
rulesuseall("X")/any("X")to aggregate sub-step results all("X"): true if ALL sub-steps matched condition Xany("X"): true if ANY sub-step matched condition X
Rule Condition Types
| Type | Syntax | Evaluation |
|---|---|---|
| Tag-based | "condition text" |
Agent outputs [STEP:N] tag, matched by index |
| AI judge | ai("condition text") |
AI evaluates condition against agent output |
| Aggregate | all("X") / any("X") |
Aggregates parallel sub-step matched conditions |
Template Variables
| Variable | Description |
|---|---|
{task} |
Original user request (auto-injected if not in template) |
{iteration} |
Workflow-wide iteration count |
{max_iterations} |
Maximum iterations allowed |
{step_iteration} |
Per-step iteration count |
{previous_response} |
Previous step output (auto-injected if not in template) |
{user_inputs} |
Accumulated user inputs (auto-injected if not in template) |
{report_dir} |
Report directory name |
Workflow Categories
Workflows can be organized into categories for better UI presentation. Categories are configured in:
resources/global/{lang}/default-categories.yaml- Default builtin categories~/.takt/config.yaml- User-defined categories (viaworkflow_categoriesfield)
Category configuration supports:
- Nested categories (unlimited depth)
- Per-category workflow lists
- "Others" category for uncategorized workflows (can be disabled via
show_others_category: false) - Builtin workflow filtering (disable via
builtin_workflows_enabled: false, or selectively viadisabled_builtins: [name1, name2])
Example category config:
workflow_categories:
Development:
workflows: [default, simple]
children:
Backend:
workflows: [expert-cqrs]
Frontend:
workflows: [expert]
Research:
workflows: [research, magi]
show_others_category: true
others_category_name: "Other Workflows"
Implemented in src/infra/config/loaders/workflowCategories.ts.
Model Resolution
Model is resolved in the following priority order:
- Workflow step
model- Highest priority (specified in step YAML) - Custom agent
model- Agent-level model in.takt/agents.yaml - Global config
model- Default model in~/.takt/config.yaml - Provider default - Falls back to provider's default (Claude: sonnet, Codex: gpt-5.2-codex)
Example ~/.takt/config.yaml:
provider: claude
model: opus # Default model for all steps (unless overridden)
NDJSON Session Logging
Session logs use NDJSON (.jsonl) format for real-time append-only writes. Record types:
| Record | Description |
|---|---|
workflow_start |
Workflow initialization with task, workflow name |
step_start |
Step execution start |
step_complete |
Step result with status, content, matched rule info |
workflow_complete |
Successful completion |
workflow_abort |
Abort with reason |
Files: .takt/logs/{sessionId}.jsonl, with latest.json pointer. Legacy .json format is still readable via loadSessionLog().
TypeScript Notes
- ESM modules with
.jsextensions in imports - Strict TypeScript with
noUncheckedIndexedAccess - Zod schemas for runtime validation (
src/core/models/schemas.ts) - Uses
@anthropic-ai/claude-agent-sdkfor Claude integration
Design Principles
Keep commands minimal. One command per concept. Use arguments/modes instead of multiple similar commands. Before adding a new command, consider if existing commands can be extended.
Do NOT expand schemas carelessly. Rule conditions are free-form text (not enum-restricted). However, the engine's behavior depends on specific patterns (ai(), all(), any()). Do not add new special syntax without updating the loader's regex parsing in workflowParser.ts.
Instruction auto-injection over explicit placeholders. The instruction builder auto-injects {task}, {previous_response}, {user_inputs}, and status rules. Templates should contain only step-specific instructions, not boilerplate.
Agent prompts contain only domain knowledge. Agent prompt files (resources/global/{lang}/agents/**/*.md) must contain only domain expertise and behavioral principles — never workflow-specific procedures. Workflow-specific details (which reports to read, step routing, specific templates with hardcoded step names) belong in the workflow YAML's instruction_template. This keeps agents reusable across different workflows.
What belongs in agent prompts:
- Role definition ("You are a ... specialist")
- Domain expertise, review criteria, judgment standards
- Do / Don't behavioral rules
- Tool usage knowledge (general, not workflow-specific)
What belongs in workflow instruction_template:
- Step-specific procedures ("Read these specific reports")
- References to other steps or their outputs
- Specific report file names or formats
- Comment/output templates with hardcoded review type names
Separation of concerns in workflow engine:
WorkflowEngine- Orchestration, state management, event emissionStepExecutor- Single step execution (3-phase model)ParallelRunner- Parallel step executionRuleEvaluator- Rule matching and evaluationInstructionBuilder- Instruction template processing
Session management: Agent sessions are stored per-cwd in ~/.claude/projects/{encoded-path}/ (Claude Code) or in-memory (Codex). Sessions are resumed across phases (Phase 1 → Phase 2 → Phase 3) to maintain context. When cwd !== projectCwd (worktree/clone execution), session resume is skipped to avoid cross-directory contamination.
Isolated Execution (Shared Clone)
When tasks specify worktree: true or worktree: "path", code runs in a git clone --shared (lightweight clone with independent .git directory). Clones are ephemeral: created before task execution, auto-committed + pushed after success, then deleted.
Why
worktreein YAML butgit clone --sharedinternally? The YAML field nameworktreeis retained for backward compatibility. The original implementation usedgit worktree, but git worktrees have a.gitfile containinggitdir: /path/to/main/.git/worktrees/.... Claude Code follows this path and recognizes the main repository as the project root, causing agents to work on main instead of the worktree.git clone --sharedcreates an independent.gitdirectory that prevents this traversal.
Key constraints:
- Independent
.git: Shared clones have their own.gitdirectory, preventing Claude Code from traversinggitdir:back to the main repository. - Ephemeral lifecycle: Clone is created → task runs → auto-commit + push → clone is deleted. Branches are the single source of truth.
- Session isolation: Claude Code sessions are stored per-cwd in
~/.claude/projects/{encoded-path}/. Sessions from the main project cannot be resumed in a clone. The engine skips session resume whencwd !== projectCwd. - No node_modules: Clones only contain tracked files.
node_modules/is absent. - Dual cwd:
cwd= clone path (where agents run),projectCwd= project root (where.takt/lives). Reports, logs, and session data always write toprojectCwd. - List: Use
takt listto list branches. Instruct action creates a temporary clone for the branch, executes, pushes, then removes the clone.
Error Propagation
ClaudeResult (from SDK) has an error field. This must be propagated through AgentResponse.error → session log history → console output. Without this, SDK failures (exit code 1, rate limits, auth errors) appear as empty blocked status with no diagnostic info.
Error handling flow:
- Provider error (Claude SDK / Codex) →
AgentResponse.error StepExecutorcaptures error →WorkflowEngineemitsstep:completewith error- Error logged to session log (
.takt/logs/{sessionId}.jsonl) - Console output shows error details
- Workflow transitions to
ABORTstep if error is unrecoverable
Debugging
Debug logging: Set debug_enabled: true in ~/.takt/config.yaml or create a .takt/debug.yaml file:
enabled: true
Debug logs are written to .takt/logs/debug.log (ndjson format). Log levels: debug, info, warn, error.
Verbose mode: Create .takt/verbose file (empty file) to enable verbose console output. This automatically enables debug logging and sets log level to debug.
Session logs: All workflow executions are logged to .takt/logs/{sessionId}.jsonl. Use tail -f .takt/logs/{sessionId}.jsonl to monitor in real-time.
Testing with mocks: Use --provider mock to test workflows without calling real AI APIs. Mock responses are deterministic and configurable via test fixtures.
Testing Notes
- Vitest for testing framework
- Tests use file system fixtures in
__tests__/subdirectories - Mock workflows and agent configs for integration tests
- Test single files:
npx vitest run src/__tests__/filename.test.ts - Pattern matching:
npx vitest run -t "test pattern" - Integration tests: Tests with
it-prefix are integration tests that simulate full workflow execution - Engine tests: Tests with
engine-prefix test specific WorkflowEngine scenarios (happy path, error handling, parallel execution, etc.)
Important Implementation Notes
Agent prompt resolution:
- Agent paths in workflow YAML are resolved relative to the workflow file's directory
../agents/default/coder.mdresolves from workflow file location- Built-in agents are loaded from
dist/resources/global/{lang}/agents/ - User agents are loaded from
~/.takt/agents/or.takt/agents.yaml - If agent file doesn't exist, the agent string is used as inline system prompt
Report directory structure:
- Report dirs are created at
.takt/reports/{timestamp}-{slug}/ - Report files specified in
step.reportare written relative to report dir - Report dir path is available as
{report_dir}variable in instruction templates - When
cwd !== projectCwd(worktree execution), reports still write toprojectCwd/.takt/reports/
Session continuity across phases:
- Agent sessions persist across Phase 1 → Phase 2 → Phase 3 for context continuity
- Session ID is passed via
resumeFrominRunAgentOptions - Sessions are stored per-cwd, so worktree executions create new sessions
- Use
takt clearto reset all agent sessions
Worktree execution gotchas:
git clone --sharedcreates independent.gitdirectory (notgit worktree)- Clone cwd ≠ project cwd: agents work in clone, but reports/logs write to project
- Session resume is skipped when
cwd !== projectCwdto avoid cross-directory contamination - Clones are ephemeral: created → task runs → auto-commit + push → deleted
- Use
takt listto manage task branches after clone deletion
Rule evaluation quirks:
- Tag-based rules match by array index (0-based), not by exact condition text
ai()conditions are evaluated by Claude/Codex, not by string matching- Aggregate conditions (
all(),any()) only work in parallel parent steps - Fail-fast: if rules exist but no rule matches, workflow aborts
- Interactive-only rules are skipped in pipeline mode (
rule.interactiveOnly === true)
Provider-specific behavior:
- Claude: Uses session files in
~/.claude/projects/, supports skill/agent calls - Codex: In-memory sessions, no skill/agent calls
- Model names are passed directly to provider (no alias resolution in TAKT)
- Claude supports aliases:
opus,sonnet,haiku - Codex defaults to
codexif model not specified
Permission modes:
default: Claude Code default behavior (prompts for file writes)acceptEdits: Auto-accept file edits without promptsbypassPermissions: Bypass all permission checks- Specified at step level (
permission_modefield) or global config - Implemented via
--sandbox-modeand--accept-editsflags passed to Claude Code CLI