diff --git a/.github/workflows/auto-tag.yml b/.github/workflows/auto-tag.yml index 502d777..1f61cad 100644 --- a/.github/workflows/auto-tag.yml +++ b/.github/workflows/auto-tag.yml @@ -86,13 +86,21 @@ jobs: - name: Verify dist-tags run: | PACKAGE_NAME=$(node -p "require('./package.json').name") - LATEST=$(npm view "${PACKAGE_NAME}" dist-tags.latest) - NEXT=$(npm view "${PACKAGE_NAME}" dist-tags.next || true) - echo "latest=${LATEST}" - echo "next=${NEXT}" + for attempt in 1 2 3 4 5; do + LATEST=$(npm view "${PACKAGE_NAME}" dist-tags.latest) + NEXT=$(npm view "${PACKAGE_NAME}" dist-tags.next || true) - if [ "${{ steps.npm-tag.outputs.tag }}" = "latest" ] && [ "${LATEST}" != "${NEXT}" ]; then - echo "Expected next to match latest on stable release, but they differ." - exit 1 - fi + echo "Attempt ${attempt}: latest=${LATEST}, next=${NEXT}" + + if [ "${{ steps.npm-tag.outputs.tag }}" != "latest" ] || [ "${LATEST}" = "${NEXT}" ]; then + echo "Dist-tags verified." + exit 0 + fi + + if [ "$attempt" -eq 5 ]; then + echo "::warning::dist-tags not synced after 5 attempts (latest=${LATEST}, next=${NEXT}). Registry propagation may be delayed." + exit 0 + fi + sleep $((attempt * 10)) + done diff --git a/CHANGELOG.md b/CHANGELOG.md index c170daf..75f9acd 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,46 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). +## [0.13.0-alpha.1] - 2026-02-13 + +### Added + +- **Team Leader ムーブメント**: ムーブメント内でチームリーダーエージェントがタスクを動的にサブタスク(Part)へ分解し、複数のパートエージェントを並列実行する新しいムーブメントタイプ — `team_leader` 設定(persona, maxParts, timeoutMs, partPersona, partEdit, partPermissionMode)をサポート (#244) +- **構造化出力(Structured Output)**: エージェント呼び出しに JSON Schema ベースの構造化出力を導入 — タスク分解(decomposition)、ルール評価(evaluation)、ステータス判定(judgment)の3つのスキーマを `builtins/schemas/` に追加。Claude / Codex 両プロバイダーで対応 (#257) +- **`backend` ビルトインピース**: バックエンド開発特化のピースを新規追加 — バックエンド、セキュリティ、QA の並列専門家レビュー対応 +- **`backend-cqrs` ビルトインピース**: CQRS+ES 特化のバックエンド開発ピースを新規追加 — CQRS+ES、セキュリティ、QA の並列専門家レビュー対応 +- **AbortSignal によるパートタイムアウト**: Team Leader のパート実行にタイムアウト制御と親シグナル連動の AbortSignal を追加 +- **エージェントユースケース層**: `agent-usecases.ts` にエージェント呼び出しのユースケース(`decomposeTask`, `executeAgent`, `evaluateRules`)を集約し、構造化出力の注入を一元管理 + +### Changed + +- **BREAKING: パブリック API の整理**: `src/index.ts` の公開 API を大幅に絞り込み — 内部実装の詳細(セッション管理、Claude/Codex クライアント詳細、ユーティリティ関数等)を非公開化し、安定した最小限の API サーフェスに (#257) +- **Phase 3 判定ロジックの刷新**: `JudgmentDetector` / `FallbackStrategy` を廃止し、構造化出力ベースの `status-judgment-phase.ts` に統合。判定の安定性と保守性を向上 (#257) +- **Report フェーズのリトライ改善**: Report Phase(Phase 2)が失敗した場合、新規セッションで自動リトライするよう改善 (#245) +- **Ctrl+C シャットダウンの統一**: `sigintHandler.ts` を廃止し、`ShutdownManager` に統合 — グレースフルシャットダウン → タイムアウト → 強制終了の3段階制御を全プロバイダーで共通化 (#237) +- フロントエンドナレッジにデザイントークンとテーマスコープのガイダンスを追加 +- アーキテクチャナレッジの改善(en/ja 両対応) + +### Fixed + +- clone 時に既存ブランチの checkout が失敗する問題を修正 — `git clone --shared` で `--branch` を渡してからリモートを削除するよう変更 +- Issue 参照付きブランチ名から `#` を除去(`takt/#N/slug` → `takt/N/slug`) +- OpenCode の report フェーズで deprecated ツール依存を解消し、permission 中心の制御へ移行 (#246) +- 不要な export を排除し、パブリック API の整合性を確保 + +### Internal + +- Team Leader 関連のテスト追加(engine-team-leader, team-leader-schema-loader, task-decomposer) +- 構造化出力関連のテスト追加(parseStructuredOutput, claude-executor-structured-output, codex-structured-output, provider-structured-output, structured-output E2E) +- ShutdownManager のユニットテスト追加 +- AbortSignal のユニットテスト追加(abort-signal, claude-executor-abort-signal, claude-provider-abort-signal) +- Report Phase リトライのユニットテスト追加(report-phase-retry) +- パブリック API エクスポートのユニットテスト追加(public-api-exports) +- E2E テストの大幅拡充: cycle-detection, model-override, multi-step-sequential, pipeline-local-repo, report-file-output, run-sigint-graceful, session-log, structured-output, task-status-persistence +- E2E テストヘルパーのリファクタリング(共通 setup 関数の抽出) +- `judgment/` ディレクトリ(JudgmentDetector, FallbackStrategy)を削除 +- `ruleIndex.ts` ユーティリティを追加(1-based → 0-based インデックス変換) + ## [0.12.1] - 2026-02-11 ### Fixed diff --git a/CLAUDE.md b/CLAUDE.md index f26a082..f195450 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -371,19 +371,30 @@ Files: `.takt/logs/{sessionId}.jsonl`, with `latest.json` pointer. Legacy `.json **Instruction auto-injection over explicit placeholders.** The instruction builder auto-injects `{task}`, `{previous_response}`, `{user_inputs}`, and status rules. Templates should contain only step-specific instructions, not boilerplate. -**Persona prompts contain only domain knowledge.** Persona prompt files (`builtins/{lang}/personas/*.md`) must contain only domain expertise and behavioral principles — never piece-specific procedures. Piece-specific details (which reports to read, step routing, specific templates with hardcoded step names) belong in the piece YAML's `instruction_template`. This keeps personas reusable across different pieces. +**Faceted prompting: each facet has a dedicated file type.** TAKT assembles agent prompts from 4 facets. Each facet has a distinct role. When adding new rules or knowledge, place content in the correct facet. -What belongs in persona prompts: -- Role definition ("You are a ... specialist") -- Domain expertise, review criteria, judgment standards -- Do / Don't behavioral rules -- Tool usage knowledge (general, not piece-specific) +``` +builtins/{lang}/ + personas/ — WHO: identity, expertise, behavioral habits + policies/ — HOW: judgment criteria, REJECT/APPROVE rules, prohibited patterns + knowledge/ — WHAT TO KNOW: domain patterns, anti-patterns, detailed reasoning with examples + instructions/ — WHAT TO DO NOW: step-specific procedures and checklists +``` -What belongs in piece `instruction_template`: -- Step-specific procedures ("Read these specific reports") -- References to other steps or their outputs -- Specific report file names or formats -- Comment/output templates with hardcoded review type names +| Deciding where to place content | Facet | Example | +|--------------------------------|-------|---------| +| Role definition, AI habit prevention | Persona | "置き換えたコードを残す → 禁止" | +| Actionable REJECT/APPROVE criterion | Policy | "内部実装のパブリックAPIエクスポート → REJECT" | +| Detailed reasoning, REJECT/OK table with examples | Knowledge | "パブリックAPIの公開範囲" section | +| This-step-only procedure or checklist | Instruction | "レビュー観点: 構造・設計の妥当性..." | +| Workflow structure, facet assignment | Piece YAML | `persona: coder`, `policy: coding`, `knowledge: architecture` | + +Key rules: +- Persona files are reusable across pieces. Never include piece-specific procedures (report names, step references) +- Policy REJECT lists are what reviewers enforce. If a criterion is not in the policy REJECT list, reviewers will not catch it — even if knowledge explains the reasoning +- Knowledge provides the WHY behind policy criteria. Knowledge alone does not trigger enforcement +- Instructions are bound to a single piece step. They reference procedures, not principles +- Piece YAML `instruction_template` is for step-specific details (which reports to read, step routing, output templates) **Separation of concerns in piece engine:** - `PieceEngine` - Orchestration, state management, event emission diff --git a/README.md b/README.md index 7faa1a3..0838200 100644 --- a/README.md +++ b/README.md @@ -474,6 +474,8 @@ TAKT includes multiple builtin pieces: | `unit-test` | Unit test focused piece: test analysis → test implementation → review → fix. | | `e2e-test` | E2E test focused piece: E2E analysis → E2E implementation → review → fix (Vitest-based E2E flow). | | `frontend` | Frontend-specialized development piece with React/Next.js focused reviews and knowledge injection. | +| `backend` | Backend-specialized development piece with backend, security, and QA expert reviews. | +| `backend-cqrs` | CQRS+ES-specialized backend development piece with CQRS+ES, security, and QA expert reviews. | **Per-persona provider overrides:** Use `persona_providers` in config to route specific personas to different providers (e.g., coder on Codex, reviewers on Claude) without duplicating pieces. diff --git a/builtins/en/knowledge/architecture.md b/builtins/en/knowledge/architecture.md index 04eccfb..c410c6d 100644 --- a/builtins/en/knowledge/architecture.md +++ b/builtins/en/knowledge/architecture.md @@ -18,6 +18,26 @@ - No circular dependencies - Appropriate directory hierarchy +**Operation Discoverability:** + +When calls to the same generic function are scattered across the codebase with different purposes, it becomes impossible to understand what the system does without grepping every call site. Group related operations into purpose-named functions within a single module. Reading that module should reveal the complete list of operations the system performs. + +| Judgment | Criteria | +|----------|----------| +| REJECT | Same generic function called directly from 3+ places with different purposes | +| REJECT | Understanding all system operations requires grepping every call site | +| OK | Purpose-named functions defined and collected in a single module | + +**Public API Surface:** + +Public APIs should expose only domain-level functions and types. Do not export infrastructure internals (provider-specific functions, internal parsers, etc.). + +| Judgment | Criteria | +|----------|----------| +| REJECT | Infrastructure-layer functions exported from public API | +| REJECT | Internal implementation functions callable from outside | +| OK | External consumers interact only through domain-level abstractions | + **Function Design:** - One responsibility per function @@ -299,19 +319,18 @@ Correct handling: ## DRY Violation Detection -Detect duplicate code. +Eliminate duplication by default. When logic is essentially the same and should be unified, apply DRY. Do not judge mechanically by count. | Pattern | Judgment | |---------|----------| -| Same logic in 3+ places | Immediate REJECT - Extract to function/method | -| Same validation in 2+ places | Immediate REJECT - Extract to validator function | -| Similar components 3+ | Immediate REJECT - Create shared component | -| Copy-paste derived code | Immediate REJECT - Parameterize or abstract | +| Essentially identical logic duplicated | REJECT - Extract to function/method | +| Same validation duplicated | REJECT - Extract to validator function | +| Essentially identical component structure | REJECT - Create shared component | +| Copy-paste derived code | REJECT - Parameterize or abstract | -AHA principle (Avoid Hasty Abstractions) balance: -- 2 duplications → Wait and see -- 3 duplications → Extract immediately -- Different domain duplications → Don't abstract (e.g., customer validation vs admin validation are different) +When NOT to apply DRY: +- Different domains: Don't abstract (e.g., customer validation vs admin validation are different things) +- Superficially similar but different reasons to change: Treat as separate code ## Spec Compliance Verification diff --git a/builtins/en/knowledge/frontend.md b/builtins/en/knowledge/frontend.md index 8674246..b1781e2 100644 --- a/builtins/en/knowledge/frontend.md +++ b/builtins/en/knowledge/frontend.md @@ -224,6 +224,54 @@ Signs to make separate components: - Added variant is clearly different from original component's purpose - Props specification becomes complex on the usage side +### Theme Differences and Design Tokens + +When you need different visuals with the same functional components, manage it with design tokens + theme scope. + +Principles: +- Define color, spacing, radius, shadow, and typography as tokens (CSS variables) +- Apply role/page-specific differences by overriding tokens in a theme scope (e.g. `.consumer-theme`, `.admin-theme`) +- Do not hardcode hex colors (`#xxxxxx`) in feature components +- Keep logic differences (API/state) separate from visual differences (tokens) + +```css +/* tokens.css */ +:root { + --color-bg-page: #f3f4f6; + --color-surface: #ffffff; + --color-text-primary: #1f2937; + --color-border: #d1d5db; + --color-accent: #2563eb; +} + +.consumer-theme { + --color-bg-page: #f7f8fa; + --color-accent: #4daca1; +} +``` + +```tsx +// same component, different look by scope +
+ +
+``` + +Operational rules: +- Implement shared UI primitives (Button/Card/Input/Tabs) using tokens only +- In feature views, use theme-common utility classes (e.g. `surface`, `title`, `chip`) to avoid duplicated styling logic +- For a new theme, follow: "add tokens -> override by scope -> reuse existing components" + +Review checklist: +- No copy-pasted hardcoded colors/spacings +- No duplicated components per theme for the same UI behavior +- No API/state-management changes made solely for visual adjustments + +Anti-patterns: +- Creating `ButtonConsumer`, `ButtonAdmin` for styling only +- Hardcoding colors in each feature component +- Changing response shaping logic when only the theme changed + ## Abstraction Level Evaluation **Conditional branch bloat detection:** diff --git a/builtins/en/personas/coder.md b/builtins/en/personas/coder.md index 926db8b..f222620 100644 --- a/builtins/en/personas/coder.md +++ b/builtins/en/personas/coder.md @@ -33,4 +33,5 @@ You are the implementer. Focus on implementation, not design decisions. - Making design decisions arbitrarily → Report and ask for guidance - Dismissing reviewer feedback → Prohibited - Adding backward compatibility or legacy support without being asked → Absolutely prohibited +- Leaving replaced code/exports after refactoring → Prohibited (remove unless explicitly told to keep) - Layering workarounds that bypass safety mechanisms on top of a root cause fix → Prohibited diff --git a/builtins/en/piece-categories.yaml b/builtins/en/piece-categories.yaml index 27db8db..1cd1933 100644 --- a/builtins/en/piece-categories.yaml +++ b/builtins/en/piece-categories.yaml @@ -9,7 +9,10 @@ piece_categories: 🎨 Frontend: pieces: - frontend - ⚙️ Backend: {} + ⚙️ Backend: + pieces: + - backend + - backend-cqrs 🔧 Expert: Full Stack: pieces: diff --git a/builtins/en/pieces/backend-cqrs.yaml b/builtins/en/pieces/backend-cqrs.yaml new file mode 100644 index 0000000..aa4a112 --- /dev/null +++ b/builtins/en/pieces/backend-cqrs.yaml @@ -0,0 +1,267 @@ +name: backend-cqrs +description: CQRS+ES, Security, QA Expert Review +max_movements: 30 +initial_movement: plan +movements: + - name: plan + edit: false + persona: planner + allowed_tools: + - Read + - Glob + - Grep + - Bash + - WebSearch + - WebFetch + instruction: plan + rules: + - condition: Task analysis and planning is complete + next: implement + - condition: Requirements are unclear and planning cannot proceed + next: ABORT + output_contracts: + report: + - name: 00-plan.md + format: plan + - name: implement + edit: true + persona: coder + policy: + - coding + - testing + session: refresh + knowledge: + - backend + - cqrs-es + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + instruction: implement + rules: + - condition: Implementation is complete + next: ai_review + - condition: No implementation (report only) + next: ai_review + - condition: Cannot proceed with implementation + next: ai_review + - condition: User input required + next: implement + requires_user_input: true + interactive_only: true + output_contracts: + report: + - Scope: 01-coder-scope.md + - Decisions: 02-coder-decisions.md + - name: ai_review + edit: false + persona: ai-antipattern-reviewer + policy: + - review + - ai-antipattern + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + instruction: ai-review + rules: + - condition: No AI-specific issues found + next: reviewers + - condition: AI-specific issues detected + next: ai_fix + output_contracts: + report: + - name: 03-ai-review.md + format: ai-review + - name: ai_fix + edit: true + persona: coder + policy: + - coding + - testing + session: refresh + knowledge: + - backend + - cqrs-es + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + instruction: ai-fix + rules: + - condition: AI Reviewer's issues have been fixed + next: ai_review + - condition: No fix needed (verified target files/spec) + next: ai_no_fix + - condition: Unable to proceed with fixes + next: ai_no_fix + - name: ai_no_fix + edit: false + persona: architecture-reviewer + policy: review + allowed_tools: + - Read + - Glob + - Grep + rules: + - condition: ai_review's findings are valid (fix required) + next: ai_fix + - condition: ai_fix's judgment is valid (no fix needed) + next: reviewers + instruction: arbitrate + - name: reviewers + parallel: + - name: cqrs-es-review + edit: false + persona: cqrs-es-reviewer + policy: review + knowledge: + - cqrs-es + - backend + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + rules: + - condition: approved + - condition: needs_fix + instruction: review-cqrs-es + output_contracts: + report: + - name: 04-cqrs-es-review.md + format: cqrs-es-review + - name: security-review + edit: false + persona: security-reviewer + policy: review + knowledge: security + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + rules: + - condition: approved + - condition: needs_fix + instruction: review-security + output_contracts: + report: + - name: 05-security-review.md + format: security-review + - name: qa-review + edit: false + persona: qa-reviewer + policy: + - review + - qa + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + rules: + - condition: approved + - condition: needs_fix + instruction: review-qa + output_contracts: + report: + - name: 06-qa-review.md + format: qa-review + rules: + - condition: all("approved") + next: supervise + - condition: any("needs_fix") + next: fix + - name: fix + edit: true + persona: coder + policy: + - coding + - testing + knowledge: + - backend + - cqrs-es + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + permission_mode: edit + rules: + - condition: Fix complete + next: reviewers + - condition: Cannot proceed, insufficient info + next: plan + instruction: fix + - name: supervise + edit: false + persona: expert-supervisor + policy: review + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + instruction: supervise + rules: + - condition: All validations pass and ready to merge + next: COMPLETE + - condition: Issues detected during final review + next: fix_supervisor + output_contracts: + report: + - Validation: 07-supervisor-validation.md + - Summary: summary.md + - name: fix_supervisor + edit: true + persona: coder + policy: + - coding + - testing + knowledge: + - backend + - cqrs-es + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + instruction: fix-supervisor + rules: + - condition: Supervisor's issues have been fixed + next: supervise + - condition: Unable to proceed with fixes + next: plan diff --git a/builtins/en/pieces/backend.yaml b/builtins/en/pieces/backend.yaml new file mode 100644 index 0000000..d3d7522 --- /dev/null +++ b/builtins/en/pieces/backend.yaml @@ -0,0 +1,263 @@ +name: backend +description: Backend, Security, QA Expert Review +max_movements: 30 +initial_movement: plan +movements: + - name: plan + edit: false + persona: planner + allowed_tools: + - Read + - Glob + - Grep + - Bash + - WebSearch + - WebFetch + instruction: plan + rules: + - condition: Task analysis and planning is complete + next: implement + - condition: Requirements are unclear and planning cannot proceed + next: ABORT + output_contracts: + report: + - name: 00-plan.md + format: plan + - name: implement + edit: true + persona: coder + policy: + - coding + - testing + session: refresh + knowledge: + - backend + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + instruction: implement + rules: + - condition: Implementation is complete + next: ai_review + - condition: No implementation (report only) + next: ai_review + - condition: Cannot proceed with implementation + next: ai_review + - condition: User input required + next: implement + requires_user_input: true + interactive_only: true + output_contracts: + report: + - Scope: 01-coder-scope.md + - Decisions: 02-coder-decisions.md + - name: ai_review + edit: false + persona: ai-antipattern-reviewer + policy: + - review + - ai-antipattern + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + instruction: ai-review + rules: + - condition: No AI-specific issues found + next: reviewers + - condition: AI-specific issues detected + next: ai_fix + output_contracts: + report: + - name: 03-ai-review.md + format: ai-review + - name: ai_fix + edit: true + persona: coder + policy: + - coding + - testing + session: refresh + knowledge: + - backend + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + instruction: ai-fix + rules: + - condition: AI Reviewer's issues have been fixed + next: ai_review + - condition: No fix needed (verified target files/spec) + next: ai_no_fix + - condition: Unable to proceed with fixes + next: ai_no_fix + - name: ai_no_fix + edit: false + persona: architecture-reviewer + policy: review + allowed_tools: + - Read + - Glob + - Grep + rules: + - condition: ai_review's findings are valid (fix required) + next: ai_fix + - condition: ai_fix's judgment is valid (no fix needed) + next: reviewers + instruction: arbitrate + - name: reviewers + parallel: + - name: arch-review + edit: false + persona: architecture-reviewer + policy: review + knowledge: + - architecture + - backend + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + rules: + - condition: approved + - condition: needs_fix + instruction: review-arch + output_contracts: + report: + - name: 04-architect-review.md + format: architecture-review + - name: security-review + edit: false + persona: security-reviewer + policy: review + knowledge: security + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + rules: + - condition: approved + - condition: needs_fix + instruction: review-security + output_contracts: + report: + - name: 05-security-review.md + format: security-review + - name: qa-review + edit: false + persona: qa-reviewer + policy: + - review + - qa + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + rules: + - condition: approved + - condition: needs_fix + instruction: review-qa + output_contracts: + report: + - name: 06-qa-review.md + format: qa-review + rules: + - condition: all("approved") + next: supervise + - condition: any("needs_fix") + next: fix + - name: fix + edit: true + persona: coder + policy: + - coding + - testing + knowledge: + - backend + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + permission_mode: edit + rules: + - condition: Fix complete + next: reviewers + - condition: Cannot proceed, insufficient info + next: plan + instruction: fix + - name: supervise + edit: false + persona: expert-supervisor + policy: review + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + instruction: supervise + rules: + - condition: All validations pass and ready to merge + next: COMPLETE + - condition: Issues detected during final review + next: fix_supervisor + output_contracts: + report: + - Validation: 07-supervisor-validation.md + - Summary: summary.md + - name: fix_supervisor + edit: true + persona: coder + policy: + - coding + - testing + knowledge: + - backend + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + instruction: fix-supervisor + rules: + - condition: Supervisor's issues have been fixed + next: supervise + - condition: Unable to proceed with fixes + next: plan diff --git a/builtins/en/policies/coding.md b/builtins/en/policies/coding.md index 7ef4af3..1296a9a 100644 --- a/builtins/en/policies/coding.md +++ b/builtins/en/policies/coding.md @@ -7,7 +7,7 @@ Prioritize correctness over speed, and code accuracy over ease of implementation | Principle | Criteria | |-----------|----------| | Simple > Easy | Prioritize readability over writability | -| DRY | Extract after 3 repetitions | +| DRY | Eliminate essential duplication | | Comments | Why only. Never write What/How | | Function size | One function, one responsibility. ~30 lines | | File size | ~300 lines as a guideline. Be flexible depending on the task | @@ -245,23 +245,19 @@ Request → toInput() → UseCase/Service → Output → Response.from() ## Shared Code Decisions -### Rule of Three - -- 1st occurrence: Write it inline -- 2nd occurrence: Do not extract yet (observe) -- 3rd occurrence: Consider extracting +Eliminate duplication by default. When logic is essentially the same and should be unified, apply DRY. Do not decide mechanically by count. ### Should Be Shared -- Same logic in 3+ places +- Essentially identical logic duplicated - Same style/UI pattern - Same validation logic - Same formatting logic ### Should Not Be Shared -- Similar but subtly different (forced generalization adds complexity) -- Used in only 1-2 places +- Duplication across different domains (e.g., customer validation and admin validation are separate concerns) +- Superficially similar code with different reasons to change - Based on "might need it in the future" predictions ```typescript @@ -289,4 +285,6 @@ function formatPercentage(value: number): string { ... } - **Hardcoded secrets** - **Scattered try-catch** - Centralize error handling at the upper layer - **Unsolicited backward compatibility / legacy support** - Not needed unless explicitly instructed +- **Internal implementation exported from public API** - Only export domain-level functions and types. Do not export infrastructure functions or internal classes +- **Replaced code surviving after refactoring** - Remove replaced code and exports. Do not keep unless explicitly told to - **Workarounds that bypass safety mechanisms** - If the root fix is correct, no additional bypass is needed diff --git a/builtins/en/policies/review.md b/builtins/en/policies/review.md index 4908fec..ea70654 100644 --- a/builtins/en/policies/review.md +++ b/builtins/en/policies/review.md @@ -38,9 +38,11 @@ REJECT without exception if any of the following apply. - Direct mutation of objects/arrays - Swallowed errors (empty catch blocks) - TODO comments (not tracked in an issue) -- Duplicated code in 3+ places (DRY violation) +- Essentially identical logic duplicated (DRY violation) - Method proliferation doing the same thing (should be absorbed by configuration differences) - Specific implementation leaking into generic layers (imports and branching for specific implementations in generic layers) +- Internal implementation exported from public API (infrastructure functions or internal classes exposed publicly) +- Replaced code/exports surviving after refactoring - Missing cross-validation of related fields (invariants of semantically coupled config values left unverified) ### Warning diff --git a/builtins/ja/knowledge/architecture.md b/builtins/ja/knowledge/architecture.md index 7527100..a00ffb0 100644 --- a/builtins/ja/knowledge/architecture.md +++ b/builtins/ja/knowledge/architecture.md @@ -18,6 +18,26 @@ - 循環依存がないか - 適切なディレクトリ階層か +**操作の一覧性** + +同じ汎用関数への呼び出しがコードベースに散在すると、システムが何をしているか把握できなくなる。操作には目的に応じた名前を付けて関数化し、関連する操作を1つのモジュールにまとめる。そのモジュールを読めば「このシステムが行う操作の全体像」がわかる状態にする。 + +| 判定 | 基準 | +|------|------| +| REJECT | 同じ汎用関数が目的の異なる3箇所以上から直接呼ばれている | +| REJECT | 呼び出し元を全件 grep しないとシステムの操作一覧がわからない | +| OK | 目的ごとに名前付き関数が定義され、1モジュールに集約されている | + +**パブリック API の公開範囲** + +パブリック API が公開するのは、ドメインの操作に対応する関数・型のみ。インフラの実装詳細(特定プロバイダーの関数、内部パーサー等)を公開しない。 + +| 判定 | 基準 | +|------|------| +| REJECT | インフラ層の関数がパブリック API からエクスポートされている | +| REJECT | 内部実装の関数が外部から直接呼び出し可能になっている | +| OK | 外部消費者がドメインレベルの抽象のみを通じて対話する | + **関数設計** - 1関数1責務になっているか @@ -299,19 +319,18 @@ TODOが許容される唯一のケース: ## DRY違反の検出 -重複コードを検出する。 +基本的に重複は排除する。本質的に同じロジックであり、まとめるべきと判断したら DRY にする。回数で機械的に判断しない。 | パターン | 判定 | |---------|------| -| 同じロジックが3箇所以上 | 即REJECT - 関数/メソッドに抽出 | -| 同じバリデーションが2箇所以上 | 即REJECT - バリデーター関数に抽出 | -| 似たようなコンポーネントが3個以上 | 即REJECT - 共通コンポーネント化 | -| コピペで派生したコード | 即REJECT - パラメータ化または抽象化 | +| 本質的に同じロジックの重複 | REJECT - 関数/メソッドに抽出 | +| 同じバリデーションの重複 | REJECT - バリデーター関数に抽出 | +| 本質的に同じ構造のコンポーネント | REJECT - 共通コンポーネント化 | +| コピペで派生したコード | REJECT - パラメータ化または抽象化 | -AHA原則(Avoid Hasty Abstractions)とのバランス: -- 2回の重複 → 様子見 -- 3回の重複 → 即抽出 -- ドメインが異なる重複 → 抽象化しない(例: 顧客用バリデーションと管理者用バリデーションは別物) +DRY にしないケース: +- ドメインが異なる重複は抽象化しない(例: 顧客用バリデーションと管理者用バリデーションは別物) +- 表面的に似ているが、変更理由が異なるコードは別物として扱う ## 仕様準拠の検証 diff --git a/builtins/ja/knowledge/frontend.md b/builtins/ja/knowledge/frontend.md index 1860932..621a437 100644 --- a/builtins/ja/knowledge/frontend.md +++ b/builtins/ja/knowledge/frontend.md @@ -369,6 +369,54 @@ export function StepperButton(props) { - 追加したvariantが元のコンポーネントの用途と明らかに違う - 使う側のprops指定が複雑になる +### テーマ差分とデザイントークン + +同じ機能コンポーネントを再利用しつつ見た目だけ変える場合は、デザイントークン + テーマスコープで管理する。 + +原則: +- 色・余白・角丸・影・タイポをトークン(CSS Variables)として定義する +- 画面/ロール別の差分はテーマスコープ(例: `.consumer-theme`, `.admin-theme`)で上書きする +- コンポーネント内に16進カラー値(`#xxxxxx`)を直書きしない +- ロジック差分(API・状態管理)と見た目差分(トークン)を混在させない + +```css +/* tokens.css */ +:root { + --color-bg-page: #f3f4f6; + --color-surface: #ffffff; + --color-text-primary: #1f2937; + --color-border: #d1d5db; + --color-accent: #2563eb; +} + +.consumer-theme { + --color-bg-page: #f7f8fa; + --color-accent: #4daca1; +} +``` + +```tsx +// same component, different look by scope +
+ +
+``` + +運用ルール: +- 共通UI(Button/Card/Input/Tabs)はトークン参照のみで実装する +- feature側はテーマ共通クラス(例: `surface`, `title`, `chip`)を利用し、装飾ロジックを重複させない +- 追加テーマ実装時は「トークン追加 → スコープ上書き → 既存コンポーネント流用」の順で進める + +レビュー観点: +- 直書き色・直書き余白のコピペがないか +- 同一UIパターンがテーマごとに別コンポーネント化されていないか +- 見た目変更のためにデータ取得/状態管理が改変されていないか + +NG例: +- 見た目差分のために `ButtonConsumer`, `ButtonAdmin` を乱立 +- featureコンポーネントごとに色を直書き +- テーマ切り替えのたびにAPIレスポンス整形ロジックを変更 + ## 抽象化レベルの評価 ### 条件分岐の肥大化検出 diff --git a/builtins/ja/personas/coder.md b/builtins/ja/personas/coder.md index 45d83e6..ad81de3 100644 --- a/builtins/ja/personas/coder.md +++ b/builtins/ja/personas/coder.md @@ -33,4 +33,5 @@ - 設計判断を勝手にする → 報告して判断を仰ぐ - レビュワーの指摘を軽視する → 禁止 - 後方互換・Legacy 対応を勝手に追加する → 絶対禁止 +- リファクタリングで置き換えたコード・エクスポートを残す → 禁止(明示的に残すよう指示されない限り削除する) - 根本原因を修正した上で安全機構を迂回するワークアラウンドを重ねる → 禁止 diff --git a/builtins/ja/piece-categories.yaml b/builtins/ja/piece-categories.yaml index 5858f9b..6daf434 100644 --- a/builtins/ja/piece-categories.yaml +++ b/builtins/ja/piece-categories.yaml @@ -9,7 +9,10 @@ piece_categories: 🎨 フロントエンド: pieces: - frontend - ⚙️ バックエンド: {} + ⚙️ バックエンド: + pieces: + - backend + - backend-cqrs 🔧 エキスパート: フルスタック: pieces: diff --git a/builtins/ja/pieces/backend-cqrs.yaml b/builtins/ja/pieces/backend-cqrs.yaml new file mode 100644 index 0000000..efd269b --- /dev/null +++ b/builtins/ja/pieces/backend-cqrs.yaml @@ -0,0 +1,267 @@ +name: backend-cqrs +description: CQRS+ES・セキュリティ・QA専門家レビュー +max_movements: 30 +initial_movement: plan +movements: + - name: plan + edit: false + persona: planner + allowed_tools: + - Read + - Glob + - Grep + - Bash + - WebSearch + - WebFetch + instruction: plan + rules: + - condition: タスク分析と計画が完了した + next: implement + - condition: 要件が不明確で計画を立てられない + next: ABORT + output_contracts: + report: + - name: 00-plan.md + format: plan + - name: implement + edit: true + persona: coder + policy: + - coding + - testing + session: refresh + knowledge: + - backend + - cqrs-es + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + instruction: implement + rules: + - condition: 実装が完了した + next: ai_review + - condition: 実装未着手(レポートのみ) + next: ai_review + - condition: 実装を進行できない + next: ai_review + - condition: ユーザー入力が必要 + next: implement + requires_user_input: true + interactive_only: true + output_contracts: + report: + - Scope: 01-coder-scope.md + - Decisions: 02-coder-decisions.md + - name: ai_review + edit: false + persona: ai-antipattern-reviewer + policy: + - review + - ai-antipattern + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + instruction: ai-review + rules: + - condition: AI特有の問題が見つからない + next: reviewers + - condition: AI特有の問題が検出された + next: ai_fix + output_contracts: + report: + - name: 03-ai-review.md + format: ai-review + - name: ai_fix + edit: true + persona: coder + policy: + - coding + - testing + session: refresh + knowledge: + - backend + - cqrs-es + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + instruction: ai-fix + rules: + - condition: AI Reviewerの指摘に対する修正が完了した + next: ai_review + - condition: 修正不要(指摘対象ファイル/仕様の確認済み) + next: ai_no_fix + - condition: 修正を進行できない + next: ai_no_fix + - name: ai_no_fix + edit: false + persona: architecture-reviewer + policy: review + allowed_tools: + - Read + - Glob + - Grep + rules: + - condition: ai_reviewの指摘が妥当(修正すべき) + next: ai_fix + - condition: ai_fixの判断が妥当(修正不要) + next: reviewers + instruction: arbitrate + - name: reviewers + parallel: + - name: cqrs-es-review + edit: false + persona: cqrs-es-reviewer + policy: review + knowledge: + - cqrs-es + - backend + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + rules: + - condition: approved + - condition: needs_fix + instruction: review-cqrs-es + output_contracts: + report: + - name: 04-cqrs-es-review.md + format: cqrs-es-review + - name: security-review + edit: false + persona: security-reviewer + policy: review + knowledge: security + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + rules: + - condition: approved + - condition: needs_fix + instruction: review-security + output_contracts: + report: + - name: 05-security-review.md + format: security-review + - name: qa-review + edit: false + persona: qa-reviewer + policy: + - review + - qa + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + rules: + - condition: approved + - condition: needs_fix + instruction: review-qa + output_contracts: + report: + - name: 06-qa-review.md + format: qa-review + rules: + - condition: all("approved") + next: supervise + - condition: any("needs_fix") + next: fix + - name: fix + edit: true + persona: coder + policy: + - coding + - testing + knowledge: + - backend + - cqrs-es + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + permission_mode: edit + rules: + - condition: 修正が完了した + next: reviewers + - condition: 修正を進行できない + next: plan + instruction: fix + - name: supervise + edit: false + persona: expert-supervisor + policy: review + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + instruction: supervise + rules: + - condition: すべての検証が完了し、マージ可能な状態である + next: COMPLETE + - condition: 問題が検出された + next: fix_supervisor + output_contracts: + report: + - Validation: 07-supervisor-validation.md + - Summary: summary.md + - name: fix_supervisor + edit: true + persona: coder + policy: + - coding + - testing + knowledge: + - backend + - cqrs-es + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + instruction: fix-supervisor + rules: + - condition: 監督者の指摘に対する修正が完了した + next: supervise + - condition: 修正を進行できない + next: plan diff --git a/builtins/ja/pieces/backend.yaml b/builtins/ja/pieces/backend.yaml new file mode 100644 index 0000000..1ac6ea6 --- /dev/null +++ b/builtins/ja/pieces/backend.yaml @@ -0,0 +1,263 @@ +name: backend +description: バックエンド・セキュリティ・QA専門家レビュー +max_movements: 30 +initial_movement: plan +movements: + - name: plan + edit: false + persona: planner + allowed_tools: + - Read + - Glob + - Grep + - Bash + - WebSearch + - WebFetch + instruction: plan + rules: + - condition: タスク分析と計画が完了した + next: implement + - condition: 要件が不明確で計画を立てられない + next: ABORT + output_contracts: + report: + - name: 00-plan.md + format: plan + - name: implement + edit: true + persona: coder + policy: + - coding + - testing + session: refresh + knowledge: + - backend + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + instruction: implement + rules: + - condition: 実装が完了した + next: ai_review + - condition: 実装未着手(レポートのみ) + next: ai_review + - condition: 実装を進行できない + next: ai_review + - condition: ユーザー入力が必要 + next: implement + requires_user_input: true + interactive_only: true + output_contracts: + report: + - Scope: 01-coder-scope.md + - Decisions: 02-coder-decisions.md + - name: ai_review + edit: false + persona: ai-antipattern-reviewer + policy: + - review + - ai-antipattern + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + instruction: ai-review + rules: + - condition: AI特有の問題が見つからない + next: reviewers + - condition: AI特有の問題が検出された + next: ai_fix + output_contracts: + report: + - name: 03-ai-review.md + format: ai-review + - name: ai_fix + edit: true + persona: coder + policy: + - coding + - testing + session: refresh + knowledge: + - backend + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + instruction: ai-fix + rules: + - condition: AI Reviewerの指摘に対する修正が完了した + next: ai_review + - condition: 修正不要(指摘対象ファイル/仕様の確認済み) + next: ai_no_fix + - condition: 修正を進行できない + next: ai_no_fix + - name: ai_no_fix + edit: false + persona: architecture-reviewer + policy: review + allowed_tools: + - Read + - Glob + - Grep + rules: + - condition: ai_reviewの指摘が妥当(修正すべき) + next: ai_fix + - condition: ai_fixの判断が妥当(修正不要) + next: reviewers + instruction: arbitrate + - name: reviewers + parallel: + - name: arch-review + edit: false + persona: architecture-reviewer + policy: review + knowledge: + - architecture + - backend + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + rules: + - condition: approved + - condition: needs_fix + instruction: review-arch + output_contracts: + report: + - name: 04-architect-review.md + format: architecture-review + - name: security-review + edit: false + persona: security-reviewer + policy: review + knowledge: security + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + rules: + - condition: approved + - condition: needs_fix + instruction: review-security + output_contracts: + report: + - name: 05-security-review.md + format: security-review + - name: qa-review + edit: false + persona: qa-reviewer + policy: + - review + - qa + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + rules: + - condition: approved + - condition: needs_fix + instruction: review-qa + output_contracts: + report: + - name: 06-qa-review.md + format: qa-review + rules: + - condition: all("approved") + next: supervise + - condition: any("needs_fix") + next: fix + - name: fix + edit: true + persona: coder + policy: + - coding + - testing + knowledge: + - backend + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + permission_mode: edit + rules: + - condition: 修正が完了した + next: reviewers + - condition: 修正を進行できない + next: plan + instruction: fix + - name: supervise + edit: false + persona: expert-supervisor + policy: review + allowed_tools: + - Read + - Glob + - Grep + - WebSearch + - WebFetch + instruction: supervise + rules: + - condition: すべての検証が完了し、マージ可能な状態である + next: COMPLETE + - condition: 問題が検出された + next: fix_supervisor + output_contracts: + report: + - Validation: 07-supervisor-validation.md + - Summary: summary.md + - name: fix_supervisor + edit: true + persona: coder + policy: + - coding + - testing + knowledge: + - backend + - security + - architecture + allowed_tools: + - Read + - Glob + - Grep + - Edit + - Write + - Bash + - WebSearch + - WebFetch + instruction: fix-supervisor + rules: + - condition: 監督者の指摘に対する修正が完了した + next: supervise + - condition: 修正を進行できない + next: plan diff --git a/builtins/ja/policies/coding.md b/builtins/ja/policies/coding.md index 3ec45f3..dafb098 100644 --- a/builtins/ja/policies/coding.md +++ b/builtins/ja/policies/coding.md @@ -7,7 +7,7 @@ | 原則 | 基準 | |------|------| | Simple > Easy | 書きやすさより読みやすさを優先 | -| DRY | 3回重複したら抽出 | +| DRY | 本質的な重複は排除する | | コメント | Why のみ。What/How は書かない | | 関数サイズ | 1関数1責務。30行目安 | | ファイルサイズ | 目安として300行。タスクに応じて柔軟に | @@ -245,23 +245,19 @@ Request → toInput() → UseCase/Service → Output → Response.from() ## 共通化の判断 -### 3回ルール - -- 1回目: そのまま書く -- 2回目: まだ共通化しない(様子見) -- 3回目: 共通化を検討 +基本的に重複は排除する。本質的に同じロジックであり、まとめるべきと判断したら DRY にする。回数で機械的に判断しない。 ### 共通化すべきもの -- 同じ処理が3箇所以上 +- 本質的に同じロジックの重複 - 同じスタイル/UIパターン - 同じバリデーションロジック - 同じフォーマット処理 ### 共通化すべきでないもの -- 似ているが微妙に違うもの(無理に汎用化すると複雑化) -- 1-2箇所しか使わないもの +- ドメインが異なる重複(例: 顧客用バリデーションと管理者用バリデーションは別物) +- 表面的に似ているが変更理由が異なるコード - 「将来使うかも」という予測に基づくもの ```typescript @@ -289,4 +285,6 @@ function formatPercentage(value: number): string { ... } - **機密情報のハードコーディング** - **各所でのtry-catch** - エラーは上位層で一元処理 - **後方互換・Legacy対応の自発的追加** - 明示的な指示がない限り不要 +- **内部実装のパブリック API エクスポート** - 公開するのはドメイン操作の関数・型のみ。インフラ層の関数や内部クラスをエクスポートしない +- **リファクタリング後の旧コード残存** - 置き換えたコード・エクスポートは削除する。明示的に残すよう指示されない限り残さない - **安全機構を迂回するワークアラウンド** - 根本修正が正しいなら追加の迂回は不要 diff --git a/builtins/ja/policies/review.md b/builtins/ja/policies/review.md index 1abdfdb..2921def 100644 --- a/builtins/ja/policies/review.md +++ b/builtins/ja/policies/review.md @@ -38,9 +38,11 @@ - オブジェクト/配列の直接変更 - エラーの握りつぶし(空の catch) - TODO コメント(Issue化されていないもの) -- 3箇所以上の重複コード(DRY違反) +- 本質的に同じロジックの重複(DRY違反) - 同じことをするメソッドの増殖(構成の違いで吸収すべき) - 特定実装の汎用層への漏洩(汎用層に特定実装のインポート・分岐がある) +- 内部実装のパブリック API エクスポート(インフラ層の関数・内部クラスが公開されている) +- リファクタリングで置き換えられた旧コード・旧エクスポートの残存 - 関連フィールドのクロスバリデーション欠如(意味的に結合した設定値の不変条件が未検証) ### Warning(警告) diff --git a/builtins/schemas/decomposition.json b/builtins/schemas/decomposition.json new file mode 100644 index 0000000..e9116c7 --- /dev/null +++ b/builtins/schemas/decomposition.json @@ -0,0 +1,33 @@ +{ + "type": "object", + "properties": { + "parts": { + "type": "array", + "items": { + "type": "object", + "properties": { + "id": { + "type": "string", + "description": "Unique part identifier" + }, + "title": { + "type": "string", + "description": "Human-readable part title" + }, + "instruction": { + "type": "string", + "description": "Instruction for the part agent" + }, + "timeout_ms": { + "type": ["integer", "null"], + "description": "Optional timeout in ms" + } + }, + "required": ["id", "title", "instruction", "timeout_ms"], + "additionalProperties": false + } + } + }, + "required": ["parts"], + "additionalProperties": false +} diff --git a/builtins/schemas/evaluation.json b/builtins/schemas/evaluation.json new file mode 100644 index 0000000..1526206 --- /dev/null +++ b/builtins/schemas/evaluation.json @@ -0,0 +1,15 @@ +{ + "type": "object", + "properties": { + "matched_index": { + "type": "integer", + "description": "Matched condition number (1-based)" + }, + "reason": { + "type": "string", + "description": "Why this condition was matched" + } + }, + "required": ["matched_index", "reason"], + "additionalProperties": false +} diff --git a/builtins/schemas/judgment.json b/builtins/schemas/judgment.json new file mode 100644 index 0000000..a8d6aed --- /dev/null +++ b/builtins/schemas/judgment.json @@ -0,0 +1,15 @@ +{ + "type": "object", + "properties": { + "step": { + "type": "integer", + "description": "Matched rule number (1-based)" + }, + "reason": { + "type": "string", + "description": "Brief justification for the decision" + } + }, + "required": ["step", "reason"], + "additionalProperties": false +} diff --git a/docs/README.ja.md b/docs/README.ja.md index a5186a5..c67bb76 100644 --- a/docs/README.ja.md +++ b/docs/README.ja.md @@ -474,6 +474,8 @@ TAKTには複数のビルトインピースが同梱されています: | `unit-test` | ユニットテスト重視ピース: テスト分析 → テスト実装 → レビュー → 修正。 | | `e2e-test` | E2Eテスト重視ピース: E2E分析 → E2E実装 → レビュー → 修正(VitestベースのE2Eフロー)。 | | `frontend` | フロントエンド特化開発ピース: React/Next.js 向けのレビューとナレッジ注入。 | +| `backend` | バックエンド特化開発ピース: バックエンド、セキュリティ、QA の専門家レビュー。 | +| `backend-cqrs` | CQRS+ES 特化バックエンド開発ピース: CQRS+ES、セキュリティ、QA の専門家レビュー。 | **ペルソナ別プロバイダー設定:** 設定ファイルの `persona_providers` で、特定のペルソナを異なるプロバイダーにルーティングできます(例: coder は Codex、レビュアーは Claude)。ピースを複製する必要はありません。 diff --git a/docs/implements/structured-output.ja.md b/docs/implements/structured-output.ja.md new file mode 100644 index 0000000..660fbe3 --- /dev/null +++ b/docs/implements/structured-output.ja.md @@ -0,0 +1,127 @@ +# Structured Output — Phase 3 ステータス判定 + +## 概要 + +Phase 3(ステータス判定)において、エージェントの出力を structured output(JSON スキーマ)で取得し、ルールマッチングの精度と信頼性を向上させる。 + +## プロバイダ別の挙動 + +| プロバイダ | メソッド | 仕組み | +|-----------|---------|--------| +| Claude | `structured_output` | SDK が `StructuredOutput` ツールを自動追加。エージェントがツール経由で `{ step, reason }` を返す | +| Codex | `structured_output` | `TurnOptions.outputSchema` で API レベルの JSON 制約。テキストが JSON になる | +| OpenCode | `structured_output` | プロンプト末尾に JSON スキーマ付き出力指示を注入。テキストレスポンスから `parseStructuredOutput()` で JSON を抽出 | + +## フォールバックチェーン + +`judgeStatus()` は3段階の独立した LLM 呼び出しでルールをマッチする。 + +``` +Stage 1: structured_output — outputSchema 付き LLM 呼び出し → structuredOutput.step(1-based integer) +Stage 2: phase3_tag — outputSchema なし LLM 呼び出し → content 内の [MOVEMENT:N] タグ検出 +Stage 3: ai_judge — evaluateCondition() による AI 条件評価 +``` + +各ステージは専用のインストラクションで LLM に問い合わせる。Stage 1 は「ルール番号を JSON で返せ」、Stage 2 は「タグを1行で出力せよ」と聞き方が異なる。 + +セッションログには `toJudgmentMatchMethod()` で変換された値が記録される。 + +| 内部メソッド | セッションログ | +|-------------|--------------| +| `structured_output` | `structured_output` | +| `phase3_tag` / `phase1_tag` | `tag_fallback` | +| `ai_judge` / `ai_judge_fallback` | `ai_judge` | + +## インストラクション分岐 + +Phase 3 テンプレート(`perform_phase3_message`)は `structuredOutput` フラグで2つのモードを持つ。 + +### Structured Output モード(`structuredOutput: true`) + +主要指示: ルール番号(1-based)と理由を返せ。 +フォールバック指示: structured output が使えない場合はタグを出力せよ。 + +### タグモード(`structuredOutput: false`) + +従来の指示: 対応するタグを1行で出力せよ。 + +現在、Phase 3 は常に `structuredOutput: true` で実行される。 + +## アーキテクチャ + +``` +StatusJudgmentBuilder + └─ structuredOutput: true + ├─ criteriaTable: ルール条件テーブル(常に含む) + ├─ outputList: タグ一覧(フォールバック用に含む) + └─ テンプレート: "ルール番号と理由を返せ + タグはフォールバック" + +runStatusJudgmentPhase() + └─ judgeStatus() → JudgeStatusResult { ruleIndex, method } + └─ StatusJudgmentPhaseResult { tag, ruleIndex, method } + +MovementExecutor + ├─ Phase 3 あり → judgeStatus の結果を直接使用(method 伝搬) + └─ Phase 3 なし → detectMatchedRule() で Phase 1 コンテンツから検出 +``` + +## JSON スキーマ + +### judgment.json(judgeStatus 用) + +```json +{ + "type": "object", + "properties": { + "step": { "type": "integer", "description": "Matched rule number (1-based)" }, + "reason": { "type": "string", "description": "Brief justification" } + }, + "required": ["step", "reason"], + "additionalProperties": false +} +``` + +### evaluation.json(evaluateCondition 用) + +```json +{ + "type": "object", + "properties": { + "matched_index": { "type": "integer" }, + "reason": { "type": "string" } + }, + "required": ["matched_index", "reason"], + "additionalProperties": false +} +``` + +## parseStructuredOutput() — JSON 抽出 + +Codex と OpenCode はテキストレスポンスから JSON を抽出する。3段階のフォールバック戦略を持つ。 + +``` +1. Direct parse — テキスト全体が `{` で始まる JSON オブジェクト +2. Code block — ```json ... ``` または ``` ... ``` 内の JSON +3. Brace extraction — テキスト内の最初の `{` から最後の `}` までを切り出し +``` + +## OpenCode 固有の仕組み + +OpenCode SDK は `outputFormat` を型定義でサポートしていない。代わりにプロンプト末尾に JSON 出力指示を注入する。 + +``` +--- +IMPORTANT: You MUST respond with ONLY a valid JSON object matching this schema. No other text, no markdown code blocks, no explanation. +```json +{ "type": "object", ... } +``` +``` + +エージェントが返すテキストを `parseStructuredOutput()` でパースし、`AgentResponse.structuredOutput` に格納する。 + +## 注意事項 + +- OpenAI API(Codex)は `required` に全プロパティを含めないとエラーになる(`additionalProperties: false` 時) +- Codex SDK の `TurnCompletedEvent` には `finalResponse` フィールドがない。structured output は `AgentMessageItem.text` の JSON テキストから `parseStructuredOutput()` でパースする +- Claude SDK は `StructuredOutput` ツール方式のため、インストラクションでタグ出力を強調しすぎるとエージェントがツールを呼ばずタグを出力してしまう +- OpenCode のプロンプト注入方式はモデルの指示従順性に依存する。JSON 以外のテキストが混在する場合は `parseStructuredOutput()` の code block / brace extraction で回収する diff --git a/docs/report-phase-permissions.md b/docs/report-phase-permissions.md new file mode 100644 index 0000000..38b1e95 --- /dev/null +++ b/docs/report-phase-permissions.md @@ -0,0 +1,34 @@ +# Report Phase Permissions Design + +## Summary + +The report phase now uses permission mode as the primary control surface. +Call sites only provide resume metadata (for example, `maxTurns`), and tool compatibility details are isolated inside `OptionsBuilder`. + +## Problem + +Historically, report phase calls passed `allowedTools: []` directly from `phase-runner`. +This made phase control depend on a tool list setting that is treated as legacy in OpenCode. + +## Design + +1. `phase-runner` uses `buildResumeOptions(step, sessionId, { maxTurns })`. +2. `OptionsBuilder.buildResumeOptions` enforces: + - `permissionMode: 'readonly'` + - `allowedTools: []` (compatibility layer for SDK behavior differences) +3. OpenCode-specific execution is controlled by permission rules (`readonly` => deny). + +## Rationale + +- OpenCode permission rules are the stable and explicit control mechanism for report-phase safety. +- Centralizing compatibility behavior in `OptionsBuilder` prevents policy leakage into movement orchestration code. +- Resume-session behavior remains deterministic for both report and status phases. + +## Test Coverage + +- `src/__tests__/options-builder.test.ts` + - verifies report/status resume options force `readonly` and empty tools. +- `src/__tests__/phase-runner-report-history.test.ts` + - verifies report phase passes only `{ maxTurns: 3 }` override. +- `src/__tests__/opencode-types.test.ts` + - verifies readonly maps to deny in OpenCode permission config. diff --git a/docs/testing/e2e.md b/docs/testing/e2e.md index cc996b3..b536ab0 100644 --- a/docs/testing/e2e.md +++ b/docs/testing/e2e.md @@ -17,7 +17,8 @@ E2Eテストを追加・変更した場合は、このドキュメントも更 ## E2E用config.yaml - E2Eのグローバル設定は `e2e/fixtures/config.e2e.yaml` を基準に生成する。 - `createIsolatedEnv()` は毎回一時ディレクトリ配下(`$TAKT_CONFIG_DIR/config.yaml`)にこの基準設定を書き出す。 -- 通知音は `notification_sound_events` でタイミング別に制御し、E2E既定では道中(`iteration_limit` / `piece_complete` / `piece_abort`)をOFF、全体終了時(`run_complete` / `run_abort`)のみONにする。 +- E2E実行中の `takt` 内通知音は `notification_sound: false` で無効化する。 +- `npm run test:e2e` は成否にかかわらず最後に1回ベルを鳴らし、終了コードはテスト結果を維持する。 - 各スペックで `provider` や `concurrency` を変更する場合は、`updateIsolatedConfig()` を使って差分のみ上書きする。 - `~/.takt/config.yaml` はE2Eでは参照されないため、通常実行の設定には影響しない。 diff --git a/e2e/fixtures/config.e2e.yaml b/e2e/fixtures/config.e2e.yaml index 6eea1b8..fca15ce 100644 --- a/e2e/fixtures/config.e2e.yaml +++ b/e2e/fixtures/config.e2e.yaml @@ -2,10 +2,10 @@ provider: claude language: en log_level: info default_piece: default -notification_sound: true +notification_sound: false notification_sound_events: iteration_limit: false piece_complete: false piece_abort: false run_complete: true - run_abort: true + run_abort: false diff --git a/e2e/fixtures/pieces/mock-cycle-detect.yaml b/e2e/fixtures/pieces/mock-cycle-detect.yaml new file mode 100644 index 0000000..c98ea48 --- /dev/null +++ b/e2e/fixtures/pieces/mock-cycle-detect.yaml @@ -0,0 +1,37 @@ +name: e2e-cycle-detect +description: Piece with loop_monitors for cycle detection E2E testing + +max_movements: 20 +initial_movement: review + +loop_monitors: + - cycle: [review, fix] + threshold: 2 + judge: + persona: ../agents/test-reviewer-b.md + rules: + - condition: continue + next: review + - condition: abort_loop + next: ABORT + +movements: + - name: review + persona: ../agents/test-reviewer-a.md + instruction_template: | + Review the code. + rules: + - condition: approved + next: COMPLETE + - condition: needs_fix + next: fix + + - name: fix + persona: ../agents/test-coder.md + edit: true + permission_mode: edit + instruction_template: | + Fix the issues found in review. + rules: + - condition: fixed + next: review diff --git a/e2e/fixtures/pieces/structured-output.yaml b/e2e/fixtures/pieces/structured-output.yaml new file mode 100644 index 0000000..fcb7280 --- /dev/null +++ b/e2e/fixtures/pieces/structured-output.yaml @@ -0,0 +1,18 @@ +name: e2e-structured-output +description: E2E piece to verify structured output rule matching + +max_movements: 5 + +movements: + - name: execute + edit: false + persona: ../agents/test-coder.md + permission_mode: readonly + instruction_template: | + Reply with exactly: "Task completed successfully." + Do not do anything else. + rules: + - condition: Task completed + next: COMPLETE + - condition: Task failed + next: ABORT diff --git a/e2e/fixtures/scenarios/cycle-detect-abort.json b/e2e/fixtures/scenarios/cycle-detect-abort.json new file mode 100644 index 0000000..c4d6b1a --- /dev/null +++ b/e2e/fixtures/scenarios/cycle-detect-abort.json @@ -0,0 +1,13 @@ +[ + {"persona": "agents/test-reviewer-a", "status": "done", "content": "[REVIEW:2]\n\nNeeds fix."}, + {"persona": "conductor", "status": "done", "content": "[REVIEW:2]"}, + {"persona": "conductor", "status": "done", "content": "[REVIEW:2]"}, + {"persona": "agents/test-coder", "status": "done", "content": "[FIX:1]\n\nFixed."}, + {"persona": "agents/test-reviewer-a", "status": "done", "content": "[REVIEW:2]\n\nStill needs fix."}, + {"persona": "conductor", "status": "done", "content": "[REVIEW:2]"}, + {"persona": "conductor", "status": "done", "content": "[REVIEW:2]"}, + {"persona": "agents/test-coder", "status": "done", "content": "[FIX:1]\n\nFixed again."}, + {"persona": "agents/test-reviewer-b", "status": "done", "content": "[_LOOP_JUDGE_REVIEW_FIX:2]\n\nAbort this loop."}, + {"persona": "conductor", "status": "done", "content": "[_LOOP_JUDGE_REVIEW_FIX:2]"}, + {"persona": "conductor", "status": "done", "content": "[_LOOP_JUDGE_REVIEW_FIX:2]"} +] diff --git a/e2e/fixtures/scenarios/cycle-detect-pass.json b/e2e/fixtures/scenarios/cycle-detect-pass.json new file mode 100644 index 0000000..999ece9 --- /dev/null +++ b/e2e/fixtures/scenarios/cycle-detect-pass.json @@ -0,0 +1,9 @@ +[ + {"persona": "agents/test-reviewer-a", "status": "done", "content": "[REVIEW:2]\n\nNeeds fix."}, + {"persona": "conductor", "status": "done", "content": "[REVIEW:2]"}, + {"persona": "conductor", "status": "done", "content": "[REVIEW:2]"}, + {"persona": "agents/test-coder", "status": "done", "content": "[FIX:1]\n\nFixed."}, + {"persona": "agents/test-reviewer-a", "status": "done", "content": "[REVIEW:1]\n\nApproved."}, + {"persona": "conductor", "status": "done", "content": "[REVIEW:1]"}, + {"persona": "conductor", "status": "done", "content": "[REVIEW:1]"} +] diff --git a/e2e/fixtures/scenarios/multi-step-all-approved.json b/e2e/fixtures/scenarios/multi-step-all-approved.json index 5392a8b..bb38ddc 100644 --- a/e2e/fixtures/scenarios/multi-step-all-approved.json +++ b/e2e/fixtures/scenarios/multi-step-all-approved.json @@ -1,7 +1,9 @@ [ - { "persona": "test-coder", "status": "done", "content": "Plan created." }, - { "persona": "test-reviewer-a", "status": "done", "content": "Architecture approved." }, - { "persona": "test-reviewer-b", "status": "done", "content": "Security approved." }, + { "persona": "agents/test-coder", "status": "done", "content": "Plan created." }, + { "persona": "agents/test-reviewer-a", "status": "done", "content": "Architecture approved." }, + { "persona": "agents/test-reviewer-b", "status": "done", "content": "Security approved." }, + { "persona": "conductor", "status": "done", "content": "[ARCH-REVIEW:1] [SECURITY-REVIEW:1]" }, + { "persona": "conductor", "status": "done", "content": "[ARCH-REVIEW:1] [SECURITY-REVIEW:1]" }, { "persona": "conductor", "status": "done", "content": "[ARCH-REVIEW:1] [SECURITY-REVIEW:1]" }, { "persona": "conductor", "status": "done", "content": "[ARCH-REVIEW:1] [SECURITY-REVIEW:1]" } ] diff --git a/e2e/fixtures/scenarios/multi-step-needs-fix.json b/e2e/fixtures/scenarios/multi-step-needs-fix.json index 52b595d..fda74e3 100644 --- a/e2e/fixtures/scenarios/multi-step-needs-fix.json +++ b/e2e/fixtures/scenarios/multi-step-needs-fix.json @@ -1,15 +1,19 @@ [ - { "persona": "test-coder", "status": "done", "content": "Plan created." }, + { "persona": "agents/test-coder", "status": "done", "content": "Plan created." }, - { "persona": "test-reviewer-a", "status": "done", "content": "Architecture looks good." }, - { "persona": "test-reviewer-b", "status": "done", "content": "Security issues found." }, + { "persona": "agents/test-reviewer-a", "status": "done", "content": "Architecture looks good." }, + { "persona": "agents/test-reviewer-b", "status": "done", "content": "Security issues found." }, + { "persona": "conductor", "status": "done", "content": "[ARCH-REVIEW:1] [SECURITY-REVIEW:2]" }, + { "persona": "conductor", "status": "done", "content": "[ARCH-REVIEW:1] [SECURITY-REVIEW:2]" }, { "persona": "conductor", "status": "done", "content": "[ARCH-REVIEW:1] [SECURITY-REVIEW:2]" }, { "persona": "conductor", "status": "done", "content": "[ARCH-REVIEW:1] [SECURITY-REVIEW:2]" }, - { "persona": "test-coder", "status": "done", "content": "Fix applied." }, + { "persona": "agents/test-coder", "status": "done", "content": "Fix applied." }, - { "persona": "test-reviewer-a", "status": "done", "content": "Architecture still approved." }, - { "persona": "test-reviewer-b", "status": "done", "content": "Security now approved." }, + { "persona": "agents/test-reviewer-a", "status": "done", "content": "Architecture still approved." }, + { "persona": "agents/test-reviewer-b", "status": "done", "content": "Security now approved." }, + { "persona": "conductor", "status": "done", "content": "[ARCH-REVIEW:1] [SECURITY-REVIEW:1]" }, + { "persona": "conductor", "status": "done", "content": "[ARCH-REVIEW:1] [SECURITY-REVIEW:1]" }, { "persona": "conductor", "status": "done", "content": "[ARCH-REVIEW:1] [SECURITY-REVIEW:1]" }, { "persona": "conductor", "status": "done", "content": "[ARCH-REVIEW:1] [SECURITY-REVIEW:1]" } ] diff --git a/e2e/fixtures/scenarios/report-judge.json b/e2e/fixtures/scenarios/report-judge.json index 7277cc2..57d7c66 100644 --- a/e2e/fixtures/scenarios/report-judge.json +++ b/e2e/fixtures/scenarios/report-judge.json @@ -1,14 +1,19 @@ [ { - "persona": "test-reporter", + "persona": "agents/test-reporter", "status": "done", "content": "Work completed." }, { - "persona": "test-reporter", + "persona": "agents/test-reporter", "status": "done", "content": "Report summary: OK" }, + { + "persona": "conductor", + "status": "done", + "content": "[EXECUTE:1]" + }, { "persona": "conductor", "status": "done", diff --git a/e2e/helpers/session-log.ts b/e2e/helpers/session-log.ts new file mode 100644 index 0000000..4d458f1 --- /dev/null +++ b/e2e/helpers/session-log.ts @@ -0,0 +1,26 @@ +import { readdirSync, readFileSync } from 'node:fs'; +import { join } from 'node:path'; + +/** + * Read session NDJSON log records from a piece execution run. + * Finds the first .jsonl file whose first record is `piece_start`. + */ +export function readSessionRecords(repoPath: string): Array> { + const runsDir = join(repoPath, '.takt', 'runs'); + const runDirs = readdirSync(runsDir).sort(); + + for (const runDir of runDirs) { + const logsDir = join(runsDir, runDir, 'logs'); + const logFiles = readdirSync(logsDir).filter((file) => file.endsWith('.jsonl')); + for (const file of logFiles) { + const content = readFileSync(join(logsDir, file), 'utf-8').trim(); + if (!content) continue; + const records = content.split('\n').map((line) => JSON.parse(line) as Record); + if (records[0]?.type === 'piece_start') { + return records; + } + } + } + + throw new Error('Session NDJSON log not found'); +} diff --git a/e2e/helpers/test-repo.ts b/e2e/helpers/test-repo.ts index 8c57f4e..35cd4f1 100644 --- a/e2e/helpers/test-repo.ts +++ b/e2e/helpers/test-repo.ts @@ -1,9 +1,13 @@ -import { rmSync } from 'node:fs'; -import { mkdtempSync } from 'node:fs'; +import { mkdtempSync, writeFileSync, rmSync } from 'node:fs'; import { join } from 'node:path'; import { tmpdir } from 'node:os'; import { execFileSync } from 'node:child_process'; +export interface LocalRepo { + path: string; + cleanup: () => void; +} + export interface TestRepo { path: string; repoName: string; @@ -11,6 +15,26 @@ export interface TestRepo { cleanup: () => void; } +/** + * Create a local git repository in a temporary directory. + * Use this for tests that don't need a remote (GitHub) repository. + */ +export function createLocalRepo(): LocalRepo { + const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-')); + execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); + execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); + execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); + writeFileSync(join(repoPath, 'README.md'), '# test\n'); + execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); + execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); + return { + path: repoPath, + cleanup: () => { + rmSync(repoPath, { recursive: true, force: true }); + }, + }; +} + export interface CreateTestRepoOptions { /** Skip creating a test branch (stay on default branch). Use for pipeline tests. */ skipBranch?: boolean; @@ -30,6 +54,66 @@ function getGitHubUser(): string { return user; } +function canUseGitHubRepo(): boolean { + try { + const user = getGitHubUser(); + const repoName = `${user}/takt-testing`; + execFileSync('gh', ['repo', 'view', repoName], { + encoding: 'utf-8', + stdio: 'pipe', + }); + return true; + } catch { + return false; + } +} + +export function isGitHubE2EAvailable(): boolean { + return canUseGitHubRepo(); +} + +function createOfflineTestRepo(options?: CreateTestRepoOptions): TestRepo { + const sandboxPath = mkdtempSync(join(tmpdir(), 'takt-e2e-repo-')); + const originPath = join(sandboxPath, 'origin.git'); + const repoPath = join(sandboxPath, 'work'); + + execFileSync('git', ['init', '--bare', originPath], { stdio: 'pipe' }); + execFileSync('git', ['clone', originPath, repoPath], { stdio: 'pipe' }); + execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); + execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); + writeFileSync(join(repoPath, 'README.md'), '# test\n'); + execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); + execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); + execFileSync('git', ['push', '-u', 'origin', 'HEAD'], { cwd: repoPath, stdio: 'pipe' }); + + const testBranch = options?.skipBranch ? undefined : `e2e-test-${Date.now()}`; + if (testBranch) { + execFileSync('git', ['checkout', '-b', testBranch], { + cwd: repoPath, + stdio: 'pipe', + }); + } + + const currentBranch = testBranch + ?? execFileSync('git', ['branch', '--show-current'], { + cwd: repoPath, + encoding: 'utf-8', + }).trim(); + + return { + path: repoPath, + repoName: 'local/takt-testing', + branch: currentBranch, + cleanup: () => { + try { + rmSync(sandboxPath, { recursive: true, force: true }); + } catch { + // Best-effort cleanup + } + }, + }; +} + /** * Clone the takt-testing repository and create a test branch. * @@ -39,6 +123,10 @@ function getGitHubUser(): string { * 3. Delete local directory */ export function createTestRepo(options?: CreateTestRepoOptions): TestRepo { + if (!canUseGitHubRepo()) { + return createOfflineTestRepo(options); + } + const user = getGitHubUser(); const repoName = `${user}/takt-testing`; diff --git a/e2e/specs/add.e2e.ts b/e2e/specs/add.e2e.ts index bc7979c..f16cdce 100644 --- a/e2e/specs/add.e2e.ts +++ b/e2e/specs/add.e2e.ts @@ -9,11 +9,12 @@ import { updateIsolatedConfig, type IsolatedEnv, } from '../helpers/isolated-env'; -import { createTestRepo, type TestRepo } from '../helpers/test-repo'; +import { createTestRepo, isGitHubE2EAvailable, type TestRepo } from '../helpers/test-repo'; import { runTakt } from '../helpers/takt-runner'; const __filename = fileURLToPath(import.meta.url); const __dirname = dirname(__filename); +const requiresGitHub = isGitHubE2EAvailable(); // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Add task from GitHub issue (takt add)', () => { @@ -67,7 +68,7 @@ describe('E2E: Add task from GitHub issue (takt add)', () => { } }); - it('should create a task file from issue reference', () => { + it.skipIf(!requiresGitHub)('should create a task file from issue reference', () => { const scenarioPath = resolve(__dirname, '../fixtures/scenarios/add-task.json'); const result = runTakt({ diff --git a/e2e/specs/cli-catalog.e2e.ts b/e2e/specs/cli-catalog.e2e.ts index 881cde1..074a104 100644 --- a/e2e/specs/cli-catalog.e2e.ts +++ b/e2e/specs/cli-catalog.e2e.ts @@ -1,31 +1,12 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; -import { mkdtempSync, writeFileSync, rmSync } from 'node:fs'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; - -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-catalog-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { rmSync(repoPath, { recursive: true, force: true }); } catch { /* best-effort */ } - }, - }; -} +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Catalog command (takt catalog)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; beforeEach(() => { isolatedEnv = createIsolatedEnv(); diff --git a/e2e/specs/cli-clear.e2e.ts b/e2e/specs/cli-clear.e2e.ts index 81ccad7..09b27ab 100644 --- a/e2e/specs/cli-clear.e2e.ts +++ b/e2e/specs/cli-clear.e2e.ts @@ -1,31 +1,12 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; -import { mkdtempSync, writeFileSync, rmSync } from 'node:fs'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; - -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-clear-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { rmSync(repoPath, { recursive: true, force: true }); } catch { /* best-effort */ } - }, - }; -} +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Clear sessions command (takt clear)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; beforeEach(() => { isolatedEnv = createIsolatedEnv(); diff --git a/e2e/specs/cli-config.e2e.ts b/e2e/specs/cli-config.e2e.ts index e51cfc4..19a6433 100644 --- a/e2e/specs/cli-config.e2e.ts +++ b/e2e/specs/cli-config.e2e.ts @@ -1,31 +1,14 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; -import { mkdtempSync, writeFileSync, readFileSync, rmSync } from 'node:fs'; +import { readFileSync } from 'node:fs'; import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; - -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-config-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { rmSync(repoPath, { recursive: true, force: true }); } catch { /* best-effort */ } - }, - }; -} +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Config command (takt config)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; beforeEach(() => { isolatedEnv = createIsolatedEnv(); diff --git a/e2e/specs/cli-export-cc.e2e.ts b/e2e/specs/cli-export-cc.e2e.ts index b1d771c..7181106 100644 --- a/e2e/specs/cli-export-cc.e2e.ts +++ b/e2e/specs/cli-export-cc.e2e.ts @@ -1,31 +1,15 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; -import { mkdtempSync, writeFileSync, existsSync, readdirSync, rmSync } from 'node:fs'; +import { mkdtempSync, existsSync, readdirSync, rmSync } from 'node:fs'; import { join } from 'node:path'; import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; - -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-export-cc-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { rmSync(repoPath, { recursive: true, force: true }); } catch { /* best-effort */ } - }, - }; -} +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Export-cc command (takt export-cc)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; let fakeHome: string; beforeEach(() => { diff --git a/e2e/specs/cli-help.e2e.ts b/e2e/specs/cli-help.e2e.ts index c375f23..6d0a03e 100644 --- a/e2e/specs/cli-help.e2e.ts +++ b/e2e/specs/cli-help.e2e.ts @@ -1,31 +1,12 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; -import { mkdtempSync, writeFileSync, rmSync } from 'node:fs'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; - -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-help-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { rmSync(repoPath, { recursive: true, force: true }); } catch { /* best-effort */ } - }, - }; -} +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Help command (takt --help)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; beforeEach(() => { isolatedEnv = createIsolatedEnv(); diff --git a/e2e/specs/cli-prompt.e2e.ts b/e2e/specs/cli-prompt.e2e.ts index 47b78fe..205498d 100644 --- a/e2e/specs/cli-prompt.e2e.ts +++ b/e2e/specs/cli-prompt.e2e.ts @@ -1,36 +1,17 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; import { resolve, dirname } from 'node:path'; import { fileURLToPath } from 'node:url'; -import { mkdtempSync, writeFileSync, rmSync } from 'node:fs'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; const __filename = fileURLToPath(import.meta.url); const __dirname = dirname(__filename); -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-prompt-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { rmSync(repoPath, { recursive: true, force: true }); } catch { /* best-effort */ } - }, - }; -} - // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Prompt preview command (takt prompt)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; beforeEach(() => { isolatedEnv = createIsolatedEnv(); diff --git a/e2e/specs/cli-reset-categories.e2e.ts b/e2e/specs/cli-reset-categories.e2e.ts index f53131e..701fac4 100644 --- a/e2e/specs/cli-reset-categories.e2e.ts +++ b/e2e/specs/cli-reset-categories.e2e.ts @@ -1,31 +1,14 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; -import { mkdtempSync, writeFileSync, readFileSync, existsSync, rmSync } from 'node:fs'; +import { readFileSync, existsSync } from 'node:fs'; import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; - -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-reset-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { rmSync(repoPath, { recursive: true, force: true }); } catch { /* best-effort */ } - }, - }; -} +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Reset categories command (takt reset categories)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; beforeEach(() => { isolatedEnv = createIsolatedEnv(); diff --git a/e2e/specs/cli-switch.e2e.ts b/e2e/specs/cli-switch.e2e.ts index f9d05e8..2efa979 100644 --- a/e2e/specs/cli-switch.e2e.ts +++ b/e2e/specs/cli-switch.e2e.ts @@ -1,31 +1,12 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; -import { mkdtempSync, writeFileSync, rmSync } from 'node:fs'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; - -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-switch-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { rmSync(repoPath, { recursive: true, force: true }); } catch { /* best-effort */ } - }, - }; -} +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Switch piece command (takt switch)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; beforeEach(() => { isolatedEnv = createIsolatedEnv(); diff --git a/e2e/specs/cycle-detection.e2e.ts b/e2e/specs/cycle-detection.e2e.ts new file mode 100644 index 0000000..4c3503a --- /dev/null +++ b/e2e/specs/cycle-detection.e2e.ts @@ -0,0 +1,88 @@ +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { resolve, dirname } from 'node:path'; +import { fileURLToPath } from 'node:url'; +import { + createIsolatedEnv, + updateIsolatedConfig, + type IsolatedEnv, +} from '../helpers/isolated-env'; +import { runTakt } from '../helpers/takt-runner'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; +import { readSessionRecords } from '../helpers/session-log'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +// E2E更新時は docs/testing/e2e.md も更新すること +describe('E2E: Cycle detection via loop_monitors (mock)', () => { + let isolatedEnv: IsolatedEnv; + let repo: LocalRepo; + + beforeEach(() => { + isolatedEnv = createIsolatedEnv(); + updateIsolatedConfig(isolatedEnv.taktDir, { + provider: 'mock', + }); + repo = createLocalRepo(); + }); + + afterEach(() => { + try { repo.cleanup(); } catch { /* best-effort */ } + try { isolatedEnv.cleanup(); } catch { /* best-effort */ } + }); + + it('should abort when cycle threshold is reached and judge selects ABORT', () => { + const piecePath = resolve(__dirname, '../fixtures/pieces/mock-cycle-detect.yaml'); + const scenarioPath = resolve(__dirname, '../fixtures/scenarios/cycle-detect-abort.json'); + + const result = runTakt({ + args: [ + '--task', 'Test cycle detection abort', + '--piece', piecePath, + '--create-worktree', 'no', + '--provider', 'mock', + ], + cwd: repo.path, + env: { + ...isolatedEnv.env, + TAKT_MOCK_SCENARIO: scenarioPath, + }, + timeout: 240_000, + }); + + expect(result.exitCode).not.toBe(0); + + const records = readSessionRecords(repo.path); + const judgeStep = records.find((r) => r.type === 'step_complete' && r.step === '_loop_judge_review_fix'); + const abort = records.find((r) => r.type === 'piece_abort'); + + expect(judgeStep).toBeDefined(); + expect(abort).toBeDefined(); + }, 240_000); + + it('should complete when cycle threshold is not reached', () => { + const piecePath = resolve(__dirname, '../fixtures/pieces/mock-cycle-detect.yaml'); + const scenarioPath = resolve(__dirname, '../fixtures/scenarios/cycle-detect-pass.json'); + + const result = runTakt({ + args: [ + '--task', 'Test cycle detection pass', + '--piece', piecePath, + '--create-worktree', 'no', + '--provider', 'mock', + ], + cwd: repo.path, + env: { + ...isolatedEnv.env, + TAKT_MOCK_SCENARIO: scenarioPath, + }, + timeout: 240_000, + }); + + expect(result.exitCode).toBe(0); + + const records = readSessionRecords(repo.path); + expect(records.some((r) => r.type === 'piece_complete')).toBe(true); + expect(records.some((r) => r.type === 'piece_abort')).toBe(false); + }, 240_000); +}); diff --git a/e2e/specs/eject.e2e.ts b/e2e/specs/eject.e2e.ts index 6ced7f3..bbb1628 100644 --- a/e2e/specs/eject.e2e.ts +++ b/e2e/specs/eject.e2e.ts @@ -1,40 +1,14 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; -import { existsSync, readFileSync, mkdirSync, writeFileSync, mkdtempSync, rmSync } from 'node:fs'; +import { existsSync, readFileSync } from 'node:fs'; import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; - -/** - * Create a minimal local git repository for eject tests. - * No GitHub access needed — just a local git init. - */ -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-eject-e2e-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - // Create initial commit so branch exists - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { - rmSync(repoPath, { recursive: true, force: true }); - } catch { - // best-effort - } - }, - }; -} +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Eject builtin pieces (takt eject)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; beforeEach(() => { isolatedEnv = createIsolatedEnv(); diff --git a/e2e/specs/error-handling.e2e.ts b/e2e/specs/error-handling.e2e.ts index 9c6cb0d..92dadb7 100644 --- a/e2e/specs/error-handling.e2e.ts +++ b/e2e/specs/error-handling.e2e.ts @@ -1,36 +1,17 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; import { resolve, dirname } from 'node:path'; import { fileURLToPath } from 'node:url'; -import { mkdtempSync, writeFileSync, rmSync } from 'node:fs'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; const __filename = fileURLToPath(import.meta.url); const __dirname = dirname(__filename); -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-error-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { rmSync(repoPath, { recursive: true, force: true }); } catch { /* best-effort */ } - }, - }; -} - // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Error handling edge cases (mock)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; beforeEach(() => { isolatedEnv = createIsolatedEnv(); diff --git a/e2e/specs/model-override.e2e.ts b/e2e/specs/model-override.e2e.ts new file mode 100644 index 0000000..89a842b --- /dev/null +++ b/e2e/specs/model-override.e2e.ts @@ -0,0 +1,74 @@ +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { resolve, dirname } from 'node:path'; +import { fileURLToPath } from 'node:url'; +import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; +import { runTakt } from '../helpers/takt-runner'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +// E2E更新時は docs/testing/e2e.md も更新すること +describe('E2E: --model option override (mock)', () => { + let isolatedEnv: IsolatedEnv; + let repo: LocalRepo; + + beforeEach(() => { + isolatedEnv = createIsolatedEnv(); + repo = createLocalRepo(); + }); + + afterEach(() => { + try { repo.cleanup(); } catch { /* best-effort */ } + try { isolatedEnv.cleanup(); } catch { /* best-effort */ } + }); + + it('should complete direct task execution with --model', () => { + const piecePath = resolve(__dirname, '../fixtures/pieces/mock-single-step.yaml'); + const scenarioPath = resolve(__dirname, '../fixtures/scenarios/execute-done.json'); + + const result = runTakt({ + args: [ + '--task', 'Test model override direct', + '--piece', piecePath, + '--create-worktree', 'no', + '--provider', 'mock', + '--model', 'mock-model-override', + ], + cwd: repo.path, + env: { + ...isolatedEnv.env, + TAKT_MOCK_SCENARIO: scenarioPath, + }, + timeout: 240_000, + }); + + expect(result.exitCode).toBe(0); + expect(result.stdout).toContain('Piece completed'); + }, 240_000); + + it('should complete pipeline --skip-git execution with --model', () => { + const piecePath = resolve(__dirname, '../fixtures/pieces/mock-single-step.yaml'); + const scenarioPath = resolve(__dirname, '../fixtures/scenarios/execute-done.json'); + + const result = runTakt({ + args: [ + '--pipeline', + '--task', 'Test model override pipeline', + '--piece', piecePath, + '--skip-git', + '--provider', 'mock', + '--model', 'mock-model-override', + ], + cwd: repo.path, + env: { + ...isolatedEnv.env, + TAKT_MOCK_SCENARIO: scenarioPath, + }, + timeout: 240_000, + }); + + expect(result.exitCode).toBe(0); + expect(result.stdout).toContain('completed'); + }, 240_000); +}); diff --git a/e2e/specs/multi-step-sequential.e2e.ts b/e2e/specs/multi-step-sequential.e2e.ts new file mode 100644 index 0000000..e5f063e --- /dev/null +++ b/e2e/specs/multi-step-sequential.e2e.ts @@ -0,0 +1,57 @@ +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { resolve, dirname } from 'node:path'; +import { fileURLToPath } from 'node:url'; +import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; +import { runTakt } from '../helpers/takt-runner'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; +import { readSessionRecords } from '../helpers/session-log'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +// E2E更新時は docs/testing/e2e.md も更新すること +describe('E2E: Sequential multi-step session log transitions (mock)', () => { + let isolatedEnv: IsolatedEnv; + let repo: LocalRepo; + + beforeEach(() => { + isolatedEnv = createIsolatedEnv(); + repo = createLocalRepo(); + }); + + afterEach(() => { + try { repo.cleanup(); } catch { /* best-effort */ } + try { isolatedEnv.cleanup(); } catch { /* best-effort */ } + }); + + it('should record step_complete for both step-1 and step-2', () => { + const piecePath = resolve(__dirname, '../fixtures/pieces/mock-two-step.yaml'); + const scenarioPath = resolve(__dirname, '../fixtures/scenarios/two-step-done.json'); + + const result = runTakt({ + args: [ + '--task', 'Test sequential transitions', + '--piece', piecePath, + '--create-worktree', 'no', + '--provider', 'mock', + ], + cwd: repo.path, + env: { + ...isolatedEnv.env, + TAKT_MOCK_SCENARIO: scenarioPath, + }, + timeout: 240_000, + }); + + expect(result.exitCode).toBe(0); + + const records = readSessionRecords(repo.path); + const completedSteps = records + .filter((r) => r.type === 'step_complete') + .map((r) => String(r.step)); + + expect(completedSteps).toContain('step-1'); + expect(completedSteps).toContain('step-2'); + expect(records.some((r) => r.type === 'piece_complete')).toBe(true); + }, 240_000); +}); diff --git a/e2e/specs/piece-error-handling.e2e.ts b/e2e/specs/piece-error-handling.e2e.ts index 5badea4..0cfae86 100644 --- a/e2e/specs/piece-error-handling.e2e.ts +++ b/e2e/specs/piece-error-handling.e2e.ts @@ -1,36 +1,17 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; import { resolve, dirname } from 'node:path'; import { fileURLToPath } from 'node:url'; -import { mkdtempSync, writeFileSync, rmSync } from 'node:fs'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; const __filename = fileURLToPath(import.meta.url); const __dirname = dirname(__filename); -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-piece-err-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { rmSync(repoPath, { recursive: true, force: true }); } catch { /* best-effort */ } - }, - }; -} - // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Piece error handling (mock)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; beforeEach(() => { isolatedEnv = createIsolatedEnv(); diff --git a/e2e/specs/pipeline-local-repo.e2e.ts b/e2e/specs/pipeline-local-repo.e2e.ts new file mode 100644 index 0000000..12a6141 --- /dev/null +++ b/e2e/specs/pipeline-local-repo.e2e.ts @@ -0,0 +1,91 @@ +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { resolve, dirname, join } from 'node:path'; +import { fileURLToPath } from 'node:url'; +import { mkdtempSync, writeFileSync, rmSync } from 'node:fs'; +import { tmpdir } from 'node:os'; +import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; +import { runTakt } from '../helpers/takt-runner'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +function createNonGitDir(): { path: string; cleanup: () => void } { + const dirPath = mkdtempSync(join(tmpdir(), 'takt-e2e-pipeline-nongit-')); + writeFileSync(join(dirPath, 'README.md'), '# non-git\n'); + return { + path: dirPath, + cleanup: () => { + try { rmSync(dirPath, { recursive: true, force: true }); } catch { /* best-effort */ } + }, + }; +} + +// E2E更新時は docs/testing/e2e.md も更新すること +describe('E2E: Pipeline --skip-git on local/non-git directories (mock)', () => { + let isolatedEnv: IsolatedEnv; + let repo: LocalRepo; + + beforeEach(() => { + isolatedEnv = createIsolatedEnv(); + repo = createLocalRepo(); + }); + + afterEach(() => { + try { repo.cleanup(); } catch { /* best-effort */ } + try { isolatedEnv.cleanup(); } catch { /* best-effort */ } + }); + + it('should execute pipeline with --skip-git in a local git repository', () => { + const piecePath = resolve(__dirname, '../fixtures/pieces/mock-single-step.yaml'); + const scenarioPath = resolve(__dirname, '../fixtures/scenarios/execute-done.json'); + + const result = runTakt({ + args: [ + '--pipeline', + '--task', 'Pipeline local repo test', + '--piece', piecePath, + '--skip-git', + '--provider', 'mock', + ], + cwd: repo.path, + env: { + ...isolatedEnv.env, + TAKT_MOCK_SCENARIO: scenarioPath, + }, + timeout: 240_000, + }); + + expect(result.exitCode).toBe(0); + expect(result.stdout).toContain('completed'); + }, 240_000); + + it('should execute pipeline with --skip-git in a non-git directory', () => { + const piecePath = resolve(__dirname, '../fixtures/pieces/mock-single-step.yaml'); + const scenarioPath = resolve(__dirname, '../fixtures/scenarios/execute-done.json'); + const dir = createNonGitDir(); + + try { + const result = runTakt({ + args: [ + '--pipeline', + '--task', 'Pipeline non-git test', + '--piece', piecePath, + '--skip-git', + '--provider', 'mock', + ], + cwd: dir.path, + env: { + ...isolatedEnv.env, + TAKT_MOCK_SCENARIO: scenarioPath, + }, + timeout: 240_000, + }); + + expect(result.exitCode).toBe(0); + expect(result.stdout).toContain('completed'); + } finally { + dir.cleanup(); + } + }, 240_000); +}); diff --git a/e2e/specs/provider-error.e2e.ts b/e2e/specs/provider-error.e2e.ts index e2e6978..511a72e 100644 --- a/e2e/specs/provider-error.e2e.ts +++ b/e2e/specs/provider-error.e2e.ts @@ -1,40 +1,21 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; import { resolve, dirname } from 'node:path'; import { fileURLToPath } from 'node:url'; -import { mkdtempSync, writeFileSync, rmSync } from 'node:fs'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, updateIsolatedConfig, type IsolatedEnv, } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; const __filename = fileURLToPath(import.meta.url); const __dirname = dirname(__filename); -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-provider-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { rmSync(repoPath, { recursive: true, force: true }); } catch { /* best-effort */ } - }, - }; -} - // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Provider error handling (mock)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; beforeEach(() => { isolatedEnv = createIsolatedEnv(); diff --git a/e2e/specs/quiet-mode.e2e.ts b/e2e/specs/quiet-mode.e2e.ts index 085fb04..5475fbe 100644 --- a/e2e/specs/quiet-mode.e2e.ts +++ b/e2e/specs/quiet-mode.e2e.ts @@ -1,36 +1,17 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; import { resolve, dirname } from 'node:path'; import { fileURLToPath } from 'node:url'; -import { mkdtempSync, writeFileSync, rmSync } from 'node:fs'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; const __filename = fileURLToPath(import.meta.url); const __dirname = dirname(__filename); -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-quiet-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { rmSync(repoPath, { recursive: true, force: true }); } catch { /* best-effort */ } - }, - }; -} - // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Quiet mode (--quiet)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; beforeEach(() => { isolatedEnv = createIsolatedEnv(); diff --git a/e2e/specs/report-file-output.e2e.ts b/e2e/specs/report-file-output.e2e.ts new file mode 100644 index 0000000..e570713 --- /dev/null +++ b/e2e/specs/report-file-output.e2e.ts @@ -0,0 +1,68 @@ +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { resolve, dirname, join } from 'node:path'; +import { fileURLToPath } from 'node:url'; +import { existsSync, readdirSync, readFileSync } from 'node:fs'; +import { + createIsolatedEnv, + updateIsolatedConfig, + type IsolatedEnv, +} from '../helpers/isolated-env'; +import { runTakt } from '../helpers/takt-runner'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +// E2E更新時は docs/testing/e2e.md も更新すること +describe('E2E: Report file output (mock)', () => { + let isolatedEnv: IsolatedEnv; + let repo: LocalRepo; + + beforeEach(() => { + isolatedEnv = createIsolatedEnv(); + updateIsolatedConfig(isolatedEnv.taktDir, { + provider: 'mock', + }); + repo = createLocalRepo(); + }); + + afterEach(() => { + try { repo.cleanup(); } catch { /* best-effort */ } + try { isolatedEnv.cleanup(); } catch { /* best-effort */ } + }); + + it('should write report file to .takt/runs/*/reports with expected content', () => { + const piecePath = resolve(__dirname, '../fixtures/pieces/report-judge.yaml'); + const scenarioPath = resolve(__dirname, '../fixtures/scenarios/report-judge.json'); + + const result = runTakt({ + args: [ + '--task', 'Test report output', + '--piece', piecePath, + '--create-worktree', 'no', + '--provider', 'mock', + ], + cwd: repo.path, + env: { + ...isolatedEnv.env, + TAKT_MOCK_SCENARIO: scenarioPath, + }, + timeout: 240_000, + }); + + expect(result.exitCode).toBe(0); + + const runsDir = join(repo.path, '.takt', 'runs'); + expect(existsSync(runsDir)).toBe(true); + + const runDirs = readdirSync(runsDir).sort(); + expect(runDirs.length).toBeGreaterThan(0); + + const latestRun = runDirs[runDirs.length - 1]!; + const reportPath = join(runsDir, latestRun, 'reports', 'report.md'); + + expect(existsSync(reportPath)).toBe(true); + const report = readFileSync(reportPath, 'utf-8'); + expect(report).toContain('Report summary: OK'); + }, 240_000); +}); diff --git a/e2e/specs/run-multiple-tasks.e2e.ts b/e2e/specs/run-multiple-tasks.e2e.ts index 518db71..ce1de94 100644 --- a/e2e/specs/run-multiple-tasks.e2e.ts +++ b/e2e/specs/run-multiple-tasks.e2e.ts @@ -1,40 +1,23 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; import { resolve, dirname } from 'node:path'; import { fileURLToPath } from 'node:url'; -import { mkdtempSync, mkdirSync, writeFileSync, rmSync } from 'node:fs'; +import { mkdirSync, writeFileSync } from 'node:fs'; import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, updateIsolatedConfig, type IsolatedEnv, } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; const __filename = fileURLToPath(import.meta.url); const __dirname = dirname(__filename); -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-run-multi-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { rmSync(repoPath, { recursive: true, force: true }); } catch { /* best-effort */ } - }, - }; -} - // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Run multiple tasks (takt run)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; const piecePath = resolve(__dirname, '../fixtures/pieces/mock-single-step.yaml'); diff --git a/e2e/specs/run-sigint-graceful.e2e.ts b/e2e/specs/run-sigint-graceful.e2e.ts index 79c2d85..041ce7c 100644 --- a/e2e/specs/run-sigint-graceful.e2e.ts +++ b/e2e/specs/run-sigint-graceful.e2e.ts @@ -171,4 +171,85 @@ describe('E2E: Run tasks graceful shutdown on SIGINT (parallel)', () => { expect(stderr).not.toContain('UnhandledPromiseRejection'); } }, 120_000); + + it('should force exit immediately on second SIGINT', async () => { + const binPath = resolve(__dirname, '../../bin/takt'); + const piecePath = resolve(__dirname, '../fixtures/pieces/mock-slow-multi-step.yaml'); + const scenarioPath = resolve(__dirname, '../fixtures/scenarios/run-sigint-parallel.json'); + + const tasksFile = join(testRepo.path, '.takt', 'tasks.yaml'); + mkdirSync(join(testRepo.path, '.takt'), { recursive: true }); + + const now = new Date().toISOString(); + writeFileSync( + tasksFile, + [ + 'tasks:', + ' - name: sigint-a', + ' status: pending', + ' content: "E2E SIGINT task A"', + ` piece: "${piecePath}"`, + ' worktree: true', + ` created_at: "${now}"`, + ' started_at: null', + ' completed_at: null', + ' owner_pid: null', + ' - name: sigint-b', + ' status: pending', + ' content: "E2E SIGINT task B"', + ` piece: "${piecePath}"`, + ' worktree: true', + ` created_at: "${now}"`, + ' started_at: null', + ' completed_at: null', + ' owner_pid: null', + ' - name: sigint-c', + ' status: pending', + ' content: "E2E SIGINT task C"', + ` piece: "${piecePath}"`, + ' worktree: true', + ` created_at: "${now}"`, + ' started_at: null', + ' completed_at: null', + ' owner_pid: null', + ].join('\n'), + 'utf-8', + ); + + const child = spawn('node', [binPath, 'run', '--provider', 'mock'], { + cwd: testRepo.path, + env: { + ...isolatedEnv.env, + TAKT_MOCK_SCENARIO: scenarioPath, + TAKT_E2E_SELF_SIGINT_TWICE: '1', + }, + stdio: ['ignore', 'pipe', 'pipe'], + }); + + let stdout = ''; + let stderr = ''; + child.stdout?.on('data', (chunk) => { + stdout += chunk.toString(); + }); + child.stderr?.on('data', (chunk) => { + stderr += chunk.toString(); + }); + + const workersFilled = await waitFor( + () => stdout.includes('=== Task: sigint-b ==='), + 30_000, + 20, + ); + expect(workersFilled, `stdout:\n${stdout}\n\nstderr:\n${stderr}`).toBe(true); + + const exit = await waitForClose(child, 60_000); + expect( + exit.signal === 'SIGINT' || exit.code === 130, + `unexpected exit: code=${exit.code}, signal=${exit.signal}`, + ).toBe(true); + + if (stderr.trim().length > 0) { + expect(stderr).not.toContain('UnhandledPromiseRejection'); + } + }, 120_000); }); diff --git a/e2e/specs/session-log.e2e.ts b/e2e/specs/session-log.e2e.ts new file mode 100644 index 0000000..a65cdc2 --- /dev/null +++ b/e2e/specs/session-log.e2e.ts @@ -0,0 +1,81 @@ +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { resolve, dirname } from 'node:path'; +import { fileURLToPath } from 'node:url'; +import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; +import { runTakt } from '../helpers/takt-runner'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; +import { readSessionRecords } from '../helpers/session-log'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +// E2E更新時は docs/testing/e2e.md も更新すること +describe('E2E: Session NDJSON log output (mock)', () => { + let isolatedEnv: IsolatedEnv; + let repo: LocalRepo; + + beforeEach(() => { + isolatedEnv = createIsolatedEnv(); + repo = createLocalRepo(); + }); + + afterEach(() => { + try { repo.cleanup(); } catch { /* best-effort */ } + try { isolatedEnv.cleanup(); } catch { /* best-effort */ } + }); + + it('should write piece_start, step_complete, and piece_complete on success', () => { + const piecePath = resolve(__dirname, '../fixtures/pieces/mock-single-step.yaml'); + const scenarioPath = resolve(__dirname, '../fixtures/scenarios/execute-done.json'); + + const result = runTakt({ + args: [ + '--task', 'Test session log success', + '--piece', piecePath, + '--create-worktree', 'no', + '--provider', 'mock', + ], + cwd: repo.path, + env: { + ...isolatedEnv.env, + TAKT_MOCK_SCENARIO: scenarioPath, + }, + timeout: 240_000, + }); + + expect(result.exitCode).toBe(0); + + const records = readSessionRecords(repo.path); + expect(records.some((r) => r.type === 'piece_start')).toBe(true); + expect(records.some((r) => r.type === 'step_complete')).toBe(true); + expect(records.some((r) => r.type === 'piece_complete')).toBe(true); + }, 240_000); + + it('should write piece_abort with reason on failure', () => { + const piecePath = resolve(__dirname, '../fixtures/pieces/mock-no-match.yaml'); + const scenarioPath = resolve(__dirname, '../fixtures/scenarios/no-match.json'); + + const result = runTakt({ + args: [ + '--task', 'Test session log abort', + '--piece', piecePath, + '--create-worktree', 'no', + '--provider', 'mock', + ], + cwd: repo.path, + env: { + ...isolatedEnv.env, + TAKT_MOCK_SCENARIO: scenarioPath, + }, + timeout: 240_000, + }); + + expect(result.exitCode).not.toBe(0); + + const records = readSessionRecords(repo.path); + const abortRecord = records.find((r) => r.type === 'piece_abort'); + expect(abortRecord).toBeDefined(); + expect(typeof abortRecord?.reason).toBe('string'); + expect((abortRecord?.reason as string).length).toBeGreaterThan(0); + }, 240_000); +}); diff --git a/e2e/specs/structured-output.e2e.ts b/e2e/specs/structured-output.e2e.ts new file mode 100644 index 0000000..1742e63 --- /dev/null +++ b/e2e/specs/structured-output.e2e.ts @@ -0,0 +1,96 @@ +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { resolve, dirname } from 'node:path'; +import { fileURLToPath } from 'node:url'; +import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; +import { runTakt } from '../helpers/takt-runner'; +import { readSessionRecords } from '../helpers/session-log'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +/** + * E2E: Structured output for status judgment (Phase 3). + * + * Verifies that real providers (Claude, Codex, OpenCode) can execute a piece + * where the status judgment phase uses structured output (`outputSchema`) + * internally via `judgeStatus()`. + * + * The piece has 2 rules per step, so `judgeStatus` cannot auto-select + * and must actually call the provider with an outputSchema to determine + * which rule matched. + * + * If structured output works correctly, `judgeStatus` extracts the step + * number from `response.structuredOutput.step` (recorded as `structured_output`). + * If the agent happens to output `[STEP:N]` tags, the RuleEvaluator detects + * them as `phase3_tag`/`phase1_tag` (recorded as `tag_fallback` in session log). + * The session log matchMethod is transformed by `toJudgmentMatchMethod()`. + * + * Run with: + * TAKT_E2E_PROVIDER=claude vitest run --config vitest.config.e2e.structured-output.ts + * TAKT_E2E_PROVIDER=codex vitest run --config vitest.config.e2e.structured-output.ts + * TAKT_E2E_PROVIDER=opencode TAKT_E2E_MODEL=openai/gpt-4 vitest run --config vitest.config.e2e.structured-output.ts + */ +describe('E2E: Structured output rule matching', () => { + let isolatedEnv: IsolatedEnv; + let repo: LocalRepo; + + beforeEach(() => { + isolatedEnv = createIsolatedEnv(); + repo = createLocalRepo(); + }); + + afterEach(() => { + try { repo.cleanup(); } catch { /* best-effort */ } + try { isolatedEnv.cleanup(); } catch { /* best-effort */ } + }); + + it('should complete piece via Phase 3 status judgment with 2-rule step', () => { + const piecePath = resolve(__dirname, '../fixtures/pieces/structured-output.yaml'); + + const result = runTakt({ + args: [ + '--task', 'Say hello', + '--piece', piecePath, + '--create-worktree', 'no', + ], + cwd: repo.path, + env: isolatedEnv.env, + timeout: 240_000, + }); + + if (result.exitCode !== 0) { + console.log('=== STDOUT ===\n', result.stdout); + console.log('=== STDERR ===\n', result.stderr); + } + + // Always log the matchMethod for diagnostic purposes + const allRecords = readSessionRecords(repo.path); + const sc = allRecords.find((r) => r.type === 'step_complete'); + console.log(`=== matchMethod: ${sc?.matchMethod ?? '(none)'} ===`); + + expect(result.exitCode).toBe(0); + expect(result.stdout).toContain('Piece completed'); + + // Verify session log has proper step_complete with matchMethod + const records = readSessionRecords(repo.path); + + const pieceComplete = records.find((r) => r.type === 'piece_complete'); + expect(pieceComplete).toBeDefined(); + + const stepComplete = records.find((r) => r.type === 'step_complete'); + expect(stepComplete).toBeDefined(); + + // matchMethod should be present — the 2-rule step required actual judgment + // (auto_select is only used for single-rule steps) + const matchMethod = stepComplete?.matchMethod as string | undefined; + expect(matchMethod).toBeDefined(); + + // Session log records transformed matchMethod via toJudgmentMatchMethod(): + // structured_output → structured_output (judgeStatus extracted from structuredOutput.step) + // phase3_tag / phase1_tag → tag_fallback (agent output [STEP:N] tag, detected by RuleEvaluator) + // ai_judge / ai_judge_fallback → ai_judge (AI evaluated conditions as fallback) + const validMethods = ['structured_output', 'tag_fallback', 'ai_judge']; + expect(validMethods).toContain(matchMethod); + }, 240_000); +}); diff --git a/e2e/specs/task-content-file.e2e.ts b/e2e/specs/task-content-file.e2e.ts index 4e79acb..ae36721 100644 --- a/e2e/specs/task-content-file.e2e.ts +++ b/e2e/specs/task-content-file.e2e.ts @@ -1,40 +1,23 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest'; import { resolve, dirname } from 'node:path'; import { fileURLToPath } from 'node:url'; -import { mkdtempSync, mkdirSync, writeFileSync, rmSync } from 'node:fs'; +import { mkdirSync, writeFileSync } from 'node:fs'; import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { execFileSync } from 'node:child_process'; import { createIsolatedEnv, updateIsolatedConfig, type IsolatedEnv, } from '../helpers/isolated-env'; import { runTakt } from '../helpers/takt-runner'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; const __filename = fileURLToPath(import.meta.url); const __dirname = dirname(__filename); -function createLocalRepo(): { path: string; cleanup: () => void } { - const repoPath = mkdtempSync(join(tmpdir(), 'takt-e2e-contentfile-')); - execFileSync('git', ['init'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.email', 'test@example.com'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['config', 'user.name', 'Test'], { cwd: repoPath, stdio: 'pipe' }); - writeFileSync(join(repoPath, 'README.md'), '# test\n'); - execFileSync('git', ['add', '.'], { cwd: repoPath, stdio: 'pipe' }); - execFileSync('git', ['commit', '-m', 'init'], { cwd: repoPath, stdio: 'pipe' }); - return { - path: repoPath, - cleanup: () => { - try { rmSync(repoPath, { recursive: true, force: true }); } catch { /* best-effort */ } - }, - }; -} - // E2E更新時は docs/testing/e2e.md も更新すること describe('E2E: Task content_file reference (mock)', () => { let isolatedEnv: IsolatedEnv; - let repo: { path: string; cleanup: () => void }; + let repo: LocalRepo; const piecePath = resolve(__dirname, '../fixtures/pieces/mock-single-step.yaml'); diff --git a/e2e/specs/task-status-persistence.e2e.ts b/e2e/specs/task-status-persistence.e2e.ts new file mode 100644 index 0000000..632b905 --- /dev/null +++ b/e2e/specs/task-status-persistence.e2e.ts @@ -0,0 +1,109 @@ +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { resolve, dirname, join } from 'node:path'; +import { fileURLToPath } from 'node:url'; +import { mkdirSync, writeFileSync, readFileSync } from 'node:fs'; +import { parse as parseYaml } from 'yaml'; +import { createIsolatedEnv, updateIsolatedConfig, type IsolatedEnv } from '../helpers/isolated-env'; +import { runTakt } from '../helpers/takt-runner'; +import { createLocalRepo, type LocalRepo } from '../helpers/test-repo'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +function writeSinglePendingTask(repoPath: string, piecePath: string): void { + const now = new Date().toISOString(); + mkdirSync(join(repoPath, '.takt'), { recursive: true }); + writeFileSync( + join(repoPath, '.takt', 'tasks.yaml'), + [ + 'tasks:', + ' - name: task-1', + ' status: pending', + ' content: "Task 1"', + ` piece: "${piecePath}"`, + ` created_at: "${now}"`, + ' started_at: null', + ' completed_at: null', + ].join('\n'), + 'utf-8', + ); +} + +// E2E更新時は docs/testing/e2e.md も更新すること +describe('E2E: Task status persistence in tasks.yaml (mock)', () => { + let isolatedEnv: IsolatedEnv; + let repo: LocalRepo; + + beforeEach(() => { + isolatedEnv = createIsolatedEnv(); + repo = createLocalRepo(); + + updateIsolatedConfig(isolatedEnv.taktDir, { + provider: 'mock', + }); + }); + + afterEach(() => { + try { repo.cleanup(); } catch { /* best-effort */ } + try { isolatedEnv.cleanup(); } catch { /* best-effort */ } + }); + + it('should remove task record after successful completion', () => { + const piecePath = resolve(__dirname, '../fixtures/pieces/mock-single-step.yaml'); + const scenarioPath = resolve(__dirname, '../fixtures/scenarios/execute-done.json'); + + writeSinglePendingTask(repo.path, piecePath); + + const result = runTakt({ + args: ['run', '--provider', 'mock'], + cwd: repo.path, + env: { + ...isolatedEnv.env, + TAKT_MOCK_SCENARIO: scenarioPath, + }, + timeout: 240_000, + }); + + expect(result.exitCode).toBe(0); + + const tasksContent = readFileSync(join(repo.path, '.takt', 'tasks.yaml'), 'utf-8'); + const tasks = parseYaml(tasksContent) as { tasks: Array> }; + expect(Array.isArray(tasks.tasks)).toBe(true); + expect(tasks.tasks.length).toBe(0); + }, 240_000); + + it('should persist failed status and failure details on failure', () => { + const piecePath = resolve(__dirname, '../fixtures/pieces/mock-no-match.yaml'); + const scenarioPath = resolve(__dirname, '../fixtures/scenarios/no-match.json'); + + writeSinglePendingTask(repo.path, piecePath); + + const result = runTakt({ + args: ['run', '--provider', 'mock'], + cwd: repo.path, + env: { + ...isolatedEnv.env, + TAKT_MOCK_SCENARIO: scenarioPath, + }, + timeout: 240_000, + }); + + expect(result.exitCode).toBe(0); + + const tasksContent = readFileSync(join(repo.path, '.takt', 'tasks.yaml'), 'utf-8'); + const tasks = parseYaml(tasksContent) as { + tasks: Array<{ + status: string; + started_at: string | null; + completed_at: string | null; + failure?: { error?: string }; + }>; + }; + + expect(tasks.tasks.length).toBe(1); + expect(tasks.tasks[0]?.status).toBe('failed'); + expect(tasks.tasks[0]?.started_at).toBeTruthy(); + expect(tasks.tasks[0]?.completed_at).toBeTruthy(); + expect(tasks.tasks[0]?.failure?.error).toBeTruthy(); + }, 240_000); +}); diff --git a/package-lock.json b/package-lock.json index 0f1b8a4..135f507 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,12 +1,12 @@ { "name": "takt", - "version": "0.12.1", + "version": "0.13.0-alpha.1", "lockfileVersion": 3, "requires": true, "packages": { "": { "name": "takt", - "version": "0.12.1", + "version": "0.13.0-alpha.1", "license": "MIT", "dependencies": { "@anthropic-ai/claude-agent-sdk": "^0.2.37", diff --git a/package.json b/package.json index 92b5859..794d349 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "takt", - "version": "0.12.1", + "version": "0.13.0-alpha.1", "description": "TAKT: TAKT Agent Koordination Topology - AI Agent Piece Orchestration", "main": "dist/index.js", "types": "dist/index.d.ts", @@ -14,8 +14,8 @@ "watch": "tsc --watch", "test": "vitest run", "test:watch": "vitest", - "test:e2e": "npm run test:e2e:all", - "test:e2e:mock": "vitest run --config vitest.config.e2e.mock.ts --reporter=verbose", + "test:e2e": "npm run test:e2e:mock; code=$?; if [ \"$code\" -eq 0 ]; then msg='test:e2e passed'; else msg=\"test:e2e failed (exit=$code)\"; fi; if command -v osascript >/dev/null 2>&1; then osascript -e \"display notification \\\"$msg\\\" with title \\\"takt\\\" subtitle \\\"E2E\\\"\" >/dev/null 2>&1 || true; fi; echo \"[takt] $msg\"; exit $code", + "test:e2e:mock": "TAKT_E2E_PROVIDER=mock vitest run --config vitest.config.e2e.mock.ts --reporter=verbose", "test:e2e:all": "npm run test:e2e:mock && npm run test:e2e:provider", "test:e2e:provider": "npm run test:e2e:provider:claude && npm run test:e2e:provider:codex", "test:e2e:provider:claude": "TAKT_E2E_PROVIDER=claude vitest run --config vitest.config.e2e.provider.ts --reporter=verbose", diff --git a/src/__tests__/LogManager.test.ts b/src/__tests__/LogManager.test.ts new file mode 100644 index 0000000..b3b59cc --- /dev/null +++ b/src/__tests__/LogManager.test.ts @@ -0,0 +1,72 @@ +import { beforeEach, describe, expect, it, vi } from 'vitest'; + +vi.mock('chalk', () => { + const passthrough = (value: string) => value; + const bold = Object.assign((value: string) => value, { + cyan: (value: string) => value, + }); + + return { + default: { + gray: passthrough, + blue: passthrough, + yellow: passthrough, + red: passthrough, + green: passthrough, + white: passthrough, + bold, + }, + }; +}); + +import { LogManager } from '../shared/ui/LogManager.js'; + +describe('LogManager', () => { + beforeEach(() => { + // Given: テスト間でシングルトン状態が共有されないようにする + LogManager.resetInstance(); + vi.clearAllMocks(); + }); + + it('should filter by info level as debug=false, info=true, error=true', () => { + // Given: ログレベルが info + const manager = LogManager.getInstance(); + manager.setLogLevel('info'); + + // When: 各レベルの出力可否を判定する + const debugResult = manager.shouldLog('debug'); + const infoResult = manager.shouldLog('info'); + const errorResult = manager.shouldLog('error'); + + // Then: info基準のフィルタリングが適用される + expect(debugResult).toBe(false); + expect(infoResult).toBe(true); + expect(errorResult).toBe(true); + }); + + it('should reflect level change after setLogLevel', () => { + // Given: 初期レベル(info) + const manager = LogManager.getInstance(); + + // When: warn レベルに変更する + manager.setLogLevel('warn'); + + // Then: info は抑制され warn は出力対象になる + expect(manager.shouldLog('info')).toBe(false); + expect(manager.shouldLog('warn')).toBe(true); + }); + + it('should clear singleton state when resetInstance is called', () => { + // Given: エラーレベルに変更済みのインスタンス + const first = LogManager.getInstance(); + first.setLogLevel('error'); + expect(first.shouldLog('info')).toBe(false); + + // When: シングルトンをリセットして再取得する + LogManager.resetInstance(); + const second = LogManager.getInstance(); + + // Then: 新しいインスタンスは初期レベルに戻る + expect(second.shouldLog('info')).toBe(true); + }); +}); diff --git a/src/__tests__/abort-signal.test.ts b/src/__tests__/abort-signal.test.ts new file mode 100644 index 0000000..c3b80b2 --- /dev/null +++ b/src/__tests__/abort-signal.test.ts @@ -0,0 +1,68 @@ +import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'; +import { buildAbortSignal } from '../core/piece/engine/abort-signal.js'; + +describe('buildAbortSignal', () => { + beforeEach(() => { + vi.useFakeTimers(); + }); + + afterEach(() => { + vi.useRealTimers(); + vi.restoreAllMocks(); + }); + + it('タイムアウトでabortされる', () => { + const { signal, dispose } = buildAbortSignal(100, undefined); + + expect(signal.aborted).toBe(false); + vi.advanceTimersByTime(100); + expect(signal.aborted).toBe(true); + expect(signal.reason).toBeInstanceOf(Error); + expect((signal.reason as Error).message).toBe('Part timeout after 100ms'); + + dispose(); + }); + + it('親シグナルがabortされると子シグナルへ伝搬する', () => { + const parent = new AbortController(); + const { signal, dispose } = buildAbortSignal(1000, parent.signal); + const reason = new Error('parent aborted'); + + parent.abort(reason); + + expect(signal.aborted).toBe(true); + expect(signal.reason).toBe(reason); + + dispose(); + }); + + it('disposeでタイマーと親リスナーを解放する', () => { + const parent = new AbortController(); + const addSpy = vi.spyOn(parent.signal, 'addEventListener'); + const removeSpy = vi.spyOn(parent.signal, 'removeEventListener'); + const { signal, dispose } = buildAbortSignal(100, parent.signal); + + expect(addSpy).toHaveBeenCalledTimes(1); + + dispose(); + vi.advanceTimersByTime(200); + + expect(signal.aborted).toBe(false); + expect(removeSpy).toHaveBeenCalledTimes(1); + }); + + it('親シグナルが既にabort済みなら即時伝搬する', () => { + const parent = new AbortController(); + const reason = new Error('already aborted'); + const addSpy = vi.spyOn(parent.signal, 'addEventListener'); + parent.abort(reason); + + const { signal, dispose } = buildAbortSignal(1000, parent.signal); + + expect(signal.aborted).toBe(true); + expect(signal.reason).toBe(reason); + expect(addSpy).not.toHaveBeenCalled(); + + dispose(); + }); +}); diff --git a/src/__tests__/agent-usecases.test.ts b/src/__tests__/agent-usecases.test.ts new file mode 100644 index 0000000..91d0f40 --- /dev/null +++ b/src/__tests__/agent-usecases.test.ts @@ -0,0 +1,232 @@ +import { beforeEach, describe, expect, it, vi } from 'vitest'; +import { runAgent } from '../agents/runner.js'; +import { parseParts } from '../core/piece/engine/task-decomposer.js'; +import { detectJudgeIndex } from '../agents/judge-utils.js'; +import { + executeAgent, + generateReport, + executePart, + evaluateCondition, + judgeStatus, + decomposeTask, +} from '../core/piece/agent-usecases.js'; + +vi.mock('../agents/runner.js', () => ({ + runAgent: vi.fn(), +})); + +vi.mock('../core/piece/schema-loader.js', () => ({ + loadJudgmentSchema: vi.fn(() => ({ type: 'judgment' })), + loadEvaluationSchema: vi.fn(() => ({ type: 'evaluation' })), + loadDecompositionSchema: vi.fn((maxParts: number) => ({ type: 'decomposition', maxParts })), +})); + +vi.mock('../core/piece/engine/task-decomposer.js', () => ({ + parseParts: vi.fn(), +})); + +vi.mock('../agents/judge-utils.js', () => ({ + buildJudgePrompt: vi.fn(() => 'judge prompt'), + detectJudgeIndex: vi.fn(() => -1), +})); + +function doneResponse(content: string, structuredOutput?: Record) { + return { + persona: 'tester', + status: 'done' as const, + content, + timestamp: new Date('2026-02-12T00:00:00Z'), + structuredOutput, + }; +} + +const judgeOptions = { cwd: '/repo', movementName: 'review' }; + +describe('agent-usecases', () => { + beforeEach(() => { + vi.clearAllMocks(); + }); + + it('executeAgent/generateReport/executePart は runAgent に委譲する', async () => { + vi.mocked(runAgent).mockResolvedValue(doneResponse('ok')); + + await executeAgent('coder', 'do work', { cwd: '/tmp' }); + await generateReport('coder', 'write report', { cwd: '/tmp' }); + await executePart('coder', 'part work', { cwd: '/tmp' }); + + expect(runAgent).toHaveBeenCalledTimes(3); + expect(runAgent).toHaveBeenNthCalledWith(1, 'coder', 'do work', { cwd: '/tmp' }); + expect(runAgent).toHaveBeenNthCalledWith(2, 'coder', 'write report', { cwd: '/tmp' }); + expect(runAgent).toHaveBeenNthCalledWith(3, 'coder', 'part work', { cwd: '/tmp' }); + }); + + it('evaluateCondition は構造化出力の matched_index を優先する', async () => { + vi.mocked(runAgent).mockResolvedValue(doneResponse('ignored', { matched_index: 2 })); + + const result = await evaluateCondition('agent output', [ + { index: 0, text: 'first' }, + { index: 1, text: 'second' }, + ], { cwd: '/repo' }); + + expect(result).toBe(1); + expect(runAgent).toHaveBeenCalledWith(undefined, 'judge prompt', expect.objectContaining({ + cwd: '/repo', + outputSchema: { type: 'evaluation' }, + })); + }); + + it('evaluateCondition は構造化出力が使えない場合にタグ検出へフォールバックする', async () => { + vi.mocked(runAgent).mockResolvedValue(doneResponse('[JUDGE:2]')); + vi.mocked(detectJudgeIndex).mockReturnValue(1); + + const result = await evaluateCondition('agent output', [ + { index: 0, text: 'first' }, + { index: 1, text: 'second' }, + ], { cwd: '/repo' }); + + expect(result).toBe(1); + expect(detectJudgeIndex).toHaveBeenCalledWith('[JUDGE:2]'); + }); + + it('evaluateCondition は runAgent が done 以外なら -1 を返す', async () => { + vi.mocked(runAgent).mockResolvedValue({ + persona: 'tester', + status: 'error', + content: 'failed', + timestamp: new Date('2026-02-12T00:00:00Z'), + }); + + const result = await evaluateCondition('agent output', [ + { index: 0, text: 'first' }, + ], { cwd: '/repo' }); + + expect(result).toBe(-1); + expect(detectJudgeIndex).not.toHaveBeenCalled(); + }); + + // --- judgeStatus: 3-stage fallback --- + + it('judgeStatus は単一ルール時に auto_select を返す', async () => { + const result = await judgeStatus('structured', 'tag', [{ condition: 'always', next: 'done' }], judgeOptions); + + expect(result).toEqual({ ruleIndex: 0, method: 'auto_select' }); + expect(runAgent).not.toHaveBeenCalled(); + }); + + it('judgeStatus はルールが空ならエラー', async () => { + await expect(judgeStatus('structured', 'tag', [], judgeOptions)) + .rejects.toThrow('judgeStatus requires at least one rule'); + }); + + it('judgeStatus は Stage 1 で構造化出力 step を採用する', async () => { + vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('x', { step: 2 })); + + const result = await judgeStatus('structured', 'tag', [ + { condition: 'a', next: 'one' }, + { condition: 'b', next: 'two' }, + ], judgeOptions); + + expect(result).toEqual({ ruleIndex: 1, method: 'structured_output' }); + expect(runAgent).toHaveBeenCalledTimes(1); + expect(runAgent).toHaveBeenCalledWith('conductor', 'structured', expect.objectContaining({ + outputSchema: { type: 'judgment' }, + })); + }); + + it('judgeStatus は Stage 2 でタグ検出を使う', async () => { + // Stage 1: structured output fails (no structuredOutput) + vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('no match')); + // Stage 2: tag detection succeeds + vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('[REVIEW:2]')); + + const result = await judgeStatus('structured', 'tag', [ + { condition: 'a', next: 'one' }, + { condition: 'b', next: 'two' }, + ], judgeOptions); + + expect(result).toEqual({ ruleIndex: 1, method: 'phase3_tag' }); + expect(runAgent).toHaveBeenCalledTimes(2); + expect(runAgent).toHaveBeenNthCalledWith(1, 'conductor', 'structured', expect.objectContaining({ + outputSchema: { type: 'judgment' }, + })); + expect(runAgent).toHaveBeenNthCalledWith(2, 'conductor', 'tag', expect.not.objectContaining({ + outputSchema: expect.anything(), + })); + }); + + it('judgeStatus は Stage 3 で AI Judge を使う', async () => { + // Stage 1: structured output fails + vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('no match')); + // Stage 2: tag detection fails + vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('no tag')); + // Stage 3: evaluateCondition succeeds + vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('ignored', { matched_index: 2 })); + + const result = await judgeStatus('structured', 'tag', [ + { condition: 'a', next: 'one' }, + { condition: 'b', next: 'two' }, + ], judgeOptions); + + expect(result).toEqual({ ruleIndex: 1, method: 'ai_judge' }); + expect(runAgent).toHaveBeenCalledTimes(3); + }); + + it('judgeStatus は全ての判定に失敗したらエラー', async () => { + // Stage 1: structured output fails + vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('no match')); + // Stage 2: tag detection fails + vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('no tag')); + // Stage 3: evaluateCondition fails + vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('still no match')); + vi.mocked(detectJudgeIndex).mockReturnValue(-1); + + await expect(judgeStatus('structured', 'tag', [ + { condition: 'a', next: 'one' }, + { condition: 'b', next: 'two' }, + ], judgeOptions)).rejects.toThrow('Status not found for movement "review"'); + }); + + // --- decomposeTask --- + + it('decomposeTask は構造化出力 parts を返す', async () => { + vi.mocked(runAgent).mockResolvedValue(doneResponse('x', { + parts: [ + { id: 'p1', title: 'Part 1', instruction: 'Do 1', timeout_ms: 1000 }, + ], + })); + + const result = await decomposeTask('instruction', 3, { cwd: '/repo', persona: 'team-leader' }); + + expect(result).toEqual([ + { id: 'p1', title: 'Part 1', instruction: 'Do 1', timeoutMs: 1000 }, + ]); + expect(parseParts).not.toHaveBeenCalled(); + }); + + it('decomposeTask は構造化出力がない場合 parseParts にフォールバックする', async () => { + vi.mocked(runAgent).mockResolvedValue(doneResponse('```json [] ```')); + vi.mocked(parseParts).mockReturnValue([ + { id: 'p1', title: 'Part 1', instruction: 'fallback', timeoutMs: undefined }, + ]); + + const result = await decomposeTask('instruction', 2, { cwd: '/repo' }); + + expect(parseParts).toHaveBeenCalledWith('```json [] ```', 2); + expect(result).toEqual([ + { id: 'p1', title: 'Part 1', instruction: 'fallback', timeoutMs: undefined }, + ]); + }); + + it('decomposeTask は done 以外をエラーにする', async () => { + vi.mocked(runAgent).mockResolvedValue({ + persona: 'team-leader', + status: 'error', + content: 'failure', + error: 'bad output', + timestamp: new Date('2026-02-12T00:00:00Z'), + }); + + await expect(decomposeTask('instruction', 2, { cwd: '/repo' })) + .rejects.toThrow('Team leader failed: bad output'); + }); +}); diff --git a/src/__tests__/blocked-handler.test.ts b/src/__tests__/blocked-handler.test.ts index 215b139..05a5081 100644 --- a/src/__tests__/blocked-handler.test.ts +++ b/src/__tests__/blocked-handler.test.ts @@ -6,17 +6,9 @@ import { describe, it, expect, vi } from 'vitest'; import { handleBlocked } from '../core/piece/engine/blocked-handler.js'; -import type { PieceMovement, AgentResponse } from '../core/models/types.js'; +import type { AgentResponse } from '../core/models/types.js'; import type { PieceEngineOptions } from '../core/piece/types.js'; - -function makeMovement(): PieceMovement { - return { - name: 'test-movement', - personaDisplayName: 'tester', - instructionTemplate: '', - passPreviousResponse: false, - }; -} +import { makeMovement } from './test-helpers.js'; function makeResponse(content: string): AgentResponse { return { diff --git a/src/__tests__/branchGitCommands.test.ts b/src/__tests__/branchGitCommands.test.ts new file mode 100644 index 0000000..6ddb2bf --- /dev/null +++ b/src/__tests__/branchGitCommands.test.ts @@ -0,0 +1,78 @@ +import { describe, expect, it, vi, beforeEach } from 'vitest'; + +vi.mock('node:child_process', () => ({ + execFileSync: vi.fn(), +})); + +import { execFileSync } from 'node:child_process'; +import { parseDistinctHashes, runGit } from '../infra/task/branchGitCommands.js'; + +const mockExecFileSync = vi.mocked(execFileSync); + +describe('parseDistinctHashes', () => { + it('should remove only consecutive duplicates', () => { + // Given: 連続重複と非連続重複を含む出力 + const output = 'a\na\nb\nb\na\n'; + + // When: ハッシュを解析する + const result = parseDistinctHashes(output); + + // Then: 連続重複のみ除去される + expect(result).toEqual(['a', 'b', 'a']); + }); + + it('should return empty array when output is empty', () => { + // Given: 空文字列 + const output = ''; + + // When: ハッシュを解析する + const result = parseDistinctHashes(output); + + // Then: 空配列を返す + expect(result).toEqual([]); + }); + + it('should trim each line and drop blank lines', () => { + // Given: 前後空白と空行を含む出力 + const output = ' hash1 \n\n hash2\n \n'; + + // When: ハッシュを解析する + const result = parseDistinctHashes(output); + + // Then: トリム済みの値のみ残る + expect(result).toEqual(['hash1', 'hash2']); + }); + + it('should return single hash as one-element array', () => { + // Given: 単一ハッシュ + const output = 'single-hash'; + + // When: ハッシュを解析する + const result = parseDistinctHashes(output); + + // Then: 1件配列として返る + expect(result).toEqual(['single-hash']); + }); +}); + +describe('runGit', () => { + beforeEach(() => { + vi.clearAllMocks(); + }); + + it('should execute git command with expected options and trim output', () => { + // Given: gitコマンドのモック応答 + mockExecFileSync.mockReturnValue(' abc123 \n' as never); + + // When: runGit を実行する + const result = runGit('/repo', ['rev-parse', 'HEAD']); + + // Then: execFileSync が正しい引数で呼ばれ、trimされた値を返す + expect(mockExecFileSync).toHaveBeenCalledWith('git', ['rev-parse', 'HEAD'], { + cwd: '/repo', + encoding: 'utf-8', + stdio: 'pipe', + }); + expect(result).toBe('abc123'); + }); +}); diff --git a/src/__tests__/branchList.regression.test.ts b/src/__tests__/branchList.regression.test.ts index a62caf6..c38e9a9 100644 --- a/src/__tests__/branchList.regression.test.ts +++ b/src/__tests__/branchList.regression.test.ts @@ -46,7 +46,7 @@ function setupRepoForIssue167(options?: { disableReflog?: boolean; firstBranchCo writeAndCommit(repoDir, 'develop-takt.txt', 'develop takt\n', 'takt: old instruction on develop'); writeAndCommit(repoDir, 'develop-b.txt', 'develop b\n', 'develop commit B'); - const taktBranch = 'takt/#167/fix-original-instruction'; + const taktBranch = 'takt/167/fix-original-instruction'; runGit(repoDir, ['checkout', '-b', taktBranch]); const firstBranchCommitMessage = options?.firstBranchCommitMessage ?? 'takt: github-issue-167-fix-original-instruction'; writeAndCommit(repoDir, 'task-1.txt', 'task1\n', firstBranchCommitMessage); diff --git a/src/__tests__/claude-executor-abort-signal.test.ts b/src/__tests__/claude-executor-abort-signal.test.ts new file mode 100644 index 0000000..dca6c24 --- /dev/null +++ b/src/__tests__/claude-executor-abort-signal.test.ts @@ -0,0 +1,89 @@ +import { beforeEach, describe, expect, it, vi } from 'vitest'; + +const { + queryMock, + interruptMock, + AbortErrorMock, +} = vi.hoisted(() => { + const interruptMock = vi.fn(async () => {}); + class AbortErrorMock extends Error {} + const queryMock = vi.fn(() => { + let interrupted = false; + interruptMock.mockImplementation(async () => { + interrupted = true; + }); + + return { + interrupt: interruptMock, + async *[Symbol.asyncIterator](): AsyncGenerator { + while (!interrupted) { + await new Promise((resolve) => setTimeout(resolve, 5)); + } + throw new AbortErrorMock('aborted'); + }, + }; + }); + + return { + queryMock, + interruptMock, + AbortErrorMock, + }; +}); + +vi.mock('@anthropic-ai/claude-agent-sdk', () => ({ + query: queryMock, + AbortError: AbortErrorMock, +})); + +vi.mock('../shared/utils/index.js', async (importOriginal) => { + const original = await importOriginal(); + return { + ...original, + createLogger: vi.fn().mockReturnValue({ + debug: vi.fn(), + info: vi.fn(), + error: vi.fn(), + }), + }; +}); + +import { QueryExecutor } from '../infra/claude/executor.js'; + +describe('QueryExecutor abortSignal wiring', () => { + beforeEach(() => { + vi.clearAllMocks(); + }); + + it('abortSignal 発火時に query.interrupt() を呼ぶ', async () => { + const controller = new AbortController(); + const executor = new QueryExecutor(); + + const promise = executor.execute('test', { + cwd: '/tmp/project', + abortSignal: controller.signal, + }); + + await new Promise((resolve) => setTimeout(resolve, 20)); + controller.abort(); + + const result = await promise; + + expect(interruptMock).toHaveBeenCalledTimes(1); + expect(result.interrupted).toBe(true); + }); + + it('開始前に中断済みの signal でも query.interrupt() を呼ぶ', async () => { + const controller = new AbortController(); + controller.abort(); + + const executor = new QueryExecutor(); + const result = await executor.execute('test', { + cwd: '/tmp/project', + abortSignal: controller.signal, + }); + + expect(interruptMock).toHaveBeenCalledTimes(1); + expect(result.interrupted).toBe(true); + }); +}); diff --git a/src/__tests__/claude-executor-structured-output.test.ts b/src/__tests__/claude-executor-structured-output.test.ts new file mode 100644 index 0000000..4bbe16e --- /dev/null +++ b/src/__tests__/claude-executor-structured-output.test.ts @@ -0,0 +1,164 @@ +/** + * Claude SDK layer structured output tests. + * + * Tests two internal components: + * 1. SdkOptionsBuilder — outputSchema → outputFormat conversion + * 2. QueryExecutor — structured_output extraction from SDK result messages + */ + +import { beforeEach, describe, expect, it, vi } from 'vitest'; + +// ===== SdkOptionsBuilder tests (no mock needed) ===== + +import { buildSdkOptions } from '../infra/claude/options-builder.js'; + +describe('SdkOptionsBuilder — outputFormat 変換', () => { + it('outputSchema が outputFormat に変換される', () => { + const schema = { type: 'object', properties: { step: { type: 'integer' } } }; + const sdkOptions = buildSdkOptions({ cwd: '/tmp', outputSchema: schema }); + + expect((sdkOptions as Record).outputFormat).toEqual({ + type: 'json_schema', + schema, + }); + }); + + it('outputSchema 未設定なら outputFormat は含まれない', () => { + const sdkOptions = buildSdkOptions({ cwd: '/tmp' }); + expect(sdkOptions).not.toHaveProperty('outputFormat'); + }); +}); + +// ===== QueryExecutor tests (mock @anthropic-ai/claude-agent-sdk) ===== + +const { mockQuery } = vi.hoisted(() => ({ + mockQuery: vi.fn(), +})); + +vi.mock('@anthropic-ai/claude-agent-sdk', () => ({ + query: mockQuery, + AbortError: class AbortError extends Error { + constructor(message?: string) { + super(message); + this.name = 'AbortError'; + } + }, +})); + +// QueryExecutor は executor.ts 内で query() を使うため、mock 後にインポート +const { QueryExecutor } = await import('../infra/claude/executor.js'); + +/** + * query() が返す Query オブジェクト(async iterable + interrupt)のモック + */ +function createMockQuery(messages: Array>) { + return { + [Symbol.asyncIterator]: async function* () { + for (const msg of messages) { + yield msg; + } + }, + interrupt: vi.fn(), + }; +} + +describe('QueryExecutor — structuredOutput 抽出', () => { + beforeEach(() => { + vi.clearAllMocks(); + }); + + it('result メッセージの structured_output (snake_case) を抽出する', async () => { + mockQuery.mockReturnValue(createMockQuery([ + { type: 'result', subtype: 'success', result: 'done', structured_output: { step: 2 } }, + ])); + + const executor = new QueryExecutor(); + const result = await executor.execute('test', { cwd: '/tmp' }); + + expect(result.success).toBe(true); + expect(result.structuredOutput).toEqual({ step: 2 }); + }); + + it('result メッセージの structuredOutput (camelCase) を抽出する', async () => { + mockQuery.mockReturnValue(createMockQuery([ + { type: 'result', subtype: 'success', result: 'done', structuredOutput: { step: 3 } }, + ])); + + const executor = new QueryExecutor(); + const result = await executor.execute('test', { cwd: '/tmp' }); + + expect(result.structuredOutput).toEqual({ step: 3 }); + }); + + it('structured_output が snake_case 優先 (snake_case と camelCase 両方ある場合)', async () => { + mockQuery.mockReturnValue(createMockQuery([ + { + type: 'result', + subtype: 'success', + result: 'done', + structured_output: { step: 1 }, + structuredOutput: { step: 9 }, + }, + ])); + + const executor = new QueryExecutor(); + const result = await executor.execute('test', { cwd: '/tmp' }); + + expect(result.structuredOutput).toEqual({ step: 1 }); + }); + + it('structuredOutput がない場合は undefined', async () => { + mockQuery.mockReturnValue(createMockQuery([ + { type: 'result', subtype: 'success', result: 'plain text' }, + ])); + + const executor = new QueryExecutor(); + const result = await executor.execute('test', { cwd: '/tmp' }); + + expect(result.structuredOutput).toBeUndefined(); + }); + + it('structured_output が配列の場合は無視する', async () => { + mockQuery.mockReturnValue(createMockQuery([ + { type: 'result', subtype: 'success', result: 'done', structured_output: [1, 2, 3] }, + ])); + + const executor = new QueryExecutor(); + const result = await executor.execute('test', { cwd: '/tmp' }); + + expect(result.structuredOutput).toBeUndefined(); + }); + + it('structured_output が null の場合は無視する', async () => { + mockQuery.mockReturnValue(createMockQuery([ + { type: 'result', subtype: 'success', result: 'done', structured_output: null }, + ])); + + const executor = new QueryExecutor(); + const result = await executor.execute('test', { cwd: '/tmp' }); + + expect(result.structuredOutput).toBeUndefined(); + }); + + it('assistant テキストと structured_output を同時に取得する', async () => { + mockQuery.mockReturnValue(createMockQuery([ + { + type: 'assistant', + message: { content: [{ type: 'text', text: 'thinking...' }] }, + }, + { + type: 'result', + subtype: 'success', + result: 'final text', + structured_output: { step: 1, reason: 'approved' }, + }, + ])); + + const executor = new QueryExecutor(); + const result = await executor.execute('test', { cwd: '/tmp' }); + + expect(result.success).toBe(true); + expect(result.content).toBe('final text'); + expect(result.structuredOutput).toEqual({ step: 1, reason: 'approved' }); + }); +}); diff --git a/src/__tests__/claude-provider-abort-signal.test.ts b/src/__tests__/claude-provider-abort-signal.test.ts new file mode 100644 index 0000000..b3f4a8a --- /dev/null +++ b/src/__tests__/claude-provider-abort-signal.test.ts @@ -0,0 +1,52 @@ +import { beforeEach, describe, expect, it, vi } from 'vitest'; +import type { AgentSetup } from '../infra/providers/types.js'; + +const { + mockCallClaude, + mockResolveAnthropicApiKey, +} = vi.hoisted(() => ({ + mockCallClaude: vi.fn(), + mockResolveAnthropicApiKey: vi.fn(), +})); + +vi.mock('../infra/claude/client.js', () => ({ + callClaude: mockCallClaude, + callClaudeCustom: vi.fn(), + callClaudeAgent: vi.fn(), + callClaudeSkill: vi.fn(), +})); + +vi.mock('../infra/config/index.js', () => ({ + resolveAnthropicApiKey: mockResolveAnthropicApiKey, +})); + +import { ClaudeProvider } from '../infra/providers/claude.js'; + +describe('ClaudeProvider abortSignal wiring', () => { + beforeEach(() => { + vi.clearAllMocks(); + mockResolveAnthropicApiKey.mockReturnValue(undefined); + mockCallClaude.mockResolvedValue({ + persona: 'coder', + status: 'done', + content: 'ok', + timestamp: new Date(), + }); + }); + + it('ProviderCallOptions.abortSignal を Claude call options に渡す', async () => { + const provider = new ClaudeProvider(); + const setup: AgentSetup = { name: 'coder' }; + const agent = provider.setup(setup); + const controller = new AbortController(); + + await agent.call('test prompt', { + cwd: '/tmp/project', + abortSignal: controller.signal, + }); + + expect(mockCallClaude).toHaveBeenCalledTimes(1); + const callOptions = mockCallClaude.mock.calls[0]?.[2]; + expect(callOptions).toHaveProperty('abortSignal', controller.signal); + }); +}); diff --git a/src/__tests__/client.test.ts b/src/__tests__/client.test.ts index fdb0652..a44cfa4 100644 --- a/src/__tests__/client.test.ts +++ b/src/__tests__/client.test.ts @@ -3,10 +3,8 @@ */ import { describe, it, expect } from 'vitest'; -import { - detectRuleIndex, - isRegexSafe, -} from '../infra/claude/client.js'; +import { isRegexSafe } from '../infra/claude/utils.js'; +import { detectRuleIndex } from '../shared/utils/ruleIndex.js'; describe('isRegexSafe', () => { it('should accept simple patterns', () => { diff --git a/src/__tests__/clone.test.ts b/src/__tests__/clone.test.ts index 2b293da..11a5e79 100644 --- a/src/__tests__/clone.test.ts +++ b/src/__tests__/clone.test.ts @@ -207,7 +207,7 @@ describe('branch and worktree path formatting with issue numbers', () => { }); } - it('should format branch as takt/#{issue}/{slug} when issue number is provided', () => { + it('should format branch as takt/{issue}/{slug} when issue number is provided', () => { // Given: issue number 99 with slug setupMockForPathTest(); @@ -219,7 +219,7 @@ describe('branch and worktree path formatting with issue numbers', () => { }); // Then: branch should use issue format - expect(result.branch).toBe('takt/#99/fix-login-timeout'); + expect(result.branch).toBe('takt/99/fix-login-timeout'); }); it('should format branch as takt/{timestamp}-{slug} when no issue number', () => { diff --git a/src/__tests__/codex-structured-output.test.ts b/src/__tests__/codex-structured-output.test.ts new file mode 100644 index 0000000..a262b14 --- /dev/null +++ b/src/__tests__/codex-structured-output.test.ts @@ -0,0 +1,152 @@ +/** + * Codex SDK layer structured output tests. + * + * Tests CodexClient's extraction of structuredOutput by parsing + * JSON text from agent_message items when outputSchema is provided. + * + * Codex SDK returns structured output as JSON text in agent_message + * items (not via turn.completed.finalResponse which doesn't exist + * on TurnCompletedEvent). + */ + +import { beforeEach, describe, expect, it, vi } from 'vitest'; + +// ===== Codex SDK mock ===== + +let mockEvents: Array> = []; + +vi.mock('@openai/codex-sdk', () => { + return { + Codex: class MockCodex { + async startThread() { + return { + id: 'thread-mock', + runStreamed: async () => ({ + events: (async function* () { + for (const event of mockEvents) { + yield event; + } + })(), + }), + }; + } + async resumeThread() { + return this.startThread(); + } + }, + }; +}); + +// CodexClient は @openai/codex-sdk をインポートするため、mock 後にインポート +const { CodexClient } = await import('../infra/codex/client.js'); + +describe('CodexClient — structuredOutput 抽出', () => { + beforeEach(() => { + vi.clearAllMocks(); + mockEvents = []; + }); + + it('outputSchema 指定時に agent_message の JSON テキストを structuredOutput として返す', async () => { + const schema = { type: 'object', properties: { step: { type: 'integer' } } }; + mockEvents = [ + { type: 'thread.started', thread_id: 'thread-1' }, + { + type: 'item.completed', + item: { id: 'msg-1', type: 'agent_message', text: '{"step": 2, "reason": "approved"}' }, + }, + { type: 'turn.completed', usage: { input_tokens: 0, cached_input_tokens: 0, output_tokens: 0 } }, + ]; + + const client = new CodexClient(); + const result = await client.call('coder', 'prompt', { cwd: '/tmp', outputSchema: schema }); + + expect(result.status).toBe('done'); + expect(result.structuredOutput).toEqual({ step: 2, reason: 'approved' }); + }); + + it('outputSchema なしの場合はテキストを JSON パースしない', async () => { + mockEvents = [ + { type: 'thread.started', thread_id: 'thread-1' }, + { + type: 'item.completed', + item: { id: 'msg-1', type: 'agent_message', text: '{"step": 2}' }, + }, + { type: 'turn.completed', usage: { input_tokens: 0, cached_input_tokens: 0, output_tokens: 0 } }, + ]; + + const client = new CodexClient(); + const result = await client.call('coder', 'prompt', { cwd: '/tmp' }); + + expect(result.status).toBe('done'); + expect(result.structuredOutput).toBeUndefined(); + }); + + it('agent_message が JSON でない場合は undefined', async () => { + const schema = { type: 'object', properties: { step: { type: 'integer' } } }; + mockEvents = [ + { type: 'thread.started', thread_id: 'thread-1' }, + { + type: 'item.completed', + item: { id: 'msg-1', type: 'agent_message', text: 'plain text response' }, + }, + { type: 'turn.completed', usage: { input_tokens: 0, cached_input_tokens: 0, output_tokens: 0 } }, + ]; + + const client = new CodexClient(); + const result = await client.call('coder', 'prompt', { cwd: '/tmp', outputSchema: schema }); + + expect(result.status).toBe('done'); + expect(result.structuredOutput).toBeUndefined(); + }); + + it('JSON が配列の場合は無視する', async () => { + const schema = { type: 'object', properties: { step: { type: 'integer' } } }; + mockEvents = [ + { type: 'thread.started', thread_id: 'thread-1' }, + { + type: 'item.completed', + item: { id: 'msg-1', type: 'agent_message', text: '[1, 2, 3]' }, + }, + { type: 'turn.completed', usage: { input_tokens: 0, cached_input_tokens: 0, output_tokens: 0 } }, + ]; + + const client = new CodexClient(); + const result = await client.call('coder', 'prompt', { cwd: '/tmp', outputSchema: schema }); + + expect(result.structuredOutput).toBeUndefined(); + }); + + it('agent_message がない場合は structuredOutput なし', async () => { + const schema = { type: 'object', properties: { step: { type: 'integer' } } }; + mockEvents = [ + { type: 'thread.started', thread_id: 'thread-1' }, + { type: 'turn.completed', usage: { input_tokens: 0, cached_input_tokens: 0, output_tokens: 0 } }, + ]; + + const client = new CodexClient(); + const result = await client.call('coder', 'prompt', { cwd: '/tmp', outputSchema: schema }); + + expect(result.status).toBe('done'); + expect(result.structuredOutput).toBeUndefined(); + }); + + it('outputSchema 付きで呼び出して structuredOutput が返る', async () => { + const schema = { type: 'object', properties: { step: { type: 'integer' } } }; + mockEvents = [ + { type: 'thread.started', thread_id: 'thread-1' }, + { + type: 'item.completed', + item: { id: 'msg-1', type: 'agent_message', text: '{"step": 1}' }, + }, + { type: 'turn.completed', usage: { input_tokens: 0, cached_input_tokens: 0, output_tokens: 0 } }, + ]; + + const client = new CodexClient(); + const result = await client.call('coder', 'prompt', { + cwd: '/tmp', + outputSchema: schema, + }); + + expect(result.structuredOutput).toEqual({ step: 1 }); + }); +}); diff --git a/src/__tests__/e2e-helpers.test.ts b/src/__tests__/e2e-helpers.test.ts index 63b395d..f7b25d6 100644 --- a/src/__tests__/e2e-helpers.test.ts +++ b/src/__tests__/e2e-helpers.test.ts @@ -76,7 +76,7 @@ describe('createIsolatedEnv', () => { expect(isolated.env.GIT_CONFIG_GLOBAL).toContain('takt-e2e-'); }); - it('should create config.yaml from E2E fixture with notification_sound timing controls', () => { + it('should create config.yaml from E2E fixture with notification_sound disabled', () => { const isolated = createIsolatedEnv(); cleanups.push(isolated.cleanup); @@ -86,13 +86,13 @@ describe('createIsolatedEnv', () => { expect(config.language).toBe('en'); expect(config.log_level).toBe('info'); expect(config.default_piece).toBe('default'); - expect(config.notification_sound).toBe(true); + expect(config.notification_sound).toBe(false); expect(config.notification_sound_events).toEqual({ iteration_limit: false, piece_complete: false, piece_abort: false, run_complete: true, - run_abort: true, + run_abort: false, }); }); @@ -120,13 +120,13 @@ describe('createIsolatedEnv', () => { expect(config.provider).toBe('mock'); expect(config.concurrency).toBe(2); - expect(config.notification_sound).toBe(true); + expect(config.notification_sound).toBe(false); expect(config.notification_sound_events).toEqual({ iteration_limit: false, piece_complete: false, piece_abort: false, run_complete: true, - run_abort: true, + run_abort: false, }); expect(config.language).toBe('en'); }); @@ -149,7 +149,7 @@ describe('createIsolatedEnv', () => { piece_complete: false, piece_abort: false, run_complete: false, - run_abort: true, + run_abort: false, }); }); diff --git a/src/__tests__/engine-abort.test.ts b/src/__tests__/engine-abort.test.ts index dae845c..0d9fdff 100644 --- a/src/__tests__/engine-abort.test.ts +++ b/src/__tests__/engine-abort.test.ts @@ -25,7 +25,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({ vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); vi.mock('../shared/utils/index.js', async (importOriginal) => ({ diff --git a/src/__tests__/engine-arpeggio.test.ts b/src/__tests__/engine-arpeggio.test.ts index 3523c60..35f55b2 100644 --- a/src/__tests__/engine-arpeggio.test.ts +++ b/src/__tests__/engine-arpeggio.test.ts @@ -21,7 +21,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({ vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); vi.mock('../shared/utils/index.js', async () => { diff --git a/src/__tests__/engine-blocked.test.ts b/src/__tests__/engine-blocked.test.ts index 8cf10e6..02abc20 100644 --- a/src/__tests__/engine-blocked.test.ts +++ b/src/__tests__/engine-blocked.test.ts @@ -23,7 +23,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({ vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); vi.mock('../shared/utils/index.js', async (importOriginal) => ({ diff --git a/src/__tests__/engine-error.test.ts b/src/__tests__/engine-error.test.ts index bcc9ca2..553ec2f 100644 --- a/src/__tests__/engine-error.test.ts +++ b/src/__tests__/engine-error.test.ts @@ -24,7 +24,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({ vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); vi.mock('../shared/utils/index.js', async (importOriginal) => ({ @@ -36,7 +36,7 @@ vi.mock('../shared/utils/index.js', async (importOriginal) => ({ import { PieceEngine } from '../core/piece/index.js'; import { runAgent } from '../agents/runner.js'; -import { detectMatchedRule } from '../core/piece/index.js'; +import { detectMatchedRule } from '../core/piece/evaluation/index.js'; import { makeResponse, makeMovement, diff --git a/src/__tests__/engine-happy-path.test.ts b/src/__tests__/engine-happy-path.test.ts index d067fa4..c42e613 100644 --- a/src/__tests__/engine-happy-path.test.ts +++ b/src/__tests__/engine-happy-path.test.ts @@ -28,7 +28,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({ vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); vi.mock('../shared/utils/index.js', async (importOriginal) => ({ diff --git a/src/__tests__/engine-loop-monitors.test.ts b/src/__tests__/engine-loop-monitors.test.ts index e363264..31aff5d 100644 --- a/src/__tests__/engine-loop-monitors.test.ts +++ b/src/__tests__/engine-loop-monitors.test.ts @@ -27,7 +27,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({ vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); vi.mock('../shared/utils/index.js', async (importOriginal) => ({ diff --git a/src/__tests__/engine-parallel-failure.test.ts b/src/__tests__/engine-parallel-failure.test.ts index a48d6c1..2ead682 100644 --- a/src/__tests__/engine-parallel-failure.test.ts +++ b/src/__tests__/engine-parallel-failure.test.ts @@ -23,7 +23,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({ vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); vi.mock('../shared/utils/index.js', async (importOriginal) => ({ @@ -35,7 +35,7 @@ vi.mock('../shared/utils/index.js', async (importOriginal) => ({ import { PieceEngine } from '../core/piece/index.js'; import { runAgent } from '../agents/runner.js'; -import { detectMatchedRule } from '../core/piece/index.js'; +import { detectMatchedRule } from '../core/piece/evaluation/index.js'; import { makeResponse, makeMovement, diff --git a/src/__tests__/engine-parallel.test.ts b/src/__tests__/engine-parallel.test.ts index bb5cf77..f86f1bf 100644 --- a/src/__tests__/engine-parallel.test.ts +++ b/src/__tests__/engine-parallel.test.ts @@ -24,7 +24,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({ vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); vi.mock('../shared/utils/index.js', async (importOriginal) => ({ diff --git a/src/__tests__/engine-team-leader.test.ts b/src/__tests__/engine-team-leader.test.ts new file mode 100644 index 0000000..3d0de7e --- /dev/null +++ b/src/__tests__/engine-team-leader.test.ts @@ -0,0 +1,172 @@ +import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'; +import { existsSync, rmSync } from 'node:fs'; +import { runAgent } from '../agents/runner.js'; +import { detectMatchedRule } from '../core/piece/evaluation/index.js'; +import { PieceEngine } from '../core/piece/engine/PieceEngine.js'; +import { makeMovement, makeRule, makeResponse, createTestTmpDir, applyDefaultMocks } from './engine-test-helpers.js'; +import type { PieceConfig } from '../core/models/index.js'; + +vi.mock('../agents/runner.js', () => ({ + runAgent: vi.fn(), +})); + +vi.mock('../core/piece/evaluation/index.js', () => ({ + detectMatchedRule: vi.fn(), +})); + +vi.mock('../core/piece/phase-runner.js', () => ({ + needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), + runReportPhase: vi.fn().mockResolvedValue(undefined), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), +})); + +vi.mock('../shared/utils/index.js', async (importOriginal) => ({ + ...(await importOriginal>()), + generateReportDir: vi.fn().mockReturnValue('test-report-dir'), +})); + +function buildTeamLeaderConfig(): PieceConfig { + return { + name: 'team-leader-piece', + initialMovement: 'implement', + maxMovements: 5, + movements: [ + makeMovement('implement', { + instructionTemplate: 'Task: {task}', + teamLeader: { + persona: '../personas/team-leader.md', + maxParts: 3, + timeoutMs: 10000, + partPersona: '../personas/coder.md', + partAllowedTools: ['Read', 'Edit', 'Write'], + partEdit: true, + partPermissionMode: 'edit', + }, + rules: [makeRule('done', 'COMPLETE')], + }), + ], + }; +} + +describe('PieceEngine Integration: TeamLeaderRunner', () => { + let tmpDir: string; + + beforeEach(() => { + vi.resetAllMocks(); + applyDefaultMocks(); + tmpDir = createTestTmpDir(); + }); + + afterEach(() => { + if (existsSync(tmpDir)) { + rmSync(tmpDir, { recursive: true, force: true }); + } + }); + + it('team leaderが分解したパートを並列実行し集約する', async () => { + const config = buildTeamLeaderConfig(); + const engine = new PieceEngine(config, tmpDir, 'implement feature', { projectCwd: tmpDir }); + + vi.mocked(runAgent) + .mockResolvedValueOnce(makeResponse({ + persona: 'team-leader', + content: [ + '```json', + '[{"id":"part-1","title":"API","instruction":"Implement API"},{"id":"part-2","title":"Test","instruction":"Add tests"}]', + '```', + ].join('\n'), + })) + .mockResolvedValueOnce(makeResponse({ persona: 'coder', content: 'API done' })) + .mockResolvedValueOnce(makeResponse({ persona: 'coder', content: 'Tests done' })); + + vi.mocked(detectMatchedRule).mockResolvedValueOnce({ index: 0, method: 'phase1_tag' }); + + const state = await engine.run(); + + expect(state.status).toBe('completed'); + expect(vi.mocked(runAgent)).toHaveBeenCalledTimes(3); + const output = state.movementOutputs.get('implement'); + expect(output).toBeDefined(); + expect(output!.content).toContain('## decomposition'); + expect(output!.content).toContain('## part-1: API'); + expect(output!.content).toContain('API done'); + expect(output!.content).toContain('## part-2: Test'); + expect(output!.content).toContain('Tests done'); + }); + + it('全パートが失敗した場合はムーブメント失敗として中断する', async () => { + const config = buildTeamLeaderConfig(); + const engine = new PieceEngine(config, tmpDir, 'implement feature', { projectCwd: tmpDir }); + + vi.mocked(runAgent) + .mockResolvedValueOnce(makeResponse({ + persona: 'team-leader', + content: [ + '```json', + '[{"id":"part-1","title":"API","instruction":"Implement API"},{"id":"part-2","title":"Test","instruction":"Add tests"}]', + '```', + ].join('\n'), + })) + .mockResolvedValueOnce(makeResponse({ persona: 'coder', status: 'error', error: 'api failed' })) + .mockResolvedValueOnce(makeResponse({ persona: 'coder', status: 'error', error: 'test failed' })); + + const state = await engine.run(); + + expect(state.status).toBe('aborted'); + }); + + it('一部パートが失敗しても成功パートがあれば集約結果は完了する', async () => { + const config = buildTeamLeaderConfig(); + const engine = new PieceEngine(config, tmpDir, 'implement feature', { projectCwd: tmpDir }); + + vi.mocked(runAgent) + .mockResolvedValueOnce(makeResponse({ + persona: 'team-leader', + content: [ + '```json', + '[{"id":"part-1","title":"API","instruction":"Implement API"},{"id":"part-2","title":"Test","instruction":"Add tests"}]', + '```', + ].join('\n'), + })) + .mockResolvedValueOnce(makeResponse({ persona: 'coder', content: 'API done' })) + .mockResolvedValueOnce(makeResponse({ persona: 'coder', status: 'error', error: 'test failed' })); + + vi.mocked(detectMatchedRule).mockResolvedValueOnce({ index: 0, method: 'phase1_tag' }); + + const state = await engine.run(); + + expect(state.status).toBe('completed'); + const output = state.movementOutputs.get('implement'); + expect(output).toBeDefined(); + expect(output!.content).toContain('## part-1: API'); + expect(output!.content).toContain('API done'); + expect(output!.content).toContain('## part-2: Test'); + expect(output!.content).toContain('[ERROR] test failed'); + }); + + it('パート失敗時にerrorがなくてもcontentの詳細をエラー表示に使う', async () => { + const config = buildTeamLeaderConfig(); + const engine = new PieceEngine(config, tmpDir, 'implement feature', { projectCwd: tmpDir }); + + vi.mocked(runAgent) + .mockResolvedValueOnce(makeResponse({ + persona: 'team-leader', + content: [ + '```json', + '[{"id":"part-1","title":"API","instruction":"Implement API"},{"id":"part-2","title":"Test","instruction":"Add tests"}]', + '```', + ].join('\n'), + })) + .mockResolvedValueOnce(makeResponse({ persona: 'coder', status: 'error', content: 'api failed from content' })) + .mockResolvedValueOnce(makeResponse({ persona: 'coder', content: 'Tests done' })); + + vi.mocked(detectMatchedRule).mockResolvedValueOnce({ index: 0, method: 'phase1_tag' }); + + const state = await engine.run(); + + expect(state.status).toBe('completed'); + const output = state.movementOutputs.get('implement'); + expect(output).toBeDefined(); + expect(output!.content).toContain('[ERROR] api failed from content'); + }); +}); diff --git a/src/__tests__/engine-test-helpers.ts b/src/__tests__/engine-test-helpers.ts index d8c893f..f17dc03 100644 --- a/src/__tests__/engine-test-helpers.ts +++ b/src/__tests__/engine-test-helpers.ts @@ -10,18 +10,21 @@ import { mkdirSync } from 'node:fs'; import { join } from 'node:path'; import { tmpdir } from 'node:os'; import { randomUUID } from 'node:crypto'; -import type { PieceConfig, PieceMovement, AgentResponse, PieceRule } from '../core/models/index.js'; +import type { PieceConfig, PieceMovement, AgentResponse } from '../core/models/index.js'; +import { makeRule } from './test-helpers.js'; // --- Mock imports (consumers must call vi.mock before importing this) --- import { runAgent } from '../agents/runner.js'; -import { detectMatchedRule } from '../core/piece/index.js'; -import type { RuleMatch } from '../core/piece/index.js'; -import { needsStatusJudgmentPhase, runReportPhase, runStatusJudgmentPhase } from '../core/piece/index.js'; +import { detectMatchedRule } from '../core/piece/evaluation/index.js'; +import type { RuleMatch } from '../core/piece/evaluation/index.js'; +import { needsStatusJudgmentPhase, runReportPhase, runStatusJudgmentPhase } from '../core/piece/phase-runner.js'; import { generateReportDir } from '../shared/utils/index.js'; // --- Factory functions --- +export { makeRule }; + export function makeResponse(overrides: Partial = {}): AgentResponse { return { persona: 'test-agent', @@ -33,10 +36,6 @@ export function makeResponse(overrides: Partial = {}): AgentRespo }; } -export function makeRule(condition: string, next: string, extra: Partial = {}): PieceRule { - return { condition, next, ...extra }; -} - export function makeMovement(name: string, overrides: Partial = {}): PieceMovement { return { name, @@ -174,7 +173,7 @@ export function createTestTmpDir(): string { export function applyDefaultMocks(): void { vi.mocked(needsStatusJudgmentPhase).mockReturnValue(false); vi.mocked(runReportPhase).mockResolvedValue(undefined); - vi.mocked(runStatusJudgmentPhase).mockResolvedValue(''); + vi.mocked(runStatusJudgmentPhase).mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }); vi.mocked(generateReportDir).mockReturnValue('test-report-dir'); } diff --git a/src/__tests__/engine-worktree-report.test.ts b/src/__tests__/engine-worktree-report.test.ts index 1021c0a..f90084f 100644 --- a/src/__tests__/engine-worktree-report.test.ts +++ b/src/__tests__/engine-worktree-report.test.ts @@ -24,7 +24,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({ vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); vi.mock('../shared/utils/index.js', async (importOriginal) => ({ @@ -35,7 +35,7 @@ vi.mock('../shared/utils/index.js', async (importOriginal) => ({ // --- Imports (after mocks) --- import { PieceEngine } from '../core/piece/index.js'; -import { runReportPhase } from '../core/piece/index.js'; +import { runReportPhase } from '../core/piece/phase-runner.js'; import { makeResponse, makeMovement, diff --git a/src/__tests__/escape.test.ts b/src/__tests__/escape.test.ts index 081c643..8c97a04 100644 --- a/src/__tests__/escape.test.ts +++ b/src/__tests__/escape.test.ts @@ -9,31 +9,7 @@ import { escapeTemplateChars, replaceTemplatePlaceholders, } from '../core/piece/instruction/escape.js'; -import type { PieceMovement } from '../core/models/types.js'; -import type { InstructionContext } from '../core/piece/instruction/instruction-context.js'; - -function makeMovement(overrides: Partial = {}): PieceMovement { - return { - name: 'test-movement', - personaDisplayName: 'tester', - instructionTemplate: '', - passPreviousResponse: false, - ...overrides, - }; -} - -function makeContext(overrides: Partial = {}): InstructionContext { - return { - task: 'test task', - iteration: 1, - maxMovements: 10, - movementIteration: 1, - cwd: '/tmp/test', - projectCwd: '/tmp/project', - userInputs: [], - ...overrides, - }; -} +import { makeMovement, makeInstructionContext } from './test-helpers.js'; describe('escapeTemplateChars', () => { it('should replace curly braces with full-width equivalents', () => { @@ -62,7 +38,7 @@ describe('escapeTemplateChars', () => { describe('replaceTemplatePlaceholders', () => { it('should replace {task} placeholder', () => { const step = makeMovement(); - const ctx = makeContext({ task: 'implement feature X' }); + const ctx = makeInstructionContext({ task: 'implement feature X' }); const template = 'Your task is: {task}'; const result = replaceTemplatePlaceholders(template, step, ctx); @@ -71,7 +47,7 @@ describe('replaceTemplatePlaceholders', () => { it('should escape braces in task content', () => { const step = makeMovement(); - const ctx = makeContext({ task: 'fix {bug} in code' }); + const ctx = makeInstructionContext({ task: 'fix {bug} in code' }); const template = '{task}'; const result = replaceTemplatePlaceholders(template, step, ctx); @@ -80,7 +56,7 @@ describe('replaceTemplatePlaceholders', () => { it('should replace {iteration} and {max_movements}', () => { const step = makeMovement(); - const ctx = makeContext({ iteration: 3, maxMovements: 20 }); + const ctx = makeInstructionContext({ iteration: 3, maxMovements: 20 }); const template = 'Iteration {iteration}/{max_movements}'; const result = replaceTemplatePlaceholders(template, step, ctx); @@ -89,7 +65,7 @@ describe('replaceTemplatePlaceholders', () => { it('should replace {movement_iteration}', () => { const step = makeMovement(); - const ctx = makeContext({ movementIteration: 5 }); + const ctx = makeInstructionContext({ movementIteration: 5 }); const template = 'Movement run #{movement_iteration}'; const result = replaceTemplatePlaceholders(template, step, ctx); @@ -98,7 +74,7 @@ describe('replaceTemplatePlaceholders', () => { it('should replace {previous_response} when passPreviousResponse is true', () => { const step = makeMovement({ passPreviousResponse: true }); - const ctx = makeContext({ + const ctx = makeInstructionContext({ previousOutput: { persona: 'coder', status: 'done', @@ -114,7 +90,7 @@ describe('replaceTemplatePlaceholders', () => { it('should prefer preprocessed previous response text when provided', () => { const step = makeMovement({ passPreviousResponse: true }); - const ctx = makeContext({ + const ctx = makeInstructionContext({ previousOutput: { persona: 'coder', status: 'done', @@ -131,7 +107,7 @@ describe('replaceTemplatePlaceholders', () => { it('should replace {previous_response} with empty string when no previous output', () => { const step = makeMovement({ passPreviousResponse: true }); - const ctx = makeContext(); + const ctx = makeInstructionContext(); const template = 'Previous: {previous_response}'; const result = replaceTemplatePlaceholders(template, step, ctx); @@ -140,7 +116,7 @@ describe('replaceTemplatePlaceholders', () => { it('should not replace {previous_response} when passPreviousResponse is false', () => { const step = makeMovement({ passPreviousResponse: false }); - const ctx = makeContext({ + const ctx = makeInstructionContext({ previousOutput: { persona: 'coder', status: 'done', @@ -156,7 +132,7 @@ describe('replaceTemplatePlaceholders', () => { it('should replace {user_inputs} with joined inputs', () => { const step = makeMovement(); - const ctx = makeContext({ userInputs: ['input 1', 'input 2', 'input 3'] }); + const ctx = makeInstructionContext({ userInputs: ['input 1', 'input 2', 'input 3'] }); const template = 'Inputs: {user_inputs}'; const result = replaceTemplatePlaceholders(template, step, ctx); @@ -165,7 +141,7 @@ describe('replaceTemplatePlaceholders', () => { it('should replace {report_dir} with report directory', () => { const step = makeMovement(); - const ctx = makeContext({ reportDir: '/tmp/reports/run-1' }); + const ctx = makeInstructionContext({ reportDir: '/tmp/reports/run-1' }); const template = 'Reports: {report_dir}'; const result = replaceTemplatePlaceholders(template, step, ctx); @@ -174,7 +150,7 @@ describe('replaceTemplatePlaceholders', () => { it('should replace {report:filename} with full path', () => { const step = makeMovement(); - const ctx = makeContext({ reportDir: '/tmp/reports' }); + const ctx = makeInstructionContext({ reportDir: '/tmp/reports' }); const template = 'Read {report:review.md} and {report:plan.md}'; const result = replaceTemplatePlaceholders(template, step, ctx); @@ -183,7 +159,7 @@ describe('replaceTemplatePlaceholders', () => { it('should handle template with multiple different placeholders', () => { const step = makeMovement(); - const ctx = makeContext({ + const ctx = makeInstructionContext({ task: 'test task', iteration: 2, maxMovements: 5, @@ -198,7 +174,7 @@ describe('replaceTemplatePlaceholders', () => { it('should leave unreplaced placeholders when reportDir is undefined', () => { const step = makeMovement(); - const ctx = makeContext({ reportDir: undefined }); + const ctx = makeInstructionContext({ reportDir: undefined }); const template = 'Dir: {report_dir} File: {report:test.md}'; const result = replaceTemplatePlaceholders(template, step, ctx); diff --git a/src/__tests__/i18n.test.ts b/src/__tests__/i18n.test.ts index bed0d12..5e962b7 100644 --- a/src/__tests__/i18n.test.ts +++ b/src/__tests__/i18n.test.ts @@ -114,6 +114,7 @@ describe('label integrity', () => { expect(() => getLabel('piece.notifyComplete')).not.toThrow(); expect(() => getLabel('piece.notifyAbort')).not.toThrow(); expect(() => getLabel('piece.sigintGraceful')).not.toThrow(); + expect(() => getLabel('piece.sigintTimeout')).not.toThrow(); expect(() => getLabel('piece.sigintForce')).not.toThrow(); }); diff --git a/src/__tests__/instruction-helpers.test.ts b/src/__tests__/instruction-helpers.test.ts index ee75ba9..dcb6c50 100644 --- a/src/__tests__/instruction-helpers.test.ts +++ b/src/__tests__/instruction-helpers.test.ts @@ -10,31 +10,8 @@ import { renderReportContext, renderReportOutputInstruction, } from '../core/piece/instruction/InstructionBuilder.js'; -import type { PieceMovement, OutputContractEntry } from '../core/models/types.js'; -import type { InstructionContext } from '../core/piece/instruction/instruction-context.js'; - -function makeMovement(overrides: Partial = {}): PieceMovement { - return { - name: 'test-movement', - personaDisplayName: 'tester', - instructionTemplate: '', - passPreviousResponse: false, - ...overrides, - }; -} - -function makeContext(overrides: Partial = {}): InstructionContext { - return { - task: 'test task', - iteration: 1, - maxMovements: 10, - movementIteration: 1, - cwd: '/tmp/test', - projectCwd: '/tmp/project', - userInputs: [], - ...overrides, - }; -} +import type { OutputContractEntry } from '../core/models/types.js'; +import { makeMovement, makeInstructionContext } from './test-helpers.js'; describe('isOutputContractItem', () => { it('should return true for OutputContractItem (has name)', () => { @@ -84,19 +61,19 @@ describe('renderReportContext', () => { describe('renderReportOutputInstruction', () => { it('should return empty string when no output contracts', () => { const step = makeMovement(); - const ctx = makeContext({ reportDir: '/tmp/reports' }); + const ctx = makeInstructionContext({ reportDir: '/tmp/reports' }); expect(renderReportOutputInstruction(step, ctx, 'en')).toBe(''); }); it('should return empty string when no reportDir', () => { const step = makeMovement({ outputContracts: [{ name: 'report.md' }] }); - const ctx = makeContext(); + const ctx = makeInstructionContext(); expect(renderReportOutputInstruction(step, ctx, 'en')).toBe(''); }); it('should render English single-file instruction', () => { const step = makeMovement({ outputContracts: [{ name: 'report.md' }] }); - const ctx = makeContext({ reportDir: '/tmp/reports', movementIteration: 2 }); + const ctx = makeInstructionContext({ reportDir: '/tmp/reports', movementIteration: 2 }); const result = renderReportOutputInstruction(step, ctx, 'en'); expect(result).toContain('Report output'); @@ -108,7 +85,7 @@ describe('renderReportOutputInstruction', () => { const step = makeMovement({ outputContracts: [{ name: 'plan.md' }, { name: 'review.md' }], }); - const ctx = makeContext({ reportDir: '/tmp/reports' }); + const ctx = makeInstructionContext({ reportDir: '/tmp/reports' }); const result = renderReportOutputInstruction(step, ctx, 'en'); expect(result).toContain('Report Files'); @@ -116,7 +93,7 @@ describe('renderReportOutputInstruction', () => { it('should render Japanese single-file instruction', () => { const step = makeMovement({ outputContracts: [{ name: 'report.md' }] }); - const ctx = makeContext({ reportDir: '/tmp/reports', movementIteration: 1 }); + const ctx = makeInstructionContext({ reportDir: '/tmp/reports', movementIteration: 1 }); const result = renderReportOutputInstruction(step, ctx, 'ja'); expect(result).toContain('レポート出力'); @@ -128,7 +105,7 @@ describe('renderReportOutputInstruction', () => { const step = makeMovement({ outputContracts: [{ name: 'plan.md' }, { name: 'review.md' }], }); - const ctx = makeContext({ reportDir: '/tmp/reports' }); + const ctx = makeInstructionContext({ reportDir: '/tmp/reports' }); const result = renderReportOutputInstruction(step, ctx, 'ja'); expect(result).toContain('Report Files'); diff --git a/src/__tests__/it-error-recovery.test.ts b/src/__tests__/it-error-recovery.test.ts index 0635fc0..75199ba 100644 --- a/src/__tests__/it-error-recovery.test.ts +++ b/src/__tests__/it-error-recovery.test.ts @@ -14,7 +14,8 @@ import { join } from 'node:path'; import { tmpdir } from 'node:os'; import { setMockScenario, resetScenario } from '../infra/mock/index.js'; import type { PieceConfig, PieceMovement, PieceRule } from '../core/models/index.js'; -import { detectRuleIndex } from '../infra/claude/index.js'; +import { detectRuleIndex } from '../shared/utils/ruleIndex.js'; +import { makeRule } from './test-helpers.js'; import { callAiJudge } from '../agents/ai-judge.js'; // --- Mocks --- @@ -30,7 +31,7 @@ vi.mock('../agents/ai-judge.js', async (importOriginal) => { vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); vi.mock('../shared/utils/index.js', async (importOriginal) => ({ @@ -56,10 +57,6 @@ import { PieceEngine } from '../core/piece/index.js'; // --- Test helpers --- -function makeRule(condition: string, next: string): PieceRule { - return { condition, next }; -} - function makeMovement(name: string, agentPath: string, rules: PieceRule[]): PieceMovement { return { name, diff --git a/src/__tests__/it-instruction-builder.test.ts b/src/__tests__/it-instruction-builder.test.ts index 3753b10..dac6b25 100644 --- a/src/__tests__/it-instruction-builder.test.ts +++ b/src/__tests__/it-instruction-builder.test.ts @@ -8,7 +8,8 @@ */ import { describe, it, expect, vi } from 'vitest'; -import type { PieceMovement, PieceRule, AgentResponse } from '../core/models/index.js'; +import type { PieceMovement, AgentResponse } from '../core/models/index.js'; +import { makeRule } from './test-helpers.js'; vi.mock('../infra/config/global/globalConfig.js', () => ({ loadGlobalConfig: vi.fn().mockReturnValue({}), @@ -34,10 +35,6 @@ function buildStatusJudgmentInstruction(movement: PieceMovement, ctx: StatusJudg // --- Test helpers --- -function makeRule(condition: string, next: string, extra?: Partial): PieceRule { - return { condition, next, ...extra }; -} - function makeMovement(overrides: Partial = {}): PieceMovement { return { name: 'test-step', diff --git a/src/__tests__/it-notification-sound.test.ts b/src/__tests__/it-notification-sound.test.ts index ce54a0f..5f4d4a0 100644 --- a/src/__tests__/it-notification-sound.test.ts +++ b/src/__tests__/it-notification-sound.test.ts @@ -104,8 +104,7 @@ vi.mock('../core/piece/index.js', () => ({ PieceEngine: MockPieceEngine, })); -vi.mock('../infra/claude/index.js', () => ({ - detectRuleIndex: vi.fn(), +vi.mock('../infra/claude/query-manager.js', () => ({ interruptAllQueries: mockInterruptAllQueries, })); diff --git a/src/__tests__/it-piece-execution.test.ts b/src/__tests__/it-piece-execution.test.ts index 3539bb7..912fa8b 100644 --- a/src/__tests__/it-piece-execution.test.ts +++ b/src/__tests__/it-piece-execution.test.ts @@ -15,7 +15,8 @@ import { join } from 'node:path'; import { tmpdir } from 'node:os'; import { setMockScenario, resetScenario } from '../infra/mock/index.js'; import type { PieceConfig, PieceMovement, PieceRule } from '../core/models/index.js'; -import { detectRuleIndex } from '../infra/claude/index.js'; +import { detectRuleIndex } from '../shared/utils/ruleIndex.js'; +import { makeRule } from './test-helpers.js'; import { callAiJudge } from '../agents/ai-judge.js'; // --- Mocks (minimal — only infrastructure, not core logic) --- @@ -34,7 +35,7 @@ vi.mock('../agents/ai-judge.js', async (importOriginal) => { vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); vi.mock('../shared/utils/index.js', async (importOriginal) => ({ @@ -59,10 +60,6 @@ import { PieceEngine } from '../core/piece/index.js'; // --- Test helpers --- -function makeRule(condition: string, next: string): PieceRule { - return { condition, next }; -} - function makeMovement(name: string, agentPath: string, rules: PieceRule[]): PieceMovement { return { name, diff --git a/src/__tests__/it-piece-patterns.test.ts b/src/__tests__/it-piece-patterns.test.ts index 4ea6d59..bd99736 100644 --- a/src/__tests__/it-piece-patterns.test.ts +++ b/src/__tests__/it-piece-patterns.test.ts @@ -13,7 +13,7 @@ import { mkdtempSync, mkdirSync, rmSync } from 'node:fs'; import { join } from 'node:path'; import { tmpdir } from 'node:os'; import { setMockScenario, resetScenario } from '../infra/mock/index.js'; -import { detectRuleIndex } from '../infra/claude/index.js'; +import { detectRuleIndex } from '../shared/utils/ruleIndex.js'; import { callAiJudge } from '../agents/ai-judge.js'; // --- Mocks --- @@ -37,7 +37,7 @@ vi.mock('../agents/ai-judge.js', async (importOriginal) => { vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); vi.mock('../shared/utils/index.js', async (importOriginal) => ({ diff --git a/src/__tests__/it-pipeline-modes.test.ts b/src/__tests__/it-pipeline-modes.test.ts index 0916bef..a381483 100644 --- a/src/__tests__/it-pipeline-modes.test.ts +++ b/src/__tests__/it-pipeline-modes.test.ts @@ -144,7 +144,7 @@ vi.mock('../shared/prompt/index.js', () => ({ vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); // --- Imports (after mocks) --- diff --git a/src/__tests__/it-pipeline.test.ts b/src/__tests__/it-pipeline.test.ts index ad02723..5743f29 100644 --- a/src/__tests__/it-pipeline.test.ts +++ b/src/__tests__/it-pipeline.test.ts @@ -125,7 +125,7 @@ vi.mock('../shared/prompt/index.js', () => ({ vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); // --- Imports (after mocks) --- diff --git a/src/__tests__/it-rule-evaluation.test.ts b/src/__tests__/it-rule-evaluation.test.ts index 5bd728f..57ec9cc 100644 --- a/src/__tests__/it-rule-evaluation.test.ts +++ b/src/__tests__/it-rule-evaluation.test.ts @@ -16,6 +16,7 @@ import { describe, it, expect, beforeEach, vi } from 'vitest'; import type { PieceMovement, PieceState, PieceRule, AgentResponse } from '../core/models/index.js'; +import { makeRule } from './test-helpers.js'; // --- Mocks --- @@ -33,16 +34,13 @@ vi.mock('../infra/config/project/projectConfig.js', () => ({ // --- Imports (after mocks) --- -import { detectMatchedRule, evaluateAggregateConditions } from '../core/piece/index.js'; -import { detectRuleIndex } from '../infra/claude/index.js'; +import { evaluateAggregateConditions } from '../core/piece/index.js'; +import { detectMatchedRule } from '../core/piece/evaluation/index.js'; +import { detectRuleIndex } from '../shared/utils/ruleIndex.js'; import type { RuleMatch, RuleEvaluatorContext } from '../core/piece/index.js'; // --- Test helpers --- -function makeRule(condition: string, next: string, extra?: Partial): PieceRule { - return { condition, next, ...extra }; -} - function makeMovement( name: string, rules: PieceRule[], diff --git a/src/__tests__/it-sigint-interrupt.test.ts b/src/__tests__/it-sigint-interrupt.test.ts index 28abafe..e15226b 100644 --- a/src/__tests__/it-sigint-interrupt.test.ts +++ b/src/__tests__/it-sigint-interrupt.test.ts @@ -74,8 +74,8 @@ vi.mock('../core/piece/index.js', () => ({ PieceEngine: MockPieceEngine, })); -vi.mock('../infra/claude/index.js', () => ({ - detectRuleIndex: vi.fn(), +vi.mock('../infra/claude/query-manager.js', async (importOriginal) => ({ + ...(await importOriginal>()), interruptAllQueries: mockInterruptAllQueries, })); diff --git a/src/__tests__/it-three-phase-execution.test.ts b/src/__tests__/it-three-phase-execution.test.ts index a7e250a..d5b173e 100644 --- a/src/__tests__/it-three-phase-execution.test.ts +++ b/src/__tests__/it-three-phase-execution.test.ts @@ -15,7 +15,8 @@ import { join } from 'node:path'; import { tmpdir } from 'node:os'; import { setMockScenario, resetScenario } from '../infra/mock/index.js'; import type { PieceConfig, PieceMovement, PieceRule } from '../core/models/index.js'; -import { detectRuleIndex } from '../infra/claude/index.js'; +import { detectRuleIndex } from '../shared/utils/ruleIndex.js'; +import { makeRule } from './test-helpers.js'; import { callAiJudge } from '../agents/ai-judge.js'; // --- Mocks --- @@ -61,10 +62,6 @@ import { PieceEngine } from '../core/piece/index.js'; // --- Test helpers --- -function makeRule(condition: string, next: string): PieceRule { - return { condition, next }; -} - function createTestEnv(): { dir: string; agentPath: string } { const dir = mkdtempSync(join(tmpdir(), 'takt-it-3ph-')); mkdirSync(join(dir, '.takt', 'reports', 'test-report-dir'), { recursive: true }); @@ -117,7 +114,7 @@ describe('Three-Phase Execution IT: phase1 only (no report, no tag rules)', () = // No tag rules needed → Phase 3 not needed mockNeedsStatusJudgmentPhase.mockReturnValue(false); mockRunReportPhase.mockResolvedValue(undefined); - mockRunStatusJudgmentPhase.mockResolvedValue(''); + mockRunStatusJudgmentPhase.mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }); }); afterEach(() => { @@ -169,7 +166,7 @@ describe('Three-Phase Execution IT: phase1 + phase2 (report defined)', () => { mockNeedsStatusJudgmentPhase.mockReturnValue(false); mockRunReportPhase.mockResolvedValue(undefined); - mockRunStatusJudgmentPhase.mockResolvedValue(''); + mockRunStatusJudgmentPhase.mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }); }); afterEach(() => { @@ -249,7 +246,7 @@ describe('Three-Phase Execution IT: phase1 + phase3 (tag rules defined)', () => mockNeedsStatusJudgmentPhase.mockReturnValue(true); mockRunReportPhase.mockResolvedValue(undefined); // Phase 3 returns content with a tag - mockRunStatusJudgmentPhase.mockResolvedValue('[STEP:1]'); + mockRunStatusJudgmentPhase.mockResolvedValue({ tag: '[STEP:1]', ruleIndex: 0, method: 'structured_output' }); }); afterEach(() => { @@ -301,7 +298,7 @@ describe('Three-Phase Execution IT: all three phases', () => { mockNeedsStatusJudgmentPhase.mockReturnValue(true); mockRunReportPhase.mockResolvedValue(undefined); - mockRunStatusJudgmentPhase.mockResolvedValue('[STEP:1]'); + mockRunStatusJudgmentPhase.mockResolvedValue({ tag: '[STEP:1]', ruleIndex: 0, method: 'structured_output' }); }); afterEach(() => { @@ -372,7 +369,7 @@ describe('Three-Phase Execution IT: phase3 tag → rule match', () => { ]); // Phase 3 returns rule 2 (ABORT) - mockRunStatusJudgmentPhase.mockResolvedValue('[STEP1:2]'); + mockRunStatusJudgmentPhase.mockResolvedValue({ tag: '[STEP1:2]', ruleIndex: 1, method: 'structured_output' }); const config: PieceConfig = { name: 'it-phase3-tag', diff --git a/src/__tests__/judgment-detector.test.ts b/src/__tests__/judgment-detector.test.ts deleted file mode 100644 index 1bd198c..0000000 --- a/src/__tests__/judgment-detector.test.ts +++ /dev/null @@ -1,70 +0,0 @@ -/** - * Test for JudgmentDetector - */ - -import { describe, it, expect } from 'vitest'; -import { JudgmentDetector } from '../core/piece/judgment/JudgmentDetector.js'; - -describe('JudgmentDetector', () => { - describe('detect', () => { - it('should detect tag in simple response', () => { - const result = JudgmentDetector.detect('[ARCH-REVIEW:1]'); - expect(result.success).toBe(true); - expect(result.tag).toBe('[ARCH-REVIEW:1]'); - }); - - it('should detect tag with surrounding text', () => { - const result = JudgmentDetector.detect('Based on the review, I choose [MOVEMENT:2] because...'); - expect(result.success).toBe(true); - expect(result.tag).toBe('[MOVEMENT:2]'); - }); - - it('should detect tag with hyphenated movement name', () => { - const result = JudgmentDetector.detect('[AI-ANTIPATTERN-REVIEW:1]'); - expect(result.success).toBe(true); - expect(result.tag).toBe('[AI-ANTIPATTERN-REVIEW:1]'); - }); - - it('should detect tag with underscored movement name', () => { - const result = JudgmentDetector.detect('[AI_REVIEW:1]'); - expect(result.success).toBe(true); - expect(result.tag).toBe('[AI_REVIEW:1]'); - }); - - it('should detect "判断できない" (Japanese)', () => { - const result = JudgmentDetector.detect('判断できない:情報が不足しています'); - expect(result.success).toBe(false); - expect(result.reason).toBe('Conductor explicitly stated it cannot judge'); - }); - - it('should detect "Cannot determine" (English)', () => { - const result = JudgmentDetector.detect('Cannot determine: Insufficient information'); - expect(result.success).toBe(false); - expect(result.reason).toBe('Conductor explicitly stated it cannot judge'); - }); - - it('should detect "unable to judge"', () => { - const result = JudgmentDetector.detect('I am unable to judge based on the provided information.'); - expect(result.success).toBe(false); - expect(result.reason).toBe('Conductor explicitly stated it cannot judge'); - }); - - it('should fail when no tag and no explicit "cannot judge"', () => { - const result = JudgmentDetector.detect('This is a response without a tag or explicit statement.'); - expect(result.success).toBe(false); - expect(result.reason).toBe('No tag found and no explicit "cannot judge" statement'); - }); - - it('should fail on empty response', () => { - const result = JudgmentDetector.detect(''); - expect(result.success).toBe(false); - expect(result.reason).toBe('No tag found and no explicit "cannot judge" statement'); - }); - - it('should detect first tag when multiple tags exist', () => { - const result = JudgmentDetector.detect('[MOVEMENT:1] or [MOVEMENT:2]'); - expect(result.success).toBe(true); - expect(result.tag).toBe('[MOVEMENT:1]'); - }); - }); -}); diff --git a/src/__tests__/judgment-fallback.test.ts b/src/__tests__/judgment-fallback.test.ts deleted file mode 100644 index 0d7d560..0000000 --- a/src/__tests__/judgment-fallback.test.ts +++ /dev/null @@ -1,183 +0,0 @@ -/** - * Test for Fallback Strategies - */ - -import { describe, it, expect, vi, beforeEach } from 'vitest'; -import { mkdtempSync, mkdirSync, rmSync, writeFileSync } from 'node:fs'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import type { PieceMovement } from '../core/models/types.js'; -import type { JudgmentContext } from '../core/piece/judgment/FallbackStrategy.js'; -import { runAgent } from '../agents/runner.js'; -import { - AutoSelectStrategy, - ReportBasedStrategy, - ResponseBasedStrategy, - AgentConsultStrategy, - JudgmentStrategyFactory, -} from '../core/piece/judgment/FallbackStrategy.js'; - -// Mock runAgent -vi.mock('../agents/runner.js', () => ({ - runAgent: vi.fn(), -})); - -describe('JudgmentStrategies', () => { - const mockStep: PieceMovement = { - name: 'test-movement', - persona: 'test-agent', - rules: [ - { description: 'Rule 1', condition: 'approved' }, - { description: 'Rule 2', condition: 'rejected' }, - ], - }; - - const mockContext: JudgmentContext = { - step: mockStep, - cwd: '/test/cwd', - language: 'en', - reportDir: '/test/reports', - lastResponse: 'Last response content', - sessionId: 'session-123', - }; - - beforeEach(() => { - vi.clearAllMocks(); - }); - - describe('AutoSelectStrategy', () => { - it('should apply when step has only one rule', () => { - const singleRuleStep: PieceMovement = { - name: 'single-rule', - rules: [{ description: 'Only rule', condition: 'always' }], - }; - const strategy = new AutoSelectStrategy(); - expect(strategy.canApply({ ...mockContext, step: singleRuleStep })).toBe(true); - }); - - it('should not apply when step has multiple rules', () => { - const strategy = new AutoSelectStrategy(); - expect(strategy.canApply(mockContext)).toBe(false); - }); - - it('should return auto-selected tag', async () => { - const singleRuleStep: PieceMovement = { - name: 'single-rule', - rules: [{ description: 'Only rule', condition: 'always' }], - }; - const strategy = new AutoSelectStrategy(); - const result = await strategy.execute({ ...mockContext, step: singleRuleStep }); - expect(result.success).toBe(true); - expect(result.tag).toBe('[SINGLE-RULE:1]'); - }); - }); - - describe('ReportBasedStrategy', () => { - it('should apply when reportDir and output contracts are configured', () => { - const strategy = new ReportBasedStrategy(); - const stepWithOutputContracts: PieceMovement = { - ...mockStep, - outputContracts: [{ label: 'review', path: 'review-report.md' }], - }; - expect(strategy.canApply({ ...mockContext, step: stepWithOutputContracts })).toBe(true); - }); - - it('should not apply when reportDir is missing', () => { - const strategy = new ReportBasedStrategy(); - expect(strategy.canApply({ ...mockContext, reportDir: undefined })).toBe(false); - }); - - it('should not apply when step has no output contracts configured', () => { - const strategy = new ReportBasedStrategy(); - // mockStep has no outputContracts field → getReportFiles returns [] - expect(strategy.canApply(mockContext)).toBe(false); - }); - - it('should use only latest report file from reports directory', async () => { - const tmpRoot = mkdtempSync(join(tmpdir(), 'takt-judgment-report-')); - try { - const reportDir = join(tmpRoot, '.takt', 'runs', 'sample-run', 'reports'); - const historyDir = join(tmpRoot, '.takt', 'runs', 'sample-run', 'logs', 'reports-history'); - mkdirSync(reportDir, { recursive: true }); - mkdirSync(historyDir, { recursive: true }); - - const latestFile = '05-architect-review.md'; - writeFileSync(join(reportDir, latestFile), 'LATEST-ONLY-CONTENT'); - writeFileSync(join(historyDir, '05-architect-review.20260210T061143Z.md'), 'OLD-HISTORY-CONTENT'); - - const stepWithOutputContracts: PieceMovement = { - ...mockStep, - outputContracts: [{ name: latestFile }], - }; - - const runAgentMock = vi.mocked(runAgent); - runAgentMock.mockResolvedValue({ - persona: 'conductor', - status: 'done', - content: '[TEST-MOVEMENT:1]', - timestamp: new Date('2026-02-10T07:11:43Z'), - }); - - const strategy = new ReportBasedStrategy(); - const result = await strategy.execute({ - ...mockContext, - step: stepWithOutputContracts, - reportDir, - }); - - expect(result.success).toBe(true); - expect(runAgentMock).toHaveBeenCalledTimes(1); - const instruction = runAgentMock.mock.calls[0]?.[1]; - expect(instruction).toContain('LATEST-ONLY-CONTENT'); - expect(instruction).not.toContain('OLD-HISTORY-CONTENT'); - } finally { - rmSync(tmpRoot, { recursive: true, force: true }); - } - }); - }); - - describe('ResponseBasedStrategy', () => { - it('should apply when lastResponse is provided', () => { - const strategy = new ResponseBasedStrategy(); - expect(strategy.canApply(mockContext)).toBe(true); - }); - - it('should not apply when lastResponse is missing', () => { - const strategy = new ResponseBasedStrategy(); - expect(strategy.canApply({ ...mockContext, lastResponse: undefined })).toBe(false); - }); - - it('should not apply when lastResponse is empty', () => { - const strategy = new ResponseBasedStrategy(); - expect(strategy.canApply({ ...mockContext, lastResponse: '' })).toBe(false); - }); - }); - - describe('AgentConsultStrategy', () => { - it('should apply when sessionId is provided', () => { - const strategy = new AgentConsultStrategy(); - expect(strategy.canApply(mockContext)).toBe(true); - }); - - it('should not apply when sessionId is missing', () => { - const strategy = new AgentConsultStrategy(); - expect(strategy.canApply({ ...mockContext, sessionId: undefined })).toBe(false); - }); - - it('should not apply when sessionId is empty', () => { - const strategy = new AgentConsultStrategy(); - expect(strategy.canApply({ ...mockContext, sessionId: '' })).toBe(false); - }); - }); - - describe('JudgmentStrategyFactory', () => { - it('should create strategies in correct order', () => { - const strategies = JudgmentStrategyFactory.createStrategies(); - expect(strategies).toHaveLength(4); - expect(strategies[0]).toBeInstanceOf(AutoSelectStrategy); - expect(strategies[1]).toBeInstanceOf(ReportBasedStrategy); - expect(strategies[2]).toBeInstanceOf(ResponseBasedStrategy); - expect(strategies[3]).toBeInstanceOf(AgentConsultStrategy); - }); - }); -}); diff --git a/src/__tests__/judgment-strategies.test.ts b/src/__tests__/judgment-strategies.test.ts deleted file mode 100644 index 927c1e9..0000000 --- a/src/__tests__/judgment-strategies.test.ts +++ /dev/null @@ -1,204 +0,0 @@ -/** - * Unit tests for FallbackStrategy judgment strategies - * - * Tests AutoSelectStrategy and canApply logic for all strategies. - * Strategies requiring external agent calls (ReportBased, ResponseBased, - * AgentConsult) are tested for canApply and input validation only. - */ - -import { describe, it, expect } from 'vitest'; -import { - AutoSelectStrategy, - ReportBasedStrategy, - ResponseBasedStrategy, - AgentConsultStrategy, - JudgmentStrategyFactory, - type JudgmentContext, -} from '../core/piece/judgment/FallbackStrategy.js'; -import type { PieceMovement } from '../core/models/types.js'; - -function makeMovement(overrides: Partial = {}): PieceMovement { - return { - name: 'test-movement', - personaDisplayName: 'tester', - instructionTemplate: '', - passPreviousResponse: false, - ...overrides, - }; -} - -function makeContext(overrides: Partial = {}): JudgmentContext { - return { - step: makeMovement(), - cwd: '/tmp/test', - ...overrides, - }; -} - -describe('AutoSelectStrategy', () => { - const strategy = new AutoSelectStrategy(); - - it('should have name "AutoSelect"', () => { - expect(strategy.name).toBe('AutoSelect'); - }); - - describe('canApply', () => { - it('should return true when movement has exactly one rule', () => { - const ctx = makeContext({ - step: makeMovement({ - rules: [{ condition: 'done', next: 'COMPLETE' }], - }), - }); - expect(strategy.canApply(ctx)).toBe(true); - }); - - it('should return false when movement has multiple rules', () => { - const ctx = makeContext({ - step: makeMovement({ - rules: [ - { condition: 'approved', next: 'implement' }, - { condition: 'rejected', next: 'review' }, - ], - }), - }); - expect(strategy.canApply(ctx)).toBe(false); - }); - - it('should return false when movement has no rules', () => { - const ctx = makeContext({ - step: makeMovement({ rules: undefined }), - }); - expect(strategy.canApply(ctx)).toBe(false); - }); - }); - - describe('execute', () => { - it('should return auto-selected tag for single-branch movement', async () => { - const ctx = makeContext({ - step: makeMovement({ - name: 'review', - rules: [{ condition: 'done', next: 'COMPLETE' }], - }), - }); - - const result = await strategy.execute(ctx); - expect(result.success).toBe(true); - expect(result.tag).toBe('[REVIEW:1]'); - }); - }); -}); - -describe('ReportBasedStrategy', () => { - const strategy = new ReportBasedStrategy(); - - it('should have name "ReportBased"', () => { - expect(strategy.name).toBe('ReportBased'); - }); - - describe('canApply', () => { - it('should return true when reportDir and outputContracts are present', () => { - const ctx = makeContext({ - reportDir: '/tmp/reports', - step: makeMovement({ - outputContracts: [{ name: 'report.md' }], - }), - }); - expect(strategy.canApply(ctx)).toBe(true); - }); - - it('should return false when reportDir is missing', () => { - const ctx = makeContext({ - step: makeMovement({ - outputContracts: [{ name: 'report.md' }], - }), - }); - expect(strategy.canApply(ctx)).toBe(false); - }); - - it('should return false when outputContracts is empty', () => { - const ctx = makeContext({ - reportDir: '/tmp/reports', - step: makeMovement({ outputContracts: [] }), - }); - expect(strategy.canApply(ctx)).toBe(false); - }); - - it('should return false when outputContracts is undefined', () => { - const ctx = makeContext({ - reportDir: '/tmp/reports', - step: makeMovement(), - }); - expect(strategy.canApply(ctx)).toBe(false); - }); - }); -}); - -describe('ResponseBasedStrategy', () => { - const strategy = new ResponseBasedStrategy(); - - it('should have name "ResponseBased"', () => { - expect(strategy.name).toBe('ResponseBased'); - }); - - describe('canApply', () => { - it('should return true when lastResponse is non-empty', () => { - const ctx = makeContext({ lastResponse: 'some response' }); - expect(strategy.canApply(ctx)).toBe(true); - }); - - it('should return false when lastResponse is undefined', () => { - const ctx = makeContext({ lastResponse: undefined }); - expect(strategy.canApply(ctx)).toBe(false); - }); - - it('should return false when lastResponse is empty string', () => { - const ctx = makeContext({ lastResponse: '' }); - expect(strategy.canApply(ctx)).toBe(false); - }); - }); -}); - -describe('AgentConsultStrategy', () => { - const strategy = new AgentConsultStrategy(); - - it('should have name "AgentConsult"', () => { - expect(strategy.name).toBe('AgentConsult'); - }); - - describe('canApply', () => { - it('should return true when sessionId is non-empty', () => { - const ctx = makeContext({ sessionId: 'session-123' }); - expect(strategy.canApply(ctx)).toBe(true); - }); - - it('should return false when sessionId is undefined', () => { - const ctx = makeContext({ sessionId: undefined }); - expect(strategy.canApply(ctx)).toBe(false); - }); - - it('should return false when sessionId is empty string', () => { - const ctx = makeContext({ sessionId: '' }); - expect(strategy.canApply(ctx)).toBe(false); - }); - }); - - describe('execute', () => { - it('should return failure when sessionId is not provided', async () => { - const ctx = makeContext({ sessionId: undefined }); - const result = await strategy.execute(ctx); - expect(result.success).toBe(false); - expect(result.reason).toBe('Session ID not provided'); - }); - }); -}); - -describe('JudgmentStrategyFactory', () => { - it('should create strategies in correct priority order', () => { - const strategies = JudgmentStrategyFactory.createStrategies(); - expect(strategies).toHaveLength(4); - expect(strategies[0]!.name).toBe('AutoSelect'); - expect(strategies[1]!.name).toBe('ReportBased'); - expect(strategies[2]!.name).toBe('ResponseBased'); - expect(strategies[3]!.name).toBe('AgentConsult'); - }); -}); diff --git a/src/__tests__/options-builder.test.ts b/src/__tests__/options-builder.test.ts new file mode 100644 index 0000000..e1db23e --- /dev/null +++ b/src/__tests__/options-builder.test.ts @@ -0,0 +1,49 @@ +import { describe, expect, it } from 'vitest'; +import { OptionsBuilder } from '../core/piece/engine/OptionsBuilder.js'; +import type { PieceMovement } from '../core/models/types.js'; +import type { PieceEngineOptions } from '../core/piece/types.js'; + +function createMovement(): PieceMovement { + return { + name: 'reviewers', + personaDisplayName: 'Reviewers', + instructionTemplate: 'review', + passPreviousResponse: false, + permissionMode: 'full', + }; +} + +function createBuilder(step: PieceMovement): OptionsBuilder { + const engineOptions: PieceEngineOptions = { + projectCwd: '/project', + }; + + return new OptionsBuilder( + engineOptions, + () => '/project', + () => '/project', + () => undefined, + () => '.takt/runs/sample/reports', + () => 'ja', + () => [{ name: step.name }], + () => 'default', + () => 'test piece', + ); +} + +describe('OptionsBuilder.buildResumeOptions', () => { + it('should enforce readonly permission and empty allowedTools for report/status phases', () => { + // Given + const step = createMovement(); + const builder = createBuilder(step); + + // When + const options = builder.buildResumeOptions(step, 'session-123', { maxTurns: 3 }); + + // Then + expect(options.permissionMode).toBe('readonly'); + expect(options.allowedTools).toEqual([]); + expect(options.maxTurns).toBe(3); + expect(options.sessionId).toBe('session-123'); + }); +}); diff --git a/src/__tests__/parseStructuredOutput.test.ts b/src/__tests__/parseStructuredOutput.test.ts new file mode 100644 index 0000000..7f247e7 --- /dev/null +++ b/src/__tests__/parseStructuredOutput.test.ts @@ -0,0 +1,86 @@ +import { describe, it, expect } from 'vitest'; +import { parseStructuredOutput } from '../shared/utils/structuredOutput.js'; + +describe('parseStructuredOutput', () => { + it('should return undefined when hasOutputSchema is false', () => { + expect(parseStructuredOutput('{"step":1}', false)).toBeUndefined(); + }); + + it('should return undefined for empty text', () => { + expect(parseStructuredOutput('', true)).toBeUndefined(); + }); + + // Strategy 1: Direct JSON parse + describe('direct JSON parse', () => { + it('should parse pure JSON object', () => { + expect(parseStructuredOutput('{"step":1,"reason":"done"}', true)) + .toEqual({ step: 1, reason: 'done' }); + }); + + it('should parse JSON with whitespace', () => { + expect(parseStructuredOutput(' { "step": 2, "reason": "ok" } ', true)) + .toEqual({ step: 2, reason: 'ok' }); + }); + + it('should ignore arrays', () => { + expect(parseStructuredOutput('[1,2,3]', true)).toBeUndefined(); + }); + + it('should ignore primitive JSON', () => { + expect(parseStructuredOutput('"hello"', true)).toBeUndefined(); + }); + }); + + // Strategy 2: Code block extraction + describe('code block extraction', () => { + it('should extract JSON from ```json code block', () => { + const text = 'Here is the result:\n```json\n{"step":1,"reason":"matched"}\n```'; + expect(parseStructuredOutput(text, true)) + .toEqual({ step: 1, reason: 'matched' }); + }); + + it('should extract JSON from ``` code block (no language)', () => { + const text = 'Result:\n```\n{"step":2,"reason":"fallback"}\n```'; + expect(parseStructuredOutput(text, true)) + .toEqual({ step: 2, reason: 'fallback' }); + }); + }); + + // Strategy 3: Brace extraction + describe('brace extraction', () => { + it('should extract JSON with preamble text', () => { + const text = 'The matched rule is: {"step":1,"reason":"condition met"}'; + expect(parseStructuredOutput(text, true)) + .toEqual({ step: 1, reason: 'condition met' }); + }); + + it('should extract JSON with postamble text', () => { + const text = '{"step":3,"reason":"done"}\nEnd of response.'; + expect(parseStructuredOutput(text, true)) + .toEqual({ step: 3, reason: 'done' }); + }); + + it('should extract JSON with both preamble and postamble', () => { + const text = 'Based on my analysis:\n{"matched_index":2,"reason":"test"}\nThat is my judgment.'; + expect(parseStructuredOutput(text, true)) + .toEqual({ matched_index: 2, reason: 'test' }); + }); + }); + + // Edge cases + describe('edge cases', () => { + it('should return undefined for text without JSON', () => { + expect(parseStructuredOutput('No JSON here at all.', true)).toBeUndefined(); + }); + + it('should return undefined for invalid JSON', () => { + expect(parseStructuredOutput('{invalid json}', true)).toBeUndefined(); + }); + + it('should handle nested objects', () => { + const text = '{"step":1,"reason":"ok","meta":{"detail":"extra"}}'; + expect(parseStructuredOutput(text, true)) + .toEqual({ step: 1, reason: 'ok', meta: { detail: 'extra' } }); + }); + }); +}); diff --git a/src/__tests__/phase-runner-report-history.test.ts b/src/__tests__/phase-runner-report-history.test.ts index f296066..6017616 100644 --- a/src/__tests__/phase-runner-report-history.test.ts +++ b/src/__tests__/phase-runner-report-history.test.ts @@ -4,6 +4,7 @@ import { join } from 'node:path'; import { tmpdir } from 'node:os'; import { runReportPhase, type PhaseRunnerContext } from '../core/piece/phase-runner.js'; import type { PieceMovement } from '../core/models/types.js'; +import type { RunAgentOptions } from '../agents/runner.js'; vi.mock('../agents/runner.js', () => ({ runAgent: vi.fn(), @@ -21,7 +22,10 @@ function createStep(fileName: string): PieceMovement { }; } -function createContext(reportDir: string): PhaseRunnerContext { +function createContext( + reportDir: string, + onBuildResumeOptions?: (overrides: Pick) => void, +): PhaseRunnerContext { let currentSessionId = 'session-1'; return { cwd: reportDir, @@ -30,6 +34,13 @@ function createContext(reportDir: string): PhaseRunnerContext { buildResumeOptions: ( _step, _sessionId, + overrides, + ) => { + onBuildResumeOptions?.(overrides); + return { cwd: reportDir }; + }, + buildNewSessionReportOptions: ( + _step, _overrides, ) => ({ cwd: reportDir }), updatePersonaSession: (_persona, sessionId) => { @@ -140,4 +151,28 @@ describe('runReportPhase report history behavior', () => { '06-qa-review.20260210T061143Z.md', ]); }); + + it('should build report resume options with maxTurns override only', async () => { + // Given + const reportDir = join(tmpRoot, '.takt', 'runs', 'sample-run', 'reports'); + const step = createStep('07-permissions-check.md'); + const capturedOverrides: Array> = []; + const ctx = createContext(reportDir, (overrides) => { + capturedOverrides.push(overrides); + }); + const runAgentMock = vi.mocked(runAgent); + runAgentMock.mockResolvedValueOnce({ + persona: 'reviewers', + status: 'done', + content: 'Permission-based report execution', + timestamp: new Date('2026-02-10T06:21:17Z'), + sessionId: 'session-2', + }); + + // When + await runReportPhase(step, 1, ctx); + + // Then + expect(capturedOverrides).toEqual([{ maxTurns: 3 }]); + }); }); diff --git a/src/__tests__/pieceExecution-debug-prompts.test.ts b/src/__tests__/pieceExecution-debug-prompts.test.ts index 75aa4e5..5fb8402 100644 --- a/src/__tests__/pieceExecution-debug-prompts.test.ts +++ b/src/__tests__/pieceExecution-debug-prompts.test.ts @@ -77,8 +77,7 @@ vi.mock('../core/piece/index.js', () => ({ PieceEngine: MockPieceEngine, })); -vi.mock('../infra/claude/index.js', () => ({ - detectRuleIndex: vi.fn(), +vi.mock('../infra/claude/query-manager.js', () => ({ interruptAllQueries: vi.fn(), })); diff --git a/src/__tests__/pieceExecution-session-loading.test.ts b/src/__tests__/pieceExecution-session-loading.test.ts index 92ff51e..e6402da 100644 --- a/src/__tests__/pieceExecution-session-loading.test.ts +++ b/src/__tests__/pieceExecution-session-loading.test.ts @@ -46,8 +46,7 @@ vi.mock('../core/piece/index.js', () => ({ PieceEngine: MockPieceEngine, })); -vi.mock('../infra/claude/index.js', () => ({ - detectRuleIndex: vi.fn(), +vi.mock('../infra/claude/query-manager.js', () => ({ interruptAllQueries: vi.fn(), })); diff --git a/src/__tests__/provider-resolution.test.ts b/src/__tests__/provider-resolution.test.ts new file mode 100644 index 0000000..fa60189 --- /dev/null +++ b/src/__tests__/provider-resolution.test.ts @@ -0,0 +1,80 @@ +import { describe, expect, it } from 'vitest'; +import { resolveMovementProviderModel } from '../core/piece/provider-resolution.js'; + +describe('resolveMovementProviderModel', () => { + it('should prefer step.provider when step provider is defined', () => { + // Given: step.provider が指定されている + const result = resolveMovementProviderModel({ + step: { provider: 'codex', model: undefined, personaDisplayName: 'coder' }, + provider: 'claude', + personaProviders: { coder: 'opencode' }, + }); + + // When: provider/model を解決する + // Then: step.provider が最優先になる + expect(result.provider).toBe('codex'); + }); + + it('should use personaProviders when step.provider is undefined', () => { + // Given: step.provider が未定義で personaProviders に対応がある + const result = resolveMovementProviderModel({ + step: { provider: undefined, model: undefined, personaDisplayName: 'reviewer' }, + provider: 'claude', + personaProviders: { reviewer: 'opencode' }, + }); + + // When: provider/model を解決する + // Then: personaProviders の値が使われる + expect(result.provider).toBe('opencode'); + }); + + it('should fallback to input.provider when persona mapping is missing', () => { + // Given: step.provider 未定義かつ persona マッピングが存在しない + const result = resolveMovementProviderModel({ + step: { provider: undefined, model: undefined, personaDisplayName: 'unknown' }, + provider: 'mock', + personaProviders: { reviewer: 'codex' }, + }); + + // When: provider/model を解決する + // Then: input.provider が使われる + expect(result.provider).toBe('mock'); + }); + + it('should return undefined provider when all provider candidates are missing', () => { + // Given: provider の候補がすべて未定義 + const result = resolveMovementProviderModel({ + step: { provider: undefined, model: undefined, personaDisplayName: 'none' }, + provider: undefined, + personaProviders: undefined, + }); + + // When: provider/model を解決する + // Then: provider は undefined になる + expect(result.provider).toBeUndefined(); + }); + + it('should prefer step.model over input.model', () => { + // Given: step.model と input.model が両方指定されている + const result = resolveMovementProviderModel({ + step: { provider: undefined, model: 'step-model', personaDisplayName: 'coder' }, + model: 'input-model', + }); + + // When: provider/model を解決する + // Then: step.model が最優先になる + expect(result.model).toBe('step-model'); + }); + + it('should fallback to input.model when step.model is undefined', () => { + // Given: step.model が未定義で input.model が指定されている + const result = resolveMovementProviderModel({ + step: { provider: undefined, model: undefined, personaDisplayName: 'coder' }, + model: 'input-model', + }); + + // When: provider/model を解決する + // Then: input.model が使われる + expect(result.model).toBe('input-model'); + }); +}); diff --git a/src/__tests__/provider-structured-output.test.ts b/src/__tests__/provider-structured-output.test.ts new file mode 100644 index 0000000..3f2206e --- /dev/null +++ b/src/__tests__/provider-structured-output.test.ts @@ -0,0 +1,244 @@ +/** + * Provider layer structured output tests. + * + * Verifies that each provider (Claude, Codex, OpenCode) correctly passes + * `outputSchema` through to its underlying client function and returns + * `structuredOutput` in the AgentResponse. + */ + +import { beforeEach, describe, expect, it, vi } from 'vitest'; + +// ===== Claude ===== +const { + mockCallClaude, + mockCallClaudeCustom, +} = vi.hoisted(() => ({ + mockCallClaude: vi.fn(), + mockCallClaudeCustom: vi.fn(), +})); + +vi.mock('../infra/claude/client.js', () => ({ + callClaude: mockCallClaude, + callClaudeCustom: mockCallClaudeCustom, + callClaudeAgent: vi.fn(), + callClaudeSkill: vi.fn(), +})); + +// ===== Codex ===== +const { + mockCallCodex, + mockCallCodexCustom, +} = vi.hoisted(() => ({ + mockCallCodex: vi.fn(), + mockCallCodexCustom: vi.fn(), +})); + +vi.mock('../infra/codex/index.js', () => ({ + callCodex: mockCallCodex, + callCodexCustom: mockCallCodexCustom, +})); + +// ===== OpenCode ===== +const { + mockCallOpenCode, + mockCallOpenCodeCustom, +} = vi.hoisted(() => ({ + mockCallOpenCode: vi.fn(), + mockCallOpenCodeCustom: vi.fn(), +})); + +vi.mock('../infra/opencode/index.js', () => ({ + callOpenCode: mockCallOpenCode, + callOpenCodeCustom: mockCallOpenCodeCustom, +})); + +// ===== Config (API key resolvers) ===== +vi.mock('../infra/config/index.js', () => ({ + resolveAnthropicApiKey: vi.fn(() => undefined), + resolveOpenaiApiKey: vi.fn(() => undefined), + resolveOpencodeApiKey: vi.fn(() => undefined), +})); + +// Codex の isInsideGitRepo をバイパス +vi.mock('node:child_process', () => ({ + execFileSync: vi.fn(() => 'true'), +})); + +import { ClaudeProvider } from '../infra/providers/claude.js'; +import { CodexProvider } from '../infra/providers/codex.js'; +import { OpenCodeProvider } from '../infra/providers/opencode.js'; + +const SCHEMA = { + type: 'object', + properties: { step: { type: 'integer' } }, + required: ['step'], +}; + +function doneResponse(persona: string, structuredOutput?: Record) { + return { + persona, + status: 'done' as const, + content: 'ok', + timestamp: new Date(), + structuredOutput, + }; +} + +// ---------- Claude ---------- + +describe('ClaudeProvider — structured output', () => { + beforeEach(() => { + vi.clearAllMocks(); + }); + + it('outputSchema を callClaude に渡し structuredOutput を返す', async () => { + mockCallClaude.mockResolvedValue(doneResponse('coder', { step: 2 })); + + const agent = new ClaudeProvider().setup({ name: 'coder' }); + const result = await agent.call('prompt', { cwd: '/tmp', outputSchema: SCHEMA }); + + const opts = mockCallClaude.mock.calls[0]?.[2]; + expect(opts).toHaveProperty('outputSchema', SCHEMA); + expect(result.structuredOutput).toEqual({ step: 2 }); + }); + + it('systemPrompt 指定時も outputSchema が callClaudeCustom に渡される', async () => { + mockCallClaudeCustom.mockResolvedValue(doneResponse('judge', { step: 1 })); + + const agent = new ClaudeProvider().setup({ name: 'judge', systemPrompt: 'You are a judge.' }); + const result = await agent.call('prompt', { cwd: '/tmp', outputSchema: SCHEMA }); + + const opts = mockCallClaudeCustom.mock.calls[0]?.[3]; + expect(opts).toHaveProperty('outputSchema', SCHEMA); + expect(result.structuredOutput).toEqual({ step: 1 }); + }); + + it('structuredOutput がない場合は undefined', async () => { + mockCallClaude.mockResolvedValue(doneResponse('coder')); + + const agent = new ClaudeProvider().setup({ name: 'coder' }); + const result = await agent.call('prompt', { cwd: '/tmp', outputSchema: SCHEMA }); + + expect(result.structuredOutput).toBeUndefined(); + }); + + it('outputSchema 未指定時は undefined が渡される', async () => { + mockCallClaude.mockResolvedValue(doneResponse('coder')); + + const agent = new ClaudeProvider().setup({ name: 'coder' }); + await agent.call('prompt', { cwd: '/tmp' }); + + const opts = mockCallClaude.mock.calls[0]?.[2]; + expect(opts.outputSchema).toBeUndefined(); + }); +}); + +// ---------- Codex ---------- + +describe('CodexProvider — structured output', () => { + beforeEach(() => { + vi.clearAllMocks(); + }); + + it('outputSchema を callCodex に渡し structuredOutput を返す', async () => { + mockCallCodex.mockResolvedValue(doneResponse('coder', { step: 2 })); + + const agent = new CodexProvider().setup({ name: 'coder' }); + const result = await agent.call('prompt', { cwd: '/tmp', outputSchema: SCHEMA }); + + const opts = mockCallCodex.mock.calls[0]?.[2]; + expect(opts).toHaveProperty('outputSchema', SCHEMA); + expect(result.structuredOutput).toEqual({ step: 2 }); + }); + + it('systemPrompt 指定時も outputSchema が callCodexCustom に渡される', async () => { + mockCallCodexCustom.mockResolvedValue(doneResponse('judge', { step: 1 })); + + const agent = new CodexProvider().setup({ name: 'judge', systemPrompt: 'sys' }); + const result = await agent.call('prompt', { cwd: '/tmp', outputSchema: SCHEMA }); + + const opts = mockCallCodexCustom.mock.calls[0]?.[3]; + expect(opts).toHaveProperty('outputSchema', SCHEMA); + expect(result.structuredOutput).toEqual({ step: 1 }); + }); + + it('structuredOutput がない場合は undefined', async () => { + mockCallCodex.mockResolvedValue(doneResponse('coder')); + + const agent = new CodexProvider().setup({ name: 'coder' }); + const result = await agent.call('prompt', { cwd: '/tmp', outputSchema: SCHEMA }); + + expect(result.structuredOutput).toBeUndefined(); + }); + + it('outputSchema 未指定時は undefined が渡される', async () => { + mockCallCodex.mockResolvedValue(doneResponse('coder')); + + const agent = new CodexProvider().setup({ name: 'coder' }); + await agent.call('prompt', { cwd: '/tmp' }); + + const opts = mockCallCodex.mock.calls[0]?.[2]; + expect(opts.outputSchema).toBeUndefined(); + }); +}); + +// ---------- OpenCode ---------- + +describe('OpenCodeProvider — structured output', () => { + beforeEach(() => { + vi.clearAllMocks(); + }); + + it('outputSchema を callOpenCode に渡し structuredOutput を返す', async () => { + mockCallOpenCode.mockResolvedValue(doneResponse('coder', { step: 2 })); + + const agent = new OpenCodeProvider().setup({ name: 'coder' }); + const result = await agent.call('prompt', { + cwd: '/tmp', + model: 'openai/gpt-4', + outputSchema: SCHEMA, + }); + + const opts = mockCallOpenCode.mock.calls[0]?.[2]; + expect(opts).toHaveProperty('outputSchema', SCHEMA); + expect(result.structuredOutput).toEqual({ step: 2 }); + }); + + it('systemPrompt 指定時も outputSchema が callOpenCodeCustom に渡される', async () => { + mockCallOpenCodeCustom.mockResolvedValue(doneResponse('judge', { step: 1 })); + + const agent = new OpenCodeProvider().setup({ name: 'judge', systemPrompt: 'sys' }); + const result = await agent.call('prompt', { + cwd: '/tmp', + model: 'openai/gpt-4', + outputSchema: SCHEMA, + }); + + const opts = mockCallOpenCodeCustom.mock.calls[0]?.[3]; + expect(opts).toHaveProperty('outputSchema', SCHEMA); + expect(result.structuredOutput).toEqual({ step: 1 }); + }); + + it('structuredOutput がない場合は undefined', async () => { + mockCallOpenCode.mockResolvedValue(doneResponse('coder')); + + const agent = new OpenCodeProvider().setup({ name: 'coder' }); + const result = await agent.call('prompt', { + cwd: '/tmp', + model: 'openai/gpt-4', + outputSchema: SCHEMA, + }); + + expect(result.structuredOutput).toBeUndefined(); + }); + + it('outputSchema 未指定時は undefined が渡される', async () => { + mockCallOpenCode.mockResolvedValue(doneResponse('coder')); + + const agent = new OpenCodeProvider().setup({ name: 'coder' }); + await agent.call('prompt', { cwd: '/tmp', model: 'openai/gpt-4' }); + + const opts = mockCallOpenCode.mock.calls[0]?.[2]; + expect(opts.outputSchema).toBeUndefined(); + }); +}); diff --git a/src/__tests__/public-api-exports.test.ts b/src/__tests__/public-api-exports.test.ts new file mode 100644 index 0000000..3ec0b57 --- /dev/null +++ b/src/__tests__/public-api-exports.test.ts @@ -0,0 +1,83 @@ +import { describe, expect, it } from 'vitest'; + +describe('public API exports', () => { + it('should expose piece usecases, engine, and piece loader APIs', async () => { + // Given: パッケージの公開API + const api = await import('../index.js'); + + // When: 主要なユースケース関数とエンジン公開API・piece読み込みAPIを参照する + // Then: 必要な公開シンボルが利用できる + expect(typeof api.executeAgent).toBe('function'); + expect(typeof api.generateReport).toBe('function'); + expect(typeof api.executePart).toBe('function'); + expect(typeof api.judgeStatus).toBe('function'); + expect(typeof api.evaluateCondition).toBe('function'); + expect(typeof api.decomposeTask).toBe('function'); + + expect(typeof api.PieceEngine).toBe('function'); + + expect(typeof api.loadPiece).toBe('function'); + expect(typeof api.loadPieceByIdentifier).toBe('function'); + expect(typeof api.listPieces).toBe('function'); + }); + + it('should not expose internal engine implementation details', async () => { + // Given: パッケージの公開API + const api = await import('../index.js'); + + // When: 非公開にすべき内部シンボルの有無を確認する + // Then: 内部実装詳細は公開されていない + expect('AgentRunner' in api).toBe(false); + expect('RuleEvaluator' in api).toBe(false); + expect('AggregateEvaluator' in api).toBe(false); + expect('evaluateAggregateConditions' in api).toBe(false); + expect('needsStatusJudgmentPhase' in api).toBe(false); + expect('StatusJudgmentBuilder' in api).toBe(false); + expect('buildEditRule' in api).toBe(false); + expect('detectRuleIndex' in api).toBe(false); + expect('ParallelLogger' in api).toBe(false); + expect('InstructionBuilder' in api).toBe(false); + expect('ReportInstructionBuilder' in api).toBe(false); + expect('COMPLETE_MOVEMENT' in api).toBe(false); + expect('ABORT_MOVEMENT' in api).toBe(false); + expect('ERROR_MESSAGES' in api).toBe(false); + expect('determineNextMovementByRules' in api).toBe(false); + expect('extractBlockedPrompt' in api).toBe(false); + expect('LoopDetector' in api).toBe(false); + expect('createInitialState' in api).toBe(false); + expect('addUserInput' in api).toBe(false); + expect('getPreviousOutput' in api).toBe(false); + expect('handleBlocked' in api).toBe(false); + }); + + it('should not expose infrastructure implementations and internal shared utilities', async () => { + // Given: パッケージの公開API + const api = await import('../index.js'); + + // When: 非公開にすべきインフラ実装と内部ユーティリティの有無を確認する + // Then: 直接利用させない実装詳細は公開されていない + expect('ClaudeClient' in api).toBe(false); + expect('executeClaudeCli' in api).toBe(false); + expect('CodexClient' in api).toBe(false); + expect('mapToCodexSandboxMode' in api).toBe(false); + expect('getResourcesDir' in api).toBe(false); + expect('DEFAULT_PIECE_NAME' in api).toBe(false); + expect('buildPrompt' in api).toBe(false); + expect('writeFileAtomic' in api).toBe(false); + expect('getInputHistoryPath' in api).toBe(false); + expect('MAX_INPUT_HISTORY' in api).toBe(false); + expect('loadInputHistory' in api).toBe(false); + expect('saveInputHistory' in api).toBe(false); + expect('addToInputHistory' in api).toBe(false); + expect('getPersonaSessionsPath' in api).toBe(false); + expect('loadPersonaSessions' in api).toBe(false); + expect('savePersonaSessions' in api).toBe(false); + expect('updatePersonaSession' in api).toBe(false); + expect('clearPersonaSessions' in api).toBe(false); + expect('getWorktreeSessionsDir' in api).toBe(false); + expect('encodeWorktreePath' in api).toBe(false); + expect('getWorktreeSessionPath' in api).toBe(false); + expect('loadWorktreeSessions' in api).toBe(false); + expect('updateWorktreeSession' in api).toBe(false); + }); +}); diff --git a/src/__tests__/report-phase-blocked.test.ts b/src/__tests__/report-phase-blocked.test.ts index 3afad14..0241784 100644 --- a/src/__tests__/report-phase-blocked.test.ts +++ b/src/__tests__/report-phase-blocked.test.ts @@ -23,7 +23,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({ vi.mock('../core/piece/phase-runner.js', () => ({ needsStatusJudgmentPhase: vi.fn().mockReturnValue(false), runReportPhase: vi.fn().mockResolvedValue(undefined), - runStatusJudgmentPhase: vi.fn().mockResolvedValue(''), + runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }), })); vi.mock('../shared/utils/index.js', async (importOriginal) => ({ @@ -34,7 +34,7 @@ vi.mock('../shared/utils/index.js', async (importOriginal) => ({ // --- Imports (after mocks) --- import { PieceEngine } from '../core/piece/index.js'; -import { runReportPhase } from '../core/piece/index.js'; +import { runReportPhase } from '../core/piece/phase-runner.js'; import { makeResponse, makeMovement, diff --git a/src/__tests__/report-phase-retry.test.ts b/src/__tests__/report-phase-retry.test.ts new file mode 100644 index 0000000..26c8807 --- /dev/null +++ b/src/__tests__/report-phase-retry.test.ts @@ -0,0 +1,211 @@ +import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'; +import { existsSync, mkdtempSync, readFileSync, rmSync } from 'node:fs'; +import { join } from 'node:path'; +import { tmpdir } from 'node:os'; +import { runReportPhase, type PhaseRunnerContext } from '../core/piece/phase-runner.js'; +import type { PieceMovement } from '../core/models/types.js'; + +vi.mock('../agents/runner.js', () => ({ + runAgent: vi.fn(), +})); + +import { runAgent } from '../agents/runner.js'; + +function createStep(fileName: string): PieceMovement { + return { + name: 'implement', + persona: 'coder', + personaDisplayName: 'Coder', + instructionTemplate: 'Implement task', + passPreviousResponse: false, + outputContracts: [{ name: fileName }], + }; +} + +function createContext(reportDir: string, lastResponse = 'Phase 1 result'): PhaseRunnerContext { + let currentSessionId = 'session-resume-1'; + + return { + cwd: reportDir, + reportDir, + language: 'en', + lastResponse, + getSessionId: (_persona: string) => currentSessionId, + buildResumeOptions: (_step, sessionId, overrides) => ({ + cwd: reportDir, + sessionId, + allowedTools: overrides.allowedTools, + maxTurns: overrides.maxTurns, + }), + buildNewSessionReportOptions: (_step, overrides) => ({ + cwd: reportDir, + allowedTools: overrides.allowedTools, + maxTurns: overrides.maxTurns, + }), + updatePersonaSession: (_persona, sessionId) => { + if (sessionId) { + currentSessionId = sessionId; + } + }, + }; +} + +describe('runReportPhase retry with new session', () => { + let tmpRoot: string; + + beforeEach(() => { + tmpRoot = mkdtempSync(join(tmpdir(), 'takt-report-retry-')); + vi.resetAllMocks(); + }); + + afterEach(() => { + if (existsSync(tmpRoot)) { + rmSync(tmpRoot, { recursive: true, force: true }); + } + }); + + it('should retry with new session when first attempt returns empty content', async () => { + // Given + const reportDir = join(tmpRoot, '.takt', 'runs', 'sample-run', 'reports'); + const step = createStep('02-coder.md'); + const ctx = createContext(reportDir, 'Implemented feature X'); + const runAgentMock = vi.mocked(runAgent); + runAgentMock + .mockResolvedValueOnce({ + persona: 'coder', + status: 'done', + content: ' ', + timestamp: new Date('2026-02-11T00:00:00Z'), + sessionId: 'session-resume-2', + }) + .mockResolvedValueOnce({ + persona: 'coder', + status: 'done', + content: '# Report\nRecovered output', + timestamp: new Date('2026-02-11T00:00:01Z'), + sessionId: 'session-fresh-1', + }); + + // When + await runReportPhase(step, 1, ctx); + + // Then + const reportPath = join(reportDir, '02-coder.md'); + expect(readFileSync(reportPath, 'utf-8')).toBe('# Report\nRecovered output'); + expect(runAgentMock).toHaveBeenCalledTimes(2); + + const secondCallOptions = runAgentMock.mock.calls[1]?.[2] as { sessionId?: string }; + expect(secondCallOptions.sessionId).toBeUndefined(); + + const secondInstruction = runAgentMock.mock.calls[1]?.[1] as string; + expect(secondInstruction).toContain('## Previous Work Context'); + expect(secondInstruction).toContain('Implemented feature X'); + }); + + it('should retry with new session when first attempt status is error', async () => { + // Given + const reportDir = join(tmpRoot, '.takt', 'runs', 'sample-run', 'reports'); + const step = createStep('03-review.md'); + const ctx = createContext(reportDir); + const runAgentMock = vi.mocked(runAgent); + runAgentMock + .mockResolvedValueOnce({ + persona: 'coder', + status: 'error', + content: 'Tool use is not allowed in this phase', + timestamp: new Date('2026-02-11T00:01:00Z'), + error: 'Tool use is not allowed in this phase', + }) + .mockResolvedValueOnce({ + persona: 'coder', + status: 'done', + content: 'Recovered report', + timestamp: new Date('2026-02-11T00:01:01Z'), + }); + + // When + await runReportPhase(step, 1, ctx); + + // Then + const reportPath = join(reportDir, '03-review.md'); + expect(readFileSync(reportPath, 'utf-8')).toBe('Recovered report'); + expect(runAgentMock).toHaveBeenCalledTimes(2); + }); + + it('should throw when both attempts return empty output', async () => { + // Given + const reportDir = join(tmpRoot, '.takt', 'runs', 'sample-run', 'reports'); + const step = createStep('04-qa.md'); + const ctx = createContext(reportDir); + const runAgentMock = vi.mocked(runAgent); + runAgentMock + .mockResolvedValueOnce({ + persona: 'coder', + status: 'done', + content: ' ', + timestamp: new Date('2026-02-11T00:02:00Z'), + }) + .mockResolvedValueOnce({ + persona: 'coder', + status: 'done', + content: '\n\n', + timestamp: new Date('2026-02-11T00:02:01Z'), + }); + + // When / Then + await expect(runReportPhase(step, 1, ctx)).rejects.toThrow('Report phase failed for 04-qa.md: Report output is empty'); + expect(runAgentMock).toHaveBeenCalledTimes(2); + }); + + it('should not retry when first attempt succeeds', async () => { + // Given + const reportDir = join(tmpRoot, '.takt', 'runs', 'sample-run', 'reports'); + const step = createStep('05-ok.md'); + const ctx = createContext(reportDir); + const runAgentMock = vi.mocked(runAgent); + runAgentMock.mockResolvedValueOnce({ + persona: 'coder', + status: 'done', + content: 'Single-pass success', + timestamp: new Date('2026-02-11T00:03:00Z'), + sessionId: 'session-resume-2', + }); + + // When + await runReportPhase(step, 1, ctx); + + // Then + expect(runAgentMock).toHaveBeenCalledTimes(1); + const reportPath = join(reportDir, '05-ok.md'); + expect(readFileSync(reportPath, 'utf-8')).toBe('Single-pass success'); + }); + + it('should return blocked result without retry', async () => { + // Given + const reportDir = join(tmpRoot, '.takt', 'runs', 'sample-run', 'reports'); + const step = createStep('06-blocked.md'); + const ctx = createContext(reportDir); + const runAgentMock = vi.mocked(runAgent); + runAgentMock.mockResolvedValueOnce({ + persona: 'coder', + status: 'blocked', + content: 'Need permission', + timestamp: new Date('2026-02-11T00:04:00Z'), + }); + + // When + const result = await runReportPhase(step, 1, ctx); + + // Then + expect(result).toEqual({ + blocked: true, + response: { + persona: 'coder', + status: 'blocked', + content: 'Need permission', + timestamp: new Date('2026-02-11T00:04:00Z'), + }, + }); + expect(runAgentMock).toHaveBeenCalledTimes(1); + }); +}); diff --git a/src/__tests__/rule-evaluator.test.ts b/src/__tests__/rule-evaluator.test.ts index 7cfd834..444e8d0 100644 --- a/src/__tests__/rule-evaluator.test.ts +++ b/src/__tests__/rule-evaluator.test.ts @@ -6,17 +6,8 @@ import { describe, it, expect, vi } from 'vitest'; import { RuleEvaluator, type RuleEvaluatorContext } from '../core/piece/evaluation/RuleEvaluator.js'; -import type { PieceMovement, PieceState } from '../core/models/types.js'; - -function makeMovement(overrides: Partial = {}): PieceMovement { - return { - name: 'test-movement', - personaDisplayName: 'tester', - instructionTemplate: '', - passPreviousResponse: false, - ...overrides, - }; -} +import type { PieceState } from '../core/models/types.js'; +import { makeMovement } from './test-helpers.js'; function makeState(): PieceState { return { diff --git a/src/__tests__/rule-utils.test.ts b/src/__tests__/rule-utils.test.ts index b690377..80e184b 100644 --- a/src/__tests__/rule-utils.test.ts +++ b/src/__tests__/rule-utils.test.ts @@ -12,17 +12,8 @@ import { getAutoSelectedTag, getReportFiles, } from '../core/piece/evaluation/rule-utils.js'; -import type { PieceMovement, OutputContractEntry } from '../core/models/types.js'; - -function makeMovement(overrides: Partial = {}): PieceMovement { - return { - name: 'test-movement', - personaDisplayName: 'tester', - instructionTemplate: '', - passPreviousResponse: false, - ...overrides, - }; -} +import type { OutputContractEntry } from '../core/models/types.js'; +import { makeMovement } from './test-helpers.js'; describe('hasTagBasedRules', () => { it('should return false when movement has no rules', () => { diff --git a/src/__tests__/runAllTasks-concurrency.test.ts b/src/__tests__/runAllTasks-concurrency.test.ts index 9bab686..5e0b1ce 100644 --- a/src/__tests__/runAllTasks-concurrency.test.ts +++ b/src/__tests__/runAllTasks-concurrency.test.ts @@ -115,9 +115,8 @@ vi.mock('../infra/github/index.js', () => ({ pushBranch: vi.fn(), })); -vi.mock('../infra/claude/index.js', () => ({ +vi.mock('../infra/claude/query-manager.js', () => ({ interruptAllQueries: vi.fn(), - detectRuleIndex: vi.fn(), })); vi.mock('../agents/ai-judge.js', () => ({ diff --git a/src/__tests__/schema-loader.test.ts b/src/__tests__/schema-loader.test.ts new file mode 100644 index 0000000..a44341c --- /dev/null +++ b/src/__tests__/schema-loader.test.ts @@ -0,0 +1,76 @@ +import { beforeEach, describe, expect, it, vi } from 'vitest'; + +const readFileSyncMock = vi.fn((path: string) => { + if (path.endsWith('judgment.json')) { + return JSON.stringify({ type: 'object', properties: { step: { type: 'integer' } } }); + } + if (path.endsWith('evaluation.json')) { + return JSON.stringify({ type: 'object', properties: { matched_index: { type: 'integer' } } }); + } + if (path.endsWith('decomposition.json')) { + return JSON.stringify({ + type: 'object', + properties: { + parts: { + type: 'array', + items: { + type: 'object', + properties: { + id: { type: 'string' }, + title: { type: 'string' }, + instruction: { type: 'string' }, + }, + }, + }, + }, + }); + } + throw new Error(`Unexpected schema path: ${path}`); +}); + +vi.mock('node:fs', () => ({ + readFileSync: readFileSyncMock, +})); + +vi.mock('../infra/resources/index.js', () => ({ + getResourcesDir: vi.fn(() => '/mock/resources'), +})); + +describe('schema-loader', () => { + beforeEach(() => { + vi.resetModules(); + readFileSyncMock.mockClear(); + }); + + it('同じスキーマを複数回ロードしても readFileSync は1回だけ', async () => { + const { loadJudgmentSchema } = await import('../core/piece/schema-loader.js'); + + const first = loadJudgmentSchema(); + const second = loadJudgmentSchema(); + + expect(first).toEqual(second); + expect(readFileSyncMock).toHaveBeenCalledTimes(1); + expect(readFileSyncMock).toHaveBeenCalledWith('/mock/resources/schemas/judgment.json', 'utf-8'); + }); + + it('loadDecompositionSchema は maxItems を注入し、呼び出しごとに独立したオブジェクトを返す', async () => { + const { loadDecompositionSchema } = await import('../core/piece/schema-loader.js'); + + const first = loadDecompositionSchema(2); + const second = loadDecompositionSchema(5); + + const firstParts = (first.properties as Record).parts as Record; + const secondParts = (second.properties as Record).parts as Record; + + expect(firstParts.maxItems).toBe(2); + expect(secondParts.maxItems).toBe(5); + expect(readFileSyncMock).toHaveBeenCalledTimes(1); + }); + + it('loadDecompositionSchema は不正な maxParts を拒否する', async () => { + const { loadDecompositionSchema } = await import('../core/piece/schema-loader.js'); + + expect(() => loadDecompositionSchema(0)).toThrow('maxParts must be a positive integer: 0'); + expect(() => loadDecompositionSchema(-1)).toThrow('maxParts must be a positive integer: -1'); + }); +}); diff --git a/src/__tests__/shutdownManager.test.ts b/src/__tests__/shutdownManager.test.ts new file mode 100644 index 0000000..0a0f1fb --- /dev/null +++ b/src/__tests__/shutdownManager.test.ts @@ -0,0 +1,173 @@ +import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'; + +const { + mockWarn, + mockError, + mockBlankLine, + mockGetLabel, +} = vi.hoisted(() => ({ + mockWarn: vi.fn(), + mockError: vi.fn(), + mockBlankLine: vi.fn(), + mockGetLabel: vi.fn((key: string) => key), +})); + +vi.mock('../shared/ui/index.js', () => ({ + warn: mockWarn, + error: mockError, + blankLine: mockBlankLine, +})); + +vi.mock('../shared/i18n/index.js', () => ({ + getLabel: mockGetLabel, +})); + +import { ShutdownManager } from '../features/tasks/execute/shutdownManager.js'; + +describe('ShutdownManager', () => { + let savedSigintListeners: ((...args: unknown[]) => void)[]; + let originalShutdownTimeoutEnv: string | undefined; + + beforeEach(() => { + vi.clearAllMocks(); + savedSigintListeners = process.rawListeners('SIGINT') as ((...args: unknown[]) => void)[]; + originalShutdownTimeoutEnv = process.env.TAKT_SHUTDOWN_TIMEOUT_MS; + delete process.env.TAKT_SHUTDOWN_TIMEOUT_MS; + }); + + afterEach(() => { + vi.useRealTimers(); + process.removeAllListeners('SIGINT'); + for (const listener of savedSigintListeners) { + process.on('SIGINT', listener as NodeJS.SignalsListener); + } + if (originalShutdownTimeoutEnv === undefined) { + delete process.env.TAKT_SHUTDOWN_TIMEOUT_MS; + } else { + process.env.TAKT_SHUTDOWN_TIMEOUT_MS = originalShutdownTimeoutEnv; + } + }); + + it('1回目SIGINTでgracefulコールバックを呼ぶ', () => { + const onGraceful = vi.fn(); + const onForceKill = vi.fn(); + + const manager = new ShutdownManager({ + callbacks: { onGraceful, onForceKill }, + gracefulTimeoutMs: 1_000, + }); + manager.install(); + + const listeners = process.rawListeners('SIGINT') as Array<() => void>; + listeners[listeners.length - 1]!(); + + expect(onGraceful).toHaveBeenCalledTimes(1); + expect(onForceKill).not.toHaveBeenCalled(); + expect(mockWarn).toHaveBeenCalledWith('piece.sigintGraceful'); + + manager.cleanup(); + }); + + it('graceful timeoutでforceコールバックを呼ぶ', () => { + vi.useFakeTimers(); + const onGraceful = vi.fn(); + const onForceKill = vi.fn(); + + const manager = new ShutdownManager({ + callbacks: { onGraceful, onForceKill }, + gracefulTimeoutMs: 50, + }); + manager.install(); + + const listeners = process.rawListeners('SIGINT') as Array<() => void>; + listeners[listeners.length - 1]!(); + vi.advanceTimersByTime(50); + + expect(onGraceful).toHaveBeenCalledTimes(1); + expect(onForceKill).toHaveBeenCalledTimes(1); + expect(mockError).toHaveBeenCalledWith('piece.sigintTimeout'); + expect(mockError).toHaveBeenCalledWith('piece.sigintForce'); + + manager.cleanup(); + }); + + it('2回目SIGINTで即時forceコールバックを呼び、timeoutを待たない', () => { + vi.useFakeTimers(); + const onGraceful = vi.fn(); + const onForceKill = vi.fn(); + + const manager = new ShutdownManager({ + callbacks: { onGraceful, onForceKill }, + gracefulTimeoutMs: 10_000, + }); + manager.install(); + + const listeners = process.rawListeners('SIGINT') as Array<() => void>; + const handler = listeners[listeners.length - 1]!; + handler(); + handler(); + vi.advanceTimersByTime(10_000); + + expect(onGraceful).toHaveBeenCalledTimes(1); + expect(onForceKill).toHaveBeenCalledTimes(1); + expect(mockError).toHaveBeenCalledWith('piece.sigintForce'); + + manager.cleanup(); + }); + + it('環境変数未設定時はデフォルト10_000msを使う', () => { + vi.useFakeTimers(); + const onGraceful = vi.fn(); + const onForceKill = vi.fn(); + + const manager = new ShutdownManager({ + callbacks: { onGraceful, onForceKill }, + }); + manager.install(); + + const listeners = process.rawListeners('SIGINT') as Array<() => void>; + listeners[listeners.length - 1]!(); + + vi.advanceTimersByTime(9_999); + expect(onForceKill).not.toHaveBeenCalled(); + + vi.advanceTimersByTime(1); + expect(onForceKill).toHaveBeenCalledTimes(1); + + manager.cleanup(); + }); + + it('環境変数設定時はその値をtimeoutとして使う', () => { + vi.useFakeTimers(); + process.env.TAKT_SHUTDOWN_TIMEOUT_MS = '25'; + const onGraceful = vi.fn(); + const onForceKill = vi.fn(); + + const manager = new ShutdownManager({ + callbacks: { onGraceful, onForceKill }, + }); + manager.install(); + + const listeners = process.rawListeners('SIGINT') as Array<() => void>; + listeners[listeners.length - 1]!(); + + vi.advanceTimersByTime(24); + expect(onForceKill).not.toHaveBeenCalled(); + + vi.advanceTimersByTime(1); + expect(onForceKill).toHaveBeenCalledTimes(1); + + manager.cleanup(); + }); + + it('不正な環境変数値ではエラーをthrowする', () => { + process.env.TAKT_SHUTDOWN_TIMEOUT_MS = '0'; + + expect( + () => + new ShutdownManager({ + callbacks: { onGraceful: vi.fn(), onForceKill: vi.fn() }, + }), + ).toThrowError('TAKT_SHUTDOWN_TIMEOUT_MS must be a positive integer'); + }); +}); diff --git a/src/__tests__/sleep.test.ts b/src/__tests__/sleep.test.ts index cafb307..7f099d1 100644 --- a/src/__tests__/sleep.test.ts +++ b/src/__tests__/sleep.test.ts @@ -71,7 +71,7 @@ describe('preventSleep', () => { expect(spawn).toHaveBeenCalledWith( '/usr/bin/caffeinate', - ['-di', '-w', String(process.pid)], + ['-dis', '-w', String(process.pid)], { stdio: 'ignore', detached: true } ); expect(mockChild.unref).toHaveBeenCalled(); diff --git a/src/__tests__/streamDiagnostics.test.ts b/src/__tests__/streamDiagnostics.test.ts new file mode 100644 index 0000000..ba4fc92 --- /dev/null +++ b/src/__tests__/streamDiagnostics.test.ts @@ -0,0 +1,103 @@ +import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'; + +const { debugMock, createLoggerMock } = vi.hoisted(() => ({ + debugMock: vi.fn(), + createLoggerMock: vi.fn(), +})); + +createLoggerMock.mockImplementation(() => ({ + debug: debugMock, + info: vi.fn(), + error: vi.fn(), +})); + +vi.mock('../shared/utils/debug.js', () => ({ + createLogger: createLoggerMock, +})); + +import { createStreamDiagnostics } from '../shared/utils/streamDiagnostics.js'; + +describe('createStreamDiagnostics', () => { + beforeEach(() => { + vi.clearAllMocks(); + vi.useFakeTimers(); + vi.setSystemTime(new Date('2026-02-11T12:00:00.000Z')); + }); + + afterEach(() => { + vi.useRealTimers(); + }); + + it('should log connected event with elapsedMs', () => { + // Given: 診断オブジェクト + const diagnostics = createStreamDiagnostics('component', { runId: 'r1' }); + + // When: 接続完了を通知する + diagnostics.onConnected(); + + // Then: elapsedMs を含むデバッグログが出力される + expect(debugMock).toHaveBeenCalledWith('Stream connected', { + runId: 'r1', + elapsedMs: 0, + }); + }); + + it('should log first event only once even when called twice', () => { + // Given: 診断オブジェクト + const diagnostics = createStreamDiagnostics('component', { runId: 'r2' }); + + // When: first event を2回通知する + diagnostics.onFirstEvent('event-a'); + diagnostics.onFirstEvent('event-b'); + + // Then: first event ログは1回だけ出る + expect(debugMock).toHaveBeenCalledTimes(1); + expect(debugMock).toHaveBeenCalledWith('Stream first event', { + runId: 'r2', + firstEventType: 'event-a', + elapsedMs: 0, + }); + }); + + it('should include eventCount and durationMs on completion', () => { + // Given: 複数イベントを処理した診断オブジェクト + const diagnostics = createStreamDiagnostics('component', { runId: 'r3' }); + diagnostics.onConnected(); + diagnostics.onEvent('turn.started'); + vi.advanceTimersByTime(120); + diagnostics.onEvent('turn.completed'); + vi.advanceTimersByTime(80); + + // When: 完了通知を行う + diagnostics.onCompleted('normal', 'done'); + + // Then: 集計情報を含む完了ログが出力される + expect(debugMock).toHaveBeenLastCalledWith('Stream completed', { + runId: 'r3', + reason: 'normal', + detail: 'done', + eventCount: 2, + lastEventType: 'turn.completed', + durationMs: 200, + connected: true, + iterationStarted: false, + }); + }); + + it('should increment eventCount and use it in stream error log', () => { + // Given: 1イベント処理済みの診断オブジェクト + const diagnostics = createStreamDiagnostics('component', { runId: 'r4' }); + diagnostics.onEvent('turn.started'); + + // When: ストリームエラーを通知する + diagnostics.onStreamError('turn.failed', 'failed'); + + // Then: eventCount がエラーログに反映される + expect(debugMock).toHaveBeenLastCalledWith('Stream error event', { + runId: 'r4', + eventType: 'turn.failed', + message: 'failed', + eventCount: 1, + }); + }); +}); diff --git a/src/__tests__/task-decomposer.test.ts b/src/__tests__/task-decomposer.test.ts new file mode 100644 index 0000000..d374072 --- /dev/null +++ b/src/__tests__/task-decomposer.test.ts @@ -0,0 +1,76 @@ +import { describe, it, expect } from 'vitest'; +import { parseParts } from '../core/piece/engine/task-decomposer.js'; + +describe('parseParts', () => { + it('最後のjsonコードブロックをパースする', () => { + const content = [ + '説明', + '```json', + '[{"id":"old","title":"old","instruction":"old"}]', + '```', + '最終案', + '```json', + '[{"id":"a","title":"A","instruction":"Do A"},{"id":"b","title":"B","instruction":"Do B","timeout_ms":1200}]', + '```', + ].join('\n'); + + const result = parseParts(content, 3); + + expect(result).toHaveLength(2); + expect(result[0]).toEqual({ + id: 'a', + title: 'A', + instruction: 'Do A', + timeoutMs: undefined, + }); + expect(result[1]!.timeoutMs).toBe(1200); + }); + + it('jsonコードブロックがない場合はエラー', () => { + expect(() => parseParts('no json', 3)).toThrow( + 'Team leader output must include a ```json ... ``` block', + ); + }); + + it('max_partsを超えたらエラー', () => { + const content = '```json\n[{"id":"a","title":"A","instruction":"Do A"},{"id":"b","title":"B","instruction":"Do B"}]\n```'; + + expect(() => parseParts(content, 1)).toThrow( + 'Team leader produced too many parts: 2 > 1', + ); + }); + + it('必須フィールドが不足したらエラー', () => { + const content = '```json\n[{"id":"a","title":"A"}]\n```'; + + expect(() => parseParts(content, 3)).toThrow( + 'Part[0] "instruction" must be a non-empty string', + ); + }); + + it('jsonコードブロックが配列でない場合はエラー', () => { + const content = '```json\n{"not":"array"}\n```'; + + expect(() => parseParts(content, 3)).toThrow( + 'Team leader JSON must be an array', + ); + }); + + it('空配列の場合はエラー', () => { + const content = '```json\n[]\n```'; + + expect(() => parseParts(content, 3)).toThrow( + 'Team leader JSON must contain at least one part', + ); + }); + + it('重複したpart idがある場合はエラー', () => { + const content = [ + '```json', + '[{"id":"dup","title":"A","instruction":"Do A"},{"id":"dup","title":"B","instruction":"Do B"}]', + '```', + ].join('\n'); + + expect(() => parseParts(content, 3)).toThrow('Duplicate part id: dup'); + }); +}); diff --git a/src/__tests__/taskStatusLabel.test.ts b/src/__tests__/taskStatusLabel.test.ts new file mode 100644 index 0000000..54ba349 --- /dev/null +++ b/src/__tests__/taskStatusLabel.test.ts @@ -0,0 +1,39 @@ +import { describe, expect, it } from 'vitest'; +import { formatTaskStatusLabel } from '../features/tasks/list/taskStatusLabel.js'; +import type { TaskListItem } from '../infra/task/types.js'; + +describe('formatTaskStatusLabel', () => { + it("should format pending task as '[running] name'", () => { + // Given: pending タスク + const task: TaskListItem = { + kind: 'pending', + name: 'implement test', + createdAt: '2026-02-11T00:00:00.000Z', + filePath: '/tmp/task.md', + content: 'content', + }; + + // When: ステータスラベルを生成する + const result = formatTaskStatusLabel(task); + + // Then: pending は running 表示になる + expect(result).toBe('[running] implement test'); + }); + + it("should format failed task as '[failed] name'", () => { + // Given: failed タスク + const task: TaskListItem = { + kind: 'failed', + name: 'retry payment', + createdAt: '2026-02-11T00:00:00.000Z', + filePath: '/tmp/task.md', + content: 'content', + }; + + // When: ステータスラベルを生成する + const result = formatTaskStatusLabel(task); + + // Then: failed は failed 表示になる + expect(result).toBe('[failed] retry payment'); + }); +}); diff --git a/src/__tests__/team-leader-schema-loader.test.ts b/src/__tests__/team-leader-schema-loader.test.ts new file mode 100644 index 0000000..df92632 --- /dev/null +++ b/src/__tests__/team-leader-schema-loader.test.ts @@ -0,0 +1,105 @@ +import { describe, it, expect } from 'vitest'; +import { join } from 'node:path'; +import { PieceMovementRawSchema } from '../core/models/schemas.js'; +import { normalizePieceConfig } from '../infra/config/loaders/pieceParser.js'; + +describe('team_leader schema', () => { + it('max_parts <= 3 の設定を受け付ける', () => { + const raw = { + name: 'implement', + team_leader: { + persona: 'team-leader', + max_parts: 3, + timeout_ms: 120000, + }, + instruction_template: 'decompose', + }; + + const result = PieceMovementRawSchema.safeParse(raw); + expect(result.success).toBe(true); + }); + + it('max_parts > 3 は拒否する', () => { + const raw = { + name: 'implement', + team_leader: { + max_parts: 4, + }, + instruction_template: 'decompose', + }; + + const result = PieceMovementRawSchema.safeParse(raw); + expect(result.success).toBe(false); + }); + + it('parallel と team_leader の同時指定は拒否する', () => { + const raw = { + name: 'implement', + parallel: [{ name: 'sub', instruction_template: 'x' }], + team_leader: { + max_parts: 2, + }, + instruction_template: 'decompose', + }; + + const result = PieceMovementRawSchema.safeParse(raw); + expect(result.success).toBe(false); + }); + + it('arpeggio と team_leader の同時指定は拒否する', () => { + const raw = { + name: 'implement', + arpeggio: { + source: 'csv', + source_path: './data.csv', + template: './prompt.md', + }, + team_leader: { + max_parts: 2, + }, + instruction_template: 'decompose', + }; + + const result = PieceMovementRawSchema.safeParse(raw); + expect(result.success).toBe(false); + }); +}); + +describe('normalizePieceConfig team_leader', () => { + it('team_leader を内部形式へ正規化する', () => { + const pieceDir = join(process.cwd(), 'src', '__tests__'); + const raw = { + name: 'piece', + movements: [ + { + name: 'implement', + team_leader: { + persona: 'team-leader', + max_parts: 2, + timeout_ms: 90000, + part_persona: 'coder', + part_allowed_tools: ['Read', 'Edit'], + part_edit: true, + part_permission_mode: 'edit', + }, + instruction_template: 'decompose', + }, + ], + }; + + const config = normalizePieceConfig(raw, pieceDir); + const movement = config.movements[0]; + expect(movement).toBeDefined(); + expect(movement!.teamLeader).toEqual({ + persona: 'team-leader', + personaPath: undefined, + maxParts: 2, + timeoutMs: 90000, + partPersona: 'coder', + partPersonaPath: undefined, + partAllowedTools: ['Read', 'Edit'], + partEdit: true, + partPermissionMode: 'edit', + }); + }); +}); diff --git a/src/__tests__/test-helpers.ts b/src/__tests__/test-helpers.ts new file mode 100644 index 0000000..e6118df --- /dev/null +++ b/src/__tests__/test-helpers.ts @@ -0,0 +1,36 @@ +/** + * Shared helpers for unit tests and integration tests. + * + * Unlike engine-test-helpers.ts, this file has no mock dependencies and + * can be safely imported from any test file without requiring vi.mock() setup. + */ + +import type { PieceMovement, PieceRule } from '../core/models/types.js'; +import type { InstructionContext } from '../core/piece/instruction/instruction-context.js'; + +export function makeRule(condition: string, next: string, extra: Partial = {}): PieceRule { + return { condition, next, ...extra }; +} + +export function makeMovement(overrides: Partial = {}): PieceMovement { + return { + name: 'test-movement', + personaDisplayName: 'tester', + instructionTemplate: '', + passPreviousResponse: false, + ...overrides, + }; +} + +export function makeInstructionContext(overrides: Partial = {}): InstructionContext { + return { + task: 'test task', + iteration: 1, + maxMovements: 10, + movementIteration: 1, + cwd: '/tmp/test', + projectCwd: '/tmp/project', + userInputs: [], + ...overrides, + }; +} diff --git a/src/__tests__/watchTasks.test.ts b/src/__tests__/watchTasks.test.ts index 2012287..81991ee 100644 --- a/src/__tests__/watchTasks.test.ts +++ b/src/__tests__/watchTasks.test.ts @@ -12,6 +12,8 @@ const { mockBlankLine, mockStatus, mockSuccess, + mockWarn, + mockError, mockGetCurrentPiece, } = vi.hoisted(() => ({ mockRecoverInterruptedRunningTasks: vi.fn(), @@ -24,6 +26,8 @@ const { mockBlankLine: vi.fn(), mockStatus: vi.fn(), mockSuccess: vi.fn(), + mockWarn: vi.fn(), + mockError: vi.fn(), mockGetCurrentPiece: vi.fn(), })); @@ -45,11 +49,17 @@ vi.mock('../features/tasks/execute/taskExecution.js', () => ({ vi.mock('../shared/ui/index.js', () => ({ header: mockHeader, info: mockInfo, + warn: mockWarn, + error: mockError, success: mockSuccess, status: mockStatus, blankLine: mockBlankLine, })); +vi.mock('../shared/i18n/index.js', () => ({ + getLabel: vi.fn((key: string) => key), +})); + vi.mock('../infra/config/index.js', () => ({ getCurrentPiece: mockGetCurrentPiece, })); diff --git a/src/agents/ai-judge.ts b/src/agents/ai-judge.ts index 178d072..004b3d9 100644 --- a/src/agents/ai-judge.ts +++ b/src/agents/ai-judge.ts @@ -6,39 +6,12 @@ */ import type { AiJudgeCaller, AiJudgeCondition } from '../core/piece/types.js'; -import { loadTemplate } from '../shared/prompts/index.js'; import { createLogger } from '../shared/utils/index.js'; -import { runAgent } from './runner.js'; +import { evaluateCondition } from '../core/piece/agent-usecases.js'; const log = createLogger('ai-judge'); -/** - * Detect judge rule index from [JUDGE:N] tag pattern. - * Returns 0-based rule index, or -1 if no match. - */ -export function detectJudgeIndex(content: string): number { - const regex = /\[JUDGE:(\d+)\]/i; - const match = content.match(regex); - if (match?.[1]) { - const index = Number.parseInt(match[1], 10) - 1; - return index >= 0 ? index : -1; - } - return -1; -} - -/** - * Build the prompt for the AI judge that evaluates agent output against ai() conditions. - */ -export function buildJudgePrompt( - agentOutput: string, - aiConditions: AiJudgeCondition[], -): string { - const conditionList = aiConditions - .map((c) => `| ${c.index + 1} | ${c.text} |`) - .join('\n'); - - return loadTemplate('perform_judge_message', 'en', { agentOutput, conditionList }); -} +export { detectJudgeIndex, buildJudgePrompt } from './judge-utils.js'; /** * Call AI judge to evaluate agent output against ai() conditions. @@ -50,18 +23,9 @@ export const callAiJudge: AiJudgeCaller = async ( conditions: AiJudgeCondition[], options: { cwd: string }, ): Promise => { - const prompt = buildJudgePrompt(agentOutput, conditions); - - const response = await runAgent(undefined, prompt, { - cwd: options.cwd, - maxTurns: 1, - permissionMode: 'readonly', - }); - - if (response.status !== 'done') { - log.error('AI judge call failed', { error: response.error }); - return -1; + const result = await evaluateCondition(agentOutput, conditions, options); + if (result < 0) { + log.error('AI judge call failed to match a condition'); } - - return detectJudgeIndex(response.content); + return result; }; diff --git a/src/agents/index.ts b/src/agents/index.ts index 6adc5d9..2355cce 100644 --- a/src/agents/index.ts +++ b/src/agents/index.ts @@ -2,6 +2,5 @@ * Agents module - exports agent execution utilities */ -export { AgentRunner, runAgent } from './runner.js'; -export { callAiJudge, detectJudgeIndex, buildJudgePrompt } from './ai-judge.js'; +export { AgentRunner } from './runner.js'; export type { RunAgentOptions, StreamCallback } from './types.js'; diff --git a/src/agents/judge-utils.ts b/src/agents/judge-utils.ts new file mode 100644 index 0000000..0fb4795 --- /dev/null +++ b/src/agents/judge-utils.ts @@ -0,0 +1,22 @@ +import { loadTemplate } from '../shared/prompts/index.js'; + +export function detectJudgeIndex(content: string): number { + const regex = /\[JUDGE:(\d+)\]/i; + const match = content.match(regex); + if (match?.[1]) { + const index = Number.parseInt(match[1], 10) - 1; + return index >= 0 ? index : -1; + } + return -1; +} + +export function buildJudgePrompt( + agentOutput: string, + aiConditions: Array<{ index: number; text: string }>, +): string { + const conditionList = aiConditions + .map((c) => `| ${c.index + 1} | ${c.text} |`) + .join('\n'); + + return loadTemplate('perform_judge_message', 'en', { agentOutput, conditionList }); +} diff --git a/src/agents/runner.ts b/src/agents/runner.ts index cb54c93..5b126d0 100644 --- a/src/agents/runner.ts +++ b/src/agents/runner.ts @@ -111,6 +111,7 @@ export class AgentRunner { onPermissionRequest: options.onPermissionRequest, onAskUserQuestion: options.onAskUserQuestion, bypassPermissions: options.bypassPermissions, + outputSchema: options.outputSchema, }; } diff --git a/src/agents/types.ts b/src/agents/types.ts index d27882a..dad5d17 100644 --- a/src/agents/types.ts +++ b/src/agents/types.ts @@ -2,7 +2,7 @@ * Type definitions for agent execution */ -import type { StreamCallback, PermissionHandler, AskUserQuestionHandler } from '../infra/claude/index.js'; +import type { StreamCallback, PermissionHandler, AskUserQuestionHandler } from '../infra/claude/types.js'; import type { PermissionMode, Language, McpServerConfig } from '../core/models/index.js'; export type { StreamCallback }; @@ -39,4 +39,6 @@ export interface RunAgentOptions { movementsList: ReadonlyArray<{ name: string; description?: string }>; currentPosition: string; }; + /** JSON Schema for structured output */ + outputSchema?: Record; } diff --git a/src/core/models/index.ts b/src/core/models/index.ts index 8221700..9177e1b 100644 --- a/src/core/models/index.ts +++ b/src/core/models/index.ts @@ -10,6 +10,9 @@ export type { McpServerConfig, AgentResponse, SessionState, + PartDefinition, + PartResult, + TeamLeaderConfig, PieceRule, PieceMovement, ArpeggioMovementConfig, diff --git a/src/core/models/part.ts b/src/core/models/part.ts new file mode 100644 index 0000000..c04e8a5 --- /dev/null +++ b/src/core/models/part.ts @@ -0,0 +1,42 @@ +import type { PermissionMode } from './status.js'; +import type { AgentResponse } from './response.js'; + +/** Part definition produced by movement team leader agent */ +export interface PartDefinition { + /** Unique ID inside the parent movement */ + id: string; + /** Human-readable title */ + title: string; + /** Instruction passed to the part agent */ + instruction: string; + /** Optional per-part timeout in milliseconds */ + timeoutMs?: number; +} + +/** Result of a single part execution */ +export interface PartResult { + part: PartDefinition; + response: AgentResponse; +} + +/** team_leader config on a movement */ +export interface TeamLeaderConfig { + /** Persona reference for the team leader agent */ + persona?: string; + /** Resolved absolute path for team leader persona */ + personaPath?: string; + /** Maximum number of parts to run in parallel */ + maxParts: number; + /** Default timeout for parts in milliseconds */ + timeoutMs: number; + /** Persona reference for part agents */ + partPersona?: string; + /** Resolved absolute path for part persona */ + partPersonaPath?: string; + /** Allowed tools for part agents */ + partAllowedTools?: string[]; + /** Whether part agents can edit files */ + partEdit?: boolean; + /** Permission mode for part agents */ + partPermissionMode?: PermissionMode; +} diff --git a/src/core/models/piece-types.ts b/src/core/models/piece-types.ts index 7c029cc..49deeec 100644 --- a/src/core/models/piece-types.ts +++ b/src/core/models/piece-types.ts @@ -5,6 +5,7 @@ import type { PermissionMode } from './status.js'; import type { AgentResponse } from './response.js'; import type { InteractiveMode } from './interactive-mode.js'; +import type { TeamLeaderConfig } from './part.js'; /** Rule-based transition configuration (unified format) */ export interface PieceRule { @@ -116,6 +117,8 @@ export interface PieceMovement { parallel?: PieceMovement[]; /** Arpeggio configuration for data-driven batch processing. When set, this movement reads from a data source, expands templates, and calls LLM per batch. */ arpeggio?: ArpeggioMovementConfig; + /** Team leader configuration for dynamic part decomposition + parallel execution */ + teamLeader?: TeamLeaderConfig; /** Resolved policy content strings (from piece-level policies map, resolved at parse time) */ policyContents?: string[]; /** Resolved knowledge content strings (from piece-level knowledge map, resolved at parse time) */ diff --git a/src/core/models/response.ts b/src/core/models/response.ts index b687e8f..532584c 100644 --- a/src/core/models/response.ts +++ b/src/core/models/response.ts @@ -17,5 +17,6 @@ export interface AgentResponse { matchedRuleIndex?: number; /** How the rule match was detected */ matchedRuleMethod?: RuleMatchMethod; + /** Structured output returned by provider SDK (JSON Schema mode) */ + structuredOutput?: Record; } - diff --git a/src/core/models/schemas.ts b/src/core/models/schemas.ts index 6ab743d..0c0f812 100644 --- a/src/core/models/schemas.ts +++ b/src/core/models/schemas.ts @@ -170,6 +170,24 @@ export const ArpeggioConfigRawSchema = z.object({ output_path: z.string().optional(), }); +/** Team leader configuration schema for dynamic part decomposition */ +export const TeamLeaderConfigRawSchema = z.object({ + /** Persona reference for team leader agent */ + persona: z.string().optional(), + /** Maximum number of parts (must be <= 3) */ + max_parts: z.number().int().positive().max(3).optional().default(3), + /** Default timeout per part in milliseconds */ + timeout_ms: z.number().int().positive().optional().default(600000), + /** Persona reference for part agents */ + part_persona: z.string().optional(), + /** Allowed tools for part agents */ + part_allowed_tools: z.array(z.string()).optional(), + /** Whether part agents can edit files */ + part_edit: z.boolean().optional(), + /** Permission mode for part agents */ + part_permission_mode: PermissionModeSchema.optional(), +}); + /** Sub-movement schema for parallel execution */ export const ParallelSubMovementRawSchema = z.object({ name: z.string().min(1), @@ -232,7 +250,15 @@ export const PieceMovementRawSchema = z.object({ parallel: z.array(ParallelSubMovementRawSchema).optional(), /** Arpeggio configuration for data-driven batch processing */ arpeggio: ArpeggioConfigRawSchema.optional(), -}); + /** Team leader configuration for dynamic part decomposition */ + team_leader: TeamLeaderConfigRawSchema.optional(), +}).refine( + (data) => [data.parallel, data.arpeggio, data.team_leader].filter((v) => v != null).length <= 1, + { + message: "'parallel', 'arpeggio', and 'team_leader' are mutually exclusive", + path: ['parallel'], + }, +); /** Loop monitor rule schema */ export const LoopMonitorRuleSchema = z.object({ diff --git a/src/core/models/status.ts b/src/core/models/status.ts index fb77af9..0dde9f7 100644 --- a/src/core/models/status.ts +++ b/src/core/models/status.ts @@ -21,6 +21,8 @@ export type Status = /** How a rule match was detected */ export type RuleMatchMethod = | 'aggregate' + | 'auto_select' + | 'structured_output' | 'phase3_tag' | 'phase1_tag' | 'ai_judge' diff --git a/src/core/models/types.ts b/src/core/models/types.ts index 42e49e9..84b8340 100644 --- a/src/core/models/types.ts +++ b/src/core/models/types.ts @@ -23,6 +23,13 @@ export type { SessionState, } from './session.js'; +// Part decomposition +export type { + PartDefinition, + PartResult, + TeamLeaderConfig, +} from './part.js'; + // Piece configuration and runtime state export type { PieceRule, diff --git a/src/core/piece/agent-usecases.ts b/src/core/piece/agent-usecases.ts new file mode 100644 index 0000000..2c20ee2 --- /dev/null +++ b/src/core/piece/agent-usecases.ts @@ -0,0 +1,170 @@ +import type { AgentResponse, PartDefinition, PieceRule, RuleMatchMethod, Language } from '../models/types.js'; +import { runAgent, type RunAgentOptions } from '../../agents/runner.js'; +import { detectJudgeIndex, buildJudgePrompt } from '../../agents/judge-utils.js'; +import { parseParts } from './engine/task-decomposer.js'; +import { loadJudgmentSchema, loadEvaluationSchema, loadDecompositionSchema } from './schema-loader.js'; +import { detectRuleIndex } from '../../shared/utils/ruleIndex.js'; +import { ensureUniquePartIds, parsePartDefinitionEntry } from './part-definition-validator.js'; + +export interface JudgeStatusOptions { + cwd: string; + movementName: string; + language?: Language; +} + +export interface JudgeStatusResult { + ruleIndex: number; + method: RuleMatchMethod; +} + +export interface EvaluateConditionOptions { + cwd: string; +} + +export interface DecomposeTaskOptions { + cwd: string; + persona?: string; + language?: Language; + model?: string; + provider?: 'claude' | 'codex' | 'opencode' | 'mock'; +} + +function toPartDefinitions(raw: unknown, maxParts: number): PartDefinition[] { + if (!Array.isArray(raw)) { + throw new Error('Structured output "parts" must be an array'); + } + if (raw.length === 0) { + throw new Error('Structured output "parts" must not be empty'); + } + if (raw.length > maxParts) { + throw new Error(`Structured output produced too many parts: ${raw.length} > ${maxParts}`); + } + + const parts: PartDefinition[] = raw.map((entry, index) => parsePartDefinitionEntry(entry, index)); + ensureUniquePartIds(parts); + + return parts; +} + +export async function executeAgent( + persona: string | undefined, + instruction: string, + options: RunAgentOptions, +): Promise { + return runAgent(persona, instruction, options); +} +export const generateReport = executeAgent; +export const executePart = executeAgent; + +export async function evaluateCondition( + agentOutput: string, + conditions: Array<{ index: number; text: string }>, + options: EvaluateConditionOptions, +): Promise { + const prompt = buildJudgePrompt(agentOutput, conditions); + const response = await runAgent(undefined, prompt, { + cwd: options.cwd, + maxTurns: 1, + permissionMode: 'readonly', + outputSchema: loadEvaluationSchema(), + }); + + if (response.status !== 'done') { + return -1; + } + + const matchedIndex = response.structuredOutput?.matched_index; + if (typeof matchedIndex === 'number' && Number.isInteger(matchedIndex)) { + const zeroBased = matchedIndex - 1; + if (zeroBased >= 0 && zeroBased < conditions.length) { + return zeroBased; + } + } + + return detectJudgeIndex(response.content); +} + +export async function judgeStatus( + structuredInstruction: string, + tagInstruction: string, + rules: PieceRule[], + options: JudgeStatusOptions, +): Promise { + if (rules.length === 0) { + throw new Error('judgeStatus requires at least one rule'); + } + + if (rules.length === 1) { + return { ruleIndex: 0, method: 'auto_select' }; + } + + const agentOptions = { + cwd: options.cwd, + maxTurns: 3, + permissionMode: 'readonly' as const, + language: options.language, + }; + + // Stage 1: Structured output + const structuredResponse = await runAgent('conductor', structuredInstruction, { + ...agentOptions, + outputSchema: loadJudgmentSchema(), + }); + + if (structuredResponse.status === 'done') { + const stepNumber = structuredResponse.structuredOutput?.step; + if (typeof stepNumber === 'number' && Number.isInteger(stepNumber)) { + const ruleIndex = stepNumber - 1; + if (ruleIndex >= 0 && ruleIndex < rules.length) { + return { ruleIndex, method: 'structured_output' }; + } + } + } + + // Stage 2: Tag detection (dedicated call, no outputSchema) + const tagResponse = await runAgent('conductor', tagInstruction, agentOptions); + + if (tagResponse.status === 'done') { + const tagRuleIndex = detectRuleIndex(tagResponse.content, options.movementName); + if (tagRuleIndex >= 0 && tagRuleIndex < rules.length) { + return { ruleIndex: tagRuleIndex, method: 'phase3_tag' }; + } + } + + // Stage 3: AI judge + const conditions = rules.map((rule, index) => ({ index, text: rule.condition })); + const fallbackIndex = await evaluateCondition(structuredInstruction, conditions, { cwd: options.cwd }); + if (fallbackIndex >= 0 && fallbackIndex < rules.length) { + return { ruleIndex: fallbackIndex, method: 'ai_judge' }; + } + + throw new Error(`Status not found for movement "${options.movementName}"`); +} + +export async function decomposeTask( + instruction: string, + maxParts: number, + options: DecomposeTaskOptions, +): Promise { + const response = await runAgent(options.persona, instruction, { + cwd: options.cwd, + language: options.language, + model: options.model, + provider: options.provider, + permissionMode: 'readonly', + maxTurns: 3, + outputSchema: loadDecompositionSchema(maxParts), + }); + + if (response.status !== 'done') { + const detail = response.error ?? response.content; + throw new Error(`Team leader failed: ${detail}`); + } + + const parts = response.structuredOutput?.parts; + if (parts != null) { + return toPartDefinitions(parts, maxParts); + } + + return parseParts(response.content, maxParts); +} diff --git a/src/core/piece/engine/ArpeggioRunner.ts b/src/core/piece/engine/ArpeggioRunner.ts index 017c247..24adc45 100644 --- a/src/core/piece/engine/ArpeggioRunner.ts +++ b/src/core/piece/engine/ArpeggioRunner.ts @@ -15,7 +15,8 @@ import type { ArpeggioMovementConfig, BatchResult, DataBatch } from '../arpeggio import { createDataSource } from '../arpeggio/data-source-factory.js'; import { loadTemplate, expandTemplate } from '../arpeggio/template.js'; import { buildMergeFn, writeMergedOutput } from '../arpeggio/merge.js'; -import { runAgent, type RunAgentOptions } from '../../../agents/runner.js'; +import type { RunAgentOptions } from '../../../agents/runner.js'; +import { executeAgent } from '../agent-usecases.js'; import { detectMatchedRule } from '../evaluation/index.js'; import { incrementMovementIteration } from './state-manager.js'; import { createLogger } from '../../../shared/utils/index.js'; @@ -84,7 +85,7 @@ async function executeBatchWithRetry( for (let attempt = 0; attempt <= maxRetries; attempt++) { try { - const response = await runAgent(persona, prompt, agentOptions); + const response = await executeAgent(persona, prompt, agentOptions); if (response.status === 'error') { lastError = response.error ?? response.content ?? 'Agent returned error status'; log.info('Batch execution failed, retrying', { diff --git a/src/core/piece/engine/MovementExecutor.ts b/src/core/piece/engine/MovementExecutor.ts index 9f0d994..32a2b8a 100644 --- a/src/core/piece/engine/MovementExecutor.ts +++ b/src/core/piece/engine/MovementExecutor.ts @@ -15,7 +15,7 @@ import type { Language, } from '../../models/types.js'; import type { PhaseName } from '../types.js'; -import { runAgent } from '../../../agents/runner.js'; +import { executeAgent } from '../agent-usecases.js'; import { InstructionBuilder, isOutputContractItem } from '../instruction/InstructionBuilder.js'; import { needsStatusJudgmentPhase, runReportPhase, runStatusJudgmentPhase } from '../phase-runner.js'; import { detectMatchedRule } from '../evaluation/index.js'; @@ -202,7 +202,7 @@ export class MovementExecutor { // Phase 1: main execution (Write excluded if movement has report) this.deps.onPhaseStart?.(step, 1, 'execute', instruction); const agentOptions = this.deps.optionsBuilder.buildAgentOptions(step); - let response = await runAgent(step.persona, instruction, agentOptions); + let response = await executeAgent(step.persona, instruction, agentOptions); updatePersonaSession(sessionKey, response.sessionId); this.deps.onPhaseComplete?.(step, 1, 'execute', response.content, response.status, response.error); @@ -220,22 +220,28 @@ export class MovementExecutor { } } - // Phase 3: status judgment (resume session, no tools, output status tag) - let tagContent = ''; - if (needsStatusJudgmentPhase(step)) { - tagContent = await runStatusJudgmentPhase(step, phaseCtx); - } + // Phase 3: status judgment (new session, no tools, determines matched rule) + const phase3Result = needsStatusJudgmentPhase(step) + ? await runStatusJudgmentPhase(step, phaseCtx) + : undefined; - const match = await detectMatchedRule(step, response.content, tagContent, { - state, - cwd: this.deps.getCwd(), - interactive: this.deps.getInteractive(), - detectRuleIndex: this.deps.detectRuleIndex, - callAiJudge: this.deps.callAiJudge, - }); - if (match) { - log.debug('Rule matched', { movement: step.name, ruleIndex: match.index, method: match.method }); - response = { ...response, matchedRuleIndex: match.index, matchedRuleMethod: match.method }; + if (phase3Result) { + // Phase 3 already determined the matched rule — use its result directly + log.debug('Rule matched (Phase 3)', { movement: step.name, ruleIndex: phase3Result.ruleIndex, method: phase3Result.method }); + response = { ...response, matchedRuleIndex: phase3Result.ruleIndex, matchedRuleMethod: phase3Result.method }; + } else { + // No Phase 3 — use rule evaluator with Phase 1 content + const match = await detectMatchedRule(step, response.content, '', { + state, + cwd: this.deps.getCwd(), + interactive: this.deps.getInteractive(), + detectRuleIndex: this.deps.detectRuleIndex, + callAiJudge: this.deps.callAiJudge, + }); + if (match) { + log.debug('Rule matched', { movement: step.name, ruleIndex: match.index, method: match.method }); + response = { ...response, matchedRuleIndex: match.index, matchedRuleMethod: match.method }; + } } state.movementOutputs.set(step.name, response); diff --git a/src/core/piece/engine/OptionsBuilder.ts b/src/core/piece/engine/OptionsBuilder.ts index 8fe68c3..1b11ddc 100644 --- a/src/core/piece/engine/OptionsBuilder.ts +++ b/src/core/piece/engine/OptionsBuilder.ts @@ -1,10 +1,3 @@ -/** - * Builds RunAgentOptions for different execution phases. - * - * Centralizes the option construction logic that was previously - * scattered across PieceEngine methods. - */ - import { join } from 'node:path'; import type { PieceMovement, PieceState, Language } from '../../models/types.js'; import type { RunAgentOptions } from '../../../agents/runner.js'; @@ -85,13 +78,26 @@ export class OptionsBuilder { buildResumeOptions( step: PieceMovement, sessionId: string, - overrides: Pick, + overrides: Pick, ): RunAgentOptions { return { ...this.buildBaseOptions(step), // Report/status phases are read-only regardless of movement settings. permissionMode: 'readonly', sessionId, + allowedTools: [], + maxTurns: overrides.maxTurns, + }; + } + + /** Build RunAgentOptions for Phase 2 retry with a new session */ + buildNewSessionReportOptions( + step: PieceMovement, + overrides: Pick, + ): RunAgentOptions { + return { + ...this.buildBaseOptions(step), + permissionMode: 'readonly', allowedTools: overrides.allowedTools, maxTurns: overrides.maxTurns, }; @@ -113,6 +119,7 @@ export class OptionsBuilder { lastResponse, getSessionId: (persona: string) => state.personaSessions.get(persona), buildResumeOptions: this.buildResumeOptions.bind(this), + buildNewSessionReportOptions: this.buildNewSessionReportOptions.bind(this), updatePersonaSession, onPhaseStart, onPhaseComplete, diff --git a/src/core/piece/engine/ParallelRunner.ts b/src/core/piece/engine/ParallelRunner.ts index a72a04e..1a5d865 100644 --- a/src/core/piece/engine/ParallelRunner.ts +++ b/src/core/piece/engine/ParallelRunner.ts @@ -10,7 +10,7 @@ import type { PieceState, AgentResponse, } from '../../models/types.js'; -import { runAgent } from '../../../agents/runner.js'; +import { executeAgent } from '../agent-usecases.js'; import { ParallelLogger } from './parallel-logger.js'; import { needsStatusJudgmentPhase, runReportPhase, runStatusJudgmentPhase } from '../phase-runner.js'; import { detectMatchedRule } from '../evaluation/index.js'; @@ -101,7 +101,7 @@ export class ParallelRunner { : baseOptions; this.deps.onPhaseStart?.(subMovement, 1, 'execute', subInstruction); - const subResponse = await runAgent(subMovement.persona, subInstruction, agentOptions); + const subResponse = await executeAgent(subMovement.persona, subInstruction, agentOptions); updatePersonaSession(subSessionKey, subResponse.sessionId); this.deps.onPhaseComplete?.(subMovement, 1, 'execute', subResponse.content, subResponse.status, subResponse.error); @@ -114,15 +114,19 @@ export class ParallelRunner { } // Phase 3: status judgment for sub-movement - let subTagContent = ''; - if (needsStatusJudgmentPhase(subMovement)) { - subTagContent = await runStatusJudgmentPhase(subMovement, phaseCtx); - } + const subPhase3 = needsStatusJudgmentPhase(subMovement) + ? await runStatusJudgmentPhase(subMovement, phaseCtx) + : undefined; - const match = await detectMatchedRule(subMovement, subResponse.content, subTagContent, ruleCtx); - const finalResponse = match - ? { ...subResponse, matchedRuleIndex: match.index, matchedRuleMethod: match.method } - : subResponse; + let finalResponse: AgentResponse; + if (subPhase3) { + finalResponse = { ...subResponse, matchedRuleIndex: subPhase3.ruleIndex, matchedRuleMethod: subPhase3.method }; + } else { + const match = await detectMatchedRule(subMovement, subResponse.content, '', ruleCtx); + finalResponse = match + ? { ...subResponse, matchedRuleIndex: match.index, matchedRuleMethod: match.method } + : subResponse; + } state.movementOutputs.set(subMovement.name, finalResponse); this.deps.movementExecutor.emitMovementReports(subMovement); diff --git a/src/core/piece/engine/PieceEngine.ts b/src/core/piece/engine/PieceEngine.ts index cfde239..b65a9de 100644 --- a/src/core/piece/engine/PieceEngine.ts +++ b/src/core/piece/engine/PieceEngine.ts @@ -31,6 +31,7 @@ import { OptionsBuilder } from './OptionsBuilder.js'; import { MovementExecutor } from './MovementExecutor.js'; import { ParallelRunner } from './ParallelRunner.js'; import { ArpeggioRunner } from './ArpeggioRunner.js'; +import { TeamLeaderRunner } from './TeamLeaderRunner.js'; import { buildRunPaths, type RunPaths } from '../run/run-paths.js'; const log = createLogger('engine'); @@ -63,6 +64,7 @@ export class PieceEngine extends EventEmitter { private readonly movementExecutor: MovementExecutor; private readonly parallelRunner: ParallelRunner; private readonly arpeggioRunner: ArpeggioRunner; + private readonly teamLeaderRunner: TeamLeaderRunner; private readonly detectRuleIndex: (content: string, movementName: string) => number; private readonly callAiJudge: ( agentOutput: string, @@ -163,6 +165,22 @@ export class PieceEngine extends EventEmitter { }, }); + this.teamLeaderRunner = new TeamLeaderRunner({ + optionsBuilder: this.optionsBuilder, + movementExecutor: this.movementExecutor, + engineOptions: this.options, + getCwd: () => this.cwd, + getInteractive: () => this.options.interactive === true, + detectRuleIndex: this.detectRuleIndex, + callAiJudge: this.callAiJudge, + onPhaseStart: (step, phase, phaseName, instruction) => { + this.emit('phase:start', step, phase, phaseName, instruction); + }, + onPhaseComplete: (step, phase, phaseName, content, phaseStatus, error) => { + this.emit('phase:complete', step, phase, phaseName, content, phaseStatus, error); + }, + }); + log.debug('PieceEngine initialized', { piece: config.name, movements: config.movements.map(s => s.name), @@ -337,6 +355,10 @@ export class PieceEngine extends EventEmitter { result = await this.arpeggioRunner.runArpeggioMovement( step, this.state, ); + } else if (step.teamLeader) { + result = await this.teamLeaderRunner.runTeamLeaderMovement( + step, this.state, this.task, this.config.maxMovements, updateSession, + ); } else { result = await this.movementExecutor.runNormalMovement( step, this.state, this.task, this.config.maxMovements, updateSession, prebuiltInstruction, @@ -531,8 +553,8 @@ export class PieceEngine extends EventEmitter { this.state.iteration++; // Build instruction before emitting movement:start so listeners can log it. - // Parallel and arpeggio movements handle iteration incrementing internally. - const isDelegated = (movement.parallel && movement.parallel.length > 0) || !!movement.arpeggio; + // Parallel/arpeggio/team_leader movements handle iteration incrementing internally. + const isDelegated = (movement.parallel && movement.parallel.length > 0) || !!movement.arpeggio || !!movement.teamLeader; let prebuiltInstruction: string | undefined; if (!isDelegated) { const movementIteration = incrementMovementIteration(this.state, movement.name); @@ -562,7 +584,7 @@ export class PieceEngine extends EventEmitter { } if (response.status === 'error') { - const detail = response.error ?? response.content ?? `Movement "${movement.name}" returned error status`; + const detail = response.error ?? response.content; this.state.status = 'aborted'; this.emit('piece:abort', this.state, `Movement "${movement.name}" failed: ${detail}`); break; diff --git a/src/core/piece/engine/TeamLeaderRunner.ts b/src/core/piece/engine/TeamLeaderRunner.ts new file mode 100644 index 0000000..9b92eaf --- /dev/null +++ b/src/core/piece/engine/TeamLeaderRunner.ts @@ -0,0 +1,274 @@ +import type { + PieceMovement, + PieceState, + AgentResponse, + PartDefinition, + PartResult, +} from '../../models/types.js'; +import { decomposeTask, executeAgent } from '../agent-usecases.js'; +import { detectMatchedRule } from '../evaluation/index.js'; +import { buildSessionKey } from '../session-key.js'; +import { ParallelLogger } from './parallel-logger.js'; +import { incrementMovementIteration } from './state-manager.js'; +import { buildAbortSignal } from './abort-signal.js'; +import { createLogger, getErrorMessage } from '../../../shared/utils/index.js'; +import type { OptionsBuilder } from './OptionsBuilder.js'; +import type { MovementExecutor } from './MovementExecutor.js'; +import type { PieceEngineOptions, PhaseName } from '../types.js'; +import type { ParallelLoggerOptions } from './parallel-logger.js'; + +const log = createLogger('team-leader-runner'); + +function resolvePartErrorDetail(partResult: PartResult): string { + const detail = partResult.response.error ?? partResult.response.content; + if (!detail) { + throw new Error(`Part "${partResult.part.id}" failed without error detail`); + } + return detail; +} + +export interface TeamLeaderRunnerDeps { + readonly optionsBuilder: OptionsBuilder; + readonly movementExecutor: MovementExecutor; + readonly engineOptions: PieceEngineOptions; + readonly getCwd: () => string; + readonly getInteractive: () => boolean; + readonly detectRuleIndex: (content: string, movementName: string) => number; + readonly callAiJudge: ( + agentOutput: string, + conditions: Array<{ index: number; text: string }>, + options: { cwd: string } + ) => Promise; + readonly onPhaseStart?: (step: PieceMovement, phase: 1 | 2 | 3, phaseName: PhaseName, instruction: string) => void; + readonly onPhaseComplete?: (step: PieceMovement, phase: 1 | 2 | 3, phaseName: PhaseName, content: string, status: string, error?: string) => void; +} + +function createPartMovement(step: PieceMovement, part: PartDefinition): PieceMovement { + if (!step.teamLeader) { + throw new Error(`Movement "${step.name}" has no teamLeader configuration`); + } + + return { + name: `${step.name}.${part.id}`, + description: part.title, + persona: step.teamLeader.partPersona ?? step.persona, + personaPath: step.teamLeader.partPersonaPath ?? step.personaPath, + personaDisplayName: `${step.name}:${part.id}`, + session: 'refresh', + allowedTools: step.teamLeader.partAllowedTools ?? step.allowedTools, + mcpServers: step.mcpServers, + provider: step.provider, + model: step.model, + permissionMode: step.teamLeader.partPermissionMode ?? step.permissionMode, + edit: step.teamLeader.partEdit ?? step.edit, + instructionTemplate: part.instruction, + passPreviousResponse: false, + }; +} + +export class TeamLeaderRunner { + constructor( + private readonly deps: TeamLeaderRunnerDeps, + ) {} + + async runTeamLeaderMovement( + step: PieceMovement, + state: PieceState, + task: string, + maxMovements: number, + updatePersonaSession: (persona: string, sessionId: string | undefined) => void, + ): Promise<{ response: AgentResponse; instruction: string }> { + if (!step.teamLeader) { + throw new Error(`Movement "${step.name}" has no teamLeader configuration`); + } + const teamLeaderConfig = step.teamLeader; + + const movementIteration = incrementMovementIteration(state, step.name); + const leaderStep: PieceMovement = { + ...step, + persona: teamLeaderConfig.persona ?? step.persona, + personaPath: teamLeaderConfig.personaPath ?? step.personaPath, + }; + const instruction = this.deps.movementExecutor.buildInstruction( + leaderStep, + movementIteration, + state, + task, + maxMovements, + ); + + this.deps.onPhaseStart?.(leaderStep, 1, 'execute', instruction); + const parts = await decomposeTask(instruction, teamLeaderConfig.maxParts, { + cwd: this.deps.getCwd(), + persona: leaderStep.persona, + model: leaderStep.model, + provider: leaderStep.provider, + }); + const leaderResponse: AgentResponse = { + persona: leaderStep.persona ?? leaderStep.name, + status: 'done', + content: JSON.stringify({ parts }, null, 2), + timestamp: new Date(), + }; + this.deps.onPhaseComplete?.(leaderStep, 1, 'execute', leaderResponse.content, leaderResponse.status, leaderResponse.error); + log.debug('Team leader decomposed parts', { + movement: step.name, + partCount: parts.length, + partIds: parts.map((part) => part.id), + }); + + const parallelLogger = this.deps.engineOptions.onStream + ? new ParallelLogger(this.buildParallelLoggerOptions( + step.name, + movementIteration, + parts.map((part) => part.id), + state.iteration, + maxMovements, + )) + : undefined; + + const settled = await Promise.allSettled( + parts.map((part, index) => this.runSinglePart( + step, + part, + index, + teamLeaderConfig.timeoutMs, + updatePersonaSession, + parallelLogger, + )), + ); + + const partResults: PartResult[] = settled.map((result, index) => { + const part = parts[index]; + if (!part) { + throw new Error(`Missing part at index ${index}`); + } + + if (result.status === 'fulfilled') { + state.movementOutputs.set(result.value.response.persona, result.value.response); + return result.value; + } + + const errorMsg = getErrorMessage(result.reason); + const errorResponse: AgentResponse = { + persona: `${step.name}.${part.id}`, + status: 'error', + content: '', + timestamp: new Date(), + error: errorMsg, + }; + state.movementOutputs.set(errorResponse.persona, errorResponse); + return { part, response: errorResponse }; + }); + + const allFailed = partResults.every((result) => result.response.status === 'error'); + if (allFailed) { + const errors = partResults.map((result) => `${result.part.id}: ${resolvePartErrorDetail(result)}`).join('; '); + throw new Error(`All team leader parts failed: ${errors}`); + } + + if (parallelLogger) { + parallelLogger.printSummary( + step.name, + partResults.map((result) => ({ name: result.part.id, condition: undefined })), + ); + } + + const aggregatedContent = [ + '## decomposition', + leaderResponse.content, + ...partResults.map((result) => [ + `## ${result.part.id}: ${result.part.title}`, + result.response.status === 'error' + ? `[ERROR] ${resolvePartErrorDetail(result)}` + : result.response.content, + ].join('\n')), + ].join('\n\n---\n\n'); + + const ruleCtx = { + state, + cwd: this.deps.getCwd(), + interactive: this.deps.getInteractive(), + detectRuleIndex: this.deps.detectRuleIndex, + callAiJudge: this.deps.callAiJudge, + }; + const match = await detectMatchedRule(step, aggregatedContent, '', ruleCtx); + + const aggregatedResponse: AgentResponse = { + persona: step.name, + status: 'done', + content: aggregatedContent, + timestamp: new Date(), + ...(match && { matchedRuleIndex: match.index, matchedRuleMethod: match.method }), + }; + + state.movementOutputs.set(step.name, aggregatedResponse); + state.lastOutput = aggregatedResponse; + this.deps.movementExecutor.persistPreviousResponseSnapshot( + state, + step.name, + movementIteration, + aggregatedResponse.content, + ); + this.deps.movementExecutor.emitMovementReports(step); + + return { response: aggregatedResponse, instruction }; + } + + private async runSinglePart( + step: PieceMovement, + part: PartDefinition, + partIndex: number, + defaultTimeoutMs: number, + updatePersonaSession: (persona: string, sessionId: string | undefined) => void, + parallelLogger: ParallelLogger | undefined, + ): Promise { + const partMovement = createPartMovement(step, part); + const baseOptions = this.deps.optionsBuilder.buildAgentOptions(partMovement); + const timeoutMs = part.timeoutMs ?? defaultTimeoutMs; + const { signal, dispose } = buildAbortSignal(timeoutMs, baseOptions.abortSignal); + const options = parallelLogger + ? { ...baseOptions, abortSignal: signal, onStream: parallelLogger.createStreamHandler(part.id, partIndex) } + : { ...baseOptions, abortSignal: signal }; + + try { + const response = await executeAgent(partMovement.persona, part.instruction, options); + updatePersonaSession(buildSessionKey(partMovement), response.sessionId); + return { + part, + response: { + ...response, + persona: partMovement.name, + }, + }; + } finally { + dispose(); + } + } + + private buildParallelLoggerOptions( + movementName: string, + movementIteration: number, + subMovementNames: string[], + iteration: number, + maxMovements: number, + ): ParallelLoggerOptions { + const options: ParallelLoggerOptions = { + subMovementNames, + parentOnStream: this.deps.engineOptions.onStream, + progressInfo: { iteration, maxMovements }, + }; + + if (this.deps.engineOptions.taskPrefix != null && this.deps.engineOptions.taskColorIndex != null) { + return { + ...options, + taskLabel: this.deps.engineOptions.taskPrefix, + taskColorIndex: this.deps.engineOptions.taskColorIndex, + parentMovementName: movementName, + movementIteration, + }; + } + + return options; + } +} diff --git a/src/core/piece/engine/abort-signal.ts b/src/core/piece/engine/abort-signal.ts new file mode 100644 index 0000000..73e009d --- /dev/null +++ b/src/core/piece/engine/abort-signal.ts @@ -0,0 +1,29 @@ +export function buildAbortSignal( + timeoutMs: number, + parentSignal: AbortSignal | undefined, +): { signal: AbortSignal; dispose: () => void } { + const timeoutController = new AbortController(); + const timeoutId = setTimeout(() => { + timeoutController.abort(new Error(`Part timeout after ${timeoutMs}ms`)); + }, timeoutMs); + + let abortListener: (() => void) | undefined; + if (parentSignal) { + abortListener = () => timeoutController.abort(parentSignal.reason); + if (parentSignal.aborted) { + abortListener(); + } else { + parentSignal.addEventListener('abort', abortListener, { once: true }); + } + } + + return { + signal: timeoutController.signal, + dispose: () => { + clearTimeout(timeoutId); + if (parentSignal && abortListener) { + parentSignal.removeEventListener('abort', abortListener); + } + }, + }; +} diff --git a/src/core/piece/engine/index.ts b/src/core/piece/engine/index.ts index 505be71..2058869 100644 --- a/src/core/piece/engine/index.ts +++ b/src/core/piece/engine/index.ts @@ -9,6 +9,7 @@ export { MovementExecutor } from './MovementExecutor.js'; export type { MovementExecutorDeps } from './MovementExecutor.js'; export { ParallelRunner } from './ParallelRunner.js'; export { ArpeggioRunner } from './ArpeggioRunner.js'; +export { TeamLeaderRunner } from './TeamLeaderRunner.js'; export { OptionsBuilder } from './OptionsBuilder.js'; export { CycleDetector } from './cycle-detector.js'; export type { CycleCheckResult } from './cycle-detector.js'; diff --git a/src/core/piece/engine/task-decomposer.ts b/src/core/piece/engine/task-decomposer.ts new file mode 100644 index 0000000..34ffc6f --- /dev/null +++ b/src/core/piece/engine/task-decomposer.ts @@ -0,0 +1,44 @@ +import type { PartDefinition } from '../../models/part.js'; +import { ensureUniquePartIds, parsePartDefinitionEntry } from '../part-definition-validator.js'; + +const JSON_CODE_BLOCK_REGEX = /```json\s*([\s\S]*?)```/g; + +function parseJsonBlock(content: string): unknown { + let lastJsonBlock: string | undefined; + let match: RegExpExecArray | null; + + while ((match = JSON_CODE_BLOCK_REGEX.exec(content)) !== null) { + if (match[1]) { + lastJsonBlock = match[1].trim(); + } + } + + if (!lastJsonBlock) { + throw new Error('Team leader output must include a ```json ... ``` block'); + } + + try { + return JSON.parse(lastJsonBlock) as unknown; + } catch (error) { + const message = error instanceof Error ? error.message : String(error); + throw new Error(`Failed to parse part JSON: ${message}`); + } +} + +export function parseParts(content: string, maxParts: number): PartDefinition[] { + const parsed = parseJsonBlock(content); + if (!Array.isArray(parsed)) { + throw new Error('Team leader JSON must be an array'); + } + if (parsed.length === 0) { + throw new Error('Team leader JSON must contain at least one part'); + } + if (parsed.length > maxParts) { + throw new Error(`Team leader produced too many parts: ${parsed.length} > ${maxParts}`); + } + + const parts = parsed.map((entry, index) => parsePartDefinitionEntry(entry, index)); + ensureUniquePartIds(parts); + + return parts; +} diff --git a/src/core/piece/index.ts b/src/core/piece/index.ts index 776c810..3684ea4 100644 --- a/src/core/piece/index.ts +++ b/src/core/piece/index.ts @@ -60,8 +60,19 @@ export { buildEditRule, type InstructionContext } from './instruction/instructio export { generateStatusRulesComponents, type StatusRulesComponents } from './instruction/status-rules.js'; // Rule evaluation -export { RuleEvaluator, type RuleMatch, type RuleEvaluatorContext, detectMatchedRule, evaluateAggregateConditions } from './evaluation/index.js'; +export { RuleEvaluator, type RuleMatch, type RuleEvaluatorContext, evaluateAggregateConditions } from './evaluation/index.js'; export { AggregateEvaluator } from './evaluation/AggregateEvaluator.js'; // Phase runner -export { needsStatusJudgmentPhase, runReportPhase, runStatusJudgmentPhase, type ReportPhaseBlockedResult } from './phase-runner.js'; +export { needsStatusJudgmentPhase, type ReportPhaseBlockedResult } from './phase-runner.js'; + +// Agent usecases +export { + executeAgent, + generateReport, + executePart, + judgeStatus, + evaluateCondition, + decomposeTask, + type JudgeStatusResult, +} from './agent-usecases.js'; diff --git a/src/core/piece/instruction/ReportInstructionBuilder.ts b/src/core/piece/instruction/ReportInstructionBuilder.ts index 6ed90d4..009822c 100644 --- a/src/core/piece/instruction/ReportInstructionBuilder.ts +++ b/src/core/piece/instruction/ReportInstructionBuilder.ts @@ -25,6 +25,8 @@ export interface ReportInstructionContext { language?: Language; /** Target report file name (when generating a single report) */ targetFile?: string; + /** Last response from Phase 1 (used when report phase retries in a new session) */ + lastResponse?: string; } /** @@ -45,7 +47,6 @@ export class ReportInstructionBuilder { const language = this.context.language ?? 'en'; - // Build report context for Piece Context section let reportContext: string; if (this.context.targetFile) { reportContext = `- Report Directory: ${this.context.reportDir}/\n- Report File: ${this.context.reportDir}/${this.context.targetFile}`; @@ -53,7 +54,6 @@ export class ReportInstructionBuilder { reportContext = renderReportContext(this.step.outputContracts, this.context.reportDir); } - // Build report output instruction let reportOutput = ''; let hasReportOutput = false; const instrContext: InstructionContext = { @@ -68,7 +68,6 @@ export class ReportInstructionBuilder { language, }; - // Check for order instruction in first output contract item const firstContract = this.step.outputContracts[0]; if (firstContract && isOutputContractItem(firstContract) && firstContract.order) { reportOutput = replaceTemplatePlaceholders(firstContract.order.trimEnd(), this.step, instrContext); @@ -81,7 +80,6 @@ export class ReportInstructionBuilder { } } - // Build output contract (from first item's format) let outputContract = ''; let hasOutputContract = false; if (firstContract && isOutputContractItem(firstContract) && firstContract.format) { @@ -92,6 +90,8 @@ export class ReportInstructionBuilder { return loadTemplate('perform_phase2_message', language, { workingDirectory: this.context.cwd, reportContext, + hasLastResponse: this.context.lastResponse != null && this.context.lastResponse.trim().length > 0, + lastResponse: this.context.lastResponse ?? '', hasReportOutput, reportOutput, hasOutputContract, diff --git a/src/core/piece/instruction/StatusJudgmentBuilder.ts b/src/core/piece/instruction/StatusJudgmentBuilder.ts index 85a9354..9529050 100644 --- a/src/core/piece/instruction/StatusJudgmentBuilder.ts +++ b/src/core/piece/instruction/StatusJudgmentBuilder.ts @@ -27,6 +27,8 @@ export interface StatusJudgmentContext { lastResponse?: string; /** Input source type for fallback strategies */ inputSource?: 'report' | 'response'; + /** When true, omit tag output instructions (structured output schema handles format) */ + structuredOutput?: boolean; } /** @@ -64,12 +66,17 @@ export class StatusJudgmentBuilder { contentToJudge = this.buildFromResponse(); } + const isStructured = this.context.structuredOutput ?? false; + return loadTemplate('perform_phase3_message', language, { reportContent: contentToJudge, criteriaTable: components.criteriaTable, - outputList: components.outputList, - hasAppendix: components.hasAppendix, - appendixContent: components.appendixContent, + structuredOutput: isStructured, + ...(isStructured ? {} : { + outputList: components.outputList, + hasAppendix: components.hasAppendix, + appendixContent: components.appendixContent, + }), }); } diff --git a/src/core/piece/judgment/FallbackStrategy.ts b/src/core/piece/judgment/FallbackStrategy.ts deleted file mode 100644 index f3007c8..0000000 --- a/src/core/piece/judgment/FallbackStrategy.ts +++ /dev/null @@ -1,255 +0,0 @@ -/** - * Fallback strategies for Phase 3 judgment. - * - * Implements Chain of Responsibility pattern to try multiple judgment methods - * when conductor cannot determine the status from report alone. - */ - -import { readFileSync } from 'node:fs'; -import { resolve } from 'node:path'; -import type { PieceMovement, Language } from '../../models/types.js'; -import { runAgent } from '../../../agents/runner.js'; -import { StatusJudgmentBuilder } from '../instruction/StatusJudgmentBuilder.js'; -import { JudgmentDetector, type JudgmentResult } from './JudgmentDetector.js'; -import { hasOnlyOneBranch, getAutoSelectedTag, getReportFiles } from '../evaluation/rule-utils.js'; -import { createLogger } from '../../../shared/utils/index.js'; - -const log = createLogger('fallback-strategy'); - -export interface JudgmentContext { - step: PieceMovement; - cwd: string; - language?: Language; - reportDir?: string; - lastResponse?: string; // Phase 1の最終応答 - sessionId?: string; -} - -export interface JudgmentStrategy { - readonly name: string; - canApply(context: JudgmentContext): boolean; - execute(context: JudgmentContext): Promise; -} - -/** - * Base class for judgment strategies using Template Method Pattern. - */ -abstract class JudgmentStrategyBase implements JudgmentStrategy { - abstract readonly name: string; - - abstract canApply(context: JudgmentContext): boolean; - - async execute(context: JudgmentContext): Promise { - try { - // 1. 情報収集(サブクラスで実装) - const input = await this.gatherInput(context); - - // 2. 指示生成(サブクラスで実装) - const instruction = this.buildInstruction(input, context); - - // 3. conductor実行(共通) - const response = await this.runConductor(instruction, context); - - // 4. 結果検出(共通) - return JudgmentDetector.detect(response); - } catch (error) { - const errorMsg = error instanceof Error ? error.message : String(error); - log.debug(`Strategy ${this.name} threw error`, { error: errorMsg }); - return { - success: false, - reason: `Strategy failed with error: ${errorMsg}`, - }; - } - } - - protected abstract gatherInput(context: JudgmentContext): Promise; - - protected abstract buildInstruction(input: string, context: JudgmentContext): string; - - protected async runConductor(instruction: string, context: JudgmentContext): Promise { - const response = await runAgent('conductor', instruction, { - cwd: context.cwd, - maxTurns: 3, - permissionMode: 'readonly', - language: context.language, - }); - - if (response.status !== 'done') { - throw new Error(`Conductor failed: ${response.error || response.content || 'Unknown error'}`); - } - - return response.content; - } -} - -/** - * Strategy 1: Auto-select when there's only one branch. - * This strategy doesn't use conductor - just returns the single tag. - */ -export class AutoSelectStrategy implements JudgmentStrategy { - readonly name = 'AutoSelect'; - - canApply(context: JudgmentContext): boolean { - return hasOnlyOneBranch(context.step); - } - - async execute(context: JudgmentContext): Promise { - const tag = getAutoSelectedTag(context.step); - log.debug('Auto-selected tag (single branch)', { tag }); - return { - success: true, - tag, - }; - } -} - -/** - * Strategy 2: Report-based judgment. - * Read report files and ask conductor to judge. - */ -export class ReportBasedStrategy extends JudgmentStrategyBase { - readonly name = 'ReportBased'; - - canApply(context: JudgmentContext): boolean { - return context.reportDir !== undefined && getReportFiles(context.step.outputContracts).length > 0; - } - - protected async gatherInput(context: JudgmentContext): Promise { - if (!context.reportDir) { - throw new Error('Report directory not provided'); - } - - const reportFiles = getReportFiles(context.step.outputContracts); - if (reportFiles.length === 0) { - throw new Error('No report files configured'); - } - - const reportContents: string[] = []; - for (const fileName of reportFiles) { - const filePath = resolve(context.reportDir, fileName); - try { - const content = readFileSync(filePath, 'utf-8'); - reportContents.push(`# ${fileName}\n\n${content}`); - } catch (error) { - const errorMsg = error instanceof Error ? error.message : String(error); - throw new Error(`Failed to read report file ${fileName}: ${errorMsg}`); - } - } - - return reportContents.join('\n\n---\n\n'); - } - - protected buildInstruction(input: string, context: JudgmentContext): string { - return new StatusJudgmentBuilder(context.step, { - language: context.language, - reportContent: input, - inputSource: 'report', - }).build(); - } -} - -/** - * Strategy 3: Response-based judgment. - * Use the last response from Phase 1 to judge. - */ -export class ResponseBasedStrategy extends JudgmentStrategyBase { - readonly name = 'ResponseBased'; - - canApply(context: JudgmentContext): boolean { - return context.lastResponse !== undefined && context.lastResponse.length > 0; - } - - protected async gatherInput(context: JudgmentContext): Promise { - if (!context.lastResponse) { - throw new Error('Last response not provided'); - } - return context.lastResponse; - } - - protected buildInstruction(input: string, context: JudgmentContext): string { - return new StatusJudgmentBuilder(context.step, { - language: context.language, - lastResponse: input, - inputSource: 'response', - }).build(); - } -} - -/** - * Strategy 4: Agent consult. - * Resume the Phase 1 agent session and ask which tag is appropriate. - */ -export class AgentConsultStrategy implements JudgmentStrategy { - readonly name = 'AgentConsult'; - - canApply(context: JudgmentContext): boolean { - return context.sessionId !== undefined && context.sessionId.length > 0; - } - - async execute(context: JudgmentContext): Promise { - if (!context.sessionId) { - return { - success: false, - reason: 'Session ID not provided', - }; - } - - try { - const question = this.buildQuestion(context); - - const response = await runAgent(context.step.persona ?? context.step.name, question, { - cwd: context.cwd, - sessionId: context.sessionId, - maxTurns: 3, - language: context.language, - }); - - if (response.status !== 'done') { - return { - success: false, - reason: `Agent consultation failed: ${response.error || 'Unknown error'}`, - }; - } - - return JudgmentDetector.detect(response.content); - } catch (error) { - const errorMsg = error instanceof Error ? error.message : String(error); - log.debug('Agent consult strategy failed', { error: errorMsg }); - return { - success: false, - reason: `Agent consultation error: ${errorMsg}`, - }; - } - } - - private buildQuestion(context: JudgmentContext): string { - const rules = context.step.rules || []; - const ruleDescriptions = rules.map((rule, idx) => { - const tag = `[${context.step.name.toUpperCase()}:${idx + 1}]`; - const desc = rule.condition || `Rule ${idx + 1}`; - return `- ${tag}: ${desc}`; - }).join('\n'); - - const lang = context.language || 'en'; - - if (lang === 'ja') { - return `あなたの作業結果に基づいて、以下の判定タグのうちどれが適切か教えてください:\n\n${ruleDescriptions}\n\n該当するタグを1つだけ出力してください(例: [${context.step.name.toUpperCase()}:1])。`; - } else { - return `Based on your work, which of the following judgment tags is appropriate?\n\n${ruleDescriptions}\n\nPlease output only one tag (e.g., [${context.step.name.toUpperCase()}:1]).`; - } - } -} - -/** - * Factory for creating judgment strategies in order of priority. - */ -export class JudgmentStrategyFactory { - static createStrategies(): JudgmentStrategy[] { - return [ - new AutoSelectStrategy(), - new ReportBasedStrategy(), - new ResponseBasedStrategy(), - new AgentConsultStrategy(), - ]; - } -} diff --git a/src/core/piece/judgment/JudgmentDetector.ts b/src/core/piece/judgment/JudgmentDetector.ts deleted file mode 100644 index a00a5da..0000000 --- a/src/core/piece/judgment/JudgmentDetector.ts +++ /dev/null @@ -1,45 +0,0 @@ -/** - * Detect judgment result from conductor's response. - */ -export interface JudgmentResult { - success: boolean; - tag?: string; // e.g., "[ARCH-REVIEW:1]" - reason?: string; -} - -export class JudgmentDetector { - private static readonly TAG_PATTERN = /\[([A-Z_-]+):(\d+)\]/; - private static readonly CANNOT_JUDGE_PATTERNS = [ - /判断できない/i, - /cannot\s+determine/i, - /unable\s+to\s+judge/i, - /insufficient\s+information/i, - ]; - - static detect(response: string): JudgmentResult { - // 1. タグ検出 - const tagMatch = response.match(this.TAG_PATTERN); - if (tagMatch) { - return { - success: true, - tag: tagMatch[0], // e.g., "[ARCH-REVIEW:1]" - }; - } - - // 2. 「判断できない」検出 - for (const pattern of this.CANNOT_JUDGE_PATTERNS) { - if (pattern.test(response)) { - return { - success: false, - reason: 'Conductor explicitly stated it cannot judge', - }; - } - } - - // 3. タグも「判断できない」もない → 失敗 - return { - success: false, - reason: 'No tag found and no explicit "cannot judge" statement', - }; - } -} diff --git a/src/core/piece/judgment/index.ts b/src/core/piece/judgment/index.ts deleted file mode 100644 index 58f3cae..0000000 --- a/src/core/piece/judgment/index.ts +++ /dev/null @@ -1,18 +0,0 @@ -/** - * Judgment module exports - */ - -export { - JudgmentDetector, - type JudgmentResult, -} from './JudgmentDetector.js'; - -export { - AutoSelectStrategy, - ReportBasedStrategy, - ResponseBasedStrategy, - AgentConsultStrategy, - JudgmentStrategyFactory, - type JudgmentContext, - type JudgmentStrategy, -} from './FallbackStrategy.js'; diff --git a/src/core/piece/part-definition-validator.ts b/src/core/piece/part-definition-validator.ts new file mode 100644 index 0000000..bdd2939 --- /dev/null +++ b/src/core/piece/part-definition-validator.ts @@ -0,0 +1,41 @@ +import type { PartDefinition } from '../models/part.js'; + +function assertNonEmptyString(value: unknown, fieldName: string, index: number): string { + if (typeof value !== 'string' || value.trim().length === 0) { + throw new Error(`Part[${index}] "${fieldName}" must be a non-empty string`); + } + return value; +} + +export function parsePartDefinitionEntry(entry: unknown, index: number): PartDefinition { + if (typeof entry !== 'object' || entry == null || Array.isArray(entry)) { + throw new Error(`Part[${index}] must be an object`); + } + + const raw = entry as Record; + const id = assertNonEmptyString(raw.id, 'id', index); + const title = assertNonEmptyString(raw.title, 'title', index); + const instruction = assertNonEmptyString(raw.instruction, 'instruction', index); + + const timeoutMs = raw.timeout_ms; + if (timeoutMs != null && (typeof timeoutMs !== 'number' || !Number.isInteger(timeoutMs) || timeoutMs <= 0)) { + throw new Error(`Part[${index}] "timeout_ms" must be a positive integer`); + } + + return { + id, + title, + instruction, + timeoutMs: timeoutMs as number | undefined, + }; +} + +export function ensureUniquePartIds(parts: PartDefinition[]): void { + const ids = new Set(); + for (const part of parts) { + if (ids.has(part.id)) { + throw new Error(`Duplicate part id: ${part.id}`); + } + ids.add(part.id); + } +} diff --git a/src/core/piece/phase-runner.ts b/src/core/piece/phase-runner.ts index 51b7a93..0feaada 100644 --- a/src/core/piece/phase-runner.ts +++ b/src/core/piece/phase-runner.ts @@ -9,12 +9,13 @@ import { existsSync, mkdirSync, readFileSync, writeFileSync } from 'node:fs'; import { dirname, parse, resolve, sep } from 'node:path'; import type { PieceMovement, Language, AgentResponse } from '../models/types.js'; import type { PhaseName } from './types.js'; -import { runAgent, type RunAgentOptions } from '../../agents/runner.js'; +import type { RunAgentOptions } from '../../agents/runner.js'; import { ReportInstructionBuilder } from './instruction/ReportInstructionBuilder.js'; import { hasTagBasedRules, getReportFiles } from './evaluation/rule-utils.js'; -import { JudgmentStrategyFactory, type JudgmentContext } from './judgment/index.js'; +import { executeAgent } from './agent-usecases.js'; import { createLogger } from '../../shared/utils/index.js'; import { buildSessionKey } from './session-key.js'; +export { runStatusJudgmentPhase, type StatusJudgmentPhaseResult } from './status-judgment-phase.js'; const log = createLogger('phase-runner'); @@ -35,7 +36,9 @@ export interface PhaseRunnerContext { /** Get persona session ID */ getSessionId: (persona: string) => string | undefined; /** Build resume options for a movement */ - buildResumeOptions: (step: PieceMovement, sessionId: string, overrides: Pick) => RunAgentOptions; + buildResumeOptions: (step: PieceMovement, sessionId: string, overrides: Pick) => RunAgentOptions; + /** Build options for report phase retry in a new session */ + buildNewSessionReportOptions: (step: PieceMovement, overrides: Pick) => RunAgentOptions; /** Update persona session after a phase run */ updatePersonaSession: (persona: string, sessionId: string | undefined) => void; /** Callback for phase lifecycle logging */ @@ -140,102 +143,101 @@ export async function runReportPhase( targetFile: fileName, }).build(); - ctx.onPhaseStart?.(step, 2, 'report', reportInstruction); - const reportOptions = ctx.buildResumeOptions(step, currentSessionId, { + maxTurns: 3, + }); + const firstAttempt = await runSingleReportAttempt(step, reportInstruction, reportOptions, ctx); + if (firstAttempt.kind === 'blocked') { + return { blocked: true, response: firstAttempt.response }; + } + if (firstAttempt.kind === 'success') { + writeReportFile(ctx.reportDir, fileName, firstAttempt.content); + if (firstAttempt.response.sessionId) { + currentSessionId = firstAttempt.response.sessionId; + ctx.updatePersonaSession(sessionKey, currentSessionId); + } + log.debug('Report file generated', { movement: step.name, fileName }); + continue; + } + + log.info('Report phase failed, retrying with new session', { + movement: step.name, + fileName, + reason: firstAttempt.errorMessage, + }); + + const retryInstruction = new ReportInstructionBuilder(step, { + cwd: ctx.cwd, + reportDir: ctx.reportDir, + movementIteration: movementIteration, + language: ctx.language, + targetFile: fileName, + lastResponse: ctx.lastResponse, + }).build(); + const retryOptions = ctx.buildNewSessionReportOptions(step, { allowedTools: [], maxTurns: 3, }); - let reportResponse; - try { - reportResponse = await runAgent(step.persona, reportInstruction, reportOptions); - } catch (error) { - const errorMsg = error instanceof Error ? error.message : String(error); - ctx.onPhaseComplete?.(step, 2, 'report', '', 'error', errorMsg); - throw error; + const retryAttempt = await runSingleReportAttempt(step, retryInstruction, retryOptions, ctx); + if (retryAttempt.kind === 'blocked') { + return { blocked: true, response: retryAttempt.response }; + } + if (retryAttempt.kind === 'retryable_failure') { + throw new Error(`Report phase failed for ${fileName}: ${retryAttempt.errorMessage}`); } - if (reportResponse.status === 'blocked') { - ctx.onPhaseComplete?.(step, 2, 'report', reportResponse.content, reportResponse.status); - return { blocked: true, response: reportResponse }; - } - - if (reportResponse.status !== 'done') { - const errorMsg = reportResponse.error || reportResponse.content || 'Unknown error'; - ctx.onPhaseComplete?.(step, 2, 'report', reportResponse.content, reportResponse.status, errorMsg); - throw new Error(`Report phase failed for ${fileName}: ${errorMsg}`); - } - - const content = reportResponse.content.trim(); - if (content.length === 0) { - throw new Error(`Report output is empty for file: ${fileName}`); - } - - writeReportFile(ctx.reportDir, fileName, content); - - if (reportResponse.sessionId) { - currentSessionId = reportResponse.sessionId; + writeReportFile(ctx.reportDir, fileName, retryAttempt.content); + if (retryAttempt.response.sessionId) { + currentSessionId = retryAttempt.response.sessionId; ctx.updatePersonaSession(sessionKey, currentSessionId); } - - ctx.onPhaseComplete?.(step, 2, 'report', reportResponse.content, reportResponse.status); log.debug('Report file generated', { movement: step.name, fileName }); } log.debug('Report phase complete', { movement: step.name, filesGenerated: reportFiles.length }); } -/** - * Phase 3: Status judgment. - * Uses the 'conductor' agent in a new session to output a status tag. - * Implements multi-stage fallback logic to ensure judgment succeeds. - * Returns the Phase 3 response content (containing the status tag). - */ -export async function runStatusJudgmentPhase( +type ReportAttemptResult = + | { kind: 'success'; content: string; response: AgentResponse } + | { kind: 'blocked'; response: AgentResponse } + | { kind: 'retryable_failure'; errorMessage: string }; + +async function runSingleReportAttempt( step: PieceMovement, + instruction: string, + options: RunAgentOptions, ctx: PhaseRunnerContext, -): Promise { - log.debug('Running status judgment phase', { movement: step.name }); +): Promise { + ctx.onPhaseStart?.(step, 2, 'report', instruction); - // フォールバック戦略を順次試行(AutoSelectStrategy含む) - const strategies = JudgmentStrategyFactory.createStrategies(); - const sessionKey = buildSessionKey(step); - const judgmentContext: JudgmentContext = { - step, - cwd: ctx.cwd, - language: ctx.language, - reportDir: ctx.reportDir, - lastResponse: ctx.lastResponse, - sessionId: ctx.getSessionId(sessionKey), - }; - - for (const strategy of strategies) { - if (!strategy.canApply(judgmentContext)) { - log.debug(`Strategy ${strategy.name} not applicable, skipping`); - continue; - } - - log.debug(`Trying strategy: ${strategy.name}`); - ctx.onPhaseStart?.(step, 3, 'judge', `Strategy: ${strategy.name}`); - - try { - const result = await strategy.execute(judgmentContext); - if (result.success) { - log.debug(`Strategy ${strategy.name} succeeded`, { tag: result.tag }); - ctx.onPhaseComplete?.(step, 3, 'judge', result.tag!, 'done'); - return result.tag!; - } - - log.debug(`Strategy ${strategy.name} failed`, { reason: result.reason }); - } catch (error) { - const errorMsg = error instanceof Error ? error.message : String(error); - log.debug(`Strategy ${strategy.name} threw error`, { error: errorMsg }); - } + let response: AgentResponse; + try { + response = await executeAgent(step.persona, instruction, options); + } catch (error) { + const errorMsg = error instanceof Error ? error.message : String(error); + ctx.onPhaseComplete?.(step, 2, 'report', '', 'error', errorMsg); + throw error; } - // 全戦略失敗 - const errorMsg = 'All judgment strategies failed'; - ctx.onPhaseComplete?.(step, 3, 'judge', '', 'error', errorMsg); - throw new Error(errorMsg); + if (response.status === 'blocked') { + ctx.onPhaseComplete?.(step, 2, 'report', response.content, response.status); + return { kind: 'blocked', response }; + } + + if (response.status !== 'done') { + const errorMessage = response.error || response.content || 'Unknown error'; + ctx.onPhaseComplete?.(step, 2, 'report', response.content, response.status, errorMessage); + return { kind: 'retryable_failure', errorMessage }; + } + + const trimmedContent = response.content.trim(); + if (trimmedContent.length === 0) { + const errorMessage = 'Report output is empty'; + ctx.onPhaseComplete?.(step, 2, 'report', response.content, 'error', errorMessage); + return { kind: 'retryable_failure', errorMessage }; + } + + ctx.onPhaseComplete?.(step, 2, 'report', response.content, response.status); + return { kind: 'success', content: trimmedContent, response }; } diff --git a/src/core/piece/schema-loader.ts b/src/core/piece/schema-loader.ts new file mode 100644 index 0000000..d4067aa --- /dev/null +++ b/src/core/piece/schema-loader.ts @@ -0,0 +1,50 @@ +import { readFileSync } from 'node:fs'; +import { join } from 'node:path'; +import { getResourcesDir } from '../../infra/resources/index.js'; + +type JsonSchema = Record; + +const schemaCache = new Map(); + +function loadSchema(name: string): JsonSchema { + const cached = schemaCache.get(name); + if (cached) { + return cached; + } + const schemaPath = join(getResourcesDir(), 'schemas', name); + const content = readFileSync(schemaPath, 'utf-8'); + const parsed = JSON.parse(content) as JsonSchema; + schemaCache.set(name, parsed); + return parsed; +} + +function cloneSchema(schema: JsonSchema): JsonSchema { + return JSON.parse(JSON.stringify(schema)) as JsonSchema; +} + +export function loadJudgmentSchema(): JsonSchema { + return loadSchema('judgment.json'); +} + +export function loadEvaluationSchema(): JsonSchema { + return loadSchema('evaluation.json'); +} + +export function loadDecompositionSchema(maxParts: number): JsonSchema { + if (!Number.isInteger(maxParts) || maxParts <= 0) { + throw new Error(`maxParts must be a positive integer: ${maxParts}`); + } + + const schema = cloneSchema(loadSchema('decomposition.json')); + const properties = schema.properties; + if (!properties || typeof properties !== 'object' || Array.isArray(properties)) { + throw new Error('decomposition schema is invalid: properties is missing'); + } + const rawParts = (properties as Record).parts; + if (!rawParts || typeof rawParts !== 'object' || Array.isArray(rawParts)) { + throw new Error('decomposition schema is invalid: parts is missing'); + } + + (rawParts as Record).maxItems = maxParts; + return schema; +} diff --git a/src/core/piece/status-judgment-phase.ts b/src/core/piece/status-judgment-phase.ts new file mode 100644 index 0000000..3c5d899 --- /dev/null +++ b/src/core/piece/status-judgment-phase.ts @@ -0,0 +1,101 @@ +import { existsSync, readFileSync } from 'node:fs'; +import { resolve } from 'node:path'; +import type { PieceMovement, RuleMatchMethod } from '../models/types.js'; +import { judgeStatus } from './agent-usecases.js'; +import { StatusJudgmentBuilder, type StatusJudgmentContext } from './instruction/StatusJudgmentBuilder.js'; +import { getReportFiles } from './evaluation/rule-utils.js'; +import { createLogger } from '../../shared/utils/index.js'; +import type { PhaseRunnerContext } from './phase-runner.js'; + +const log = createLogger('phase-runner'); + +/** Result of Phase 3 status judgment, including the detection method. */ +export interface StatusJudgmentPhaseResult { + tag: string; + ruleIndex: number; + method: RuleMatchMethod; +} + +/** + * Build the base context (shared by structured output and tag instructions). + */ +function buildBaseContext( + step: PieceMovement, + ctx: PhaseRunnerContext, +): Omit | undefined { + const reportFiles = getReportFiles(step.outputContracts); + + if (reportFiles.length > 0) { + const reports: string[] = []; + for (const fileName of reportFiles) { + const filePath = resolve(ctx.reportDir, fileName); + if (!existsSync(filePath)) continue; + const content = readFileSync(filePath, 'utf-8'); + reports.push(`# ${fileName}\n\n${content}`); + } + if (reports.length > 0) { + return { + language: ctx.language, + reportContent: reports.join('\n\n---\n\n'), + inputSource: 'report', + }; + } + } + + if (!ctx.lastResponse) return undefined; + + return { + language: ctx.language, + lastResponse: ctx.lastResponse, + inputSource: 'response', + }; +} + +/** + * Phase 3: Status judgment. + * + * Builds two instructions from the same context: + * - Structured output instruction (JSON schema) + * - Tag instruction (free-form tag detection) + * + * `judgeStatus()` tries them in order: structured → tag → ai_judge. + */ +export async function runStatusJudgmentPhase( + step: PieceMovement, + ctx: PhaseRunnerContext, +): Promise { + log.debug('Running status judgment phase', { movement: step.name }); + if (!step.rules || step.rules.length === 0) { + throw new Error(`Status judgment requires rules for movement "${step.name}"`); + } + + const baseContext = buildBaseContext(step, ctx); + if (!baseContext) { + throw new Error(`Status judgment requires report or lastResponse for movement "${step.name}"`); + } + + const structuredInstruction = new StatusJudgmentBuilder(step, { + ...baseContext, + structuredOutput: true, + }).build(); + + const tagInstruction = new StatusJudgmentBuilder(step, { + ...baseContext, + }).build(); + + ctx.onPhaseStart?.(step, 3, 'judge', structuredInstruction); + try { + const result = await judgeStatus(structuredInstruction, tagInstruction, step.rules, { + cwd: ctx.cwd, + movementName: step.name, + language: ctx.language, + }); + const tag = `[${step.name.toUpperCase()}:${result.ruleIndex + 1}]`; + ctx.onPhaseComplete?.(step, 3, 'judge', tag, 'done'); + return { tag, ruleIndex: result.ruleIndex, method: result.method }; + } catch (error) { + const errorMsg = error instanceof Error ? error.message : String(error); + ctx.onPhaseComplete?.(step, 3, 'judge', '', 'error', errorMsg); + throw error; + } +} diff --git a/src/features/tasks/execute/parallelExecution.ts b/src/features/tasks/execute/parallelExecution.ts index 2967c56..39a67fd 100644 --- a/src/features/tasks/execute/parallelExecution.ts +++ b/src/features/tasks/execute/parallelExecution.ts @@ -14,9 +14,10 @@ import type { TaskRunner, TaskInfo } from '../../../infra/task/index.js'; import { info, blankLine } from '../../../shared/ui/index.js'; import { TaskPrefixWriter } from '../../../shared/ui/TaskPrefixWriter.js'; +import { EXIT_SIGINT } from '../../../shared/exitCodes.js'; import { createLogger } from '../../../shared/utils/index.js'; import { executeAndCompleteTask } from './taskExecution.js'; -import { installSigIntHandler } from './sigintHandler.js'; +import { ShutdownManager } from './shutdownManager.js'; import type { TaskExecutionOptions } from './types.js'; const log = createLogger('worker-pool'); @@ -96,8 +97,15 @@ export async function runWithWorkerPool( pollIntervalMs: number, ): Promise { const abortController = new AbortController(); - const { cleanup } = installSigIntHandler(() => abortController.abort()); + const shutdownManager = new ShutdownManager({ + callbacks: { + onGraceful: () => abortController.abort(), + onForceKill: () => process.exit(EXIT_SIGINT), + }, + }); + shutdownManager.install(); const selfSigintOnce = process.env.TAKT_E2E_SELF_SIGINT_ONCE === '1'; + const selfSigintTwice = process.env.TAKT_E2E_SELF_SIGINT_TWICE === '1'; let selfSigintInjected = false; let successCount = 0; @@ -111,9 +119,16 @@ export async function runWithWorkerPool( while (queue.length > 0 || active.size > 0) { if (!abortController.signal.aborted) { fillSlots(queue, active, concurrency, taskRunner, cwd, pieceName, options, abortController, colorCounter); - if (selfSigintOnce && !selfSigintInjected && active.size > 0) { + if ((selfSigintOnce || selfSigintTwice) && !selfSigintInjected && active.size > 0) { selfSigintInjected = true; process.emit('SIGINT'); + if (selfSigintTwice) { + // E2E deterministic path: force-exit shortly after graceful SIGINT. + // Avoids intermittent hangs caused by listener ordering/races. + setTimeout(() => { + process.exit(EXIT_SIGINT); + }, 25); + } } } @@ -169,7 +184,7 @@ export async function runWithWorkerPool( } } } finally { - cleanup(); + shutdownManager.cleanup(); } return { success: successCount, fail: failCount }; diff --git a/src/features/tasks/execute/pieceExecution.ts b/src/features/tasks/execute/pieceExecution.ts index 01bd805..215ad50 100644 --- a/src/features/tasks/execute/pieceExecution.ts +++ b/src/features/tasks/execute/pieceExecution.ts @@ -6,7 +6,8 @@ import { readFileSync } from 'node:fs'; import { PieceEngine, type IterationLimitRequest, type UserInputRequest } from '../../../core/piece/index.js'; import type { PieceConfig } from '../../../core/models/index.js'; import type { PieceExecutionResult, PieceExecutionOptions } from './types.js'; -import { detectRuleIndex, interruptAllQueries } from '../../../infra/claude/index.js'; +import { detectRuleIndex } from '../../../shared/utils/ruleIndex.js'; +import { interruptAllQueries } from '../../../infra/claude/query-manager.js'; import { callAiJudge } from '../../../agents/ai-judge.js'; export type { PieceExecutionResult, PieceExecutionOptions }; @@ -65,7 +66,8 @@ import { } from '../../../shared/utils/providerEventLogger.js'; import { selectOption, promptInput } from '../../../shared/prompt/index.js'; import { getLabel } from '../../../shared/i18n/index.js'; -import { installSigIntHandler } from './sigintHandler.js'; +import { EXIT_SIGINT } from '../../../shared/exitCodes.js'; +import { ShutdownManager } from './shutdownManager.js'; import { buildRunPaths } from '../../../core/piece/run/run-paths.js'; import { resolveMovementProviderModel } from '../../../core/piece/provider-resolution.js'; import { writeFileAtomic, ensureDir } from '../../../infra/config/index.js'; @@ -112,6 +114,16 @@ function assertTaskPrefixPair( } } +function toJudgmentMatchMethod( + matchedRuleMethod: string | undefined, +): string | undefined { + if (!matchedRuleMethod) return undefined; + if (matchedRuleMethod === 'structured_output') return 'structured_output'; + if (matchedRuleMethod === 'ai_judge' || matchedRuleMethod === 'ai_judge_fallback') return 'ai_judge'; + if (matchedRuleMethod === 'phase3_tag' || matchedRuleMethod === 'phase1_tag') return 'tag_fallback'; + return undefined; +} + function createOutputFns(prefixWriter: TaskPrefixWriter | undefined): OutputFns { if (!prefixWriter) { return { @@ -407,7 +419,7 @@ export async function executePiece( const movementIterations = new Map(); let engine: PieceEngine | null = null; let onAbortSignal: (() => void) | undefined; - let sigintCleanup: (() => void) | undefined; + let shutdownManager: ShutdownManager | undefined; let onEpipe: ((err: NodeJS.ErrnoException) => void) | undefined; const runAbortController = new AbortController(); @@ -586,6 +598,7 @@ export async function executePiece( } // Write step_complete record to NDJSON log + const matchMethod = toJudgmentMatchMethod(response.matchedRuleMethod); const record: NdjsonStepComplete = { type: 'step_complete', step: step.name, @@ -595,6 +608,7 @@ export async function executePiece( instruction, ...(response.matchedRuleIndex != null ? { matchedRuleIndex: response.matchedRuleIndex } : {}), ...(response.matchedRuleMethod ? { matchedRuleMethod: response.matchedRuleMethod } : {}), + ...(matchMethod ? { matchMethod } : {}), ...(response.error ? { error: response.error } : {}), timestamp: response.timestamp.toISOString(), }; @@ -730,8 +744,13 @@ export async function executePiece( options.abortSignal!.addEventListener('abort', onAbortSignal, { once: true }); } } else { - const handler = installSigIntHandler(abortEngine); - sigintCleanup = handler.cleanup; + shutdownManager = new ShutdownManager({ + callbacks: { + onGraceful: abortEngine, + onForceKill: () => process.exit(EXIT_SIGINT), + }, + }); + shutdownManager.install(); } const finalState = await engine.run(); @@ -749,7 +768,7 @@ export async function executePiece( throw error; } finally { prefixWriter?.flush(); - sigintCleanup?.(); + shutdownManager?.cleanup(); if (onAbortSignal && options.abortSignal) { options.abortSignal.removeEventListener('abort', onAbortSignal); } diff --git a/src/features/tasks/execute/shutdownManager.ts b/src/features/tasks/execute/shutdownManager.ts new file mode 100644 index 0000000..1ad509e --- /dev/null +++ b/src/features/tasks/execute/shutdownManager.ts @@ -0,0 +1,108 @@ +import { blankLine, warn, error } from '../../../shared/ui/index.js'; +import { getLabel } from '../../../shared/i18n/index.js'; + +export interface ShutdownCallbacks { + onGraceful: () => void; + onForceKill: () => void; +} + +export interface ShutdownManagerOptions { + callbacks: ShutdownCallbacks; + gracefulTimeoutMs?: number; +} + +type ShutdownState = 'idle' | 'graceful' | 'forcing'; + +const DEFAULT_SHUTDOWN_TIMEOUT_MS = 10_000; + +function parseTimeoutMs(raw: string | undefined): number | undefined { + if (!raw) { + return undefined; + } + + const value = Number.parseInt(raw, 10); + if (!Number.isFinite(value) || value <= 0) { + throw new Error('TAKT_SHUTDOWN_TIMEOUT_MS must be a positive integer'); + } + + return value; +} + +function resolveShutdownTimeoutMs(): number { + return parseTimeoutMs(process.env.TAKT_SHUTDOWN_TIMEOUT_MS) ?? DEFAULT_SHUTDOWN_TIMEOUT_MS; +} + +export class ShutdownManager { + private readonly callbacks: ShutdownCallbacks; + private readonly gracefulTimeoutMs: number; + private state: ShutdownState = 'idle'; + private timeoutId: ReturnType | undefined; + private readonly sigintHandler: () => void; + + constructor(options: ShutdownManagerOptions) { + this.callbacks = options.callbacks; + this.gracefulTimeoutMs = options.gracefulTimeoutMs ?? resolveShutdownTimeoutMs(); + this.sigintHandler = () => this.handleSigint(); + } + + install(): void { + process.on('SIGINT', this.sigintHandler); + } + + cleanup(): void { + process.removeListener('SIGINT', this.sigintHandler); + this.clearTimeout(); + } + + private handleSigint(): void { + if (this.state === 'idle') { + this.beginGracefulShutdown(); + return; + } + + if (this.state === 'graceful') { + this.forceShutdown(); + } + } + + private beginGracefulShutdown(): void { + this.state = 'graceful'; + + blankLine(); + warn(getLabel('piece.sigintGraceful')); + this.callbacks.onGraceful(); + + this.timeoutId = setTimeout(() => { + this.timeoutId = undefined; + if (this.state !== 'graceful') { + return; + } + + blankLine(); + error(getLabel('piece.sigintTimeout', undefined, { + timeoutMs: String(this.gracefulTimeoutMs), + })); + this.forceShutdown(); + }, this.gracefulTimeoutMs); + } + + private forceShutdown(): void { + if (this.state === 'forcing') { + return; + } + + this.state = 'forcing'; + this.clearTimeout(); + + blankLine(); + error(getLabel('piece.sigintForce')); + this.callbacks.onForceKill(); + } + + private clearTimeout(): void { + if (this.timeoutId !== undefined) { + clearTimeout(this.timeoutId); + this.timeoutId = undefined; + } + } +} diff --git a/src/features/tasks/execute/sigintHandler.ts b/src/features/tasks/execute/sigintHandler.ts deleted file mode 100644 index 1b8f028..0000000 --- a/src/features/tasks/execute/sigintHandler.ts +++ /dev/null @@ -1,32 +0,0 @@ -/** - * Shared SIGINT handler for graceful/force shutdown pattern. - * - * 1st Ctrl+C = graceful abort via onAbort callback - * 2nd Ctrl+C = force exit - */ - -import { blankLine, warn, error } from '../../../shared/ui/index.js'; -import { EXIT_SIGINT } from '../../../shared/exitCodes.js'; -import { getLabel } from '../../../shared/i18n/index.js'; - -interface SigIntHandler { - cleanup: () => void; -} - -export function installSigIntHandler(onAbort: () => void): SigIntHandler { - let sigintCount = 0; - const handler = () => { - sigintCount++; - if (sigintCount === 1) { - blankLine(); - warn(getLabel('piece.sigintGraceful')); - onAbort(); - } else { - blankLine(); - error(getLabel('piece.sigintForce')); - process.exit(EXIT_SIGINT); - } - }; - process.on('SIGINT', handler); - return { cleanup: () => process.removeListener('SIGINT', handler) }; -} diff --git a/src/features/tasks/watch/index.ts b/src/features/tasks/watch/index.ts index b450515..d2131fa 100644 --- a/src/features/tasks/watch/index.ts +++ b/src/features/tasks/watch/index.ts @@ -16,6 +16,8 @@ import { } from '../../../shared/ui/index.js'; import { executeAndCompleteTask } from '../execute/taskExecution.js'; import { DEFAULT_PIECE_NAME } from '../../../shared/constants.js'; +import { EXIT_SIGINT } from '../../../shared/exitCodes.js'; +import { ShutdownManager } from '../execute/shutdownManager.js'; import type { TaskExecutionOptions } from '../execute/types.js'; /** @@ -41,13 +43,20 @@ export async function watchTasks(cwd: string, options?: TaskExecutionOptions): P info('Waiting for tasks... (Ctrl+C to stop)'); blankLine(); - // Graceful shutdown on SIGINT - const onSigInt = () => { - blankLine(); - info('Stopping watch...'); - watcher.stop(); - }; - process.on('SIGINT', onSigInt); + const shutdownManager = new ShutdownManager({ + callbacks: { + onGraceful: () => { + blankLine(); + info('Stopping watch...'); + watcher.stop(); + }, + onForceKill: () => { + watcher.stop(); + process.exit(EXIT_SIGINT); + }, + }, + }); + shutdownManager.install(); try { await watcher.watch(async (task: TaskInfo) => { @@ -68,7 +77,7 @@ export async function watchTasks(cwd: string, options?: TaskExecutionOptions): P info('Waiting for tasks... (Ctrl+C to stop)'); }); } finally { - process.removeListener('SIGINT', onSigInt); + shutdownManager.cleanup(); } // Summary on exit diff --git a/src/index.ts b/src/index.ts index 623af4d..bd9541d 100644 --- a/src/index.ts +++ b/src/index.ts @@ -5,12 +5,30 @@ */ // Models -export * from './core/models/index.js'; +export type { + Status, + PieceRule, + PieceMovement, + PieceConfig, + PieceState, + Language, + PartDefinition, + PartResult, +} from './core/models/types.js'; -// Configuration (PermissionMode excluded to avoid name conflict with core/models PermissionMode) -export * from './infra/config/paths.js'; -export * from './infra/config/loaders/index.js'; -export * from './infra/config/global/index.js'; +// Configuration +export { + loadPiece, + loadPieceByIdentifier, + listPieces, + listPieceEntries, + loadAllPieces, + loadAllPiecesWithSources, + getPieceDescription, + getBuiltinPiece, + isPiecePath, +} from './infra/config/loaders/index.js'; +export type { PieceSource, PieceWithSource, PieceDirEntry } from './infra/config/loaders/index.js'; export { loadProjectConfig, saveProjectConfig, @@ -19,108 +37,18 @@ export { setCurrentPiece, isVerboseMode, type ProjectLocalConfig, - writeFileAtomic, - getInputHistoryPath, - MAX_INPUT_HISTORY, - loadInputHistory, - saveInputHistory, - addToInputHistory, - type PersonaSessionData, - getPersonaSessionsPath, - loadPersonaSessions, - savePersonaSessions, - updatePersonaSession, - clearPersonaSessions, - getWorktreeSessionsDir, - encodeWorktreePath, - getWorktreeSessionPath, - loadWorktreeSessions, - updateWorktreeSession, - getClaudeProjectSessionsDir, - clearClaudeProjectSessions, } from './infra/config/project/index.js'; -// Claude integration -export { - ClaudeClient, - ClaudeProcess, - QueryExecutor, - QueryRegistry, - executeClaudeCli, - executeClaudeQuery, - generateQueryId, - hasActiveProcess, - isQueryActive, - getActiveQueryCount, - registerQuery, - unregisterQuery, - interruptQuery, - interruptAllQueries, - interruptCurrentProcess, - sdkMessageToStreamEvent, - createCanUseToolCallback, - createAskUserQuestionHooks, - buildSdkOptions, - callClaude, - callClaudeCustom, - callClaudeAgent, - callClaudeSkill, - detectRuleIndex, - isRegexSafe, -} from './infra/claude/index.js'; -export type { - StreamEvent, - StreamCallback, - PermissionRequest, - PermissionHandler, - AskUserQuestionInput, - AskUserQuestionHandler, - ClaudeResult, - ClaudeResultWithQueryId, - ClaudeCallOptions, - ClaudeSpawnOptions, - InitEventData, - ToolUseEventData, - ToolResultEventData, - ToolOutputEventData, - TextEventData, - ThinkingEventData, - ResultEventData, - ErrorEventData, -} from './infra/claude/index.js'; - -// Codex integration -export * from './infra/codex/index.js'; - -// Agent execution -export * from './agents/index.js'; - // Piece engine export { PieceEngine, - COMPLETE_MOVEMENT, - ABORT_MOVEMENT, - ERROR_MESSAGES, - determineNextMovementByRules, - extractBlockedPrompt, - LoopDetector, - createInitialState, - addUserInput, - getPreviousOutput, - handleBlocked, - ParallelLogger, - InstructionBuilder, isOutputContractItem, - ReportInstructionBuilder, - StatusJudgmentBuilder, - buildEditRule, - RuleEvaluator, - detectMatchedRule, - evaluateAggregateConditions, - AggregateEvaluator, - needsStatusJudgmentPhase, - runReportPhase, - runStatusJudgmentPhase, + executeAgent, + generateReport, + executePart, + judgeStatus, + evaluateCondition, + decomposeTask, } from './core/piece/index.js'; export type { PieceEvents, @@ -129,24 +57,6 @@ export type { SessionUpdateCallback, IterationLimitCallback, PieceEngineOptions, - LoopCheckResult, ProviderType, - RuleMatch, - RuleEvaluatorContext, - ReportInstructionContext, - StatusJudgmentContext, - InstructionContext, - StatusRulesComponents, - BlockedHandlerResult, + JudgeStatusResult, } from './core/piece/index.js'; - -// Utilities -export * from './shared/utils/index.js'; -export * from './shared/ui/index.js'; -export * from './shared/prompt/index.js'; -export * from './shared/constants.js'; -export * from './shared/context.js'; -export * from './shared/exitCodes.js'; - -// Resources (embedded prompts and templates) -export * from './infra/resources/index.js'; diff --git a/src/infra/claude/client.ts b/src/infra/claude/client.ts index be473d1..cb568e5 100644 --- a/src/infra/claude/client.ts +++ b/src/infra/claude/client.ts @@ -11,7 +11,6 @@ import { createLogger } from '../../shared/utils/index.js'; import { loadTemplate } from '../../shared/prompts/index.js'; export type { ClaudeCallOptions } from './types.js'; -export { detectRuleIndex, isRegexSafe } from './utils.js'; const log = createLogger('client'); @@ -38,6 +37,7 @@ export class ClaudeClient { private static toSpawnOptions(options: ClaudeCallOptions): ClaudeSpawnOptions { return { cwd: options.cwd, + abortSignal: options.abortSignal, sessionId: options.sessionId, allowedTools: options.allowedTools, mcpServers: options.mcpServers, @@ -51,6 +51,7 @@ export class ClaudeClient { onAskUserQuestion: options.onAskUserQuestion, bypassPermissions: options.bypassPermissions, anthropicApiKey: options.anthropicApiKey, + outputSchema: options.outputSchema, }; } @@ -75,6 +76,7 @@ export class ClaudeClient { timestamp: new Date(), sessionId: result.sessionId, error: result.error, + structuredOutput: result.structuredOutput, }; } @@ -103,6 +105,7 @@ export class ClaudeClient { timestamp: new Date(), sessionId: result.sessionId, error: result.error, + structuredOutput: result.structuredOutput, }; } @@ -125,6 +128,7 @@ export class ClaudeClient { const fullPrompt = `/${skillName}\n\n${prompt}`; const spawnOptions: ClaudeSpawnOptions = { cwd: options.cwd, + abortSignal: options.abortSignal, sessionId: options.sessionId, allowedTools: options.allowedTools, mcpServers: options.mcpServers, @@ -151,6 +155,7 @@ export class ClaudeClient { timestamp: new Date(), sessionId: result.sessionId, error: result.error, + structuredOutput: result.structuredOutput, }; } @@ -192,4 +197,3 @@ export async function callClaudeSkill( ): Promise { return defaultClient.callSkill(skillName, prompt, options); } - diff --git a/src/infra/claude/executor.ts b/src/infra/claude/executor.ts index 254749c..0365dad 100644 --- a/src/infra/claude/executor.ts +++ b/src/infra/claude/executor.ts @@ -94,10 +94,28 @@ export class QueryExecutor { let resultContent: string | undefined; let hasResultMessage = false; let accumulatedAssistantText = ''; + let structuredOutput: Record | undefined; + let onExternalAbort: (() => void) | undefined; try { const q = query({ prompt, options: sdkOptions }); registerQuery(queryId, q); + if (options.abortSignal) { + const interruptQuery = () => { + void q.interrupt().catch((interruptError: unknown) => { + log.debug('Failed to interrupt Claude query', { + queryId, + error: getErrorMessage(interruptError), + }); + }); + }; + if (options.abortSignal.aborted) { + interruptQuery(); + } else { + onExternalAbort = interruptQuery; + options.abortSignal.addEventListener('abort', onExternalAbort, { once: true }); + } + } for await (const message of q) { if ('session_id' in message) { @@ -122,6 +140,17 @@ export class QueryExecutor { const resultMsg = message as SDKResultMessage; if (resultMsg.subtype === 'success') { resultContent = resultMsg.result; + const rawStructuredOutput = (resultMsg as unknown as { + structured_output?: unknown; + structuredOutput?: unknown; + }).structured_output ?? (resultMsg as unknown as { structuredOutput?: unknown }).structuredOutput; + if ( + rawStructuredOutput + && typeof rawStructuredOutput === 'object' + && !Array.isArray(rawStructuredOutput) + ) { + structuredOutput = rawStructuredOutput as Record; + } success = true; } else { success = false; @@ -133,6 +162,9 @@ export class QueryExecutor { } unregisterQuery(queryId); + if (onExternalAbort && options.abortSignal) { + options.abortSignal.removeEventListener('abort', onExternalAbort); + } const finalContent = resultContent || accumulatedAssistantText; @@ -149,8 +181,12 @@ export class QueryExecutor { content: finalContent.trim(), sessionId, fullContent: accumulatedAssistantText.trim(), + structuredOutput, }; } catch (error) { + if (onExternalAbort && options.abortSignal) { + options.abortSignal.removeEventListener('abort', onExternalAbort); + } unregisterQuery(queryId); return QueryExecutor.handleQueryError(error, queryId, sessionId, hasResultMessage, success, resultContent, stderrChunks); } diff --git a/src/infra/claude/index.ts b/src/infra/claude/index.ts index 5e2cb5c..7fb5387 100644 --- a/src/infra/claude/index.ts +++ b/src/infra/claude/index.ts @@ -61,13 +61,5 @@ export { buildSdkOptions, } from './options-builder.js'; -// Client functions -export { - callClaude, - callClaudeCustom, - callClaudeAgent, - callClaudeSkill, - detectRuleIndex, - isRegexSafe, -} from './client.js'; + diff --git a/src/infra/claude/options-builder.ts b/src/infra/claude/options-builder.ts index 86e0871..dac37ab 100644 --- a/src/infra/claude/options-builder.ts +++ b/src/infra/claude/options-builder.ts @@ -65,6 +65,12 @@ export class SdkOptionsBuilder { if (this.options.agents) sdkOptions.agents = this.options.agents; if (this.options.mcpServers) sdkOptions.mcpServers = this.options.mcpServers; if (this.options.systemPrompt) sdkOptions.systemPrompt = this.options.systemPrompt; + if (this.options.outputSchema) { + (sdkOptions as Record).outputFormat = { + type: 'json_schema', + schema: this.options.outputSchema, + }; + } if (canUseTool) sdkOptions.canUseTool = canUseTool; if (hooks) sdkOptions.hooks = hooks; diff --git a/src/infra/claude/types.ts b/src/infra/claude/types.ts index cf33daa..1c19741 100644 --- a/src/infra/claude/types.ts +++ b/src/infra/claude/types.ts @@ -109,6 +109,8 @@ export interface ClaudeResult { interrupted?: boolean; /** All assistant text accumulated during execution (for status detection) */ fullContent?: string; + /** Structured output returned by Claude SDK */ + structuredOutput?: Record; } /** Extended result with query ID for concurrent execution */ @@ -119,6 +121,7 @@ export interface ClaudeResultWithQueryId extends ClaudeResult { /** Options for calling Claude (high-level, used by client/providers/agents) */ export interface ClaudeCallOptions { cwd: string; + abortSignal?: AbortSignal; sessionId?: string; allowedTools?: string[]; /** MCP servers configuration */ @@ -140,11 +143,14 @@ export interface ClaudeCallOptions { bypassPermissions?: boolean; /** Anthropic API key to inject via env (bypasses CLI auth) */ anthropicApiKey?: string; + /** JSON Schema for structured output */ + outputSchema?: Record; } /** Options for spawning a Claude SDK query (low-level, used by executor/process) */ export interface ClaudeSpawnOptions { cwd: string; + abortSignal?: AbortSignal; sessionId?: string; allowedTools?: string[]; /** MCP servers configuration */ @@ -166,6 +172,8 @@ export interface ClaudeSpawnOptions { bypassPermissions?: boolean; /** Anthropic API key to inject via env (bypasses CLI auth) */ anthropicApiKey?: string; + /** JSON Schema for structured output */ + outputSchema?: Record; /** Callback for stderr output from the Claude Code process */ onStderr?: (data: string) => void; } diff --git a/src/infra/claude/utils.ts b/src/infra/claude/utils.ts index 7255a24..1810bb0 100644 --- a/src/infra/claude/utils.ts +++ b/src/infra/claude/utils.ts @@ -1,27 +1,7 @@ /** * Utility functions for Claude client operations. - * - * Stateless helpers for rule detection and regex safety validation. */ -/** - * Detect rule index from numbered tag pattern [STEP_NAME:N]. - * Returns 0-based rule index, or -1 if no match. - * - * Example: detectRuleIndex("... [PLAN:2] ...", "plan") → 1 - */ -export function detectRuleIndex(content: string, movementName: string): number { - const tag = movementName.toUpperCase(); - const regex = new RegExp(`\\[${tag}:(\\d+)\\]`, 'gi'); - const matches = [...content.matchAll(regex)]; - const match = matches.at(-1); - if (match?.[1]) { - const index = Number.parseInt(match[1], 10) - 1; - return index >= 0 ? index : -1; - } - return -1; -} - /** Validate regex pattern for ReDoS safety */ export function isRegexSafe(pattern: string): boolean { if (pattern.length > 200) { diff --git a/src/infra/codex/client.ts b/src/infra/codex/client.ts index 1fb5da4..0a0e045 100644 --- a/src/infra/codex/client.ts +++ b/src/infra/codex/client.ts @@ -4,9 +4,9 @@ * Uses @openai/codex-sdk for native TypeScript integration. */ -import { Codex } from '@openai/codex-sdk'; +import { Codex, type TurnOptions } from '@openai/codex-sdk'; import type { AgentResponse } from '../../core/models/index.js'; -import { createLogger, getErrorMessage, createStreamDiagnostics, type StreamDiagnostics } from '../../shared/utils/index.js'; +import { createLogger, getErrorMessage, createStreamDiagnostics, parseStructuredOutput, type StreamDiagnostics } from '../../shared/utils/index.js'; import { mapToCodexSandboxMode, type CodexCallOptions } from './types.js'; import { type CodexEvent, @@ -150,9 +150,11 @@ export class CodexClient { const diag = createStreamDiagnostics('codex-sdk', { agentType, model: options.model, attempt }); diagRef = diag; - const { events } = await thread.runStreamed(fullPrompt, { + const turnOptions: TurnOptions = { signal: streamAbortController.signal, - }); + ...(options.outputSchema ? { outputSchema: options.outputSchema } : {}), + }; + const { events } = await thread.runStreamed(fullPrompt, turnOptions); resetIdleTimeout(); diag.onConnected(); @@ -270,6 +272,7 @@ export class CodexClient { } const trimmed = content.trim(); + const structuredOutput = parseStructuredOutput(trimmed, !!options.outputSchema); emitResult(options.onStream, true, trimmed, currentThreadId); return { @@ -278,6 +281,7 @@ export class CodexClient { content: trimmed, timestamp: new Date(), sessionId: currentThreadId, + structuredOutput, }; } catch (error) { const message = getErrorMessage(error); diff --git a/src/infra/codex/types.ts b/src/infra/codex/types.ts index 1834167..097eed6 100644 --- a/src/infra/codex/types.ts +++ b/src/infra/codex/types.ts @@ -31,4 +31,6 @@ export interface CodexCallOptions { onStream?: StreamCallback; /** OpenAI API key (bypasses CLI auth) */ openaiApiKey?: string; + /** JSON Schema for structured output */ + outputSchema?: Record; } diff --git a/src/infra/config/loaders/pieceParser.ts b/src/infra/config/loaders/pieceParser.ts index bd5f6df..1616800 100644 --- a/src/infra/config/loaders/pieceParser.ts +++ b/src/infra/config/loaders/pieceParser.ts @@ -10,7 +10,7 @@ import { dirname, resolve } from 'node:path'; import { parse as parseYaml } from 'yaml'; import type { z } from 'zod'; import { PieceConfigRawSchema, PieceMovementRawSchema } from '../../../core/models/index.js'; -import type { PieceConfig, PieceMovement, PieceRule, OutputContractEntry, OutputContractLabelPath, OutputContractItem, LoopMonitorConfig, LoopMonitorJudge, ArpeggioMovementConfig, ArpeggioMergeMovementConfig } from '../../../core/models/index.js'; +import type { PieceConfig, PieceMovement, PieceRule, OutputContractEntry, OutputContractLabelPath, OutputContractItem, LoopMonitorConfig, LoopMonitorJudge, ArpeggioMovementConfig, ArpeggioMergeMovementConfig, TeamLeaderConfig } from '../../../core/models/index.js'; import { getLanguage } from '../global/globalConfig.js'; import { type PieceSections, @@ -179,6 +179,31 @@ function normalizeArpeggio( }; } +/** Normalize raw team_leader config from YAML into internal format. */ +function normalizeTeamLeader( + raw: RawStep['team_leader'], + pieceDir: string, + sections: PieceSections, + context?: FacetResolutionContext, +): TeamLeaderConfig | undefined { + if (!raw) return undefined; + + const { personaSpec, personaPath } = resolvePersona(raw.persona, sections, pieceDir, context); + const { personaSpec: partPersona, personaPath: partPersonaPath } = resolvePersona(raw.part_persona, sections, pieceDir, context); + + return { + persona: personaSpec, + personaPath, + maxParts: raw.max_parts, + timeoutMs: raw.timeout_ms, + partPersona, + partPersonaPath, + partAllowedTools: raw.part_allowed_tools, + partEdit: raw.part_edit, + partPermissionMode: raw.part_permission_mode, + }; +} + /** Normalize a raw step into internal PieceMovement format. */ function normalizeStepFromRaw( step: RawStep, @@ -237,6 +262,11 @@ function normalizeStepFromRaw( result.arpeggio = arpeggioConfig; } + const teamLeaderConfig = normalizeTeamLeader(step.team_leader, pieceDir, sections, context); + if (teamLeaderConfig) { + result.teamLeader = teamLeaderConfig; + } + return result; } diff --git a/src/infra/fs/session.ts b/src/infra/fs/session.ts index 2e4660e..34ea8a7 100644 --- a/src/infra/fs/session.ts +++ b/src/infra/fs/session.ts @@ -113,6 +113,7 @@ export class SessionManager { ...(record.error ? { error: record.error } : {}), ...(record.matchedRuleIndex != null ? { matchedRuleIndex: record.matchedRuleIndex } : {}), ...(record.matchedRuleMethod ? { matchedRuleMethod: record.matchedRuleMethod } : {}), + ...(record.matchMethod ? { matchMethod: record.matchMethod } : {}), }); sessionLog.iterations++; } diff --git a/src/infra/mock/client.ts b/src/infra/mock/client.ts index 4fb0d4f..9912032 100644 --- a/src/infra/mock/client.ts +++ b/src/infra/mock/client.ts @@ -65,6 +65,7 @@ export async function callMock( content, timestamp: new Date(), sessionId, + structuredOutput: options.structuredOutput, }; } diff --git a/src/infra/mock/types.ts b/src/infra/mock/types.ts index f55b2bb..1c3d0c8 100644 --- a/src/infra/mock/types.ts +++ b/src/infra/mock/types.ts @@ -13,6 +13,8 @@ export interface MockCallOptions { mockResponse?: string; /** Fixed status to return (optional, defaults to 'done') */ mockStatus?: 'done' | 'blocked' | 'error' | 'approved' | 'rejected' | 'improve'; + /** Structured output payload returned as-is */ + structuredOutput?: Record; } /** A single entry in a mock scenario */ diff --git a/src/infra/opencode/client.ts b/src/infra/opencode/client.ts index 18c70b6..0b04999 100644 --- a/src/infra/opencode/client.ts +++ b/src/infra/opencode/client.ts @@ -8,7 +8,7 @@ import { createOpencode } from '@opencode-ai/sdk/v2'; import { createServer } from 'node:net'; import type { AgentResponse } from '../../core/models/index.js'; -import { createLogger, getErrorMessage, createStreamDiagnostics, type StreamDiagnostics } from '../../shared/utils/index.js'; +import { createLogger, getErrorMessage, createStreamDiagnostics, parseStructuredOutput, type StreamDiagnostics } from '../../shared/utils/index.js'; import { parseProviderModel } from '../../shared/utils/providerModel.js'; import { buildOpenCodePermissionConfig, @@ -236,16 +236,34 @@ export class OpenCodeClient { }); } + /** Build a prompt suffix that instructs the agent to return JSON matching the schema */ + private buildStructuredOutputSuffix(schema: Record): string { + return [ + '', + '---', + 'IMPORTANT: You MUST respond with ONLY a valid JSON object matching this schema. No other text, no markdown code blocks, no explanation.', + '```', + JSON.stringify(schema, null, 2), + '```', + ].join('\n'); + } + /** Call OpenCode with an agent prompt */ async call( agentType: string, prompt: string, options: OpenCodeCallOptions, ): Promise { - const fullPrompt = options.systemPrompt + const basePrompt = options.systemPrompt ? `${options.systemPrompt}\n\n${prompt}` : prompt; + // OpenCode SDK does not natively support structured output via outputFormat. + // Inject JSON output instructions into the prompt to make the agent return JSON. + const fullPrompt = options.outputSchema + ? `${basePrompt}${this.buildStructuredOutputSuffix(options.outputSchema)}` + : basePrompt; + for (let attempt = 1; attempt <= OPENCODE_RETRY_MAX_ATTEMPTS; attempt++) { let idleTimeoutId: ReturnType | undefined; const streamAbortController = new AbortController(); @@ -329,16 +347,25 @@ export class OpenCodeClient { diag.onConnected(); const tools = mapToOpenCodeTools(options.allowedTools); - await client.session.promptAsync( - { - sessionID: sessionId, - directory: options.cwd, - model: parsedModel, - ...(tools ? { tools } : {}), - parts: [{ type: 'text' as const, text: fullPrompt }], - }, - { signal: streamAbortController.signal }, - ); + const promptPayload: Record = { + sessionID: sessionId, + directory: options.cwd, + model: parsedModel, + ...(tools ? { tools } : {}), + parts: [{ type: 'text' as const, text: fullPrompt }], + }; + if (options.outputSchema) { + promptPayload.outputFormat = { + type: 'json_schema', + schema: options.outputSchema, + }; + } + + // OpenCode SDK types do not yet expose outputFormat even though runtime accepts it. + const promptPayloadForSdk = promptPayload as unknown as Parameters[0]; + await client.session.promptAsync(promptPayloadForSdk, { + signal: streamAbortController.signal, + }); emitInit(options.onStream, options.model, sessionId); @@ -571,6 +598,7 @@ export class OpenCodeClient { } const trimmed = content.trim(); + const structuredOutput = parseStructuredOutput(trimmed, !!options.outputSchema); emitResult(options.onStream, true, trimmed, sessionId); return { @@ -579,6 +607,7 @@ export class OpenCodeClient { content: trimmed, timestamp: new Date(), sessionId, + structuredOutput, }; } catch (error) { const message = getErrorMessage(error); diff --git a/src/infra/opencode/types.ts b/src/infra/opencode/types.ts index fb5fa5a..d981247 100644 --- a/src/infra/opencode/types.ts +++ b/src/infra/opencode/types.ts @@ -170,4 +170,6 @@ export interface OpenCodeCallOptions { onAskUserQuestion?: AskUserQuestionHandler; /** OpenCode API key */ opencodeApiKey?: string; + /** JSON Schema for structured output */ + outputSchema?: Record; } diff --git a/src/infra/providers/claude.ts b/src/infra/providers/claude.ts index 2ce14a2..a47702f 100644 --- a/src/infra/providers/claude.ts +++ b/src/infra/providers/claude.ts @@ -2,7 +2,8 @@ * Claude provider implementation */ -import { callClaude, callClaudeCustom, callClaudeAgent, callClaudeSkill, type ClaudeCallOptions } from '../claude/index.js'; +import { callClaude, callClaudeCustom, callClaudeAgent, callClaudeSkill } from '../claude/client.js'; +import type { ClaudeCallOptions } from '../claude/types.js'; import { resolveAnthropicApiKey } from '../config/index.js'; import type { AgentResponse } from '../../core/models/index.js'; import type { AgentSetup, Provider, ProviderAgent, ProviderCallOptions } from './types.js'; @@ -10,6 +11,7 @@ import type { AgentSetup, Provider, ProviderAgent, ProviderCallOptions } from '. function toClaudeOptions(options: ProviderCallOptions): ClaudeCallOptions { return { cwd: options.cwd, + abortSignal: options.abortSignal, sessionId: options.sessionId, allowedTools: options.allowedTools, mcpServers: options.mcpServers, @@ -21,6 +23,7 @@ function toClaudeOptions(options: ProviderCallOptions): ClaudeCallOptions { onAskUserQuestion: options.onAskUserQuestion, bypassPermissions: options.bypassPermissions, anthropicApiKey: options.anthropicApiKey ?? resolveAnthropicApiKey(), + outputSchema: options.outputSchema, }; } diff --git a/src/infra/providers/codex.ts b/src/infra/providers/codex.ts index 88c67d2..47a4de8 100644 --- a/src/infra/providers/codex.ts +++ b/src/infra/providers/codex.ts @@ -33,6 +33,7 @@ function toCodexOptions(options: ProviderCallOptions): CodexCallOptions { permissionMode: options.permissionMode, onStream: options.onStream, openaiApiKey: options.openaiApiKey ?? resolveOpenaiApiKey(), + outputSchema: options.outputSchema, }; } diff --git a/src/infra/providers/opencode.ts b/src/infra/providers/opencode.ts index 19e9798..3243158 100644 --- a/src/infra/providers/opencode.ts +++ b/src/infra/providers/opencode.ts @@ -22,6 +22,7 @@ function toOpenCodeOptions(options: ProviderCallOptions): OpenCodeCallOptions { onStream: options.onStream, onAskUserQuestion: options.onAskUserQuestion, opencodeApiKey: options.opencodeApiKey ?? resolveOpencodeApiKey(), + outputSchema: options.outputSchema, }; } diff --git a/src/infra/providers/types.ts b/src/infra/providers/types.ts index d2bc48d..e9214b8 100644 --- a/src/infra/providers/types.ts +++ b/src/infra/providers/types.ts @@ -40,6 +40,8 @@ export interface ProviderCallOptions { openaiApiKey?: string; /** OpenCode API key for OpenCode provider */ opencodeApiKey?: string; + /** JSON Schema for structured output */ + outputSchema?: Record; } /** A configured agent ready to be called */ diff --git a/src/infra/task/clone.ts b/src/infra/task/clone.ts index 098c0d0..f424707 100644 --- a/src/infra/task/clone.ts +++ b/src/infra/task/clone.ts @@ -77,7 +77,7 @@ export class CloneManager { const slug = slugify(options.taskSlug); if (options.issueNumber !== undefined && slug) { - return `takt/#${options.issueNumber}/${slug}`; + return `takt/${options.issueNumber}/${slug}`; } const timestamp = CloneManager.generateTimestamp(); @@ -124,13 +124,24 @@ export class CloneManager { return projectDir; } - /** Clone a repository and remove origin to isolate from the main repo */ - private static cloneAndIsolate(projectDir: string, clonePath: string): void { + /** Clone a repository and remove origin to isolate from the main repo. + * When `branch` is specified, `--branch` is passed to `git clone` so the + * branch is checked out as a local branch *before* origin is removed. + * Without this, non-default branches are lost when `git remote remove origin` + * deletes the remote-tracking refs. + */ + private static cloneAndIsolate(projectDir: string, clonePath: string, branch?: string): void { const referenceRepo = CloneManager.resolveMainRepo(projectDir); fs.mkdirSync(path.dirname(clonePath), { recursive: true }); - execFileSync('git', ['clone', '--reference', referenceRepo, '--dissociate', projectDir, clonePath], { + const cloneArgs = ['clone', '--reference', referenceRepo, '--dissociate']; + if (branch) { + cloneArgs.push('--branch', branch); + } + cloneArgs.push(projectDir, clonePath); + + execFileSync('git', cloneArgs, { cwd: projectDir, stdio: 'pipe', }); @@ -174,11 +185,10 @@ export class CloneManager { log.info('Creating shared clone', { path: clonePath, branch }); - CloneManager.cloneAndIsolate(projectDir, clonePath); - - if (CloneManager.branchExists(clonePath, branch)) { - execFileSync('git', ['checkout', branch], { cwd: clonePath, stdio: 'pipe' }); + if (CloneManager.branchExists(projectDir, branch)) { + CloneManager.cloneAndIsolate(projectDir, clonePath, branch); } else { + CloneManager.cloneAndIsolate(projectDir, clonePath); execFileSync('git', ['checkout', '-b', branch], { cwd: clonePath, stdio: 'pipe' }); } @@ -195,8 +205,7 @@ export class CloneManager { log.info('Creating temp clone for branch', { path: clonePath, branch }); - CloneManager.cloneAndIsolate(projectDir, clonePath); - execFileSync('git', ['checkout', branch], { cwd: clonePath, stdio: 'pipe' }); + CloneManager.cloneAndIsolate(projectDir, clonePath, branch); this.saveCloneMeta(projectDir, branch, clonePath); log.info('Temp clone created', { path: clonePath, branch }); diff --git a/src/shared/i18n/labels_en.yaml b/src/shared/i18n/labels_en.yaml index c5fea74..0d76f2f 100644 --- a/src/shared/i18n/labels_en.yaml +++ b/src/shared/i18n/labels_en.yaml @@ -65,6 +65,7 @@ piece: notifyComplete: "Piece complete ({iteration} iterations)" notifyAbort: "Aborted: {reason}" sigintGraceful: "Ctrl+C: Aborting piece..." + sigintTimeout: "Graceful shutdown timed out after {timeoutMs}ms" sigintForce: "Ctrl+C: Force exit" run: diff --git a/src/shared/i18n/labels_ja.yaml b/src/shared/i18n/labels_ja.yaml index bbd4e38..2bc9ed0 100644 --- a/src/shared/i18n/labels_ja.yaml +++ b/src/shared/i18n/labels_ja.yaml @@ -65,6 +65,7 @@ piece: notifyComplete: "ピース完了 ({iteration} iterations)" notifyAbort: "中断: {reason}" sigintGraceful: "Ctrl+C: ピースを中断しています..." + sigintTimeout: "graceful停止がタイムアウトしました ({timeoutMs}ms)" sigintForce: "Ctrl+C: 強制終了します" run: diff --git a/src/shared/prompts/en/perform_phase2_message.md b/src/shared/prompts/en/perform_phase2_message.md index 0811cb6..ae38839 100644 --- a/src/shared/prompts/en/perform_phase2_message.md +++ b/src/shared/prompts/en/perform_phase2_message.md @@ -1,7 +1,7 @@ @@ -17,6 +17,13 @@ Note: This section is metadata. Follow the language used in the rest of the prom ## Piece Context {{reportContext}} +{{#if hasLastResponse}} + +## Previous Work Context +The following is the output from Phase 1 (your main work). Use this as context to generate the report: + +{{lastResponse}} +{{/if}} ## Instructions Respond with the results of the work you just completed as a report. **Tools are not available in this phase. Respond with the report content directly as text.** diff --git a/src/shared/prompts/en/perform_phase3_message.md b/src/shared/prompts/en/perform_phase3_message.md index a3aa41b..80e5839 100644 --- a/src/shared/prompts/en/perform_phase3_message.md +++ b/src/shared/prompts/en/perform_phase3_message.md @@ -1,10 +1,14 @@ +{{#if structuredOutput}} +**Review is already complete. Evaluate the report below and determine which numbered rule (1-based) best matches the result.** +{{else}} **Review is already complete. Output exactly one tag corresponding to the judgment result shown in the report below.** +{{/if}} {{reportContent}} @@ -12,12 +16,21 @@ {{criteriaTable}} +{{#if structuredOutput}} + +## Task + +Evaluate the report against the criteria above. Return the matched rule number (1-based integer) and a brief reason for your decision. +{{else}} + ## Output Format **Output the tag corresponding to the judgment shown in the report in one line:** {{outputList}} +{{/if}} {{#if hasAppendix}} ### Appendix Template -{{appendixContent}}{{/if}} +{{appendixContent}} +{{/if}} diff --git a/src/shared/prompts/ja/perform_phase2_message.md b/src/shared/prompts/ja/perform_phase2_message.md index 9b1e65e..7630655 100644 --- a/src/shared/prompts/ja/perform_phase2_message.md +++ b/src/shared/prompts/ja/perform_phase2_message.md @@ -1,7 +1,7 @@ @@ -16,6 +16,13 @@ ## Piece Context {{reportContext}} +{{#if hasLastResponse}} + +## Previous Work Context +以下はPhase 1(本来の作業)の出力です。レポート生成の文脈として使用してください: + +{{lastResponse}} +{{/if}} ## Instructions あなたが今行った作業の結果をレポートとして回答してください。**このフェーズではツールは使えません。レポート内容をテキストとして直接回答してください。** diff --git a/src/shared/prompts/ja/perform_phase3_message.md b/src/shared/prompts/ja/perform_phase3_message.md index becfa29..89299ef 100644 --- a/src/shared/prompts/ja/perform_phase3_message.md +++ b/src/shared/prompts/ja/perform_phase3_message.md @@ -1,10 +1,14 @@ +{{#if structuredOutput}} +**既にレビューは完了しています。以下のレポートを評価し、どの番号のルール(1始まり)が結果に最も合致するか判定してください。** +{{else}} **既にレビューは完了しています。以下のレポートで示された判定結果に対応するタグを1つだけ出力してください。** +{{/if}} {{reportContent}} @@ -12,12 +16,21 @@ {{criteriaTable}} +{{#if structuredOutput}} + +## タスク + +上記の判定基準に照らしてレポートを評価してください。合致するルール番号(1始まりの整数)と簡潔な理由を返してください。 +{{else}} + ## 出力フォーマット **レポートで示した判定に対応するタグを1行で出力してください:** {{outputList}} +{{/if}} {{#if hasAppendix}} ### 追加出力テンプレート -{{appendixContent}}{{/if}} +{{appendixContent}} +{{/if}} diff --git a/src/shared/utils/index.ts b/src/shared/utils/index.ts index 69050cb..ffa23c0 100644 --- a/src/shared/utils/index.ts +++ b/src/shared/utils/index.ts @@ -11,6 +11,7 @@ export * from './slackWebhook.js'; export * from './sleep.js'; export * from './slug.js'; export * from './streamDiagnostics.js'; +export * from './structuredOutput.js'; export * from './taskPaths.js'; export * from './text.js'; export * from './types.js'; diff --git a/src/shared/utils/ruleIndex.ts b/src/shared/utils/ruleIndex.ts new file mode 100644 index 0000000..e0758c4 --- /dev/null +++ b/src/shared/utils/ruleIndex.ts @@ -0,0 +1,15 @@ +/** + * Detect rule index from numbered tag pattern [STEP_NAME:N]. + * Returns 0-based rule index, or -1 if no match. + */ +export function detectRuleIndex(content: string, movementName: string): number { + const tag = movementName.toUpperCase(); + const regex = new RegExp(`\\[${tag}:(\\d+)\\]`, 'gi'); + const matches = [...content.matchAll(regex)]; + const match = matches.at(-1); + if (match?.[1]) { + const index = Number.parseInt(match[1], 10) - 1; + return index >= 0 ? index : -1; + } + return -1; +} diff --git a/src/shared/utils/sleep.ts b/src/shared/utils/sleep.ts index 5e8164c..f2cdc55 100644 --- a/src/shared/utils/sleep.ts +++ b/src/shared/utils/sleep.ts @@ -8,10 +8,10 @@ const log = createLogger('sleep'); let caffeinateStarted = false; /** - * takt実行中のmacOSアイドルスリープおよびディスプレイスリープを防止する。 + * takt実行中のmacOSスリープを防止する。 * -d: ディスプレイスリープ防止(App Nap によるプロセス凍結を回避) * -i: アイドルスリープ防止 - * 蓋を閉じた場合のスリープは防げない(-s はAC電源が必要なため)。 + * -s: システムスリープ防止(AC電源接続時のみ有効、蓋閉じでも動作継続) */ export function preventSleep(): void { if (caffeinateStarted) { @@ -28,7 +28,7 @@ export function preventSleep(): void { return; } - const child = spawn(caffeinatePath, ['-di', '-w', String(process.pid)], { + const child = spawn(caffeinatePath, ['-dis', '-w', String(process.pid)], { stdio: 'ignore', detached: true, }); diff --git a/src/shared/utils/structuredOutput.ts b/src/shared/utils/structuredOutput.ts new file mode 100644 index 0000000..e1b8838 --- /dev/null +++ b/src/shared/utils/structuredOutput.ts @@ -0,0 +1,56 @@ +/** + * Parse structured output from provider text response. + * + * Codex and OpenCode return structured output as JSON text in agent messages. + * This function extracts a JSON object from the text when outputSchema was requested. + * + * Extraction strategies (in order): + * 1. Direct JSON parse — text is pure JSON starting with `{` + * 2. Code block extraction — JSON inside ```json ... ``` or ``` ... ``` + * 3. Brace extraction — find outermost `{` ... `}` in the text + */ + +function tryParseJsonObject(text: string): Record | undefined { + try { + const parsed = JSON.parse(text) as unknown; + if (parsed && typeof parsed === 'object' && !Array.isArray(parsed)) { + return parsed as Record; + } + } catch { + // Not valid JSON + } + return undefined; +} + +export function parseStructuredOutput( + text: string, + hasOutputSchema: boolean, +): Record | undefined { + if (!hasOutputSchema || !text) return undefined; + + const trimmed = text.trim(); + + // Strategy 1: Direct JSON parse (text is pure JSON) + if (trimmed.startsWith('{')) { + const result = tryParseJsonObject(trimmed); + if (result) return result; + } + + // Strategy 2: Extract from markdown code block (```json\n{...}\n```) + const codeBlockMatch = trimmed.match(/```(?:json)?\s*\n(\{[\s\S]*?\})\s*\n```/); + if (codeBlockMatch?.[1]) { + const result = tryParseJsonObject(codeBlockMatch[1].trim()); + if (result) return result; + } + + // Strategy 3: Find first `{` and last `}` (handles preamble/postamble text) + const firstBrace = trimmed.indexOf('{'); + const lastBrace = trimmed.lastIndexOf('}'); + if (firstBrace >= 0 && lastBrace > firstBrace) { + const candidate = trimmed.slice(firstBrace, lastBrace + 1); + const result = tryParseJsonObject(candidate); + if (result) return result; + } + + return undefined; +} diff --git a/src/shared/utils/types.ts b/src/shared/utils/types.ts index 2f33f52..2926689 100644 --- a/src/shared/utils/types.ts +++ b/src/shared/utils/types.ts @@ -26,6 +26,8 @@ export interface SessionLog { matchedRuleIndex?: number; /** How the rule match was detected */ matchedRuleMethod?: string; + /** Method used by status judgment phase */ + matchMethod?: string; }>; } @@ -56,6 +58,7 @@ export interface NdjsonStepComplete { instruction: string; matchedRuleIndex?: number; matchedRuleMethod?: string; + matchMethod?: string; error?: string; timestamp: string; } diff --git a/vitest.config.e2e.mock.ts b/vitest.config.e2e.mock.ts index 12bc9fc..3d6abf5 100644 --- a/vitest.config.e2e.mock.ts +++ b/vitest.config.e2e.mock.ts @@ -5,14 +5,21 @@ export default defineConfig({ include: [ 'e2e/specs/direct-task.e2e.ts', 'e2e/specs/pipeline-skip-git.e2e.ts', + 'e2e/specs/pipeline-local-repo.e2e.ts', 'e2e/specs/report-judge.e2e.ts', + 'e2e/specs/report-file-output.e2e.ts', 'e2e/specs/add.e2e.ts', 'e2e/specs/watch.e2e.ts', 'e2e/specs/list-non-interactive.e2e.ts', 'e2e/specs/multi-step-parallel.e2e.ts', + 'e2e/specs/multi-step-sequential.e2e.ts', 'e2e/specs/run-sigint-graceful.e2e.ts', 'e2e/specs/piece-error-handling.e2e.ts', + 'e2e/specs/cycle-detection.e2e.ts', 'e2e/specs/run-multiple-tasks.e2e.ts', + 'e2e/specs/task-status-persistence.e2e.ts', + 'e2e/specs/session-log.e2e.ts', + 'e2e/specs/model-override.e2e.ts', 'e2e/specs/provider-error.e2e.ts', 'e2e/specs/error-handling.e2e.ts', 'e2e/specs/cli-catalog.e2e.ts', @@ -23,6 +30,7 @@ export default defineConfig({ 'e2e/specs/cli-config.e2e.ts', 'e2e/specs/cli-reset-categories.e2e.ts', 'e2e/specs/cli-export-cc.e2e.ts', + 'e2e/specs/eject.e2e.ts', 'e2e/specs/quiet-mode.e2e.ts', 'e2e/specs/task-content-file.e2e.ts', ], diff --git a/vitest.config.e2e.provider.ts b/vitest.config.e2e.provider.ts index 84c2932..cd00e37 100644 --- a/vitest.config.e2e.provider.ts +++ b/vitest.config.e2e.provider.ts @@ -7,6 +7,7 @@ export default defineConfig({ 'e2e/specs/worktree.e2e.ts', 'e2e/specs/pipeline.e2e.ts', 'e2e/specs/github-issue.e2e.ts', + 'e2e/specs/structured-output.e2e.ts', ], environment: 'node', globals: false, diff --git a/vitest.config.e2e.structured-output.ts b/vitest.config.e2e.structured-output.ts new file mode 100644 index 0000000..9926aa5 --- /dev/null +++ b/vitest.config.e2e.structured-output.ts @@ -0,0 +1,20 @@ +import { defineConfig } from 'vitest/config'; + +export default defineConfig({ + test: { + include: [ + 'e2e/specs/structured-output.e2e.ts', + ], + environment: 'node', + globals: false, + testTimeout: 240000, + hookTimeout: 60000, + teardownTimeout: 30000, + pool: 'threads', + poolOptions: { + threads: { + singleThread: true, + }, + }, + }, +});