判定処理の修正
This commit is contained in:
parent
034bd059b3
commit
4919bc759f
@ -19,11 +19,11 @@
|
|||||||
"description": "Instruction for the part agent"
|
"description": "Instruction for the part agent"
|
||||||
},
|
},
|
||||||
"timeout_ms": {
|
"timeout_ms": {
|
||||||
"type": "integer",
|
"type": ["integer", "null"],
|
||||||
"description": "Optional timeout in ms"
|
"description": "Optional timeout in ms"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"required": ["id", "title", "instruction"],
|
"required": ["id", "title", "instruction", "timeout_ms"],
|
||||||
"additionalProperties": false
|
"additionalProperties": false
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@ -10,6 +10,6 @@
|
|||||||
"description": "Why this condition was matched"
|
"description": "Why this condition was matched"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"required": ["matched_index"],
|
"required": ["matched_index", "reason"],
|
||||||
"additionalProperties": false
|
"additionalProperties": false
|
||||||
}
|
}
|
||||||
|
|||||||
@ -10,6 +10,6 @@
|
|||||||
"description": "Brief justification for the decision"
|
"description": "Brief justification for the decision"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"required": ["step"],
|
"required": ["step", "reason"],
|
||||||
"additionalProperties": false
|
"additionalProperties": false
|
||||||
}
|
}
|
||||||
|
|||||||
127
docs/implements/structured-output.ja.md
Normal file
127
docs/implements/structured-output.ja.md
Normal file
@ -0,0 +1,127 @@
|
|||||||
|
# Structured Output — Phase 3 ステータス判定
|
||||||
|
|
||||||
|
## 概要
|
||||||
|
|
||||||
|
Phase 3(ステータス判定)において、エージェントの出力を structured output(JSON スキーマ)で取得し、ルールマッチングの精度と信頼性を向上させる。
|
||||||
|
|
||||||
|
## プロバイダ別の挙動
|
||||||
|
|
||||||
|
| プロバイダ | メソッド | 仕組み |
|
||||||
|
|-----------|---------|--------|
|
||||||
|
| Claude | `structured_output` | SDK が `StructuredOutput` ツールを自動追加。エージェントがツール経由で `{ step, reason }` を返す |
|
||||||
|
| Codex | `structured_output` | `TurnOptions.outputSchema` で API レベルの JSON 制約。テキストが JSON になる |
|
||||||
|
| OpenCode | `structured_output` | プロンプト末尾に JSON スキーマ付き出力指示を注入。テキストレスポンスから `parseStructuredOutput()` で JSON を抽出 |
|
||||||
|
|
||||||
|
## フォールバックチェーン
|
||||||
|
|
||||||
|
`judgeStatus()` は3段階の独立した LLM 呼び出しでルールをマッチする。
|
||||||
|
|
||||||
|
```
|
||||||
|
Stage 1: structured_output — outputSchema 付き LLM 呼び出し → structuredOutput.step(1-based integer)
|
||||||
|
Stage 2: phase3_tag — outputSchema なし LLM 呼び出し → content 内の [MOVEMENT:N] タグ検出
|
||||||
|
Stage 3: ai_judge — evaluateCondition() による AI 条件評価
|
||||||
|
```
|
||||||
|
|
||||||
|
各ステージは専用のインストラクションで LLM に問い合わせる。Stage 1 は「ルール番号を JSON で返せ」、Stage 2 は「タグを1行で出力せよ」と聞き方が異なる。
|
||||||
|
|
||||||
|
セッションログには `toJudgmentMatchMethod()` で変換された値が記録される。
|
||||||
|
|
||||||
|
| 内部メソッド | セッションログ |
|
||||||
|
|-------------|--------------|
|
||||||
|
| `structured_output` | `structured_output` |
|
||||||
|
| `phase3_tag` / `phase1_tag` | `tag_fallback` |
|
||||||
|
| `ai_judge` / `ai_judge_fallback` | `ai_judge` |
|
||||||
|
|
||||||
|
## インストラクション分岐
|
||||||
|
|
||||||
|
Phase 3 テンプレート(`perform_phase3_message`)は `structuredOutput` フラグで2つのモードを持つ。
|
||||||
|
|
||||||
|
### Structured Output モード(`structuredOutput: true`)
|
||||||
|
|
||||||
|
主要指示: ルール番号(1-based)と理由を返せ。
|
||||||
|
フォールバック指示: structured output が使えない場合はタグを出力せよ。
|
||||||
|
|
||||||
|
### タグモード(`structuredOutput: false`)
|
||||||
|
|
||||||
|
従来の指示: 対応するタグを1行で出力せよ。
|
||||||
|
|
||||||
|
現在、Phase 3 は常に `structuredOutput: true` で実行される。
|
||||||
|
|
||||||
|
## アーキテクチャ
|
||||||
|
|
||||||
|
```
|
||||||
|
StatusJudgmentBuilder
|
||||||
|
└─ structuredOutput: true
|
||||||
|
├─ criteriaTable: ルール条件テーブル(常に含む)
|
||||||
|
├─ outputList: タグ一覧(フォールバック用に含む)
|
||||||
|
└─ テンプレート: "ルール番号と理由を返せ + タグはフォールバック"
|
||||||
|
|
||||||
|
runStatusJudgmentPhase()
|
||||||
|
└─ judgeStatus() → JudgeStatusResult { ruleIndex, method }
|
||||||
|
└─ StatusJudgmentPhaseResult { tag, ruleIndex, method }
|
||||||
|
|
||||||
|
MovementExecutor
|
||||||
|
├─ Phase 3 あり → judgeStatus の結果を直接使用(method 伝搬)
|
||||||
|
└─ Phase 3 なし → detectMatchedRule() で Phase 1 コンテンツから検出
|
||||||
|
```
|
||||||
|
|
||||||
|
## JSON スキーマ
|
||||||
|
|
||||||
|
### judgment.json(judgeStatus 用)
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"type": "object",
|
||||||
|
"properties": {
|
||||||
|
"step": { "type": "integer", "description": "Matched rule number (1-based)" },
|
||||||
|
"reason": { "type": "string", "description": "Brief justification" }
|
||||||
|
},
|
||||||
|
"required": ["step", "reason"],
|
||||||
|
"additionalProperties": false
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### evaluation.json(evaluateCondition 用)
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"type": "object",
|
||||||
|
"properties": {
|
||||||
|
"matched_index": { "type": "integer" },
|
||||||
|
"reason": { "type": "string" }
|
||||||
|
},
|
||||||
|
"required": ["matched_index", "reason"],
|
||||||
|
"additionalProperties": false
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## parseStructuredOutput() — JSON 抽出
|
||||||
|
|
||||||
|
Codex と OpenCode はテキストレスポンスから JSON を抽出する。3段階のフォールバック戦略を持つ。
|
||||||
|
|
||||||
|
```
|
||||||
|
1. Direct parse — テキスト全体が `{` で始まる JSON オブジェクト
|
||||||
|
2. Code block — ```json ... ``` または ``` ... ``` 内の JSON
|
||||||
|
3. Brace extraction — テキスト内の最初の `{` から最後の `}` までを切り出し
|
||||||
|
```
|
||||||
|
|
||||||
|
## OpenCode 固有の仕組み
|
||||||
|
|
||||||
|
OpenCode SDK は `outputFormat` を型定義でサポートしていない。代わりにプロンプト末尾に JSON 出力指示を注入する。
|
||||||
|
|
||||||
|
```
|
||||||
|
---
|
||||||
|
IMPORTANT: You MUST respond with ONLY a valid JSON object matching this schema. No other text, no markdown code blocks, no explanation.
|
||||||
|
```json
|
||||||
|
{ "type": "object", ... }
|
||||||
|
```
|
||||||
|
```
|
||||||
|
|
||||||
|
エージェントが返すテキストを `parseStructuredOutput()` でパースし、`AgentResponse.structuredOutput` に格納する。
|
||||||
|
|
||||||
|
## 注意事項
|
||||||
|
|
||||||
|
- OpenAI API(Codex)は `required` に全プロパティを含めないとエラーになる(`additionalProperties: false` 時)
|
||||||
|
- Codex SDK の `TurnCompletedEvent` には `finalResponse` フィールドがない。structured output は `AgentMessageItem.text` の JSON テキストから `parseStructuredOutput()` でパースする
|
||||||
|
- Claude SDK は `StructuredOutput` ツール方式のため、インストラクションでタグ出力を強調しすぎるとエージェントがツールを呼ばずタグを出力してしまう
|
||||||
|
- OpenCode のプロンプト注入方式はモデルの指示従順性に依存する。JSON 以外のテキストが混在する場合は `parseStructuredOutput()` の code block / brace extraction で回収する
|
||||||
18
e2e/fixtures/pieces/structured-output.yaml
Normal file
18
e2e/fixtures/pieces/structured-output.yaml
Normal file
@ -0,0 +1,18 @@
|
|||||||
|
name: e2e-structured-output
|
||||||
|
description: E2E piece to verify structured output rule matching
|
||||||
|
|
||||||
|
max_movements: 5
|
||||||
|
|
||||||
|
movements:
|
||||||
|
- name: execute
|
||||||
|
edit: false
|
||||||
|
persona: ../agents/test-coder.md
|
||||||
|
permission_mode: readonly
|
||||||
|
instruction_template: |
|
||||||
|
Reply with exactly: "Task completed successfully."
|
||||||
|
Do not do anything else.
|
||||||
|
rules:
|
||||||
|
- condition: Task completed
|
||||||
|
next: COMPLETE
|
||||||
|
- condition: Task failed
|
||||||
|
next: ABORT
|
||||||
96
e2e/specs/structured-output.e2e.ts
Normal file
96
e2e/specs/structured-output.e2e.ts
Normal file
@ -0,0 +1,96 @@
|
|||||||
|
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
|
||||||
|
import { resolve, dirname } from 'node:path';
|
||||||
|
import { fileURLToPath } from 'node:url';
|
||||||
|
import { createIsolatedEnv, type IsolatedEnv } from '../helpers/isolated-env';
|
||||||
|
import { createLocalRepo, type LocalRepo } from '../helpers/test-repo';
|
||||||
|
import { runTakt } from '../helpers/takt-runner';
|
||||||
|
import { readSessionRecords } from '../helpers/session-log';
|
||||||
|
|
||||||
|
const __filename = fileURLToPath(import.meta.url);
|
||||||
|
const __dirname = dirname(__filename);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* E2E: Structured output for status judgment (Phase 3).
|
||||||
|
*
|
||||||
|
* Verifies that real providers (Claude, Codex, OpenCode) can execute a piece
|
||||||
|
* where the status judgment phase uses structured output (`outputSchema`)
|
||||||
|
* internally via `judgeStatus()`.
|
||||||
|
*
|
||||||
|
* The piece has 2 rules per step, so `judgeStatus` cannot auto-select
|
||||||
|
* and must actually call the provider with an outputSchema to determine
|
||||||
|
* which rule matched.
|
||||||
|
*
|
||||||
|
* If structured output works correctly, `judgeStatus` extracts the step
|
||||||
|
* number from `response.structuredOutput.step` (recorded as `structured_output`).
|
||||||
|
* If the agent happens to output `[STEP:N]` tags, the RuleEvaluator detects
|
||||||
|
* them as `phase3_tag`/`phase1_tag` (recorded as `tag_fallback` in session log).
|
||||||
|
* The session log matchMethod is transformed by `toJudgmentMatchMethod()`.
|
||||||
|
*
|
||||||
|
* Run with:
|
||||||
|
* TAKT_E2E_PROVIDER=claude vitest run --config vitest.config.e2e.structured-output.ts
|
||||||
|
* TAKT_E2E_PROVIDER=codex vitest run --config vitest.config.e2e.structured-output.ts
|
||||||
|
* TAKT_E2E_PROVIDER=opencode TAKT_E2E_MODEL=openai/gpt-4 vitest run --config vitest.config.e2e.structured-output.ts
|
||||||
|
*/
|
||||||
|
describe('E2E: Structured output rule matching', () => {
|
||||||
|
let isolatedEnv: IsolatedEnv;
|
||||||
|
let repo: LocalRepo;
|
||||||
|
|
||||||
|
beforeEach(() => {
|
||||||
|
isolatedEnv = createIsolatedEnv();
|
||||||
|
repo = createLocalRepo();
|
||||||
|
});
|
||||||
|
|
||||||
|
afterEach(() => {
|
||||||
|
try { repo.cleanup(); } catch { /* best-effort */ }
|
||||||
|
try { isolatedEnv.cleanup(); } catch { /* best-effort */ }
|
||||||
|
});
|
||||||
|
|
||||||
|
it('should complete piece via Phase 3 status judgment with 2-rule step', () => {
|
||||||
|
const piecePath = resolve(__dirname, '../fixtures/pieces/structured-output.yaml');
|
||||||
|
|
||||||
|
const result = runTakt({
|
||||||
|
args: [
|
||||||
|
'--task', 'Say hello',
|
||||||
|
'--piece', piecePath,
|
||||||
|
'--create-worktree', 'no',
|
||||||
|
],
|
||||||
|
cwd: repo.path,
|
||||||
|
env: isolatedEnv.env,
|
||||||
|
timeout: 240_000,
|
||||||
|
});
|
||||||
|
|
||||||
|
if (result.exitCode !== 0) {
|
||||||
|
console.log('=== STDOUT ===\n', result.stdout);
|
||||||
|
console.log('=== STDERR ===\n', result.stderr);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Always log the matchMethod for diagnostic purposes
|
||||||
|
const allRecords = readSessionRecords(repo.path);
|
||||||
|
const sc = allRecords.find((r) => r.type === 'step_complete');
|
||||||
|
console.log(`=== matchMethod: ${sc?.matchMethod ?? '(none)'} ===`);
|
||||||
|
|
||||||
|
expect(result.exitCode).toBe(0);
|
||||||
|
expect(result.stdout).toContain('Piece completed');
|
||||||
|
|
||||||
|
// Verify session log has proper step_complete with matchMethod
|
||||||
|
const records = readSessionRecords(repo.path);
|
||||||
|
|
||||||
|
const pieceComplete = records.find((r) => r.type === 'piece_complete');
|
||||||
|
expect(pieceComplete).toBeDefined();
|
||||||
|
|
||||||
|
const stepComplete = records.find((r) => r.type === 'step_complete');
|
||||||
|
expect(stepComplete).toBeDefined();
|
||||||
|
|
||||||
|
// matchMethod should be present — the 2-rule step required actual judgment
|
||||||
|
// (auto_select is only used for single-rule steps)
|
||||||
|
const matchMethod = stepComplete?.matchMethod as string | undefined;
|
||||||
|
expect(matchMethod).toBeDefined();
|
||||||
|
|
||||||
|
// Session log records transformed matchMethod via toJudgmentMatchMethod():
|
||||||
|
// structured_output → structured_output (judgeStatus extracted from structuredOutput.step)
|
||||||
|
// phase3_tag / phase1_tag → tag_fallback (agent output [STEP:N] tag, detected by RuleEvaluator)
|
||||||
|
// ai_judge / ai_judge_fallback → ai_judge (AI evaluated conditions as fallback)
|
||||||
|
const validMethods = ['structured_output', 'tag_fallback', 'ai_judge'];
|
||||||
|
expect(validMethods).toContain(matchMethod);
|
||||||
|
}, 240_000);
|
||||||
|
});
|
||||||
@ -40,6 +40,8 @@ function doneResponse(content: string, structuredOutput?: Record<string, unknown
|
|||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
|
const judgeOptions = { cwd: '/repo', movementName: 'review' };
|
||||||
|
|
||||||
describe('agent-usecases', () => {
|
describe('agent-usecases', () => {
|
||||||
beforeEach(() => {
|
beforeEach(() => {
|
||||||
vi.clearAllMocks();
|
vi.clearAllMocks();
|
||||||
@ -102,83 +104,90 @@ describe('agent-usecases', () => {
|
|||||||
expect(detectJudgeIndex).not.toHaveBeenCalled();
|
expect(detectJudgeIndex).not.toHaveBeenCalled();
|
||||||
});
|
});
|
||||||
|
|
||||||
|
// --- judgeStatus: 3-stage fallback ---
|
||||||
|
|
||||||
it('judgeStatus は単一ルール時に auto_select を返す', async () => {
|
it('judgeStatus は単一ルール時に auto_select を返す', async () => {
|
||||||
const result = await judgeStatus('instruction', [{ condition: 'always', next: 'done' }], {
|
const result = await judgeStatus('structured', 'tag', [{ condition: 'always', next: 'done' }], judgeOptions);
|
||||||
cwd: '/repo',
|
|
||||||
movementName: 'review',
|
|
||||||
});
|
|
||||||
|
|
||||||
expect(result).toEqual({ ruleIndex: 0, method: 'auto_select' });
|
expect(result).toEqual({ ruleIndex: 0, method: 'auto_select' });
|
||||||
expect(runAgent).not.toHaveBeenCalled();
|
expect(runAgent).not.toHaveBeenCalled();
|
||||||
});
|
});
|
||||||
|
|
||||||
it('judgeStatus はルールが空ならエラー', async () => {
|
it('judgeStatus はルールが空ならエラー', async () => {
|
||||||
await expect(judgeStatus('instruction', [], {
|
await expect(judgeStatus('structured', 'tag', [], judgeOptions))
|
||||||
cwd: '/repo',
|
.rejects.toThrow('judgeStatus requires at least one rule');
|
||||||
movementName: 'review',
|
|
||||||
})).rejects.toThrow('judgeStatus requires at least one rule');
|
|
||||||
});
|
});
|
||||||
|
|
||||||
it('judgeStatus は構造化出力 step を採用する', async () => {
|
it('judgeStatus は Stage 1 で構造化出力 step を採用する', async () => {
|
||||||
vi.mocked(runAgent).mockResolvedValue(doneResponse('x', { step: 2 }));
|
vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('x', { step: 2 }));
|
||||||
|
|
||||||
const result = await judgeStatus('instruction', [
|
const result = await judgeStatus('structured', 'tag', [
|
||||||
{ condition: 'a', next: 'one' },
|
{ condition: 'a', next: 'one' },
|
||||||
{ condition: 'b', next: 'two' },
|
{ condition: 'b', next: 'two' },
|
||||||
], {
|
], judgeOptions);
|
||||||
cwd: '/repo',
|
|
||||||
movementName: 'review',
|
|
||||||
});
|
|
||||||
|
|
||||||
expect(result).toEqual({ ruleIndex: 1, method: 'structured_output' });
|
expect(result).toEqual({ ruleIndex: 1, method: 'structured_output' });
|
||||||
|
expect(runAgent).toHaveBeenCalledTimes(1);
|
||||||
|
expect(runAgent).toHaveBeenCalledWith('conductor', 'structured', expect.objectContaining({
|
||||||
|
outputSchema: { type: 'judgment' },
|
||||||
|
}));
|
||||||
});
|
});
|
||||||
|
|
||||||
it('judgeStatus はタグフォールバックを使う', async () => {
|
it('judgeStatus は Stage 2 でタグ検出を使う', async () => {
|
||||||
vi.mocked(runAgent).mockResolvedValue(doneResponse('[REVIEW:2]'));
|
// Stage 1: structured output fails (no structuredOutput)
|
||||||
|
vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('no match'));
|
||||||
|
// Stage 2: tag detection succeeds
|
||||||
|
vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('[REVIEW:2]'));
|
||||||
|
|
||||||
const result = await judgeStatus('instruction', [
|
const result = await judgeStatus('structured', 'tag', [
|
||||||
{ condition: 'a', next: 'one' },
|
{ condition: 'a', next: 'one' },
|
||||||
{ condition: 'b', next: 'two' },
|
{ condition: 'b', next: 'two' },
|
||||||
], {
|
], judgeOptions);
|
||||||
cwd: '/repo',
|
|
||||||
movementName: 'review',
|
|
||||||
});
|
|
||||||
|
|
||||||
expect(result).toEqual({ ruleIndex: 1, method: 'phase3_tag' });
|
expect(result).toEqual({ ruleIndex: 1, method: 'phase3_tag' });
|
||||||
|
expect(runAgent).toHaveBeenCalledTimes(2);
|
||||||
|
expect(runAgent).toHaveBeenNthCalledWith(1, 'conductor', 'structured', expect.objectContaining({
|
||||||
|
outputSchema: { type: 'judgment' },
|
||||||
|
}));
|
||||||
|
expect(runAgent).toHaveBeenNthCalledWith(2, 'conductor', 'tag', expect.not.objectContaining({
|
||||||
|
outputSchema: expect.anything(),
|
||||||
|
}));
|
||||||
});
|
});
|
||||||
|
|
||||||
it('judgeStatus は最終手段として AI Judge を使う', async () => {
|
it('judgeStatus は Stage 3 で AI Judge を使う', async () => {
|
||||||
vi.mocked(runAgent)
|
// Stage 1: structured output fails
|
||||||
.mockResolvedValueOnce(doneResponse('no match'))
|
vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('no match'));
|
||||||
.mockResolvedValueOnce(doneResponse('ignored', { matched_index: 2 }));
|
// Stage 2: tag detection fails
|
||||||
|
vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('no tag'));
|
||||||
|
// Stage 3: evaluateCondition succeeds
|
||||||
|
vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('ignored', { matched_index: 2 }));
|
||||||
|
|
||||||
const result = await judgeStatus('instruction', [
|
const result = await judgeStatus('structured', 'tag', [
|
||||||
{ condition: 'a', next: 'one' },
|
{ condition: 'a', next: 'one' },
|
||||||
{ condition: 'b', next: 'two' },
|
{ condition: 'b', next: 'two' },
|
||||||
], {
|
], judgeOptions);
|
||||||
cwd: '/repo',
|
|
||||||
movementName: 'review',
|
|
||||||
});
|
|
||||||
|
|
||||||
expect(result).toEqual({ ruleIndex: 1, method: 'ai_judge' });
|
expect(result).toEqual({ ruleIndex: 1, method: 'ai_judge' });
|
||||||
expect(runAgent).toHaveBeenCalledTimes(2);
|
expect(runAgent).toHaveBeenCalledTimes(3);
|
||||||
});
|
});
|
||||||
|
|
||||||
it('judgeStatus は全ての判定に失敗したらエラー', async () => {
|
it('judgeStatus は全ての判定に失敗したらエラー', async () => {
|
||||||
vi.mocked(runAgent)
|
// Stage 1: structured output fails
|
||||||
.mockResolvedValueOnce(doneResponse('no match'))
|
vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('no match'));
|
||||||
.mockResolvedValueOnce(doneResponse('still no match'));
|
// Stage 2: tag detection fails
|
||||||
|
vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('no tag'));
|
||||||
|
// Stage 3: evaluateCondition fails
|
||||||
|
vi.mocked(runAgent).mockResolvedValueOnce(doneResponse('still no match'));
|
||||||
vi.mocked(detectJudgeIndex).mockReturnValue(-1);
|
vi.mocked(detectJudgeIndex).mockReturnValue(-1);
|
||||||
|
|
||||||
await expect(judgeStatus('instruction', [
|
await expect(judgeStatus('structured', 'tag', [
|
||||||
{ condition: 'a', next: 'one' },
|
{ condition: 'a', next: 'one' },
|
||||||
{ condition: 'b', next: 'two' },
|
{ condition: 'b', next: 'two' },
|
||||||
], {
|
], judgeOptions)).rejects.toThrow('Status not found for movement "review"');
|
||||||
cwd: '/repo',
|
|
||||||
movementName: 'review',
|
|
||||||
})).rejects.toThrow('Status not found for movement "review"');
|
|
||||||
});
|
});
|
||||||
|
|
||||||
|
// --- decomposeTask ---
|
||||||
|
|
||||||
it('decomposeTask は構造化出力 parts を返す', async () => {
|
it('decomposeTask は構造化出力 parts を返す', async () => {
|
||||||
vi.mocked(runAgent).mockResolvedValue(doneResponse('x', {
|
vi.mocked(runAgent).mockResolvedValue(doneResponse('x', {
|
||||||
parts: [
|
parts: [
|
||||||
|
|||||||
@ -1,8 +1,12 @@
|
|||||||
/**
|
/**
|
||||||
* Codex SDK layer structured output tests.
|
* Codex SDK layer structured output tests.
|
||||||
*
|
*
|
||||||
* Tests CodexClient's extraction of structuredOutput from
|
* Tests CodexClient's extraction of structuredOutput by parsing
|
||||||
* `turn.completed` events' `finalResponse` field.
|
* JSON text from agent_message items when outputSchema is provided.
|
||||||
|
*
|
||||||
|
* Codex SDK returns structured output as JSON text in agent_message
|
||||||
|
* items (not via turn.completed.finalResponse which doesn't exist
|
||||||
|
* on TurnCompletedEvent).
|
||||||
*/
|
*/
|
||||||
|
|
||||||
import { beforeEach, describe, expect, it, vi } from 'vitest';
|
import { beforeEach, describe, expect, it, vi } from 'vitest';
|
||||||
@ -42,34 +46,32 @@ describe('CodexClient — structuredOutput 抽出', () => {
|
|||||||
mockEvents = [];
|
mockEvents = [];
|
||||||
});
|
});
|
||||||
|
|
||||||
it('turn.completed の finalResponse を structuredOutput として返す', async () => {
|
it('outputSchema 指定時に agent_message の JSON テキストを structuredOutput として返す', async () => {
|
||||||
|
const schema = { type: 'object', properties: { step: { type: 'integer' } } };
|
||||||
mockEvents = [
|
mockEvents = [
|
||||||
{ type: 'thread.started', thread_id: 'thread-1' },
|
{ type: 'thread.started', thread_id: 'thread-1' },
|
||||||
{
|
{
|
||||||
type: 'item.completed',
|
type: 'item.completed',
|
||||||
item: { id: 'msg-1', type: 'agent_message', text: 'response text' },
|
item: { id: 'msg-1', type: 'agent_message', text: '{"step": 2, "reason": "approved"}' },
|
||||||
},
|
|
||||||
{
|
|
||||||
type: 'turn.completed',
|
|
||||||
turn: { finalResponse: { step: 2, reason: 'approved' } },
|
|
||||||
},
|
},
|
||||||
|
{ type: 'turn.completed', usage: { input_tokens: 0, cached_input_tokens: 0, output_tokens: 0 } },
|
||||||
];
|
];
|
||||||
|
|
||||||
const client = new CodexClient();
|
const client = new CodexClient();
|
||||||
const result = await client.call('coder', 'prompt', { cwd: '/tmp' });
|
const result = await client.call('coder', 'prompt', { cwd: '/tmp', outputSchema: schema });
|
||||||
|
|
||||||
expect(result.status).toBe('done');
|
expect(result.status).toBe('done');
|
||||||
expect(result.structuredOutput).toEqual({ step: 2, reason: 'approved' });
|
expect(result.structuredOutput).toEqual({ step: 2, reason: 'approved' });
|
||||||
});
|
});
|
||||||
|
|
||||||
it('turn.completed に finalResponse がない場合は undefined', async () => {
|
it('outputSchema なしの場合はテキストを JSON パースしない', async () => {
|
||||||
mockEvents = [
|
mockEvents = [
|
||||||
{ type: 'thread.started', thread_id: 'thread-1' },
|
{ type: 'thread.started', thread_id: 'thread-1' },
|
||||||
{
|
{
|
||||||
type: 'item.completed',
|
type: 'item.completed',
|
||||||
item: { id: 'msg-1', type: 'agent_message', text: 'text' },
|
item: { id: 'msg-1', type: 'agent_message', text: '{"step": 2}' },
|
||||||
},
|
},
|
||||||
{ type: 'turn.completed', turn: {} },
|
{ type: 'turn.completed', usage: { input_tokens: 0, cached_input_tokens: 0, output_tokens: 0 } },
|
||||||
];
|
];
|
||||||
|
|
||||||
const client = new CodexClient();
|
const client = new CodexClient();
|
||||||
@ -79,86 +81,64 @@ describe('CodexClient — structuredOutput 抽出', () => {
|
|||||||
expect(result.structuredOutput).toBeUndefined();
|
expect(result.structuredOutput).toBeUndefined();
|
||||||
});
|
});
|
||||||
|
|
||||||
it('finalResponse が配列の場合は無視する', async () => {
|
it('agent_message が JSON でない場合は undefined', async () => {
|
||||||
mockEvents = [
|
|
||||||
{ type: 'thread.started', thread_id: 'thread-1' },
|
|
||||||
{
|
|
||||||
type: 'item.completed',
|
|
||||||
item: { id: 'msg-1', type: 'agent_message', text: 'text' },
|
|
||||||
},
|
|
||||||
{ type: 'turn.completed', turn: { finalResponse: [1, 2, 3] } },
|
|
||||||
];
|
|
||||||
|
|
||||||
const client = new CodexClient();
|
|
||||||
const result = await client.call('coder', 'prompt', { cwd: '/tmp' });
|
|
||||||
|
|
||||||
expect(result.structuredOutput).toBeUndefined();
|
|
||||||
});
|
|
||||||
|
|
||||||
it('finalResponse が null の場合は undefined', async () => {
|
|
||||||
mockEvents = [
|
|
||||||
{ type: 'thread.started', thread_id: 'thread-1' },
|
|
||||||
{ type: 'turn.completed', turn: { finalResponse: null } },
|
|
||||||
];
|
|
||||||
|
|
||||||
const client = new CodexClient();
|
|
||||||
const result = await client.call('coder', 'prompt', { cwd: '/tmp' });
|
|
||||||
|
|
||||||
expect(result.structuredOutput).toBeUndefined();
|
|
||||||
});
|
|
||||||
|
|
||||||
it('turn.completed がない場合は structuredOutput なし', async () => {
|
|
||||||
mockEvents = [
|
|
||||||
{ type: 'thread.started', thread_id: 'thread-1' },
|
|
||||||
{
|
|
||||||
type: 'item.completed',
|
|
||||||
item: { id: 'msg-1', type: 'agent_message', text: 'response' },
|
|
||||||
},
|
|
||||||
];
|
|
||||||
|
|
||||||
const client = new CodexClient();
|
|
||||||
const result = await client.call('coder', 'prompt', { cwd: '/tmp' });
|
|
||||||
|
|
||||||
expect(result.status).toBe('done');
|
|
||||||
expect(result.structuredOutput).toBeUndefined();
|
|
||||||
});
|
|
||||||
|
|
||||||
it('outputSchema が runStreamed に渡される', async () => {
|
|
||||||
const schema = { type: 'object', properties: { step: { type: 'integer' } } };
|
const schema = { type: 'object', properties: { step: { type: 'integer' } } };
|
||||||
const runStreamedSpy = vi.fn().mockResolvedValue({
|
|
||||||
events: (async function* () {
|
|
||||||
yield { type: 'thread.started', thread_id: 'thread-1' };
|
|
||||||
yield {
|
|
||||||
type: 'item.completed',
|
|
||||||
item: { id: 'msg-1', type: 'agent_message', text: 'ok' },
|
|
||||||
};
|
|
||||||
yield {
|
|
||||||
type: 'turn.completed',
|
|
||||||
turn: { finalResponse: { step: 1 } },
|
|
||||||
};
|
|
||||||
})(),
|
|
||||||
});
|
|
||||||
|
|
||||||
// Mock SDK で startThread が返す thread の runStreamed を spy に差し替え
|
|
||||||
const { Codex } = await import('@openai/codex-sdk');
|
|
||||||
const codex = new Codex({} as never);
|
|
||||||
const thread = await codex.startThread();
|
|
||||||
thread.runStreamed = runStreamedSpy;
|
|
||||||
|
|
||||||
// CodexClient は内部で Codex を new するため、
|
|
||||||
// SDK クラス自体のモックで startThread の返り値を制御
|
|
||||||
// → mockEvents ベースの簡易テストでは runStreamed の引数を直接検証できない
|
|
||||||
// ここではプロバイダ層テスト (provider-structured-output.test.ts) で
|
|
||||||
// outputSchema パススルーを検証済みのため、SDK 内部の引数検証はスキップ
|
|
||||||
|
|
||||||
// 代わりに、outputSchema 付きで呼び出して structuredOutput が返ることを確認
|
|
||||||
mockEvents = [
|
mockEvents = [
|
||||||
{ type: 'thread.started', thread_id: 'thread-1' },
|
{ type: 'thread.started', thread_id: 'thread-1' },
|
||||||
{
|
{
|
||||||
type: 'item.completed',
|
type: 'item.completed',
|
||||||
item: { id: 'msg-1', type: 'agent_message', text: 'ok' },
|
item: { id: 'msg-1', type: 'agent_message', text: 'plain text response' },
|
||||||
},
|
},
|
||||||
{ type: 'turn.completed', turn: { finalResponse: { step: 1 } } },
|
{ type: 'turn.completed', usage: { input_tokens: 0, cached_input_tokens: 0, output_tokens: 0 } },
|
||||||
|
];
|
||||||
|
|
||||||
|
const client = new CodexClient();
|
||||||
|
const result = await client.call('coder', 'prompt', { cwd: '/tmp', outputSchema: schema });
|
||||||
|
|
||||||
|
expect(result.status).toBe('done');
|
||||||
|
expect(result.structuredOutput).toBeUndefined();
|
||||||
|
});
|
||||||
|
|
||||||
|
it('JSON が配列の場合は無視する', async () => {
|
||||||
|
const schema = { type: 'object', properties: { step: { type: 'integer' } } };
|
||||||
|
mockEvents = [
|
||||||
|
{ type: 'thread.started', thread_id: 'thread-1' },
|
||||||
|
{
|
||||||
|
type: 'item.completed',
|
||||||
|
item: { id: 'msg-1', type: 'agent_message', text: '[1, 2, 3]' },
|
||||||
|
},
|
||||||
|
{ type: 'turn.completed', usage: { input_tokens: 0, cached_input_tokens: 0, output_tokens: 0 } },
|
||||||
|
];
|
||||||
|
|
||||||
|
const client = new CodexClient();
|
||||||
|
const result = await client.call('coder', 'prompt', { cwd: '/tmp', outputSchema: schema });
|
||||||
|
|
||||||
|
expect(result.structuredOutput).toBeUndefined();
|
||||||
|
});
|
||||||
|
|
||||||
|
it('agent_message がない場合は structuredOutput なし', async () => {
|
||||||
|
const schema = { type: 'object', properties: { step: { type: 'integer' } } };
|
||||||
|
mockEvents = [
|
||||||
|
{ type: 'thread.started', thread_id: 'thread-1' },
|
||||||
|
{ type: 'turn.completed', usage: { input_tokens: 0, cached_input_tokens: 0, output_tokens: 0 } },
|
||||||
|
];
|
||||||
|
|
||||||
|
const client = new CodexClient();
|
||||||
|
const result = await client.call('coder', 'prompt', { cwd: '/tmp', outputSchema: schema });
|
||||||
|
|
||||||
|
expect(result.status).toBe('done');
|
||||||
|
expect(result.structuredOutput).toBeUndefined();
|
||||||
|
});
|
||||||
|
|
||||||
|
it('outputSchema 付きで呼び出して structuredOutput が返る', async () => {
|
||||||
|
const schema = { type: 'object', properties: { step: { type: 'integer' } } };
|
||||||
|
mockEvents = [
|
||||||
|
{ type: 'thread.started', thread_id: 'thread-1' },
|
||||||
|
{
|
||||||
|
type: 'item.completed',
|
||||||
|
item: { id: 'msg-1', type: 'agent_message', text: '{"step": 1}' },
|
||||||
|
},
|
||||||
|
{ type: 'turn.completed', usage: { input_tokens: 0, cached_input_tokens: 0, output_tokens: 0 } },
|
||||||
];
|
];
|
||||||
|
|
||||||
const client = new CodexClient();
|
const client = new CodexClient();
|
||||||
|
|||||||
@ -25,7 +25,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
||||||
|
|||||||
@ -21,7 +21,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
vi.mock('../shared/utils/index.js', async () => {
|
vi.mock('../shared/utils/index.js', async () => {
|
||||||
|
|||||||
@ -23,7 +23,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
||||||
|
|||||||
@ -24,7 +24,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
||||||
|
|||||||
@ -28,7 +28,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
||||||
|
|||||||
@ -27,7 +27,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
||||||
|
|||||||
@ -23,7 +23,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
||||||
|
|||||||
@ -24,7 +24,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
||||||
|
|||||||
@ -17,7 +17,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
||||||
|
|||||||
@ -173,7 +173,7 @@ export function createTestTmpDir(): string {
|
|||||||
export function applyDefaultMocks(): void {
|
export function applyDefaultMocks(): void {
|
||||||
vi.mocked(needsStatusJudgmentPhase).mockReturnValue(false);
|
vi.mocked(needsStatusJudgmentPhase).mockReturnValue(false);
|
||||||
vi.mocked(runReportPhase).mockResolvedValue(undefined);
|
vi.mocked(runReportPhase).mockResolvedValue(undefined);
|
||||||
vi.mocked(runStatusJudgmentPhase).mockResolvedValue('');
|
vi.mocked(runStatusJudgmentPhase).mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' });
|
||||||
vi.mocked(generateReportDir).mockReturnValue('test-report-dir');
|
vi.mocked(generateReportDir).mockReturnValue('test-report-dir');
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@ -24,7 +24,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
||||||
|
|||||||
@ -31,7 +31,7 @@ vi.mock('../agents/ai-judge.js', async (importOriginal) => {
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
||||||
|
|||||||
@ -35,7 +35,7 @@ vi.mock('../agents/ai-judge.js', async (importOriginal) => {
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
||||||
|
|||||||
@ -37,7 +37,7 @@ vi.mock('../agents/ai-judge.js', async (importOriginal) => {
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
||||||
|
|||||||
@ -144,7 +144,7 @@ vi.mock('../shared/prompt/index.js', () => ({
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
// --- Imports (after mocks) ---
|
// --- Imports (after mocks) ---
|
||||||
|
|||||||
@ -125,7 +125,7 @@ vi.mock('../shared/prompt/index.js', () => ({
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
// --- Imports (after mocks) ---
|
// --- Imports (after mocks) ---
|
||||||
|
|||||||
@ -114,7 +114,7 @@ describe('Three-Phase Execution IT: phase1 only (no report, no tag rules)', () =
|
|||||||
// No tag rules needed → Phase 3 not needed
|
// No tag rules needed → Phase 3 not needed
|
||||||
mockNeedsStatusJudgmentPhase.mockReturnValue(false);
|
mockNeedsStatusJudgmentPhase.mockReturnValue(false);
|
||||||
mockRunReportPhase.mockResolvedValue(undefined);
|
mockRunReportPhase.mockResolvedValue(undefined);
|
||||||
mockRunStatusJudgmentPhase.mockResolvedValue('');
|
mockRunStatusJudgmentPhase.mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' });
|
||||||
});
|
});
|
||||||
|
|
||||||
afterEach(() => {
|
afterEach(() => {
|
||||||
@ -166,7 +166,7 @@ describe('Three-Phase Execution IT: phase1 + phase2 (report defined)', () => {
|
|||||||
|
|
||||||
mockNeedsStatusJudgmentPhase.mockReturnValue(false);
|
mockNeedsStatusJudgmentPhase.mockReturnValue(false);
|
||||||
mockRunReportPhase.mockResolvedValue(undefined);
|
mockRunReportPhase.mockResolvedValue(undefined);
|
||||||
mockRunStatusJudgmentPhase.mockResolvedValue('');
|
mockRunStatusJudgmentPhase.mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' });
|
||||||
});
|
});
|
||||||
|
|
||||||
afterEach(() => {
|
afterEach(() => {
|
||||||
@ -246,7 +246,7 @@ describe('Three-Phase Execution IT: phase1 + phase3 (tag rules defined)', () =>
|
|||||||
mockNeedsStatusJudgmentPhase.mockReturnValue(true);
|
mockNeedsStatusJudgmentPhase.mockReturnValue(true);
|
||||||
mockRunReportPhase.mockResolvedValue(undefined);
|
mockRunReportPhase.mockResolvedValue(undefined);
|
||||||
// Phase 3 returns content with a tag
|
// Phase 3 returns content with a tag
|
||||||
mockRunStatusJudgmentPhase.mockResolvedValue('[STEP:1]');
|
mockRunStatusJudgmentPhase.mockResolvedValue({ tag: '[STEP:1]', ruleIndex: 0, method: 'structured_output' });
|
||||||
});
|
});
|
||||||
|
|
||||||
afterEach(() => {
|
afterEach(() => {
|
||||||
@ -298,7 +298,7 @@ describe('Three-Phase Execution IT: all three phases', () => {
|
|||||||
|
|
||||||
mockNeedsStatusJudgmentPhase.mockReturnValue(true);
|
mockNeedsStatusJudgmentPhase.mockReturnValue(true);
|
||||||
mockRunReportPhase.mockResolvedValue(undefined);
|
mockRunReportPhase.mockResolvedValue(undefined);
|
||||||
mockRunStatusJudgmentPhase.mockResolvedValue('[STEP:1]');
|
mockRunStatusJudgmentPhase.mockResolvedValue({ tag: '[STEP:1]', ruleIndex: 0, method: 'structured_output' });
|
||||||
});
|
});
|
||||||
|
|
||||||
afterEach(() => {
|
afterEach(() => {
|
||||||
@ -369,7 +369,7 @@ describe('Three-Phase Execution IT: phase3 tag → rule match', () => {
|
|||||||
]);
|
]);
|
||||||
|
|
||||||
// Phase 3 returns rule 2 (ABORT)
|
// Phase 3 returns rule 2 (ABORT)
|
||||||
mockRunStatusJudgmentPhase.mockResolvedValue('[STEP1:2]');
|
mockRunStatusJudgmentPhase.mockResolvedValue({ tag: '[STEP1:2]', ruleIndex: 1, method: 'structured_output' });
|
||||||
|
|
||||||
const config: PieceConfig = {
|
const config: PieceConfig = {
|
||||||
name: 'it-phase3-tag',
|
name: 'it-phase3-tag',
|
||||||
|
|||||||
86
src/__tests__/parseStructuredOutput.test.ts
Normal file
86
src/__tests__/parseStructuredOutput.test.ts
Normal file
@ -0,0 +1,86 @@
|
|||||||
|
import { describe, it, expect } from 'vitest';
|
||||||
|
import { parseStructuredOutput } from '../shared/utils/structuredOutput.js';
|
||||||
|
|
||||||
|
describe('parseStructuredOutput', () => {
|
||||||
|
it('should return undefined when hasOutputSchema is false', () => {
|
||||||
|
expect(parseStructuredOutput('{"step":1}', false)).toBeUndefined();
|
||||||
|
});
|
||||||
|
|
||||||
|
it('should return undefined for empty text', () => {
|
||||||
|
expect(parseStructuredOutput('', true)).toBeUndefined();
|
||||||
|
});
|
||||||
|
|
||||||
|
// Strategy 1: Direct JSON parse
|
||||||
|
describe('direct JSON parse', () => {
|
||||||
|
it('should parse pure JSON object', () => {
|
||||||
|
expect(parseStructuredOutput('{"step":1,"reason":"done"}', true))
|
||||||
|
.toEqual({ step: 1, reason: 'done' });
|
||||||
|
});
|
||||||
|
|
||||||
|
it('should parse JSON with whitespace', () => {
|
||||||
|
expect(parseStructuredOutput(' { "step": 2, "reason": "ok" } ', true))
|
||||||
|
.toEqual({ step: 2, reason: 'ok' });
|
||||||
|
});
|
||||||
|
|
||||||
|
it('should ignore arrays', () => {
|
||||||
|
expect(parseStructuredOutput('[1,2,3]', true)).toBeUndefined();
|
||||||
|
});
|
||||||
|
|
||||||
|
it('should ignore primitive JSON', () => {
|
||||||
|
expect(parseStructuredOutput('"hello"', true)).toBeUndefined();
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// Strategy 2: Code block extraction
|
||||||
|
describe('code block extraction', () => {
|
||||||
|
it('should extract JSON from ```json code block', () => {
|
||||||
|
const text = 'Here is the result:\n```json\n{"step":1,"reason":"matched"}\n```';
|
||||||
|
expect(parseStructuredOutput(text, true))
|
||||||
|
.toEqual({ step: 1, reason: 'matched' });
|
||||||
|
});
|
||||||
|
|
||||||
|
it('should extract JSON from ``` code block (no language)', () => {
|
||||||
|
const text = 'Result:\n```\n{"step":2,"reason":"fallback"}\n```';
|
||||||
|
expect(parseStructuredOutput(text, true))
|
||||||
|
.toEqual({ step: 2, reason: 'fallback' });
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// Strategy 3: Brace extraction
|
||||||
|
describe('brace extraction', () => {
|
||||||
|
it('should extract JSON with preamble text', () => {
|
||||||
|
const text = 'The matched rule is: {"step":1,"reason":"condition met"}';
|
||||||
|
expect(parseStructuredOutput(text, true))
|
||||||
|
.toEqual({ step: 1, reason: 'condition met' });
|
||||||
|
});
|
||||||
|
|
||||||
|
it('should extract JSON with postamble text', () => {
|
||||||
|
const text = '{"step":3,"reason":"done"}\nEnd of response.';
|
||||||
|
expect(parseStructuredOutput(text, true))
|
||||||
|
.toEqual({ step: 3, reason: 'done' });
|
||||||
|
});
|
||||||
|
|
||||||
|
it('should extract JSON with both preamble and postamble', () => {
|
||||||
|
const text = 'Based on my analysis:\n{"matched_index":2,"reason":"test"}\nThat is my judgment.';
|
||||||
|
expect(parseStructuredOutput(text, true))
|
||||||
|
.toEqual({ matched_index: 2, reason: 'test' });
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// Edge cases
|
||||||
|
describe('edge cases', () => {
|
||||||
|
it('should return undefined for text without JSON', () => {
|
||||||
|
expect(parseStructuredOutput('No JSON here at all.', true)).toBeUndefined();
|
||||||
|
});
|
||||||
|
|
||||||
|
it('should return undefined for invalid JSON', () => {
|
||||||
|
expect(parseStructuredOutput('{invalid json}', true)).toBeUndefined();
|
||||||
|
});
|
||||||
|
|
||||||
|
it('should handle nested objects', () => {
|
||||||
|
const text = '{"step":1,"reason":"ok","meta":{"detail":"extra"}}';
|
||||||
|
expect(parseStructuredOutput(text, true))
|
||||||
|
.toEqual({ step: 1, reason: 'ok', meta: { detail: 'extra' } });
|
||||||
|
});
|
||||||
|
});
|
||||||
|
});
|
||||||
@ -23,7 +23,7 @@ vi.mock('../core/piece/evaluation/index.js', () => ({
|
|||||||
vi.mock('../core/piece/phase-runner.js', () => ({
|
vi.mock('../core/piece/phase-runner.js', () => ({
|
||||||
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
needsStatusJudgmentPhase: vi.fn().mockReturnValue(false),
|
||||||
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
runReportPhase: vi.fn().mockResolvedValue(undefined),
|
||||||
runStatusJudgmentPhase: vi.fn().mockResolvedValue(''),
|
runStatusJudgmentPhase: vi.fn().mockResolvedValue({ tag: '', ruleIndex: 0, method: 'auto_select' }),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
vi.mock('../shared/utils/index.js', async (importOriginal) => ({
|
||||||
|
|||||||
@ -85,7 +85,8 @@ export async function evaluateCondition(
|
|||||||
}
|
}
|
||||||
|
|
||||||
export async function judgeStatus(
|
export async function judgeStatus(
|
||||||
instruction: string,
|
structuredInstruction: string,
|
||||||
|
tagInstruction: string,
|
||||||
rules: PieceRule[],
|
rules: PieceRule[],
|
||||||
options: JudgeStatusOptions,
|
options: JudgeStatusOptions,
|
||||||
): Promise<JudgeStatusResult> {
|
): Promise<JudgeStatusResult> {
|
||||||
@ -94,48 +95,47 @@ export async function judgeStatus(
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (rules.length === 1) {
|
if (rules.length === 1) {
|
||||||
return {
|
return { ruleIndex: 0, method: 'auto_select' };
|
||||||
ruleIndex: 0,
|
|
||||||
method: 'auto_select',
|
|
||||||
};
|
|
||||||
}
|
}
|
||||||
|
|
||||||
const response = await runAgent('conductor', instruction, {
|
const agentOptions = {
|
||||||
cwd: options.cwd,
|
cwd: options.cwd,
|
||||||
maxTurns: 3,
|
maxTurns: 3,
|
||||||
permissionMode: 'readonly',
|
permissionMode: 'readonly' as const,
|
||||||
language: options.language,
|
language: options.language,
|
||||||
|
};
|
||||||
|
|
||||||
|
// Stage 1: Structured output
|
||||||
|
const structuredResponse = await runAgent('conductor', structuredInstruction, {
|
||||||
|
...agentOptions,
|
||||||
outputSchema: loadJudgmentSchema(),
|
outputSchema: loadJudgmentSchema(),
|
||||||
});
|
});
|
||||||
|
|
||||||
if (response.status === 'done') {
|
if (structuredResponse.status === 'done') {
|
||||||
const stepNumber = response.structuredOutput?.step;
|
const stepNumber = structuredResponse.structuredOutput?.step;
|
||||||
if (typeof stepNumber === 'number' && Number.isInteger(stepNumber)) {
|
if (typeof stepNumber === 'number' && Number.isInteger(stepNumber)) {
|
||||||
const ruleIndex = stepNumber - 1;
|
const ruleIndex = stepNumber - 1;
|
||||||
if (ruleIndex >= 0 && ruleIndex < rules.length) {
|
if (ruleIndex >= 0 && ruleIndex < rules.length) {
|
||||||
return {
|
return { ruleIndex, method: 'structured_output' };
|
||||||
ruleIndex,
|
}
|
||||||
method: 'structured_output',
|
|
||||||
};
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
const tagRuleIndex = detectRuleIndex(response.content, options.movementName);
|
// Stage 2: Tag detection (dedicated call, no outputSchema)
|
||||||
|
const tagResponse = await runAgent('conductor', tagInstruction, agentOptions);
|
||||||
|
|
||||||
|
if (tagResponse.status === 'done') {
|
||||||
|
const tagRuleIndex = detectRuleIndex(tagResponse.content, options.movementName);
|
||||||
if (tagRuleIndex >= 0 && tagRuleIndex < rules.length) {
|
if (tagRuleIndex >= 0 && tagRuleIndex < rules.length) {
|
||||||
return {
|
return { ruleIndex: tagRuleIndex, method: 'phase3_tag' };
|
||||||
ruleIndex: tagRuleIndex,
|
|
||||||
method: 'phase3_tag',
|
|
||||||
};
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Stage 3: AI judge
|
||||||
const conditions = rules.map((rule, index) => ({ index, text: rule.condition }));
|
const conditions = rules.map((rule, index) => ({ index, text: rule.condition }));
|
||||||
const fallbackIndex = await evaluateCondition(instruction, conditions, { cwd: options.cwd });
|
const fallbackIndex = await evaluateCondition(structuredInstruction, conditions, { cwd: options.cwd });
|
||||||
if (fallbackIndex >= 0 && fallbackIndex < rules.length) {
|
if (fallbackIndex >= 0 && fallbackIndex < rules.length) {
|
||||||
return {
|
return { ruleIndex: fallbackIndex, method: 'ai_judge' };
|
||||||
ruleIndex: fallbackIndex,
|
|
||||||
method: 'ai_judge',
|
|
||||||
};
|
|
||||||
}
|
}
|
||||||
|
|
||||||
throw new Error(`Status not found for movement "${options.movementName}"`);
|
throw new Error(`Status not found for movement "${options.movementName}"`);
|
||||||
|
|||||||
@ -220,13 +220,18 @@ export class MovementExecutor {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// Phase 3: status judgment (resume session, no tools, output status tag)
|
// Phase 3: status judgment (new session, no tools, determines matched rule)
|
||||||
let tagContent = '';
|
const phase3Result = needsStatusJudgmentPhase(step)
|
||||||
if (needsStatusJudgmentPhase(step)) {
|
? await runStatusJudgmentPhase(step, phaseCtx)
|
||||||
tagContent = await runStatusJudgmentPhase(step, phaseCtx);
|
: undefined;
|
||||||
}
|
|
||||||
|
|
||||||
const match = await detectMatchedRule(step, response.content, tagContent, {
|
if (phase3Result) {
|
||||||
|
// Phase 3 already determined the matched rule — use its result directly
|
||||||
|
log.debug('Rule matched (Phase 3)', { movement: step.name, ruleIndex: phase3Result.ruleIndex, method: phase3Result.method });
|
||||||
|
response = { ...response, matchedRuleIndex: phase3Result.ruleIndex, matchedRuleMethod: phase3Result.method };
|
||||||
|
} else {
|
||||||
|
// No Phase 3 — use rule evaluator with Phase 1 content
|
||||||
|
const match = await detectMatchedRule(step, response.content, '', {
|
||||||
state,
|
state,
|
||||||
cwd: this.deps.getCwd(),
|
cwd: this.deps.getCwd(),
|
||||||
interactive: this.deps.getInteractive(),
|
interactive: this.deps.getInteractive(),
|
||||||
@ -237,6 +242,7 @@ export class MovementExecutor {
|
|||||||
log.debug('Rule matched', { movement: step.name, ruleIndex: match.index, method: match.method });
|
log.debug('Rule matched', { movement: step.name, ruleIndex: match.index, method: match.method });
|
||||||
response = { ...response, matchedRuleIndex: match.index, matchedRuleMethod: match.method };
|
response = { ...response, matchedRuleIndex: match.index, matchedRuleMethod: match.method };
|
||||||
}
|
}
|
||||||
|
}
|
||||||
|
|
||||||
state.movementOutputs.set(step.name, response);
|
state.movementOutputs.set(step.name, response);
|
||||||
state.lastOutput = response;
|
state.lastOutput = response;
|
||||||
|
|||||||
@ -114,15 +114,19 @@ export class ParallelRunner {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// Phase 3: status judgment for sub-movement
|
// Phase 3: status judgment for sub-movement
|
||||||
let subTagContent = '';
|
const subPhase3 = needsStatusJudgmentPhase(subMovement)
|
||||||
if (needsStatusJudgmentPhase(subMovement)) {
|
? await runStatusJudgmentPhase(subMovement, phaseCtx)
|
||||||
subTagContent = await runStatusJudgmentPhase(subMovement, phaseCtx);
|
: undefined;
|
||||||
}
|
|
||||||
|
|
||||||
const match = await detectMatchedRule(subMovement, subResponse.content, subTagContent, ruleCtx);
|
let finalResponse: AgentResponse;
|
||||||
const finalResponse = match
|
if (subPhase3) {
|
||||||
|
finalResponse = { ...subResponse, matchedRuleIndex: subPhase3.ruleIndex, matchedRuleMethod: subPhase3.method };
|
||||||
|
} else {
|
||||||
|
const match = await detectMatchedRule(subMovement, subResponse.content, '', ruleCtx);
|
||||||
|
finalResponse = match
|
||||||
? { ...subResponse, matchedRuleIndex: match.index, matchedRuleMethod: match.method }
|
? { ...subResponse, matchedRuleIndex: match.index, matchedRuleMethod: match.method }
|
||||||
: subResponse;
|
: subResponse;
|
||||||
|
}
|
||||||
|
|
||||||
state.movementOutputs.set(subMovement.name, finalResponse);
|
state.movementOutputs.set(subMovement.name, finalResponse);
|
||||||
this.deps.movementExecutor.emitMovementReports(subMovement);
|
this.deps.movementExecutor.emitMovementReports(subMovement);
|
||||||
|
|||||||
@ -27,8 +27,8 @@ export interface StatusJudgmentContext {
|
|||||||
lastResponse?: string;
|
lastResponse?: string;
|
||||||
/** Input source type for fallback strategies */
|
/** Input source type for fallback strategies */
|
||||||
inputSource?: 'report' | 'response';
|
inputSource?: 'report' | 'response';
|
||||||
/** Structured output mode omits tag-format instructions */
|
/** When true, omit tag output instructions (structured output schema handles format) */
|
||||||
useStructuredOutput?: boolean;
|
structuredOutput?: boolean;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@ -66,14 +66,17 @@ export class StatusJudgmentBuilder {
|
|||||||
contentToJudge = this.buildFromResponse();
|
contentToJudge = this.buildFromResponse();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
const isStructured = this.context.structuredOutput ?? false;
|
||||||
|
|
||||||
return loadTemplate('perform_phase3_message', language, {
|
return loadTemplate('perform_phase3_message', language, {
|
||||||
reportContent: contentToJudge,
|
reportContent: contentToJudge,
|
||||||
criteriaTable: components.criteriaTable,
|
criteriaTable: components.criteriaTable,
|
||||||
outputList: this.context.useStructuredOutput
|
structuredOutput: isStructured,
|
||||||
? ''
|
...(isStructured ? {} : {
|
||||||
: components.outputList,
|
outputList: components.outputList,
|
||||||
hasAppendix: components.hasAppendix,
|
hasAppendix: components.hasAppendix,
|
||||||
appendixContent: components.appendixContent,
|
appendixContent: components.appendixContent,
|
||||||
|
}),
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@ -15,7 +15,7 @@ import { hasTagBasedRules, getReportFiles } from './evaluation/rule-utils.js';
|
|||||||
import { executeAgent } from './agent-usecases.js';
|
import { executeAgent } from './agent-usecases.js';
|
||||||
import { createLogger } from '../../shared/utils/index.js';
|
import { createLogger } from '../../shared/utils/index.js';
|
||||||
import { buildSessionKey } from './session-key.js';
|
import { buildSessionKey } from './session-key.js';
|
||||||
export { runStatusJudgmentPhase } from './status-judgment-phase.js';
|
export { runStatusJudgmentPhase, type StatusJudgmentPhaseResult } from './status-judgment-phase.js';
|
||||||
|
|
||||||
const log = createLogger('phase-runner');
|
const log = createLogger('phase-runner');
|
||||||
|
|
||||||
|
|||||||
@ -1,75 +1,98 @@
|
|||||||
import { existsSync, readFileSync } from 'node:fs';
|
import { existsSync, readFileSync } from 'node:fs';
|
||||||
import { resolve } from 'node:path';
|
import { resolve } from 'node:path';
|
||||||
import type { PieceMovement } from '../models/types.js';
|
import type { PieceMovement, RuleMatchMethod } from '../models/types.js';
|
||||||
import { judgeStatus } from './agent-usecases.js';
|
import { judgeStatus } from './agent-usecases.js';
|
||||||
import { StatusJudgmentBuilder } from './instruction/StatusJudgmentBuilder.js';
|
import { StatusJudgmentBuilder, type StatusJudgmentContext } from './instruction/StatusJudgmentBuilder.js';
|
||||||
import { getReportFiles } from './evaluation/rule-utils.js';
|
import { getReportFiles } from './evaluation/rule-utils.js';
|
||||||
import { createLogger } from '../../shared/utils/index.js';
|
import { createLogger } from '../../shared/utils/index.js';
|
||||||
import type { PhaseRunnerContext } from './phase-runner.js';
|
import type { PhaseRunnerContext } from './phase-runner.js';
|
||||||
|
|
||||||
const log = createLogger('phase-runner');
|
const log = createLogger('phase-runner');
|
||||||
|
|
||||||
|
/** Result of Phase 3 status judgment, including the detection method. */
|
||||||
|
export interface StatusJudgmentPhaseResult {
|
||||||
|
tag: string;
|
||||||
|
ruleIndex: number;
|
||||||
|
method: RuleMatchMethod;
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Phase 3: Status judgment.
|
* Build the base context (shared by structured output and tag instructions).
|
||||||
* Uses the 'conductor' agent in a new session to output a status tag.
|
|
||||||
* Implements multi-stage fallback logic to ensure judgment succeeds.
|
|
||||||
* Returns the Phase 3 response content (containing the status tag).
|
|
||||||
*/
|
*/
|
||||||
export async function runStatusJudgmentPhase(
|
function buildBaseContext(
|
||||||
step: PieceMovement,
|
step: PieceMovement,
|
||||||
ctx: PhaseRunnerContext,
|
ctx: PhaseRunnerContext,
|
||||||
): Promise<string> {
|
): Omit<StatusJudgmentContext, 'structuredOutput'> | undefined {
|
||||||
log.debug('Running status judgment phase', { movement: step.name });
|
|
||||||
if (!step.rules || step.rules.length === 0) {
|
|
||||||
throw new Error(`Status judgment requires rules for movement "${step.name}"`);
|
|
||||||
}
|
|
||||||
|
|
||||||
const reportFiles = getReportFiles(step.outputContracts);
|
const reportFiles = getReportFiles(step.outputContracts);
|
||||||
let instruction: string | undefined;
|
|
||||||
|
|
||||||
if (reportFiles.length > 0) {
|
if (reportFiles.length > 0) {
|
||||||
const reports: string[] = [];
|
const reports: string[] = [];
|
||||||
for (const fileName of reportFiles) {
|
for (const fileName of reportFiles) {
|
||||||
const filePath = resolve(ctx.reportDir, fileName);
|
const filePath = resolve(ctx.reportDir, fileName);
|
||||||
if (!existsSync(filePath)) {
|
if (!existsSync(filePath)) continue;
|
||||||
continue;
|
|
||||||
}
|
|
||||||
const content = readFileSync(filePath, 'utf-8');
|
const content = readFileSync(filePath, 'utf-8');
|
||||||
reports.push(`# ${fileName}\n\n${content}`);
|
reports.push(`# ${fileName}\n\n${content}`);
|
||||||
}
|
}
|
||||||
if (reports.length > 0) {
|
if (reports.length > 0) {
|
||||||
instruction = new StatusJudgmentBuilder(step, {
|
return {
|
||||||
language: ctx.language,
|
language: ctx.language,
|
||||||
reportContent: reports.join('\n\n---\n\n'),
|
reportContent: reports.join('\n\n---\n\n'),
|
||||||
inputSource: 'report',
|
inputSource: 'report',
|
||||||
useStructuredOutput: true,
|
};
|
||||||
}).build();
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (instruction == null) {
|
if (!ctx.lastResponse) return undefined;
|
||||||
if (!ctx.lastResponse) {
|
|
||||||
throw new Error(`Status judgment requires report or lastResponse for movement "${step.name}"`);
|
|
||||||
}
|
|
||||||
|
|
||||||
instruction = new StatusJudgmentBuilder(step, {
|
return {
|
||||||
language: ctx.language,
|
language: ctx.language,
|
||||||
lastResponse: ctx.lastResponse,
|
lastResponse: ctx.lastResponse,
|
||||||
inputSource: 'response',
|
inputSource: 'response',
|
||||||
useStructuredOutput: true,
|
};
|
||||||
}).build();
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Phase 3: Status judgment.
|
||||||
|
*
|
||||||
|
* Builds two instructions from the same context:
|
||||||
|
* - Structured output instruction (JSON schema)
|
||||||
|
* - Tag instruction (free-form tag detection)
|
||||||
|
*
|
||||||
|
* `judgeStatus()` tries them in order: structured → tag → ai_judge.
|
||||||
|
*/
|
||||||
|
export async function runStatusJudgmentPhase(
|
||||||
|
step: PieceMovement,
|
||||||
|
ctx: PhaseRunnerContext,
|
||||||
|
): Promise<StatusJudgmentPhaseResult> {
|
||||||
|
log.debug('Running status judgment phase', { movement: step.name });
|
||||||
|
if (!step.rules || step.rules.length === 0) {
|
||||||
|
throw new Error(`Status judgment requires rules for movement "${step.name}"`);
|
||||||
}
|
}
|
||||||
|
|
||||||
ctx.onPhaseStart?.(step, 3, 'judge', instruction);
|
const baseContext = buildBaseContext(step, ctx);
|
||||||
|
if (!baseContext) {
|
||||||
|
throw new Error(`Status judgment requires report or lastResponse for movement "${step.name}"`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const structuredInstruction = new StatusJudgmentBuilder(step, {
|
||||||
|
...baseContext,
|
||||||
|
structuredOutput: true,
|
||||||
|
}).build();
|
||||||
|
|
||||||
|
const tagInstruction = new StatusJudgmentBuilder(step, {
|
||||||
|
...baseContext,
|
||||||
|
}).build();
|
||||||
|
|
||||||
|
ctx.onPhaseStart?.(step, 3, 'judge', structuredInstruction);
|
||||||
try {
|
try {
|
||||||
const result = await judgeStatus(instruction, step.rules, {
|
const result = await judgeStatus(structuredInstruction, tagInstruction, step.rules, {
|
||||||
cwd: ctx.cwd,
|
cwd: ctx.cwd,
|
||||||
movementName: step.name,
|
movementName: step.name,
|
||||||
language: ctx.language,
|
language: ctx.language,
|
||||||
});
|
});
|
||||||
const tag = `[${step.name.toUpperCase()}:${result.ruleIndex + 1}]`;
|
const tag = `[${step.name.toUpperCase()}:${result.ruleIndex + 1}]`;
|
||||||
ctx.onPhaseComplete?.(step, 3, 'judge', tag, 'done');
|
ctx.onPhaseComplete?.(step, 3, 'judge', tag, 'done');
|
||||||
return tag;
|
return { tag, ruleIndex: result.ruleIndex, method: result.method };
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
const errorMsg = error instanceof Error ? error.message : String(error);
|
const errorMsg = error instanceof Error ? error.message : String(error);
|
||||||
ctx.onPhaseComplete?.(step, 3, 'judge', '', 'error', errorMsg);
|
ctx.onPhaseComplete?.(step, 3, 'judge', '', 'error', errorMsg);
|
||||||
|
|||||||
@ -4,9 +4,9 @@
|
|||||||
* Uses @openai/codex-sdk for native TypeScript integration.
|
* Uses @openai/codex-sdk for native TypeScript integration.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
import { Codex } from '@openai/codex-sdk';
|
import { Codex, type TurnOptions } from '@openai/codex-sdk';
|
||||||
import type { AgentResponse } from '../../core/models/index.js';
|
import type { AgentResponse } from '../../core/models/index.js';
|
||||||
import { createLogger, getErrorMessage, createStreamDiagnostics, type StreamDiagnostics } from '../../shared/utils/index.js';
|
import { createLogger, getErrorMessage, createStreamDiagnostics, parseStructuredOutput, type StreamDiagnostics } from '../../shared/utils/index.js';
|
||||||
import { mapToCodexSandboxMode, type CodexCallOptions } from './types.js';
|
import { mapToCodexSandboxMode, type CodexCallOptions } from './types.js';
|
||||||
import {
|
import {
|
||||||
type CodexEvent,
|
type CodexEvent,
|
||||||
@ -150,20 +150,15 @@ export class CodexClient {
|
|||||||
const diag = createStreamDiagnostics('codex-sdk', { agentType, model: options.model, attempt });
|
const diag = createStreamDiagnostics('codex-sdk', { agentType, model: options.model, attempt });
|
||||||
diagRef = diag;
|
diagRef = diag;
|
||||||
|
|
||||||
const runOptions: Record<string, unknown> = {
|
const turnOptions: TurnOptions = {
|
||||||
signal: streamAbortController.signal,
|
signal: streamAbortController.signal,
|
||||||
|
...(options.outputSchema ? { outputSchema: options.outputSchema } : {}),
|
||||||
};
|
};
|
||||||
if (options.outputSchema) {
|
const { events } = await thread.runStreamed(fullPrompt, turnOptions);
|
||||||
runOptions.outputSchema = options.outputSchema;
|
|
||||||
}
|
|
||||||
// Codex SDK types do not yet expose outputSchema even though runtime accepts it.
|
|
||||||
const runStreamedOptions = runOptions as unknown as Parameters<typeof thread.runStreamed>[1];
|
|
||||||
const { events } = await thread.runStreamed(fullPrompt, runStreamedOptions);
|
|
||||||
resetIdleTimeout();
|
resetIdleTimeout();
|
||||||
diag.onConnected();
|
diag.onConnected();
|
||||||
|
|
||||||
let content = '';
|
let content = '';
|
||||||
let structuredOutput: Record<string, unknown> | undefined;
|
|
||||||
const contentOffsets = new Map<string, number>();
|
const contentOffsets = new Map<string, number>();
|
||||||
let success = true;
|
let success = true;
|
||||||
let failureMessage = '';
|
let failureMessage = '';
|
||||||
@ -196,20 +191,6 @@ export class CodexClient {
|
|||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (event.type === 'turn.completed') {
|
|
||||||
const rawFinalResponse = (event as unknown as {
|
|
||||||
turn?: { finalResponse?: unknown };
|
|
||||||
}).turn?.finalResponse;
|
|
||||||
if (
|
|
||||||
rawFinalResponse
|
|
||||||
&& typeof rawFinalResponse === 'object'
|
|
||||||
&& !Array.isArray(rawFinalResponse)
|
|
||||||
) {
|
|
||||||
structuredOutput = rawFinalResponse as Record<string, unknown>;
|
|
||||||
}
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
|
|
||||||
if (event.type === 'item.started') {
|
if (event.type === 'item.started') {
|
||||||
const item = event.item as CodexItem | undefined;
|
const item = event.item as CodexItem | undefined;
|
||||||
if (item) {
|
if (item) {
|
||||||
@ -291,6 +272,7 @@ export class CodexClient {
|
|||||||
}
|
}
|
||||||
|
|
||||||
const trimmed = content.trim();
|
const trimmed = content.trim();
|
||||||
|
const structuredOutput = parseStructuredOutput(trimmed, !!options.outputSchema);
|
||||||
emitResult(options.onStream, true, trimmed, currentThreadId);
|
emitResult(options.onStream, true, trimmed, currentThreadId);
|
||||||
|
|
||||||
return {
|
return {
|
||||||
|
|||||||
@ -8,7 +8,7 @@
|
|||||||
import { createOpencode } from '@opencode-ai/sdk/v2';
|
import { createOpencode } from '@opencode-ai/sdk/v2';
|
||||||
import { createServer } from 'node:net';
|
import { createServer } from 'node:net';
|
||||||
import type { AgentResponse } from '../../core/models/index.js';
|
import type { AgentResponse } from '../../core/models/index.js';
|
||||||
import { createLogger, getErrorMessage, createStreamDiagnostics, type StreamDiagnostics } from '../../shared/utils/index.js';
|
import { createLogger, getErrorMessage, createStreamDiagnostics, parseStructuredOutput, type StreamDiagnostics } from '../../shared/utils/index.js';
|
||||||
import { parseProviderModel } from '../../shared/utils/providerModel.js';
|
import { parseProviderModel } from '../../shared/utils/providerModel.js';
|
||||||
import {
|
import {
|
||||||
buildOpenCodePermissionConfig,
|
buildOpenCodePermissionConfig,
|
||||||
@ -236,16 +236,34 @@ export class OpenCodeClient {
|
|||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/** Build a prompt suffix that instructs the agent to return JSON matching the schema */
|
||||||
|
private buildStructuredOutputSuffix(schema: Record<string, unknown>): string {
|
||||||
|
return [
|
||||||
|
'',
|
||||||
|
'---',
|
||||||
|
'IMPORTANT: You MUST respond with ONLY a valid JSON object matching this schema. No other text, no markdown code blocks, no explanation.',
|
||||||
|
'```',
|
||||||
|
JSON.stringify(schema, null, 2),
|
||||||
|
'```',
|
||||||
|
].join('\n');
|
||||||
|
}
|
||||||
|
|
||||||
/** Call OpenCode with an agent prompt */
|
/** Call OpenCode with an agent prompt */
|
||||||
async call(
|
async call(
|
||||||
agentType: string,
|
agentType: string,
|
||||||
prompt: string,
|
prompt: string,
|
||||||
options: OpenCodeCallOptions,
|
options: OpenCodeCallOptions,
|
||||||
): Promise<AgentResponse> {
|
): Promise<AgentResponse> {
|
||||||
const fullPrompt = options.systemPrompt
|
const basePrompt = options.systemPrompt
|
||||||
? `${options.systemPrompt}\n\n${prompt}`
|
? `${options.systemPrompt}\n\n${prompt}`
|
||||||
: prompt;
|
: prompt;
|
||||||
|
|
||||||
|
// OpenCode SDK does not natively support structured output via outputFormat.
|
||||||
|
// Inject JSON output instructions into the prompt to make the agent return JSON.
|
||||||
|
const fullPrompt = options.outputSchema
|
||||||
|
? `${basePrompt}${this.buildStructuredOutputSuffix(options.outputSchema)}`
|
||||||
|
: basePrompt;
|
||||||
|
|
||||||
for (let attempt = 1; attempt <= OPENCODE_RETRY_MAX_ATTEMPTS; attempt++) {
|
for (let attempt = 1; attempt <= OPENCODE_RETRY_MAX_ATTEMPTS; attempt++) {
|
||||||
let idleTimeoutId: ReturnType<typeof setTimeout> | undefined;
|
let idleTimeoutId: ReturnType<typeof setTimeout> | undefined;
|
||||||
const streamAbortController = new AbortController();
|
const streamAbortController = new AbortController();
|
||||||
@ -580,17 +598,7 @@ export class OpenCodeClient {
|
|||||||
}
|
}
|
||||||
|
|
||||||
const trimmed = content.trim();
|
const trimmed = content.trim();
|
||||||
let structuredOutput: Record<string, unknown> | undefined;
|
const structuredOutput = parseStructuredOutput(trimmed, !!options.outputSchema);
|
||||||
if (options.outputSchema && trimmed.startsWith('{')) {
|
|
||||||
try {
|
|
||||||
const parsed = JSON.parse(trimmed) as unknown;
|
|
||||||
if (parsed && typeof parsed === 'object' && !Array.isArray(parsed)) {
|
|
||||||
structuredOutput = parsed as Record<string, unknown>;
|
|
||||||
}
|
|
||||||
} catch {
|
|
||||||
// Non-JSON response falls back to text path.
|
|
||||||
}
|
|
||||||
}
|
|
||||||
emitResult(options.onStream, true, trimmed, sessionId);
|
emitResult(options.onStream, true, trimmed, sessionId);
|
||||||
|
|
||||||
return {
|
return {
|
||||||
|
|||||||
@ -1,10 +1,14 @@
|
|||||||
<!--
|
<!--
|
||||||
template: perform_phase3_message
|
template: perform_phase3_message
|
||||||
phase: 3 (status judgment)
|
phase: 3 (status judgment)
|
||||||
vars: reportContent, criteriaTable, outputList, hasAppendix, appendixContent
|
vars: reportContent, criteriaTable, outputList, hasAppendix, appendixContent, structuredOutput
|
||||||
builder: StatusJudgmentBuilder
|
builder: StatusJudgmentBuilder
|
||||||
-->
|
-->
|
||||||
|
{{#if structuredOutput}}
|
||||||
|
**Review is already complete. Evaluate the report below and determine which numbered rule (1-based) best matches the result.**
|
||||||
|
{{else}}
|
||||||
**Review is already complete. Output exactly one tag corresponding to the judgment result shown in the report below.**
|
**Review is already complete. Output exactly one tag corresponding to the judgment result shown in the report below.**
|
||||||
|
{{/if}}
|
||||||
|
|
||||||
{{reportContent}}
|
{{reportContent}}
|
||||||
|
|
||||||
@ -12,12 +16,21 @@
|
|||||||
|
|
||||||
{{criteriaTable}}
|
{{criteriaTable}}
|
||||||
|
|
||||||
|
{{#if structuredOutput}}
|
||||||
|
|
||||||
|
## Task
|
||||||
|
|
||||||
|
Evaluate the report against the criteria above. Return the matched rule number (1-based integer) and a brief reason for your decision.
|
||||||
|
{{else}}
|
||||||
|
|
||||||
## Output Format
|
## Output Format
|
||||||
|
|
||||||
**Output the tag corresponding to the judgment shown in the report in one line:**
|
**Output the tag corresponding to the judgment shown in the report in one line:**
|
||||||
|
|
||||||
{{outputList}}
|
{{outputList}}
|
||||||
|
{{/if}}
|
||||||
{{#if hasAppendix}}
|
{{#if hasAppendix}}
|
||||||
|
|
||||||
### Appendix Template
|
### Appendix Template
|
||||||
{{appendixContent}}{{/if}}
|
{{appendixContent}}
|
||||||
|
{{/if}}
|
||||||
|
|||||||
@ -1,10 +1,14 @@
|
|||||||
<!--
|
<!--
|
||||||
template: perform_phase3_message
|
template: perform_phase3_message
|
||||||
phase: 3 (status judgment)
|
phase: 3 (status judgment)
|
||||||
vars: reportContent, criteriaTable, outputList, hasAppendix, appendixContent
|
vars: reportContent, criteriaTable, outputList, hasAppendix, appendixContent, structuredOutput
|
||||||
builder: StatusJudgmentBuilder
|
builder: StatusJudgmentBuilder
|
||||||
-->
|
-->
|
||||||
|
{{#if structuredOutput}}
|
||||||
|
**既にレビューは完了しています。以下のレポートを評価し、どの番号のルール(1始まり)が結果に最も合致するか判定してください。**
|
||||||
|
{{else}}
|
||||||
**既にレビューは完了しています。以下のレポートで示された判定結果に対応するタグを1つだけ出力してください。**
|
**既にレビューは完了しています。以下のレポートで示された判定結果に対応するタグを1つだけ出力してください。**
|
||||||
|
{{/if}}
|
||||||
|
|
||||||
{{reportContent}}
|
{{reportContent}}
|
||||||
|
|
||||||
@ -12,12 +16,21 @@
|
|||||||
|
|
||||||
{{criteriaTable}}
|
{{criteriaTable}}
|
||||||
|
|
||||||
|
{{#if structuredOutput}}
|
||||||
|
|
||||||
|
## タスク
|
||||||
|
|
||||||
|
上記の判定基準に照らしてレポートを評価してください。合致するルール番号(1始まりの整数)と簡潔な理由を返してください。
|
||||||
|
{{else}}
|
||||||
|
|
||||||
## 出力フォーマット
|
## 出力フォーマット
|
||||||
|
|
||||||
**レポートで示した判定に対応するタグを1行で出力してください:**
|
**レポートで示した判定に対応するタグを1行で出力してください:**
|
||||||
|
|
||||||
{{outputList}}
|
{{outputList}}
|
||||||
|
{{/if}}
|
||||||
{{#if hasAppendix}}
|
{{#if hasAppendix}}
|
||||||
|
|
||||||
### 追加出力テンプレート
|
### 追加出力テンプレート
|
||||||
{{appendixContent}}{{/if}}
|
{{appendixContent}}
|
||||||
|
{{/if}}
|
||||||
|
|||||||
@ -11,6 +11,7 @@ export * from './slackWebhook.js';
|
|||||||
export * from './sleep.js';
|
export * from './sleep.js';
|
||||||
export * from './slug.js';
|
export * from './slug.js';
|
||||||
export * from './streamDiagnostics.js';
|
export * from './streamDiagnostics.js';
|
||||||
|
export * from './structuredOutput.js';
|
||||||
export * from './taskPaths.js';
|
export * from './taskPaths.js';
|
||||||
export * from './text.js';
|
export * from './text.js';
|
||||||
export * from './types.js';
|
export * from './types.js';
|
||||||
|
|||||||
56
src/shared/utils/structuredOutput.ts
Normal file
56
src/shared/utils/structuredOutput.ts
Normal file
@ -0,0 +1,56 @@
|
|||||||
|
/**
|
||||||
|
* Parse structured output from provider text response.
|
||||||
|
*
|
||||||
|
* Codex and OpenCode return structured output as JSON text in agent messages.
|
||||||
|
* This function extracts a JSON object from the text when outputSchema was requested.
|
||||||
|
*
|
||||||
|
* Extraction strategies (in order):
|
||||||
|
* 1. Direct JSON parse — text is pure JSON starting with `{`
|
||||||
|
* 2. Code block extraction — JSON inside ```json ... ``` or ``` ... ```
|
||||||
|
* 3. Brace extraction — find outermost `{` ... `}` in the text
|
||||||
|
*/
|
||||||
|
|
||||||
|
function tryParseJsonObject(text: string): Record<string, unknown> | undefined {
|
||||||
|
try {
|
||||||
|
const parsed = JSON.parse(text) as unknown;
|
||||||
|
if (parsed && typeof parsed === 'object' && !Array.isArray(parsed)) {
|
||||||
|
return parsed as Record<string, unknown>;
|
||||||
|
}
|
||||||
|
} catch {
|
||||||
|
// Not valid JSON
|
||||||
|
}
|
||||||
|
return undefined;
|
||||||
|
}
|
||||||
|
|
||||||
|
export function parseStructuredOutput(
|
||||||
|
text: string,
|
||||||
|
hasOutputSchema: boolean,
|
||||||
|
): Record<string, unknown> | undefined {
|
||||||
|
if (!hasOutputSchema || !text) return undefined;
|
||||||
|
|
||||||
|
const trimmed = text.trim();
|
||||||
|
|
||||||
|
// Strategy 1: Direct JSON parse (text is pure JSON)
|
||||||
|
if (trimmed.startsWith('{')) {
|
||||||
|
const result = tryParseJsonObject(trimmed);
|
||||||
|
if (result) return result;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Strategy 2: Extract from markdown code block (```json\n{...}\n```)
|
||||||
|
const codeBlockMatch = trimmed.match(/```(?:json)?\s*\n(\{[\s\S]*?\})\s*\n```/);
|
||||||
|
if (codeBlockMatch?.[1]) {
|
||||||
|
const result = tryParseJsonObject(codeBlockMatch[1].trim());
|
||||||
|
if (result) return result;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Strategy 3: Find first `{` and last `}` (handles preamble/postamble text)
|
||||||
|
const firstBrace = trimmed.indexOf('{');
|
||||||
|
const lastBrace = trimmed.lastIndexOf('}');
|
||||||
|
if (firstBrace >= 0 && lastBrace > firstBrace) {
|
||||||
|
const candidate = trimmed.slice(firstBrace, lastBrace + 1);
|
||||||
|
const result = tryParseJsonObject(candidate);
|
||||||
|
if (result) return result;
|
||||||
|
}
|
||||||
|
|
||||||
|
return undefined;
|
||||||
|
}
|
||||||
@ -7,6 +7,7 @@ export default defineConfig({
|
|||||||
'e2e/specs/worktree.e2e.ts',
|
'e2e/specs/worktree.e2e.ts',
|
||||||
'e2e/specs/pipeline.e2e.ts',
|
'e2e/specs/pipeline.e2e.ts',
|
||||||
'e2e/specs/github-issue.e2e.ts',
|
'e2e/specs/github-issue.e2e.ts',
|
||||||
|
'e2e/specs/structured-output.e2e.ts',
|
||||||
],
|
],
|
||||||
environment: 'node',
|
environment: 'node',
|
||||||
globals: false,
|
globals: false,
|
||||||
|
|||||||
20
vitest.config.e2e.structured-output.ts
Normal file
20
vitest.config.e2e.structured-output.ts
Normal file
@ -0,0 +1,20 @@
|
|||||||
|
import { defineConfig } from 'vitest/config';
|
||||||
|
|
||||||
|
export default defineConfig({
|
||||||
|
test: {
|
||||||
|
include: [
|
||||||
|
'e2e/specs/structured-output.e2e.ts',
|
||||||
|
],
|
||||||
|
environment: 'node',
|
||||||
|
globals: false,
|
||||||
|
testTimeout: 240000,
|
||||||
|
hookTimeout: 60000,
|
||||||
|
teardownTimeout: 30000,
|
||||||
|
pool: 'threads',
|
||||||
|
poolOptions: {
|
||||||
|
threads: {
|
||||||
|
singleThread: true,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
});
|
||||||
Loading…
x
Reference in New Issue
Block a user