Implement unified story generation flow
This commit is contained in:
46
README.md
46
README.md
@@ -21,6 +21,15 @@ docs/ 当前产品、规划与技术文档
|
||||
docker-compose.yml
|
||||
```
|
||||
|
||||
## 环境变量文件
|
||||
|
||||
仓库里可能同时出现两个被 git 忽略的 env 文件,它们职责不同:
|
||||
|
||||
- `backend/.env`:应用运行配置。后端 API、管理后端、Celery worker、Celery beat 都读取这个文件;AI key、OAuth key、`SECRET_KEY`、`DATABASE_URL`、Provider 列表都放这里。
|
||||
- 根目录 `.env`:仅供 Docker Compose 做构建覆盖。这里只放 `PYTHON_BASE_IMAGE`、`NODE_BASE_IMAGE`、`NGINX_BASE_IMAGE`、`NPM_REGISTRY` 等镜像源/registry 变量,不放后端密钥,也不放 AI/OAuth key。
|
||||
|
||||
后端代码会按绝对路径读取 `backend/.env`,因此无论你在仓库根目录运行 `uvicorn`,还是 `cd backend` 后运行,读到的都是同一个应用配置文件。`backend/.env.example` 是 `backend/.env` 的模板;根目录 `.env` 没有模板也不是必需文件,只有在需要替换 Docker 基础镜像、npm registry 或端口时才创建。
|
||||
|
||||
## 本地 Docker 演示
|
||||
|
||||
1. 准备环境文件:
|
||||
@@ -42,6 +51,15 @@ STORYBOOK_PROVIDERS=["demo", "storybook_primary"]
|
||||
|
||||
`SECRET_KEY` 必须设置为强随机值。`backend/.env` 已被 git 忽略,不要提交真实密钥。
|
||||
|
||||
Docker 演示默认使用 `backend/.env` 中的容器内连接地址:
|
||||
|
||||
```env
|
||||
DATABASE_URL=postgresql+asyncpg://dreamweaver:dreamweaver_password@db:5432/dreamweaver_db
|
||||
CELERY_BROKER_URL=redis://redis:6379/0
|
||||
CELERY_RESULT_BACKEND=redis://redis:6379/0
|
||||
REDIS_URL=redis://redis:6379/0
|
||||
```
|
||||
|
||||
2. 启动完整本地栈:
|
||||
|
||||
```bash
|
||||
@@ -64,14 +82,30 @@ docker compose logs -f backend
|
||||
./scripts/demo_smoke.sh
|
||||
SMOKE_AUDIO=1 ./scripts/demo_smoke.sh
|
||||
SMOKE_VOICE=1 ./scripts/demo_smoke.sh
|
||||
SMOKE_REAL_ASR=1 ./scripts/demo_smoke.sh
|
||||
docker compose down
|
||||
docker compose down -v
|
||||
```
|
||||
|
||||
`scripts/demo_smoke.sh` 会检查健康状态、本地登录、统一生成后台任务、主记录落库、资产重试、故事列表和 Provider 能力分层。默认跳过 TTS 和语音共创;演示前需要验证朗读链路时使用 `SMOKE_AUDIO=1`,需要验证 Voice Studio Alpha 时使用 `SMOKE_VOICE=1`。
|
||||
`scripts/demo_smoke.sh` 会检查健康状态、本地登录、统一生成后台任务、主记录落库、资产重试、故事列表和 Provider 能力分层。默认跳过 TTS、语音共创和真实 ASR;演示前需要验证朗读链路时使用 `SMOKE_AUDIO=1`,需要验证 Voice Studio Alpha 时使用 `SMOKE_VOICE=1`,需要用真实 OpenAI ASR key 验收上传转写时使用 `SMOKE_REAL_ASR=1`。
|
||||
|
||||
语音共创的 ASR 能力已纳入 Provider 分层。默认 `ASR_PROVIDERS=["demo"]` 会使用 `transcript_hint` 或文本上传作为本地演示转写;需要真实转写时可设置 `ASR_PROVIDERS=["openai_asr", "demo"]` 并配置 `OPENAI_API_KEY`。
|
||||
|
||||
真实 ASR 验收建议在 `backend/.env` 中确认:
|
||||
|
||||
```env
|
||||
ASR_PROVIDERS=["openai_asr", "demo"]
|
||||
OPENAI_API_KEY=sk-...
|
||||
OPENAI_API_BASE=
|
||||
VOICE_TRANSCRIPTION_MODE=provider
|
||||
VOICE_TRANSCRIPTION_MODEL=gpt-4o-mini-transcribe
|
||||
VOICE_TRANSCRIPTION_LANGUAGE=zh
|
||||
```
|
||||
|
||||
改完 `backend/.env` 后重启 API/worker,若后台 Provider 表改过 ASR provider,还需要调用 `POST /admin/providers/reload` 并重启 API 进程,确保运行中缓存使用新配置。`SMOKE_REAL_ASR=1` 会自动开启 `SMOKE_VOICE=1`,在 macOS 上默认用 `say`/`afconvert` 生成一段短音频;其他环境可传入 `REAL_ASR_AUDIO_FILE=/path/to/sample.m4a`。
|
||||
|
||||
真实 ASR smoke 失败时,脚本会打印上传接口响应、Voice Session 事件和 Admin ASR analytics。常见失败包括 `OPENAI_API_KEY 未配置`、401/403 key 无效或项目无权限、429/insufficient_quota 额度不足、404/model_not_found 模型名不可用、连接超时或 `OPENAI_API_BASE` 指向错误,以及音频文件格式不被转写接口接受。
|
||||
|
||||
## 手动开发
|
||||
|
||||
后端:
|
||||
@@ -83,6 +117,15 @@ alembic upgrade head
|
||||
uvicorn app.main:app --reload --port 8000
|
||||
```
|
||||
|
||||
本机直接跑后端时,仍然修改 `backend/.env`,只是把数据库和 Redis 地址换成宿主机端口版本:
|
||||
|
||||
```env
|
||||
DATABASE_URL=postgresql+asyncpg://dreamweaver:dreamweaver_password@localhost:52432/dreamweaver_db
|
||||
CELERY_BROKER_URL=redis://localhost:52379/0
|
||||
CELERY_RESULT_BACKEND=redis://localhost:52379/0
|
||||
REDIS_URL=redis://localhost:52379/0
|
||||
```
|
||||
|
||||
Celery:
|
||||
|
||||
```bash
|
||||
@@ -162,6 +205,7 @@ npm run build
|
||||
- `docs/planning/week-4-sprint-review.md`:Week 4 复盘和生产化 backlog
|
||||
- `docs/technical/architecture.md`:求职版架构说明
|
||||
- `docs/technical/api-compatibility.md`:旧生成 API 兼容层策略
|
||||
- `docs/technical/environment-configuration.md`:环境变量文件职责与 Docker/本机切换约定
|
||||
- `docs/technical/generation-job-state.md`:Generation Job 状态落库决策
|
||||
- `docs/technical/memory-system-dev.md`:记忆系统技术说明
|
||||
- `docs/technical/provider-routing.md`:Provider 能力与路由策略说明
|
||||
|
||||
@@ -1,23 +1,26 @@
|
||||
# Build Stage
|
||||
FROM node:18-alpine AS build-stage
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
COPY package*.json ./
|
||||
RUN npm install
|
||||
|
||||
COPY . .
|
||||
RUN npm run build
|
||||
|
||||
# Production Stage
|
||||
FROM nginx:alpine AS production-stage
|
||||
|
||||
# 复制构建产物到 Nginx
|
||||
COPY --from=build-stage /app/dist /usr/share/nginx/html
|
||||
|
||||
# 复制自定义 Nginx 配置 (处理 SPA 路由)
|
||||
COPY nginx.conf /etc/nginx/conf.d/default.conf
|
||||
|
||||
EXPOSE 80
|
||||
|
||||
CMD ["nginx", "-g", "daemon off;"]
|
||||
# Build Stage
|
||||
ARG NODE_BASE_IMAGE=node:18-alpine
|
||||
ARG NGINX_BASE_IMAGE=nginx:alpine
|
||||
FROM ${NODE_BASE_IMAGE} AS build-stage
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
ARG NPM_REGISTRY=https://registry.npmjs.org/
|
||||
COPY package*.json ./
|
||||
RUN npm ci --registry="${NPM_REGISTRY}" --no-audit --no-fund
|
||||
|
||||
COPY . .
|
||||
RUN npm run build
|
||||
|
||||
# Production Stage
|
||||
FROM ${NGINX_BASE_IMAGE} AS production-stage
|
||||
|
||||
# 复制构建产物到 Nginx
|
||||
COPY --from=build-stage /app/dist /usr/share/nginx/html
|
||||
|
||||
# 复制自定义 Nginx 配置 (处理 SPA 路由)
|
||||
COPY nginx.conf /etc/nginx/conf.d/default.conf
|
||||
|
||||
EXPOSE 80
|
||||
|
||||
CMD ["nginx", "-g", "daemon off;"]
|
||||
|
||||
@@ -157,12 +157,20 @@
|
||||
<template v-else-if="analytics">
|
||||
<div class="mt-6 grid grid-cols-2 gap-3 lg:grid-cols-4">
|
||||
<div class="rounded-xl border border-gray-100 bg-white px-4 py-3">
|
||||
<div class="text-xs text-gray-500">覆盖故事</div>
|
||||
<div class="mt-1 text-lg font-semibold text-gray-900">{{ analytics.story_count }}</div>
|
||||
<div class="text-xs text-gray-500">
|
||||
{{ analyticsCapability === 'asr' ? '语音会话' : '覆盖故事' }}
|
||||
</div>
|
||||
<div class="mt-1 text-lg font-semibold text-gray-900">
|
||||
{{ analyticsCapability === 'asr' ? analytics.voice_session_count : analytics.story_count }}
|
||||
</div>
|
||||
</div>
|
||||
<div class="rounded-xl border border-gray-100 bg-white px-4 py-3">
|
||||
<div class="text-xs text-gray-500">覆盖任务</div>
|
||||
<div class="mt-1 text-lg font-semibold text-gray-900">{{ analytics.job_count }}</div>
|
||||
<div class="text-xs text-gray-500">
|
||||
{{ analyticsCapability === 'asr' ? '上传回合' : '覆盖任务' }}
|
||||
</div>
|
||||
<div class="mt-1 text-lg font-semibold text-gray-900">
|
||||
{{ analyticsCapability === 'asr' ? analytics.voice_turn_count : analytics.job_count }}
|
||||
</div>
|
||||
</div>
|
||||
<div class="rounded-xl border border-gray-100 bg-white px-4 py-3">
|
||||
<div class="text-xs text-gray-500">平均耗时</div>
|
||||
@@ -581,6 +589,8 @@ type ProviderAnalyticsResponse = {
|
||||
user_count: number
|
||||
job_count: number
|
||||
story_count: number
|
||||
voice_session_count: number
|
||||
voice_turn_count: number
|
||||
by_provider: ProviderAnalyticsBucket[]
|
||||
by_user: ProviderAnalyticsUserBucket[]
|
||||
failure_reasons: Array<{
|
||||
|
||||
@@ -2,25 +2,24 @@
|
||||
# DREAMWEAVER 环境变量配置模板
|
||||
# ==============================================
|
||||
# 使用说明:
|
||||
# 1. 复制此文件为 .env
|
||||
# 1. 在仓库根目录执行:cp backend/.env.example backend/.env
|
||||
# 2. 填入您的 API Keys
|
||||
# 3. 配合 docker-compose.yml 启动
|
||||
# 3. 后端、Celery、Docker demo 都读取 backend/.env
|
||||
# 4. 仓库根目录 .env 仅供 Docker Compose 自身读取构建参数,不放后端密钥
|
||||
# ==============================================
|
||||
|
||||
# ----------------------------------------------
|
||||
# 1. 基础设施 (Infrastructure) [必填]
|
||||
# ----------------------------------------------
|
||||
# ⚠️ 在 Docker 启动时无需修改这部分,直接使用默认值即可
|
||||
# ⚠️ 仅当您想连接外部数据库时才修改这里
|
||||
# ⚠️ Docker 演示通常无需修改这部分,直接使用默认值即可
|
||||
# ⚠️ 本机直跑后端时,把 DATABASE_URL/CELERY_* 改成文件末尾的 localhost 版本
|
||||
POSTGRES_USER=dreamweaver
|
||||
POSTGRES_PASSWORD=dreamweaver_password
|
||||
POSTGRES_DB=dreamweaver_db
|
||||
POSTGRES_PORT=5432
|
||||
REDIS_PORT=6379
|
||||
|
||||
DATABASE_URL=postgresql+asyncpg://${POSTGRES_USER}:${POSTGRES_PASSWORD}@db:5432/${POSTGRES_DB}
|
||||
DATABASE_URL=postgresql+asyncpg://dreamweaver:dreamweaver_password@db:5432/dreamweaver_db
|
||||
CELERY_BROKER_URL=redis://redis:6379/0
|
||||
CELERY_RESULT_BACKEND=redis://redis:6379/0
|
||||
REDIS_URL=redis://redis:6379/0
|
||||
|
||||
# Web Security
|
||||
SECRET_KEY=change-me-to-a-secure-random-string-in-production
|
||||
@@ -44,6 +43,7 @@ TTS_PROVIDERS=["minimax", "elevenlabs", "edge_tts"]
|
||||
# 绘本结构生成: 默认复用 Gemini Storybook adapter
|
||||
STORYBOOK_PROVIDERS=["storybook_primary"]
|
||||
# 语音识别: 本地演示默认 demo;真实转写可设置为 ["openai_asr", "demo"]
|
||||
# 真实 ASR smoke 必须让 openai_asr 排在 demo 前面,否则 demo hint 路径会先命中。
|
||||
ASR_PROVIDERS=["demo"]
|
||||
|
||||
# [模型参数]
|
||||
@@ -83,8 +83,10 @@ ELEVENLABS_API_KEY=
|
||||
|
||||
# OpenAI (如需使用)
|
||||
OPENAI_API_KEY=
|
||||
# 可选:OpenAI 官方地址可留空;使用兼容网关时填类似 https://example.com/v1
|
||||
OPENAI_API_BASE=
|
||||
# OpenAI ASR
|
||||
VOICE_TRANSCRIPTION_MODE=provider
|
||||
VOICE_TRANSCRIPTION_MODEL=gpt-4o-mini-transcribe
|
||||
VOICE_TRANSCRIPTION_LANGUAGE=zh
|
||||
|
||||
@@ -122,6 +124,8 @@ CORS_ORIGINS=["http://localhost:52080", "http://localhost:52888", "http://localh
|
||||
|
||||
# [本地开发覆盖 Local Dev Override]
|
||||
# 如果您不使用 Docker,而是在本机直接运行 `python -m uvicorn ...`
|
||||
# 请取消注释以下行以连接 localhost 数据库:
|
||||
# 请改用以下值连接 localhost 数据库/Redis:
|
||||
# DATABASE_URL=postgresql+asyncpg://dreamweaver:dreamweaver_password@localhost:52432/dreamweaver_db
|
||||
# CELERY_BROKER_URL=redis://localhost:52379/0
|
||||
# CELERY_RESULT_BACKEND=redis://localhost:52379/0
|
||||
# REDIS_URL=redis://localhost:52379/0
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
FROM python:3.11-slim
|
||||
ARG PYTHON_BASE_IMAGE=python:3.11-slim
|
||||
FROM ${PYTHON_BASE_IMAGE}
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
|
||||
@@ -9,8 +9,8 @@ from app.core.admin_auth import admin_guard
|
||||
from app.db.admin_models import Provider
|
||||
from app.db.database import get_db
|
||||
from app.services.adapters.registry import AdapterRegistry
|
||||
from app.services.admin_provider_analytics import get_admin_provider_analytics
|
||||
from app.services.cost_tracker import cost_tracker
|
||||
from app.services.generation_jobs import get_admin_provider_analytics
|
||||
from app.services.provider_policy import DEFAULT_PROVIDERS, list_capability_policies
|
||||
from app.services.secret_service import SecretService
|
||||
|
||||
@@ -97,6 +97,8 @@ class ProviderAnalyticsResponse(BaseModel):
|
||||
user_count: int
|
||||
job_count: int
|
||||
story_count: int
|
||||
voice_session_count: int = 0
|
||||
voice_turn_count: int = 0
|
||||
by_provider: list[ProviderAnalyticsBucket]
|
||||
by_user: list[ProviderAnalyticsUserBucket]
|
||||
failure_reasons: list[ProviderAnalyticsFailureReason]
|
||||
|
||||
@@ -1,15 +1,20 @@
|
||||
from pydantic import Field, model_validator
|
||||
from pydantic_settings import BaseSettings, SettingsConfigDict
|
||||
|
||||
|
||||
class Settings(BaseSettings):
|
||||
"""应用全局配置"""
|
||||
|
||||
model_config = SettingsConfigDict(
|
||||
env_file=".env",
|
||||
env_file_encoding="utf-8",
|
||||
extra="ignore",
|
||||
)
|
||||
from pathlib import Path
|
||||
|
||||
from pydantic import Field, model_validator
|
||||
from pydantic_settings import BaseSettings, SettingsConfigDict
|
||||
|
||||
BACKEND_DIR = Path(__file__).resolve().parents[2]
|
||||
BACKEND_ENV_FILE = BACKEND_DIR / ".env"
|
||||
|
||||
|
||||
class Settings(BaseSettings):
|
||||
"""应用全局配置"""
|
||||
|
||||
model_config = SettingsConfigDict(
|
||||
env_file=BACKEND_ENV_FILE,
|
||||
env_file_encoding="utf-8",
|
||||
extra="ignore",
|
||||
)
|
||||
|
||||
# 应用基础配置
|
||||
app_name: str = "DreamWeaver"
|
||||
@@ -34,9 +39,10 @@ class Settings(BaseSettings):
|
||||
tts_api_key: str = ""
|
||||
image_api_key: str = ""
|
||||
|
||||
# Additional Provider API Keys
|
||||
openai_api_key: str = ""
|
||||
elevenlabs_api_key: str = ""
|
||||
# Additional Provider API Keys
|
||||
openai_api_key: str = ""
|
||||
openai_api_base: str = ""
|
||||
elevenlabs_api_key: str = ""
|
||||
cqtai_api_key: str = ""
|
||||
minimax_api_key: str = ""
|
||||
minimax_group_id: str = ""
|
||||
|
||||
@@ -9,6 +9,7 @@ from app.services.adapters.asr import openai as _asr_openai_adapter # noqa: F40
|
||||
from app.services.adapters.base import AdapterConfig, BaseAdapter
|
||||
|
||||
# Image adapters
|
||||
from app.services.adapters.image import antigravity as _image_antigravity_adapter # noqa: F401
|
||||
from app.services.adapters.image import cqtai as _image_cqtai_adapter # noqa: F401
|
||||
from app.services.adapters.registry import AdapterRegistry
|
||||
|
||||
|
||||
@@ -2,10 +2,11 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
from io import BytesIO
|
||||
|
||||
from fastapi import HTTPException
|
||||
from openai import AsyncOpenAI
|
||||
from openai import APIConnectionError, APIStatusError, APITimeoutError, AsyncOpenAI
|
||||
|
||||
from app.core.logging import get_logger
|
||||
from app.services.adapters.asr.models import TranscriptionOutput
|
||||
@@ -15,6 +16,14 @@ from app.services.adapters.registry import AdapterRegistry
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
def _mask_openai_error(message: str) -> str:
|
||||
"""Avoid leaking bearer tokens while keeping ASR smoke failures actionable."""
|
||||
|
||||
sanitized = message.replace("\n", " ").strip()
|
||||
sanitized = re.sub(r"Bearer\s+[A-Za-z0-9._-]+", "Bearer ***", sanitized)
|
||||
return re.sub(r"sk-[A-Za-z0-9_-]+", "sk-***", sanitized)
|
||||
|
||||
|
||||
@AdapterRegistry.register("asr", "openai_asr")
|
||||
class OpenAIASRAdapter(BaseAdapter[TranscriptionOutput]):
|
||||
"""Transcribe uploaded voice turn audio with OpenAI audio transcription."""
|
||||
@@ -37,7 +46,11 @@ class OpenAIASRAdapter(BaseAdapter[TranscriptionOutput]):
|
||||
detail="OPENAI_API_KEY 未配置,无法使用 OpenAI 语音转写。",
|
||||
)
|
||||
|
||||
client = AsyncOpenAI(api_key=self.config.api_key)
|
||||
client = AsyncOpenAI(
|
||||
api_key=self.config.api_key,
|
||||
base_url=self.config.api_base or None,
|
||||
timeout=self.config.timeout_ms / 1000,
|
||||
)
|
||||
audio_file = BytesIO(audio_bytes)
|
||||
audio_file.name = file_name or "voice-turn.webm"
|
||||
|
||||
@@ -51,11 +64,29 @@ class OpenAIASRAdapter(BaseAdapter[TranscriptionOutput]):
|
||||
language=language,
|
||||
prompt=prompt,
|
||||
)
|
||||
except APIStatusError as exc:
|
||||
detail = _mask_openai_error(getattr(exc, "message", str(exc)))
|
||||
logger.warning(
|
||||
"openai_asr_failed",
|
||||
status_code=exc.status_code,
|
||||
error=detail,
|
||||
)
|
||||
raise HTTPException(
|
||||
status_code=503,
|
||||
detail=f"OpenAI ASR 调用失败(HTTP {exc.status_code}):{detail}",
|
||||
) from exc
|
||||
except (APITimeoutError, APIConnectionError) as exc:
|
||||
detail = _mask_openai_error(str(exc))
|
||||
logger.warning("openai_asr_failed", error=detail)
|
||||
raise HTTPException(
|
||||
status_code=503,
|
||||
detail=f"OpenAI ASR 网络连接失败:{detail}",
|
||||
) from exc
|
||||
except Exception as exc:
|
||||
logger.warning("openai_asr_failed", error=str(exc))
|
||||
raise HTTPException(
|
||||
status_code=503,
|
||||
detail="语音转写服务暂时不可用,请稍后重试。",
|
||||
detail=f"OpenAI ASR 调用异常:{_mask_openai_error(str(exc))}",
|
||||
) from exc
|
||||
|
||||
transcript_text = (getattr(response, "text", "") or "").strip()
|
||||
|
||||
@@ -126,6 +126,11 @@ class MiniMaxTTSAdapter(BaseAdapter[bytes]):
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
@property
|
||||
def estimated_cost(self) -> float:
|
||||
"""预估每次短文本语音合成成本 (USD)。"""
|
||||
return 0.01
|
||||
|
||||
@retry(
|
||||
stop=stop_after_attempt(3),
|
||||
wait=wait_exponential(multiplier=1, min=1, max=10),
|
||||
|
||||
408
backend/app/services/admin_provider_analytics.py
Normal file
408
backend/app/services/admin_provider_analytics.py
Normal file
@@ -0,0 +1,408 @@
|
||||
"""Admin-facing provider analytics across generation and voice telemetry."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from datetime import datetime, timedelta, timezone
|
||||
from typing import Any
|
||||
|
||||
from sqlalchemy import select
|
||||
from sqlalchemy.ext.asyncio import AsyncSession
|
||||
|
||||
from app.db.admin_models import CostRecord
|
||||
from app.db.models import VoiceSession, VoiceSessionEvent, VoiceTurn
|
||||
from app.services.generation_jobs import (
|
||||
_aggregate_provider_events,
|
||||
_as_float,
|
||||
_event_matches_capability,
|
||||
_provider_events_query,
|
||||
)
|
||||
|
||||
|
||||
def _empty_admin_user_bucket(user_id: str) -> dict[str, Any]:
|
||||
return {
|
||||
"user_id": user_id,
|
||||
"call_count": 0,
|
||||
"success_count": 0,
|
||||
"failure_count": 0,
|
||||
"estimated_cost_usd": 0.0,
|
||||
"job_ids": set(),
|
||||
"story_ids": set(),
|
||||
}
|
||||
|
||||
|
||||
def _merge_admin_user_bucket(
|
||||
target: dict[str, Any],
|
||||
source: dict[str, Any],
|
||||
) -> None:
|
||||
target["call_count"] += int(source["call_count"])
|
||||
target["success_count"] += int(source["success_count"])
|
||||
target["failure_count"] += int(source["failure_count"])
|
||||
target["estimated_cost_usd"] += float(source["estimated_cost_usd"])
|
||||
target["job_ids"].update(source["job_ids"])
|
||||
target["story_ids"].update(source["story_ids"])
|
||||
|
||||
|
||||
def _serialize_admin_user_buckets(
|
||||
by_user: dict[str, dict[str, Any]],
|
||||
) -> list[dict[str, Any]]:
|
||||
serialized_users = [
|
||||
{
|
||||
"user_id": user_id,
|
||||
"call_count": bucket["call_count"],
|
||||
"success_count": bucket["success_count"],
|
||||
"failure_count": bucket["failure_count"],
|
||||
"job_count": len(bucket["job_ids"]),
|
||||
"story_count": len(bucket["story_ids"]),
|
||||
"estimated_cost_usd": round(bucket["estimated_cost_usd"], 6),
|
||||
}
|
||||
for user_id, bucket in by_user.items()
|
||||
]
|
||||
serialized_users.sort(
|
||||
key=lambda item: (
|
||||
-int(item["call_count"]),
|
||||
-float(item["estimated_cost_usd"]),
|
||||
str(item["user_id"]),
|
||||
)
|
||||
)
|
||||
return serialized_users
|
||||
|
||||
|
||||
def _merge_provider_analytics(
|
||||
left: dict[str, Any],
|
||||
right: dict[str, Any],
|
||||
) -> dict[str, Any]:
|
||||
provider_buckets: dict[tuple[str, str], dict[str, Any]] = {}
|
||||
latency_totals: dict[tuple[str, str], float] = {}
|
||||
latency_counts: dict[tuple[str, str], int] = {}
|
||||
failure_reasons: dict[str, int] = {}
|
||||
|
||||
for payload in (left, right):
|
||||
for row in payload["by_provider"]:
|
||||
capability_name = str(row["capability"])
|
||||
adapter_name = str(row["adapter"])
|
||||
key = (capability_name, adapter_name)
|
||||
bucket = provider_buckets.setdefault(
|
||||
key,
|
||||
{
|
||||
"capability": capability_name,
|
||||
"adapter": adapter_name,
|
||||
"call_count": 0,
|
||||
"success_count": 0,
|
||||
"failure_count": 0,
|
||||
"estimated_cost_usd": 0.0,
|
||||
},
|
||||
)
|
||||
call_count = int(row["call_count"])
|
||||
bucket["call_count"] += call_count
|
||||
bucket["success_count"] += int(row["success_count"])
|
||||
bucket["failure_count"] += int(row["failure_count"])
|
||||
bucket["estimated_cost_usd"] += float(row["estimated_cost_usd"])
|
||||
|
||||
if row["avg_latency_ms"] is not None and call_count:
|
||||
latency_totals[key] = latency_totals.get(key, 0.0) + (
|
||||
float(row["avg_latency_ms"]) * call_count
|
||||
)
|
||||
latency_counts[key] = latency_counts.get(key, 0) + call_count
|
||||
|
||||
for item in payload["failure_reasons"]:
|
||||
reason = str(item["reason"])
|
||||
failure_reasons[reason] = failure_reasons.get(reason, 0) + int(item["count"])
|
||||
|
||||
by_provider = []
|
||||
total_latency = 0.0
|
||||
latency_count = 0
|
||||
for key, bucket in provider_buckets.items():
|
||||
bucket_latency_count = latency_counts.get(key, 0)
|
||||
bucket_latency_total = latency_totals.get(key, 0.0)
|
||||
if bucket_latency_count:
|
||||
total_latency += bucket_latency_total
|
||||
latency_count += bucket_latency_count
|
||||
by_provider.append(
|
||||
{
|
||||
**bucket,
|
||||
"avg_latency_ms": (
|
||||
round(bucket_latency_total / bucket_latency_count, 2)
|
||||
if bucket_latency_count
|
||||
else None
|
||||
),
|
||||
"estimated_cost_usd": round(bucket["estimated_cost_usd"], 6),
|
||||
}
|
||||
)
|
||||
|
||||
by_provider.sort(
|
||||
key=lambda item: (
|
||||
str(item["capability"]),
|
||||
str(item["adapter"]),
|
||||
)
|
||||
)
|
||||
|
||||
return {
|
||||
"total_calls": int(left["total_calls"]) + int(right["total_calls"]),
|
||||
"successful_calls": int(left["successful_calls"]) + int(right["successful_calls"]),
|
||||
"failed_calls": int(left["failed_calls"]) + int(right["failed_calls"]),
|
||||
"avg_latency_ms": round(total_latency / latency_count, 2) if latency_count else None,
|
||||
"estimated_cost_usd": round(
|
||||
float(left["estimated_cost_usd"]) + float(right["estimated_cost_usd"]),
|
||||
6,
|
||||
),
|
||||
"by_provider": by_provider,
|
||||
"failure_reasons": [
|
||||
{"reason": reason, "count": count}
|
||||
for reason, count in sorted(
|
||||
failure_reasons.items(),
|
||||
key=lambda item: (-item[1], item[0]),
|
||||
)
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
def _voice_asr_provider_from_turn(turn: VoiceTurn) -> str:
|
||||
story_patch = turn.story_patch or {}
|
||||
return str(story_patch.get("transcription_provider") or "unknown")
|
||||
|
||||
|
||||
async def _aggregate_voice_asr_provider_analytics(
|
||||
db: AsyncSession,
|
||||
*,
|
||||
days: int | None = None,
|
||||
) -> dict[str, Any]:
|
||||
"""Aggregate ASR telemetry from voice co-creation sessions."""
|
||||
|
||||
cutoff = datetime.now(timezone.utc) - timedelta(days=days) if days is not None else None
|
||||
|
||||
turn_query = (
|
||||
select(
|
||||
VoiceTurn,
|
||||
VoiceSession.user_id,
|
||||
VoiceSession.final_story_id,
|
||||
VoiceSession.id,
|
||||
)
|
||||
.join(VoiceSession, VoiceTurn.session_id == VoiceSession.id)
|
||||
.where(
|
||||
VoiceTurn.user_audio_path.isnot(None),
|
||||
VoiceTurn.user_transcript.isnot(None),
|
||||
)
|
||||
)
|
||||
failure_query = (
|
||||
select(
|
||||
VoiceSessionEvent,
|
||||
VoiceSession.user_id,
|
||||
VoiceSession.final_story_id,
|
||||
VoiceSession.id,
|
||||
)
|
||||
.join(VoiceSession, VoiceSessionEvent.session_id == VoiceSession.id)
|
||||
.where(VoiceSessionEvent.event_type == "turn_transcription_failed")
|
||||
)
|
||||
cost_query = select(
|
||||
CostRecord.user_id,
|
||||
CostRecord.provider_name,
|
||||
CostRecord.estimated_cost,
|
||||
).where(CostRecord.capability == "asr")
|
||||
|
||||
if cutoff is not None:
|
||||
turn_query = turn_query.where(VoiceTurn.created_at >= cutoff)
|
||||
failure_query = failure_query.where(VoiceSessionEvent.created_at >= cutoff)
|
||||
cost_query = cost_query.where(CostRecord.timestamp >= cutoff)
|
||||
|
||||
turn_rows = (await db.execute(turn_query)).all()
|
||||
failure_rows = (await db.execute(failure_query)).all()
|
||||
cost_rows = (await db.execute(cost_query)).all()
|
||||
|
||||
costs_by_provider: dict[str, float] = {}
|
||||
costs_by_user: dict[str, float] = {}
|
||||
for user_id, provider_name, estimated_cost in cost_rows:
|
||||
cost = float(estimated_cost or 0.0)
|
||||
provider = str(provider_name or "unknown")
|
||||
costs_by_provider[provider] = costs_by_provider.get(provider, 0.0) + cost
|
||||
costs_by_user[str(user_id)] = costs_by_user.get(str(user_id), 0.0) + cost
|
||||
|
||||
provider_buckets: dict[tuple[str, str], dict[str, Any]] = {}
|
||||
failure_reasons: dict[str, int] = {}
|
||||
by_user: dict[str, dict[str, Any]] = {}
|
||||
user_ids: set[str] = set()
|
||||
story_ids: set[int] = set()
|
||||
voice_session_ids: set[str] = set()
|
||||
successful_calls = 0
|
||||
failed_calls = 0
|
||||
|
||||
def provider_bucket(adapter: str) -> dict[str, Any]:
|
||||
return provider_buckets.setdefault(
|
||||
("asr", adapter),
|
||||
{
|
||||
"capability": "asr",
|
||||
"adapter": adapter,
|
||||
"call_count": 0,
|
||||
"success_count": 0,
|
||||
"failure_count": 0,
|
||||
"avg_latency_ms": None,
|
||||
"estimated_cost_usd": 0.0,
|
||||
},
|
||||
)
|
||||
|
||||
for turn, user_id, final_story_id, session_id in turn_rows:
|
||||
user_id = str(user_id)
|
||||
adapter = _voice_asr_provider_from_turn(turn)
|
||||
user_ids.add(user_id)
|
||||
voice_session_ids.add(str(session_id))
|
||||
if final_story_id is not None:
|
||||
story_ids.add(int(final_story_id))
|
||||
|
||||
bucket = provider_bucket(adapter)
|
||||
bucket["call_count"] += 1
|
||||
bucket["success_count"] += 1
|
||||
successful_calls += 1
|
||||
|
||||
user_bucket = by_user.setdefault(user_id, _empty_admin_user_bucket(user_id))
|
||||
user_bucket["call_count"] += 1
|
||||
user_bucket["success_count"] += 1
|
||||
if final_story_id is not None:
|
||||
user_bucket["story_ids"].add(int(final_story_id))
|
||||
|
||||
for provider_name, cost in costs_by_provider.items():
|
||||
key = ("asr", provider_name)
|
||||
if key in provider_buckets:
|
||||
provider_buckets[key]["estimated_cost_usd"] += cost
|
||||
|
||||
for user_id, cost in costs_by_user.items():
|
||||
if user_id in by_user:
|
||||
by_user[user_id]["estimated_cost_usd"] += cost
|
||||
|
||||
for event, user_id, final_story_id, session_id in failure_rows:
|
||||
metadata = event.event_metadata or {}
|
||||
adapter = str(
|
||||
metadata.get("adapter")
|
||||
or metadata.get("transcription_provider")
|
||||
or "unknown"
|
||||
)
|
||||
user_id = str(user_id)
|
||||
reason = str(metadata.get("error") or "unknown_error")
|
||||
user_ids.add(user_id)
|
||||
voice_session_ids.add(str(session_id))
|
||||
if final_story_id is not None:
|
||||
story_ids.add(int(final_story_id))
|
||||
|
||||
bucket = provider_bucket(adapter)
|
||||
bucket["call_count"] += 1
|
||||
bucket["failure_count"] += 1
|
||||
failed_calls += 1
|
||||
failure_reasons[reason] = failure_reasons.get(reason, 0) + 1
|
||||
|
||||
user_bucket = by_user.setdefault(user_id, _empty_admin_user_bucket(user_id))
|
||||
user_bucket["call_count"] += 1
|
||||
user_bucket["failure_count"] += 1
|
||||
if final_story_id is not None:
|
||||
user_bucket["story_ids"].add(int(final_story_id))
|
||||
|
||||
by_provider = [
|
||||
{
|
||||
**bucket,
|
||||
"estimated_cost_usd": round(bucket["estimated_cost_usd"], 6),
|
||||
}
|
||||
for bucket in provider_buckets.values()
|
||||
]
|
||||
by_provider.sort(
|
||||
key=lambda item: (
|
||||
str(item["capability"]),
|
||||
str(item["adapter"]),
|
||||
)
|
||||
)
|
||||
|
||||
return {
|
||||
"total_calls": successful_calls + failed_calls,
|
||||
"successful_calls": successful_calls,
|
||||
"failed_calls": failed_calls,
|
||||
"avg_latency_ms": None,
|
||||
"estimated_cost_usd": round(
|
||||
sum(float(bucket["estimated_cost_usd"]) for bucket in provider_buckets.values()),
|
||||
6,
|
||||
),
|
||||
"by_provider": by_provider,
|
||||
"failure_reasons": [
|
||||
{"reason": reason, "count": count}
|
||||
for reason, count in sorted(
|
||||
failure_reasons.items(),
|
||||
key=lambda item: (-item[1], item[0]),
|
||||
)
|
||||
],
|
||||
"by_user": by_user,
|
||||
"user_ids": user_ids,
|
||||
"story_ids": story_ids,
|
||||
"voice_session_ids": voice_session_ids,
|
||||
"voice_turn_count": successful_calls,
|
||||
}
|
||||
|
||||
|
||||
async def get_admin_provider_analytics(
|
||||
db: AsyncSession,
|
||||
*,
|
||||
days: int | None = None,
|
||||
capability: str | None = None,
|
||||
) -> dict[str, Any]:
|
||||
"""Aggregate provider telemetry across every user in the current environment."""
|
||||
|
||||
rows = (await db.execute(_provider_events_query(days=days))).all()
|
||||
events = [event for event, _, _ in rows]
|
||||
filtered_rows = [
|
||||
(event, user_id, story_id)
|
||||
for event, user_id, story_id in rows
|
||||
if _event_matches_capability(event, capability)
|
||||
]
|
||||
|
||||
by_user: dict[str, dict[str, Any]] = {}
|
||||
filtered_job_ids = {event.job_id for event, _, _ in filtered_rows}
|
||||
filtered_story_ids = {
|
||||
story_id for _, _, story_id in filtered_rows if story_id is not None
|
||||
}
|
||||
filtered_user_ids = {user_id for _, user_id, _ in filtered_rows}
|
||||
|
||||
for event, user_id, story_id in filtered_rows:
|
||||
bucket = by_user.setdefault(
|
||||
user_id,
|
||||
_empty_admin_user_bucket(user_id),
|
||||
)
|
||||
bucket["call_count"] += 1
|
||||
bucket["job_ids"].add(event.job_id)
|
||||
if story_id is not None:
|
||||
bucket["story_ids"].add(story_id)
|
||||
|
||||
if event.event_type == "provider_call_succeeded":
|
||||
bucket["success_count"] += 1
|
||||
bucket["estimated_cost_usd"] += (
|
||||
_as_float((event.event_metadata or {}).get("estimated_cost_usd")) or 0.0
|
||||
)
|
||||
else:
|
||||
bucket["failure_count"] += 1
|
||||
|
||||
provider_analytics = _aggregate_provider_events(events, capability=capability)
|
||||
voice_session_count = 0
|
||||
voice_turn_count = 0
|
||||
if capability in {None, "asr"}:
|
||||
asr_analytics = await _aggregate_voice_asr_provider_analytics(db, days=days)
|
||||
provider_analytics = _merge_provider_analytics(
|
||||
provider_analytics,
|
||||
asr_analytics,
|
||||
)
|
||||
filtered_user_ids.update(asr_analytics["user_ids"])
|
||||
filtered_story_ids.update(asr_analytics["story_ids"])
|
||||
voice_session_count = len(asr_analytics["voice_session_ids"])
|
||||
voice_turn_count = int(asr_analytics["voice_turn_count"])
|
||||
|
||||
for user_id, source_bucket in asr_analytics["by_user"].items():
|
||||
target_bucket = by_user.setdefault(
|
||||
user_id,
|
||||
_empty_admin_user_bucket(user_id),
|
||||
)
|
||||
_merge_admin_user_bucket(target_bucket, source_bucket)
|
||||
|
||||
return {
|
||||
"scope": "current_environment",
|
||||
"window_days": days,
|
||||
"capability": capability,
|
||||
**provider_analytics,
|
||||
"user_count": len(filtered_user_ids),
|
||||
"job_count": len(filtered_job_ids),
|
||||
"story_count": len(filtered_story_ids),
|
||||
"voice_session_count": voice_session_count,
|
||||
"voice_turn_count": voice_turn_count,
|
||||
"by_user": _serialize_admin_user_buckets(by_user),
|
||||
}
|
||||
@@ -11,7 +11,11 @@ from sqlalchemy.ext.asyncio import AsyncSession
|
||||
|
||||
from app.core.config import settings
|
||||
from app.core.logging import get_logger
|
||||
from app.db.models import GenerationJob, GenerationJobEvent, Story
|
||||
from app.db.models import (
|
||||
GenerationJob,
|
||||
GenerationJobEvent,
|
||||
Story,
|
||||
)
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
@@ -712,87 +716,6 @@ async def get_user_provider_analytics(
|
||||
}
|
||||
|
||||
|
||||
async def get_admin_provider_analytics(
|
||||
db: AsyncSession,
|
||||
*,
|
||||
days: int | None = None,
|
||||
capability: str | None = None,
|
||||
) -> dict[str, Any]:
|
||||
"""Aggregate provider telemetry across every user in the current environment."""
|
||||
|
||||
rows = (await db.execute(_provider_events_query(days=days))).all()
|
||||
events = [event for event, _, _ in rows]
|
||||
filtered_rows = [
|
||||
(event, user_id, story_id)
|
||||
for event, user_id, story_id in rows
|
||||
if _event_matches_capability(event, capability)
|
||||
]
|
||||
|
||||
by_user: dict[str, dict[str, Any]] = {}
|
||||
filtered_job_ids = {event.job_id for event, _, _ in filtered_rows}
|
||||
filtered_story_ids = {
|
||||
story_id for _, _, story_id in filtered_rows if story_id is not None
|
||||
}
|
||||
filtered_user_ids = {user_id for _, user_id, _ in filtered_rows}
|
||||
|
||||
for event, user_id, story_id in filtered_rows:
|
||||
bucket = by_user.setdefault(
|
||||
user_id,
|
||||
{
|
||||
"user_id": user_id,
|
||||
"call_count": 0,
|
||||
"success_count": 0,
|
||||
"failure_count": 0,
|
||||
"estimated_cost_usd": 0.0,
|
||||
"job_ids": set(),
|
||||
"story_ids": set(),
|
||||
},
|
||||
)
|
||||
bucket["call_count"] += 1
|
||||
bucket["job_ids"].add(event.job_id)
|
||||
if story_id is not None:
|
||||
bucket["story_ids"].add(story_id)
|
||||
|
||||
if event.event_type == "provider_call_succeeded":
|
||||
bucket["success_count"] += 1
|
||||
bucket["estimated_cost_usd"] += (
|
||||
_as_float((event.event_metadata or {}).get("estimated_cost_usd")) or 0.0
|
||||
)
|
||||
else:
|
||||
bucket["failure_count"] += 1
|
||||
|
||||
serialized_users = [
|
||||
{
|
||||
"user_id": user_id,
|
||||
"call_count": bucket["call_count"],
|
||||
"success_count": bucket["success_count"],
|
||||
"failure_count": bucket["failure_count"],
|
||||
"job_count": len(bucket["job_ids"]),
|
||||
"story_count": len(bucket["story_ids"]),
|
||||
"estimated_cost_usd": round(bucket["estimated_cost_usd"], 6),
|
||||
}
|
||||
for user_id, bucket in by_user.items()
|
||||
]
|
||||
serialized_users.sort(
|
||||
key=lambda item: (
|
||||
-int(item["call_count"]),
|
||||
-float(item["estimated_cost_usd"]),
|
||||
str(item["user_id"]),
|
||||
)
|
||||
)
|
||||
|
||||
return {
|
||||
"scope": "current_environment",
|
||||
"window_days": days,
|
||||
"capability": capability,
|
||||
**_aggregate_provider_events(events, capability=capability),
|
||||
"user_count": len(filtered_user_ids),
|
||||
"job_count": len(filtered_job_ids),
|
||||
"story_count": len(filtered_story_ids),
|
||||
"by_user": serialized_users,
|
||||
}
|
||||
|
||||
|
||||
async def get_user_generation_ops_summary(
|
||||
db: AsyncSession,
|
||||
*,
|
||||
|
||||
@@ -117,6 +117,7 @@ def _get_default_config(adapter_name: str) -> AdapterConfig | None:
|
||||
if adapter_name == "openai_asr":
|
||||
return AdapterConfig(
|
||||
api_key=settings.openai_api_key,
|
||||
api_base=getattr(settings, "openai_api_base", ""),
|
||||
model=settings.voice_transcription_model,
|
||||
timeout_ms=60000,
|
||||
)
|
||||
@@ -131,6 +132,7 @@ def _get_default_config(adapter_name: str) -> AdapterConfig | None:
|
||||
if adapter_name == "openai":
|
||||
return AdapterConfig(
|
||||
api_key=getattr(settings, "openai_api_key", ""),
|
||||
api_base=getattr(settings, "openai_api_base", ""),
|
||||
model=settings.openai_model,
|
||||
timeout_ms=60000,
|
||||
)
|
||||
|
||||
@@ -1,12 +1,14 @@
|
||||
from datetime import datetime, timedelta, timezone
|
||||
from decimal import Decimal
|
||||
|
||||
from fastapi import FastAPI
|
||||
from httpx import ASGITransport, AsyncClient
|
||||
|
||||
from app.api import admin_providers
|
||||
from app.core.admin_auth import admin_guard
|
||||
from app.db.admin_models import CostRecord
|
||||
from app.db.database import get_db
|
||||
from app.db.models import Story, User
|
||||
from app.db.models import Story, User, VoiceSession, VoiceSessionEvent, VoiceTurn
|
||||
from app.services.generation_jobs import create_generation_job, record_generation_event
|
||||
|
||||
|
||||
@@ -286,3 +288,105 @@ async def test_admin_provider_analytics_support_days_and_capability_filters(
|
||||
|
||||
response = await client.get("/admin/providers/analytics?capability=unknown")
|
||||
assert response.status_code == 422
|
||||
|
||||
|
||||
async def test_admin_provider_analytics_includes_voice_asr_calls(
|
||||
db_session,
|
||||
test_user,
|
||||
):
|
||||
second_user = User(
|
||||
id="google:asr-user",
|
||||
name="ASR User",
|
||||
avatar_url="https://example.com/asr.png",
|
||||
provider="google",
|
||||
)
|
||||
db_session.add(second_user)
|
||||
await db_session.commit()
|
||||
|
||||
successful_session = VoiceSession(user_id=test_user.id, status="active")
|
||||
failed_session = VoiceSession(user_id=second_user.id, status="active")
|
||||
db_session.add_all([successful_session, failed_session])
|
||||
await db_session.commit()
|
||||
await db_session.refresh(successful_session)
|
||||
await db_session.refresh(failed_session)
|
||||
|
||||
db_session.add_all(
|
||||
[
|
||||
VoiceTurn(
|
||||
session_id=successful_session.id,
|
||||
turn_index=1,
|
||||
status="completed",
|
||||
user_audio_path="/tmp/voice-turn.webm",
|
||||
user_audio_mime_type="audio/webm",
|
||||
user_audio_duration_ms=1300,
|
||||
user_transcript="我想听一个星星故事",
|
||||
transcript_confidence=0.96,
|
||||
detected_intent="continue_story",
|
||||
intent_confidence=0.9,
|
||||
story_patch={"transcription_provider": "demo"},
|
||||
),
|
||||
VoiceSessionEvent(
|
||||
session_id=failed_session.id,
|
||||
event_type="turn_transcription_failed",
|
||||
status="failed",
|
||||
message="Voice transcription failed.",
|
||||
event_metadata={"error": "OPENAI_API_KEY 未配置"},
|
||||
),
|
||||
CostRecord(
|
||||
user_id=test_user.id,
|
||||
provider_name="demo",
|
||||
capability="asr",
|
||||
estimated_cost=Decimal("0.002"),
|
||||
),
|
||||
]
|
||||
)
|
||||
await db_session.commit()
|
||||
|
||||
admin_app = _build_admin_test_app(db_session)
|
||||
transport = ASGITransport(app=admin_app)
|
||||
|
||||
async with AsyncClient(transport=transport, base_url="http://test") as client:
|
||||
response = await client.get("/admin/providers/analytics?capability=asr")
|
||||
|
||||
assert response.status_code == 200
|
||||
data = response.json()
|
||||
assert data["capability"] == "asr"
|
||||
assert data["total_calls"] == 2
|
||||
assert data["successful_calls"] == 1
|
||||
assert data["failed_calls"] == 1
|
||||
assert data["user_count"] == 2
|
||||
assert data["job_count"] == 0
|
||||
assert data["story_count"] == 0
|
||||
assert data["voice_session_count"] == 2
|
||||
assert data["voice_turn_count"] == 1
|
||||
assert data["estimated_cost_usd"] == 0.002
|
||||
assert data["failure_reasons"] == [
|
||||
{"reason": "OPENAI_API_KEY 未配置", "count": 1}
|
||||
]
|
||||
assert data["by_provider"] == [
|
||||
{
|
||||
"capability": "asr",
|
||||
"adapter": "demo",
|
||||
"call_count": 1,
|
||||
"success_count": 1,
|
||||
"failure_count": 0,
|
||||
"avg_latency_ms": None,
|
||||
"estimated_cost_usd": 0.002,
|
||||
},
|
||||
{
|
||||
"capability": "asr",
|
||||
"adapter": "unknown",
|
||||
"call_count": 1,
|
||||
"success_count": 0,
|
||||
"failure_count": 1,
|
||||
"avg_latency_ms": None,
|
||||
"estimated_cost_usd": 0.0,
|
||||
},
|
||||
]
|
||||
|
||||
users = {row["user_id"]: row for row in data["by_user"]}
|
||||
assert users[test_user.id]["call_count"] == 1
|
||||
assert users[test_user.id]["success_count"] == 1
|
||||
assert users[test_user.id]["estimated_cost_usd"] == 0.002
|
||||
assert users[second_user.id]["call_count"] == 1
|
||||
assert users[second_user.id]["failure_count"] == 1
|
||||
|
||||
@@ -73,6 +73,7 @@ class TestDevSigninRedirect:
|
||||
|
||||
def test_dev_signin_uses_allowed_next_url(self, client: TestClient, monkeypatch):
|
||||
"""允许的 next 参数应作为登录完成后的回跳地址。"""
|
||||
monkeypatch.setattr(settings, "debug", True)
|
||||
monkeypatch.setattr(settings, "cors_origins", ["http://localhost:5173", "http://localhost:5174"])
|
||||
|
||||
response = client.get(
|
||||
@@ -86,6 +87,7 @@ class TestDevSigninRedirect:
|
||||
|
||||
def test_dev_signin_rejects_untrusted_next_url(self, client: TestClient, monkeypatch):
|
||||
"""不可信的 next 参数应回退到默认前端地址,避免开放重定向。"""
|
||||
monkeypatch.setattr(settings, "debug", True)
|
||||
monkeypatch.setattr(settings, "cors_origins", ["http://localhost:5173", "http://localhost:5174"])
|
||||
|
||||
response = client.get(
|
||||
|
||||
53
backend/tests/test_config.py
Normal file
53
backend/tests/test_config.py
Normal file
@@ -0,0 +1,53 @@
|
||||
"""配置加载约定测试。"""
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
from app.core.config import BACKEND_ENV_FILE, Settings
|
||||
|
||||
|
||||
def test_default_env_file_is_backend_env():
|
||||
"""默认 env 文件应固定为 backend/.env 的绝对路径。"""
|
||||
|
||||
configured_env_file = Path(Settings.model_config["env_file"])
|
||||
|
||||
assert configured_env_file == BACKEND_ENV_FILE
|
||||
assert configured_env_file.is_absolute()
|
||||
assert configured_env_file.parent.name == "backend"
|
||||
assert configured_env_file.name == ".env"
|
||||
|
||||
|
||||
def test_explicit_env_file_ignores_current_working_directory_dotenv(monkeypatch, tmp_path):
|
||||
"""显式 env 文件不应被当前目录 .env 污染。"""
|
||||
|
||||
root_env = tmp_path / ".env"
|
||||
root_env.write_text(
|
||||
"\n".join(
|
||||
[
|
||||
"SECRET_KEY=root-env-should-not-be-used",
|
||||
"DATABASE_URL=sqlite+aiosqlite:///root-env.db",
|
||||
"DEBUG=false",
|
||||
]
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
backend_env = tmp_path / "backend.env"
|
||||
backend_env.write_text(
|
||||
"\n".join(
|
||||
[
|
||||
"SECRET_KEY=backend-env-secret",
|
||||
"DATABASE_URL=sqlite+aiosqlite:///backend-env.db",
|
||||
"DEBUG=true",
|
||||
]
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
monkeypatch.chdir(tmp_path)
|
||||
monkeypatch.delenv("SECRET_KEY", raising=False)
|
||||
monkeypatch.delenv("DATABASE_URL", raising=False)
|
||||
|
||||
settings = Settings(_env_file=backend_env)
|
||||
|
||||
assert settings.database_url == "sqlite+aiosqlite:///backend-env.db"
|
||||
assert settings.secret_key == "backend-env-secret"
|
||||
assert settings.debug is True
|
||||
@@ -299,6 +299,21 @@ class TestProviderPolicy:
|
||||
assert result.transcript_text == "我想听一个小熊找星星的故事"
|
||||
assert result.confidence == 1.0
|
||||
assert result.provider == "demo"
|
||||
|
||||
def test_openai_asr_default_config_uses_openai_env(self):
|
||||
from app.services.provider_router import _get_default_config
|
||||
|
||||
with patch("app.services.provider_router.settings") as mock_settings:
|
||||
mock_settings.openai_api_key = "openai-key"
|
||||
mock_settings.openai_api_base = "https://api.example.com/v1"
|
||||
mock_settings.voice_transcription_model = "gpt-4o-mini-transcribe"
|
||||
|
||||
config = _get_default_config("openai_asr")
|
||||
|
||||
assert config is not None
|
||||
assert config.api_key == "openai-key"
|
||||
assert config.api_base == "https://api.example.com/v1"
|
||||
assert config.model == "gpt-4o-mini-transcribe"
|
||||
|
||||
|
||||
class TestProviderConfigFromDB:
|
||||
|
||||
@@ -1,14 +1,13 @@
|
||||
name: dreamweaver
|
||||
|
||||
x-backend-env: &backend-env
|
||||
DATABASE_URL: postgresql+asyncpg://${POSTGRES_USER:-dreamweaver}:${POSTGRES_PASSWORD:-dreamweaver_password}@db:5432/${POSTGRES_DB:-dreamweaver_db}
|
||||
CELERY_BROKER_URL: redis://redis:6379/0
|
||||
CELERY_RESULT_BACKEND: redis://redis:6379/0
|
||||
REDIS_URL: redis://redis:6379/0
|
||||
|
||||
services:
|
||||
frontend:
|
||||
build: ./frontend
|
||||
build:
|
||||
context: ./frontend
|
||||
args:
|
||||
NODE_BASE_IMAGE: ${NODE_BASE_IMAGE:-node:18-alpine}
|
||||
NGINX_BASE_IMAGE: ${NGINX_BASE_IMAGE:-nginx:alpine}
|
||||
NPM_REGISTRY: ${NPM_REGISTRY:-https://registry.npmjs.org/}
|
||||
image: dreamweaver-frontend:dev
|
||||
container_name: dreamweaver_frontend
|
||||
restart: unless-stopped
|
||||
@@ -19,7 +18,12 @@ services:
|
||||
condition: service_started
|
||||
|
||||
frontend-admin:
|
||||
build: ./admin-frontend
|
||||
build:
|
||||
context: ./admin-frontend
|
||||
args:
|
||||
NODE_BASE_IMAGE: ${NODE_BASE_IMAGE:-node:18-alpine}
|
||||
NGINX_BASE_IMAGE: ${NGINX_BASE_IMAGE:-nginx:alpine}
|
||||
NPM_REGISTRY: ${NPM_REGISTRY:-https://registry.npmjs.org/}
|
||||
image: dreamweaver-admin-frontend:dev
|
||||
container_name: dreamweaver_frontend_admin
|
||||
restart: unless-stopped
|
||||
@@ -30,14 +34,16 @@ services:
|
||||
condition: service_started
|
||||
|
||||
backend:
|
||||
build: ./backend
|
||||
build:
|
||||
context: ./backend
|
||||
args:
|
||||
PYTHON_BASE_IMAGE: ${PYTHON_BASE_IMAGE:-python:3.11-slim}
|
||||
image: dreamweaver-backend:dev
|
||||
container_name: dreamweaver_backend
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "52000:8000"
|
||||
env_file: ./backend/.env
|
||||
environment: *backend-env
|
||||
volumes:
|
||||
- backend_static:/app/static
|
||||
depends_on:
|
||||
@@ -54,7 +60,6 @@ services:
|
||||
ports:
|
||||
- "52800:8001"
|
||||
env_file: ./backend/.env
|
||||
environment: *backend-env
|
||||
volumes:
|
||||
- backend_static:/app/static
|
||||
depends_on:
|
||||
@@ -71,7 +76,6 @@ services:
|
||||
restart: unless-stopped
|
||||
command: celery -A app.core.celery_app worker --loglevel=info
|
||||
env_file: ./backend/.env
|
||||
environment: *backend-env
|
||||
depends_on:
|
||||
backend:
|
||||
condition: service_started
|
||||
@@ -86,7 +90,6 @@ services:
|
||||
restart: unless-stopped
|
||||
command: celery -A app.core.celery_app beat --loglevel=info
|
||||
env_file: ./backend/.env
|
||||
environment: *backend-env
|
||||
depends_on:
|
||||
backend:
|
||||
condition: service_started
|
||||
@@ -98,15 +101,15 @@ services:
|
||||
container_name: dreamweaver_db
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
POSTGRES_USER: ${POSTGRES_USER:-dreamweaver}
|
||||
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-dreamweaver_password}
|
||||
POSTGRES_DB: ${POSTGRES_DB:-dreamweaver_db}
|
||||
POSTGRES_USER: dreamweaver
|
||||
POSTGRES_PASSWORD: dreamweaver_password
|
||||
POSTGRES_DB: dreamweaver_db
|
||||
ports:
|
||||
- "52432:5432"
|
||||
volumes:
|
||||
- postgres_data:/var/lib/postgresql/data
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-dreamweaver} -d ${POSTGRES_DB:-dreamweaver_db}"]
|
||||
test: ["CMD-SHELL", "pg_isready -U \"$${POSTGRES_USER}\" -d \"$${POSTGRES_DB}\""]
|
||||
interval: 10s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
|
||||
@@ -2,6 +2,8 @@
|
||||
|
||||
**目标**: 演示前用 5-10 分钟确认本地 Docker 环境、核心生成链路和讲解材料处于可展示状态。
|
||||
|
||||
**当前演示口径(2026-05-06)**: 主生成链路可作为稳定主线展示;语音共创是 Phase A Alpha,可演示回合式共创、文本降级、上传转写、观测指标和保存为 Story。管理端已能看到 ASR 维度运营摘要。外部 Registry 阻塞已通过可配置基础镜像与 npm registry 修复;当前代码 `docker compose up -d --build` 和 `SMOKE_VOICE=1` 均已通过。
|
||||
|
||||
---
|
||||
|
||||
## 1. 演示前准备
|
||||
@@ -53,6 +55,12 @@ SMOKE_AUDIO=1 ./scripts/demo_smoke.sh
|
||||
SMOKE_VOICE=1 ./scripts/demo_smoke.sh
|
||||
```
|
||||
|
||||
需要检查真实 OpenAI ASR Key 环境时:
|
||||
|
||||
```bash
|
||||
SMOKE_REAL_ASR=1 ./scripts/demo_smoke.sh
|
||||
```
|
||||
|
||||
需要同时检查 TTS 和语音共创时:
|
||||
|
||||
```bash
|
||||
@@ -81,8 +89,34 @@ SMOKE_AUDIO=1 SMOKE_VOICE=1 ./scripts/demo_smoke.sh
|
||||
- [ ] 如果启用 `SMOKE_VOICE=1`,语音共创会话可完成文本 fallback、上传回合、analytics 和 finalize 到 Story
|
||||
- [ ] 如果启用 `SMOKE_VOICE=1`,analytics 返回输入构成、语音时长、Provider 分布、ASR/TTS 成功率和低置信度确认率
|
||||
- [ ] 如果启用 `SMOKE_VOICE=1`,analytics 支持按 `provider` 和 `session_status` 筛选
|
||||
- [ ] 如果启用 `SMOKE_REAL_ASR=1`,上传回合返回 `transcription_provider=openai_asr`,转写文本非空
|
||||
- [ ] 如果启用 `SMOKE_REAL_ASR=1`,`/api/voice-sessions/analytics?provider=openai_asr` 能看到上传回合
|
||||
- [ ] Admin Provider analytics 在 `capability=asr` 下能看到语音会话数、上传回合数、ASR 成功/失败和失败原因
|
||||
- [ ] 真实 ASR 环境失败时,脚本输出包含上传响应、Voice Session 事件和 Admin ASR failure reasons
|
||||
- [ ] 验证结果已记录到 `docs/planning/demo-validation-log.md`
|
||||
|
||||
真实 ASR 环境变量最小集:
|
||||
|
||||
```env
|
||||
ASR_PROVIDERS=["openai_asr", "demo"]
|
||||
OPENAI_API_KEY=sk-...
|
||||
OPENAI_API_BASE=
|
||||
VOICE_TRANSCRIPTION_MODE=provider
|
||||
VOICE_TRANSCRIPTION_MODEL=gpt-4o-mini-transcribe
|
||||
VOICE_TRANSCRIPTION_LANGUAGE=zh
|
||||
```
|
||||
|
||||
改完 `backend/.env` 后重启 backend/worker。若在 Admin Provider 表里改过 ASR 配置,先 `curl -u admin:admin -X POST http://localhost:52800/admin/providers/reload`,再重启 API 容器/进程,避免运行中缓存仍指向旧 provider。
|
||||
|
||||
真实 ASR 常见失败口径:
|
||||
|
||||
- `OPENAI_API_KEY 未配置`:容器或本机 API 没读到 key。
|
||||
- `HTTP 401/403`:key 错误、项目权限或网关鉴权失败。
|
||||
- `HTTP 429` / `insufficient_quota`:额度或限流问题。
|
||||
- `model_not_found`:`VOICE_TRANSCRIPTION_MODEL` 当前 key 不可用,先换回 `gpt-4o-mini-transcribe`。
|
||||
- 网络连接失败:检查代理、DNS、`OPENAI_API_BASE` 是否必须带 `/v1`。
|
||||
- 音频格式失败:传 `REAL_ASR_AUDIO_FILE=/path/to/sample.m4a` 换一段真实短音频复测。
|
||||
|
||||
---
|
||||
|
||||
## 3. 手动演示路径
|
||||
@@ -121,6 +155,7 @@ SMOKE_AUDIO=1 SMOKE_VOICE=1 ./scripts/demo_smoke.sh
|
||||
- Provider: 具体供应商配置
|
||||
- Adapter: API 调用实现
|
||||
- Routing Policy: 优先级/成本/延迟/轮询
|
||||
4. 切到“语音识别”能力,说明 Voice Studio 上传转写的 ASR 调用已进入管理端运营摘要,可看语音会话、上传回合、失败原因和成本归因。
|
||||
|
||||
### 路径 D: 语音共创 Alpha
|
||||
|
||||
@@ -136,6 +171,7 @@ SMOKE_AUDIO=1 SMOKE_VOICE=1 ./scripts/demo_smoke.sh
|
||||
4. 演示低置信度确认:说明系统会提示“本轮系统理解为”,家长可选择继续、重说或改成文本。
|
||||
5. 点击结束并保存,确认正式 Story 进入故事库。
|
||||
6. 打开生成轨迹,说明语音共创 finalize 后的封面资产补全已经接回统一 generation job。
|
||||
7. 回到 Admin 的语音识别摘要,说明 Alpha 阶段保留 demo fallback,同时为真实 ASR Provider 验收预留运营视图。
|
||||
|
||||
---
|
||||
|
||||
@@ -157,7 +193,7 @@ DreamWeaver 是面向 3-8 岁亲子场景的个性化 AI 绘本与陪伴式讲
|
||||
|
||||
### 2:20 - 3:00 取舍与下一步
|
||||
|
||||
求职版优先稳定闭环和可解释性,不做支付、多租户和复杂监控。现在 job/event 已能查询 workflow、资产补全、provider 调用轨迹和聚合指标,统一生成已迁移到后台 worker,取消/重试队列也已打通;用户端可看跨故事运营摘要,管理端可看当前环境跨用户 Provider dashboard。下一步应补跨环境汇聚、断点续跑和更完整监控。
|
||||
求职版优先稳定闭环和可解释性,不做支付、多租户和复杂监控。现在 job/event 已能查询 workflow、资产补全、provider 调用轨迹和聚合指标,统一生成已迁移到后台 worker,取消/重试队列也已打通;Voice Studio 已进入 Phase A Alpha,可演示回合式共创和保存为 Story;用户端可看跨故事运营摘要,管理端可看当前环境跨用户 Provider dashboard 和 ASR 摘要。下一步应补真实 ASR Key 环境验收、跨环境 Provider 汇聚、断点续跑和更完整监控。
|
||||
|
||||
---
|
||||
|
||||
@@ -170,6 +206,7 @@ DreamWeaver 是面向 3-8 岁亲子场景的个性化 AI 绘本与陪伴式讲
|
||||
| 图片 provider 失败 | 展示 degraded completed 与 retry 机制 |
|
||||
| 录音或 ASR 不稳定 | 切到文本 fallback,说明 Alpha 阶段已保留降级路径 |
|
||||
| 语音共创低置信度卡住 | 使用“按这个理解继续”或“改成文本输入”完成本轮 |
|
||||
| Docker Hub 拉取超时 | 当前 Dockerfile/Compose 支持基础镜像覆盖;本机 `.env` 已配置代理源,可直接 `docker compose up -d --build` |
|
||||
| Docker 冷启动慢 | 演示前提前运行 smoke 脚本并保持容器运行 |
|
||||
| Admin 页面不适合主展示 | 只用 Provider 分层说明辅助讲系统设计 |
|
||||
| 面试官追问生产部署 | 明确当前是求职版 MVP,本轮重点是产品闭环和系统边界 |
|
||||
|
||||
@@ -17,16 +17,63 @@ docker compose up -d --build
|
||||
./scripts/demo_smoke.sh
|
||||
```
|
||||
|
||||
需要验证语音链路时:
|
||||
需要验证故事 TTS 音频时:
|
||||
|
||||
```bash
|
||||
SMOKE_AUDIO=1 ./scripts/demo_smoke.sh
|
||||
```
|
||||
|
||||
需要验证 Voice Studio Alpha 时:
|
||||
|
||||
```bash
|
||||
SMOKE_VOICE=1 ./scripts/demo_smoke.sh
|
||||
```
|
||||
|
||||
需要验证真实 OpenAI ASR Key 环境时:
|
||||
|
||||
```bash
|
||||
SMOKE_REAL_ASR=1 ./scripts/demo_smoke.sh
|
||||
```
|
||||
|
||||
`SMOKE_REAL_ASR=1` 会自动包含 Voice Studio Alpha smoke。Docker 环境下先在 `backend/.env` 确认:
|
||||
|
||||
```env
|
||||
ASR_PROVIDERS=["openai_asr", "demo"]
|
||||
OPENAI_API_KEY=sk-...
|
||||
OPENAI_API_BASE=
|
||||
VOICE_TRANSCRIPTION_MODE=provider
|
||||
VOICE_TRANSCRIPTION_MODEL=gpt-4o-mini-transcribe
|
||||
VOICE_TRANSCRIPTION_LANGUAGE=zh
|
||||
```
|
||||
|
||||
改完环境变量后重启 backend/worker;如果通过 Admin Provider 表配置了 ASR,先执行 `curl -u admin:admin -X POST http://localhost:52800/admin/providers/reload`,再重启 API 容器/进程。macOS 会自动用 `say`/`afconvert` 生成短音频,其他环境可传 `REAL_ASR_AUDIO_FILE=/path/to/sample.m4a`。
|
||||
|
||||
当 Docker Hub 网络暂时不可用时,当前 Docker 构建支持通过根 `.env` 覆盖基础镜像与 npm registry。当前机器已配置:
|
||||
|
||||
```bash
|
||||
PYTHON_BASE_IMAGE=docker.m.daocloud.io/library/python:3.11-slim
|
||||
NODE_BASE_IMAGE=docker.1ms.run/library/node:18-alpine
|
||||
NGINX_BASE_IMAGE=docker.m.daocloud.io/library/nginx:alpine
|
||||
NPM_REGISTRY=https://registry.npmmirror.com
|
||||
```
|
||||
|
||||
如果需要绕过 Docker、直接验证当前源码,也可以本机启动当前源码 API/admin/worker,并覆盖登录回跳地址后运行:
|
||||
|
||||
```bash
|
||||
APP_URL=http://localhost:53000 \
|
||||
BACKEND_URL=http://localhost:53000 \
|
||||
ADMIN_BACKEND_URL=http://localhost:53800 \
|
||||
DEV_SIGNIN_URL='http://localhost:53000/auth/dev/signin?next=http://localhost:53000/auth/session' \
|
||||
SMOKE_VOICE=1 ./scripts/demo_smoke.sh
|
||||
```
|
||||
|
||||
当前注意:2026-05-06 外部 Registry 阻塞已修复;当前代码 `docker compose up -d --build` 已通过,重建后 `SMOKE_VOICE=1` 也已通过。
|
||||
|
||||
演示入口:
|
||||
|
||||
- 用户端:`http://localhost:52080`
|
||||
- 本地登录:`http://localhost:52080/auth/dev/signin`
|
||||
- 语音共创:`http://localhost:52080/voice-studio`
|
||||
- 管理端:`http://localhost:52888`
|
||||
- 后端健康:`http://localhost:52000/health`
|
||||
|
||||
@@ -41,7 +88,9 @@ SMOKE_AUDIO=1 ./scripts/demo_smoke.sh
|
||||
5. 创建绘本,进入绘本阅读器。
|
||||
6. 刷新页面或重新进入绘本,说明按 ID 恢复和阅读位置恢复。
|
||||
7. 回到故事库,展示跨故事 Provider 运营摘要。
|
||||
8. 打开孩子时间线,展示阅读事件和记忆沉淀。
|
||||
8. 进入 Voice Studio,演示文本 fallback / 上传语音 / 保存为 Story,说明它是 Phase A Alpha。
|
||||
9. 打开管理端 Provider 摘要,切到“语音识别”,展示 ASR 调用、失败原因和语音会话/上传回合。
|
||||
10. 打开孩子时间线,展示阅读事件和记忆沉淀。
|
||||
|
||||
---
|
||||
|
||||
@@ -51,7 +100,8 @@ SMOKE_AUDIO=1 ./scripts/demo_smoke.sh
|
||||
- **AI 不确定性处理**:主内容和资产拆开,图片/音频失败不阻塞阅读。
|
||||
- **Provider 产品化**:用户看到稳定能力,系统内部用 Capability / Provider / Adapter / Routing Policy 管供应链。
|
||||
- **可观测性**:generation job/event 让生成过程、失败恢复和 Provider 成本可解释。
|
||||
- **可继续生产化**:统一生成已迁移到 worker,前端轮询、任务事件模型、取消/重试队列和管理台当前环境 dashboard 也已打通,下一步是补跨环境汇聚、断点续跑和更完整监控。
|
||||
- **语音共创边界**:Voice Studio 是 Phase A Alpha,验证回合式共创、文本降级、上传转写、TTS 回复和保存为 Story,不夸大成实时语音最终形态。
|
||||
- **可继续生产化**:统一生成已迁移到 worker,前端轮询、任务事件模型、取消/重试队列、管理台当前环境 dashboard 和 ASR 摘要已打通;下一步是真实 ASR 环境验收、跨环境汇聚、断点续跑和更完整监控。
|
||||
|
||||
---
|
||||
|
||||
@@ -61,6 +111,9 @@ SMOKE_AUDIO=1 ./scripts/demo_smoke.sh
|
||||
| --- | --- |
|
||||
| TTS 网络失败 | 说明音频是可恢复资产,展示缓存状态或跳过语音 |
|
||||
| 图片生成失败 | 展示 `degraded_completed` 与资源重试 |
|
||||
| 录音或 ASR 不稳定 | 切到文本 fallback,说明 Alpha 已保留降级路径 |
|
||||
| 真实 ASR Key 验收失败 | 看 smoke 输出的上传响应、Voice Session 事件和 Admin ASR failure reasons;优先排查 key 未加载、401/403、429/额度、model_not_found、`OPENAI_API_BASE` 和音频格式 |
|
||||
| Docker Hub 拉取超时 | 使用根 `.env` 的基础镜像覆盖与 npm registry 覆盖,直接重建当前 Docker 栈 |
|
||||
| Docker 冷启动慢 | 演示前先跑 smoke 并保持容器运行 |
|
||||
| Provider 追问过深 | 回到 Capability / Provider / Adapter / Routing Policy 四层解释 |
|
||||
| 生产化追问 | 说明下一步是跨环境 Provider 汇聚、断点续跑、监控告警和密钥治理 |
|
||||
|
||||
@@ -2,6 +2,152 @@
|
||||
|
||||
这份记录用于演示前快速说明“当前本地 Docker 环境已经验证到什么程度”。新的验证记录按时间倒序追加。
|
||||
|
||||
## 2026-06-01 真实 ASR Key 环境验收入口补齐
|
||||
|
||||
- 检查当前 `openai_asr` 接线:ASR capability 已在 Provider policy 中注册,`ASR_PROVIDERS` 默认仍为 `["demo"]`;真实转写走 `openai_asr` 适配器、Provider Router 和 Voice Session 上传回合。
|
||||
- 补齐 `OPENAI_API_BASE` 到 settings 与 `openai_asr` 默认配置,兼容官方 OpenAI 留空和兼容网关 `/v1` 场景。
|
||||
- `openai_asr` 失败信息从统一“服务暂时不可用”改为保留 HTTP 状态、连接错误或异常摘要,并脱敏 `Bearer` / `sk-` token,方便区分 key、额度、模型、网关和音频格式问题。
|
||||
- `scripts/demo_smoke.sh` 新增可选 `SMOKE_REAL_ASR=1`。该开关会自动启用 `SMOKE_VOICE=1`,上传真实音频,断言 `transcription_provider=openai_asr`、转写文本非空、用户侧 analytics 可按 `provider=openai_asr` 筛选、Admin ASR analytics 能看到 `openai_asr`。
|
||||
- 默认 smoke、`SMOKE_AUDIO=1` 和 `SMOKE_VOICE=1` 行为不变;真实 ASR 路径只有显式打开时才会触发外部 OpenAI 调用。
|
||||
- 真实 ASR 音频来源:macOS 下默认用 `say` + `afconvert` 生成短 m4a;其他环境可传 `REAL_ASR_AUDIO_FILE=/path/to/sample.m4a`。
|
||||
|
||||
真实 ASR `.env` 最小集:
|
||||
|
||||
```env
|
||||
ASR_PROVIDERS=["openai_asr", "demo"]
|
||||
OPENAI_API_KEY=sk-...
|
||||
OPENAI_API_BASE=
|
||||
VOICE_TRANSCRIPTION_MODE=provider
|
||||
VOICE_TRANSCRIPTION_MODEL=gpt-4o-mini-transcribe
|
||||
VOICE_TRANSCRIPTION_LANGUAGE=zh
|
||||
```
|
||||
|
||||
验证命令:
|
||||
|
||||
```bash
|
||||
docker compose up -d --build
|
||||
docker compose restart backend backend-admin worker celery-beat
|
||||
SMOKE_REAL_ASR=1 ./scripts/demo_smoke.sh
|
||||
curl -fsS -u admin:admin 'http://localhost:52800/admin/providers/analytics?days=7&capability=asr'
|
||||
```
|
||||
|
||||
若通过 Admin Provider 表改 ASR 配置,先刷新 provider cache 并重启 API 进程:
|
||||
|
||||
```bash
|
||||
curl -fsS -u admin:admin -X POST 'http://localhost:52800/admin/providers/reload'
|
||||
docker compose restart backend worker
|
||||
```
|
||||
|
||||
失败排查口径:
|
||||
|
||||
- `OPENAI_API_KEY 未配置`:容器或本机 API 没读到 key,先 `docker compose exec backend env | rg 'ASR_PROVIDERS|OPENAI|VOICE_TRANSCRIPTION'`。
|
||||
- `HTTP 401/403`:key 错误、项目权限不足或兼容网关鉴权失败。
|
||||
- `HTTP 429` / `insufficient_quota`:额度不足或触发限流。
|
||||
- `model_not_found`:`VOICE_TRANSCRIPTION_MODEL` 当前 key 不可用,先换回 `gpt-4o-mini-transcribe`。
|
||||
- `OpenAI ASR 网络连接失败`:检查代理、DNS、网关地址和 `OPENAI_API_BASE` 是否需要 `/v1`。
|
||||
- 音频格式错误或空转写:用 `REAL_ASR_AUDIO_FILE=/path/to/sample.m4a` 传一段真实短录音复测。
|
||||
|
||||
本轮本地验证:
|
||||
|
||||
- `bash -n scripts/demo_smoke.sh` 通过。
|
||||
- `backend/.venv/bin/python -m pytest backend/tests/test_provider_router.py -q` 通过,13 passed。
|
||||
- `backend/.venv/bin/python -m ruff check backend/app/core/config.py backend/app/services/provider_router.py backend/app/services/adapters/asr/openai.py backend/tests/test_provider_router.py` 通过。
|
||||
- 本轮触碰文件的 `git diff --check -- ...` 通过。
|
||||
- 全量 `git diff --check` 仍会报出既有未触碰文件 `backend/app/services/adapters/__init__.py` 与 `backend/app/services/adapters/tts/minimax.py` 的 trailing whitespace;本轮按“只改阻塞验收部分”未清理。
|
||||
- 未在当前环境执行 `SMOKE_REAL_ASR=1`,因为真实 `OPENAI_API_KEY` 不应写入仓库;该路径已作为 key 环境验收入口补齐。
|
||||
|
||||
## 2026-05-06 外部 Registry 阻塞修复与重建回归
|
||||
|
||||
- 根因分析:
|
||||
- Docker Hub 失败不是项目 Dockerfile 问题,而是当前网络到 `registry-1.docker.io` / `auth.docker.io` 的 TLS 链路不稳定;`auth.docker.io` token 请求在宿主机 `curl` 下也会 SSL timeout。
|
||||
- 绕开 Docker Hub 后,管理端前端构建又暴露第二层外部依赖问题:容器内访问 `registry.npmjs.org` 触发 `EIDLETIMEOUT`。
|
||||
- 修复方式:
|
||||
- `backend/Dockerfile`、`frontend/Dockerfile`、`admin-frontend/Dockerfile` 改为支持可覆盖基础镜像。
|
||||
- `docker-compose.yml` 新增 `PYTHON_BASE_IMAGE`、`NODE_BASE_IMAGE`、`NGINX_BASE_IMAGE`、`NPM_REGISTRY` build args,默认仍使用官方 Docker Hub / npmjs,不影响其他环境。
|
||||
- 本机 git-ignored 根 `.env` 写入代理源:`docker.m.daocloud.io`、`docker.1ms.run`、`registry.npmmirror.com`。
|
||||
- 两个前端 Dockerfile 从 `npm install` 改为 `npm ci --no-audit --no-fund`,用 lockfile 提高构建确定性。
|
||||
- `docker compose up -d --build` 已用当前代码完整重建 backend、frontend、frontend-admin 镜像并重建容器。
|
||||
- 重建后 `SMOKE_VOICE=1 ./scripts/demo_smoke.sh` 通过,生成本轮故事 ID `56/57/58`。
|
||||
- 重建后管理端 ASR analytics 验证通过:`capability=asr` 返回 `total_calls=3`、`voice_session_count=3`、`voice_turn_count=3`,并按 `demo` Provider 与 `github:dev_user_001` 聚合。
|
||||
- Docker 栈当前服务全部运行,backend、backend-admin、worker、celery-beat、frontend、frontend-admin 均为重建后容器。
|
||||
- 语音共创 PRD #48 已完成;#47/#48/#49/#50 本批 Alpha 演示质量任务收束。
|
||||
|
||||
验证命令:
|
||||
|
||||
```bash
|
||||
curl -Iv --connect-timeout 15 https://registry-1.docker.io/v2/
|
||||
curl -Iv --connect-timeout 15 'https://auth.docker.io/token?service=registry.docker.io&scope=repository:library/python:pull'
|
||||
docker compose config | rg -n "PYTHON_BASE_IMAGE|NODE_BASE_IMAGE|NGINX_BASE_IMAGE|NPM_REGISTRY"
|
||||
docker compose build backend frontend frontend-admin
|
||||
docker compose up -d --build
|
||||
SMOKE_VOICE=1 ./scripts/demo_smoke.sh
|
||||
curl -fsS -u admin:admin 'http://localhost:52800/admin/providers/analytics?days=7&capability=asr'
|
||||
docker compose ps
|
||||
```
|
||||
|
||||
结果:
|
||||
|
||||
- Docker Hub 官方链路仍可不稳定,但当前项目构建不再直接依赖它的 auth 链路。
|
||||
- `docker compose up -d --build` 通过。
|
||||
- `SMOKE_VOICE=1 ./scripts/demo_smoke.sh` 通过。
|
||||
- Admin ASR analytics 手动验证通过。
|
||||
|
||||
## 2026-05-06 拉取后 ASR 管理端摘要补齐
|
||||
|
||||
- 已拉取远端 `main` 到 `0ccfd00 chore: update frontend tooling and Chinese copy`。
|
||||
- 管理端 Provider analytics 已补齐 ASR 维度:`/admin/providers/analytics?capability=asr` 会聚合 Voice Session 上传转写成功、转写失败、失败原因、ASR 成本、跨用户分布、语音会话数和上传回合数。
|
||||
- 管理端前端在语音识别筛选下将摘要卡片切换为“语音会话 / 上传回合”,避免沿用 generation job 口径。
|
||||
- 后端开发登录重定向测试已显式打开 debug,避免依赖外部环境变量导致全量测试不稳定。
|
||||
- Docker 镜像重建两次被 Docker Hub TLS handshake timeout 阻塞,失败点在 `python:3.11-slim`、`node:18-alpine`、`nginx:alpine` 元数据解析;本轮未能用当前代码重建容器。
|
||||
- 当前已启动 Docker 栈首次 `SMOKE_VOICE=1` 在登录阶段返回 502,定位为前端 Nginx 解析到旧 backend 容器 IP;重启 `frontend` 后代理恢复。
|
||||
- 当前已启动 Docker 栈下 `SMOKE_VOICE=1 ./scripts/demo_smoke.sh` 通过,覆盖故事生成、Voice Session 文本 fallback、上传回合 demo transcript hint、语音 analytics、finalize 保存 Story、绘本生成与图片补全。
|
||||
- `scripts/demo_smoke.sh` 新增 `DEV_SIGNIN_URL` 覆盖项,支持直接打本机源码 API 时把 dev 登录回跳到 `/auth/session`,避免没有 SPA 页面导致误报。
|
||||
- 当前源码本机 API/admin/worker 连接 Docker Postgres/Redis 后,`SMOKE_VOICE=1` 通过,生成本轮故事 ID `53/54/55`。
|
||||
- 本机源码 admin ASR analytics 手动验证通过:`capability=asr` 返回 `total_calls=2`、`voice_session_count=2`、`voice_turn_count=2`,并按 `demo` Provider 与 `github:dev_user_001` 聚合。
|
||||
- 技术方案已新增服务复杂度自审,列出 `voice_session_service.py`、`generation_jobs.py`、ASR service 和 Voice Studio 的拆分候选与风险信号。
|
||||
- 已按服务复杂度自审开始拆分:管理端跨用户 Provider/ASR 摘要迁移到 `backend/app/services/admin_provider_analytics.py`,`generation_jobs.py` 回到生成任务与用户侧 provider stats 边界。
|
||||
- 演示 checklist、demo package、3 分钟 pitch、PRD 和技术方案已完成口径复核:统一说明 Voice Studio 是 Phase A Alpha,ASR 摘要已进入管理端,当前源码 smoke 已通过。当时 #48 仍待当前代码镜像重建后的 Docker voice smoke。
|
||||
- 后续同日已通过 Registry 绕行修复完成 #48,见上方“外部 Registry 阻塞修复与重建回归”记录。
|
||||
|
||||
验证命令:
|
||||
|
||||
```bash
|
||||
docker compose up -d --build backend backend-admin worker celery-beat frontend-admin
|
||||
docker compose build backend frontend-admin
|
||||
DOCKER_BUILDKIT=0 docker compose build backend
|
||||
docker manifest inspect python:3.11-slim
|
||||
docker compose restart frontend
|
||||
SMOKE_VOICE=1 ./scripts/demo_smoke.sh
|
||||
APP_URL=http://localhost:53000 BACKEND_URL=http://localhost:53000 ADMIN_BACKEND_URL=http://localhost:53800 DEV_SIGNIN_URL='http://localhost:53000/auth/dev/signin?next=http://localhost:53000/auth/session' SMOKE_VOICE=1 ./scripts/demo_smoke.sh
|
||||
curl -fsS -u admin:admin 'http://localhost:53800/admin/providers/analytics?days=7&capability=asr'
|
||||
backend/.venv/bin/python -m pytest backend/tests/test_admin_providers.py -q
|
||||
backend/.venv/bin/python -m pytest backend/tests -q
|
||||
backend/.venv/bin/python -m pytest backend/tests/test_auth.py backend/tests/test_admin_providers.py -q
|
||||
backend/.venv/bin/python -m ruff check backend/app/services/generation_jobs.py backend/app/services/admin_provider_analytics.py backend/app/api/admin_providers.py backend/tests/test_admin_providers.py
|
||||
backend/.venv/bin/python -m ruff check backend/app backend/tests
|
||||
cd frontend && npm run build
|
||||
cd admin-frontend && npm run build
|
||||
git diff --check
|
||||
```
|
||||
|
||||
结果:
|
||||
|
||||
- Docker build 未完成,原因是 Docker Hub TLS handshake timeout;legacy builder 同样卡在 `FROM python:3.11-slim`,已手动终止。
|
||||
- `docker manifest inspect python:3.11-slim` 同样因 Docker Hub auth token 请求 TLS handshake timeout 失败,说明当前阻塞在 registry 访问而不是项目 Dockerfile。
|
||||
- `docker compose restart frontend` 后 `/auth/dev/signin` 经前端代理恢复 302。
|
||||
- 当前已启动 Docker 栈 `SMOKE_VOICE=1 ./scripts/demo_smoke.sh` 通过;本结果只能证明运行中栈健康,不能替代当前代码重建后的 Docker smoke。
|
||||
- 当前源码本机 API/admin/worker 下 `SMOKE_VOICE=1 ./scripts/demo_smoke.sh` 通过;当时这验证了当前代码路径,但仍不能替代镜像重建验证。后续同日已完成镜像重建验证,见上方记录。
|
||||
- 本机源码 admin ASR analytics 返回 `voice_session_count=2`、`voice_turn_count=2`,确认管理端 ASR 运营摘要字段可用。
|
||||
- 本地 demo 数据卷由历史 `create_all` 路径创建过 Voice Session 表,直接运行 `alembic upgrade head` 会因 `voice_sessions` 已存在而失败;本轮未修改数据卷版本号,后续可在演示库层面单独处理 stamp 或迁移策略。
|
||||
- `backend/tests/test_admin_providers.py` 通过,3 passed。
|
||||
- `backend/tests/test_auth.py backend/tests/test_admin_providers.py` 通过,12 passed。
|
||||
- 后端全量测试通过,119 passed。
|
||||
- 后端相关文件 ruff 检查通过;全量 `backend/app backend/tests` ruff 检查也通过。
|
||||
- 用户端 `vue-tsc && vite build` 通过。
|
||||
- 管理端 `vue-tsc && vite build` 通过。
|
||||
- `git diff --check` 通过。
|
||||
- 用户端构建仍提示 Browserslist 数据偏旧;管理端构建仍提示 `baseline-browser-mapping` 与 Browserslist 数据偏旧。本轮未处理前端依赖刷新。
|
||||
|
||||
## 2026-04-28 拉取后回归与 Voice Studio 文案收敛
|
||||
|
||||
- 已拉取远端 `main` 到 `55ca098 Add voice analytics filters and metrics` 后完成本地回归。
|
||||
|
||||
@@ -48,20 +48,22 @@ AI 生成产品最大的问题不是“能不能调模型”,而是结果不
|
||||
|
||||
我把它拆成四个概念:
|
||||
|
||||
- Capability:产品需要的 AI 能力,例如文本、图片、语音、绘本结构
|
||||
- Capability:产品需要的 AI 能力,例如文本、图片、语音合成、语音识别、绘本结构
|
||||
- Provider:某个能力下的供应商配置,例如 Gemini、OpenAI、CQTAI、MiniMax
|
||||
- Adapter:具体 API 调用实现
|
||||
- Routing Policy:如何按优先级、成本、延迟或轮询选择 Provider
|
||||
|
||||
这样用户看到的是稳定的产品能力,系统内部再决定具体调用哪个模型或供应商。
|
||||
|
||||
语音共创 Alpha 也沿用这套分层:孩子可以通过 Voice Studio 用文本降级或上传语音参与故事,系统把 ASR、对话生成和 TTS 都当成可观测能力,而不是写死在页面里。
|
||||
|
||||
---
|
||||
|
||||
## 2:35 - 3:00 当前成果和下一步
|
||||
|
||||
目前本地 Docker 可以跑通完整链路,并且有 smoke 脚本验证健康检查、登录、生成、资产重试、故事列表和 Provider 能力分层。
|
||||
目前本地 Docker 运行栈可以跑通完整链路,并且有 smoke 脚本验证健康检查、登录、生成、资产重试、故事列表、Provider 能力分层和 Voice Studio Alpha。之前镜像重建被 Docker Hub / npm registry 链路卡住,我把基础镜像和 npm registry 做成可配置后,当前代码已经完成 `docker compose up -d --build` 和重建后 voice smoke。
|
||||
|
||||
现在 generation job 已经能查询完整事件流,包括 workflow、资产补全和 provider 调用;用户端和管理端都能展示生成轨迹,也能看到 provider 成功率、耗时和成本视角。
|
||||
现在 generation job 已经能查询完整事件流,包括 workflow、资产补全和 provider 调用;用户端和管理端都能展示生成轨迹,也能看到 provider 成功率、耗时和成本视角。Voice Studio 仍定位为 Phase A Alpha:它验证回合式语音共创、文本 fallback、低置信度确认、TTS 回复和保存为正式 Story,不把它包装成实时语音最终形态。
|
||||
|
||||
我希望通过这个项目展示的是:我不只是会接 AI API,而是能把不确定的模型能力收敛成稳定、可解释、可恢复的产品体验。
|
||||
|
||||
@@ -81,6 +83,10 @@ AI 生成产品最大的问题不是“能不能调模型”,而是结果不
|
||||
|
||||
它让用户不需要理解模型供应链,只感知稳定能力;同时让产品拥有者能控制成本、失败降级和供应商切换。
|
||||
|
||||
### 语音共创现在做到什么程度?
|
||||
|
||||
它是 Phase A Alpha,已经能演示创建会话、文本 fallback、上传语音转写、系统接着讲、低置信度确认、TTS 回复、会话恢复和 finalize 保存到故事库。当前不做实时打断和全双工对话,下一步先补真实 ASR Key 环境验收。
|
||||
|
||||
### 这个项目下一步怎么上线?
|
||||
|
||||
我已经把当前轻量 job/event 模型迁移到后台 worker,并打通了前端进度轮询、取消/重试队列和管理台当前环境运营视图;下一步会补跨环境 Provider 汇聚、断点续跑和更完整监控。生产上线前还需要补真实用户鉴权配置、密钥管理和部署策略。
|
||||
我已经把当前轻量 job/event 模型迁移到后台 worker,并打通了前端进度轮询、取消/重试队列、管理台当前环境运营视图和 ASR 摘要;下一步会补真实 ASR 环境验收、跨环境 Provider 汇聚、断点续跑和更完整监控。生产上线前还需要补真实用户鉴权配置、密钥管理和部署策略。
|
||||
|
||||
@@ -13,7 +13,7 @@ DreamWeaver 当前已经具备“输入主题 -> 生成故事/绘本 -> 补全
|
||||
|
||||
这个方向的价值不在于再加一个输入方式,而在于把 DreamWeaver 从“生成结果”推进到“陪伴式创作过程”。孩子不是先写清楚需求再等待结果,而是可以像和讲故事的人对话一样,说出自己想要的角色、情节和变化,系统实时或准实时地接住这些表达,再继续讲下去。
|
||||
|
||||
本增量 PRD 最初用于把语音共创定义为一条独立、可评估、可拆阶段落地的产品路线。2026-04-24 更新后,远端 `main` 已经提前跑通 Phase A Alpha:独立 Voice Studio、语音/文本回合、低置信度确认、安全改写、TTS 回复、会话恢复、finalize 保存为 Story,以及接回统一 generation job 的资产补全与 trace。下一步不应继续扩大到 Phase B 实时化,而应先完成 Alpha 验收、真实 ASR Provider 接入、成本/观测补齐,并回到原主线的跨环境 Provider 汇聚、监控告警和断点续跑。
|
||||
本增量 PRD 最初用于把语音共创定义为一条独立、可评估、可拆阶段落地的产品路线。2026-05-06 更新后,远端 `main` 已经跑通 Phase A Alpha:独立 Voice Studio、语音/文本回合、低置信度确认、安全改写、TTS 回复、会话恢复、finalize 保存为 Story,以及接回统一 generation job 的资产补全与 trace。ASR 已纳入 Provider 能力与管理端运营摘要,当前代码镜像重建后的 Docker voice smoke 已通过;真实 Key 环境仍需补验。下一步不应继续扩大到 Phase B 实时化,而应先完成真实 ASR 环境验收,再回到原主线的跨环境 Provider 汇聚、监控告警和断点续跑。
|
||||
|
||||
---
|
||||
|
||||
@@ -31,8 +31,8 @@ DreamWeaver 当前已经具备“输入主题 -> 生成故事/绘本 -> 补全
|
||||
|
||||
### Proposed Sequencing
|
||||
|
||||
1. 先完成 Phase A Alpha 收束:回归验证、演示清单、验收矩阵和已知限制记录。
|
||||
2. 补齐真实 ASR Provider、turn 级成本/指标归因、Voice Studio smoke 路径和失败降级验收。
|
||||
1. 先完成 Phase A Alpha 收束:回归验证、演示清单、验收矩阵、服务复杂度自审和已知限制记录。
|
||||
2. 补齐真实 ASR Key 环境验收,以及 turn 级对话/TTS 成本归因。
|
||||
3. 回到生产化主线:跨环境 Provider 汇聚、监控告警、断点续跑与更细粒度任务控制。
|
||||
4. Phase A 稳定并验证产品价值后,再评估 Phase B 准实时共创。
|
||||
|
||||
@@ -386,7 +386,7 @@ DreamWeaver 的语音共创模式应当成为一种“孩子可以开口参与
|
||||
|
||||
#### 3. 新增 ASR / Dialogue Orchestrator 能力
|
||||
|
||||
当前系统已有 `text` / `image` / `tts` / `storybook` capability,但 **没有输入侧语音识别能力**。未来至少需要新增:
|
||||
初始系统已有 `text` / `image` / `tts` / `storybook` capability,但当时 **没有输入侧语音识别能力**。Phase A Alpha 已新增 `asr` capability、demo fallback 和 `openai_asr` 适配器;真实 Key 环境仍需验收。能力层仍至少包含:
|
||||
|
||||
- `asr` 或 `speech_input` capability
|
||||
- 会话级 story planner / dialogue orchestrator
|
||||
@@ -434,15 +434,16 @@ DreamWeaver 的语音共创模式应当成为一种“孩子可以开口参与
|
||||
|
||||
## Key Gaps vs Current Architecture
|
||||
|
||||
当前架构 **可以支撑语音共创方向**,但还不能直接无痛实现,主要差距有:
|
||||
初始架构 **可以支撑语音共创方向**,但不能直接无痛实现;以下差距中,Phase A Alpha 已补齐主链路,剩余重点是生产化验收:
|
||||
|
||||
1. **没有语音输入能力层**
|
||||
现在只有 TTS,没有 ASR / STT。
|
||||
1. **语音输入能力层**
|
||||
已新增 `asr` Provider capability、demo fallback 和 `openai_asr`;仍需真实 Key 环境、延迟样本和更多失败原因验收。
|
||||
|
||||
2. **没有会话态故事模型**
|
||||
现在更像“提交任务 -> 等结果”,缺少持续共创 session。
|
||||
2. **会话态故事模型**
|
||||
已新增 Voice Session/Turn/Event;后续要继续拆分服务边界,降低 turn 编排复杂度。
|
||||
|
||||
3. **没有剧情修正语义**
|
||||
3. **剧情修正语义**
|
||||
已支持基础 start / continue / correct;后续要用更多真实儿童表达样本提高覆盖。
|
||||
当前重试和取消针对 job,不针对“故事中途被改写”。
|
||||
|
||||
4. **没有低延迟链路设计**
|
||||
@@ -513,7 +514,7 @@ DreamWeaver 的语音共创模式应当成为一种“孩子可以开口参与
|
||||
| FR-008 分支剧情 | Deferred | 当前状态模型不阻断未来扩展,但未实现分叉体验 | 保持 P2,Phase A 不做 |
|
||||
| NFR-001 响应可接受 | Needs Measurement | 回合式体验已实现,但尚无 p95 指标采集 | 加入 ASR/TTS/turn 编排耗时埋点 |
|
||||
| NFR-002 儿童内容安全 | Alpha Done | 已新增用户转写安全检查、assistant 柔性改写和 `safety_flags` 事件 | 扩充安全样本和误伤回归 |
|
||||
| NFR-003 成本可观测 | Partial | generation job/provider analytics 已覆盖资产补全;voice turn 级 ASR/TTS 成本仍需细化 | 把 ASR/Dialogue/TTS 成本写入 turn/event metadata |
|
||||
| NFR-003 成本可观测 | Partial | generation job/provider analytics 已覆盖资产补全;ASR 已进入管理端 Provider 摘要;voice turn 级 Dialogue/TTS 成本仍需细化 | 把 Dialogue/TTS 成本写入 turn/event metadata |
|
||||
| NFR-004 会话可恢复 | Alpha Done | Voice Studio 支持最近会话恢复和 active session 查询 | 补刷新/切页手动验收记录 |
|
||||
| NFR-005 架构可插拔 | Alpha Done | ASR 已纳入 `asr` Provider capability,默认 demo fallback,可配置 `openai_asr` | 后续补更多 ASR provider 与管理端体验 |
|
||||
|
||||
@@ -699,4 +700,27 @@ DreamWeaver 的语音共创模式应当成为一种“孩子可以开口参与
|
||||
- 已扩展 Voice Studio 观测卡:支持转写来源和会话状态筛选,便于演示时解释 demo/fallback/真实 ASR 差异。
|
||||
- 已扩展 `SMOKE_VOICE=1`:增加 provider/status 过滤断言,避免 analytics 只验证全量路径。
|
||||
|
||||
后续仍未完成:#47 ASR Provider 管理端摘要、#48 Docker voice smoke 回归、#49 服务复杂度拆分、#50 演示口径最终复核。
|
||||
当时后续仍未完成:#47 ASR Provider 管理端摘要、#48 Docker voice smoke 回归、#49 服务复杂度拆分、#50 演示口径最终复核。2026-05-06 已补 #47/#48/#49/#50。
|
||||
|
||||
## Phase A Alpha Execution Update(2026-05-06)
|
||||
|
||||
本轮拉取远端 `main` 到 `0ccfd00 chore: update frontend tooling and Chinese copy` 后继续收束 Alpha 运营可解释性:
|
||||
|
||||
- 已完成 #47:管理端 Provider 运营摘要现在会把 Voice Session 上传转写的 ASR 成功/失败纳入 `capability=asr` 聚合。
|
||||
- 管理端摘要新增 `voice_session_count` 与 `voice_turn_count`,语音识别筛选下可直接看到语音会话数和上传回合数。
|
||||
- ASR 摘要会按转写来源聚合成功调用,按失败事件聚合错误原因,并把 ASR 成本记录计入供应商和用户维度。
|
||||
- 已补后端测试覆盖 ASR 成功、失败、成本、跨用户聚合和管理端接口响应。
|
||||
- 已完成 #48:外部 Registry 阻塞已通过可配置基础镜像与 npm registry 修复;当前代码 `docker compose up -d --build` 通过,重建后 `SMOKE_VOICE=1 ./scripts/demo_smoke.sh` 也通过。
|
||||
- 已完成 #49:技术方案新增服务复杂度自审,列出 `voice_session_service.py`、`generation_jobs.py`、ASR service 和 Voice Studio 的拆分候选、风险信号和建议顺序;并已先把管理端跨用户 Provider/ASR 摘要拆到 `admin_provider_analytics.py`。
|
||||
- 已完成 #50:演示 checklist、demo package、3 分钟 pitch、PRD 和技术方案已统一口径:Voice Studio 是 Phase A Alpha,ASR 摘要已进入管理端,当前代码 Docker 重建和 voice smoke 已完成。
|
||||
|
||||
后续仍未完成:真实 ASR Key 环境验收、turn 级 Dialogue/TTS 成本归因、跨环境 Provider 汇聚、断点续跑和更完整监控。
|
||||
|
||||
## Phase A Alpha ASR Key Validation Prep(2026-06-01)
|
||||
|
||||
- 已检查 `openai_asr` 接线:适配器通过 ASR Provider Router 被 Voice Session 上传回合调用,Provider 默认配置读取 `OPENAI_API_KEY`、可选 `OPENAI_API_BASE`、`VOICE_TRANSCRIPTION_MODEL` 和 `VOICE_TRANSCRIPTION_LANGUAGE`。
|
||||
- 已补 `SMOKE_REAL_ASR=1 ./scripts/demo_smoke.sh`,该路径会自动包含 Voice Studio smoke,上传真实音频并断言 `transcription_provider=openai_asr`、转写文本非空、用户侧 analytics 可按 `provider=openai_asr` 筛选、Admin ASR analytics 能看到 `openai_asr`。
|
||||
- 默认演示路径仍保留 demo fallback;真实 ASR 路径必须显式打开,避免没有 key 时影响普通 smoke。
|
||||
- 文档已补真实 ASR `.env`、运行命令和失败排查口径。
|
||||
|
||||
真实 Key 环境验收仍需在有可用 key 的机器执行;执行通过后再把“真实 ASR Key 环境验收”从后续项里移除。
|
||||
|
||||
54
docs/technical/environment-configuration.md
Normal file
54
docs/technical/environment-configuration.md
Normal file
@@ -0,0 +1,54 @@
|
||||
# 环境变量配置约定
|
||||
|
||||
DreamWeaver 只把 `backend/.env` 视为应用运行配置文件。根目录 `.env` 可以存在,但它只服务 Docker Compose 本身,不参与后端配置加载。
|
||||
|
||||
## 文件职责
|
||||
|
||||
| 文件 | 读取方 | 放什么 | 不放什么 |
|
||||
| --- | --- | --- | --- |
|
||||
| `backend/.env` | FastAPI、Admin API、Celery worker、Celery beat、Docker 后端服务 | `SECRET_KEY`、`DATABASE_URL`、Redis/Celery URL、Provider 列表、AI key、OAuth key、Admin 账号 | Docker 镜像源、npm registry |
|
||||
| `.env` | Docker Compose 插值 | `PYTHON_BASE_IMAGE`、`NODE_BASE_IMAGE`、`NGINX_BASE_IMAGE`、`NPM_REGISTRY` 等镜像源/registry 覆盖 | AI key、OAuth key、`SECRET_KEY`、后端运行配置 |
|
||||
| `backend/.env.example` | 人读/复制模板 | `backend/.env` 的完整示例 | 真实密钥 |
|
||||
|
||||
## 为什么不让后端读取根目录 `.env`
|
||||
|
||||
`pydantic-settings` 的相对 `env_file=".env"` 会受当前工作目录影响:在仓库根目录启动会读根 `.env`,在 `backend/` 目录启动会读 `backend/.env`。这会导致同一条启动命令在不同目录下使用不同配置。
|
||||
|
||||
当前代码在 `backend/app/core/config.py` 中固定使用绝对路径 `backend/.env`。因此后端从任意工作目录启动时都读取同一个文件。
|
||||
|
||||
## Docker 演示
|
||||
|
||||
Docker 后端服务通过 `env_file: ./backend/.env` 读取应用配置。默认容器内地址应保持为服务名:
|
||||
|
||||
```env
|
||||
DATABASE_URL=postgresql+asyncpg://dreamweaver:dreamweaver_password@db:5432/dreamweaver_db
|
||||
CELERY_BROKER_URL=redis://redis:6379/0
|
||||
CELERY_RESULT_BACKEND=redis://redis:6379/0
|
||||
REDIS_URL=redis://redis:6379/0
|
||||
```
|
||||
|
||||
Postgres 容器只接收 `docker-compose.yml` 中固定的 demo 账号和数据库名,避免把 AI/OAuth key 注入基础设施容器。后端服务读取 `backend/.env` 中的 `DATABASE_URL`。需要改 Docker demo 的数据库账号时,同时修改 `docker-compose.yml` 的 `db.environment` 和 `backend/.env` 的 `DATABASE_URL`。Docker demo 固定暴露 `52432:5432` 和 `52379:6379`,本机直跑后端时按这些宿主机端口连接。
|
||||
|
||||
## 本机直跑后端
|
||||
|
||||
本机直接运行 `uvicorn`、`celery` 或 `alembic` 时也只改 `backend/.env`,把数据库和 Redis URL 改成宿主机端口:
|
||||
|
||||
```env
|
||||
DATABASE_URL=postgresql+asyncpg://dreamweaver:dreamweaver_password@localhost:52432/dreamweaver_db
|
||||
CELERY_BROKER_URL=redis://localhost:52379/0
|
||||
CELERY_RESULT_BACKEND=redis://localhost:52379/0
|
||||
REDIS_URL=redis://localhost:52379/0
|
||||
```
|
||||
|
||||
## 检查命令
|
||||
|
||||
```bash
|
||||
# 后端实际读取哪个 env 文件
|
||||
backend/.venv/bin/python - <<'PY'
|
||||
from app.core.config import BACKEND_ENV_FILE
|
||||
print(BACKEND_ENV_FILE)
|
||||
PY
|
||||
|
||||
# Docker 后端容器实际环境,注意不要把输出贴到公共渠道
|
||||
docker compose exec backend env | sort
|
||||
```
|
||||
@@ -25,7 +25,7 @@
|
||||
- 当 `transcript_confidence` 或 `intent_confidence` 偏低时,后端优先返回确认提示,而不是直接把这一轮写进故事正文
|
||||
- 已补完整确认流:支持“按这个理解继续”“重说本轮”“改成文本输入”
|
||||
- 前端明确展示“本轮系统理解为”与“建议家长确认后再继续”提示
|
||||
- 低置信度确认链路已有后端测试覆盖,可作为下一阶段继续接 ASR 与更细确认交互的基础
|
||||
- 低置信度确认链路已有后端测试覆盖,可作为下一阶段继续验收真实 ASR Key 环境与更细确认交互的基础
|
||||
- 已新增用户转写安全检查、assistant 输出柔性改写与 `safety_flags` 事件记录
|
||||
- finalize 会生成更稳定的标题/摘要,并在条件允许时自动排队封面补全 job
|
||||
- 已新增 `voice session analytics` 聚合指标,可跟踪 turn 成功率、ASR/TTS 失败、低置信度触发、finalize 转化率、输入构成、语音时长、Provider 分布、确认率和平均置信度,并支持按转写 Provider 与会话状态筛选
|
||||
@@ -52,7 +52,7 @@ Phase A 明确不做以下内容:
|
||||
- 不做多人共创
|
||||
- 不做绘本共创主链路
|
||||
- 不做每回合即时插图生成
|
||||
- 不把 ASR / Realtime 能力立刻并入当前 admin Provider 配置面板
|
||||
- 不把 Realtime 能力立刻并入当前 admin Provider 配置面板;ASR 已作为 Alpha 运营观测能力进入 Provider 体系
|
||||
|
||||
换句话说,Phase A 是一个 **回合式 voice session MVP**,不是最终形态。
|
||||
|
||||
@@ -93,13 +93,13 @@ Phase A 明确不做以下内容:
|
||||
- `tts` Provider Router
|
||||
- 现有故事库、故事详情页和后续资产补全链路
|
||||
|
||||
### 4.2 当前明显缺失的能力
|
||||
### 4.2 初始设计时明显缺失、Alpha 已补齐的能力
|
||||
|
||||
- 语音输入识别(ASR / STT)
|
||||
- 会话级状态模型
|
||||
- “剧情修正”语义解析
|
||||
- 会话级可观测事件
|
||||
- 从 voice session 保存为正式 Story 的收束服务
|
||||
- 语音输入识别(ASR / STT):已通过 `asr` Provider capability、demo fallback 和 `openai_asr` 适配器补齐,真实 Key 环境仍需验收。
|
||||
- 会话级状态模型:已落地 `voice_sessions / voice_turns / voice_session_events`。
|
||||
- “剧情修正”语义解析:Alpha 已支持 start / continue / correct 等回合意图。
|
||||
- 会话级可观测事件:已支持 voice session analytics、事件列表和管理端 ASR 摘要。
|
||||
- 从 voice session 保存为正式 Story 的收束服务:已支持 finalize 保存为 Story,并接回 generation job 资产补全。
|
||||
|
||||
---
|
||||
|
||||
@@ -115,7 +115,7 @@ Phase A 明确不做以下内容:
|
||||
`voice_sessions` 管过程,`stories` 管正式结果,避免把会话噪音直接污染正式故事结构。
|
||||
|
||||
4. **先复用 `text` / `tts` 主干,再决定是否拆新 capability**
|
||||
首版把复杂度压到最小,不急着把所有新能力都映射进 admin Provider 面板。
|
||||
首版把复杂度压到最小,不急着把 realtime / barge-in 等新能力映射进 admin Provider 面板。ASR 现在只作为回合式转写能力进入 Provider 体系。
|
||||
|
||||
5. **首版允许“文本可用但语音失败”降级**
|
||||
这与当前 DreamWeaver 主结果优先可读的原则一致。
|
||||
@@ -440,14 +440,29 @@ Phase A 明确不做以下内容:
|
||||
|
||||
**建议**
|
||||
|
||||
- Phase A 先接单一稳定供应商
|
||||
- 暂不并入当前 admin Provider CRUD
|
||||
- 先通过配置文件或单独 service 封装
|
||||
- Phase A 先接单一稳定供应商,并保留 demo fallback
|
||||
- 已并入当前 admin Provider CRUD 和运营摘要,但不引入 realtime 复杂配置
|
||||
- 先通过配置文件或单独 service 封装真实 Key 环境差异
|
||||
- 真实 Key 验收用 `SMOKE_REAL_ASR=1 ./scripts/demo_smoke.sh`,只在显式打开时调用外部 ASR
|
||||
|
||||
理由是:
|
||||
|
||||
- 当前 admin Provider 已扩展到 `text/image/tts/storybook/asr`
|
||||
- Phase A Alpha 已把 ASR 纳入最小 Provider 能力,但仍保留 demo fallback,避免真实转写不可用时阻塞演示
|
||||
- `openai_asr` 默认读取 `OPENAI_API_KEY`、可选 `OPENAI_API_BASE`、`VOICE_TRANSCRIPTION_MODEL` 和 `VOICE_TRANSCRIPTION_LANGUAGE`
|
||||
|
||||
真实 ASR 验收最小 `.env`:
|
||||
|
||||
```env
|
||||
ASR_PROVIDERS=["openai_asr", "demo"]
|
||||
OPENAI_API_KEY=sk-...
|
||||
OPENAI_API_BASE=
|
||||
VOICE_TRANSCRIPTION_MODE=provider
|
||||
VOICE_TRANSCRIPTION_MODEL=gpt-4o-mini-transcribe
|
||||
VOICE_TRANSCRIPTION_LANGUAGE=zh
|
||||
```
|
||||
|
||||
失败时优先看三处:上传接口响应、`turn_transcription_failed` 事件、Admin Provider analytics 的 `capability=asr` failure reasons。常见原因是 key 没进容器、401/403、429/额度不足、模型不可用、`OPENAI_API_BASE` 指向错误或音频格式不被接受。
|
||||
|
||||
### B. Dialogue Orchestrator
|
||||
|
||||
@@ -537,7 +552,36 @@ Phase A 就应该按 turn 记录:
|
||||
- 对话生成成本
|
||||
- TTS 成本
|
||||
|
||||
这部分后续可以汇总到新的语音共创 analytics,而不是一开始就挤进现有故事生成 dashboard。
|
||||
当前 Alpha 已把 ASR 成本和调用摘要接入管理端 Provider analytics。短期这样可以让运营视角统一看到 text/image/tts/storybook/asr;中期如果语音共创继续扩大,应把 voice session analytics 保持为主视图,把 admin Provider analytics 只作为跨能力成本与失败原因摘要。
|
||||
|
||||
### 13.3 服务复杂度自审(2026-05-06)
|
||||
|
||||
当前 Alpha 已经验证主链路,但服务边界开始接近需要拆分的程度:
|
||||
|
||||
| 模块 | 当前职责 | 复杂度信号 | 建议拆分 |
|
||||
| --- | --- | --- | --- |
|
||||
| `voice_session_service.py` | 会话 CRUD、turn 创建、意图识别、故事 patch、低置信度确认、安全改写、TTS、finalize、analytics | 文件已接近 2000 行;同步处理状态机、AI 编排和响应序列化,单次改动容易波及多条路径 | 优先拆 `voice_turn_orchestrator.py`、`voice_session_analytics.py`、`voice_session_finalizer.py` |
|
||||
| `generation_jobs.py` + `admin_provider_analytics.py` | generation job/event、任务控制、provider stats、ops summary;管理端跨用户 Provider/ASR 摘要已拆到独立 service | `generation_jobs.py` 仍偏大,但 ASR 管理端摘要已不再继续塞进 generation job 模块 | 后续继续把 `generation_jobs.py` 内部 provider telemetry helper 拆为共享小模块,保留 generation job 主流程聚焦任务状态 |
|
||||
| `voice_transcription_service.py` | ASR mode 解析与 provider router 调用 | 仍较小,但失败元数据不足,admin ASR 失败只能从事件里读 `error` | 后续补 `VoiceTranscriptionAttempt` 风格的轻量结果结构,统一 provider、latency、cost、error |
|
||||
| 前端 `VoiceStudio.vue` | 页面状态、录音上传、会话列表、turn 展示、analytics 卡片、确认/重试/finalize | 视图文件承担了太多 workflow 判断;继续加实时能力会变得难测 | 拆出 `useVoiceSessionWorkflow`、`VoiceTurnCard`、`VoiceAnalyticsPanel` |
|
||||
|
||||
建议拆分顺序:
|
||||
|
||||
1. 先拆只读 analytics:风险最低,测试可以复用现有 `test_voice_sessions.py` 与 `test_admin_providers.py`。2026-05-06 已先拆出管理端 `admin_provider_analytics.py`。
|
||||
2. 再拆 finalize:边界清晰,输入是 session,输出是 Story / generation job。
|
||||
3. 最后拆 turn orchestrator:它耦合 ASR、意图、故事 patch、安全和 TTS,应等回归矩阵更稳定后再动。
|
||||
|
||||
暂不建议在 Phase A Alpha 末尾做的大改:
|
||||
|
||||
- 不引入工作流引擎替代当前状态机。
|
||||
- 不把 voice session 直接塞进 generation job 主模型。
|
||||
- 不在 ASR 事件上新增迁移字段,除非要做精确延迟分布和供应商级 SLA。
|
||||
|
||||
触发必须拆分的信号:
|
||||
|
||||
- 单个 voice turn 改动需要同时修改 3 个以上测试文件。
|
||||
- 新增一个 analytics 字段需要读写多个无关 service。
|
||||
- Voice Studio 引入实时或准实时能力前,仍没有可复用 composable。
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,23 +1,26 @@
|
||||
# Build Stage
|
||||
FROM node:18-alpine AS build-stage
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
COPY package*.json ./
|
||||
RUN npm install
|
||||
|
||||
COPY . .
|
||||
RUN npm run build
|
||||
|
||||
# Production Stage
|
||||
FROM nginx:alpine AS production-stage
|
||||
|
||||
# 复制构建产物到 Nginx
|
||||
COPY --from=build-stage /app/dist /usr/share/nginx/html
|
||||
|
||||
# 复制自定义 Nginx 配置 (处理 SPA 路由)
|
||||
COPY nginx.conf /etc/nginx/conf.d/default.conf
|
||||
|
||||
EXPOSE 80
|
||||
|
||||
CMD ["nginx", "-g", "daemon off;"]
|
||||
# Build Stage
|
||||
ARG NODE_BASE_IMAGE=node:18-alpine
|
||||
ARG NGINX_BASE_IMAGE=nginx:alpine
|
||||
FROM ${NODE_BASE_IMAGE} AS build-stage
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
ARG NPM_REGISTRY=https://registry.npmjs.org/
|
||||
COPY package*.json ./
|
||||
RUN npm ci --registry="${NPM_REGISTRY}" --no-audit --no-fund
|
||||
|
||||
COPY . .
|
||||
RUN npm run build
|
||||
|
||||
# Production Stage
|
||||
FROM ${NGINX_BASE_IMAGE} AS production-stage
|
||||
|
||||
# 复制构建产物到 Nginx
|
||||
COPY --from=build-stage /app/dist /usr/share/nginx/html
|
||||
|
||||
# 复制自定义 Nginx 配置 (处理 SPA 路由)
|
||||
COPY nginx.conf /etc/nginx/conf.d/default.conf
|
||||
|
||||
EXPOSE 80
|
||||
|
||||
CMD ["nginx", "-g", "daemon off;"]
|
||||
|
||||
@@ -5,13 +5,23 @@ APP_URL="${APP_URL:-http://localhost:52080}"
|
||||
BACKEND_URL="${BACKEND_URL:-http://localhost:52000}"
|
||||
ADMIN_BACKEND_URL="${ADMIN_BACKEND_URL:-http://localhost:52800}"
|
||||
ADMIN_AUTH="${ADMIN_AUTH:-admin:admin}"
|
||||
DEV_SIGNIN_URL="${DEV_SIGNIN_URL:-$APP_URL/auth/dev/signin}"
|
||||
SMOKE_AUDIO="${SMOKE_AUDIO:-0}"
|
||||
SMOKE_VOICE="${SMOKE_VOICE:-0}"
|
||||
SMOKE_REAL_ASR="${SMOKE_REAL_ASR:-0}"
|
||||
REAL_ASR_AUDIO_FILE="${REAL_ASR_AUDIO_FILE:-}"
|
||||
REAL_ASR_EXPECTED_TEXT="${REAL_ASR_EXPECTED_TEXT:-小熊和星星一起找家}"
|
||||
REAL_ASR_DURATION_MS="${REAL_ASR_DURATION_MS:-2200}"
|
||||
|
||||
if [[ "$SMOKE_REAL_ASR" == "1" ]]; then
|
||||
SMOKE_VOICE=1
|
||||
fi
|
||||
|
||||
COOKIE_JAR="$(mktemp "${TMPDIR:-/tmp}/dreamweaver-cookie.XXXXXX")"
|
||||
VOICE_SMOKE_AUDIO="$(mktemp "${TMPDIR:-/tmp}/dreamweaver-voice-audio.XXXXXX")"
|
||||
REAL_ASR_SMOKE_AUDIO="${TMPDIR:-/tmp}/dreamweaver-real-asr-audio.$$.$RANDOM.m4a"
|
||||
cleanup() {
|
||||
rm -f "$COOKIE_JAR" "$VOICE_SMOKE_AUDIO"
|
||||
rm -f "$COOKIE_JAR" "$VOICE_SMOKE_AUDIO" "$REAL_ASR_SMOKE_AUDIO" "$REAL_ASR_SMOKE_AUDIO.caf"
|
||||
}
|
||||
trap cleanup EXIT
|
||||
|
||||
@@ -57,6 +67,78 @@ assert_jq() {
|
||||
fi
|
||||
}
|
||||
|
||||
curl_form_capture() {
|
||||
local body_file="$1"
|
||||
local status_file="$2"
|
||||
local url="$3"
|
||||
shift 3
|
||||
|
||||
local http_code
|
||||
if http_code="$(curl -sS -b "$COOKIE_JAR" -o "$body_file" -w '%{http_code}' "$@" "$url")"; then
|
||||
printf '%s' "$http_code" > "$status_file"
|
||||
return 0
|
||||
fi
|
||||
|
||||
printf '%s' "${http_code:-curl_failed}" > "$status_file"
|
||||
return 1
|
||||
}
|
||||
|
||||
print_json_or_raw() {
|
||||
local body_file="$1"
|
||||
if jq '.' "$body_file" >&2 2>/dev/null; then
|
||||
return 0
|
||||
fi
|
||||
cat "$body_file" >&2
|
||||
}
|
||||
|
||||
print_real_asr_diagnostics() {
|
||||
local session_id="$1"
|
||||
local body_file="$2"
|
||||
|
||||
echo "Real ASR smoke failed." >&2
|
||||
echo "Required backend env: ASR_PROVIDERS=[\"openai_asr\"] or [\"openai_asr\", \"demo\"], OPENAI_API_KEY, optional OPENAI_API_BASE, VOICE_TRANSCRIPTION_MODEL, VOICE_TRANSCRIPTION_LANGUAGE." >&2
|
||||
echo "Upload response:" >&2
|
||||
print_json_or_raw "$body_file"
|
||||
|
||||
if [[ -n "$session_id" && "$session_id" != "null" ]]; then
|
||||
echo "Voice session events:" >&2
|
||||
if voice_diag_json="$(get_json "$APP_URL/api/voice-sessions/$session_id" 2>/dev/null)"; then
|
||||
echo "$voice_diag_json" | jq '{id,status,last_error,events:[.events[] | {event_type,status,message,event_metadata}]}' >&2
|
||||
fi
|
||||
fi
|
||||
|
||||
echo "Admin ASR analytics:" >&2
|
||||
if admin_asr_json="$(curl -fsS -u "$ADMIN_AUTH" "$ADMIN_BACKEND_URL/admin/providers/analytics?days=7&capability=asr" 2>/dev/null)"; then
|
||||
echo "$admin_asr_json" | jq '{capability,total_calls,successful_calls,failed_calls,voice_session_count,voice_turn_count,by_provider,failure_reasons}' >&2
|
||||
fi
|
||||
|
||||
echo "If provider rows were changed in Admin, POST /admin/providers/reload and restart the API container/process before rerunning this smoke." >&2
|
||||
}
|
||||
|
||||
ensure_real_asr_audio() {
|
||||
if [[ -n "$REAL_ASR_AUDIO_FILE" ]]; then
|
||||
if [[ ! -f "$REAL_ASR_AUDIO_FILE" ]]; then
|
||||
echo "REAL_ASR_AUDIO_FILE does not exist: $REAL_ASR_AUDIO_FILE" >&2
|
||||
exit 1
|
||||
fi
|
||||
printf '%s\n' "$REAL_ASR_AUDIO_FILE"
|
||||
return 0
|
||||
fi
|
||||
|
||||
if command -v say >/dev/null 2>&1 && command -v afconvert >/dev/null 2>&1; then
|
||||
if ! say -v Tingting -o "$REAL_ASR_SMOKE_AUDIO.caf" "$REAL_ASR_EXPECTED_TEXT" 2>/dev/null; then
|
||||
say -o "$REAL_ASR_SMOKE_AUDIO.caf" "$REAL_ASR_EXPECTED_TEXT"
|
||||
fi
|
||||
afconvert -f m4af -d aac "$REAL_ASR_SMOKE_AUDIO.caf" "$REAL_ASR_SMOKE_AUDIO" >/dev/null
|
||||
rm -f "$REAL_ASR_SMOKE_AUDIO.caf"
|
||||
printf '%s\n' "$REAL_ASR_SMOKE_AUDIO"
|
||||
return 0
|
||||
fi
|
||||
|
||||
echo "SMOKE_REAL_ASR=1 requires REAL_ASR_AUDIO_FILE, or macOS say + afconvert to synthesize a short sample." >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
wait_for_job_story() {
|
||||
local job_id="$1"
|
||||
local attempts="${2:-60}"
|
||||
@@ -88,7 +170,7 @@ curl -fsS "$BACKEND_URL/health" | jq -e '.status == "ok"' >/dev/null
|
||||
curl -fsS "$ADMIN_BACKEND_URL/health" | jq -e '.status == "ok"' >/dev/null
|
||||
|
||||
say "Logging in with dev auth"
|
||||
curl -fsS -c "$COOKIE_JAR" -o /dev/null -L "$APP_URL/auth/dev/signin"
|
||||
curl -fsS -c "$COOKIE_JAR" -o /dev/null -L "$DEV_SIGNIN_URL"
|
||||
session_json="$(get_json "$APP_URL/auth/session")"
|
||||
assert_jq "$session_json" '.user.id == "github:dev_user_001"' "dev session should be active"
|
||||
|
||||
@@ -211,6 +293,48 @@ if [[ "$SMOKE_VOICE" == "1" ]]; then
|
||||
voice_waiting_analytics_json="$(get_json "$APP_URL/api/voice-sessions/analytics?days=7&session_status=waiting_user")"
|
||||
assert_jq "$voice_waiting_analytics_json" '.session_status == "waiting_user" and .total_sessions >= 1' "voice analytics should filter by session status"
|
||||
|
||||
if [[ "$SMOKE_REAL_ASR" == "1" ]]; then
|
||||
say "Submitting voice uploaded turn with real OpenAI ASR"
|
||||
real_asr_audio_path="$(ensure_real_asr_audio)"
|
||||
real_asr_body="$(mktemp "${TMPDIR:-/tmp}/dreamweaver-real-asr-body.XXXXXX")"
|
||||
real_asr_status_file="$(mktemp "${TMPDIR:-/tmp}/dreamweaver-real-asr-status.XXXXXX")"
|
||||
if ! curl_form_capture "$real_asr_body" "$real_asr_status_file" "$APP_URL/api/voice-sessions/$voice_session_id/turns" \
|
||||
-F "audio_file=@${real_asr_audio_path};filename=real-asr.m4a;type=audio/mp4" \
|
||||
-F "duration_ms=${REAL_ASR_DURATION_MS}"; then
|
||||
print_real_asr_diagnostics "$voice_session_id" "$real_asr_body"
|
||||
rm -f "$real_asr_body" "$real_asr_status_file"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
real_asr_status="$(cat "$real_asr_status_file")"
|
||||
if [[ "$real_asr_status" != "202" ]]; then
|
||||
echo "Unexpected real ASR upload HTTP status: $real_asr_status" >&2
|
||||
print_real_asr_diagnostics "$voice_session_id" "$real_asr_body"
|
||||
rm -f "$real_asr_body" "$real_asr_status_file"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
real_asr_upload_json="$(cat "$real_asr_body")"
|
||||
rm -f "$real_asr_body" "$real_asr_status_file"
|
||||
real_asr_turn_id="$(jq -r '.turn_id' <<<"$real_asr_upload_json")"
|
||||
assert_jq "$real_asr_upload_json" '.status != "failed" and .transcription_provider == "openai_asr"' "real ASR upload turn should use openai_asr"
|
||||
|
||||
real_asr_detail_json="$(get_json "$APP_URL/api/voice-sessions/$voice_session_id/turns/$real_asr_turn_id")"
|
||||
assert_jq "$real_asr_detail_json" '.transcription_provider == "openai_asr"' "real ASR turn detail should keep openai_asr provider"
|
||||
assert_jq "$real_asr_detail_json" '.user_transcript != null and (.user_transcript | length) > 0' "real ASR turn should expose a non-empty transcript"
|
||||
assert_jq "$real_asr_detail_json" '.assistant_text != null and .assistant_text != ""' "real ASR turn should continue the narrative"
|
||||
echo "$real_asr_detail_json" | jq '{id,status,transcription_provider,user_transcript,detected_intent,requires_confirmation,assistant_audio_ready,assistant_text}'
|
||||
|
||||
voice_openai_asr_analytics_json="$(get_json "$APP_URL/api/voice-sessions/analytics?days=7&provider=openai_asr")"
|
||||
assert_jq "$voice_openai_asr_analytics_json" '.provider == "openai_asr" and .uploaded_audio_turns >= 1 and (.transcription_provider_counts.openai_asr >= 1)' "voice analytics should filter real ASR provider"
|
||||
|
||||
admin_asr_analytics_json="$(curl -fsS -u "$ADMIN_AUTH" "$ADMIN_BACKEND_URL/admin/providers/analytics?days=7&capability=asr")"
|
||||
assert_jq "$admin_asr_analytics_json" '.capability == "asr" and .successful_calls >= 1 and ([.by_provider[].adapter] | index("openai_asr")) != null' "admin ASR analytics should include openai_asr"
|
||||
echo "$admin_asr_analytics_json" | jq '{capability,total_calls,successful_calls,failed_calls,voice_session_count,voice_turn_count,by_provider,failure_reasons}'
|
||||
else
|
||||
say "Skipping real ASR smoke; set SMOKE_REAL_ASR=1 with backend OPENAI_API_KEY and ASR_PROVIDERS=[\"openai_asr\", \"demo\"]"
|
||||
fi
|
||||
|
||||
say "Finalizing voice session into story"
|
||||
voice_finalize_json="$(post_json "$APP_URL/api/voice-sessions/$voice_session_id/finalize" '{
|
||||
"save_story": true,
|
||||
|
||||
Reference in New Issue
Block a user