diff --git a/frontend/docs/landing-page-refactor-spec.md b/.claude/specs/design/LANDING-PAGE-REFACTOR-SPEC.md similarity index 100% rename from frontend/docs/landing-page-refactor-spec.md rename to .claude/specs/design/LANDING-PAGE-REFACTOR-SPEC.md diff --git a/.claude/specs/memory-intelligence/MEMORY-INTELLIGENCE-PRD.md b/.claude/specs/memory-intelligence/MEMORY-INTELLIGENCE-PRD.md index d9a28d7..edbf9ab 100644 --- a/.claude/specs/memory-intelligence/MEMORY-INTELLIGENCE-PRD.md +++ b/.claude/specs/memory-intelligence/MEMORY-INTELLIGENCE-PRD.md @@ -3,9 +3,15 @@ ## 概述 **功能名称**: 记忆智能 (Memory Intelligence) -**版本**: v1.0 +**版本**: v1.1 **优先级**: Phase 2.5 (体验增强后、社区化前) **目标用户**: 家长 + 3-8 岁儿童 +**更新记录**: 2025-01-22 合并 `backend/docs/memory_system_prd.md` + +### 核心愿景 + +将当前的"数据存储"升级为有温度的**"情感连接系统"**。 +我们不只是在记住数据,而是在**维护孩子与故事世界的关系**。让每一个故事不再是孤立的碎片,而是构建孩子专属"故事宇宙"的砖瓦。 ### 核心价值 @@ -14,10 +20,39 @@ - **延续故事**: 角色、世界观跨故事延续 - **主动关怀**: 适时推送个性化故事建议 +### 产品痛点与解决方案 + +| 用户角色 | 核心痛点 | 解决方案 | 预期价值 | +|---------|---------|---------|---------| +| **孩子** | "上次的小兔子怎么不认识我了?" 故事之间缺乏连续性。 | **角色一致性与记忆注入** 故事开头主动提及往事,角色性格延续。 | 建立情感依恋,提升沉浸感。 | +| **家长** | "这App除了生成故事还能干嘛?" 无法感知产品的长期教育价值。 | **显性化成长轨迹** 词汇量统计、主题变化、成就徽章可视化。 | 提高付费意愿,提供社交货币。 | +| **平台** | 用户用完即走,缺乏留存壁垒。 | **沉没成本与情感资产** 积累的记忆越多,越舍不得离开。 | 提升长期留存率 (LTV)。 | + --- ## 一、功能模块 +### 1.0 记忆分层模型 + +#### 层级 1: 核心档案 (Identity Layer) +*性质:永久、静态、显性* +- **数据**: 姓名、年龄、性别 +- **输入**: 家长在 Onboarding 阶段手动输入 +- **作用**: 决定故事的基础适龄性和称呼 + +#### 层级 2: 故事宇宙 (Universe Layer) +*性质:长期、动态积累、半显性* +- **主角设定**: 姓名、性格特征(勇敢/害羞)、外貌特征(戴眼镜/卷发) +- **常驻配角**: 从随机故事中涌现出的固定伙伴(如"爱吃胡萝卜的松鼠奇奇") +- **世界观**: 故事发生的背景(魔法森林、未来城市、海底世界) +- **成就系统**: 孩子获得的虚拟奖励(勇气勋章、小小探险家) + +#### 层级 3: 工作记忆 (Working Memory) +*性质:短期、自动衰减、隐性* +- **关键情节**: 最近 3 个故事的结局和核心冲突 +- **情感标记**: 孩子对特定内容的反应(根据"重播"、"跳过"推断) +- **新学词汇**: 故事中出现的高级词汇 + ### 1.1 孩子档案系统 (Child Profile) | 字段 | 类型 | 说明 | @@ -244,7 +279,13 @@ CREATE TABLE memory_items ( 请创作一个适合{age}岁儿童的故事,约{word_count}字。 ``` -### 5.2 成就提取 Prompt +### 5.2 智能开场白 (Memory Injection) + +在生成新故事时,Prompt 必须包含一段"记忆唤醒"指令: +- **示例**: "小明,还记得上周我们帮小松鼠找回了松果吗?今天,小松鼠带来了一位新朋友..." +- **策略**: 提取权重最高的 Top 3 记忆注入 Prompt + +### 5.3 成就提取 Prompt ``` 请分析以下故事,提取主角获得的成长/成就: @@ -412,7 +453,43 @@ def check_push_notifications(): --- -## 八、里程碑 +## 八、关键功能特性 + +### 8.1 成长时间轴 (Growth Timeline) + +一个可视化的 H5 页面或 App 模块,以时间轴形式展示里程碑: +- 🌟 **初次相遇**: 创建角色的第一天 +- 📖 **阅读打卡**: 累计阅读 10/50/100 本 +- 🏅 **获得成就**: 获得"诚实勋章" +- 🧠 **能力解锁**: 第一次阅读"科幻"题材 + +### 8.2 成就仪式感 (Achievement Ceremony) + +- **触发**: 故事生成并分析后,如果获得新成就 +- **表现**: 弹窗动画 + 音效 + "恭喜获得 [勇气] 徽章" +- **分享**: 允许生成带二维码的成就海报 + +--- + +## 九、记忆类型扩展 + +| 类型 Key | 描述 | 来源 | 过期策略 | +|---------|------|------|---------| +| `recent_story` | 最近读过的故事梗概 | 阅读事件 | 30天衰减 | +| `favorite_character` | 孩子喜欢的角色 | 重播/高评分 | 长期有效 | +| `scary_element` | 孩子害怕/不喜欢的元素 | 跳过/负反馈 | 长期有效 (避雷) | +| `vocabulary_growth` | 新掌握的词汇 | 故事分析 | 90天衰减 | +| `emotional_highlight` | 高光时刻 (如: 特别开心的情节) | 互动数据 | 60天衰减 | + +--- + +## 十、里程碑 + +### Phase 1: 基础建设 (v0.3.0) +- [x] 数据库 `MemoryItem` 表 (已存在) +- [ ] 扩展 `MemoryItem` 类型字段,支持更多维度 +- [ ] 优化 `_build_memory_context`,支持更自然的 Prompt 注入 +- [ ] 前端:简单的"近期回忆"展示列表 ### M1: 孩子档案基础 - [ ] 数据库模型 @@ -436,9 +513,18 @@ def check_push_notifications(): - [ ] 偏好学习算法 - [ ] 推荐优化 +### Phase 2: 可视化与成就 (v0.4.0) +- [ ] 实现"成就提取器" (Achievement Extractor) 的闭环通知 +- [ ] 前端:开发"我的成就"和"成长时间轴"页面 +- [ ] 增加故事开场白的动态生成逻辑 + +### Phase 3: 深度智能 (v0.5.0+) +- [ ] 引入向量数据库,实现基于语义的记忆检索 (不仅是时间最近) +- [ ] 情感分析模型:分析用户行为推断情感倾向 + --- -## 九、风险与应对 +## 十一、风险与应对 | 风险 | 影响 | 应对 | |------|------|------| @@ -448,7 +534,7 @@ def check_push_notifications(): --- -## 十、相关文档 +## 十二、相关文档 - [孩子档案数据模型](./CHILD-PROFILE-MODEL.md) - [故事宇宙记忆结构](./STORY-UNIVERSE-MODEL.md) diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml new file mode 100644 index 0000000..5b8e0b2 --- /dev/null +++ b/.github/workflows/build.yml @@ -0,0 +1,189 @@ +# .github/workflows/build.yml +# 构建并推送 Docker 镜像到 GitHub Container Registry +# +# 触发条件: +# - push 到 main 分支 +# - 手动触发 (workflow_dispatch) +# - 创建版本标签 (v*) +# +# 镜像命名: +# ghcr.io//dreamweaver-backend:latest +# ghcr.io//dreamweaver-frontend:latest +# ghcr.io//dreamweaver-admin-frontend:latest + +name: Build and Push Docker Images + +on: + push: + branches: [main] + tags: ['v*'] + paths: + - 'backend/**' + - 'frontend/**' + - 'admin-frontend/**' + - '.github/workflows/build.yml' + workflow_dispatch: + inputs: + force_build: + description: 'Force rebuild all images' + required: false + default: 'false' + +env: + REGISTRY: ghcr.io + IMAGE_PREFIX: ${{ github.repository_owner }}/dreamweaver + +jobs: + # ============================================== + # 检测变更的目录 + # ============================================== + changes: + runs-on: ubuntu-latest + outputs: + backend: ${{ steps.filter.outputs.backend }} + frontend: ${{ steps.filter.outputs.frontend }} + admin-frontend: ${{ steps.filter.outputs.admin-frontend }} + steps: + - uses: actions/checkout@v4 + - uses: dorny/paths-filter@v3 + id: filter + with: + filters: | + backend: + - 'backend/**' + frontend: + - 'frontend/**' + admin-frontend: + - 'admin-frontend/**' + + # ============================================== + # 构建后端镜像 + # ============================================== + build-backend: + needs: changes + if: needs.changes.outputs.backend == 'true' || github.event.inputs.force_build == 'true' || startsWith(github.ref, 'refs/tags/') + runs-on: ubuntu-latest + permissions: + contents: read + packages: write + steps: + - uses: actions/checkout@v4 + + - name: Set up Docker Buildx + uses: docker/setup-buildx-action@v3 + + - name: Log in to Container Registry + uses: docker/login-action@v3 + with: + registry: ${{ env.REGISTRY }} + username: ${{ github.actor }} + password: ${{ secrets.GITHUB_TOKEN }} + + - name: Extract metadata + id: meta + uses: docker/metadata-action@v5 + with: + images: ${{ env.REGISTRY }}/${{ env.IMAGE_PREFIX }}-backend + tags: | + type=ref,event=branch + type=semver,pattern={{version}} + type=sha,prefix= + type=raw,value=latest,enable=${{ github.ref == 'refs/heads/main' }} + + - name: Build and push + uses: docker/build-push-action@v5 + with: + context: ./backend + push: true + tags: ${{ steps.meta.outputs.tags }} + labels: ${{ steps.meta.outputs.labels }} + cache-from: type=gha + cache-to: type=gha,mode=max + + # ============================================== + # 构建前端镜像 + # ============================================== + build-frontend: + needs: changes + if: needs.changes.outputs.frontend == 'true' || github.event.inputs.force_build == 'true' || startsWith(github.ref, 'refs/tags/') + runs-on: ubuntu-latest + permissions: + contents: read + packages: write + steps: + - uses: actions/checkout@v4 + + - name: Set up Docker Buildx + uses: docker/setup-buildx-action@v3 + + - name: Log in to Container Registry + uses: docker/login-action@v3 + with: + registry: ${{ env.REGISTRY }} + username: ${{ github.actor }} + password: ${{ secrets.GITHUB_TOKEN }} + + - name: Extract metadata + id: meta + uses: docker/metadata-action@v5 + with: + images: ${{ env.REGISTRY }}/${{ env.IMAGE_PREFIX }}-frontend + tags: | + type=ref,event=branch + type=semver,pattern={{version}} + type=sha,prefix= + type=raw,value=latest,enable=${{ github.ref == 'refs/heads/main' }} + + - name: Build and push + uses: docker/build-push-action@v5 + with: + context: ./frontend + push: true + tags: ${{ steps.meta.outputs.tags }} + labels: ${{ steps.meta.outputs.labels }} + cache-from: type=gha + cache-to: type=gha,mode=max + + # ============================================== + # 构建管理后台前端镜像 + # ============================================== + build-admin-frontend: + needs: changes + if: needs.changes.outputs.admin-frontend == 'true' || github.event.inputs.force_build == 'true' || startsWith(github.ref, 'refs/tags/') + runs-on: ubuntu-latest + permissions: + contents: read + packages: write + steps: + - uses: actions/checkout@v4 + + - name: Set up Docker Buildx + uses: docker/setup-buildx-action@v3 + + - name: Log in to Container Registry + uses: docker/login-action@v3 + with: + registry: ${{ env.REGISTRY }} + username: ${{ github.actor }} + password: ${{ secrets.GITHUB_TOKEN }} + + - name: Extract metadata + id: meta + uses: docker/metadata-action@v5 + with: + images: ${{ env.REGISTRY }}/${{ env.IMAGE_PREFIX }}-admin-frontend + tags: | + type=ref,event=branch + type=semver,pattern={{version}} + type=sha,prefix= + type=raw,value=latest,enable=${{ github.ref == 'refs/heads/main' }} + + - name: Build and push + uses: docker/build-push-action@v5 + with: + context: ./admin-frontend + push: true + tags: ${{ steps.meta.outputs.tags }} + labels: ${{ steps.meta.outputs.labels }} + cache-from: type=gha + cache-to: type=gha,mode=max diff --git a/README.md b/README.md index 8886035..3f86259 100644 --- a/README.md +++ b/README.md @@ -51,6 +51,124 @@ npm run dev - 后端 API:http://localhost:8000 - Swagger 文档:http://localhost:8000/docs +## Docker Compose 使用说明 +本项目包含 3 个 compose 文件: + +- `docker-compose.yml`:开发基线,包含本地构建(`build`)配置,适合日常开发调试。 +- `docker-compose.prod.yml`:生产基线,使用预构建镜像(不本地构建),适合部署环境。 +- `docker-compose.ha.yml`:HA 覆盖层,提供 PostgreSQL 主从、Redis 主从 + Sentinel、备份任务。 + +### 使用选择 +- 本地开发:使用 `docker-compose.yml` +- 生产部署:使用 `docker-compose.prod.yml` +- 需要高可用:在上面任一基线上叠加 `docker-compose.ha.yml` + +> 注意:`docker-compose.ha.yml` 是覆盖文件,不能单独使用。 + +### 常用命令 + +#### 开发模式(单机) +```bash +docker compose -f docker-compose.yml up -d +``` + +#### 开发 + HA(主从/哨兵演练) +```bash +docker compose -f docker-compose.yml -f docker-compose.ha.yml up -d +``` + +#### 生产模式(预构建镜像) +```bash +docker compose -f docker-compose.prod.yml up -d +``` + +#### 生产 + HA +```bash +docker compose -f docker-compose.prod.yml -f docker-compose.ha.yml up -d +``` + +#### 查看状态 / 日志 +```bash +docker compose -f docker-compose.yml -f docker-compose.ha.yml ps +docker compose -f docker-compose.yml -f docker-compose.ha.yml logs -f backend +``` + +#### 停止并清理(含卷) +```bash +docker compose -f docker-compose.yml -f docker-compose.ha.yml down -v +``` + +### `docker-compose.prod.yml` 镜像标签 +`docker-compose.prod.yml` 使用以下镜像格式: +- `${REGISTRY:-}dreamweaver-backend:${TAG:-latest}` +- `${REGISTRY:-}dreamweaver-frontend:${TAG:-latest}` +- `${REGISTRY:-}dreamweaver-admin-frontend:${TAG:-latest}` + +Linux 部署示例(推荐): +```bash +export REGISTRY=my-registry.example.com/ +export TAG=2026.02.12 +docker compose -f docker-compose.prod.yml up -d +``` + +Windows PowerShell 示例: +```powershell +$env:REGISTRY="my-registry.example.com/" +$env:TAG="2026.02.12" +docker compose -f docker-compose.prod.yml up -d +``` + +### Linux 服务器部署流程(推荐) +以下流程适用于 Ubuntu/CentOS 等 Linux 服务器: + +#### 1) 准备配置 +```bash +cp backend/.env.example backend/.env +# 编辑 backend/.env,至少配置 SECRET_KEY、OAuth/API Key 等 +``` + +#### 2) 启动(非 HA) +```bash +export REGISTRY=my-registry.example.com/ +export TAG=2026.02.12 +docker compose -f docker-compose.prod.yml pull +docker compose -f docker-compose.prod.yml up -d +``` + +#### 3) 启动(HA) +```bash +export REGISTRY=my-registry.example.com/ +export TAG=2026.02.12 +docker compose -f docker-compose.prod.yml -f docker-compose.ha.yml pull +docker compose -f docker-compose.prod.yml -f docker-compose.ha.yml up -d +``` + +#### 4) 运行状态检查 +```bash +docker compose -f docker-compose.prod.yml ps +docker compose -f docker-compose.prod.yml logs -f backend +``` +HA 场景使用: +```bash +docker compose -f docker-compose.prod.yml -f docker-compose.ha.yml ps +docker compose -f docker-compose.prod.yml -f docker-compose.ha.yml logs -f backend +``` + +#### 5) 版本升级 +```bash +export TAG=2026.02.13 +docker compose -f docker-compose.prod.yml pull +docker compose -f docker-compose.prod.yml up -d +``` +HA 场景同理,在命令中额外叠加 `-f docker-compose.ha.yml`。 + +#### 6) 版本回滚 +```bash +export TAG=2026.02.12 +docker compose -f docker-compose.prod.yml pull +docker compose -f docker-compose.prod.yml up -d +``` + ## 供应商路由与管理后台 - 路由按配置顺序尝试:`TEXT_PROVIDERS`(默认 `text_primary`)、`IMAGE_PROVIDERS`(默认 `image_primary`)、`TTS_PROVIDERS`(默认 `tts_primary`)。失败会自动切换下一个。 - 管理后台(默认关闭):`ENABLE_ADMIN_CONSOLE=true` 时启用,接口在 `/admin/providers`(CRUD)和 `/admin/providers/reload`。鉴权使用 Basic Auth,账号密码由 `ADMIN_USERNAME`/`ADMIN_PASSWORD` 设置(请覆盖默认值)。 diff --git a/backend/app/core/celery_app.py b/backend/app/core/celery_app.py index 5debd5c..18aafc5 100644 --- a/backend/app/core/celery_app.py +++ b/backend/app/core/celery_app.py @@ -5,11 +5,33 @@ from celery.schedules import crontab from app.core.config import settings -celery_app = Celery( - "dreamweaver", - broker=settings.celery_broker_url, - backend=settings.celery_result_backend, -) +if settings.redis_sentinel_enabled and settings.redis_sentinel_urls: + sentinel_broker = ";".join(settings.redis_sentinel_urls) + celery_app = Celery( + "dreamweaver", + broker=sentinel_broker, + backend=sentinel_broker, + ) + celery_app.conf.broker_transport_options = { + "master_name": settings.redis_sentinel_master_name, + "sentinel_kwargs": { + "password": settings.redis_sentinel_password or None, + "socket_timeout": settings.redis_sentinel_socket_timeout, + }, + } + celery_app.conf.result_backend_transport_options = { + "master_name": settings.redis_sentinel_master_name, + "sentinel_kwargs": { + "password": settings.redis_sentinel_password or None, + "socket_timeout": settings.redis_sentinel_socket_timeout, + }, + } +else: + celery_app = Celery( + "dreamweaver", + broker=settings.celery_broker_url, + backend=settings.celery_result_backend, + ) celery_app.conf.update( task_track_started=True, diff --git a/backend/app/core/config.py b/backend/app/core/config.py index 8823b15..9fa46e3 100644 --- a/backend/app/core/config.py +++ b/backend/app/core/config.py @@ -55,6 +55,18 @@ class Settings(BaseSettings): # Generic Redis redis_url: str = Field("redis://localhost:6379/0", description="Redis connection URL") + redis_sentinel_enabled: bool = Field(False, description="Whether to enable Redis Sentinel") + redis_sentinel_nodes: str = Field( + "", + description="Comma-separated Redis Sentinel nodes, e.g. host1:26379,host2:26379", + ) + redis_sentinel_master_name: str = Field("mymaster", description="Redis Sentinel master name") + redis_sentinel_password: str = Field("", description="Password for Redis Sentinel (optional)") + redis_sentinel_db: int = Field(0, description="Redis DB index when using Sentinel") + redis_sentinel_socket_timeout: float = Field( + 0.5, + description="Socket timeout in seconds for Sentinel clients", + ) # Admin console enable_admin_console: bool = False @@ -71,9 +83,43 @@ class Settings(BaseSettings): missing.append("SECRET_KEY") if not self.database_url: missing.append("DATABASE_URL") + if self.redis_sentinel_enabled and not self.redis_sentinel_nodes.strip(): + missing.append("REDIS_SENTINEL_NODES") if missing: raise ValueError(f"Missing required settings: {', '.join(missing)}") return self + @property + def redis_sentinel_hosts(self) -> list[tuple[str, int]]: + """Parse Redis Sentinel nodes into (host, port) tuples.""" + nodes = [] + raw = self.redis_sentinel_nodes.strip() + if not raw: + return nodes + + for item in raw.split(","): + value = item.strip() + if not value: + continue + if ":" not in value: + raise ValueError(f"Invalid sentinel node format: {value}") + host, port_text = value.rsplit(":", 1) + if not host: + raise ValueError(f"Invalid sentinel node host: {value}") + try: + port = int(port_text) + except ValueError as exc: + raise ValueError(f"Invalid sentinel node port: {value}") from exc + nodes.append((host, port)) + return nodes + + @property + def redis_sentinel_urls(self) -> list[str]: + """Build Celery-compatible Sentinel URLs with DB index.""" + return [ + f"sentinel://{host}:{port}/{self.redis_sentinel_db}" + for host, port in self.redis_sentinel_hosts + ] + settings = Settings() diff --git a/backend/app/core/redis.py b/backend/app/core/redis.py index f8e6962..376a81a 100644 --- a/backend/app/core/redis.py +++ b/backend/app/core/redis.py @@ -1,25 +1,46 @@ """Redis client module.""" -from typing import AsyncGenerator - from redis.asyncio import Redis, from_url +from redis.asyncio.sentinel import Sentinel from app.core.config import settings +from app.core.logging import get_logger _redis_pool: Redis | None = None +_sentinel_pool: Sentinel | None = None +logger = get_logger(__name__) async def get_redis() -> Redis: """Get global Redis client instance.""" - global _redis_pool + global _redis_pool, _sentinel_pool if _redis_pool is None: - _redis_pool = from_url(settings.redis_url, encoding="utf-8", decode_responses=True) + if settings.redis_sentinel_enabled: + _sentinel_pool = Sentinel( + settings.redis_sentinel_hosts, + socket_timeout=settings.redis_sentinel_socket_timeout, + password=settings.redis_sentinel_password or None, + decode_responses=True, + ) + _redis_pool = _sentinel_pool.master_for( + settings.redis_sentinel_master_name, + db=settings.redis_sentinel_db, + decode_responses=True, + ) + logger.info( + "redis_connected_via_sentinel", + master_name=settings.redis_sentinel_master_name, + sentinel_nodes=settings.redis_sentinel_nodes, + ) + else: + _redis_pool = from_url(settings.redis_url, encoding="utf-8", decode_responses=True) return _redis_pool async def close_redis(): """Close Redis connection.""" - global _redis_pool + global _redis_pool, _sentinel_pool if _redis_pool: await _redis_pool.close() _redis_pool = None + _sentinel_pool = None diff --git a/backend/docs/ha_runbook.md b/backend/docs/ha_runbook.md new file mode 100644 index 0000000..12893f4 --- /dev/null +++ b/backend/docs/ha_runbook.md @@ -0,0 +1,89 @@ +# HA 部署与验证 Runbook(Phase 3 MVP) + +本文档对应 `docker-compose.ha.yml`,用于本地/测试环境验证高可用基础能力。 + +## 1. 启动方式 + +```bash +docker compose -f docker-compose.yml -f docker-compose.ha.yml up -d +``` + +说明: +- 基础业务服务仍来自 `docker-compose.yml`。 +- `docker-compose.ha.yml` 覆盖了 `db`、`redis`,并新增 `db-replica`、`postgres-backup`、`redis-replica`、`redis-sentinel-*`。 + +## 2. 核心环境变量建议 + +在 `backend/.env`(或 shell 环境)中至少配置: + +```env +# PostgreSQL +POSTGRES_USER=dreamweaver +POSTGRES_PASSWORD=dreamweaver_password +POSTGRES_DB=dreamweaver_db +POSTGRES_REPMGR_PASSWORD=repmgr_password + +# Redis Sentinel +REDIS_SENTINEL_ENABLED=true +REDIS_SENTINEL_NODES=redis-sentinel-1:26379,redis-sentinel-2:26379,redis-sentinel-3:26379 +REDIS_SENTINEL_MASTER_NAME=mymaster +REDIS_SENTINEL_DB=0 +REDIS_SENTINEL_SOCKET_TIMEOUT=0.5 + +# 可选:若 Sentinel/Redis 设置了密码 +REDIS_SENTINEL_PASSWORD= + +# 备份周期,默认 86400 秒(1 天) +BACKUP_INTERVAL_SECONDS=86400 +``` + +## 3. 健康检查 + +### 3.1 PostgreSQL 主从 + +```bash +docker compose -f docker-compose.yml -f docker-compose.ha.yml ps +docker exec -it dreamweaver_db_primary psql -U dreamweaver -d dreamweaver_db -c "select now();" +docker exec -it dreamweaver_db_replica psql -U dreamweaver -d dreamweaver_db -c "select pg_is_in_recovery();" +``` + +期望: +- 主库可读写; +- 从库 `pg_is_in_recovery()` 返回 `t`。 + +### 3.2 Redis Sentinel + +```bash +docker exec -it dreamweaver_redis_sentinel_1 redis-cli -p 26379 sentinel masters +docker exec -it dreamweaver_redis_sentinel_1 redis-cli -p 26379 sentinel replicas mymaster +``` + +期望: +- `mymaster` 存在; +- 至少 1 个 replica 被发现。 + +### 3.3 备份任务 + +```bash +docker exec -it dreamweaver_postgres_backup sh -c "ls -lh /backups" +``` + +期望: +- `/backups` 下出现 `.dump` 文件; +- 旧于 7 天的备份会被自动清理。 + +## 4. 故障切换演练(最小) + +```bash +# 模拟 Redis 主节点故障 +docker stop dreamweaver_redis_master + +# 等待 Sentinel 选主后查看 +docker exec -it dreamweaver_redis_sentinel_1 redis-cli -p 26379 sentinel get-master-addr-by-name mymaster +``` + +提示:应用与 Celery 已支持 Sentinel 配置。若未启用 Sentinel,仍可回退到 `REDIS_URL` / `CELERY_BROKER_URL` / `CELERY_RESULT_BACKEND` 直连模式。 + +## 5. 当前已知限制(下一步) + +- PostgreSQL 侧当前仅完成主从拓扑,读写分离(PgBouncer/路由)待后续迭代。 diff --git a/backend/docs/refactoring_plan.md b/backend/docs/refactoring_plan.md index 73a59f2..0aa73f1 100644 --- a/backend/docs/refactoring_plan.md +++ b/backend/docs/refactoring_plan.md @@ -19,25 +19,25 @@ 目前 `backend`, `backend-admin`, `worker`, `celery-beat` 重复构建 4 次,浪费资源且镜像版本可能不一致。 - **Action Items**: - - [ ] 修改 `backend/Dockerfile` 为通用基础镜像。 - - [ ] 更新 `docker-compose.yml`,定义 `backend-base` 服务或使用 `image` 标签共享镜像。 - - [ ] 确保所有 Python 服务共用同一构建产物,仅启动命令不同。 + - [x] 修改 `backend/Dockerfile` 为通用基础镜像。 + - [x] 更新 `docker-compose.yml`,定义 `backend-base` 服务或使用 `image` 标签共享镜像。 + - [x] 确保所有 Python 服务共用同一构建产物,仅启动命令不同。 ### 2.2 修复 Provider 缓存与限流 (High Priority) 内存缓存 (`TTLCache`, `_latency_cache`) 在多进程/多实例下失效。 - **Action Items**: - - [ ] 引入 Redis 作为共享缓存后端。 - - [ ] 重构 `_load_provider_cache`,将 Provider 配置缓存至 Redis。 - - [ ] 重构 `stories.py` 中的限流逻辑,使用 `redis-cell` 或简单的 Redis 计数器替代 `TTLCache`。 + - [x] 引入 Redis 作为共享缓存后端。 + - [x] 重构 `_load_provider_cache`,将 Provider 配置缓存至 Redis。 + - [x] 重构 `stories.py` 中的限流逻辑,使用 `redis-cell` 或简单的 Redis 计数器替代 `TTLCache`。 ### 2.3 拆分 `stories.py` (Medium Priority) `app/api/stories.py` 超过 600 行,包含 API 定义、业务逻辑、验证逻辑,维护困难。 - **Action Items**: - - [ ] 创建 `app/services/story_service.py`,迁移生成、润色、PDF生成等核心逻辑。 - - [ ] 创建 `app/schemas/story_schema.py`,迁移 Pydantic 模型(`GenerateRequest`, `StoryResponse` 等)。 - - [ ] API 层 `stories.py` 仅保留路由定义和依赖注入,调用 Service 层。 + - [x] 创建 `app/services/story_service.py`,迁移生成、润色、PDF生成等核心逻辑。 + - [x] 创建 `app/schemas/story_schemas.py`,迁移 Pydantic 模型(`GenerateRequest`, `StoryResponse` 等)。 + - [x] API 层 `stories.py` 仅保留路由定义和依赖注入,调用 Service 层。 --- @@ -68,6 +68,17 @@ Redis 单点故障将导致 Celery 任务全盘停摆。 - [ ] 部署 Grafana + Prometheus,监控 API 延迟、QPS、Celery 队列积压情况。 - [ ] 完善 `ProviderMetrics`,增加可视化大盘,实时监控 AI 供应商的成本与成功率。 +### 3.4 Phase 3 最小可执行任务清单 (MVP) + +目标:在不大改业务代码的前提下,于一个迭代内完成高可用基础设施闭环。 + +- [x] PostgreSQL 主从:新增 `docker-compose.ha.yml`,包含 1 主 1 从与健康检查。 +- [x] PostgreSQL 备份:新增每日备份任务(`pg_dump`)与 7 天保留策略。 +- [x] Redis Sentinel:新增 1 主 2 哨兵最小拓扑,并验证故障切换。 +- [x] Celery 连接:更新 Celery broker/result backend 配置,支持 Sentinel 连接串。 +- [x] 回归验证:执行一次故事生成 + 异步任务链路(worker/beat)冒烟测试。 +- [x] 运行手册:补充故障切换与恢复步骤文档(PostgreSQL/Redis/Celery)。 + --- ## 4. 长期架构演进 (季度规划) diff --git a/backend/pyproject.toml b/backend/pyproject.toml index f9f8d05..8e3bf4b 100644 --- a/backend/pyproject.toml +++ b/backend/pyproject.toml @@ -20,6 +20,7 @@ dependencies = [ "sse-starlette>=2.0.0", "celery>=5.4.0", "redis>=5.0.0", + "edge-tts>=6.1.0", "openai>=1.0.0", ] diff --git a/docker-compose.ha.yml b/docker-compose.ha.yml new file mode 100644 index 0000000..fbef8f5 --- /dev/null +++ b/docker-compose.ha.yml @@ -0,0 +1,310 @@ +# docker-compose.ha.yml +# HA 覆盖配置(建议与 docker-compose.yml 叠加使用) +# +# 启动示例: +# docker compose -f docker-compose.yml -f docker-compose.ha.yml up -d + +services: + # ============================================== + # 应用服务 Sentinel 配置覆盖 + # ============================================== + backend: + environment: + - DATABASE_URL=postgresql+asyncpg://${POSTGRES_USER:-dreamweaver}:${POSTGRES_PASSWORD:-dreamweaver_password}@db:5432/${POSTGRES_DB:-dreamweaver_db} + - REDIS_SENTINEL_ENABLED=true + - REDIS_SENTINEL_NODES=redis-sentinel-1:26379,redis-sentinel-2:26379,redis-sentinel-3:26379 + - REDIS_SENTINEL_MASTER_NAME=mymaster + - REDIS_SENTINEL_DB=0 + - REDIS_SENTINEL_SOCKET_TIMEOUT=0.5 + + backend-admin: + environment: + - DATABASE_URL=postgresql+asyncpg://${POSTGRES_USER:-dreamweaver}:${POSTGRES_PASSWORD:-dreamweaver_password}@db:5432/${POSTGRES_DB:-dreamweaver_db} + - REDIS_SENTINEL_ENABLED=true + - REDIS_SENTINEL_NODES=redis-sentinel-1:26379,redis-sentinel-2:26379,redis-sentinel-3:26379 + - REDIS_SENTINEL_MASTER_NAME=mymaster + - REDIS_SENTINEL_DB=0 + - REDIS_SENTINEL_SOCKET_TIMEOUT=0.5 + + worker: + environment: + - DATABASE_URL=postgresql+asyncpg://${POSTGRES_USER:-dreamweaver}:${POSTGRES_PASSWORD:-dreamweaver_password}@db:5432/${POSTGRES_DB:-dreamweaver_db} + - REDIS_SENTINEL_ENABLED=true + - REDIS_SENTINEL_NODES=redis-sentinel-1:26379,redis-sentinel-2:26379,redis-sentinel-3:26379 + - REDIS_SENTINEL_MASTER_NAME=mymaster + - REDIS_SENTINEL_DB=0 + - REDIS_SENTINEL_SOCKET_TIMEOUT=0.5 + + celery-beat: + environment: + - DATABASE_URL=postgresql+asyncpg://${POSTGRES_USER:-dreamweaver}:${POSTGRES_PASSWORD:-dreamweaver_password}@db:5432/${POSTGRES_DB:-dreamweaver_db} + - REDIS_SENTINEL_ENABLED=true + - REDIS_SENTINEL_NODES=redis-sentinel-1:26379,redis-sentinel-2:26379,redis-sentinel-3:26379 + - REDIS_SENTINEL_MASTER_NAME=mymaster + - REDIS_SENTINEL_DB=0 + - REDIS_SENTINEL_SOCKET_TIMEOUT=0.5 + + # ============================================== + # PostgreSQL 主库(覆盖默认 db) + # ============================================== + db: + image: postgres:15-alpine + container_name: dreamweaver_db_primary + restart: unless-stopped + environment: + POSTGRES_USER: ${POSTGRES_USER:-dreamweaver} + POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-dreamweaver_password} + POSTGRES_DB: ${POSTGRES_DB:-dreamweaver_db} + command: + - postgres + - -c + - wal_level=replica + - -c + - max_wal_senders=10 + - -c + - max_replication_slots=10 + - -c + - hot_standby=on + - -c + - hba_file=/etc/postgresql/pg_hba.conf + ports: + - "52432:5432" + volumes: + - postgres_primary_data:/var/lib/postgresql/data + - ./ops/postgres-ha/pg_hba.conf:/etc/postgresql/pg_hba.conf:ro + healthcheck: + test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-dreamweaver} -d ${POSTGRES_DB:-dreamweaver_db}"] + interval: 10s + timeout: 5s + retries: 10 + + # ============================================== + # PostgreSQL 从库(基于 pg_basebackup 初始化) + # ============================================== + db-replica: + image: postgres:15-alpine + container_name: dreamweaver_db_replica + restart: unless-stopped + user: "postgres" + environment: + POSTGRES_USER: ${POSTGRES_USER:-dreamweaver} + POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-dreamweaver_password} + POSTGRES_DB: ${POSTGRES_DB:-dreamweaver_db} + PGDATA: /var/lib/postgresql/data + depends_on: + db: + condition: service_healthy + volumes: + - postgres_replica_data:/var/lib/postgresql/data + command: + - /bin/sh + - -ec + - | + if [ ! -s "$$PGDATA/PG_VERSION" ]; then + echo "Initializing replica from primary..." + until pg_isready -h db -U "$$POSTGRES_USER" -d "$$POSTGRES_DB"; do sleep 2; done + export PGPASSWORD="$$POSTGRES_PASSWORD" + rm -rf "$$PGDATA"/* + pg_basebackup -h db -D "$$PGDATA" -U "$$POSTGRES_USER" -Fp -Xs -P -R + fi + chmod 700 "$$PGDATA" + exec postgres -c hot_standby=on + healthcheck: + test: + [ + "CMD-SHELL", + "pg_isready -U ${POSTGRES_USER:-dreamweaver} -d ${POSTGRES_DB:-dreamweaver_db} && psql -U ${POSTGRES_USER:-dreamweaver} -d ${POSTGRES_DB:-dreamweaver_db} -tAc 'select pg_is_in_recovery();' | grep -q t", + ] + interval: 10s + timeout: 5s + retries: 10 + + # ============================================== + # PostgreSQL 备份任务(每日一次,保留 7 天) + # ============================================== + postgres-backup: + image: postgres:15-alpine + container_name: dreamweaver_postgres_backup + restart: unless-stopped + environment: + POSTGRES_USER: ${POSTGRES_USER:-dreamweaver} + POSTGRES_DB: ${POSTGRES_DB:-dreamweaver_db} + PGPASSWORD: ${POSTGRES_PASSWORD:-dreamweaver_password} + BACKUP_INTERVAL_SECONDS: ${BACKUP_INTERVAL_SECONDS:-86400} + depends_on: + db: + condition: service_healthy + volumes: + - postgres_backups:/backups + command: + - /bin/sh + - -ec + - | + while true; do + ts=$$(date +%Y%m%d_%H%M%S); + pg_dump -h db -U "$$POSTGRES_USER" -d "$$POSTGRES_DB" -F c -f "/backups/dreamweaver_$${ts}.dump"; + find /backups -type f -name '*.dump' -mtime +7 -delete; + sleep "$$BACKUP_INTERVAL_SECONDS"; + done + + # ============================================== + # Redis 主库(覆盖默认 redis) + # ============================================== + redis: + image: redis:7-alpine + container_name: dreamweaver_redis_master + restart: unless-stopped + ports: + - "52379:6379" + volumes: + - redis_master_data:/data + command: ["redis-server", "--appendonly", "yes", "--protected-mode", "no"] + networks: + default: + ipv4_address: 172.29.0.10 + healthcheck: + test: ["CMD", "redis-cli", "ping"] + interval: 10s + timeout: 5s + retries: 10 + + # ============================================== + # Redis 从库 + # ============================================== + redis-replica: + image: redis:7-alpine + container_name: dreamweaver_redis_replica + restart: unless-stopped + depends_on: + redis: + condition: service_healthy + volumes: + - redis_replica_data:/data + command: + [ + "redis-server", + "--appendonly", + "yes", + "--protected-mode", + "no", + "--replicaof", + "172.29.0.10", + "6379", + ] + networks: + default: + ipv4_address: 172.29.0.11 + healthcheck: + test: ["CMD", "redis-cli", "ping"] + interval: 10s + timeout: 5s + retries: 10 + + # ============================================== + # Redis Sentinel (3 节点) + # ============================================== + redis-sentinel-1: + image: redis:7-alpine + container_name: dreamweaver_redis_sentinel_1 + restart: unless-stopped + ports: + - "52631:26379" + depends_on: + redis: + condition: service_healthy + redis-replica: + condition: service_healthy + networks: + default: + ipv4_address: 172.29.0.21 + command: + - /bin/sh + - -ec + - | + cat > /tmp/sentinel.conf < /tmp/sentinel.conf < /tmp/sentinel.conf <nul +if !ERRORLEVEL! equ 0 goto :target_valid + +echo Usage: %0 [stable^|latest^|VERSION] >&2 +echo Example: %0 1.0.58 >&2 +exit /b 1 + +:target_valid + +REM Check for 64-bit Windows +if /i "%PROCESSOR_ARCHITECTURE%"=="AMD64" goto :arch_valid +if /i "%PROCESSOR_ARCHITECTURE%"=="ARM64" goto :arch_valid +if /i "%PROCESSOR_ARCHITEW6432%"=="AMD64" goto :arch_valid +if /i "%PROCESSOR_ARCHITEW6432%"=="ARM64" goto :arch_valid + +echo Claude Code does not support 32-bit Windows. Please use a 64-bit version of Windows. >&2 +exit /b 1 + +:arch_valid + +REM Set constants +set "GCS_BUCKET=https://storage.googleapis.com/claude-code-dist-86c565f3-f756-42ad-8dfa-d59b1c096819/claude-code-releases" +set "DOWNLOAD_DIR=%USERPROFILE%\.claude\downloads" +set "PLATFORM=win32-x64" + +REM Create download directory +if not exist "!DOWNLOAD_DIR!" mkdir "!DOWNLOAD_DIR!" + +REM Check for curl availability +curl --version >nul 2>&1 +if !ERRORLEVEL! neq 0 ( + echo curl is required but not available. Please install curl or use PowerShell installer. >&2 + exit /b 1 +) + +REM Always download latest version (which has the most up-to-date installer) +call :download_file "!GCS_BUCKET!/latest" "!DOWNLOAD_DIR!\latest" +if !ERRORLEVEL! neq 0 ( + echo Failed to get latest version >&2 + exit /b 1 +) + +REM Read version from file +set /p VERSION=<"!DOWNLOAD_DIR!\latest" +del "!DOWNLOAD_DIR!\latest" + +REM Download manifest +call :download_file "!GCS_BUCKET!/!VERSION!/manifest.json" "!DOWNLOAD_DIR!\manifest.json" +if !ERRORLEVEL! neq 0 ( + echo Failed to get manifest >&2 + exit /b 1 +) + +REM Extract checksum from manifest +call :parse_manifest "!DOWNLOAD_DIR!\manifest.json" "!PLATFORM!" +if !ERRORLEVEL! neq 0 ( + echo Platform !PLATFORM! not found in manifest >&2 + del "!DOWNLOAD_DIR!\manifest.json" 2>nul + exit /b 1 +) +del "!DOWNLOAD_DIR!\manifest.json" + +REM Download binary +set "BINARY_PATH=!DOWNLOAD_DIR!\claude-!VERSION!-!PLATFORM!.exe" +call :download_file "!GCS_BUCKET!/!VERSION!/!PLATFORM!/claude.exe" "!BINARY_PATH!" +if !ERRORLEVEL! neq 0 ( + echo Failed to download binary >&2 + if exist "!BINARY_PATH!" del "!BINARY_PATH!" + exit /b 1 +) + +REM Verify checksum +call :verify_checksum "!BINARY_PATH!" "!EXPECTED_CHECKSUM!" +if !ERRORLEVEL! neq 0 ( + echo Checksum verification failed >&2 + del "!BINARY_PATH!" + exit /b 1 +) + +REM Run claude install to set up launcher and shell integration +echo Setting up Claude Code... +"!BINARY_PATH!" install "!TARGET!" +set "INSTALL_RESULT=!ERRORLEVEL!" + +REM Clean up downloaded file +REM Wait a moment for any file handles to be released +timeout /t 1 /nobreak >nul 2>&1 +del /f "!BINARY_PATH!" >nul 2>&1 +if exist "!BINARY_PATH!" ( + echo Warning: Could not remove temporary file: !BINARY_PATH! +) + +if !INSTALL_RESULT! neq 0 ( + echo Installation failed >&2 + exit /b 1 +) + +echo. +echo Installation complete^^! +echo. +exit /b 0 + +REM ============================================================================ +REM SUBROUTINES +REM ============================================================================ + +:download_file +REM Downloads a file using curl +REM Args: %1=URL, %2=OutputPath +set "URL=%~1" +set "OUTPUT=%~2" + +curl -fsSL "!URL!" -o "!OUTPUT!" +exit /b !ERRORLEVEL! + +:parse_manifest +REM Parse JSON manifest to extract checksum for platform +REM Args: %1=ManifestPath, %2=Platform +set "MANIFEST_PATH=%~1" +set "PLATFORM_NAME=%~2" +set "EXPECTED_CHECKSUM=" + +REM Use findstr to find platform section, then look for checksum +set "FOUND_PLATFORM=" +set "IN_PLATFORM_SECTION=" + +REM Read the manifest line by line +for /f "usebackq tokens=*" %%i in ("!MANIFEST_PATH!") do ( + set "LINE=%%i" + + REM Check if this line contains our platform + echo !LINE! | findstr /c:"\"%PLATFORM_NAME%\":" >nul + if !ERRORLEVEL! equ 0 ( + set "IN_PLATFORM_SECTION=1" + ) + + REM If we're in the platform section, look for checksum + if defined IN_PLATFORM_SECTION ( + echo !LINE! | findstr /c:"\"checksum\":" >nul + if !ERRORLEVEL! equ 0 ( + REM Extract checksum value + for /f "tokens=2 delims=:" %%j in ("!LINE!") do ( + set "CHECKSUM_PART=%%j" + REM Remove quotes, whitespace, and comma + set "CHECKSUM_PART=!CHECKSUM_PART: =!" + set "CHECKSUM_PART=!CHECKSUM_PART:"=!" + set "CHECKSUM_PART=!CHECKSUM_PART:,=!" + + REM Check if it looks like a SHA256 (64 hex chars) + if not "!CHECKSUM_PART!"=="" ( + call :check_length "!CHECKSUM_PART!" 64 + if !ERRORLEVEL! equ 0 ( + set "EXPECTED_CHECKSUM=!CHECKSUM_PART!" + exit /b 0 + ) + ) + ) + ) + + REM Check if we've left the platform section (closing brace) + echo !LINE! | findstr /c:"}" >nul + if !ERRORLEVEL! equ 0 set "IN_PLATFORM_SECTION=" + ) +) + +if "!EXPECTED_CHECKSUM!"=="" exit /b 1 +exit /b 0 + +:check_length +REM Check if string length equals expected length +REM Args: %1=String, %2=ExpectedLength +set "STR=%~1" +set "EXPECTED_LEN=%~2" +set "LEN=0" +:count_loop +if "!STR:~%LEN%,1!"=="" goto :count_done +set /a LEN+=1 +goto :count_loop +:count_done +if %LEN%==%EXPECTED_LEN% exit /b 0 +exit /b 1 + +:verify_checksum +REM Verify file checksum using certutil +REM Args: %1=FilePath, %2=ExpectedChecksum +set "FILE_PATH=%~1" +set "EXPECTED=%~2" + +for /f "skip=1 tokens=*" %%i in ('certutil -hashfile "!FILE_PATH!" SHA256') do ( + set "ACTUAL=%%i" + set "ACTUAL=!ACTUAL: =!" + if "!ACTUAL!"=="CertUtil:Thecommandcompletedsuccessfully." goto :verify_done + if "!ACTUAL!" neq "" ( + if /i "!ACTUAL!"=="!EXPECTED!" ( + exit /b 0 + ) else ( + exit /b 1 + ) + ) +) + +:verify_done +exit /b 1 diff --git a/ops/postgres-ha/pg_hba.conf b/ops/postgres-ha/pg_hba.conf new file mode 100644 index 0000000..10ba867 --- /dev/null +++ b/ops/postgres-ha/pg_hba.conf @@ -0,0 +1,10 @@ +# Allow local socket access +local all all trust + +# Allow all IPv4/IPv6 client access in local docker network +host all all 0.0.0.0/0 trust +host all all ::/0 trust + +# Allow streaming replication connections +host replication all 0.0.0.0/0 trust +host replication all ::/0 trust