docs: add docs (#115)

* docs: add prd documentation * docs: add app-flow doc
Yukaii · Feb 22, 2025 · 007f02c · 007f02c
1 parent dc53c70
commit 007f02c
Show file tree

Hide file tree

Showing 2 changed files with 302 additions and 0 deletions.
diff --git a/docs/app-flow.md b/docs/app-flow.md
@@ -0,0 +1,195 @@
+# App Flow Documentation
+
+## 1. Core Architecture Overview
+
+Gakuon is a learning system with dual interfaces: a web interface (PWA) and a terminal interface (TUI). Both interfaces share the same core services but differ in how they consume them.
+
+```mermaid
+graph TD
+    A[Core Services] --> B[Express Server]
+    A --> C[TUI Interface]
+    D[PWA Client] --> B
+
+    subgraph "Backend Services"
+    E1[Anki Service]
+    E2[OpenAI Service]
+    E3[Content Manager]
+    end
+
+    A --> E1
+    A --> E2
+    A --> E3
+
+    C --> E1
+    C --> E2
+    C --> E3
+
+    subgraph "Service Consumption"
+    C -.- F[Direct Module Import]
+    D -.- G[REST API]
+    end
+```
+
+File Structure:
+```
+src/
+├── client/           # React PWA client
+│   ├── components/   # UI components
+│   ├── api/         # API client
+│   └── views/       # Page components
+├── commands/         # CLI commands
+│   ├── learn.ts     # TUI interface
+│   └── serve.ts     # Web server
+├── services/        # Shared services
+│   ├── anki.ts     # Anki integration
+│   ├── openai.ts   # AI services
+│   └── audio.ts    # Audio handling
+└── utils/          # Shared utilities
+```
+
+## 2. Service Architecture
+
+### Core Services
+1. **Service Layer**: Shared business logic
+   ```typescript
+   // src/services/content-manager.ts
+   export class ContentManager {
+     constructor(
+       private ankiService: AnkiService,
+       private openaiService: OpenAIService,
+       private ttsVoice: string,
+     ) {}
+
+     async getOrGenerateContent(card: Card, config: DeckConfig) {
+       // Shared content generation logic
+     }
+   }
+   ```
+
+### Service Consumption Methods
+
+1. **TUI Interface**: Direct module imports
+   ```typescript
+   // src/commands/learn.ts
+   const contentManager = new ContentManager(
+     ankiService,
+     openaiService,
+     config.global.ttsVoice,
+   );
+   ```
+
+2. **Web Interface**: REST API abstraction
+   ```typescript
+   // src/client/api/index.ts
+   export async function fetchCard(cardId: number) {
+     const response = await fetch(`/api/cards/${cardId}`);
+     return response.json();
+   }
+   ```
+
+## 3. Interface Implementations
+
+### Web Interface (Primary)
+1. **Express Server**:
+   ```typescript
+   app.get("/api/cards/:id", async (req, res) => {
+     const cardId = Number(req.params.id);
+     const card = await ankiService.getCardsInfo([cardId]);
+     const content = await contentManager.getExistingContent(card);
+     res.json(content);
+   });
+   ```
+
+2. **PWA Client**:
+   ```typescript
+   // src/client/views/DeckView.tsx
+   export function DeckView() {
+     const { data: cardInfo } = useSWR(
+       `/api/cards/${cardId}`,
+       fetchCard
+     );
+
+     return (
+       <div>
+         <AudioPlayer urls={cardInfo.audioUrls} />
+         <CardControls />
+       </div>
+     );
+   }
+   ```
+
+### Terminal Interface (Secondary)
+```typescript
+// src/commands/learn.ts
+export async function learn() {
+  const contentManager = new ContentManager(/* ... */);
+
+  // Direct service usage
+  const cards = await ankiService.getDueCardsInfo(deckName);
+  const content = await contentManager.getOrGenerateContent(card);
+
+  // Terminal-specific UI
+  keyboard.on(KeyAction.PLAY_ALL, async () => {
+    await audioPlayer.play(content.audioFiles);
+  });
+}
+```
+
+## 4. Data Flow
+
+### Web Interface Flow
+```mermaid
+graph TD
+    A[User Action] --> B[API Request]
+    B --> C[Express Handler]
+    C --> D[Service Layer]
+    D --> E[Response]
+    E --> F[UI Update]
+```
+
+### TUI Flow
+```mermaid
+graph TD
+    A[User Input] --> B[Command Handler]
+    B --> C[Direct Service Call]
+    C --> D[Terminal Output]
+```
+
+## 5. API Integration
+
+1. **REST Endpoints**:
+   ```typescript
+   GET    /api/decks
+   GET    /api/decks/:name/cards
+   GET    /api/cards/:id
+   POST   /api/cards/:id/answer
+   GET    /api/audio/:filename
+   ```
+
+2. **Direct Service Methods**:
+   ```typescript
+   ankiService.getDueCardsInfo()
+   contentManager.getOrGenerateContent()
+   audioPlayer.play()
+   ```
+
+## 6. Operational Flow
+
+### Web Operations
+1. **Card Management**:
+   - Fetch through REST API
+   - Stream audio via HTTP
+   - Update through POST requests
+
+2. **State Management**:
+   - Client-side SWR
+   - Server-side caching
+   - Audio streaming
+
+### TUI Operations
+1. **Card Processing**:
+   - Direct service calls
+   - In-memory audio handling
+   - Keyboard event handling
+
+The architecture leverages the same core services while providing appropriate interfaces for different use cases. The web interface offers a modern, accessible experience through REST APIs, while the TUI provides a lightweight, direct interaction through module imports.
diff --git a/docs/prd.md b/docs/prd.md
@@ -0,0 +1,107 @@
+# Gakuon (学音) - Project Requirements Documentation
+
+## 1. App Overview
+- **Brief Description**: Gakuon is a CLI application with PWA capabilities that enhances Anki flashcard learning through AI-powered audio generation
+- **Project Goals**:
+  - Enable immersive audio learning from existing Anki decks
+  - Provide flexible configuration for AI-driven content generation
+  - Offer both CLI and web-based interfaces for maximum accessibility
+
+## 2. User Flow
+1. **Main Operation Flow**:
+   - User starts Gakuon server or CLI interface
+   - Application connects to AnkiConnect API
+   - User selects target deck
+   - AI generates contextual text based on card content
+   - Text is converted to audio via TTS
+   - User consumes audio content
+
+2. **Development Flow**:
+   ```mermaid
+   graph LR
+   A[Anki Deck] --> B[AnkiConnect]
+   B --> C[Gakuon Server]
+   C --> D[AI Processing]
+   D --> E[TTS Service]
+   E --> F[Audio Output]
+   ```
+
+3. **Runtime Flow**:
+   - Server Mode: REST API + PWA Frontend
+   - CLI Mode: Terminal-based TUI interface
+   - Both modes share core audio generation logic
+
+## 3. Tech Stack & APIs
+- **Core Technology Stack**:
+  - Bun.js runtime environment
+  - Node.js compatibility layer
+  - AnkiConnect API integration
+  - AI text generation service
+  - TTS service integration
+
+- **Frontend Layer**:
+  - PWA framework
+  - Responsive web interface
+  - TUI library for CLI interface
+
+## 4. Core Features
+1. **Primary Feature Set**:
+   - AnkiConnect API integration
+   - AI prompt configuration system
+   - Text-to-Speech pipeline
+   - Audio playback controls
+   - Deck selection and management
+
+2. **Component System**:
+   - Server component
+   - CLI interface
+   - PWA frontend
+   - Audio processing pipeline
+   - Configuration management
+
+## 5. In-scope and Out-of-scope
+- **In-scope Items**:
+  - Anki deck reading
+  - AI text generation
+  - Audio conversion
+  - Basic playback controls
+  - Configuration management
+
+- **Out-of-scope Items**:
+  - Anki card modification
+  - Direct AnkiWeb integration
+  - Complex audio editing
+  - Multi-user support
+
+## 6. Non-functional Requirements
+1. **Performance Requirements**:
+   - Audio generation under 5 seconds
+   - Smooth playback experience
+   - Efficient deck processing
+
+2. **Security Requirements**:
+   - Local API key storage
+   - Secure API communications
+   - User data protection
+
+## 7. Constraints & Assumptions
+- **Technical Constraints**:
+  - AnkiConnect availability
+  - AI service rate limits
+  - TTS service limitations
+
+- **Assumptions**:
+  - User has Anki installed
+  - AnkiConnect is configured
+  - Internet connectivity available
+
+## 8. Known Issues & Potential Pitfalls
+1. **Technical Challenges**:
+   - AnkiConnect API stability
+   - AI response quality
+   - Audio format compatibility
+
+2. **Performance Concerns**:
+   - Large deck processing time
+   - AI service latency
+   - Audio file storage management