Skip to content

Commit

Permalink
docs: add docs (#115)
Browse files Browse the repository at this point in the history
* docs: add prd documentation
* docs: add app-flow doc
  • Loading branch information
Yukaii authored Feb 22, 2025
1 parent dc53c70 commit 007f02c
Show file tree
Hide file tree
Showing 2 changed files with 302 additions and 0 deletions.
195 changes: 195 additions & 0 deletions docs/app-flow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
# App Flow Documentation

## 1. Core Architecture Overview

Gakuon is a learning system with dual interfaces: a web interface (PWA) and a terminal interface (TUI). Both interfaces share the same core services but differ in how they consume them.

```mermaid
graph TD
A[Core Services] --> B[Express Server]
A --> C[TUI Interface]
D[PWA Client] --> B
subgraph "Backend Services"
E1[Anki Service]
E2[OpenAI Service]
E3[Content Manager]
end
A --> E1
A --> E2
A --> E3
C --> E1
C --> E2
C --> E3
subgraph "Service Consumption"
C -.- F[Direct Module Import]
D -.- G[REST API]
end
```

File Structure:
```
src/
├── client/ # React PWA client
│ ├── components/ # UI components
│ ├── api/ # API client
│ └── views/ # Page components
├── commands/ # CLI commands
│ ├── learn.ts # TUI interface
│ └── serve.ts # Web server
├── services/ # Shared services
│ ├── anki.ts # Anki integration
│ ├── openai.ts # AI services
│ └── audio.ts # Audio handling
└── utils/ # Shared utilities
```

## 2. Service Architecture

### Core Services
1. **Service Layer**: Shared business logic
```typescript
// src/services/content-manager.ts
export class ContentManager {
constructor(
private ankiService: AnkiService,
private openaiService: OpenAIService,
private ttsVoice: string,
) {}

async getOrGenerateContent(card: Card, config: DeckConfig) {
// Shared content generation logic
}
}
```

### Service Consumption Methods

1. **TUI Interface**: Direct module imports
```typescript
// src/commands/learn.ts
const contentManager = new ContentManager(
ankiService,
openaiService,
config.global.ttsVoice,
);
```

2. **Web Interface**: REST API abstraction
```typescript
// src/client/api/index.ts
export async function fetchCard(cardId: number) {
const response = await fetch(`/api/cards/${cardId}`);
return response.json();
}
```

## 3. Interface Implementations

### Web Interface (Primary)
1. **Express Server**:
```typescript
app.get("/api/cards/:id", async (req, res) => {
const cardId = Number(req.params.id);
const card = await ankiService.getCardsInfo([cardId]);
const content = await contentManager.getExistingContent(card);
res.json(content);
});
```

2. **PWA Client**:
```typescript
// src/client/views/DeckView.tsx
export function DeckView() {
const { data: cardInfo } = useSWR(
`/api/cards/${cardId}`,
fetchCard
);

return (
<div>
<AudioPlayer urls={cardInfo.audioUrls} />
<CardControls />
</div>
);
}
```

### Terminal Interface (Secondary)
```typescript
// src/commands/learn.ts
export async function learn() {
const contentManager = new ContentManager(/* ... */);

// Direct service usage
const cards = await ankiService.getDueCardsInfo(deckName);
const content = await contentManager.getOrGenerateContent(card);

// Terminal-specific UI
keyboard.on(KeyAction.PLAY_ALL, async () => {
await audioPlayer.play(content.audioFiles);
});
}
```

## 4. Data Flow

### Web Interface Flow
```mermaid
graph TD
A[User Action] --> B[API Request]
B --> C[Express Handler]
C --> D[Service Layer]
D --> E[Response]
E --> F[UI Update]
```

### TUI Flow
```mermaid
graph TD
A[User Input] --> B[Command Handler]
B --> C[Direct Service Call]
C --> D[Terminal Output]
```

## 5. API Integration

1. **REST Endpoints**:
```typescript
GET /api/decks
GET /api/decks/:name/cards
GET /api/cards/:id
POST /api/cards/:id/answer
GET /api/audio/:filename
```

2. **Direct Service Methods**:
```typescript
ankiService.getDueCardsInfo()
contentManager.getOrGenerateContent()
audioPlayer.play()
```

## 6. Operational Flow

### Web Operations
1. **Card Management**:
- Fetch through REST API
- Stream audio via HTTP
- Update through POST requests

2. **State Management**:
- Client-side SWR
- Server-side caching
- Audio streaming

### TUI Operations
1. **Card Processing**:
- Direct service calls
- In-memory audio handling
- Keyboard event handling

The architecture leverages the same core services while providing appropriate interfaces for different use cases. The web interface offers a modern, accessible experience through REST APIs, while the TUI provides a lightweight, direct interaction through module imports.
107 changes: 107 additions & 0 deletions docs/prd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Gakuon (学音) - Project Requirements Documentation

## 1. App Overview
- **Brief Description**: Gakuon is a CLI application with PWA capabilities that enhances Anki flashcard learning through AI-powered audio generation
- **Project Goals**:
- Enable immersive audio learning from existing Anki decks
- Provide flexible configuration for AI-driven content generation
- Offer both CLI and web-based interfaces for maximum accessibility

## 2. User Flow
1. **Main Operation Flow**:
- User starts Gakuon server or CLI interface
- Application connects to AnkiConnect API
- User selects target deck
- AI generates contextual text based on card content
- Text is converted to audio via TTS
- User consumes audio content

2. **Development Flow**:
```mermaid
graph LR
A[Anki Deck] --> B[AnkiConnect]
B --> C[Gakuon Server]
C --> D[AI Processing]
D --> E[TTS Service]
E --> F[Audio Output]
```

3. **Runtime Flow**:
- Server Mode: REST API + PWA Frontend
- CLI Mode: Terminal-based TUI interface
- Both modes share core audio generation logic

## 3. Tech Stack & APIs
- **Core Technology Stack**:
- Bun.js runtime environment
- Node.js compatibility layer
- AnkiConnect API integration
- AI text generation service
- TTS service integration

- **Frontend Layer**:
- PWA framework
- Responsive web interface
- TUI library for CLI interface

## 4. Core Features
1. **Primary Feature Set**:
- AnkiConnect API integration
- AI prompt configuration system
- Text-to-Speech pipeline
- Audio playback controls
- Deck selection and management

2. **Component System**:
- Server component
- CLI interface
- PWA frontend
- Audio processing pipeline
- Configuration management

## 5. In-scope and Out-of-scope
- **In-scope Items**:
- Anki deck reading
- AI text generation
- Audio conversion
- Basic playback controls
- Configuration management

- **Out-of-scope Items**:
- Anki card modification
- Direct AnkiWeb integration
- Complex audio editing
- Multi-user support

## 6. Non-functional Requirements
1. **Performance Requirements**:
- Audio generation under 5 seconds
- Smooth playback experience
- Efficient deck processing

2. **Security Requirements**:
- Local API key storage
- Secure API communications
- User data protection

## 7. Constraints & Assumptions
- **Technical Constraints**:
- AnkiConnect availability
- AI service rate limits
- TTS service limitations

- **Assumptions**:
- User has Anki installed
- AnkiConnect is configured
- Internet connectivity available

## 8. Known Issues & Potential Pitfalls
1. **Technical Challenges**:
- AnkiConnect API stability
- AI response quality
- Audio format compatibility

2. **Performance Concerns**:
- Large deck processing time
- AI service latency
- Audio file storage management

0 comments on commit 007f02c

Please sign in to comment.