KnowledgeRefinery/docs/operational-notes.md

# Operational Notes

## Data Locations

Each workspace has its own data directory under `~/.knowledge-refinery/workspaces/<id>/`.

| Item | Path |
|------|------|
| Workspace root | `~/.knowledge-refinery/workspaces/<id>/` |
| SQLite DB (metadata + vectors + graph) | `~/.knowledge-refinery/workspaces/<id>/refinery.db` |
| PID file | `~/.knowledge-refinery/workspaces/<id>/daemon.pid` |
| Workspace config | `~/.knowledge-refinery/workspaces.json` |

## Resetting

To start fresh, remove the data directory:
```bash
rm -rf ~/.knowledge-refinery
```

## Monitoring

The Go daemon logs to stdout with structured messages. Key log patterns:
- `[pipeline] Stage: ...` - Pipeline stage progress
- `[embedder] Embedded batch: X chunks` - Embedding progress
- `[error]` - Errors during processing

### Live Pipeline Monitoring

During pipeline execution, real-time progress is available via the enriched `/ingest/status` endpoint. The daemon maintains:

- **Live progress dict**: Per-stage status (pending/running/done) with progress percentages
- **Counters**: chunk_count, annotation_count, concept_count, edge_count
- **Activity log**: 200-entry ring buffer; the last 50 events are returned via the API

The SwiftUI app polls at 1.5-second intervals and renders a Pipeline Progress Panel with stage checkmarks, animated counters, and an auto-scrolling activity log. Polling auto-stops when the pipeline reaches idle/done state. The universe visualization auto-refreshes every 5 seconds during processing using `mergeUniverse()` for incremental node injection.

## API Endpoints

| Method | Path | Description |
|--------|------|-------------|
| GET | /health | Health check |
| POST | /volumes/add | Add watched directory |
| GET | /volumes/list | List watched directories |
| DELETE | /volumes/remove | Remove watched directory |
| POST | /ingest/start | Start pipeline |
| GET | /ingest/status | Pipeline status |
| POST | /search | Vector search |
| GET | /evidence/{asset_id} | Get asset info |
| GET | /evidence/chunk/{chunk_id} | Get chunk details |
| GET | /evidence/assets/all | List all assets |
| GET | /universe/snapshot?lod=macro | Universe snapshot |
| POST | /universe/focus | Focus on node |
| POST | /concepts/refine | Refine concept |
| GET | /concepts/list | List concepts |
| GET | /concepts/{id} | Concept detail |

## Performance Considerations

- Large files (>500MB) are skipped by default
- Embedding batch size defaults to 32
- SQLite uses WAL mode for concurrent reads
- Pipeline runs in a background goroutine
- Incremental processing skips unchanged files (content hash comparison)
- Vector search: brute-force cosine similarity, all vectors loaded in memory (~150MB for 50K vectors)
- Go daemon starts in <100ms, uses ~30MB base memory
Knowledge Refinery: local-first semantic search & 3D concept visualization macOS app for corpus ingestion, semantic search, and concept universe visualization powered by local LLMs via LM Studio. Architecture: - Go daemon (17MB single binary, zero dependencies) - chi router, pure-Go SQLite, tiktoken tokenizer - 6-stage pipeline: scan → extract → chunk → embed → annotate → conceptualize - Brute-force cosine vector search in memory - 89 tests across 8 packages - SwiftUI app (macOS 15+) - Multi-workspace management with auto-start daemons - Live pipeline progress, search, concept browser - WebGPU 3D universe renderer with Canvas2D fallback - Custom crystal app icon 2026-02-13 17:09:46 +00:00			`# Operational Notes`

			`## Data Locations`

			Each workspace has its own data directory under `~/.knowledge-refinery/workspaces/<id>/`.

			`\| Item \| Path \|`
			`\|------\|------\|`
			\| Workspace root \| `~/.knowledge-refinery/workspaces/<id>/` \|
Update all documentation for Go daemon rewrite All docs, README, and presentation now reflect the Go daemon architecture: Python/FastAPI/LanceDB/PyMuPDF references replaced with Go/chi/SQLite/pdftotext. Updated test counts (97), model names (qwen3-4b-2507), app bundle structure, installer steps, and tech stack tables. 2026-02-13 18:29:23 +00:00			\| SQLite DB (metadata + vectors + graph) \| `~/.knowledge-refinery/workspaces/<id>/refinery.db` \|
Knowledge Refinery: local-first semantic search & 3D concept visualization macOS app for corpus ingestion, semantic search, and concept universe visualization powered by local LLMs via LM Studio. Architecture: - Go daemon (17MB single binary, zero dependencies) - chi router, pure-Go SQLite, tiktoken tokenizer - 6-stage pipeline: scan → extract → chunk → embed → annotate → conceptualize - Brute-force cosine vector search in memory - 89 tests across 8 packages - SwiftUI app (macOS 15+) - Multi-workspace management with auto-start daemons - Live pipeline progress, search, concept browser - WebGPU 3D universe renderer with Canvas2D fallback - Custom crystal app icon 2026-02-13 17:09:46 +00:00			\| PID file \| `~/.knowledge-refinery/workspaces/<id>/daemon.pid` \|
Update all documentation for Go daemon rewrite All docs, README, and presentation now reflect the Go daemon architecture: Python/FastAPI/LanceDB/PyMuPDF references replaced with Go/chi/SQLite/pdftotext. Updated test counts (97), model names (qwen3-4b-2507), app bundle structure, installer steps, and tech stack tables. 2026-02-13 18:29:23 +00:00			\| Workspace config \| `~/.knowledge-refinery/workspaces.json` \|
Knowledge Refinery: local-first semantic search & 3D concept visualization macOS app for corpus ingestion, semantic search, and concept universe visualization powered by local LLMs via LM Studio. Architecture: - Go daemon (17MB single binary, zero dependencies) - chi router, pure-Go SQLite, tiktoken tokenizer - 6-stage pipeline: scan → extract → chunk → embed → annotate → conceptualize - Brute-force cosine vector search in memory - 89 tests across 8 packages - SwiftUI app (macOS 15+) - Multi-workspace management with auto-start daemons - Live pipeline progress, search, concept browser - WebGPU 3D universe renderer with Canvas2D fallback - Custom crystal app icon 2026-02-13 17:09:46 +00:00
			`## Resetting`

			`To start fresh, remove the data directory:`
			```bash
			`rm -rf ~/.knowledge-refinery`
			```

			`## Monitoring`

Update all documentation for Go daemon rewrite All docs, README, and presentation now reflect the Go daemon architecture: Python/FastAPI/LanceDB/PyMuPDF references replaced with Go/chi/SQLite/pdftotext. Updated test counts (97), model names (qwen3-4b-2507), app bundle structure, installer steps, and tech stack tables. 2026-02-13 18:29:23 +00:00			`The Go daemon logs to stdout with structured messages. Key log patterns:`
			- `[pipeline] Stage: ...` - Pipeline stage progress
			- `[embedder] Embedded batch: X chunks` - Embedding progress
			- `[error]` - Errors during processing
Knowledge Refinery: local-first semantic search & 3D concept visualization macOS app for corpus ingestion, semantic search, and concept universe visualization powered by local LLMs via LM Studio. Architecture: - Go daemon (17MB single binary, zero dependencies) - chi router, pure-Go SQLite, tiktoken tokenizer - 6-stage pipeline: scan → extract → chunk → embed → annotate → conceptualize - Brute-force cosine vector search in memory - 89 tests across 8 packages - SwiftUI app (macOS 15+) - Multi-workspace management with auto-start daemons - Live pipeline progress, search, concept browser - WebGPU 3D universe renderer with Canvas2D fallback - Custom crystal app icon 2026-02-13 17:09:46 +00:00
Update all documentation for Go daemon rewrite All docs, README, and presentation now reflect the Go daemon architecture: Python/FastAPI/LanceDB/PyMuPDF references replaced with Go/chi/SQLite/pdftotext. Updated test counts (97), model names (qwen3-4b-2507), app bundle structure, installer steps, and tech stack tables. 2026-02-13 18:29:23 +00:00			`### Live Pipeline Monitoring`
Knowledge Refinery: local-first semantic search & 3D concept visualization macOS app for corpus ingestion, semantic search, and concept universe visualization powered by local LLMs via LM Studio. Architecture: - Go daemon (17MB single binary, zero dependencies) - chi router, pure-Go SQLite, tiktoken tokenizer - 6-stage pipeline: scan → extract → chunk → embed → annotate → conceptualize - Brute-force cosine vector search in memory - 89 tests across 8 packages - SwiftUI app (macOS 15+) - Multi-workspace management with auto-start daemons - Live pipeline progress, search, concept browser - WebGPU 3D universe renderer with Canvas2D fallback - Custom crystal app icon 2026-02-13 17:09:46 +00:00
			During pipeline execution, real-time progress is available via the enriched `/ingest/status` endpoint. The daemon maintains:

			`- Live progress dict: Per-stage status (pending/running/done) with progress percentages`
			`- Counters: chunk_count, annotation_count, concept_count, edge_count`
			`- Activity log: 200-entry ring buffer; the last 50 events are returned via the API`

Update all documentation for Go daemon rewrite All docs, README, and presentation now reflect the Go daemon architecture: Python/FastAPI/LanceDB/PyMuPDF references replaced with Go/chi/SQLite/pdftotext. Updated test counts (97), model names (qwen3-4b-2507), app bundle structure, installer steps, and tech stack tables. 2026-02-13 18:29:23 +00:00			The SwiftUI app polls at 1.5-second intervals and renders a Pipeline Progress Panel with stage checkmarks, animated counters, and an auto-scrolling activity log. Polling auto-stops when the pipeline reaches idle/done state. The universe visualization auto-refreshes every 5 seconds during processing using `mergeUniverse()` for incremental node injection.
Knowledge Refinery: local-first semantic search & 3D concept visualization macOS app for corpus ingestion, semantic search, and concept universe visualization powered by local LLMs via LM Studio. Architecture: - Go daemon (17MB single binary, zero dependencies) - chi router, pure-Go SQLite, tiktoken tokenizer - 6-stage pipeline: scan → extract → chunk → embed → annotate → conceptualize - Brute-force cosine vector search in memory - 89 tests across 8 packages - SwiftUI app (macOS 15+) - Multi-workspace management with auto-start daemons - Live pipeline progress, search, concept browser - WebGPU 3D universe renderer with Canvas2D fallback - Custom crystal app icon 2026-02-13 17:09:46 +00:00
			`## API Endpoints`

			`\| Method \| Path \| Description \|`
			`\|--------\|------\|-------------\|`
			`\| GET \| /health \| Health check \|`
			`\| POST \| /volumes/add \| Add watched directory \|`
			`\| GET \| /volumes/list \| List watched directories \|`
			`\| DELETE \| /volumes/remove \| Remove watched directory \|`
			`\| POST \| /ingest/start \| Start pipeline \|`
			`\| GET \| /ingest/status \| Pipeline status \|`
			`\| POST \| /search \| Vector search \|`
			`\| GET \| /evidence/{asset_id} \| Get asset info \|`
			`\| GET \| /evidence/chunk/{chunk_id} \| Get chunk details \|`
			`\| GET \| /evidence/assets/all \| List all assets \|`
Update all documentation for Go daemon rewrite All docs, README, and presentation now reflect the Go daemon architecture: Python/FastAPI/LanceDB/PyMuPDF references replaced with Go/chi/SQLite/pdftotext. Updated test counts (97), model names (qwen3-4b-2507), app bundle structure, installer steps, and tech stack tables. 2026-02-13 18:29:23 +00:00			`\| GET \| /universe/snapshot?lod=macro \| Universe snapshot \|`
Knowledge Refinery: local-first semantic search & 3D concept visualization macOS app for corpus ingestion, semantic search, and concept universe visualization powered by local LLMs via LM Studio. Architecture: - Go daemon (17MB single binary, zero dependencies) - chi router, pure-Go SQLite, tiktoken tokenizer - 6-stage pipeline: scan → extract → chunk → embed → annotate → conceptualize - Brute-force cosine vector search in memory - 89 tests across 8 packages - SwiftUI app (macOS 15+) - Multi-workspace management with auto-start daemons - Live pipeline progress, search, concept browser - WebGPU 3D universe renderer with Canvas2D fallback - Custom crystal app icon 2026-02-13 17:09:46 +00:00			`\| POST \| /universe/focus \| Focus on node \|`
			`\| POST \| /concepts/refine \| Refine concept \|`
			`\| GET \| /concepts/list \| List concepts \|`
Update all documentation for Go daemon rewrite All docs, README, and presentation now reflect the Go daemon architecture: Python/FastAPI/LanceDB/PyMuPDF references replaced with Go/chi/SQLite/pdftotext. Updated test counts (97), model names (qwen3-4b-2507), app bundle structure, installer steps, and tech stack tables. 2026-02-13 18:29:23 +00:00			`\| GET \| /concepts/{id} \| Concept detail \|`
Knowledge Refinery: local-first semantic search & 3D concept visualization macOS app for corpus ingestion, semantic search, and concept universe visualization powered by local LLMs via LM Studio. Architecture: - Go daemon (17MB single binary, zero dependencies) - chi router, pure-Go SQLite, tiktoken tokenizer - 6-stage pipeline: scan → extract → chunk → embed → annotate → conceptualize - Brute-force cosine vector search in memory - 89 tests across 8 packages - SwiftUI app (macOS 15+) - Multi-workspace management with auto-start daemons - Live pipeline progress, search, concept browser - WebGPU 3D universe renderer with Canvas2D fallback - Custom crystal app icon 2026-02-13 17:09:46 +00:00
			`## Performance Considerations`

			`- Large files (>500MB) are skipped by default`
Update all documentation for Go daemon rewrite All docs, README, and presentation now reflect the Go daemon architecture: Python/FastAPI/LanceDB/PyMuPDF references replaced with Go/chi/SQLite/pdftotext. Updated test counts (97), model names (qwen3-4b-2507), app bundle structure, installer steps, and tech stack tables. 2026-02-13 18:29:23 +00:00			`- Embedding batch size defaults to 32`
Knowledge Refinery: local-first semantic search & 3D concept visualization macOS app for corpus ingestion, semantic search, and concept universe visualization powered by local LLMs via LM Studio. Architecture: - Go daemon (17MB single binary, zero dependencies) - chi router, pure-Go SQLite, tiktoken tokenizer - 6-stage pipeline: scan → extract → chunk → embed → annotate → conceptualize - Brute-force cosine vector search in memory - 89 tests across 8 packages - SwiftUI app (macOS 15+) - Multi-workspace management with auto-start daemons - Live pipeline progress, search, concept browser - WebGPU 3D universe renderer with Canvas2D fallback - Custom crystal app icon 2026-02-13 17:09:46 +00:00			`- SQLite uses WAL mode for concurrent reads`
Update all documentation for Go daemon rewrite All docs, README, and presentation now reflect the Go daemon architecture: Python/FastAPI/LanceDB/PyMuPDF references replaced with Go/chi/SQLite/pdftotext. Updated test counts (97), model names (qwen3-4b-2507), app bundle structure, installer steps, and tech stack tables. 2026-02-13 18:29:23 +00:00			`- Pipeline runs in a background goroutine`
Knowledge Refinery: local-first semantic search & 3D concept visualization macOS app for corpus ingestion, semantic search, and concept universe visualization powered by local LLMs via LM Studio. Architecture: - Go daemon (17MB single binary, zero dependencies) - chi router, pure-Go SQLite, tiktoken tokenizer - 6-stage pipeline: scan → extract → chunk → embed → annotate → conceptualize - Brute-force cosine vector search in memory - 89 tests across 8 packages - SwiftUI app (macOS 15+) - Multi-workspace management with auto-start daemons - Live pipeline progress, search, concept browser - WebGPU 3D universe renderer with Canvas2D fallback - Custom crystal app icon 2026-02-13 17:09:46 +00:00			`- Incremental processing skips unchanged files (content hash comparison)`
Update all documentation for Go daemon rewrite All docs, README, and presentation now reflect the Go daemon architecture: Python/FastAPI/LanceDB/PyMuPDF references replaced with Go/chi/SQLite/pdftotext. Updated test counts (97), model names (qwen3-4b-2507), app bundle structure, installer steps, and tech stack tables. 2026-02-13 18:29:23 +00:00			`- Vector search: brute-force cosine similarity, all vectors loaded in memory (~150MB for 50K vectors)`
			`- Go daemon starts in <100ms, uses ~30MB base memory`