mirror of
https://github.com/saymrwulf/KnowledgeRefinery.git
synced 2026-05-14 20:47:51 +00:00
Local-first macOS app for corpus ingestion, semantic search & 3D concept visualization powered by local LLMs
NSApp.applicationIconImage set in init() was too early; the app wasn't fully initialized yet. Moved to onAppear where NSApp is ready. |
||
|---|---|---|
| apps/macos/KnowledgeRefinery | ||
| daemon-go | ||
| docs | ||
| scripts | ||
| shared | ||
| test_corpus | ||
| .gitignore | ||
| Makefile | ||
| presentation.html | ||
| README.md | ||
Knowledge Refinery
A local-first macOS Tahoe application that ingests heterogeneous document corpora, extracts structured knowledge via local LLMs (LM Studio), and provides semantic search with 3D concept visualization.
Installation
Prerequisites
- macOS Tahoe (26.x) on Apple Silicon
- Xcode or Xcode Command Line Tools (for Swift 6.2+)
- Python 3.12+ (system Python or from python.org)
- LM Studio from lmstudio.ai
One-Line Install
git clone <repo-url> && cd LongLocalTimeHorizonInfoRetrieval && bash scripts/install.sh
This will:
- Check all prerequisites
- Create a Python virtual environment and install dependencies
- Build the SwiftUI application
- Create a proper
.appbundle - Install to
/Applications
Manual Build
# Set up daemon
cd daemon
python3 -m venv .venv
.venv/bin/pip install -e ".[dev]"
# Build app bundle
cd ..
make build
# Or just run in development mode
make app-run
LM Studio Setup
Before launching Knowledge Refinery:
- Open LM Studio
- Load models:
- Chat:
gemma-3-4b(or any chat model) - Embeddings:
nomic-embed-text-v1.5(768-dim)
- Chat:
- Start the local server on port 1234
Quick Start
- Launch Knowledge Refinery from Applications or Spotlight
- The dashboard shows LM Studio status (green = connected)
- Click New Workspace — name it, add data lake folders
- Click Start All to launch all workspace daemons and auto-start ingestion
- Watch live pipeline progress: stage tracker, animated counters, activity log
- Search, explore the concept universe, browse clusters
Architecture
- SwiftUI Master Control App — Multi-workspace dashboard, LM Studio monitoring, daemon lifecycle, live pipeline visibility
- Python Daemon (FastAPI) — Per-workspace instances with independent ports and data directories (
~/.knowledge-refinery/workspaces/<id>/) - Live Pipeline Progress — 1.5s fast polling during ingestion, enriched
/ingest/statuswith per-stage progress, counters, and activity log - LanceDB — Embedded vector store for semantic search
- SQLite — Metadata, graph store, pipeline state
- LM Studio — Local LLM inference (embeddings + chat)
- WebGPU — 3D concept universe visualization with auto-refresh during ingestion
Project Structure
apps/macos/KnowledgeRefinery/ SwiftUI macOS application
daemon/ Python backend daemon
shared/ Prompt templates, schemas
docs/ Architecture and operational docs
scripts/ Build and install scripts
test_corpus/ Sample documents for testing
dist/ Built .app bundle (after make build)
Development
make help # Show all commands
make test # Run daemon tests + Swift build check
make app-run # Run app via swift run (dev mode)
make daemon-run # Run daemon directly
make clean # Remove build artifacts
Milestones
- M1: Core ingestion + search + evidence
- M2: LLM structured annotation
- M3: Concept clustering + labeling
- M4: WebGPU 3D Universe visualization
- M5: Semantic zoom + lenses
- M6: Extended format support (EPUB, archives, DICOM)
- M7: Master Control App (multi-workspace, LM Studio monitoring, daemon lifecycle)
- M8: Live Pipeline Visibility (real-time progress panel, activity log, universe auto-refresh)