# Running Knowledge Refinery

## Prerequisites

- macOS Tahoe (26.x)
- Python 3.12+
- Xcode 26+
- LM Studio running locally with at least one model loaded

## 1. Start LM Studio

1. Open LM Studio
2. Load an embedding model (e.g., `nomic-embed-text-v1.5` or `text-embedding-3-small`)
3. Load a chat model (e.g., `llama-3.2-3b-instruct` or similar)
4. Start the local server (default port 1234)
5. Verify: `curl http://127.0.0.1:1234/v1/models`

## 2. Start the Daemon

```bash
cd daemon
source .venv/bin/activate
python -m knowledge_refinery.main
```

The daemon will:
- Create data directory at `~/.knowledge-refinery/workspaces/<id>/`
- Initialize SQLite database
- Connect to LM Studio
- Write a PID file to `{data_dir}/daemon.pid` for process detection
- Listen on its assigned port (default `http://127.0.0.1:8742`)

> **Tip**: Use **Start All** in the app toolbar to launch all workspace daemons at once. Each workspace runs an independent daemon with its own port and data directory. After connection, ingestion auto-starts.

### Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `KR_DATA_DIR` | `~/.knowledge-refinery` | Data directory |
| `KR_LM_STUDIO_URL` | `http://127.0.0.1:1234/v1` | LM Studio API URL |
| `KR_PORT` | `8742` | Daemon port |

### Verify Daemon

```bash
curl http://127.0.0.1:8742/health
```

## 3. Run the macOS App

```bash
cd apps/macos/KnowledgeRefinery
swift run
```

Or open in Xcode:
```bash
open Package.swift
```

The app will:
- Auto-start daemons for all workspaces on launch
- Detect already-running daemons via PID files
- Auto-restart crashed daemons (up to 3 times)
- Show connection status in the toolbar

## 4. Ingest Documents

1. In the app, go to **Volumes** tab
2. Click **Add Folder** and select a directory
3. Go to **Ingest** tab and click **Start Ingestion**, or use **Start All** from the dashboard
4. Watch live pipeline progress in the **Pipeline Progress Panel**:
   - **Stage tracker**: Each of the 6 stages (Scan, Extract, Chunk, Embed, Annotate, Conceptualize) shows a checkmark when complete or an animated progress bar when running
   - **Animated counters**: Live tallies for chunks, vectors, annotations, concepts, and edges
   - **Interaction indicators**: Visual status of App-to-Daemon and Daemon-to-LM Studio connections
   - **Activity log**: Auto-scrolling log of the last 50 pipeline events
5. The dashboard card shows a compact spinner with the current stage name and chunk count
6. The 3D universe auto-refreshes every 5 seconds during ingestion, using incremental node injection

The app polls `/ingest/status` every 1.5 seconds during pipeline execution and automatically stops polling when the pipeline completes.

### Via API

```bash
# Add a volume
curl -X POST http://127.0.0.1:8742/volumes/add \
  -H "Content-Type: application/json" \
  -d '{"path": "/path/to/documents"}'

# Start ingestion
curl -X POST http://127.0.0.1:8742/ingest/start \
  -H "Content-Type: application/json" \
  -d '{}'

# Check status (enriched response with live progress)
curl http://127.0.0.1:8742/ingest/status
```

The enriched `/ingest/status` response includes:

```json
{
  "status": "running",
  "stage": "embed",
  "chunk_count": 142,
  "annotation_count": 87,
  "concept_count": 12,
  "edge_count": 45,
  "live": {
    "scan": {"status": "done", "progress_pct": 100},
    "extract": {"status": "done", "progress_pct": 100},
    "chunk": {"status": "done", "progress_pct": 100},
    "embed": {"status": "running", "progress_pct": 64},
    "annotate": {"status": "pending", "progress_pct": 0},
    "conceptualize": {"status": "pending", "progress_pct": 0}
  },
  "activity_log": [
    {"timestamp": "2026-02-12T10:30:01Z", "message": "Scanning 3 volumes..."},
    {"timestamp": "2026-02-12T10:30:03Z", "message": "Found 47 files, 12 new"},
    "..."
  ]
}
```

## 5. Search

Use the **Search** tab in the app, or:

```bash
curl -X POST http://127.0.0.1:8742/search \
  -H "Content-Type: application/json" \
  -d '{"query": "machine learning", "limit": 10}'
```

## Troubleshooting

- **Daemon won't start**: Check that port 8742 is free
- **LM Studio unavailable**: Ensure LM Studio server is running on port 1234
- **No embeddings**: Verify an embedding model is loaded in LM Studio
- **App can't connect**: Check daemon is running on the expected port