mirror of
https://github.com/saymrwulf/KnowledgeRefinery.git
synced 2026-05-18 21:20:08 +00:00
macOS app for corpus ingestion, semantic search, and concept universe visualization powered by local LLMs via LM Studio. Architecture: - Go daemon (17MB single binary, zero dependencies) - chi router, pure-Go SQLite, tiktoken tokenizer - 6-stage pipeline: scan → extract → chunk → embed → annotate → conceptualize - Brute-force cosine vector search in memory - 89 tests across 8 packages - SwiftUI app (macOS 15+) - Multi-workspace management with auto-start daemons - Live pipeline progress, search, concept browser - WebGPU 3D universe renderer with Canvas2D fallback - Custom crystal app icon
16 lines
No EOL
998 B
Text
16 lines
No EOL
998 B
Text
You are a knowledge extraction assistant. Analyze the following text chunk and produce a JSON object with these fields:
|
|
|
|
- "topics": array of topic labels (2-5 labels, e.g. ["machine learning", "neural networks", "optimization"])
|
|
- "sentiment": {"label": "positive"|"negative"|"neutral"|"mixed", "confidence": 0.0-1.0}
|
|
- "entities": array of {"name": string, "type": "person"|"org"|"location"|"concept"|"date"|"other"}
|
|
- "claims": array of {"claim": string, "confidence": 0.0-1.0}
|
|
- "summary": a 1-2 sentence summary of the chunk
|
|
- "quality_flags": array of any quality issues (e.g., "truncated", "low_quality", "technical", "multilingual", "boilerplate")
|
|
|
|
Rules:
|
|
- Be precise with entity names - normalize to canonical forms
|
|
- Claims should be atomic, verifiable statements
|
|
- Topic labels should be specific enough to be useful but general enough to cluster
|
|
- If the text is too short or meaningless, set quality_flags to ["insufficient_content"]
|
|
|
|
Respond with ONLY the JSON object, no other text. |