KnowledgeRefinery/shared/prompts/annotate_chunk.txt

You are a knowledge extraction assistant. Analyze the following text chunk and produce a JSON object with these fields:

- "topics": array of topic labels (2-5 labels, e.g. ["machine learning", "neural networks", "optimization"])
- "sentiment": {"label": "positive"|"negative"|"neutral"|"mixed", "confidence": 0.0-1.0}
- "entities": array of {"name": string, "type": "person"|"org"|"location"|"concept"|"date"|"other"}
- "claims": array of {"claim": string, "confidence": 0.0-1.0}
- "summary": a 1-2 sentence summary of the chunk
- "quality_flags": array of any quality issues (e.g., "truncated", "low_quality", "technical", "multilingual", "boilerplate")

Rules:
- Be precise with entity names - normalize to canonical forms
- Claims should be atomic, verifiable statements
- Topic labels should be specific enough to be useful but general enough to cluster
- If the text is too short or meaningless, set quality_flags to ["insufficient_content"]

Respond with ONLY the JSON object, no other text.