Standards and Conventions¶
This document defines the technical standards for tools to interoperate within this automation stack.
Taxonomy¶
The knowledge base uses a stable set of top-level categories. Do not create new top-level sections unless strictly necessary.
| Category | Location | What belongs here |
|---|---|---|
| AI & Knowledge | tools/ai_knowledge/ |
General AI tools, knowledge management, LLM products |
| Frameworks | tools/frameworks/ |
Libraries for building LLM apps (LangChain, LlamaIndex, etc.) |
| Providers | tools/providers/ |
Companies offering LLM APIs or managed AI services |
| Agents | tools/agents/ |
Agent frameworks and autonomous AI tools |
| Orchestration | tools/orchestration/ |
Workflow automation, multi-agent routing, pipeline tools |
| Infrastructure | tools/infrastructure/ |
Inference engines, vector DBs, serving stacks, quantisation |
| Benchmarking | tools/benchmarking/ |
Eval frameworks, benchmarks, leaderboards |
| Development & Ops | tools/development_ops/ |
AI-assisted coding tools and IDEs |
| Patterns | knowledge_base/patterns/ |
Recurring design patterns (RAG, tool calling, routing, etc.) |
| Playbooks | playbooks/ |
Step-by-step workflow guides |
Deduplication Rules¶
- One canonical page per tool/framework/provider. All other mentions must link to that canonical page.
- Before creating a new page, search the repo for the tool name, URL, and common aliases.
- If a source maps to an existing page, update that page rather than creating a new one.
- Merge duplicates rather than creating parallel pages.
Source Classification Tags¶
Items in new-sources.md use these tags: tool · framework · provider · paper/article · tutorial/guide · benchmark/eval · infrastructure · analysis
Naming Conventions¶
- Tags (Paperless):
kebab-case. Lowercase only. Prefix status tags withs:and category tags withc:. - Workflows (n8n):
[Trigger Source] -> [Primary Action]. Example:IMAP -> Paperless Intake. - Prompts: Versioned using SemVer. Store as Markdown files in
reference-implementations/llm-prompts/.
Document Lifecycle States¶
- Ingested: Raw file received by the system.
- OCRed: Searchable layer added (via OCRmyPDF).
- Classified: Assigned a document type and category tags.
- Actioned: Any extracted tasks/events have been synced to external systems.
- Archived: Document is moved to long-term storage or its final tag state.
Minimal Metadata Schema¶
Every document processed by AI should attempt to populate:
- extraction_date: ISO8601 of when AI ran.
- source_origin: Email, Scan, Webhook.
- action_required: Boolean.
- due_date: If applicable.
- confidence_score: 0.0 to 1.0 (from LLM).
Interoperability¶
- Data Format: All cross-tool communication should prefer JSON.
- Dates: Always use ISO8601 with UTC offsets.
- IDs: Use the internal ID of the source system (e.g. Paperless
document_id) in the metadata of the destination system (e.g. GCal event description).
What "Done" Means¶
An automated flow is considered "done" when:
1. The primary action is completed (Event created/Task synced).
2. The source document is updated with a processed or actioned tag.
3. No errors were logged in the orchestration engine (n8n).
4. If a critical failure occurred, a notification was sent to a human review channel.
AI-Authored Documentation Metadata (Required)¶
For AI-authored updates to knowledge pages (docs/tools/, docs/services/, docs/knowledge_base/, docs/architecture/, docs/playbooks/, and docs/reference-implementations/), include:
Last reviewed: ISO date (YYYY-MM-DD)Confidence:high,medium, orlowSources / References: at least one URL
Recommended section format:
## Sources / References
- [Official docs](https://example.com)
## Contribution Metadata
- Last reviewed: 2026-02-25
- Confidence: medium
These requirements are enforced by scripts/check_docs_contract.py in pull-request CI.