ai-agent.md
AI Agent ("The Organizer")
The killer feature of Mino β a server-hosted AI knowledge steward with plugins, sandbox, and integrations.
What the Agent Does
The agent is a knowledge steward, not a general-purpose chatbot. It runs server-side (never in the browser) for full access to files, plugins, and local AI tools. Its job:
- Organize β Auto-tag, categorize, and file notes into the right folders
- Connect β Find and create links between related notes (like Obsidian's graph view, but automatic)
- Summarize β Create summaries, indexes, and MOC (map of content) notes
- Clean β Find duplicate notes, merge related ones, archive stale ones
- Answer β When asked a question, search through all notes and synthesize an answer
- Import β Process imports (web pages, emails, voice memos) into properly formatted notes
- Execute β Run plugins (web search, transcription, OCR) via the sandbox
Why Server-Hosted
| Reason | Details |
|---|---|
| LLM API keys stay server-side | No keys leak to the browser |
| File system access | Agent reads/writes .md files directly β no network overhead |
| Sandbox / code execution | Agent can run scripts, install tools, execute plugins in isolation |
| Multi-channel | Agent is accessible from web, mobile, CLI, MCP β all through the same server API |
| Background processing | Auto-organization, auto-tagging, embedding generation run asynchronously |
| Local AI tools | If server has resources, agent can use Whisper, OCR, local embeddings for free |
| Token efficiency | Agent queries local SQLite directly (zero network), reads files locally |
Agent Architecture
ββ Agent Runtime (server-side) βββββββββββββββββββββββββββββ
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β LLM (Claude / GPT / etc.) β β
β β System prompt + context window β β
β ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββ β
β β TOOL LAYER β β
β β β β
β β mino.search β full-text + semantic search β β
β β mino.read β read a specific note β β
β β mino.write β create/replace a note β β
β β mino.edit β search-and-replace in a note β β
β β mino.move β move/rename a note β β
β β mino.delete β delete a note (with safeguard) β β
β β mino.tree β get the folder structure β β
β β mino.tags β list/manage tags β β
β β mino.recent β get recently modified notes β β
β β mino.links β find backlinks / forward links β β
β β β β
β β --- PLUGIN TOOLS (dynamically loaded) --- β β
β β web.search β search the web (plugin) β β
β β youtube.transcriptβ get video transcript (plugin) β β
β β whisper.transcribeβ speech-to-text (plugin) β β
β β ocr.extract β extract text from image β β
β β email.fetch β fetch emails (plugin) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CONTEXT STRATEGY β β
β β β β
β β 1. Always inject: folder tree (compacted) β β
β β 2. Always inject: recent notes list (last 10) β β
β β 3. On search: return snippets, not full files β β
β β 4. On read: return full file content β β
β β 5. On edit: use search-and-replace (not full file) β β
β β 6. Limit context to ~8K tokens per tool call β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Token Efficiency Strategy
| Strategy | Description | Token Savings |
|---|---|---|
| Compact file tree | Send folder names + file count, not every filename. Expand on demand. | ~80% vs full tree |
| Snippet search | Return 200-char snippets around matches, not full files | ~90% vs full files |
| Search-and-replace edits | Never send the full file to modify. Use mino.edit(path, oldText, newText) | ~95% vs full file |
| Embeddings pre-filter | Use vector similarity to find the right 5 notes out of 10,000 before reading them | ~99% vs reading all |
| Session memory | Maintain a compact summary of what the agent already knows about the vault | Avoids re-reading |
| Tool result caching | Cache file tree and recent search results within a session | ~50% on repeat queries |
How the Agent Should Search
A multi-phase search strategy:
- Phase 1: Title + Tag match β Fast. Search note titles and tags. Costs ~0 tokens (local SQLite query).
- Phase 2: Full-text search β Search note content via FTS5. Return snippets. Costs ~500 tokens.
- Phase 3: Semantic search β If Phase 1+2 don't find it, use embedding similarity. Costs ~200 tokens for the query embedding.
- Phase 4: Read β Only read the full file when the agent is confident it's the right one.
How the Agent Should Modify
Agent decision tree for modifications:
1. Small edit (fix typo, add tag, update a line)?
β Use mino.edit(path, oldText, newText) β PREFERRED
2. Add content to existing note?
β Use mino.edit(path, "<!-- END -->", "new content\n<!-- END -->")
3. Create new note?
β Use mino.write(path, content)
4. Restructure a note completely?
β Use mino.write(path, entireNewContent) β EXPENSIVE, avoid
5. Delete a note?
β Use mino.delete(path) β but ALWAYS confirm with user first
β Unless the agent has explicit "auto-delete" permission
File Tree Context Size
The agent always gets a compacted file tree:
π Projects/ (12 files)
π Alpha/ (5 files)
π Beta/ (3 files)
π Archive/ (4 files)
π Daily Notes/ (89 files)
π References/ (23 files)
π Personal/ (7 files)
Instead of listing all 131 individual files. The agent can "drill down" by calling mino.tree("Projects/Alpha") to see the files in a specific folder.
Agent Permissions Model
| Permission | Description | Default |
|---|---|---|
read | Read any note | β Enabled |
search | Search across notes | β Enabled |
write | Create new notes | β Enabled |
edit | Modify existing notes | β Enabled |
move | Move/rename notes | β Enabled |
delete | Delete notes | β Disabled (requires explicit enable) |
organize | Auto-organize (batch operations) | β Disabled |
plugins | Use installed plugins | β Enabled |
sandbox | Execute code in sandbox | β Disabled |
scope | Restrict to specific folders | All folders |
Plugin System (One-Click Install)
Install Flow (from the Web UI)
Inspired by the OpenClaw plugin architecture:
User opens Settings β Plugins β Marketplace
β Browses available plugins (fetched from registry)
β Clicks "Install" on "whisper-local"
β POST /api/v1/plugins/install { name: "whisper-local" }
β Server downloads from npm/@mino-ink scope or GitHub
β Extracts to /data/plugins/whisper-local/
β Loads plugin, registers tools with Agent Runtime
β WebSocket push: "whisper-local installed β"
β Agent immediately has whisper.transcribe tool
β Works from web, mobile, CLI, MCP β all channels
Plugin Architecture
Plugins extend the agent's capabilities. Each plugin is an npm package that exports tools:
// @mino-ink/plugin-web-search/index.ts
import { definePlugin } from '@mino-ink/plugin-sdk';
export default definePlugin({
name: 'web-search',
description: 'Search the web and save results as notes',
tools: [
{
name: 'web_search',
description: 'Search the web for information',
parameters: {
query: { type: 'string', description: 'Search query' },
maxResults: { type: 'number', default: 5 },
},
execute: async ({ query, maxResults }) => {
// ... implementation
},
},
],
});
Plugin Categories
| Category | Examples | Requires |
|---|---|---|
| Core | Web search, YouTube transcript | API key or free API |
| Local AI | Whisper transcription, OCR, local embeddings | Server resources (auto-detected) |
| Import | Email, RSS, social media, Obsidian vault | API key / OAuth |
| Export | Git sync, backup, static site generation | Config |
| Integrations | Calendar, task manager, bookmarks | API key / OAuth |
Planned Plugins
| Plugin | Description | Priority |
|---|---|---|
| Web Search | Search the web via Perplexity/Google/DuckDuckGo | P0 |
| YouTube Transcript | Import video transcripts as notes | P1 |
| Whisper (local) | Local speech-to-text (if server resources allow) | P1 |
| Whisper (API) | Speech-to-text via OpenAI Whisper API | P1 |
| Image OCR | Extract text from images (local Tesseract or API) | P2 |
| Email Import | Import emails (Gmail API) as notes | P1 |
| RSS/News | Follow feeds and clip articles | P2 |
| Social Media | Save tweets/posts/threads | P2 |
| Calendar | Import calendar events as daily notes | P2 |
| Git Integration | Auto-commit note changes to a git repo | P1 |
| Obsidian Import | Import existing Obsidian vaults | P0 |
Agent Setup Flow (via UI)
Everything is configured through the visual interface β no terminal needed:
Settings β Agent
ββ Provider: [Anthropic βΌ] [OpenAI] [Google] [Local]
ββ Model: [claude-sonnet-4-20250514 βΌ]
ββ API Key: [β’β’β’β’β’β’β’β’β’β’β’β’] [Show] [Test Connection]
ββ Permissions: [β read] [β write] [β edit] [β delete] [β organize]
ββ Status: β Connected β 3 tools + 2 plugins loaded
Settings β Plugins
ββ Installed: web-search β, whisper-local β
ββ [Browse Marketplace]
ββ Plugin config: whisper-local β Language: [en βΌ], Model: [base βΌ]
MCP Integration
Every Mino server can expose its API as an MCP (Model Context Protocol) server, so AI agents like Antigravity, Cursor, and Claude Code can directly read/write notes:
# Start the MCP server
npx @mino-ink/mcp-server \
--endpoint https://your-server.com \
--api-key YOUR_KEY
The MCP server exposes these tools to any connected agent:
mino_searchβ Search notesmino_readβ Read a notemino_writeβ Create/update a notemino_editβ Edit part of a notemino_treeβ Get folder structuremino_moveβ Move/rename a notemino_deleteβ Delete a note