Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Archive Integration

Bobbin can index structured markdown records alongside code — agent memories, communication logs, HLA records, or any collection of markdown files with YAML frontmatter. Archive records are searchable via the same search and context APIs as code.

Configuration

Archive sources are configured in .bobbin/config.toml:

[archive]
enabled = true
webhook_secret = ""  # Optional: Forgejo webhook auth token

[[archive.sources]]
name = "pensieve"
path = "/var/lib/bobbin/archives/pensieve"
schema = "agent-memory"
name_field = "agent"

[[archive.sources]]
name = "hla"
path = "/var/lib/bobbin/archives/hla"
schema = "human-intent"
name_field = "channel"

Source fields

FieldTypeDescription
namestringSource label — used as language tag in chunks and as a search filter
pathstringFilesystem path to the directory of markdown records
schemastringYAML frontmatter value to match (e.g., "agent-memory") — files without this in frontmatter are skipped
name_fieldstringOptional frontmatter field used to prefix chunk names (e.g., "channel""telegram/{record_id}")

Record Format

Archive records are markdown files with YAML frontmatter:

---
schema: agent-memory
id: mem-2026-0322-abc
timestamp: 2026-03-22T12:00:00Z
agent: stryder
tags: [bobbin, search-quality]
---

## Context

Discovered that tag effects only apply via /context endpoint, not /search.
The CLI returns raw LanceDB scores without boosts.

The frontmatter must contain the schema value matching your source config. Other fields (id, timestamp, etc.) are extracted as metadata.

Field handling

  • id — Record identifier (used in chunk ID generation)
  • timestamp — Parsed for date-based file path grouping (YYYY/MM/DD/)
  • source: block — Nested keys are flattened (e.g., source:\n channel: telegram becomes field channel)
  • Chunk IDs — Generated via SHA256(source:id:timestamp) for deduplication

Searching Archives

Archive records appear in regular search results. Filter by source name:

bobbin search "agent memory about search quality" --repo pensieve

HTTP API

EndpointDescription
GET /archive/search?q=<query>&source=<name>&limit=10Search archive records
GET /archive/entry/{id}Fetch a single record by ID
GET /archive/recent?days=30&source=<name>Recent records with optional date range

Web UI

Toggle “Include archive” in the Search tab to merge archive results into code search.

Webhook Integration

For automatic re-indexing when archive sources are updated via git push:

[archive]
webhook_secret = "your-secret-token"

Configure a Forgejo/Gitea push webhook pointing to POST /webhook/push. When a push event matches a configured repo, bobbin triggers an incremental re-index of that source.

Use Cases

  • Agent memories (pensieve): Index agent context snapshots for cross-agent search
  • Communication logs (HLA): Index human-agent interaction records
  • Knowledge bases: Index structured documentation collections
  • Incident records: Index postmortem and investigation reports

Indexing

Archive sources are indexed alongside code during bobbin index. The --force flag re-indexes all records:

bobbin index /var/lib/bobbin --force

Records are chunked like markdown files — headings create chunk boundaries, with frontmatter metadata preserved as chunk attributes.