Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Chunk Types

Bobbin parses source files into semantic chunks — structural units like functions, classes, and documentation sections. Each chunk is stored with its type, name, line range, content, and embedding vector.

Code Chunk Types

These chunk types are extracted by tree-sitter from supported programming languages.

TypeDescriptionLanguages
functionStandalone function definitionsRust (fn), TypeScript, Python (def), Go (func), Java, C++
methodFunctions defined inside a class or typeTypeScript, Java, C++
classClass definitions (including body)TypeScript, Python, Java, C++
structStruct/record type definitionsRust, Go, C++
enumEnumeration type definitionsRust, Java, C++
interfaceInterface definitionsTypeScript, Java
traitTrait definitionsRust
implImplementation blocksRust (impl Type)
moduleModule declarationsRust (mod)

Markdown Chunk Types

These chunk types are extracted by pulldown-cmark from Markdown files.

TypeDescriptionExample
sectionContent under a heading (including the heading)## Architecture and its body text
tableMarkdown tables| Column | Column |
code_blockFenced code blocks```rust ... ```
docYAML frontmatter blocks---\ntitle: "..."

section

A section chunk captures a heading and all content up to the next heading of the same or higher level. Section names include the full heading hierarchy, so nested headings produce names like "API Reference > Authentication > OAuth Flow".

Given this markdown:

# API Reference

Overview text.

## Authentication

Auth details here.

### OAuth Flow

OAuth steps.

Bobbin produces three section chunks:

  • "API Reference" — contains “Overview text.”
  • "API Reference > Authentication" — contains “Auth details here.”
  • "API Reference > Authentication > OAuth Flow" — contains “OAuth steps.”

Content before the first heading (excluding frontmatter) becomes a doc chunk named “Preamble”.

Search example:

bobbin search "OAuth authorization" --type section

table

Table chunks capture the full markdown table. They are named after their parent section heading — for example, a table under ## Configuration becomes "Configuration (table)".

Given this markdown:

## Configuration

| Key      | Default | Description          |
|----------|---------|----------------------|
| timeout  | 30      | Request timeout (s)  |
| retries  | 3       | Max retry attempts   |

Bobbin produces one table chunk named "Configuration (table)".

Search example:

bobbin search "timeout settings" --type table
bobbin grep "retries" --type table

code_block

Code block chunks capture fenced code blocks. They are named by their language tag — ```bash produces a chunk named "code: bash".

Given this markdown:

## Installation

```bash
pip install mypackage
```

```python
import mypackage
mypackage.init()
```

Bobbin produces two code_block chunks: "code: bash" and "code: python".

Search example:

bobbin search "install dependencies" --type code_block
bobbin grep "pip install" --type code_block

doc (frontmatter)

Doc chunks capture YAML frontmatter at the top of a markdown file. The chunk is named "Frontmatter".

Given this markdown:

---
title: Deployment Guide
tags: [ops, deployment]
status: published
---

# Deployment Guide

Bobbin produces one doc chunk named "Frontmatter" containing the YAML block.

Search example:

bobbin grep "status: draft" --type doc
bobbin search "deployment guide metadata" --type doc

Special Chunk Types

TypeDescription
commitGit commit messages (used internally for history analysis)
otherFallback for line-based chunks from unsupported file types

Line-Based Fallback

Files that don’t match a supported language are split into line-based chunks: 50 lines per chunk with a 10-line overlap between consecutive chunks. These chunks have type other.

Filtering by Type

Both the CLI and MCP tools support filtering by chunk type:

# CLI
bobbin search "auth" --type function
bobbin grep "TODO" --type struct

# MCP tool
search(query: "auth", type: "function")
grep(pattern: "TODO", type: "struct")

Accepted type values (case-insensitive, with aliases):

ValueAliases
functionfunc, fn
method
class
struct
enum
interface
modulemod
impl
trait
docdocumentation
section
table
code_blockcodeblock
commit
other

Language-to-Chunk Mapping

LanguageExtensionsChunk Types Extracted
Rust.rsfunction, method, struct, enum, trait, impl, module
TypeScript.ts, .tsxfunction, method, class, interface
Python.pyfunction, class
Go.gofunction, method, struct
Java.javamethod, class, interface, enum
C++.cpp, .cc, .hppfunction, method, class, struct, enum
Markdown.mdsection, table, code_block, doc
Other*other (line-based)