Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

benchmark

Benchmark embedding models to compare load times, single-query latency, and batch throughput.

Synopsis

bobbin benchmark -q <QUERY> [OPTIONS] [PATH]

Description

The benchmark command runs timed trials against one or more ONNX embedding models. For each model it measures:

  • Load time — how long it takes to initialize the model.
  • Single embed — per-query embedding latency (mean, min, max, p50, p95).
  • Batch embed — latency for embedding all queries in a single batch.

If no --model is specified, all three built-in models are tested:

  • all-MiniLM-L6-v2
  • bge-small-en-v1.5
  • gte-small

Models are automatically downloaded if not already cached.

Arguments

ArgumentDefaultDescription
[PATH].Directory containing .bobbin/ config

Options

OptionShortDefaultDescription
--query <TEXT>-qQueries to benchmark (required, can be repeated)
--model <NAME>-mall built-inModels to compare (can be repeated)
--iterations <N>5Number of iterations per query
--batch-size <N>32Batch size for embedding

Examples

Benchmark with a single query:

bobbin benchmark -q "authentication middleware"

Compare two models with multiple queries:

bobbin benchmark \
  -q "error handling" \
  -q "database connection pool" \
  -m all-MiniLM-L6-v2 \
  -m bge-small-en-v1.5 \
  --iterations 10

JSON output for programmatic comparison:

bobbin benchmark -q "test query" --json

JSON Output

{
  "models": [
    {
      "model": "all-MiniLM-L6-v2",
      "dimension": 384,
      "load_time_ms": 45.2,
      "embed_single": {
        "mean_ms": 3.12,
        "min_ms": 2.80,
        "max_ms": 4.01,
        "p50_ms": 3.05,
        "p95_ms": 3.90
      },
      "embed_batch": {
        "mean_ms": 8.45,
        "min_ms": 7.90,
        "max_ms": 9.10,
        "p50_ms": 8.40,
        "p95_ms": 9.05
      }
    }
  ],
  "queries": ["test query"],
  "iterations": 5
}

See Also

  • Embedding Settings — configure which model bobbin uses
  • index — build the search index using the configured model