API Reference

HTTP API for the Vectoria server. All endpoints except /health require Authorization: Bearer <api_key>. Base URL: http://localhost:7700 by default.
For embedding Vectoria directly in a Rust application without a server, see Embedded library.

Health

GET /health

No authentication required. Use this to wait for the server to be ready.

{"status": "ok", "version": "0.1.8"}

Products

POST /products

Index a product. Generates an embedding from text and stores the product.

{
  "id": "sku-001",
  "text": "Blue trail running shoe waterproof breathable",
  "metadata": {
    "title": "Trail X Pro",
    "brand": "Merrell",
    "price": 149.99,
    "in_stock": true
  }
}

Field	Type	Description
`id`	string, required	Unique product identifier
`text`	string	Indexed for BM25 and used to generate the embedding vector
`metadata`	JSON object	Arbitrary key/value pairs stored and returned with search hits. Supports filter matching.

Returns 201 Created.

PUT /products/{id}

Replace an existing product and re-embed. Same body as POST /products.

DELETE /products/{id}

Remove a product from the index. Returns 200 OK.

GET /products/{id}/similar?limit=10

Return products whose vectors are nearest to the given product. Useful for "customers also viewed" widgets.

POST /products/similar

Find similar products by text or raw vector.

{ "text": "lightweight running shoe", "limit": 10 }

Or pass "vector": [0.12, -0.34, ...] directly to skip embedding generation.

Search

POST /search

{
  "q": "waterproof trail shoes",
  "limit": 20,
  "offset": 0,
  "mode": "hybrid",
  "filters": {"in_stock": true, "brand": "Merrell"},
  "aggregate": ["brand", "category"],
  "ranking_weights": {"semantic": 0.7, "bm25": 0.3},
  "explain": false,
  "rerank": false
}

Field	Default	Description
`q`	—	Query string, required
`limit`	20	Results per page
`offset`	0	Pagination offset
`mode`	`hybrid`	`hybrid`, `semantic`, or `bm25`
`filters`	null	Key/value pairs matched against metadata. Special keys: `price_min`, `price_max`
`aggregate`	null	Metadata fields to facet-count across all matched candidates (e.g. `["brand","category"]`)
`ranking_weights`	see below	Override score fusion weights for this request
`explain`	false	Include per-result score breakdown in response
`rerank`	false	Apply cross-encoder reranking (requires `index.enable_reranker = true`)

Default ranking weights: semantic=0.7, bm25=0.3, popularity=0.2, availability=0.05, margin=0.05.

Response:

{
  "hits": [
    {
      "id": "sku-001",
      "score": 0.87,
      "metadata": {"title": "Trail X Pro", "price": 149.99},
      "explain": null
    }
  ],
  "total": 42,
  "offset": 0,
  "limit": 20,
  "processing_time_ms": 3,
  "query": "waterproof trail shoes",
  "aggregations": {
    "brand": {"Merrell": 8, "Salomon": 5},
    "category": {"Trail Running": 10}
  }
}

total reflects matched candidates up to (limit + offset) × 5 — not an exact index-wide count.

Score explanation (`explain: true`)

The examples below come from a real test (test_explain_score_breakdown_anatomy): three products indexed, three clicks recorded on shoe1 for the query "running shoe", then a hybrid search with explain:true.

Why shoe1 ranked first (score: 0.7112):

{
  "id": "shoe1",
  "score": 0.7112,
  "explain": {
    "match_sources": ["bm25", "vector"],
    "query_context": {
      "original_query": "running shoe",
      "effective_query": "running shoe",
      "spell_corrected": false,
      "query_expanded": false
    },
    "factors": [
      {"factor": "semantic_similarity", "score": 0.016, "weight": 0.70, "contribution": 0.011},
      {"factor": "bm25",               "score": 1.000, "weight": 0.30, "contribution": 0.300},
      {"factor": "popularity",         "score": 1.000, "weight": 0.20, "contribution": 0.200},
      {"factor": "query_ctr",          "score": 1.000, "weight": 0.15, "contribution": 0.150},
      {"factor": "availability",       "score": 1.000, "weight": 0.05, "contribution": 0.050},
      {"factor": "margin",             "score": 0.000, "weight": 0.05, "contribution": 0.000}
    ]
  }
}

bm25=1.0: top BM25 result (score normalized to max across candidates). popularity=1.0: most clicks globally relative to views. query_ctr=1.0: only product clicked for this exact query (normalized to max). semantic_similarity=0.016 is low here because this test uses a hash-based stub embedder; with multilingual-e5-small semantic scores are typically 0.7–0.95 for relevant products.

Why shoe2 ranked second (score: 0.1780):

{
  "id": "shoe2",
  "score": 0.1780,
  "explain": {
    "match_sources": ["bm25", "vector"],
    "factors": [
      {"factor": "bm25",        "score": 0.363, "weight": 0.30, "contribution": 0.109},
      {"factor": "query_ctr",   "score": 0.000, "weight": 0.15, "contribution": 0.000},
      {"factor": "availability","score": 1.000, "weight": 0.05, "contribution": 0.050},
      ...
    ]
  }
}

bm25=0.363: partial match ("running" appears but not "shoe"). query_ctr=0 and popularity=0: never clicked.

Why mat1 ranked last (score: 0.0566):

match_sources: ["vector"] — only returned via semantic search, no BM25 match. Score is almost entirely availability (0.05) plus a tiny semantic contribution (0.007). Not a text match; only present because the vector index found it as a distant neighbor.

Field	Description
`match_sources`	How this product entered the candidate set: subset of `["bm25", "vector"]`
`query_context.original_query`	Query as submitted
`query_context.effective_query`	Query actually used for BM25 (differs when spell-corrected or expanded)
`query_context.spell_corrected`	`true` if BM25 returned no results and a corrected query was tried
`query_context.query_expanded`	`true` if semantically-similar terms were appended to improve recall
`factors[].score`	Raw signal value (0.0–1.0)
`factors[].weight`	Configured weight for this factor
`factors[].contribution`	`score × weight` — actual contribution to `hit.score`

sum(contribution) equals hit.score. To diagnose a ranking:

Check match_sources — if bm25 is absent, product wasn't a text match (semantic retrieval only).
Check query_ctr contribution — if it dominates, ranking is driven by past clicks on this query.
Check bm25 contribution — if low, the product doesn't contain the query terms verbatim.
Check query_context.spell_corrected — if true, BM25 found no results for the original query.
Check semantic_similarity — if the only non-zero signal, the product is semantically related but not a keyword match.

GET /autocomplete?q=runn&limit=5

Returns query completions based on indexed product text. Useful for search-as-you-type inputs.

Events

POST /events

Record a behavioral signal to improve ranking. Two distinct signals are derived:

Signal	How computed	Effect
Global popularity	click_count / view_count per product (all queries)	Products with high overall CTR rank slightly higher everywhere
Query CTR	click + purchase count per (query, product) pair	Products clicked for this exact query rank higher for future searches

Query CTR is the stronger signal. It captures "users who searched this chose that" — something BM25 and vectors cannot provide. Always include query in click events to activate it.

{
  "product_id": "sku-001",
  "event_type": "click",
  "query": "waterproof trail shoes",
  "user_id": "u-abc",
  "session_id": "sess-xyz"
}

Field	Description
`product_id`	Required. The product that was interacted with.
`event_type`	`view`, `click`, `add_to_cart`, `wishlist`, `purchase`
`query`	The search query that led to this product. Required for query-CTR to work.
`user_id`	Optional. Reserved for future per-user personalization.
`session_id`	Optional. Groups events within a browsing session.

Feedback loop:

User searches "running shoes" → results returned
User clicks sku-001 → POST /events with event_type=click, query="running shoes"
Background aggregation (every aggregation_interval_secs, default 300s) computes per-product signals
Next search for "running shoes" → sku-001 gets a query_ctr boost in the score formula

Score formula (all weights configurable in [ranking]):

score = semantic    × w_semantic
      + bm25        × w_bm25
      + popularity  × w_popularity     // global click/view ratio
      + query_ctr   × w_query_ctr      // clicks for this exact query (default 0.15)
      + availability × w_availability
      + margin      × w_margin

Pass "explain": true in search to see per-factor scores in each hit.

Admin

GET /stats

Returns index size, vector count, storage backend, and embedding model info.

POST /admin/reindex

Re-embeds all products using the current embedding model and rebuilds the HNSW index. Use after:

Changing the embedding model or provider
Switching the vector_backend to edgestore-hnsw
Bulk-importing products that were added with PendingVector status

{"reindexed": 5000, "errors": 0, "duration_ms": 12400}

Embedded library (Rust)

vectoria-core is published on crates.io and can be embedded directly in any Rust application — no HTTP server required.

[dependencies]
vectoria-core = "0.1.8"

Async API (Tokio)

use vectoria_core::{SearchEngineBuilder, model::{SearchRequest, SearchMode}};

let engine = SearchEngineBuilder::new()
    .query_cache(300, 1_000)   // TTL seconds, max cached queries
    .build()
    .await?;

engine.index(product).await?;

let results = engine.search(SearchRequest {
    q: "running shoes".into(),
    mode: SearchMode::Hybrid,
    limit: 10,
    ..Default::default()
}).await?;

Sync API (no async runtime required)

use vectoria_core::{SearchEngineSync, model::SearchRequest};

let engine = SearchEngineSync::new()?;
engine.index(product)?;

let results = engine.search(SearchRequest {
    q: "running shoes".into(),
    ..Default::default()
})?;

engine.reindex()?;   // re-embed all products
engine.stats()?;     // index stats

SearchEngineSync creates a single-threaded Tokio runtime internally. Safe to use from synchronous callers with no existing runtime.

Preloading an existing database

Both persistent storage backends accept a file path. Point them at an existing database and the engine loads from it at startup. Call reindex_all() after opening to rebuild the in-memory BM25 index and spell corrector from stored products.

use std::{path::Path, sync::Arc};
use vectoria_core::{
    SearchEngineBuilder,
    storage::sqlite::SqliteStorage,
};

let storage = Arc::new(SqliteStorage::open(Path::new("./vectoria.db"))?);

let engine = SearchEngineBuilder::new()
    .storage(storage)
    .build()
    .await?;

// Rebuild BM25 + spell corrector from stored products
engine.reindex_all().await?;

For HNSW persistence, use EdgeStoreStorage::open(path) paired with EdgeStoreVectorIndex::open(path, model_id, dims). The HNSW graph is persisted to disk and loaded automatically — no rebuild needed unless the embedding model changed.

Bulk indexing

There is no single batch call — index products individually in a loop. If products already have pre-computed vectors, set product.vector and the embedding step is skipped (only storage + BM25 + spell corrector are updated). After bulk loading with HNSW, call reindex_all() once to flush the graph.

// Fast path: skip embedding if vectors already computed
for p in products {
    let product = Product {
        id: p.id,
        text: Some(p.text),
        vector: Some(p.embedding),  // skip embed call
        metadata: p.meta,
        ..Product::new(p.id, p.meta)
    };
    engine.index(product).await?;
}

// Flush HNSW graph after bulk load
engine.reindex_all().await?;

For concurrent bulk loading in async code, spawn tasks and join:

let engine = Arc::new(engine);
let handles: Vec<_> = products.into_iter().map(|p| {
    let e = Arc::clone(&engine);
    tokio::spawn(async move { e.index(p).await })
}).collect();

for h in handles { h.await??; }
engine.reindex_all().await?;

Builder options

Method	Default	Description
`.storage(arc)`	`MemoryStorage`	Persistent backends: `SqliteStorage`, `EdgeStoreStorage`
`.vector_index(arc)`	`MemoryVectorIndex`	Persistent: `EdgeStoreVectorIndex` (HNSW)
`.embedding(arc)`	Local `multilingual-e5-small`	Any `EmbeddingProvider` impl, incl. OpenAI-compatible
`.weights(RankingWeights)`	semantic=0.7, bm25=0.3, popularity=0.2, query_ctr=0.15, availability=0.05, margin=0.05	Score fusion weights
`.query_cache(ttl, max)`	disabled	LRU cache for repeated queries (TTL in seconds)
`.reranker()`	disabled	Enable cross-encoder reranking (requires feature flag)

All types implement Send + Sync. Storage and vector index can be wrapped in Arc and shared across threads.

Publishing — run make publish after cargo login (or set CARGO_REGISTRY_TOKEN). Tag a release with make tag to trigger the GitHub Actions release workflow.