Vectoria — Embedded Hybrid Search for Ecommerce

100% Query Coverage

2ms Semantic p50

500K Products / Node

40MB Model Size

// quick start

Up in three commands.

Clone, build, start. The embedding model downloads automatically on first run. No Python, no Docker, no search cluster to provision.

Works with NDJSON, CSV, and Parquet out of the box. POST products to the HTTP API — they're immediately searchable. BM25, semantic, and hybrid search on the same index with a single flag.

bash

# clone and build (~60s) $ git clone github.com/gleicon/vectoria $ cargo build --release Compiling vectoria-core v0.1.8 Finished release target in 58.4s # start server $ ./target/release/vectoria-server INFO vectoria v0.1.8 api_key: a1b2c3d4e5f6... INFO listening on http://0.0.0.0:7700 # index a product $ curl -X POST http://localhost:7700/products \ -H "Authorization: Bearer $KEY" \ -d '{"id":"p1","text":"Nike Air Max running shoe"}' 201 Created # search $ curl -X POST http://localhost:7700/search \ -d '{"q":"jogging footwear","mode":"hybrid"}' {"hits":[{"id":"p1","score":0.91,...}]}

// features

Everything in one binary.

No microservices. No managed search cluster. No per-query API billing. Ship search on your own infrastructure.

⚡

BM25 + Vector Hybrid

Full-text precision for exact queries. Semantic fallback for concept queries ("jogging footwear" finds "Nike Air Max"). Scores are fused and re-ranked in a single pass.

◈

Zero-Result Elimination

100% query coverage on ESCI benchmark. Spell correction fires on zero-result queries only — compound splits ("whiteshoes" → "white shoes"), typo correction, space joining — without hurting precision on well-formed queries.

▣

HNSW Vector Index

EdgeStore HNSW built automatically after reindex. Activates on POST /admin/reindex. Approximate nearest-neighbor search at 2ms p50 for a 5K product catalog.

↑

Behavioral Signals

Record clicks, purchases, views, and cart events via POST /events. Popularity and conversion signals are aggregated and fold into ranking scores automatically.

◎

Local ONNX Embeddings

Ships with multilingual-e5-small (~40 MB, 100+ languages). Swap to any OpenAI-compatible embedding provider via config — no code changes required.

≡

Persistent Storage

Three backends: in-memory (dev), SQLite + EdgeStore flat scan (light prod), EdgeStore HNSW (production). Switch with one config line, no data migration.

⊕

Catalog-Seeded Spell Correction

SymSpell seeded from your product catalog as items are indexed. Vocabulary grows with the catalog. No dictionary files to manage. Skips numeric tokens and product codes automatically.

⌖

Cross-Encoder Reranking

Optional rerank: true flag in any search request. Top-50 results re-scored with a cross-encoder model. Enable with VECTORIA_ENABLE_RERANKER=1.

⊡

Bulk Import CLI

vectoria import products.ndjson handles NDJSON, CSV, and Parquet. Batch size configurable. Progress reported per batch. Re-embed all products after a model change with vectoria reindex.

// how it works

The query pipeline.

Every search request goes through the same pipeline regardless of mode.

Receive query

The original query string is used as-is. No pre-processing that could hurt precision.

BM25 search

Full-text search over the indexed catalog using BM25 scoring. If zero results, spell correction fires: compound splits + typo correction on alphabetic tokens, then BM25 retries with the corrected query.

Semantic search hybrid / semantic only

Query embedded via multilingual-e5-small (or configured provider). HNSW approximate nearest-neighbor search over product vectors. Semantic candidates merged with BM25 candidates.

Query expansion automatic

If BM25 returns fewer than limit/2 results, top semantic hits are mined for expansion terms. BM25 retries with the enriched query. Narrows the zero-result gap further without increasing latency for normal queries.

Score fusion and ranking

Final score = semantic × 0.6 + bm25 × w + popularity × 0.2 + availability × 0.1 + margin × 0.1. All weights configurable in vectoria.toml or per-request.

Optional cross-encoder reranking

With "rerank": true, top-50 results re-scored. Adds ~20–50ms latency. Off by default.

// benchmark

Measured on real ecommerce data.

Amazon ESCI dataset — 5,000 US products, 107–119 judged queries, multilingual-e5-small embedding. ESCI labels represent three levels of query hardness.

Label Set	Queries	BM25 MRR	Hybrid MRR	Semantic MRR	Coverage (all modes)
E — exact match	107	0.5842	0.5882	0.4922	100%
E+S — exact + concept	117	0.6595	0.6576	0.5690	100%
E+S+C — all labels	119	0.6835	0.6803	0.5797	100%

E = exact product name match (BM25-optimal). S = substitute/concept (keyword overlap low). C = complement (e.g. "camera" → "camera bag").
Previous BM25 coverage before spell corrector: 96.6%. All modes now reach 100%. Spell correction fires only on zero-result queries — no precision loss on normal queries.

Reproduce: make esci-import && make esci-judges && make bench

// comparison

vs. the alternatives.

Vectoria's niche is the gap between "just add SQLite FTS5" and "stand up Elasticsearch."

SQLite FTS5

✓ Zero-dependency
✓ Fast BM25
✗ No semantic fallback
✗ Zero-result gaps
✗ No behavioral ranking
✗ No compound spell correction
✗ No cross-encoder reranking

      Vectoria ← you are here
      ✓ Single binary, no external services
✓ BM25 + HNSW hybrid
✓ 100% query coverage
✓ Behavioral signals (click, purchase)
✓ Catalog-seeded spell correction
✓ Local ONNX embeddings (no API key)
✓ Cross-encoder reranking optional

    

Elasticsearch / Algolia

✓ Mature ecosystem
✓ Multi-tenant / SaaS
✗ External service required
✗ Per-query billing (Algolia)
✗ Cluster to operate (ES)
~ Semantic: add-on / extra cost
~ BYO embedding model complexity

// installation

Docker Compose or from source.

quickest start — docker compose (no Rust required)

# clone and start — ONNX model downloads automatically on first run $ git clone github.com/gleicon/vectoria && cd vectoria $ VECTORIA_API_KEY=my-secret-key docker compose up INFO vectoria v0.1.8 api_key: my-secret-key INFO listening on http://0.0.0.0:7700 # background $ VECTORIA_API_KEY=my-secret-key docker compose up -d $ curl http://localhost:7700/health {"status":"ok","version":"0.1.8"}

DOCKER — FULL IMAGE (~400 MB)

local ONNX embeddings, model cached in volume

$ docker build --target vectoria-full \ -t vectoria:full . $ docker run -p 7700:7700 \ -v vectoria-data:/data \ -v fastembed-cache:/root/.cache/fastembed \ -e VECTORIA_API_KEY=my-secret-key \ vectoria:full

DOCKER — SLIM IMAGE (~50 MB)

requires external OpenAI-compatible embedding API

$ docker build --target vectoria-slim \ -t vectoria:slim . $ docker run -p 7700:7700 \ -v vectoria-data:/data \ -e VECTORIA_API_KEY=my-secret-key \ -e VECTORIA_EMBEDDING_BASE_URL=https://api.openai.com/v1 \ -e VECTORIA_EMBEDDING_MODEL=text-embedding-3-small \ vectoria:slim

FROM SOURCE (Rust 1.80+)

bash

$ cargo build --release $ ./target/release/vectoria-server INFO listening on http://0.0.0.0:7700

Both images include a HEALTHCHECK and the vectoria CLI binary. The /data volume holds the persistent index. Mount /root/.cache/fastembed to avoid re-downloading the ONNX model on container restart.

// configuration

One TOML file. All optional.

Place vectoria.toml in the working directory. Every field has a sensible default and an environment variable override.

vectoria.toml

[server] host = "0.0.0.0" port = 7700 api_key = "your-key" # auto-generated if absent [storage] path = "./vectoria.db" [embedding] provider = "local" # or "openai-compatible" model = "multilingual-e5-small" [index] vector_backend = "edgestore-hnsw" [ranking] semantic = 0.6 popularity = 0.2 availability = 0.1 margin = 0.1

VECTOR BACKENDS

MEM

`memory`

Everything in-memory. Lost on restart. Development only.

SQL

`sqlite`

SQLite metadata + EdgeStore flat vector scan. Light production use.

HNSW

`edgestore-hnsw` — recommended

Persistent HNSW index. Activates after POST /admin/reindex. Recommended for production.

ENV OVERRIDES

VECTORIA_HOST VECTORIA_PORT VECTORIA_API_KEY VECTORIA_STORAGE_PATH VECTORIA_EMBEDDING_PROVIDER VECTORIA_EMBEDDING_BASE_URL VECTORIA_EMBEDDING_MODEL VECTORIA_SKIP_CONSENT=1 # skip download prompt VECTORIA_ENABLE_RERANKER=1

// documentation

API reference.

Core endpoints. All except /health require Authorization: Bearer <api_key>.

POST /products — index a product

{ "id": "sku-001", "text": "Nike Air Max running shoe", "metadata": { "title": "Nike Air Max", "price": 120.00, "in_stock": true } } → 201 Created

POST /search

{ "q": "jogging footwear", "limit": 20, "mode": "hybrid", // bm25 | semantic | hybrid "filters": {"in_stock": true}, "explain": false, "rerank": false }

POST /events — behavioral signal

{ "product_id": "sku-001", "event_type": "click" } // event_type: click | purchase | view | cart // aggregated every 5min → popularity score

POST /admin/reindex

# Re-embeds all products. Builds HNSW index. # Use after model change or backend switch. { "reindexed": 5000, "errors": 0 }

CLI

# bulk import $ vectoria import products.ndjson \ --server http://localhost:7700 \ --api-key $KEY # benchmark $ vectoria bench judges.ndjson \ --mode all # reindex after model change $ vectoria reindex

// embedded library

Use as a Rust library.

No HTTP server needed. Add vectoria-core to your Cargo.toml and get the full hybrid search engine in-process. Available on crates.io.

Cargo.toml

[dependencies] vectoria-core = "0.1.8"

async (tokio)

use vectoria_core::{SearchEngineBuilder, model::{SearchRequest, SearchMode}}; let engine = SearchEngineBuilder::new() .query_cache(300, 1_000) .build() .await?; engine.index(product).await?; let results = engine.search(SearchRequest { q: "running shoes".into(), mode: SearchMode::Hybrid, limit: 10, ..Default::default() }).await?;

sync (no async runtime required)

use vectoria_core::{SearchEngineSync, model::SearchRequest}; let engine = SearchEngineSync::new()?; engine.index(product)?; let results = engine.search(SearchRequest { q: "running shoes".into(), ..Default::default() })?; engine.reindex()?; engine.stats()?;

DEF

Defaults out of the box

In-memory storage + multilingual-e5-small ONNX — zero config needed for prototyping.

OVR

Override anything

Swap storage (Sqlite, EdgeStore), vector index, embedding provider, or ranking weights via builder methods.

PUB

`make publish`

Publish to crates.io with cargo login + make publish. Tag release with make tag.

// demo

Try the ESCI demo store.

Five minutes to a running demo with real Amazon product data and three search modes side by side. Requires accepting the ESCI dataset license.

make demo

# start server in background, wait until healthy $ make server-bg ✓ server healthy at http://localhost:7700 # download ESCI data and import 5000 products (~5 min first run) $ make esci-import 5000 products loaded Imported 5000 products (0 errors) # open demo storefront at http://localhost:8080 $ make webstore INFO serving at http://localhost:8080

ESCI data requires a separate license: github.com/amazon-science/esci-data

// algolia-compatible api

Algolia-compatible API. Drop-in demo.

vectoria-algolia speaks the same protocol as Algolia Search. Point any InstantSearch or algoliasearch v5 app at localhost:8108 — no code changes beyond the host override. Ships with a React demo and 550 sample products.

github.com/gleicon/vectoria-algolia

bash

# clone and start (builds + loads 550 products automatically) $ git clone https://github.com/gleicon/vectoria-algolia $ cd vectoria-algolia $ docker compose up --build ✓ search healthy at http://localhost:8108 ✓ 550 products loaded # open the demo storefront $ open http://localhost:8108

Wire up your InstantSearch app

TypeScript

import { liteClient } from 'algoliasearch/lite' const searchClient = liteClient('local', 'local', { hosts: [{ url: 'localhost:8108', protocol: 'http', accept: 'readWrite' }], }) // drop this into any <InstantSearch> tree

API compatibility

Feature	Status
Single-index search	✓
Multi-search (InstantSearch)	✓
`facets` / `facetFilters`	✓
`_highlightResult` + AIS tags	✓
`filters` string syntax	✓
`numericFilters`	✓
Batch index / delete	✓
React demo (InstantSearch UI)	✓

View on GitHub → Full API reference → Live API at a.vectoriasearch.com →

Search that finds what customers mean, not just what they type.

Up in three commands.

Everything in one binary.

BM25 + Vector Hybrid

Zero-Result Elimination

HNSW Vector Index

Behavioral Signals

Local ONNX Embeddings

Persistent Storage

Catalog-Seeded Spell Correction

Cross-Encoder Reranking

Bulk Import CLI

The query pipeline.

Receive query

BM25 search

Semantic search hybrid / semantic only

Query expansion automatic

Score fusion and ranking

Optional cross-encoder reranking

Measured on real ecommerce data.

vs. the alternatives.

SQLite FTS5

Vectoria ← you are here

Elasticsearch / Algolia

Docker Compose or from source.

One TOML file. All optional.

memory

sqlite

edgestore-hnsw — recommended

API reference.

Use as a Rust library.

Defaults out of the box

Override anything

make publish

Try the ESCI demo store.

Algolia-compatible API. Drop-in demo.

Search that finds what
customers mean,
not just what they type.

`memory`

`sqlite`

`edgestore-hnsw` — recommended

`make publish`