Announcing ZeroEntropy's First Reranker: zerank-1-small and zerank-1-large

4 min read · Jun 13, 2025

4 min read · Jun 13, 2025

4 min read · Jun 13, 2025

ZeroEntropy is launching two new rerankers: zerank-1-small (open-source, optimized for local use) and zerank-1-large (our flagship production model).


In benchmarks, zerank-1 outperforms Voyage rerank-2 by x% and Cohere’s v3 reranker by x%, with gains up to 20% on private customer data.


Both models support long context (up to 16K tokens), run fast, and are available now via our API. If you’re already using ZeroEntropy, no changes are needed, zerank-1 will be automatically running for every query sent.


If you’re building anything retrieval-heavy, this will make your search results significantly more accurate.

Today, we’re excited to launch the first two models in our new reranker series: zerank-1-small and zerank-1-large. These models dramatically improve search quality across RAG and AI agent applications by refining first-pass results with deep reasoning and context awareness.

zerank-1-small is fully open-source and optimized for local deployment. It’s fast, light, and easy to integrate with existing stacks.

zerank-1-large is our flagship proprietary model, offering best-in-class accuracy and built for production-grade workloads.

In benchmark testing across public evaluation sets, zerank-1 outperforms Voyage rerank-2 by an average of x% and Cohere’s rerank-english-v3 by x%. On private customer data, we’ve seen even larger gains—up to x% over the best existing rerankers. These results are consistent across technical, legal, and long-form document domains.


Why We Built It


Our team has been working in retrieval for years, and the pattern was always the same: once a system was in production, accuracy problems would creep in. Developers would try to fix them by adding glue code, custom reranking, and fragile heuristics. But reranking wasn’t just a nice-to-have—it was the only way to make results reliable.


With zerank, we wanted to build rerankers that could:

  • Handle longer context windows (up to 16K tokens)

  • Reason across complex, multi-step queries

  • Support real-world latency requirements

  • Work well across technical, legal, and long-form domains


Both zerank models were trained using retrieval-specific objectives and evaluated across real production workloads, including legal queries, software documentation, financial filings, and internal knowledge bases. The results have consistently shown a clear lift in recall, precision, and overall relevance.


How It Works


Zerank-1-small and zerank-1-large follow a standard two-stage architecture. A fast first-pass search retrieves top candidates using embeddings, hybrid search, or BM25. Our reranker then reorders the top N candidates based on fine-grained semantic relevance to the query.


You send us a query and a list of candidate documents, and we return a ranked list. It’s simple, fast, and makes your search feel like magic.


Try It Now

Both rerankers are live today via the ZeroEntropy API.

  • zerank-1-small is open-source and can be used freely in local or private cloud environments

  • zerank-1-large is hosted and available for production use with pricing included in all retrieval plans

If you’re already using ZeroEntropy, no changes are needed—just turn on reranking in your query options.


If you’re not yet using ZeroEntropy, you can get started with just a few lines of code. Upload your documents, query in natural language, and let our retrieval pipeline (now powered by zerank) return better answers immediately.


We believe great retrieval should be accurate, fast, and easy. With zerank, that’s finally possible.


Learn more at docs.zeroentropy.dev or reach out to chat with our team.