More

andre-z · 2026-01-26T08:04:58 1769414698

What do I say? Happy to talk about "fakes". Here is my calendar. Feel free to book a slot. https://qdrant.to/andre-z

andre-z · 2025-11-25T12:56:35 1764075395

FAISS is not suitable for production. The dedicated vector search solutions solve all the issues you mentioned: you just store the metadata along with vectors in JSON format. At least, with Qdrant, it works like this: https://qdrant.tech/documentation/concepts/payload/

paul2495 · 2025-11-25T15:45:26 1764085526

Thanksthat makes sense and it never even crossed my mind . FAISS has been great for prototyping but I'm definitely hitting the limits around metadata, updates, and operational overhead.

One thing I’m exploring now is Qdrant in embedded mode, since the tool has to run in fully air-gapped environments (no internet, no external services, distributed on a portable SSD). The embedded version runs as a simple file-based directory, similar to SQLite:

from qdrant_client import QdrantClient client = QdrantClient(path="./qdrant_data") # local-only, no server If that model works reliably, it would solve several problems FAISS creates for my use case:

incremental updates instead of full index rebuilds

storing metadata as payloads instead of a 1.5GB pickle

much easier filtering (e.g., per-source, per-customer, per-tool)

better concurrency under load

I’m still benchmarking, but curious about your experience: Have you used Qdrant’s embedded mode in a production/offline scenario? And if so, how does it behave with larger collections (500k–1M vectors) on consumer hardware?

Not dismissing FAISS — just trying to pick the right long-term architecture for an offline tool that gets updated via USB and needs to stay lightweight for the end user.

andre-z · 2025-05-16T15:01:34 1747407694

miniCOIL is a contextualized per-word embedding model. It generates extremely small embeddings (8dim or even 4dim) while still preserving the word's context for each word in a sentence.

GitHub https://github.com/qdrant/miniCOIL HuggingFace https://huggingface.co/Qdrant/minicoil-v1

andre-z · 2025-04-15T08:51:10 1744707070

"Slowness can arise from a misconfigured index or if filterable attributes aren't listed." ;)

andre-z · 2025-03-18T17:41:41 1742319701

Qdrant runs on Linux/Mac/Windows and on x86/ARM processors

andre-z · on Aug 20, 2024

Any dense embedding model can become a late interaction model.

andre-z · on April 5, 2024

Qdrant lib project https://github.com/tyrchen/qdrant-lib, Qdrant SDK has also support for local mode, which means embeddable https://github.com/qdrant/qdrant-client Or just use FAISS https://github.com/facebookresearch/faiss

PhilippGille · on April 5, 2024

I mentioned Faiss, Annoy and USearch in the above description, but they require CGO for use in Go, as they're written in C++.

For Qdrant I didn't know it was embeddable at all, thanks for the correction! But requires CGO as well, right?

andre-z · on March 17, 2024

The only other Repository is a fork of Qdrant.

andre-z · on March 15, 2024

I can answer how it would be in Qdrant if interested. The index will take around 70GB RAM. New vectors are first placed in a non-indexed segment and are immediately available for search while the index is being built. The vectors and the index can be offloaded to disk. Search will take some milliseconds.

andre-z · on March 14, 2024

Qdrant | Open Source Vector DB | Full-time | Remote/Berlin | https://qdrant.com

Hiring SRE, DevOps, Cloud Engineers, Rust Developers, and more.

https://qdrant.join.com