- Source: Milvus (vector database)
Milvus is a distributed vector database developed by Zilliz. It is available as both open-source software and a cloud service.
Milvus is an open-source project under LF AI & Data Foundation distributed under the Apache License 2.0.
History
Milvus has been developed by Zilliz since 2017.
Milvus joined Linux foundation as an incubation project in January 2020 and became a graduate in June 2021. The details about its architecture and possible applications were presented on ACM SIGMOD Conference in 2021
Milvus 2.0, a major redesign of the whole product with a new architecture, was released in January 2022.
Features
= Similarity search
=Major similarity search related features that are available in the active 2.4.x Milvus branch:
In-memory, on-disk and GPU indices,
Single query, batch query and range query search,
Support of sparse vectors, binary vectors, JSON and arrays,
FP32, FP16 and BF16 data types,
Euclidean distance, inner product distance and cosine distance support for floating-point data,
Hamming distance and jaccard distance for binary data,
Support of graph indices (including HNSW), Inverted-lists based indices and a brute-force search.
Support of vector quantization for lossy input data compression, including product quantization (PQ) and scalar quantization (SQ), that trades stored data size for accuracy,
Re-ranking.
Milvus similarity search engine relies on heavily-modified forks of third-party open-source similarity search libraries, such as Faiss, DiskANN and hnswlib.
Milvus includes optimizations for I/O data layout, specific to graph search indices.
= Database
=As a database, Milvus provides the following features:
Column-oriented database
Four supported data consistency levels, including strong consistency and eventual consistency.
Data sharding
Streaming data ingestion, which allows to process and ingest data in real-time as it arrives
A dynamic schema, which allows inserting the data without a predefined schema
Independent storage and compute layers
Multi-tenancy scenarios (database-oriented, collection-oriented, partition-oriented)
Memory-mapped data storage
Role-based access control
Multi-vector and hybrid search
= Deployment options
=Milvus can be deployed as an embedded database, standalone server, or distributed cluster. Zillis Cloud offers a fully managed version.
= GPU support
=Milvus provides GPU accelerated index building and search using Nvidia CUDA technology via Nvidia RAFT library, including a recent GPU-based graph indexing algorithm Nvidia CAGRA
= Integration
=Milvus provides official SDK clients for Java, NodeJS, Python and Go. An additional C# SDK client was contributed by Microsoft. The database can integrate with Prometheus and Grafana for monitoring and alerts, frameworks Haystack and LangChain, IBM Watsonx, and OpenAI models.
See also
Nearest neighbor search
Similarity search
Vector database
Vector quantization
Vector embedding
References
Kata Kunci Pencarian:
- Milvus (vector database)
- Vector database
- LangChain
- Hierarchical navigable small world
- DataStax
- Adenoviridae
- Bird strike