Vexless: A Serverless Vector Data Management System Using Cloud Functions

Vexless: A Serverless Vector Data Management System Using Cloud Functions

June 2024 | YONGYE SU, YINQI SUN, MINJIA ZHANG, JIANGUO WANG
The paper introduces Vexless, a serverless vector data management system optimized for cloud functions. Cloud functions, such as AWS Lambda and Azure Functions, offer elastic, serverless, and cost-effective computing, making them suitable for bursty and sparse workloads. The authors focus on vector databases, which have gained attention due to large language models, and address challenges in sharding, communication overhead, and cold-start times. Vexless employs a global coordinator (orchestrator) to assign workloads to cloud function instances based on available hardware resources. It uses stateful cloud functions to reduce communication overhead and introduces a workload-aware strategy to minimize cold-start times. Experimental results show that Vexless significantly reduces costs, especially for bursty and sparse workloads, while achieving similar or higher query performance and accuracy compared to cloud VM instances. The system design includes a purpose-built sharding-based index and search strategy, an optimized communication mechanism, and a novel approach to reduce cold-start latency. Vexless is evaluated using synthetic and real-world workloads, demonstrating its effectiveness in various scenarios. The results highlight Vexless's superior performance-to-cost ratio and latency compared to other solutions.The paper introduces Vexless, a serverless vector data management system optimized for cloud functions. Cloud functions, such as AWS Lambda and Azure Functions, offer elastic, serverless, and cost-effective computing, making them suitable for bursty and sparse workloads. The authors focus on vector databases, which have gained attention due to large language models, and address challenges in sharding, communication overhead, and cold-start times. Vexless employs a global coordinator (orchestrator) to assign workloads to cloud function instances based on available hardware resources. It uses stateful cloud functions to reduce communication overhead and introduces a workload-aware strategy to minimize cold-start times. Experimental results show that Vexless significantly reduces costs, especially for bursty and sparse workloads, while achieving similar or higher query performance and accuracy compared to cloud VM instances. The system design includes a purpose-built sharding-based index and search strategy, an optimized communication mechanism, and a novel approach to reduce cold-start latency. Vexless is evaluated using synthetic and real-world workloads, demonstrating its effectiveness in various scenarios. The results highlight Vexless's superior performance-to-cost ratio and latency compared to other solutions.
Reach us at info@study.space
Understanding Vexless%3A A Serverless Vector Data Management System Using Cloud Functions