Announcing Mosaic AI Vector Search General Availability in Azure Databricks (2024)

Today, at Microsoft Build, we are thrilled to announce the general availability of Mosaic AI Vector Search in Azure Databricks. Vector Search is a serverless vector database that helps customers build high-quality Generative AI applications using Retrieval Augmented Generation (RAG). With its native integration in Azure Databricks, Vector Search supports automatic data synchronization from source to index, eliminating complex and costly pipeline maintenance. It also leverages the same security and data governance tools organizations have already built for peace of mind.

Challenges with Building Production-quality Generative AI Applications

The top challenge with deploying Generative AI is getting these applications to production quality. Traditionally, the primary focus around quality was to find the best Large Language Model (LLM). However, recent research shows that the LLM is only one of the factors of quality for your GenAI-powered application. To achieve high quality, developers need to build an AI system, where they take into account data preparation, the LLM, ranking and post-processing pipelines, prompt engineering, and more. The most popular type of AI system is Retrieval Augmented Generation or RAG. RAG is an architectural approach that can improve the efficacy of large language models by augmenting the prompt with data stored in a Vector Database.

Vector Search in Azure Databricks enables developers to improve the accuracy of their Retrieval Augmented Generation (RAG) and generative AI applications through similarity search over unstructured documents such as PDFs, Office Documents, Wikis, and more. This enriches the LLM queries with context and domain knowledge, improving accuracy and quality of results.

Vector Search is part of the Azure Databricks Data Intelligence Platform, making it easy for your RAG and Generative AI applications to use the proprietary data stored in your data lakes in a fast and secure manner and deliver accurate responses. Unlike other databases, Vector Search supports automatic data synchronization from source to index, eliminating complex and costly pipeline maintenance. It leverages the same security and data governance tools organizations have already built for peace of mind. With its serverless design, Vector Search easily scales to support billions of embeddings and thousands of real-time queries per second.

Why do customers love Mosaic AI Vector Search in Azure Databricks?

We designed Vector Search to be fast, secure and easy to use.

Fast with low TCO -Vector Search is designed to deliver high performance at lower TCO, with up to 5x faster performance than other providers
Automated data ingestion - Vector Search makes it possible to synchronize any Delta Table into a vector index with 1-click - no need for complex, custom built data ingestion/sync pipelines.
Built-in Governance - Vector Search uses the same Unity Catalog-based security and data governance tools that already power your Data Intelligence Platform, meaning you do not have to build and maintain a separate set of data governance policies for your unstructured data
Best-in-class retrieval quality - Vector Search has been engineered to provide the highest recall out of the box compared to other providers
Serverless Scaling - Our serverless infrastructure automatically scales to your workflows without the need to configure instances and server types.

Automated Data Ingestion

Before a vector database can store information, it requires a data ingestion pipeline where raw, unprocessed data from various sources need to be cleaned, processed (parsed/chunked), and embedded with an AI model before it is stored as vectors in the database. This process to build and maintain another set of data ingestion pipelines is expensive and time-consuming, taking time from valuable engineering resources. Vector Search is fully integrated with the Azure Databricks Data Intelligence Platform, enabling it to automatically pull data and embed that data without needing to build and maintain new data pipelines.

Built-In Governance

Vector Search leverages the same security controls and data governance that already protects the rest of the Data Intelligence Platform enabled by integration with Unity Catalog. The vector indexes are stored as entities within your Unity catalog and leverage the same unified interface to define policies on data, with fine-grained control on embeddings.

Best in Class Retrieval Quality

In any Retrieval-Augmented Generation (RAG) application, the cornerstone of delivering relevant and precise answers lies in the retrieval quality of the underlying search engine. Central to evaluating this quality is the metric known as recall. Recall measures the ability of the search engine to retrieve all relevant documents from a dataset. Vector Search has been engineered to provide the highest recall out of the box compared to other providers. Vector Search leverages state-of-the-art machine learning models, optimized indexing strategies, and advanced query understanding techniques to ensure that every search captures the complete range of relevant documents. This capability sets Vector Search apart, offering our users an unmatched level of retrieval quality that enhances the overall effectiveness of their RAG applications.

GA Pricing

Vector Search will continue to be priced at $0.28/hr (eastus2) with a capacity of 2 million vectors per unit
Each unit of Vector Search will include 30 GB of storage at no additional cost
- Any usage above 30 GB will be charged at $0.23/GB/month (eastus2)
- Storage overage billing starts on Aug 1, 2024
Capacity and Storage prices vary by region

Next Steps

Get started by reading our documentation and specifically creating a Vector Search index