Announcing Vector Search for Amazon MemoryDB: Revolutionizing Real-Time ML and Generative AI Applications
Today, Amazon Web Services (AWS) has announced the general availability of vector search for Amazon MemoryDB. This new capability allows you to store, index, retrieve, and search vectors, enabling the development of real-time machine learning (ML) and generative artificial intelligence (AI) applications with in-memory performance and multi-Availability Zone (multi-AZ) durability.
Key Highlights of the Vector Search Feature
With this launch, Amazon MemoryDB offers the fastest vector search performance with the highest recall rates among popular vector databases on AWS. This means you no longer have to compromise on throughput, recall, and latency—three key factors that often conflict with one another in traditional database systems.
You can now use a single MemoryDB database to store your application data and millions of vectors, achieving single-digit millisecond query and update response times at the highest levels of recall. This simplifies your generative AI application architecture, delivering peak performance while reducing licensing costs, operational burdens, and the time required to derive insights from your data.
Use Cases for Vector Search in MemoryDB
The new vector search capability can be implemented for various use cases, including:
- Real-Time Semantic Search for Retrieval-Augmented Generation (RAG):
- Vector search can be used to retrieve relevant passages from a large corpus of data to augment a large language model (LLM). This involves chunking your document corpus into discrete text buckets, generating vector embeddings for each chunk, and then loading these embeddings into Amazon MemoryDB.
- With RAG and MemoryDB, you can develop real-time generative AI applications to find similar products or content by representing items as vectors, or search documents by representing text documents as dense vectors that capture semantic meaning.
- Low Latency Durable Semantic Caching:
- Semantic caching helps reduce computational costs by storing previous results from the foundation model (FM) in-memory. You can store prior inferenced answers alongside the vector representation of the question in MemoryDB and reuse them instead of generating new answers from the LLM.
- If a user’s query is semantically similar to a prior question, MemoryDB will return the answer to the prior question, thereby providing a quicker response and reducing costs.
- Real-Time Anomaly (Fraud) Detection:
- Vector search can supplement your rule-based and batch ML processes by storing transactional data represented by vectors, alongside metadata indicating whether those transactions were identified as fraudulent or valid.
- The ML processes can detect fraudulent transactions when new transactions have high similarity to vectors representing fraudulent transactions. With vector search for MemoryDB, you can quickly detect fraud by modeling fraudulent transactions based on your batch ML models and loading normal and fraudulent transactions into MemoryDB.
Getting Started with Vector Search for Amazon MemoryDB
To help you get started, here is a simple guide to implementing a semantic search application using vector search for MemoryDB.
Step 1: Create a Cluster to Support Vector Search
- You can create a MemoryDB cluster by enabling vector search within the MemoryDB console. Choose "Enable vector search" in the "Cluster settings" when creating or updating a cluster. Vector search is available for MemoryDB version 7.1 and a single shard configuration.
Step 2: Create Vector Embeddings Using the Amazon Titan Embeddings Model
- Use Amazon Titan Text Embeddings or other embedding models available in Amazon Bedrock to create vector embeddings. Load your PDF file, split the text into chunks, and get vector data using a single API with LangChain libraries integrated with AWS services.
“`python
import redis
import numpy as np
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import BedrockEmbeddingsLoad a PDF file and split document
loader = PyPDFLoader(file_path=pdf_path)
pages = loader.load_and_split()
text_splitter = RecursiveCharacterTextSplitter(
separators=["\n\n", "\n", ".", " "],
chunk_size=1000,
chunk_overlap=200,
)
chunks = loader.load_and_split(text_splitter)Create MemoryDB vector store the chunks and embedding details
client = RedisCluster(
host="mycluster.memorydb.us-east-1.amazonaws.com",
port=6379,
ssl=True,
ssl_cert_reqs="none",
decode_responses=True,
)embedding = BedrockEmbeddings(
region_name="us-east-1",
endpoint_url="https://bedrock-runtime.us-east-1.amazonaws.com",
)Save embedding and metadata using hset into your MemoryDB cluster
for id, dd in enumerate(chunks):
y = embedding.embed_documents([dd])
j = np.array(y, dtype=np.float32).tobytes()
client.hset(f’oakDoc:{id}’, mapping={’embed’: j, ‘text’: chunks[id]})
“`Once you generate the vector embeddings using the Amazon Titan Text Embeddings model, connect to your MemoryDB cluster and save these embeddings using the MemoryDB HSET command.
Step 3: Create a Vector Index
- To query your vector data, create a vector index using the FT.CREATE command. Vector indexes are constructed and maintained over a subset of the MemoryDB keyspace. Vectors can be saved in JSON or HASH data types, and any modifications are automatically updated.
python<br /> from redis.commands.search.field import TextField, VectorField<br /> <br /> index = client.ft(idx="testIndex").create_index([<br /> VectorField(<br /> "embed",<br /> "FLAT",<br /> {<br /> "TYPE": "FLOAT32",<br /> "DIM": 1536,<br /> "DISTANCE_METRIC": "COSINE",<br /> }<br /> ),<br /> TextField("text")<br /> ])<br />
MemoryDB supports four types of fields: number fields, tag fields, text fields, and vector fields. Vector fields support K-nearest neighbor searching (KNN) of fixed-sized vectors using the flat search (FLAT) and hierarchical navigable small worlds (HNSW) algorithms. The feature supports various distance metrics, such as Euclidean, cosine, and inner product.
Step 4: Search the Vector Space
- You can use FT.SEARCH and FT.AGGREGATE commands to query your vector data. Each operator uses one field in the index to identify a subset of the keys in the index. You can query and find filtered results by the distance between a vector field in MemoryDB and a query vector based on a predefined threshold (RADIUS).
“`python
from redis.commands.search.query import QueryQuery vector data
query = (
Query("@vector:[VECTOR_RANGE $radius $vec]=>{$YIELD_DISTANCE_AS: score}")
.paging(0, 3)
.sort_by("vector score")
.return_fields("id", "score")
.dialect(2)
)Find all vectors within 0.8 of the query vector
query_params = {
"radius": 0.8,
"vec": np.random.rand(VECTOR_DIMENSIONS).astype(np.float32).tobytes()
}results = client.ft(index).search(query, query_params).docs
“`For example, when using cosine similarity, the RADIUS value ranges from 0 to 1, with a value closer to 1 indicating vectors more similar to the search center.
What’s New at General Availability (GA)
At re:Invent 2023, AWS released vector search for MemoryDB in preview. Based on customer feedback, several new features and improvements are now available:
- VECTOR_RANGE: Allows MemoryDB to operate as a low latency durable semantic cache, optimizing costs and performance for generative AI applications.
- SCORE: Enhances filtering on similarity during vector search.
- Shared Memory: Prevents duplication of vectors in memory by storing vectors within the MemoryDB keyspace and storing pointers to the vectors in the vector index.
- Performance Improvements: Enhances performance at high filtering rates, powering the most performance-intensive generative AI applications.
Availability and Next Steps
Vector search is now available in all regions where MemoryDB is currently available. To learn more, visit the AWS documentation on vector search for Amazon MemoryDB.
You can try it out in the MemoryDB console and provide feedback through the AWS re:Post for Amazon MemoryDB or your usual AWS Support contacts.
By integrating vector search into Amazon MemoryDB, AWS continues to enhance its services, providing users with powerful tools to develop cutting-edge AI and ML applications efficiently. This new feature is set to revolutionize how businesses leverage in-memory databases for real-time data processing and AI-driven insights.
For more Information, Refer to this article.