Amazon MemoryDB Introduces General Availability of Vector Search Feature

NewsAmazon MemoryDB Introduces General Availability of Vector Search Feature

Announcing Vector Search for Amazon MemoryDB: Revolutionizing Real-Time ML and Generative AI Applications

Today, Amazon Web Services (AWS) has announced the general availability of vector search for Amazon MemoryDB. This new capability allows you to store, index, retrieve, and search vectors, enabling the development of real-time machine learning (ML) and generative artificial intelligence (AI) applications with in-memory performance and multi-Availability Zone (multi-AZ) durability.

Key Highlights of the Vector Search Feature

With this launch, Amazon MemoryDB offers the fastest vector search performance with the highest recall rates among popular vector databases on AWS. This means you no longer have to compromise on throughput, recall, and latency—three key factors that often conflict with one another in traditional database systems.

You can now use a single MemoryDB database to store your application data and millions of vectors, achieving single-digit millisecond query and update response times at the highest levels of recall. This simplifies your generative AI application architecture, delivering peak performance while reducing licensing costs, operational burdens, and the time required to derive insights from your data.

Use Cases for Vector Search in MemoryDB

The new vector search capability can be implemented for various use cases, including:

  1. Real-Time Semantic Search for Retrieval-Augmented Generation (RAG):
    • Vector search can be used to retrieve relevant passages from a large corpus of data to augment a large language model (LLM). This involves chunking your document corpus into discrete text buckets, generating vector embeddings for each chunk, and then loading these embeddings into Amazon MemoryDB.
    • With RAG and MemoryDB, you can develop real-time generative AI applications to find similar products or content by representing items as vectors, or search documents by representing text documents as dense vectors that capture semantic meaning.
  2. Low Latency Durable Semantic Caching:
    • Semantic caching helps reduce computational costs by storing previous results from the foundation model (FM) in-memory. You can store prior inferenced answers alongside the vector representation of the question in MemoryDB and reuse them instead of generating new answers from the LLM.
    • If a user’s query is semantically similar to a prior question, MemoryDB will return the answer to the prior question, thereby providing a quicker response and reducing costs.
  3. Real-Time Anomaly (Fraud) Detection:
    • Vector search can supplement your rule-based and batch ML processes by storing transactional data represented by vectors, alongside metadata indicating whether those transactions were identified as fraudulent or valid.
    • The ML processes can detect fraudulent transactions when new transactions have high similarity to vectors representing fraudulent transactions. With vector search for MemoryDB, you can quickly detect fraud by modeling fraudulent transactions based on your batch ML models and loading normal and fraudulent transactions into MemoryDB.

      Getting Started with Vector Search for Amazon MemoryDB

      To help you get started, here is a simple guide to implementing a semantic search application using vector search for MemoryDB.

      Step 1: Create a Cluster to Support Vector Search

    • You can create a MemoryDB cluster by enabling vector search within the MemoryDB console. Choose "Enable vector search" in the "Cluster settings" when creating or updating a cluster. Vector search is available for MemoryDB version 7.1 and a single shard configuration.

      Step 2: Create Vector Embeddings Using the Amazon Titan Embeddings Model

    • Use Amazon Titan Text Embeddings or other embedding models available in Amazon Bedrock to create vector embeddings. Load your PDF file, split the text into chunks, and get vector data using a single API with LangChain libraries integrated with AWS services.

      “`python
      import redis
      import numpy as np
      from langchain.document_loaders import PyPDFLoader
      from langchain.text_splitter import RecursiveCharacterTextSplitter
      from langchain.embeddings import BedrockEmbeddings

      Load a PDF file and split document

      loader = PyPDFLoader(file_path=pdf_path)
      pages = loader.load_and_split()
      text_splitter = RecursiveCharacterTextSplitter(
      separators=["\n\n", "\n", ".", " "],
      chunk_size=1000,
      chunk_overlap=200,
      )
      chunks = loader.load_and_split(text_splitter)

      Create MemoryDB vector store the chunks and embedding details

      client = RedisCluster(
      host="mycluster.memorydb.us-east-1.amazonaws.com",
      port=6379,
      ssl=True,
      ssl_cert_reqs="none",
      decode_responses=True,
      )

      embedding = BedrockEmbeddings(
      region_name="us-east-1",
      endpoint_url="https://bedrock-runtime.us-east-1.amazonaws.com",
      )

      Save embedding and metadata using hset into your MemoryDB cluster

      for id, dd in enumerate(chunks):
      y = embedding.embed_documents([dd])
      j = np.array(y, dtype=np.float32).tobytes()
      client.hset(f’oakDoc:{id}’, mapping={’embed’: j, ‘text’: chunks[id]})
      “`

      Once you generate the vector embeddings using the Amazon Titan Text Embeddings model, connect to your MemoryDB cluster and save these embeddings using the MemoryDB HSET command.

      Step 3: Create a Vector Index

    • To query your vector data, create a vector index using the FT.CREATE command. Vector indexes are constructed and maintained over a subset of the MemoryDB keyspace. Vectors can be saved in JSON or HASH data types, and any modifications are automatically updated.

      python<br /> from redis.commands.search.field import TextField, VectorField<br /> <br /> index = client.ft(idx="testIndex").create_index([<br /> VectorField(<br /> "embed",<br /> "FLAT",<br /> {<br /> "TYPE": "FLOAT32",<br /> "DIM": 1536,<br /> "DISTANCE_METRIC": "COSINE",<br /> }<br /> ),<br /> TextField("text")<br /> ])<br />

      MemoryDB supports four types of fields: number fields, tag fields, text fields, and vector fields. Vector fields support K-nearest neighbor searching (KNN) of fixed-sized vectors using the flat search (FLAT) and hierarchical navigable small worlds (HNSW) algorithms. The feature supports various distance metrics, such as Euclidean, cosine, and inner product.

      Step 4: Search the Vector Space

    • You can use FT.SEARCH and FT.AGGREGATE commands to query your vector data. Each operator uses one field in the index to identify a subset of the keys in the index. You can query and find filtered results by the distance between a vector field in MemoryDB and a query vector based on a predefined threshold (RADIUS).

      “`python
      from redis.commands.search.query import Query

      Query vector data

      query = (
      Query("@vector:[VECTOR_RANGE $radius $vec]=>{$YIELD_DISTANCE_AS: score}")
      .paging(0, 3)
      .sort_by("vector score")
      .return_fields("id", "score")
      .dialect(2)
      )

      Find all vectors within 0.8 of the query vector

      query_params = {
      "radius": 0.8,
      "vec": np.random.rand(VECTOR_DIMENSIONS).astype(np.float32).tobytes()
      }

      results = client.ft(index).search(query, query_params).docs
      “`

      For example, when using cosine similarity, the RADIUS value ranges from 0 to 1, with a value closer to 1 indicating vectors more similar to the search center.

      What’s New at General Availability (GA)

      At re:Invent 2023, AWS released vector search for MemoryDB in preview. Based on customer feedback, several new features and improvements are now available:

    • VECTOR_RANGE: Allows MemoryDB to operate as a low latency durable semantic cache, optimizing costs and performance for generative AI applications.
    • SCORE: Enhances filtering on similarity during vector search.
    • Shared Memory: Prevents duplication of vectors in memory by storing vectors within the MemoryDB keyspace and storing pointers to the vectors in the vector index.
    • Performance Improvements: Enhances performance at high filtering rates, powering the most performance-intensive generative AI applications.

      Availability and Next Steps

      Vector search is now available in all regions where MemoryDB is currently available. To learn more, visit the AWS documentation on vector search for Amazon MemoryDB.

      You can try it out in the MemoryDB console and provide feedback through the AWS re:Post for Amazon MemoryDB or your usual AWS Support contacts.

      By integrating vector search into Amazon MemoryDB, AWS continues to enhance its services, providing users with powerful tools to develop cutting-edge AI and ML applications efficiently. This new feature is set to revolutionize how businesses leverage in-memory databases for real-time data processing and AI-driven insights.

For more Information, Refer to this article.

Neil S
Neil S
Neil is a highly qualified Technical Writer with an M.Sc(IT) degree and an impressive range of IT and Support certifications including MCSE, CCNA, ACA(Adobe Certified Associates), and PG Dip (IT). With over 10 years of hands-on experience as an IT support engineer across Windows, Mac, iOS, and Linux Server platforms, Neil possesses the expertise to create comprehensive and user-friendly documentation that simplifies complex technical concepts for a wide audience.
Watch & Subscribe Our YouTube Channel
YouTube Subscribe Button

Latest From Hawkdive

You May like these Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.