Overview of Vertex AI Vector Search


Vector Search, based on cutting-edge technology developed by Google Research, is a powerful tool that underpins major Google products such as Google Search, YouTube, and Play. It allows for the search of billions of semantically similar or related items, making it ideal for applications like recommendation engines, search engines, chatbots, and text classification.

What is Vector Search?

Vector Search leverages embeddings—vector representations of data—that capture the semantic meaning of various data types, including text, images, audio, and video. By matching these vectors, Vector Search can find items that are semantically similar to a given query, even at a massive scale with high performance.

Use Case: Online Retail

Imagine an online retailer with a vast inventory of clothing items. By using the multi-modal embedding API, the retailer can generate embeddings for each item and use Vector Search to match these items to text queries. For instance, a search for “yellow summer dress” would return images of the most semantically similar dresses, even if the exact phrase doesn’t appear in the item descriptions. This capability enhances the user experience by providing more relevant search results efficiently.

How to Use Vector Search for Semantic Matching

  1. Generate an Embedding: Create embeddings for your dataset, either using external tools or the Generative AI capabilities on Vertex AI for both text and multimodal embeddings.
  2. Add Your Embedding to Cloud Storage: Upload these embeddings to Google Cloud Storage, preparing them for use with Vector Search.
  3. Upload to Vector Search: Connect the embeddings to Vector Search, create an index from them, and deploy this index to an endpoint to run queries and obtain results.
  4. Evaluate the Results: After retrieving the approximate nearest neighbors, evaluate the results for accuracy. Adjust algorithm parameters or scale the system as needed to improve performance.

Key Terminology

  • Vector: A list of float values representing data with magnitude and direction.
  • Embedding: A vector that captures the semantic meaning of data, often created using machine learning techniques.
  • Index: A collection of vectors used for similarity search, which can be queried to find the nearest neighbors.
  • Ground Truth: The real-world data used to verify the accuracy of machine learning models.
  • Recall: The percentage of true nearest neighbors returned by the index.
  • Restrict: A functionality that limits searches to a subset of the index using Boolean rules, also known as filtering.

Example Workflow

  1. Generate Embeddings: Use Generative AI on Vertex AI to create embeddings for your items, ensuring they capture the semantic essence of the data.
  2. Upload to Cloud Storage: Store these embeddings in Google Cloud Storage for easy access by the Vector Search service.
  3. Create and Deploy an Index: Use the embeddings to create an index. Deploy this index to an endpoint to enable querying.
  4. Run and Evaluate Queries: Execute queries against the index to find semantically similar items. Evaluate and refine the results to meet your specific needs.


Vector Search on Vertex AI provides a robust solution for implementing advanced search functionalities across various applications. By leveraging semantic embeddings and Google’s powerful infrastructure, businesses can achieve high-performance, scalable, and accurate search capabilities. Whether for retail, content recommendation, or complex data analysis, Vector Search offers the tools to unlock new possibilities and efficiencies in AI-driven solutions.