Build a Google Drive RAG System in Apps Script

๐Ÿš€ Apps Script + Gemini Mastery โ€” Issue #4

https://github.com/lsvekis/rag-script-Gemini-Rag

Bring context-aware AI answers to your own documents โ€” summaries, explanations, knowledge extraction, and more.


โญ What You Will Build in Issue #4

A fully functioning RAG pipeline, completely in Apps Script:

  1. ๐Ÿ—‚ Read files from Google Drive (Docs, Sheets, PDFs, text files)
  2. โœ‚๏ธ Chunk content into manageable segments
  3. ๐Ÿ”ข Embed content using Gemini embedding models
  4. ๐Ÿงฎ Store embeddings in a Sheet for fast retrieval
  5. ๐Ÿ” Search for relevant chunks using cosine similarity
  6. ๐Ÿค– Generate an AI answer with live context
  7. ๐Ÿ“„ Insert results into a Doc, or display in a sidebar

This is the same technique used in advanced AI apps โ€” now fully achievable inside Apps Script.


๐Ÿง  Learning Objectives

By the end of this issue, you will know how to:

โœ” Build embeddings in Apps Script
โœ” Store and index vector embeddings
โœ” Search a vector store using cosine similarity
โœ” Create a context-aware RAG prompt
โœ” Combine retrieved content with user questions
โœ” Answer questions grounded in real Drive documents


๐Ÿ”ง System Overview

Your workflow will look like this:

Drive Files โ†’ Extract Text โ†’ Chunk โ†’ Embed โ†’ Store in Sheet
User Query โ†’ Embed Query โ†’ Similarity Search โ†’ Top Chunks โ†’ Gemini Answer

๐Ÿงฉ EXERCISE โ€” Build the RAG System

Weโ€™ll implement this in phases.


1๏ธโƒฃ Create the Vector Store Sheet

Open a new Google Sheet titled:

RAG_Embeddings

Add headers in Row 1:

| id | chunk | embedding |

Apps Script will populate the rest automatically.


2๏ธโƒฃ Add Menu for RAG Tools

Code.gs

function onOpen() {
  DocumentApp.getUi()
    .createMenu("AI Tools")
    .addItem("RAG: Index Drive Folder", "ragIndexFolder")
    .addItem("RAG: Ask Question", "showRagSidebar")
    .addToUi();
}

function showRagSidebar() {
  const html = HtmlService.createHtmlOutputFromFile("RagSidebar")
    .setTitle("RAG Question Answering");
  DocumentApp.getUi().showSidebar(html);
}

3๏ธโƒฃ Sidebar UI for Asking Questions

RagSidebar.html

<div style="font-family: Arial; padding: 14px;">
  <h2>Ask Your Documents</h2>

  <label>Enter your question:</label><br>
  <input id="question" style="width:100%; margin-bottom:10px;" />

  <button onclick="ask()">Ask</button>

  <pre id="output" style="white-space: pre-wrap; margin-top:20px;"></pre>

  <script>
    function ask() {
      google.script.run.withSuccessHandler(function(res) {
        document.getElementById("output").textContent = res;
      }).ragAnswer(
        document.getElementById("question").value
      );
    }
  </script>
</div>

4๏ธโƒฃ Extract Text From Drive Files

RagExtract.gs

function extractTextFromFile(file) {
  const mime = file.getMimeType();

  if (mime === MimeType.GOOGLE_DOCS) {
    return DocumentApp.openById(file.getId()).getBody().getText();
  }

  if (mime === MimeType.GOOGLE_SHEETS) {
    const sheet = SpreadsheetApp.openById(file.getId()).getSheets()[0];
    return sheet.getDataRange().getDisplayValues().flat().join(" ");
  }

  if (mime === MimeType.PDF) {
    return DriveApp.getFileById(file.getId()).getBlob().getDataAsString();
  }

  return file.getBlob().getDataAsString();
}

5๏ธโƒฃ Chunking Function

RagChunks.gs

function chunkText(text, size = 700) {
  const chunks = [];
  for (let i = 0; i < text.length; i += size) {
    chunks.push(text.substring(i, i + size));
  }
  return chunks;
}

6๏ธโƒฃ Generate Embeddings Using Gemini

We will use text-embedding-004 from your model list.

GeminiEmbedding.gs

const EMBEDDING_MODEL = "text-embedding-004";

function embedText(text) {
  const url = `https://generativelanguage.googleapis.com/v1/models/${EMBEDDING_MODEL}:embedContent?key=${GEMINI_API_KEY}`;

  const payload = { content: { parts: [{ text }] } };

  const res = UrlFetchApp.fetch(url, {
    method: "post",
    contentType: "application/json",
    payload: JSON.stringify(payload),
    muteHttpExceptions: true
  });

  const json = JSON.parse(res.getContentText());
  return json.embedding?.values || [];
}

7๏ธโƒฃ Store Embeddings in Sheet

RagIndex.gs

function ragIndexFolder() {
  const folder = DriveApp.getFolderById("YOUR_FOLDER_ID");
  const files = folder.getFiles();

  const sheet = SpreadsheetApp.openById("YOUR_SHEET_ID").getSheetByName("RAG_Embeddings");
  sheet.clear();
  sheet.appendRow(["id", "chunk", "embedding"]);

  while (files.hasNext()) {
    const file = files.next();
    const text = extractTextFromFile(file);
    const chunks = chunkText(text);

    chunks.forEach((chunk, idx) => {
      const vector = embedText(chunk);
      sheet.appendRow([`${file.getName()}_${idx}`, chunk, JSON.stringify(vector)]);
    });
  }
}

8๏ธโƒฃ Cosine Similarity

RagSimilarity.gs

function cosine(v1, v2) {
  let dot = 0, a = 0, b = 0;
  for (let i = 0; i < v1.length; i++) {
    dot += v1[i] * v2[i];
    a += v1[i] * v1[i];
    b += v2[i] * v2[i];
  }
  return dot / (Math.sqrt(a) * Math.sqrt(b));
}

9๏ธโƒฃ Retrieve Relevant Chunks

RagRetrieve.gs

function ragRetrieve(queryVector, topK = 5) {
  const sheet = SpreadsheetApp.openById("YOUR_SHEET_ID").getSheetByName("RAG_Embeddings");
  const data = sheet.getDataRange().getValues().slice(1);

  const scored = data.map(row => {
    const embedding = JSON.parse(row[2]);
    return { chunk: row[1], score: cosine(queryVector, embedding) };
  });

  scored.sort((a, b) => b.score - a.score);
  return scored.slice(0, topK).map(x => x.chunk).join("\n\n");
}

๐Ÿ”Ÿ Generate the Grounded Answer

RagAnswer.gs

function ragAnswer(question) {
  const queryVector = embedText(question);
  const context = ragRetrieve(queryVector);

  const prompt = `
You are a helpful assistant.
Use ONLY the context below to answer the user's question.

Context:
${context}

Question:
${question}

Answer using clear, grounded reasoning.
`;

  return callGemini(prompt, "");
}

๐ŸŽ‰ What You Built in Issue #4

You now have a working RAG engine inside Apps Script, including:

  • Text extraction
  • Chunking
  • Embeddings
  • Vector storage
  • Semantic search
  • Context retrieval
  • Grounded Gemini answers

This is the foundation for:

๐Ÿ”ฅ Enterprise search
๐Ÿ”ฅ Course material Q&A
๐Ÿ”ฅ Knowledge extraction
๐Ÿ”ฅ Drive-based chatbots

And it’s all running directly inside Workspace.

Single-file RAG Script (Drive + Gemini)

/************************************************************
 *  Apps Script + Gemini RAG over Google Drive (Single File)
 *  - Index a Drive folder
 *  - Store embeddings in a Sheet
 *  - Ask questions from a Doc sidebar
 *
 *  BEFORE USE:
 *    1) Set GEMINI_API_KEY
 *    2) Set RAG_SHEET_ID  (Spreadsheet that stores embeddings)
 *    3) Set RAG_FOLDER_ID (Drive folder to index)
 ************************************************************/

// === CONFIG =================================================

// Gemini API key (do NOT commit real keys)
const GEMINI_API_KEY   = "YOUR_API_KEY_HERE";

// Text model for answering questions
const GEMINI_MODEL     = "gemini-2.5-flash";

// Embedding model for RAG
const EMBEDDING_MODEL  = "text-embedding-004";

// Spreadsheet that stores embeddings
const RAG_SHEET_ID     = "YOUR_SHEET_ID_HERE";   // Spreadsheet ID from URL
const RAG_SHEET_NAME   = "RAG_Embeddings";

// Drive folder to index
const RAG_FOLDER_ID    = "YOUR_FOLDER_ID_HERE";  // Folder ID from URL


// === UI: MENU + SIDEBAR =====================================

function onOpen() {
  DocumentApp.getUi()
    .createMenu("AI Tools")
    .addItem("RAG: Index Drive Folder", "ragIndexFolder")
    .addItem("RAG: Ask Question", "showRagSidebar")
    .addToUi();
}

function showRagSidebar() {
  const html = HtmlService.createHtmlOutput(ragSidebarHtml())
    .setTitle("RAG Question Answering");
  DocumentApp.getUi().showSidebar(html);
}

function ragSidebarHtml() {
  return `
  <div style="font-family: Arial, sans-serif; padding: 14px;">
    <h2>Ask Your Drive Docs</h2>
    <p>Select <b>AI Tools โ†’ RAG: Index Drive Folder</b> first if you haven't indexed yet.</p>

    <label for="question"><b>Your question</b></label><br/>
    <input id="question"
           style="width: 100%; margin-bottom: 10px;"
           placeholder="e.g., What are the key policies in this folder?" />

    <button onclick="ask()" style="margin-top: 6px;">Ask</button>

    <pre id="output" style="white-space: pre-wrap; margin-top: 16px;"></pre>

    <script>
      function ask() {
        var q = document.getElementById('question').value;
        document.getElementById('output').textContent = "Thinking...";
        google.script.run
          .withSuccessHandler(function(res) {
            document.getElementById('output').textContent = res;
          })
          .ragAnswer(q);
      }
    </script>
  </div>
  `;
}


// === GEMINI HELPERS ========================================

function callGemini(prompt, text) {
  if (!GEMINI_API_KEY || GEMINI_API_KEY === "YOUR_API_KEY_HERE") {
    return "Gemini API key is not configured. Set GEMINI_API_KEY at the top of the script.";
  }

  const url = "https://generativelanguage.googleapis.com/v1/models/" +
              GEMINI_MODEL + ":generateContent?key=" + GEMINI_API_KEY;

  const payload = {
    contents: [{
      parts: [{
        text: prompt + (text ? "\n\n---\n\n" + text : "")
      }]
    }]
  };

  const res = UrlFetchApp.fetch(url, {
    method: "post",
    contentType: "application/json",
    payload: JSON.stringify(payload),
    muteHttpExceptions: true
  });

  const body = res.getContentText();
  const json = JSON.parse(body);

  if (json.candidates &&
      json.candidates[0] &&
      json.candidates[0].content &&
      json.candidates[0].content.parts &&
      json.candidates[0].content.parts[0].text) {
    return json.candidates[0].content.parts[0].text;
  }

  if (json.error) {
    return "Gemini API error: " + json.error.message +
           (json.error.status ? " (" + json.error.status + ")" : "");
  }

  return "Unexpected Gemini response:\n" + body;
}

function embedText(text) {
  if (!GEMINI_API_KEY || GEMINI_API_KEY === "YOUR_API_KEY_HERE") {
    throw new Error("Gemini API key is not configured. Set GEMINI_API_KEY at the top of the script.");
  }

  const url = "https://generativelanguage.googleapis.com/v1/models/" +
              EMBEDDING_MODEL + ":embedContent?key=" + GEMINI_API_KEY;

  const payload = {
    content: {
      parts: [{ text: text }]
    }
  };

  const res = UrlFetchApp.fetch(url, {
    method: "post",
    contentType: "application/json",
    payload: JSON.stringify(payload),
    muteHttpExceptions: true
  });

  const body = res.getContentText();
  const json = JSON.parse(body);

  if (json.error) {
    throw new Error("Embedding API error: " + json.error.message +
                    (json.error.status ? " (" + json.error.status + ")" : ""));
  }

  if (json.embedding && json.embedding.values) {
    return json.embedding.values;
  }

  throw new Error("Unexpected embedding response: " + body);
}


// === TEXT EXTRACTION & CHUNKING ============================

function extractTextFromFile(file) {
  const mime = file.getMimeType();

  if (mime === MimeType.GOOGLE_DOCS) {
    return DocumentApp.openById(file.getId()).getBody().getText();
  }

  if (mime === MimeType.GOOGLE_SHEETS) {
    const sheet = SpreadsheetApp.openById(file.getId()).getSheets()[0];
    return sheet.getDataRange().getDisplayValues().join(" ");
  }

  if (mime === MimeType.PDF) {
    return file.getBlob().getDataAsString();
  }

  // Fallback: treat as plain text
  return file.getBlob().getDataAsString();
}

function chunkText(text, size) {
  const chunkSize = size || 700;
  const chunks = [];
  for (let i = 0; i < text.length; i += chunkSize) {
    chunks.push(text.substring(i, i + chunkSize));
  }
  return chunks;
}


// === INDEXING (BUILD VECTOR STORE) =========================

function ragIndexFolder() {
  if (!RAG_FOLDER_ID || RAG_FOLDER_ID === "YOUR_FOLDER_ID_HERE") {
    throw new Error("RAG_FOLDER_ID is not set. Update it at the top of the script.");
  }
  if (!RAG_SHEET_ID || RAG_SHEET_ID === "YOUR_SHEET_ID_HERE") {
    throw new Error("RAG_SHEET_ID is not set. Update it at the top of the script.");
  }

  const folder = DriveApp.getFolderById(RAG_FOLDER_ID);
  const files = folder.getFiles();

  const ss = SpreadsheetApp.openById(RAG_SHEET_ID);
  let sheet = ss.getSheetByName(RAG_SHEET_NAME);
  if (!sheet) {
    sheet = ss.insertSheet(RAG_SHEET_NAME);
  }
  sheet.clear();
  sheet.appendRow(["id", "chunk", "embedding"]);

  while (files.hasNext()) {
    const file = files.next();
    const text = extractTextFromFile(file);
    const chunks = chunkText(text);

    for (let i = 0; i < chunks.length; i++) {
      const chunk = chunks[i];
      if (!chunk || !chunk.trim()) continue;

      const vector = embedText(chunk);
      sheet.appendRow([file.getName() + "_" + i, chunk, JSON.stringify(vector)]);
    }
  }
}


// === SIMILARITY + RETRIEVAL ================================

function cosineSimilarity(v1, v2) {
  let dot = 0;
  let mag1 = 0;
  let mag2 = 0;

  const len = Math.min(v1.length, v2.length);
  for (let i = 0; i < len; i++) {
    dot  += v1[i] * v2[i];
    mag1 += v1[i] * v1[i];
    mag2 += v2[i] * v2[i];
  }

  if (mag1 === 0 || mag2 === 0) return 0;
  return dot / (Math.sqrt(mag1) * Math.sqrt(mag2));
}

function ragRetrieve(queryVector, topK) {
  const k = topK || 5;

  const ss = SpreadsheetApp.openById(RAG_SHEET_ID);
  const sheet = ss.getSheetByName(RAG_SHEET_NAME);
  if (!sheet) {
    throw new Error("Sheet " + RAG_SHEET_NAME + " not found. Run ragIndexFolder() first.");
  }

  const data = sheet.getDataRange().getValues();
  if (data.length <= 1) {
    throw new Error("No embeddings found. Run ragIndexFolder() first.");
  }

  const rows = data.slice(1); // drop header

  const scored = rows.map(function(row) {
    const chunk = row[1];
    const embedding = JSON.parse(row[2]);
    const score = cosineSimilarity(queryVector, embedding);
    return { chunk: chunk, score: score };
  });

  scored.sort(function(a, b) { return b.score - a.score; });

  const top = scored.slice(0, k);
  return top.map(function(entry) { return entry.chunk; }).join("\n\n");
}


// === RAG ANSWER ============================================

function ragAnswer(question) {
  if (!question || !question.trim()) {
    return "Please enter a question.";
  }

  const queryVector = embedText(question);
  const context = ragRetrieve(queryVector, 5);

  const prompt = [
    "You are a helpful assistant.",
    "Use ONLY the context below to answer the user's question.",
    "",
    "Context:",
    context,
    "",
    "Question:",
    question,
    "",
    "If the context does not contain the answer, say that you do not know.",
    "Answer clearly and concisely."
  ].join("\n");

  return callGemini(prompt, "");
}


// === TESTING & SETUP HELPERS ===============================

/**
 * Simple connectivity test to the text model.
 */
function testGeminiText() {
  const result = callGemini(
    "Summarize this in one sentence:",
    "Google Apps Script lets you automate tasks across Google Workspace."
  );
  Logger.log("Text test result:\n" + result);
}

/**
 * Simple embedding test to verify EMBEDDING_MODEL works.
 */
function testEmbedding() {
  const vec = embedText("Hello from Apps Script RAG.");
  Logger.log("Embedding length: " + vec.length);
}

/**
 * Creates a demo RAG sheet + folder and logs their IDs.
 * You can use these to quickly wire up a test environment.
 */
function ragCreateDemoEnvironment() {
  const ss = SpreadsheetApp.create("RAG_Embeddings_Demo");
  const sheet = ss.getActiveSheet();
  sheet.setName(RAG_SHEET_NAME);
  sheet.appendRow(["id", "chunk", "embedding"]);

  const folder = DriveApp.createFolder("RAG_Demo_Folder");

  Logger.log("Demo Sheet ID: " + ss.getId());
  Logger.log("Demo Folder ID: " + folder.getId());
}

/**
 * Creates a few sample Docs in the configured RAG_FOLDER_ID.
 */
function ragCreateSampleDocs() {
  if (!RAG_FOLDER_ID || RAG_FOLDER_ID === "YOUR_FOLDER_ID_HERE") {
    throw new Error("Set RAG_FOLDER_ID, or run ragCreateDemoEnvironment and copy its folder ID.");
  }

  const folder = DriveApp.getFolderById(RAG_FOLDER_ID);

  const doc1 = DocumentApp.create("RAG Sample โ€“ Policies");
  doc1.getBody().appendParagraph(
    "This document describes the organization's remote work policy, core hours, and communication expectations."
  );
  folder.addFile(DriveApp.getFileById(doc1.getId()));

  const doc2 = DocumentApp.create("RAG Sample โ€“ Onboarding");
  doc2.getBody().appendParagraph(
    "New employees should complete onboarding within two weeks, including security training, tool access setup, and team introductions."
  );
  folder.addFile(DriveApp.getFileById(doc2.getId()));
}

/**
 * Quick end-to-end test:
 *  1) Index folder
 *  2) Ask a sample question
 *  3) Log the answer
 */
function ragTestEndToEnd() {
  ragIndexFolder();
  const answer = ragAnswer("What are the key onboarding expectations?");
  Logger.log("RAG answer:\n" + answer);
}

โœ… How to Use / Test

  1. Create a Doc-bound Apps Script project
    • Open any Google Doc โ†’ Extensions โ†’ Apps Script.
    • Replace all existing code with the script above.
  2. Set up config
    • Generate a new Gemini API key and paste into GEMINI_API_KEY.
    • Either:
      • Run ragCreateDemoEnvironment() โ†’ copy the logged Sheet + Folder IDs and paste into RAG_SHEET_ID and RAG_FOLDER_ID,
        then run ragCreateSampleDocs() to generate sample Docs inside the folder.
      • Or use your own existing sheet + folder IDs.
  3. Authorize & test
    • From the Script Editor, run:
      • testGeminiText() โ†’ confirm text model works.
      • testEmbedding() โ†’ confirm embeddings work.
      • ragIndexFolder() โ†’ builds the vector store.
      • ragTestEndToEnd() โ†’ logs a full RAG answer.
  4. Use the UI
    • Reload the Doc.
    • Use AI Tools โ†’ RAG: Ask Question.
    • Ask natural language questions about the content in your indexed folder.