Lesson 4 – Leveling Up Your Gemini RAG Assistant

Github https://github.com/lsvekis/Google-Apps-Script-APIs-and-Gemini

4A: RAG + API Data · 4B: Sheet Vector Index · 4C: Chat History

By now, you’ve already built a powerful foundation:

  • Lesson 1 – Call free public APIs from Apps Script (DATA_API_URL)
  • Lesson 2 – Combine API data + folder text in a single Gemini prompt
  • Lesson 3 – Add embeddings + cosine similarity to build a basic RAG system over a Drive folder

In Lesson 4, we’re going to level this up in three big steps:

  1. 4A – Combine RAG + API data in one Gemini answer
  2. 4B – Store embeddings in Google Sheets (a simple vector database)
  3. 4C – Add chat history on top of RAG (conversation + documents)

By the end, you’ll have an assistant that:

  • Reads your Drive documents
  • Uses a persistent vector index in Sheets
  • Enriches answers with live API data
  • Remembers the last few turns of conversation

📘 Lesson 4A — Combine RAG + API Data in a Single Gemini Answer

So far, your assistant can read docs and talk to APIs. Now we’ll blend those two worlds together.

What Lesson 4A Does

In Lesson 4A, your assistant:

  • Runs RAG over a Drive folder (Docs and text files)
  • Fetches live JSON data from a public API (DATA_API_URL)
  • Sends both the retrieved chunks and the API JSON into a single Gemini request
  • Returns a combined answer grounded in your documents and enriched with live data

Example question:

“Using the onboarding notes in the folder and today’s weather from the API, suggest a welcoming message for new students.”

Gemini will:

  • Use the Drive onboarding docs as the source of truth
  • Pull current weather (or other data) from the API
  • Produce a single, combined answer

High-Level Flow

Compared to Lesson 3, we’re just adding the API branch:

  • RAG part (same as Lesson 3)
    • Read folder → chunk → embed chunks
    • Embed question → cosine similarity → top-K chunks
  • API part (from Lessons 1 & 2)
    • Call DATA_API_URL
    • Parse the JSON
  • Gemini prompt
    • Include:
      • User question
      • API JSON
      • Retrieved top-K document chunks
    • Ask Gemini to use both the API data and the document chunks.

Diagram

           Drive Folder                       Public API
         (Docs / Text files)           (cat facts, weather, etc.)
                 │                               │
            Read & chunk                     Fetch JSON
                 │                               │
          Embed each chunk                      ▼
                 │                       API JSON object
                 ▼                               │
     Question ──► Embed question                 │
                 │                               │
                 └─► Similarity search ◄─────────┘
                         │ (top K chunks)
                         ▼
               Build combined prompt:
           question + API data + doc chunks
                         │
                         ▼
                      Gemini
                         │
                         ▼
                      Answer

Full Lesson 4A Code (Apps Script)

Create a new file in your Apps Script project, for example:

Lesson4A_RagAndApi.gs

Paste this full script and replace PASTE_YOUR_FOLDER_ID_HERE with your real Drive folder ID:

/**
 * Lesson 4A — Combine RAG (embeddings over Drive) + API data in one Gemini answer
 *
 * What this file does:
 *  - Reads & chunks text from a Drive folder
 *  - Creates embeddings for each chunk (with caching)
 *  - Computes cosine similarity to the user question
 *  - Retrieves the top-K relevant chunks (RAG)
 *  - Fetches JSON from an external API (DATA_API_URL)
 *  - Sends question + API JSON + retrieved chunks to Gemini
 *  - Returns a single, combined answer
 */

/***************************************
 * CONFIG
 ***************************************/

// 1) Replace with your own Drive folder ID
const DEFAULT_FOLDER_ID = 'PASTE_YOUR_FOLDER_ID_HERE';

// 2) Gemini models
const EMBEDDING_MODEL = 'models/gemini-embedding-001';
const GENERATION_MODEL = 'gemini-2.5-flash';

// 3) RAG parameters
const CHUNK_SIZE = 1200;   // characters per chunk
const TOP_K = 5;           // how many chunks to keep
const MAX_OUTPUT_TOKENS = 1024;

/***************************************
 * MAIN ENTRY — LESSON 4A
 ***************************************/

function answerWithRagAndApi(question) {
  const q = (question || '').trim();
  if (!q) {
    throw new Error('Question cannot be empty.');
  }

  // 1) RAG over Drive folder
  const chunks = getFolderChunks_(DEFAULT_FOLDER_ID);
  if (chunks.length === 0) {
    return 'No readable files found in the folder.';
  }

  const chunkEmbeddings = chunks.map(c => ({
    chunk: c,
    embedding: embedTextCached_(c.text)
  }));

  const queryEmbedding = embedText_(q);

  const ranked = chunkEmbeddings
    .map(item => ({
      chunk: item.chunk,
      score: cosineSimilarity_(queryEmbedding, item.embedding)
    }))
    .sort((a, b) => b.score - a.score);

  const selected = ranked.slice(0, TOP_K);

  const ragContext = selected
    .map((c, i) =>
      `[#${i + 1} — ${c.chunk.fileName} — score=${c.score.toFixed(3)}]\n` +
      c.chunk.text
    )
    .join('\n\n');

  // 2) Fetch external API data
  const apiData = fetchDataFromApi_();
  const apiJsonPretty = JSON.stringify(apiData, null, 2);

  // 3) Build combined prompt
  const prompt =
    'You are an AI assistant that must answer using BOTH of the following sources:\n' +
    '1) JSON data from an external API\n' +
    '2) Retrieved text chunks from a Google Drive folder (RAG)\n\n' +
    'Guidelines:\n' +
    '- Use the API data for up-to-date or numeric information.\n' +
    '- Use the retrieved chunks for policies, descriptions, and longer explanations.\n' +
    '- If information is missing or unclear, say so politely.\n\n' +
    '--- USER QUESTION ---\n' +
    q + '\n\n' +
    '--- API DATA (JSON) ---\n' +
    apiJsonPretty + '\n\n' +
    '--- RETRIEVED DOCUMENT CHUNKS ---\n' +
    ragContext + '\n\n' +
    'Now write a single, clear answer that blends the API information and the document content where relevant.';

  return callGeminiText_(prompt);
}

function testAnswerWithRagAndApi() {
  const question =
    'Using both the document content and the API data, ' +
    'summarize the key ideas and give one practical suggestion.';
  const ans = answerWithRagAndApi(question);
  Logger.log('AI Answer:\n' + ans);
}

(Keep the rest of your helper functions from your draft: getFolderChunks_, embedText_, embedTextCached_, cosineSimilarity_, callGeminiText_, fetchDataFromApi_, getGeminiKey_ — they can stay in the same file.)

If you want to connect this to your existing web app UI, add:

function chatWithRagAndApi(question) {
  const answer = answerWithRagAndApi(question);
  return { answer: answer || '' };
}

And in your Index.html, change:

google.script.run.chatWithFolderAndApi(text);

to:

google.script.run.chatWithRagAndApi(text);

📘 Lesson 4B — Store Embeddings in Google Sheets (DIY Vector Database)

In Lesson 3 and 4A, embeddings were created in memory every time you ran the script. That’s okay for tiny folders, but not scalable.

In Lesson 4B, you’ll:

  • Store chunk + embedding data in a Google Sheet
  • Use that Sheet as a persistent vector index
  • Rebuild embeddings only when content changes
  • Inspect and debug the index visually

Concept: Sheet as a Vector Database

We’ll use a sheet with columns like this:

chunkIdfileIdfileNamechunkIndextextembeddingJson
1Abc…:01Abc…Onboarding.md0“Welcome to…”“[0.0123, -0.09, 0.45…]”
1Abc…:11Abc…Onboarding.md1“In this…”“[0.083, 0.11, -0.02…]”
2Xyz…:02Xyz…Policy.docx0“All students…”“[0.004, -0.22, 0.19…]”

Each row is one chunk of text plus its embedding.

Setup Overview

New Script Property:

  • RAG_SHEET_ID → holds the ID of the spreadsheet that stores your index

The script will:

  • Create the spreadsheet automatically if needed
  • Store its ID in RAG_SHEET_ID
  • Populate a tab called RAG_Index

You still need:

  • GEMINI_API_KEY in Script Properties
  • A DEFAULT_FOLDER_ID
  • Docs / text files in your folder

Full Lesson 4B Code (Apps Script)

Create a new file:

Lesson4B_SheetIndex.gs

Paste your full Lesson 4B script (as you wrote it). The key public functions are:

function buildRagIndex() { … }

function answerWithRagFromSheet(question) { … }

function testAnswerWithRagFromSheet() {
  const question = 'What do these documents say about onboarding?';
  const answer = answerWithRagFromSheet(question);
  Logger.log('AI Answer:\n' + answer);
}

Don’t forget to set:

const DEFAULT_FOLDER_ID = 'YOUR_REAL_FOLDER_ID_HERE';

Then:

  1. Run buildRagIndex() once to create the spreadsheet and index.
  2. Run testAnswerWithRagFromSheet() or call answerWithRagFromSheet("…").

If you want to wire this into your chatbot UI, add:

function chatWithRagFromSheet(question) {
  const answer = answerWithRagFromSheet(question);
  return { answer: answer || '' };
}

Then in Index.html:

google.script.run.chatWithRagFromSheet(text);

Now your web app uses a Sheet-backed index and does not recompute embeddings each time.


📘 Lesson 4C — Add Chat History on Top of RAG

(Conversation + Documents)

Right now, your assistant treats every question as a one-off. It doesn’t remember previous answers or follow-up questions.

In Lesson 4C, we’ll add short-term chat memory:

  • The assistant remembers the last few user + assistant messages
  • That history is included in the prompt alongside the RAG chunks
  • You get more natural, conversational replies

We’ll still keep it simple:

  • Memory is short-term only (last N turns)
  • Stored using CacheService.getUserCache()
  • No external database required

What “Chat History” Means Here

For each user:

  • We keep a small array:
[
  { role: 'user', text: '…' },
  { role: 'assistant', text: '…' },
  …
]

On each new question:

  1. Load this history from the cache
  2. Run RAG just for the current question
  3. Build a prompt that contains:
    • The chat history
    • The retrieved chunks
    • The latest question
  4. Call Gemini
  5. Append the new Q & A back into history
  6. Save it into the cache (capped at ~8 turns)

This gives you:

  • Document-grounded answers (from RAG)
  • Conversational continuity (from chat history)

Where We Store History

We’ll use:

const CHAT_HISTORY_KEY = 'CHAT_HISTORY';

and store the JSON-serialized array in:

CacheService.getUserCache()

Full Lesson 4C Code (Chat + RAG + Sheets)

Create a new file:

Lesson4C_ChatRag.gs

Paste your Lesson 4C script. The key public functions are:

function answerWithChatRag(question) { … }

function chatWithChatRag(question) {
  const answer = answerWithChatRag(question);
  return { answer: answer || '' };
}

Make sure:

  • Lesson 4B is already in your project (for loadRagIndex_, embedText_, cosineSimilarity_, callGeminiText_, getGeminiKey_).
  • You’ve already run buildRagIndex() once to build the Sheet index.

Then, test from the editor:

Logger.log(answerWithChatRag("Give me a quick summary of the documents."));
Logger.log(answerWithChatRag("Can you suggest one practical action from that?"));
Logger.log(answerWithChatRag("Rewrite that suggestion in a friendlier tone."));

Because chat history is stored in the per-user cache, those three calls share context.

Wiring Chat History into Your Web App

If your UI previously called:

google.script.run.chatWithRagFromSheet(text);

just switch it to:

google.script.run.chatWithChatRag(text);

No UI change required — your chatbot now has:

  • RAG over your documents
  • Persistent vector index in Sheets
  • Short-term chat memory

✅ Recap: What Lesson 4 Gives You

By the end of Lesson 4, you’ve turned a simple AI script into a mini AI platform:

  • 4A – Combined RAG + API data in one Gemini call
  • 4B – Added a Sheet-based vector store for persistent embeddings
  • 4C – Layered on chat history so conversations feel natural