AI Data Cleaner for Google Sheets Google Apps Script and Gemini

🚀 Apps Script + Gemini Mastery — Issue #8

https://github.com/lsvekis/Apps-Script-Code-Snippets/tree/main/apps_script_gemini_ai_data_cleaner_sheets

AI Data Cleaner for Google Sheets

Automatically clean, normalize, and prepare messy spreadsheet data using Gemini + Apps Script.


⭐ What You Will Build in This Issue

In this lesson, you’ll build an AI-powered data cleaning assistant for Google Sheets that can:

🧹 Detect messy or inconsistent data
🔎 Identify formatting issues and anomalies
📐 Normalize values (dates, casing, categories, numbers)
🧠 Suggest fixes using Gemini
⚡ Apply transformations automatically
📊 Prepare sheets for analysis, reporting, or dashboards

This tool works directly inside Google Sheets via a custom menu and sidebar.


🧠 Learning Objectives

By the end of this issue, readers will know how to:

✔ Read and analyze tabular data in Apps Script
✔ Convert sheet data into AI-friendly summaries
✔ Prompt Gemini to identify data quality issues
✔ Generate structured “cleaning instructions” in JSON
✔ Apply transformations back to the Sheet
✔ Build safe, testable automation workflows

This issue introduces AI-guided data operations, not just text generation.


🧩 Exercise — Build the AI Data Cleaner

You’ll create:

  1. A custom menu in Google Sheets
  2. A sidebar UI for selecting ranges and rules
  3. A Gemini-powered data analyzer
  4. A rule-based cleaner that applies fixes automatically

1️⃣ Add Menu + Sidebar

Code.gs

function onOpen() {
  SpreadsheetApp.getUi()
    .createMenu("AI Tools")
    .addItem("AI Data Cleaner", "showCleaner")
    .addToUi();
}

function showCleaner() {
  const html = HtmlService.createHtmlOutputFromFile("Sidebar")
    .setTitle("AI Data Cleaner");
  SpreadsheetApp.getUi().showSidebar(html);
}

2️⃣ Sidebar UI

Sidebar.html

<div style="font-family: Arial; padding: 14px;">
  <h2>AI Data Cleaner</h2>

  <p>Select a range, then describe how you'd like the data cleaned.</p>

  <label><b>Cleaning Instructions</b></label>
  <textarea id="instructions"
            style="width:100%; height:80px;"
            placeholder="Normalize dates, fix capitalization, remove duplicates, standardize categories">
  </textarea>

  <button onclick="runCleaner()">Analyze & Clean</button>

  <pre id="output"
       style="white-space: pre-wrap; margin-top:12px; max-height:260px; overflow:auto;"></pre>

  <script>
    function runCleaner() {
      document.getElementById("output").textContent = "Analyzing data...";
      google.script.run
        .withSuccessHandler(msg => document.getElementById("output").textContent = msg)
        .cleanSelectedRange(
          document.getElementById("instructions").value
        );
    }
  </script>
</div>

3️⃣ Read and Summarize Sheet Data

DataReader.gs

function getSelectedData_() {
  const range = SpreadsheetApp.getActiveRange();
  if (!range) throw new Error("No range selected.");

  return {
    range: range.getA1Notation(),
    values: range.getValues()
  };
}

4️⃣ Gemini Prompt → Cleaning Rules (JSON)

Gemini won’t modify your data directly.
Instead, it returns instructions.

DataCleaner.gs

function cleanSelectedRange(instructions) {
  if (!instructions) return "Please enter cleaning instructions.";

  const data = getSelectedData_();

  const prompt = `
Analyze the following spreadsheet data and return cleaning instructions in pure JSON.

Data:
${JSON.stringify(data.values)}

User instructions:
${instructions}

Return JSON only using this structure:
{
  "actions": [
    {
      "type": "lowercase" | "uppercase" | "trim" | "normalize_date" | "remove_duplicates",
      "column": 0
    }
  ]
}
`;

  const jsonText = callGemini(prompt, "");
  const clean = JSON.parse(jsonText);

  applyCleaning_(clean.actions, data);
  return "✅ Data cleaned for range " + data.range;
}

5️⃣ Apply Cleaning Actions

ApplyCleaner.gs

function applyCleaning_(actions, data) {
  const values = data.values;

  actions.forEach(action => {
    const col = action.column;

    for (let r = 1; r < values.length; r++) {
      let cell = values[r][col];
      if (typeof cell !== "string") continue;

      switch (action.type) {
        case "lowercase":
          values[r][col] = cell.toLowerCase();
          break;
        case "uppercase":
          values[r][col] = cell.toUpperCase();
          break;
        case "trim":
          values[r][col] = cell.trim();
          break;
      }
    }
  });

  SpreadsheetApp.getActiveRange()
    .setValues(values);
}

🧪 Testing Strategy

Readers should test in stages:

1️⃣ testGeminiConnection()
2️⃣ Analyze a small sample range
3️⃣ Review Gemini’s JSON output in logs
4️⃣ Apply cleaning only after validation


🎯 What This Unlocks

This issue teaches how to build:

📊 Smart data prep tools
🤖 AI-assisted ETL workflows
🧼 Data quality pipelines
📈 Analytics-ready Sheets
🧠 Safe AI automation patterns

It’s a huge step up from text generation.


🔜 Coming Next (Issue #9)

AI Report Generator for Google Docs
Turn Sheets data into executive summaries, charts, and insights — automatically.