Source code https://github.com/lsvekis/Google-Apps-Script-Gemini-Projects/tree/main/AI%20Data%20Cleaner
Managing messy product data is a constant headache — especially when you’re working with spreadsheets full of inconsistent titles, missing brands, and vague descriptions.
In this exercise, we’ll build an AI-powered Google Sheets add-on that automatically cleans, classifies, and enriches product information using the Gemini API and Google Apps Script.
By the end, you’ll be able to:
✅ Normalize messy product names
✅ Detect brands and categories automatically
✅ Generate short marketing descriptions
✅ Tag items with AI-suggested keywords
✅ Run it directly from your Sheet — no coding knowledge required
🚀 Overview
We’ll connect Google Sheets to the Gemini 2.5 Flash model through a simple Apps Script project.
Once set up, you can select any range of rows and the script will:
- Send each product name + description to Gemini.
- Ask for a JSON response matching a strict schema.
- Parse and display clean data in the next columns.
The result is a ready-to-use AI data-cleaning tool built right into Google Sheets.
🧩 Step 1 — Set up the Script Project
- Open a Google Sheet.
- Go to Extensions → Apps Script.
- Create three files:
File name | Purpose |
---|---|
Code.gs | The full Apps Script logic |
Sidebar.html | A small UI for testing single products |
appsscript.json | Manifest with required scopes |
Paste in the full code from this project (see below).
- In Project Settings → Script properties, add your API key:
- Name:
GEMINI_API_KEY
- Value: your Gemini API key from ai.google.dev
- Name:
- Save and run the
onOpen()
function once to authorize the script.
🧠 Step 2 — How the AI Cleaner Works
The core function is cleanOneRecord(name, desc)
:
- Prompt construction:
It tells Gemini exactly which fields to return (clean name, brand, category, tags, short description, confidence, notes).
It also warns the model to return “RAW JSON ONLY on a single line” — no markdown fences. - Gemini request:
The function usesUrlFetchApp.fetch()
to callhttps://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent
. - Robust JSON parsing:
A helpersafeParseJsonLoose()
cleans up stray “`json fences and safely recovers incomplete objects if Gemini truncates output. - Normalization:
The code trims strings, clamps the confidence 0-1, shortens the description ≤ 120 characters, and joins tags into a readable list. - Output:
Each cleaned record writes into seven new columns:- Clean Name
- Brand
- Category
- Tags
- Short Description
- Confidence
- Notes
⚙️ Step 3 — Test It with Sample Data
From the new 🧠 AI Tools menu in your Sheet:
- Choose Insert Sample Data — this populates demo rows (iPhone, Nike shoes, Sony headphones…).
- Select columns A and B (Product Name and Description).
- Choose Clean Selection → next columns.
- Watch Gemini fill in clean, organized data instantly.
You can also open Open Sidebar and test individual items interactively.
💡 How the Prompt Keeps Gemini Consistent
The trick is the strict JSON schema and explicit rules:
{
"clean_name": "string",
"brand": "string",
"category": "string",
"tags": ["string"],
"short_description": "string",
"confidence": "number (0..1)",
"notes": "string"
}
By enforcing this structure and rejecting anything that isn’t valid JSON, you get predictable, machine-readable output every time — perfect for automation.
🔍 Example Result
Input:
Product Name | Description |
---|---|
Apple iPhone 15 Pro Max 256 GB – Titan Grey | Flagship phone with A17 Pro chip, 6.7″ display, excellent camera. Unlocked. |
Output:
Clean Name | Brand | Category | Tags | Short Description | Confidence | Notes |
---|---|---|---|---|---|---|
Apple iPhone 15 Pro Max 256 GB | Apple | Smartphone | smartphone, apple, iphone, flagship, 256gb | Flagship phone with A17 Pro chip and excellent camera. | 0.97 | — |
🧰 Step 4 — Optional: Add the Custom Function
You can also clean individual cells directly:
=AI_CLEAN_ROW(A2,B2)
This returns a full JSON string for the given product row.
📦 Download & Reuse
All ready-to-use files are available in this repo (or ZIP):
Code.gs
Sidebar.html
appsscript.json
→ Import them into a new Google Sheet’s Apps Script editor
→ Add your Gemini API key
→ You’re ready to build your own AI data-cleaning pipeline!
✨ What You Learned
- How to integrate Gemini with Google Sheets using REST calls
- How to design strict, self-validating JSON prompts
- How to build a sidebar UI and custom functions in Apps Script
- How to safely parse imperfect AI responses
This technique is a foundation for more complex automations — from cleaning e-commerce listings to categorizing support tickets or research data.
