GitHub https://github.com/lsvekis/Google-Apps-Script-Gemini-Projects/tree/main/summarizer
🌐 Building a Web Article Summarizer & Citation Extractor with Google Sheets + Gemini AI
Have you ever wished you could drop a list of article URLs into Google Sheets and instantly get structured summaries — complete with titles, key quotes, named entities, and estimated reading time?
In this exercise, we’ll create exactly that using Google Apps Script and Gemini AI.
🧠 Project Overview
This project transforms your spreadsheet into a lightweight research assistant.
You’ll feed it article URLs, and the script will:
- Fetch and clean each webpage
- Summarize its main points in bullet form
- Extract people, organizations, and locations
- Highlight key quotes
- Suggest topic tags and reading time
- Output everything into the next columns automatically
It’s perfect for anyone who wants quick, AI-powered overviews of online content — students, journalists, analysts, or content creators.
⚙️ How It Works
The magic happens inside Google Apps Script, which acts as a bridge between Google Sheets and Gemini’s REST API.
Step 1 — Fetch the Web Page
Each URL is retrieved with UrlFetchApp.fetch()
.
Scripts, styles, and HTML tags are stripped away, leaving clean readable text.
body = body
.replace(/<script[\s\S]*?<\/script>/gi, ' ')
.replace(/<style[\s\S]*?<\/style>/gi, ' ')
.replace(/<[^>]+>/g, ' ')
.trim();
The result is cached for six hours to avoid hitting rate limits.
Step 2 — Chunk and Summarize
Long pages are split into ~6000-character chunks.
Each chunk is sent to Gemini with a simple instruction:
“Summarize this web page chunk in 4–6 concise bullet points focusing on facts.”
This ensures we don’t exceed token limits and still get accurate partial summaries.
Step 3 — Merge and Structure
Once all chunk summaries return, we send them again to Gemini — this time asking for a strict JSON object:
{
"title": "string",
"summary": ["string"],
"key_quotes": ["string"],
"entities": {
"people": ["string"],
"organizations": ["string"],
"locations": ["string"]
},
"topics": ["string"],
"reading_time_minutes": 3,
"source_url": "string"
}
The AI consolidates everything into a clean data structure ready for Sheets.
Step 4 — Display Results in Sheets
Each row of URLs expands into seven columns:
| Title | Summary (bullets) | Entities (People | Orgs | Locs) | Key Quotes | Topics | Reading Minutes | Source URL |
|:——|:——————|:—————–|:————-|:——–|:—————-|:———–|
| AI Updates from Google | • Gemini 2.5 launch … | Sundar Pichai | Google | “AI for everyone” | ai, research | 6 | https://blog.google/… |
🧩 Code Highlights
fetchUrlText(url)
— cleans HTML into readable textsummarizeUrl(url)
— handles chunking, API calls, and JSON consolidationAI_URL_SUMMARY(url)
— custom sheet function returning the full JSONsummarizeUrlSelectionToRight()
— bulk action for selected URLs- Sidebar UI — allows quick one-off summarization inside a small HTML panel
🧪 Try It Yourself
- Open a new Google Sheet → Extensions → Apps Script
- Create these files:
Code.gs
Sidebar.html
appsscript.json
- Paste in the code from the Exercise 7 GitHub package
- In Script Properties, add your Gemini API Key (
GEMINI_API_KEY
). - Run
onOpen()
once and reload the Sheet.
Then, use the menu:
🧠 AI Tools → Insert Sample URLs
Select column A → Summarize URL Selection → next columns
or try a single-cell formula:
=AI_URL_SUMMARY(A2)
🧩 Tips & Troubleshooting
- If you see
❌ Set GEMINI_API_KEY…
→ add your key in Script Properties. - Some news sites block fetch requests — try public blogs first.
- Long pages? The script summarizes the first four chunks (≈ 24 K chars).
- Need finer control? Adjust
CHUNK_CHAR_LIMIT
orMERGE_TOKENS
.
🎯 Why This Matters
This exercise demonstrates multi-stage AI orchestration inside a low-code environment.
You’re combining:
- Data fetching (UrlFetchApp)
- Pre-processing and caching
- Multi-prompt reasoning (chunk → merge)
- Structured JSON outputs
- Instant visualization in Sheets
In short — a real-world example of AI-assisted knowledge extraction without any external server.
