Automating Document Cleanup: Consolidate Duplicate Categories in Google Docs with Apps Script

Managing content in a Google Doc can be tedious, especially when dealing with repetitive headings and disorganized structure. If you’ve ever faced the challenge of consolidating duplicate categories while retaining their content, this blog post is for you. We’ll explore a simple Google Apps Script that automates this task, ensuring your document stays clean, organized, and easy to read.


The Problem: Duplicate Categories in Google Docs

Imagine a scenario where a document has multiple Heading 2 elements with the same name, such as:

Category: JavaScript Basics
- What is JavaScript?
Category: JavaScript Basics
- What are variables in JavaScript?

This redundancy makes the document harder to navigate. Consolidating these duplicate categories manually can be time-consuming and error-prone. Automating this process not only saves time but ensures consistency.


The Solution: Google Apps Script

Google Apps Script provides a powerful way to automate tasks in Google Docs. Here’s a script that identifies duplicate Heading 2 categories, combines their content, and rewrites the document with a clean, organized structure.


The Code

function combineDuplicateCategories() {
const DOCID = 'YOUR_DOCUMENT_ID_HERE'; // Replace with your Document ID
const doc = DocumentApp.openById(DOCID);
const body = doc.getBody();
const paragraphs = body.getParagraphs();

const categoryMap = new Map(); // To store category text and its corresponding content
let currentCategory = null;

for (let i = 0; i < paragraphs.length; i++) {
const paragraph = paragraphs[i];
const heading = paragraph.getHeading();
const text = paragraph.getText().trim();

if (heading === DocumentApp.ParagraphHeading.HEADING2) {
currentCategory = text;
if (!categoryMap.has(currentCategory)) {
categoryMap.set(currentCategory, []);
}
} else if (currentCategory && text) {
categoryMap.get(currentCategory).push(text); // Add content under the current category
}
}

// Clear the document content to rewrite the consolidated content
body.clear();

// Write consolidated categories and their content
categoryMap.forEach((content, category) => {
body.appendParagraph(category).setHeading(DocumentApp.ParagraphHeading.HEADING2);
content.forEach(line => {
body.appendParagraph(line);
});
});

Logger.log("Duplicate categories have been combined.");
}

How It Works

  1. Identify Categories:
    • The script scans all paragraphs in the document and identifies those marked as Heading 2.
    • These are treated as categories.
  2. Store Content:
    • A Map is used to store categories as keys and their associated content as values.
  3. Combine Duplicates:
    • If a category is encountered again, its content is added to the existing entry in the Map.
  4. Rewrite the Document:
    • The original content is cleared, and the script rewrites the document with consolidated categories.

How to Use

  1. Set Up Apps Script:
    • Open your Google Drive, click on the gear icon in the top-right corner, and select Apps Script.
    • Paste the code into the editor.
  2. Provide Document ID:
    • Replace YOUR_DOCUMENT_ID_HERE with your Google Doc’s ID. You can find this in the document’s URL.
  3. Run the Script:
    • Click the Run button. You might need to authorize the script the first time.
  4. Check the Output:
    • Open the Google Doc to see the cleaned-up and consolidated categories.

Why Automate?

This script eliminates hours of manual work, reduces the chance of errors, and ensures a professional and polished final document. Whether you’re organizing meeting notes, educational materials, or a manuscript, automating this process can save you significant time and effort.


Takeaway

By leveraging Google Apps Script, you can automate repetitive tasks like consolidating duplicate categories in Google Docs. This script demonstrates the power of automation in document management, showcasing how simple code can make a big difference in productivity.