Streamlining Your Google Docs: Automate the Removal of Code Snippets with Google Apps Script

Removes both multi-paragraph and single-paragraph language-specific text patterns from a specified Google Document

In the ever-evolving landscape of content creation, efficiency is paramount. Whether you’re a developer documenting your code, a blogger crafting tutorials, or a student compiling research, managing and formatting content within Google Docs can sometimes become a tedious task. One common challenge is the repetitive removal of specific code snippets or formatting patterns, such as language identifiers followed by a “Copy code” prompt. Fortunately, with the power of Google Apps Script, you can automate this process, saving time and ensuring consistency across your documents.

In this blog post, we’ll walk you through creating a Google Apps Script that automatically scans your Google Document and removes unwanted patterns like 'html\nCopy code', 'css\nCopy code', and more. This script is particularly useful for cleaning up documents that include multiple code blocks in various programming languages, ensuring your content remains clean and professional.

Understanding the Problem

When documenting code or creating tutorials, it’s common to include code snippets alongside descriptive text. Often, these snippets are preceded by language identifiers (e.g., HTML, CSS, JavaScript) and a prompt like Copy code to facilitate easy copying for readers. While helpful, these patterns can clutter your document, especially if they are numerous and need to be removed before sharing or publishing.

Manually searching for and deleting these patterns is time-consuming and prone to errors, particularly in lengthy documents. Automating this process not only enhances efficiency but also ensures consistency and accuracy in your document cleanup.

Introducing Google Apps Script

Google Apps Script is a powerful tool that allows you to extend and automate functionalities across Google Workspace applications, including Google Docs, Sheets, Slides, and more. It leverages JavaScript, making it accessible to developers familiar with the language while offering extensive documentation and a supportive community for those new to scripting.

With Google Apps Script, you can automate repetitive tasks, create custom functions, and integrate with external APIs, among other capabilities. In our case, we’ll harness its power to scan a Google Document for specific text patterns and remove them systematically.

The Solution: Automating Text Removal

To address the challenge of removing specific text patterns like 'html\nCopy code', 'css\nCopy code', etc., we can create a Google Apps Script that:

  1. Identifies Patterns Across Paragraphs: Scans the document for a language keyword followed by the phrase Copy code.
  2. Removes Matching Patterns: Deletes both the language keyword paragraph and the subsequent Copy code paragraph.
  3. Handles Case Insensitivity: Ensures that the script recognizes variations in casing (e.g., HTML, Html, html).
  4. Manages Quota Limits: Implements batch saving to avoid exceeding Google Apps Script’s execution quotas.

By automating this process, you eliminate the need for manual intervention, streamline your workflow, and maintain a clean and professional document.

Step-by-Step Guide to Implementing the Script

Follow these steps to create and deploy the script in your Google Document:

1. Access Google Apps Script

  1. Open Your Google Document:
    • Navigate to the Google Document where you want to implement the script.
  2. Open Apps Script Editor:
    • Click on Extensions in the top menu.
    • Select Apps Script. This action will open the Google Apps Script editor in a new browser tab.

2. Create a New Project

  1. Start a New Project:
    • If prompted, click on New project to create a fresh script.
  2. Name Your Project:
    • Click on Untitled project at the top left and rename it to something meaningful, like “Remove Code Patterns”.

3. Paste the Script

  1. Delete Existing Code:
    • In the script editor, delete any existing boilerplate code to start with a clean slate.
  2. Paste the Provided Script:
    • Insert the following script into the editor:
/**
 * Removes multi-paragraph language-specific text patterns from a specified Google Document.
 * Patterns removed:
 *   - A paragraph containing one of the specified language keywords,
 *     immediately followed by a paragraph containing 'Copy code'.
 * The removal is case-insensitive.
 */
function removeMultiParagraphTextPatterns() {
  try {
    // Replace 'YOUR_DOCUMENT_ID_HERE' with your actual Document ID
    var DOCID = 'YOUR_DOCUMENT_ID_HERE';
    
    // Open the document by ID
    var doc = DocumentApp.openById(DOCID);
    var body = doc.getBody();
    
    // Define the list of languages/keywords
    var keywords = [
      'makefile',
      'bash',
      'css',
      'html',
      'javascript',
      'arduino',
      'php',
      'scss',
      'kotlin',
      'csharp',
      'sql',
      'lua',
      'python'
    ];
    
    // Convert keywords to lowercase for case-insensitive comparison
    var keywordsLower = keywords.map(function(word) {
      return word.toLowerCase();
    });
    
    // Get all paragraphs in the document
    var paragraphs = body.getParagraphs();
    
    // Counter for removals
    var removalCount = 0;
    
    // Iterate through paragraphs from the end to the beginning
    for (var i = paragraphs.length - 2; i >= 0; i--) {
      var currentParaText = paragraphs[i].getText().trim().toLowerCase();
      var nextParaText = paragraphs[i + 1].getText().trim().toLowerCase();
      
      // Check if current paragraph matches any keyword and next paragraph is 'copy code'
      if (keywordsLower.includes(currentParaText) && nextParaText === 'copy code') {
        // Remove the 'Copy code' paragraph first
        body.removeChild(paragraphs[i + 1]);
        // Then remove the keyword paragraph
        body.removeChild(paragraphs[i]);
        
        removalCount += 2;
        
        // Log the removal
        Logger.log('Removed patterns at paragraphs ' + (i + 1) + ' and ' + (i + 2));
        
        // To avoid quota limits, save and close the document periodically
        if (removalCount % 20 === 0) { // Adjust the batch size as needed
          doc.saveAndClose();
          // Reopen the document to continue making changes
          doc = DocumentApp.openById(DOCID);
          body = doc.getBody();
          paragraphs = body.getParagraphs(); // Refresh the paragraphs after reopening
        }
      }
    }
    
    // Final save after all removals
    doc.saveAndClose();
    
    // Log a confirmation message
    Logger.log("Specified multi-paragraph text patterns have been removed successfully. Total removals: " + removalCount);
    
  } catch (e) {
    Logger.log('An error occurred: ' + e.message);
  }
}

/**
 * Adds a custom menu to the Google Docs UI for easy access to the script.
 */
function onOpen() {
  DocumentApp.getUi()
    .createMenu('Custom Scripts')
    .addItem('Remove Text Patterns', 'removeMultiParagraphTextPatterns')
    .addToUi();
}

4. Configure the Script

  1. Set the Document ID:
    • Locate the line: var DOCID = 'YOUR_DOCUMENT_ID_HERE';Replace 'YOUR_DOCUMENT_ID_HERE' with the actual ID of your Google Document.
    How to Find the Document ID:
    • Open your Google Document.Look at the URL in your browser’s address bar. It will resemble: codehttps://docs.google.com/document/d/ABC123XYZ456/editThe Document ID is the part between /d/ and /edit, e.g., ABC123XYZ456.
    var DOCID = 'ABC123XYZ456'; // Replace with your actual Document ID

5. Save and Authorize the Script

  1. Save the Script:
    • Click on the floppy disk icon or press Ctrl + S (Windows) / Cmd + S (Mac) to save your project.
  2. Authorize Permissions:
    • Click the Run button (▶️) in the toolbar.
    • A dialog will appear prompting you to authorize the script to access your Google Document.
    • Follow the on-screen instructions to grant the necessary permissions.
    Note: Google Apps Script requires authorization to interact with your documents. Ensure you’re comfortable granting these permissions and understand that the script will have the ability to modify your document.

6. Run the Script

  1. Use the Custom Menu:
    • Return to your Google Document.
    • You’ll notice a new menu item called Custom Scripts in the top menu bar.
    • Click on Custom Scripts and select Remove Text Patterns.
  2. Monitor Execution:
    • The script will execute, scanning your document for the specified patterns and removing them.
    • Depending on the size of your document and the number of patterns, this process may take a few moments.
  3. Check the Logs:
    • To verify the script’s actions, go back to the Apps Script editor.
    • Click on View > Logs to see detailed logs of the removals made.

7. Verify the Changes

  1. Review Your Document:
    • Open your Google Document.
    • Ensure that all instances of patterns like 'HTML\nCopy code', 'CSS\nCopy code', etc., have been successfully removed.
  2. Check for Unintended Modifications:
    • Ensure that only the targeted patterns were removed and that other content remains unaffected.

How the Script Works

Understanding the mechanics of the script enhances your ability to customize and troubleshoot it. Here’s a breakdown of its functionality:

1. Initialization

  • Document Access:
    • The script accesses your Google Document using the provided Document ID.
    • var doc = DocumentApp.openById(DOCID); opens the document for editing.
  • Keyword Definition:
    • An array of language keywords (e.g., html, css, javascript) is defined.
    • These keywords represent the patterns you want to target for removal.
  • Case-Insensitive Matching:
    • The keywords are converted to lowercase to facilitate case-insensitive comparisons.
    • This ensures that variations like HTML, Html, or html are all recognized.

2. Paragraph Iteration

  • Reverse Traversal:
    • The script iterates through the document’s paragraphs from the end to the beginning.
    • This approach prevents issues related to shifting indices when removing elements.
  • Pattern Detection:
    • For each paragraph, the script checks if it matches any of the specified keywords.
    • If a match is found, it then checks if the immediately following paragraph contains the phrase Copy code.

3. Pattern Removal

  • Removing Matched Patterns:
    • If both conditions are satisfied (keyword followed by Copy code), the script removes both paragraphs.
    • The Copy code paragraph is removed first to maintain the correct indexing when removing the keyword paragraph.
  • Quota Management:
    • To prevent exceeding Google Apps Script’s execution quotas, the script saves and closes the document after every 20 removals.
    • This is managed by the condition if (removalCount % 20 === 0).
    • The document is then reopened to continue processing any remaining patterns.

4. Finalization

  • Saving Changes:
    • After all removals are completed, the script performs a final save and closes the document.
  • Logging:
    • Throughout the process, the script logs each removal action and provides a summary upon completion.
    • This logging is invaluable for verifying the script’s effectiveness and troubleshooting any issues.

5. Custom Menu Integration

  • Ease of Access:
    • The onOpen function adds a Custom Scripts menu to your Google Document.
    • This menu provides a convenient interface to run the Remove Text Patterns function without navigating back to the Apps Script editor.

Best Practices and Considerations

To ensure smooth operation and prevent unintended consequences, keep the following best practices in mind:

1. Always Backup Your Document

Before running any script that modifies your document’s content, create a backup copy. This precaution safeguards against accidental data loss or unintended modifications.

2. Test on a Sample Document

Before deploying the script on your primary document, test it on a sample document containing the patterns you wish to remove. This ensures the script behaves as expected and allows you to make necessary adjustments without affecting critical content.

3. Adjust Batch Sizes as Needed

The script saves and closes the document after every 20 removals to manage execution quotas. Depending on your document’s size and the number of patterns, you may need to adjust this number:

if (removalCount % 20 === 0) { // Adjust the batch size as needed
// Save and close logic
}

Reducing this number can help prevent quota limit errors, especially in large documents with numerous patterns.

4. Monitor Logs for Insights

Regularly check the logs (View > Logs in the Apps Script editor) to monitor the script’s actions. Logs provide detailed information about each removal, aiding in verification and troubleshooting.

5. Extend the Script for Additional Patterns

If you need to target more patterns or modify existing ones, simply update the keywords array with the new terms:

var keywords = [
'makefile',
'bash',
'css',
'html',
'javascript',
'arduino',
'php',
'scss',
'kotlin',
'csharp',
'sql',
'lua',
'python',
'ruby', // New keyword added
'go' // Another new keyword
];

6. Understand Google Apps Script Quotas

Be aware of Google Apps Script’s quotas and limitations to ensure your scripts run smoothly without interruptions.

Extending the Script

While the current script effectively removes multi-paragraph patterns, you may encounter scenarios where patterns span within a single paragraph, separated by manual line breaks. To handle such cases, you can enhance the script to detect and remove these patterns as well.

Handling Single-Paragraph Patterns

Example Pattern:

HTML
Copy code

Enhanced Script Segment:

// Iterate through paragraphs
for (var i = paragraphs.length - 1; i >= 0; i--) {
var para = paragraphs[i];
var paraText = para.getText().trim();

// Check for patterns within the same paragraph
var patternRegex = new RegExp('^(' + keywords.join('|') + ')\\s*\\n\\s*Copy code$', 'i');

if (patternRegex.test(paraText)) {
body.removeChild(para);
removalCount += 1;
Logger.log('Removed pattern within paragraph ' + (i + 1));

// Save periodically
if (removalCount % 20 === 0) {
doc.saveAndClose();
doc = DocumentApp.openById(DOCID);
body = doc.getBody();
paragraphs = body.getParagraphs();
}
}
}

Integration:

You can incorporate this segment into your main loop or create a separate function to handle single-paragraph patterns, ensuring comprehensive cleanup of your document.

Conclusion

Automating the removal of specific text patterns in Google Docs using Google Apps Script can significantly enhance your productivity and maintain the professionalism of your documents. Whether you’re cleaning up code snippets, removing repetitive prompts, or managing complex formatting, this script provides a robust solution tailored to your needs.

By following the step-by-step guide outlined in this post, you can implement and customize the script to fit various scenarios, ensuring your documents remain clean, organized, and free from unwanted clutter. Embrace the power of automation with Google Apps Script and transform the way you manage your Google Docs today!


Pro Tip: Always remember to backup your documents before running scripts that alter their content. This simple step can save you from potential data loss and provide peace of mind as you automate your workflows.