Content generation with Google Workspace Add-ons

6 min readOct 21, 2020

Back when G Suite, err… Google Workspace Add-ons launched, support for Docs, Sheets, and Slides was conspicuously absent. Sure, they’ve long supported their own flavor of add-ons, but the idea of building an add-on two different ways didn’t sit well with me. Thankfully I didn’t have to wait too long for the editors to catch up. I know it felt like an eternity, but let’s just chalk that up to 2020 being what it is.

Now that add-ons work in the editors I had the chance to build a few small demos to try them out. The two samples I built both fall into the category of content generation. One inserts text, the other images. There are lots of potential add-ons that do exactly this — inserting generated charts or diagrams for a report or sales pitch, grabbing images from a stock photography catalog for a slide deck, adding canned snippets of text for an RFP, or using AI-driven text generation and summarization to accelerate writing.

As fun as building some of those might be, I wanted to build something simple that shows what could be done without going down too deep a hole. For text, what better way to show inserting text than the classic Lorem Ipsum! Ok ok ok… To be fair, Pirate Ipsum, Zombie Ipsum, Cat Ipsum or one of the many other variants would be better, but I’m sticking with the classics for this one.

For images, I went with something more useful — QR Codes. They’re simple to generate and there are plenty of good reasons for adding them to a document or presentation, even emails. But even if you don’t care about random placeholder text or QR codes, it’s just a small jump from these to inserting whatever text or image content matters to you.

Let’s jump into how they’re built.

Enable the editors in the Apps Script manifest

All add-ons start with a manifest that describes the features and entry points. For the editors, only the homepageTrigger entry points are supported. Since the home page runs in the context of an active editing session, there’s naturally more context available than in other hosts — the document itself.

"docs": {
  "homepageTrigger": {
    "runFunction": "onHomepage"
  }
},
"sheets": {
  "homepageTrigger": {
   "runFunction": "onHomepage"
  }
},
"slides": {
  "homepageTrigger": {
    "runFunction": "onHomepage"
  }
},
"gmail": {
  "composeTrigger": {
    "draftAccess": "NONE",
    "selectActions": [{
      "runFunction": "onHomepage",
      "text": "Insert QR Code"
    }]
  }
}

As an aside, there’s a bit of duplication here and I could just have easily used a single homepage trigger for the entire add-on:

"common": {
    "homepageTrigger": {
    "runFunction": "onHomepage"
    }
},
"docs": {},
"sheets": {},
"slides": {},
"gmail": {
    ...
}

But since these two add-ons only work contextually, I thought it best to define the entry points on a per-host basis.

Request permissions

The other important part is listing the scopes the add-on needs.

"oauthScopes": [
  "https://www.googleapis.com/auth/documents.currentonly",
  "https://www.googleapis.com/auth/presentations", 
  "https://www.googleapis.com/auth/spreadsheets",
  "https://www.googleapis.com/auth/script.external_request",
  "https://www.googleapis.com/auth/gmail.addons.execute",
  "https://www.googleapis.com/auth/gmail.addons.current.action.compose"
],

In this case the add-on needs access to update the documents. Gmail needs some special scopes unique to compose-time add-ons as well. Lastly, the external_request scope allows calling the underlying web APIs that generate the lorem ipsum text and QR codes. I’m lazy and even with the new V8 runtime, it’s often a chore trying to get Node.js packages to run in Apps Script. Yes, they’re both JavaScript and both use V8, but the environments are very different. Calling an API with UrlFetchApp is much easier, so https://loripsum.net/ and http://goqr.me/api/ to the rescue!

There’s more to the manifest, but it’s mostly cosmetic details. You can view the full version for both add-ons on GitHub.

Display the home card

Both add-ons require the user to enter some information and confirm inserting content — how much text to generate or the data to encode in the QR code. They’re both very similar, so let’s just take a look at one of them — the lorem ipsum generator

function onHomePage(event) {
  let builder = CardService.newCardBuilder();  let numParagraphsInput = CardService.newTextInput()
    .setTitle('Number of paragraphs')
    .setFieldName('paragraphs')
    .setHint('Enter # of paragraphs to generate')
    .setValue('1');
  
  let lengthInput = CardService.newSelectionInput()
    .setTitle('Average length of paragraph')
    .setFieldName('length')
    .setType(CardService.SelectionInputType.DROPDOWN)
    .addItem('Short', 'short', true)
    .addItem('Medium', 'medium', false)
    .addItem('Long', 'long', false)
    .addItem('Very long', 'verylong', false);
           
  let submitAction = CardService.newAction()
    .setFunctionName('onGenerateText')
    .setLoadIndicator(CardService.LoadIndicator.SPINNER);
  let submitButton = CardService.newTextButton()
    .setText('Generate and insert text')
    .setOnClickAction(submitAction);
  
  let optionsSection = CardService.newCardSection()
    .addWidget(numParagraphsInput)
    .addWidget(lengthInput)
    .addWidget(submitButton);  builder.addSection(optionsSection);
  return builder.build();
}

On submitting the form, the action callback is called.

function onGenerateText(e) {
  let options = {
    paragraphs: e.formInput.paragraphs,
    length: e.formInput.length
  };
  // Insert content....
}

Again, the QR code version is very similar, just a different set of options for the data to encode and requested image size.

Insert text and images

Here’s where things get a little tricky. Inserting content varies a lot depending on the host application. The add-on event indicates which application is the host for the add-on and the add-on can switch implementations on that value. Inserting text in any individual host isn’t a lot of code, but to be honest, having to write so many variations of nearly identical code bothers me. There certainly are use cases where I’d want the ability to precisely control the behavior for each host, but for this one? Nope, rather have some little helpers do the work for me.

For text insertion, it ends up looking like this.

let content = '';
switch (e.hostApp) {
case 'docs': 
  let doc = DocumentApp.getActiveDocument()
  let cursor = doc.getCursor();
  if (!cursor) {
    return notify('Unable to insert text, no cursor.');
  }
  content = generateText({ style: 'plaintext', ...options});
  cursor.insertText(content);
  return notify('Text inserted');
case 'slides':
  let slides = SlidesApp.getActivePresentation();
  let textRange = slides.getSelection().getTextRange();
  if (!textRange) {
    return notify('Unable to insert text, no cursor.');
  }
  content = generateText({ style: 'plaintext', ...options});
  textRange.insertText(0 ,content);
  return notify('Text inserted');
case 'sheets':
  let sheets = SpreadsheetApp.getActiveSpreadsheet();
  let cell = sheets.getCurrentCell();
  if (!cell) {
    return notify('Unable to insert text, no cursor.');
  }
  content = generateText({ style: 'plaintext', ...options});
  cell.setValue(content);
  return notify('Text inserted');
case 'gmail':
  content = generateText({style: 'html', ...options});
  let updateAction = CardService.newUpdateDraftBodyAction()
    .addUpdateContent(content, CardService.ContentType.MUTABLE_HTML)
    .setUpdateType(CardService.UpdateDraftBodyType.IN_PLACE_INSERT)
  return CardService.newUpdateDraftActionResponseBuilder()
      .setUpdateDraftBodyAction(updateAction)
      .build();
default:
  return notify('Host app not supported.');
}

Images are very similar:

let imageUrl = generateQrCodeUrl(data, width);
let image;
switch (e.hostApp) {
case 'docs': 
  let doc = DocumentApp.getActiveDocument()
  let cursor = doc.getCursor();
  if (!cursor) {
    return notify('Unable to insert image, no cursor.');
  }
  image = UrlFetchApp.fetch(imageUrl);
  cursor.insertInlineImage(image);
  return notify('QR code inserted');
case 'slides':
  let slides = SlidesApp.getActivePresentation();
  let page = slides.getSelection().getCurrentPage();
  if (!page) {
    return notify('Unable to insert image, no page selected.');
  }
  image = UrlFetchApp.fetch(imageUrl);
  page.insertImage(image)
  return notify('QR code inserted');
case 'sheets':
  let sheets = SpreadsheetApp.getActiveSpreadsheet();
  let cell = sheets.getCurrentCell();
  if (!cell) {
    return notify('Unable to insert image, no cursor.');
  }
  image = UrlFetchApp.fetch(imageUrl);
  cell.getSheet().insertImage(
      image, 
      cell.getColumn(),
      cell.getRow());
  return notify('QR code inserted');
case 'gmail':
  let html = `<img style="display: block" src="${imageUrl}"/>`;
  let updateAction = CardService.newUpdateDraftBodyAction()
    .addUpdateContent(content, CardService.ContentType.MUTABLE_HTML)
    .setUpdateType(CardService.UpdateDraftBodyType.IN_PLACE_INSERT)
  return CardService.newUpdateDraftActionResponseBuilder()
      .setUpdateDraftBodyAction(updateAction)
      .build();
default:
  return notify('Host app not supported.');
}

In both cases, Gmail is a bit of an outlier. Interaction isn’t via the built-in APIs in Apps Script, but rather through returning a response from the action instructing the host app to do the work for us.

Similarly, while the other hosts expect the actual image content as a Blob, Gmail requires an HTML <img> tag. Compose-time actions in Gmail don’t support adding attachments and data URIs aren’t allowed in images either. This means the image must be served from a publicly accessible URL. That’s not a problem in this case since the QR code API can be used as the source for the image, but it’s not hard to imagine cases where add-ons will need to host their generated images in a GCS bucket or elsewhere.

Watch out for icebergs

Easy, right? Sort of.

Yes, the implementations here are straightforward. With the exception of Gmail which contains a tiny bit of HTML, this is all plain text and single images. Inserting richly formatted text is more complicated, particularly when the text is already in a format like HTML or markdown. That means parsing and transforming the markup into the right set of API calls for the host app.

Similarly, replacing text that is selected text is slightly different than inserting text at the cursor. It’s doable, just another use case to be aware of when building and testing an add-on. Neither of these two add-ons demanded that as a feature, but yours might.

Next up

While inserting content covers a broad range of add-ons for editors, it’s just part of the overall landscape. Consuming and transforming content are also powerful patterns that fit a wide variety of use cases — readability scoring, toxicity or tone analysis, data loss prevention, publishing, and so on. Stay tuned for more, and in the meantime learn more about how to build an add-on here.