Codementor Events

Create PDF files from templates with Python and Google Scripts

Published Dec 09, 2018Last updated Jun 06, 2019
Create PDF files from templates with Python and Google Scripts

Often, it's useful to create PDF files from your Python scripts. Whether you're creating invoices, letters, reports, or any other documents that contain a lot of formatting repetition but only a little bit of dynamic content, adding some automation can save you many hours.

You have a few options for this. The usual ones are:

  1. Use a PDF library like reportlab to generate PDF files directly (e.g. https://www.blog.pythonlibrary.org/2010/03/08/a-simple-step-by-step-reportlab-tutorial/)
  2. Use an HTML templating library like Jinja2 and convert from HTML to PDF (e.g. see http://pbpython.com/pdf-reports.html)
  3. Use a third-party API like https://pdfgeneratorapi.com/.

For Option 1, generating PDFs directly from inside Python can make formatting very difficult. You have to draw anything you need element by element, using code, and even once you've got a template looking the way you want it, it's difficult to maintain.

Option 2 can often work better, but you still have to build the Jinja HTML boilerplate, and sometimes the HTML to PDF conversion doesn't come out quite as you expected.

Option 3 requires you to build the template first using an online service's web interface. Although you get a drag-and-drop interface, it's quite clunky and difficult to make your template look as you want. Usually, you also have to pay to use the service.

While one of the options above may work for you, if you don't like any of them, you can also hack together a document generation API based on Google Drive. You'll get a free API, and you'll be able to use Google Docs as your templating tool, which is quite powerful and has many pre-existing templates for things like invoices, letters, and CVs.

I started off with an invoice template that I found online. It looks like this:

Google Docs Invoice Template

In this tutorial, we'll walk through creating an API that generates these invoices and lets you programmatically insert the invoice number from an external Python script. In reality, you'd need to do the same for a lot of the other fields, but we'll start with a simple example for demonstration reasons.

We'll be writing a few lines of Google App Script code, and a few lines of Python code.

Creating a template document

Use one of the built in Google Document templates, search online for one that matches your needs, or build your own over at docs.google.com. (You'll need a Google Account).

Add placeholders where you need dynamic information. In the example below, I've added INVOICE NO {invoice_id} in place of the "456" id that I had on the original document. There is nothing special about this syntax -- we'll be using a basic search and replace function later to swap this out for the real information, so use something that's unlikely to actually appear in the final document.

Take a note of your document id, which is the highlighted part in the URL bar.

Invoice template with placeholder

Setting up a custom Google Script

Go to Google Drive, press "New" in the top left corner. Under "More", select "Google Apps Script" if it is available or "Connect more apps" if you don't see it.

Connecting more apps

Search for "apps script" and choose to connect it. You might see some warning messages asking if you trust yourself. Say that you do.

Adding Apps Script

Once you can create a new App Script, you'll see a default blank script that looks as follows.

Blank Google Apps Script

Delete the code that you see there, and replace it with a createDocument function that looks as follows.

function createDocument(invoice_id) {
  var TEMPLATE_ID = '1Ybq8r_SiWu4Z4-_Z6S0IW1L8FJywfpjPAATPCvvkKk8';  
  var documentId = DriveApp.getFileById(TEMPLATE_ID).makeCopy().getId();
  
  drivedoc = DriveApp.getFileById(documentId);
  drivedoc.setName("Invoice " + invoice_id);
  
  doc = DocumentApp.openById(documentId);
  
  var body = doc.getBody();
  
  body.replaceText('{invoice_id}', invoice_id);
  drivedoc.setSharing(DriveApp.Access.ANYONE_WITH_LINK, DriveApp.Permission.EDIT);

  return "https://docs.google.com/document/d/" + documentId + "/export?format=pdf";
}

On line 2, switch out the TEMPLATE_ID with the document ID that you copied from the URL bar on your templated Google Doc.

This code finds the templated doc, creates a copy of it and sets the file name to "Invoice " plus whatever invoice_id we pass in. It then opens the new file through the DocumentApp (instead of the Drive App, so that we can actually get the contents of the file and edit them). It searches the doc for the placeholder we added ({invoice_id}) and replaces it with the actual invoice_id that the function takes as input. It then sets the document to be publicly accessible and returns a URL that will go directly to a PDF export for that document.

Below this function, add another one called doGet. While the previous function can be named anything, doGet is a special function in Google Apps Scripts, so you'll need to name it exactly doGet. This function will handle incoming web requests after we've deployed our app.

The code for the doGet function is as follows. Paste this below the previous createDocument() function.

function doGet(e) {
  var invoice_id = e.parameter.invoice_id;
  var url = createDocument(invoice_id);
  return ContentService.createTextOutput(url);
}

This takes in the invoice_id as a URL parameter, passes this along to our createDocument function that we just wrote, and returns the URL of the created document as plain text.

Publishing our API

From the "Publish" menu, select "Deploy as web app"

Deploy as web app

You'll be asked to name the project. Give it a name like "PDF API" or anything else you want.

Naming the project

You'll see a new menu to set the options for deploying your web app.

Deploy options

Add a message like "initial deploy" under where it says "New" and choose "Anyone, even anonymous" from the access settings. Leave the Execution settings as "Me".

Warning: If you share the link in a public place, people may abuse the service and spam it with automatic requests. Google may lock your account for abuse if this happens, so keep the link safe.

Hit the Deploy button and make a note of the URL that you see on the next pop up.

Your app's URL

Add "?invoice_id=1" to the end of the URL and visit it in your browser. It should look something like

https://script.google.com/macros/s/AKfycbxDiKpTGqMijZmU8-0cPj06DBFjDOPYZJ9IFvhcO111GCh8jqxC/exec?invoice_id=1

If everything went well, you should see a Google Docs link displayed.

Response from our web application

If you visit the URL, a PDF of the invoice with the placeholder switched out with 1 should be downloaded.

Updating the application

If you see an error instead, or don't get a response, you probably made a mistake in the code. You can change it and update the deployment in the same way as you initially deployed it. The update screen is only slightly different from the deploy screen.

Update deployment options

The only tricky thing is that you have to select "New" as the version for every change you make. If you make changes to the code and Update a previous version, the changes won't take effect, which is not obvious from the UI. (You can see it took me a few tries to get this right.).

Creating our invoices from Python

We can now create invoices and save them locally from a Python script. The following code shows how to generate three invoices in a for loop.

import requests

url = "https://script.google.com/macros/s/AKfycbyYL5jhEstkuzZAmZjo0dUIyAmzUc1XL5B-01fHRHx8h63cieXc/exec?invoice_id={}"

invoice_ids = ["123", "456", "789"]

for invoice_id in invoice_ids:
    print("processing ", invoice_id)
    response = requests.get(url.format(invoice_id))
    print("file generated")
    response = requests.get(response.content)
    print("file downloaded")
    with open("invoice{}.pdf".format(invoice_id), "wb") as f:
        f.write(response.content)

Note that the creation and download process is quite slow, so it'll take a few seconds for each invoice you create.

You've probably noticed that this is quite a "hacky" solution to generate PDF files from inside Python. The "replace" functionality is quite limited compared to a proper templating language, and passing data through a get request also has limitations. If you pass through anything more complicated than an invoice ID, you'll to URL encode the data first. You can do this in Python using the urllib.parse module. An example modification of the Python script to deal with more complicated data is as follows.

import requests
import urllib.parse

url = "https://script.google.com/macros/s/AKfycbyYL5jhEstkuzZAmZjo0dUIyAmzUc1XL5B-01fHRHx8h63cieXc/exec?"

invoice_ids = ["A longer ID with special characters $% ! --*?+"]

for invoice_id in invoice_ids:
    print("processing ", invoice_id)
    payload = {"invoice_id": invoice_id}
    u = url + urllib.parse.urlencode(payload)
    response = requests.get(u)
    print("file generated")
    response = requests.get(response.content)
    print(response.content)
    print("file downloaded")
    with open("invoice{}.pdf".format(invoice_id), "wb") as f:
        f.write(response.content)

But there are still limitations of what kind of data and how much you can pass just using URLs, so you would need to modify the script to use POST requests instead if you were sending a lot of dynamic data.

It's also quite slow compared to some of the other methods we discussed at the beginning, and Google has some limitations on how many files you can create automatically in this way.

That said, being able to generate templates using Google Docs can be quick and powerful, so you'll need to assess the tradeoffs for yourself.

Also note that this is quite a contrived example, where we could have run the Python script from within the Google Ecosystem, and avoided needing to set up a public facing API that could potentially be abused if other people discovered the URL. However, you might have an existing Python application, not hosted on Google, that you need to connect with auto generated PDF files, and this method still allows you set up a self-contained "microservice" within the Google ecosystem that allows for easy PDF generation.

Conclusion

If you had any problems with the set up, spot any errors, or know of a better way to generate PDF files in Python, please leave a comment below or ping me on Twitter. You might also enjoy my other tutorials.

Discover and read more posts from Gareth Dwyer
get started
post comments7Replies
Игорь Хижняк
3 years ago

Right now I’m looking for the best approach to create a set of reports containing tables with some data. It looks like I’ll have to generate requests using the Google Docs API.
Does anyone have better ideas?

Bright Uchenna Oparaji
4 years ago

Nice article. If you want to add an image how do you change the file. Thanks.

chin woo
5 years ago

Your article is great and rewarding
https://wordcounter.tools/

Show more replies