March 9, 2010

Google AppEngine + Google Docs + Some Python = Simple CMS

I was looking for a CMS that could run on the Google App Engine lately. As the free quotas are pretty high, a GAE powered CMS would make sense for small personal web pages that do not get too much traffic and have to run on a tiny budget. There are some CMSes (e.g. the App Engine Site Creator) but I didn't like them a lot. They are either too restricted or don't seem to have a large user base which makes their future uncertain. Also, as the GAE datastore is not a standard database, you have additional issues: How to backup? Possibility to migrate to another host? How to import or export of data?
My search ended without having found something I really liked...

But while searching, I had an idea: Why not just use a spreadsheet hosted on Google Docs as the base for a CMS? It has a couple of advantages:
  • Simple, everybody knows how to use a spreadsheet application
  • you get an online editor for free
  • Google Docs does versioning, so it's easy to roll back changes
  • Backup/restore and import/export is simple
The GAE app would only need to pull the data from the spreadsheet, apply some post-processing and output it. Not really much for the app to do, so I decided to write a little prove-of-concept application. I was using the following ingredients
  • Google App Engine SDK for Python
  • gdata-python-client, the Python version of the Google Data API to access the spreadsheet
  • Mako, a templating system for Python
  • and some code to glue these together
The app expects 2 tables to be present in the spreadsheet: pages and templates. Both need a content column, the templates table also needs a name column. That's it. It could look like this:

templates:


pages:


There are 2 templates defined, main and defs. The template main is the base template for the html output, defs holds some Mako style methods (more on whose later).

The pages table holds 3 pages. The first one is used as the root document, the others can be got via http://<host>/page/<id>. The content is treated as a Mako template, references to other templates are resolved by looking up the name in the templates table. Therefore the first two lines load methods from the defs template and make the page inherit from main:


<%namespace import="*" file="defs" />
<%inherit file="main"/>


The page content invokes 2 methods, image() and page(). The methods make use of global objects that are injected into the global Mako template context:

  • pageid, which is the id of the page currently rendered
  • page, representing the current page row, a dict with the column names as keys
  • db, the Database object. Row objects can be retrieved like this:
    db.<table>[<row_no>]['<col_name>']

Using the db object any table of the spreadsheet can be accessed. This way the system can be extended in a really easy way. In out little example this is used for the image() method. References to images are stored in a table called images and are referenced by an id:




<%def name="image(id)">
<img src="${db.images[id]['url']}" alt=""/>
</%def>


When rendering the root url, this now results in:


<html>
<head><title>Welcome!</title></head>
<body>
<h1>Welcome!</h1>
<p>

<p>Some text</p>
<img src="http://www.andreaskahler.com/me.png" alt=""/>
<br/>
<a href="page/2">link to page 2</a>

</p>
this is the footer
</body>
</html>

Not really a useful page, but you get the idea.

The python code for the whole thing looks like this:

#!/usr/bin/env python

import sys
import logging
import cgi
from google.appengine.ext import webapp
from google.appengine.ext.webapp import util
import gdata.spreadsheet.text_db

from mako.template import Template
from mako.lookup import TemplateLookup
from mako.runtime import Context
from StringIO import StringIO

email = 'your.google.account@gmail.com'
password = 'yourpasswd'
spreadsheet = 'spreadcms' #name of spreadsheet
table_pages = 'pages'
table_templates = 'templates'
col_name = 'name'
col_content = 'content'

class Db:
""" wrapper for the spreadsheet. places the tables into its __dict__ """
def __init__(self):
self.client = gdata.spreadsheet.text_db.DatabaseClient(username=email, password=password)
self.db = self.client.GetDatabases(name=spreadsheet)[0]
for table in self.db.GetTables():
name = table.entry.title.text
self.__dict__[name] = Table(self, name)

def getTableRecord(self, table, row_no):
return self.db.GetTables(name=table)[0].GetRecord(row_number=row_no).content


class Table:
""" wraps a spreadsheet table. access rows with their index (starting with 1) """
def __init__(self, db, table):
self.db = db
self.table = table

def __getitem__(self, index):
return self.db.getTableRecord(self.table, index)



class MainHandler(webapp.RequestHandler):
""" the main web request handler """
def connect(self):
try:
self.db = Db()
except gdata.spreadsheet.text_db.Error, err:
self.response.out.write(err)
return False
return True

def get(self):
if not self.connect():
return

# get page id, default to 1
pageid = 1
parts = self.request.path.strip('/').split('/')
if len(parts)==2 and parts[0]=="page":
try: pageid = int(parts[1])
except ValueError: pass

page = self.db.__dict__[table_pages][pageid]

mylookup = TemplateLookup(filesystem_checks=False)

#load all templates
i = 1
while True:
try:
template = self.db.__dict__[table_templates][i]
mylookup.put_template(template[col_name], Template(template[col_content], lookup=mylookup))
except:
break
i+=1

content = page[col_content] or ""

# invoke Mako
try:
t = Template(content, lookup=mylookup)
mylookup.put_template("_page"+str(pageid), t)
s = t.render(page=page, pageid=pageid, db=self.db)
self.response.out.write(s)
except Exception,e:
s = "<h1>Error</h1>"
s += "<pre>" + cgi.escape(str(e)) + "</pre>"
s += "<h1>Page</h1>"
s += "<pre>" + cgi.escape(page[col_content]) + "</pre>"
self.response.out.write(s)


def main():
application = webapp.WSGIApplication([('/', MainHandler),
('/page/[0-9]+', MainHandler)],
debug=True)
logging.getLogger().setLevel(logging.DEBUG)
#logging.getLogger().setLevel(logging.WARNING)
util.run_wsgi_app(application)


if __name__ == '__main__':
main()



Of course this is not production-ready code. A lot of error handling is missing and one needs a proper way to handle the gdata login captcha requests (I guess storing a valid token would work). There is also no caching implemented.

This of course is not intended to replace a full-blown CMS. But I guess if you would spend some time fixing these things, this could be suitable for small web sites, where all content is more or less static.