Saturday, August 28, 2010

Closing a Google Apps project: downloading sites

When MN Special Hockey signed up with a web hosting project, I needed to transfer the domain.

The domain was managed through my Dreamhost (love 'em - click here to evaluate my discount on their ISP services) accounts. There it has associated Google Apps services. I really didn't know what would happen to the account data when I switched away the host name, so I decided to see what I could backup.

Naturally I turned to Google's Data Liberation Front - my heroes. They provide a Java app for this purpose. It can allegedly be used to move a site from one location to another! 
Sites - the Data Liberation Front
A site can be downloaded using the open-source tool available at the Google Sites Liberation Project Page. Note that this tool requires Java 1.5 or later. Once the tool has been downloaded, double-click on the JAR file to launch the application. 

It opened easily on OS X 10.6.

You will need the documentation. Instead of www.mnspecialhockey.org (didn't fit their model) I tried sites.mnspecialhockey.org and I found two sites to copy:
  • http://sites.google.com/a/mnspecialhockey.org/www/
  • http://sites.google.com/a/mnspecialhockey.org/mn-special-hockey-site/
From their documentation I think the values are:
  • host: sites.google.com (optional)
  • domain: mnspecialhockey.org (optional)
  • webspace: www and mn-special-hockey-site (two webspaces)
There's an option to download revisions, but I didn't care about that. I was fine with the latest version.

It worked, but the result was not pretty! Without style sheets it renders poorly in Safari, and  some of the character encoding was off. Still, the bulk of the content was there including associated PDFs that were a part of the site.

The Java app is pretty simple. You have to quit and restart to do a second site. Still, did I mention it worked?

2 comments:

Nick said...

is the same true today (several days later) for your google docs showing up via the URL? I"m curious to see if you were able to access them because propogation hadn't fully happened yet. After the
standard "72" wait, it would be nice to know if they're still there. Please let us know

- NICK (WHO'S still holding on for a YES answer so he can get his own docs back)

John Gordon said...

Nick, it's been several days and I can still access the pages. It is a bit weird, I do wonder if Google will eventually clear everything out.