technobabble‎ > ‎projects‎ > ‎

typepad export

typepad, rss, and atom

So, my efforts to simplify the maintenance of my website and archiving of my blog contents is snowballing, as any good programming project should.  My ultimate objective: to write a set of scripts (I don't care what language - would prefer python, Java, or perl, in that order) that extracts content (posts and photo gallery links) from my typepad blog and archives it to my static website so that I don't have to edit any html files or manually transfer files.  Ideally, I want to run one program that finds the latest photo galleries on my blog, extracts all new (or all, period) posts to my blogs, writes html files in the format desired for my static website, and even uploads the files to the site.

Typepad provides an export facility that seems to give me all the posts ever made to each blog and writes them to a flat file of some parseable format (I've written a simple parser in python). 

My typepad blog displays a list of photo albums along with thumbnails - I want to add similar links to the "Recent Photos" section on my static website without manually editing the html.  One way to accomplish this would be to open the url for the blog, and write a script that parses the resulting html to extract the thumbnail links - this is doable in python using the xml dom parser (I don't want to use an event driven sax parser).

There are drawbacks to both of these approaches, unfortunately. 

The export feature of typepad is accomplished by logging in to the admin tool and manually navigating to the link for exporting.  Maybe I can write a python script that can handle the admin log in and then send the export url for each blog, capturing the output for each to a file.

Parsing the html for my blog to find image thumbnail links is doable, but not very pretty.  If typepad changes the html structure, my parsing program must change.

In both cases, it would be nice to directly query typepad for the posts for each blog, and the current list of photo albums.  Typepad provides some support for a publishing API called Atom, and publishing of blog entries using RSS.  I'm trying to determine how I can use one or both of these API's to accomplish the objectives described above.

Here are some relevant links to help me decide:

Sign in  |  Recent Site Activity  |  Terms  |  Report Abuse  |  Print page  |  Powered by Google Sites