[wp-hackers] Porting static content
shacker at birdhouse.org
Tue Feb 22 22:42:08 UTC 2011
I have a client (an Ethiopian in exile) who has created a very popular static site comprised of 6,000 (!) pages... all hand-created in Notepad (yes, the wheels turn differently in some parts of the world). Amazing, I know.
I'm building a Wordpress site for him, but the question is how to get all that old static content into the site. Fortunately he's based all the old articles on the same original file, so the document structure is highly regular. In Python/Django I'd write a BeautifulSoup script to crawl the directory, scrape content into objects, and pump it in through the Django API. I'm sure similar solutions exist for PHP/WordPress but don't know where to start. Has anyone done a project like this? Do you have a skeleton script to share, or pointers on best way to proceed?
More information about the wp-hackers