[wp-hackers] Porting static content
Paul
paul at codehooligans.com
Tue Feb 22 22:49:58 UTC 2011
Scott.
I've used http://sourceforge.net/projects/simplehtmldom/ a number of times.
I'll send you a working script I use to suck down a site.
P-
On Feb 22, 2011, at 5:42 PM, Scot Hacker wrote:
> I have a client (an Ethiopian in exile) who has created a very popular static site comprised of 6,000 (!) pages... all hand-created in Notepad (yes, the wheels turn differently in some parts of the world). Amazing, I know.
>
> I'm building a Wordpress site for him, but the question is how to get all that old static content into the site. Fortunately he's based all the old articles on the same original file, so the document structure is highly regular. In Python/Django I'd write a BeautifulSoup script to crawl the directory, scrape content into objects, and pump it in through the Django API. I'm sure similar solutions exist for PHP/WordPress but don't know where to start. Has anyone done a project like this? Do you have a skeleton script to share, or pointers on best way to proceed?
>
> Thanks,
> Scot
>
> _______________________________________________
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-hackers
More information about the wp-hackers
mailing list