This program will parse raw HTML pages of Kuro5hin diaries and post them to a Wordpress blog using in the Wordpress XMLRPC API. Requirements ============= To use the program you will need the following software: * Python 3 * The python-wordpress-xmlrpc package: https://pypi.python.org/pypi/python-wordpress-xmlrpc Of course, you will also need the Kuro5hin diary entries you want to import. I grabbed mine from "What's Left of K5, AKA Mumble's Archive" described here: https://kr5ddit.com/post/754 Using the Program =================== This is how I used the data: 1. I downloaded and unzipped this file: http://k5.semantic-db.org/diary-slurp/161942--archive-diaries--html-diaries--nested-format.zip 2. My username is "makohill" so searched through and copied diary entries from the location of the unzipped entries with a command like this one: grep -l -r 'HREF="/user/makohill">makohill' LOCATION_OF_ENTRIES|xargs -i cp {} . 3. Once I did that, I modified and imported the data with a command like: ./diary_parser.py 2002-12-26-9150-8083.html By default, the entries are posted with "pending" status so I could check then first. If you have many entries, you might want to tweak this. Details on the Wordpress XMLRPC API and the Python module I use is here: https://codex.wordpress.org/XML-RPC_WordPress_API/Posts https://python-wordpress-xmlrpc.readthedocs.io/en/latest/index.html Copyright and License ====================== © Benjamin Mako Hill, 2018 This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.