1 NOTE: The latest version of this file is online here:
2 http://snarfed.org/pyblosxom2wxr
4 pyblosxom2wxr.sh is a shell script that migrates content from PyBlosxom
5 to WordPress. It converts PyBlosxom posts and comments into a WXR
6 (WordPress eXtensible RSS) file that can be imported into WordPress.
10 - The post file extension is hard-coded to .txt, since that’s what
13 - Pages are supported as well as posts. pyblosxom2wxr assumes that post
14 filenames start with the date, in YYYY-MM-DD format, e.g.
15 2010-07-28_my_post.txt. Files without a prefix in that format are
16 assumed to be pages. (This is hard coded but would be easy to change.
17 Search for the date_re variable.)
19 - The filename is used as the WordPress post/page GUID, and the first
20 line of the file is extracted and used as the title. The second line
21 is assumed to be blank. If your files don’t follow that format, you’ll
22 want to preprocess them or tweak the script.
24 - Categories are not (yet) supported. All posts and pages are assigned
25 to the “uncategorized” category in WordPress.
27 - WordPress limits import files to 2MB, but pyblosxom2wxr can generate
28 output files larger than that. If that happens, you can split it
29 manually or with a tool like ChoppedPress.
31 - By default, the last modified time of post and page files is used as
32 their timestamp. However, if you have a timestamps file from the
33 hardcodedates PyBlosxom plugin, it will be used instead. The default
34 path is ../timestamp; you can customize this by editing the
35 timestamp_file variable in the script.
37 - If you use Markdown or another markup language where line breaks and
38 whitespace are meaningful, you’ll want to apply this patch to the
41 - pyblosxom2wxr doesn’t assign post ids. It omits <wp:post_id> elements
42 in the output file. This makes WordPress allocate post ids itself.
44 - However, WordPress won’t allocate comment ids itself, so pyblosxom2wxr
45 has to do that and populate them in <wp:comment_id> elements. This
46 means that importing a WXR file generated by pyblosxom2wxr may
47 overwrite any existing comments!
49 - If you use PyBlosxom’s compact_comments.sh, comments imported from
50 -all.cmt files may not be ordered by date. See my page on extracting
51 compacted PyBlosxom comments for a workaround.
55 - Posts with more than 256 comments are not supported well. Only the
56 last 256 comments will be imported, and will likely be ordered wrong.
57 See the TODO near the end of the script.