X-Git-Url: https://projects.mako.cc/source/pyblosxom2wxr/blobdiff_plain/9066cfd27ee72987b460abed6f27c5b820915c6c..63b7950374b42e5bb351e534dfa3c150c3d5a285:/README.snarfed diff --git a/README.snarfed b/README.snarfed new file mode 100644 index 0000000..c655079 --- /dev/null +++ b/README.snarfed @@ -0,0 +1,58 @@ +NOTE: The latest version of this file is online here: + http://snarfed.org/pyblosxom2wxr + +pyblosxom2wxr.sh is a shell script that migrates content from PyBlosxom +to WordPress. It converts PyBlosxom posts and comments into a WXR +(WordPress eXtensible RSS) file that can be imported into WordPress. + +Notes: + +- The post file extension is hard-coded to .txt, since that’s what + PyBlosxom expects. + +- Pages are supported as well as posts. pyblosxom2wxr assumes that post + filenames start with the date, in YYYY-MM-DD format, e.g. + 2010-07-28_my_post.txt. Files without a prefix in that format are + assumed to be pages. (This is hard coded but would be easy to change. + Search for the date_re variable.) + +- The filename is used as the WordPress post/page GUID, and the first + line of the file is extracted and used as the title. The second line + is assumed to be blank. If your files don’t follow that format, you’ll + want to preprocess them or tweak the script. + +- Categories are not (yet) supported. All posts and pages are assigned + to the “uncategorized” category in WordPress. + +- WordPress limits import files to 2MB, but pyblosxom2wxr can generate + output files larger than that. If that happens, you can split it + manually or with a tool like ChoppedPress. + +- By default, the last modified time of post and page files is used as + their timestamp. However, if you have a timestamps file from the + hardcodedates PyBlosxom plugin, it will be used instead. The default + path is ../timestamp; you can customize this by editing the + timestamp_file variable in the script. + +- If you use Markdown or another markup language where line breaks and + whitespace are meaningful, you’ll want to apply this patch to the + WordPress importer. + +- pyblosxom2wxr doesn’t assign post ids. It omits elements + in the output file. This makes WordPress allocate post ids itself. + +- However, WordPress won’t allocate comment ids itself, so pyblosxom2wxr + has to do that and populate them in elements. This + means that importing a WXR file generated by pyblosxom2wxr may + overwrite any existing comments! + +- If you use PyBlosxom’s compact_comments.sh, comments imported from + -all.cmt files may not be ordered by date. See my page on extracting + compacted PyBlosxom comments for a workaround. + +Known bugs: + +- Posts with more than 256 comments are not supported well. Only the + last 256 comments will be imported, and will likely be ordered wrong. + See the TODO near the end of the script. +