Migrating from Wordpress to Pelican on PaaS - Part 2

Posted by Dave on 12 February 2014

Part 2 of a this 3 part series examines how I created my Pelican blog and migrated my Wordpress content with me.

Part 2: The Wordpress to Pelican Migration

The Plan

If you haven't read Part 1 already, it will give you some background as to what I'm doing and why I'm doing it.

Starting the Pelican Project

Assuming you already have a working Python, starting a new blog is as easy as installing a few dependencies and using the pelican-quickstart

:::bash
pip install pelican Markdown
mkdir blog
cd blog
pelican-quickstart
Welcome to pelican-quickstart v3.3.0.

This script will help you create a new Pelican-based website.

Please answer the following questions so this script can generate the files
needed by Pelican.


> Where do you want to create your new web site? [.]
> What will be the title of this web site? Dave's Blog
> Who will be the author of this web site? Dave Tucker
> What will be the default language of this web site? [en]
> Do you want to specify a URL prefix? e.g., http://example.com   (Y/n) Y
> What is your URL prefix? (see above example; no trailing slash) http://dtucker.co.uk
> Do you want to enable article pagination? (Y/n) Y
> How many articles per page do you want? [10]
> Do you want to generate a Fabfile/Makefile to automate generation and publishing? (Y/n)
> Do you want an auto-reload & simpleHTTP script to assist with theme and site development? (Y/n)
> Do you want to upload your website using FTP? (y/N)
> Do you want to upload your website using SSH? (y/N)
> Do you want to upload your website using Dropbox? (y/N)
> Do you want to upload your website using S3? (y/N)
> Do you want to upload your website using Rackspace Cloud Files? (y/N)
Done. Your new project is available at /Users/dave/dev/blog

Now your project is set up, lets get this under source control

:::bash
git init
echo "*.pyc
*.pid
output" > .gitignore
git add --all
git commit -a -m "Initial Commit"

Now we have a backup.

Exporting from Wordpress

Getting your blog posts from Wordpress is pretty easy if you follow the instructions here

Once you have your posts in XML format you can convert them using the pelican-import tool. This tool has some additional dependencies that you need to install. First install Pandoc following the instructions for your OS. Then:

:::bash
pip install BeautifulSoup4 lxml
pelican-import --wpfile -m markdown -o content <your_wordpress-file>.xml

You might need to tweak your pages by hand once the import has completed.

Tweaking the Pelican

Now you have your site and content in order, it's time to get a the site themed and to makes some customizations. pelicanconf.py is home to pretty much all of the configuration settings. publishconf.py adds or overrides settings that are only necessary when publishing to the web. I'm not going to cover this in detail, instead I'll show you the tweaks I've made to my site...

Settings

The settings that can appear in your pelicanconf.py are documented here. Here is what I changed:

I adjusted Pelican to match my permalink structure on Wordpress:

:::python
ARTICLE_URL = '{category}/{slug}.html'
ARTICLE_SAVE_AS = '{category}/{slug}.html'
YEAR_ARCHIVE_SAVE_AS = 'archives/{date =%Y}/index.html'
MONTH_ARCHIVE_SAVE_AS = 'archives/{date =%Y}/{date =%b}/index.html'

I changed my the LINKS and SOCIAL settings:

:::python
LINKS =  (('Networkstatic', 'http://networkstatic.net/'),
          ('Etherealmind', 'http://etherealmind.com/'),
          ('Greg Ferro', 'http://gregferro.com/'),
          ('Brett Terpstra', 'http://brettterpstra.com/'),)

SOCIAL = (('Twitter', 'http://twitter.com/dave_tucker'),
          ('LinkedIn', 'http://www.linkedin.com/in/davetucker'),
          ('GitHub', 'http://github.com/dave-tucker'),
          ('Stack-Overflow', 'http://careers.stackoverflow.com/davetucker'),
          ('RSS', 'http://feedpress.me/davetucker'))

Finally, I made some changes to my publishconf.py to set up Disqus, Google Analytics and my RSS Feeds.

:::python
FEED_ALL_RSS = 'feed/all.rss.xml'
CATEGORY_FEED_RSS = 'feed/%s.rss.xml'
DELETE_OUTPUT_DIRECTORY = True
DISQUS_SITENAME = ""
GOOGLE_ANALYTICS = ""

Themes

The Pelican team maintain a list of themes on GitHub. I'm very fond on Bootstrap and so I picked the pelican-bootstrap3 theme. The README explains how to install and customize this theme so I'm not going to go in to too much detail here. Here is the relevant section of my pelicanconf.py for reference:

:::python
THEME = 'themes/pelican-bootstrap3'
BOOTSTRAP_NAVBAR_INVERSE = True
DISPLAY_CATEGORIES_ON_MENU = True
DISPLAY_CATEGORIES_ON_SIDEBAR= False
DISPLAY_TAGS_ON_SIDEBAR = True
TAG_CLOUD = True
TAG_CLOUD_MAX_ITEMS = 40
PLUGIN_PATH = 'plugins'
PLUGINS = ['summary']
PYGMENTS_STYLE = 'monokai'
CC_LICENSE = 'CC-BY-NC'
BOOTSTRAP_THEME = 'simplex'

Plugins

There is a plugin repository on GitHub full of useful Pelican plugins. As I was using Markdown in my Wordpress blog (and will continue using Markdown here) all of my posts have a <!-- more --> tag to end the summary. Adding the summary plugin from this repo allows and setting the SUMMARY_END_MARKER in my pelicanconf.py means that I don't need to change my workflow!

Testing it out

Before you commit your changes, you might want to test them out. Pelican has a development server that you can use by simply running ./develop_server.sh start

You can then browse to http://127.0.0.1:8000 and see your new blog in all it's glory!

Once you are happy with your changes remember to commit them.

:::bash
git add --all
git commit -m "Awesome New Feature"

Conclusion

So now I have my blog looking nice, Pelican powered, and running locally I need to think about how to publish it on my Docker-powered mini-Herkou (dokku). Join me in Part 3, for the finale!

@dave_tucker