Jon Thysell

Father. Engineer. Retro games. Ukuleles. Nerd.

Month: June, 2012

The evolution of Pawsgaard

It’s been several months since I published Pawsgaard, the first story in the Guineawick Tales universe. Since then I’ve been heads down editing the third draft of its sequel, Hester and the Kookaburra King. I got to thinking about all the drafts I go through before publishing, and thought it might be fun to revisit some of my earlier Pawsgaard revisions.

Here’s how the story started in the original draft back in 2009:

Autumnal clouds blanketed the skies over Guineawick, thick and white and holding back the valiant efforts of the midday sun. The town bustled with a crowd of farmer-mice: the squeaks and chatter announced harvest time had come at last. A steady stream of strapping young mice marched in from the outer fields, passing through the heavy doors of the East Gate. Some carried bundles on their backs, others pulled wood carts; but collectively they bore the smiles of a good day’s work and the promise of a comfortable winter.

In the next draft, I dropped the occupation-mice formation, and massaged some of the sentence structures, but not much else changed.

Thick white clouds blanketed the skies over Guineawick; holding the midday sun at bay. The town bustled with a crowd of mice: their squeaks and chatter proclaimed the beginning of the harvest. A steady stream of strapping young farmers marched in from the outer fields, passing through the heavy doors of the East Gate. Some carried bundles on their backs, others pulled wood carts; but collectively they bore the smiles of a good day’s work and the promise of a comfortable winter.

And here’s how the final draft of Pawsgaard started:

Thick white clouds blanketed the sky, blocking the hot noon sun. The walled mousetown bustled with twittering whiskers, bouncing tails, and the rapid chatter of hundreds of mice. Merchants shouted from the shade of their stalls; mothers ran errands with little ones circling their feet. A constant stream of farmers returned from the fields, marching in from the East Gate with carts overstuffed. All bore the smiles of a good day’s work and the promise of a comfortable winter.

Harvest had come to Guineawick.

This time, I focused on smoothing out the flow of the scene, and as well as boosting the  imagery with the shouting merchants, the mothers and the children. I also push mention of harvest and the name Guineawick to their own single-line paragraph. This helps emphasize them, without requiring the reader to remember those details from the dense first paragraph.

It’s just a peek into the process; but I know I enjoy reading about how others write and edit their work, so I hope someone else finds this interesting. You can download Pawsgaard for free at Smashwords and wherever finer ebooks are sold.

/jon

Retrieve your blog posts from a WordPress eXtended Rss file with WXR to HTML

If you’ve ever migrated or retired a WordPress blog, you’re probably familiar with WordPress eXtended Rss files. There’s a thorough summary here, but basically a WXR file is a copy of all of the textual content on your site: pages, blog posts, and comments. You can use them to migrate that content from blog to blog, or just to archive it for your own backups.

The problem is, there’s not much you can do with the file other than import it back into WordPress or another blogging system. But what if you just want to read the content? What if it’s been years since that blog was live, and you just want to rescue a favorite post?

The file is XML, so it is technically human-readable, but there’s a lot of ugly markup to sift through as well. Not fun. You could import the file back into a WordPress install, but that seems a tad overkill.

I’ve been hit with this exact problem myself. I have the WXR files of two old WordPress sites, with years of content dating back to the early days of blogging. I’d like to do something with that content.

Though WXR files may be a pain for us to read, as XML it’s a breeze for software to parse. With that in mind, I wrote WXR to HMTL. It’s a short and sweet python script for converting a WXR file into a plain, easy to read HTML file.

Note, the goal is not to recreate the original sites in all their former glory. I don’t even care about comments all that much; I just want the blog posts in a form that I can read, search, and copy/paste from. Also it’s not really possible to rebuild the site; WXR files only contain the raw text: no images, no style, and no layout information. Those limitations in mind, here’s the script:

#!/usr/bin/env python

"""
WXR to HMTL <https://jonthysell.com/>

Copyright 2012 Jon Thysell <thysell@gmail.com>

This software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software.

Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions:

1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
"""

import sys
import codecs
import string
from lxml import etree

_header_html = """
<html>
<head>
<title>%s</title>
</head>
<body>
<h1>%s</h1>
<p>%s</p>
<p><em>Exported from <a href="%s">%s</a> %s</em></p>
"""

_footer_html = """
<p><em>HTML generated by WXR to HTML &lt;<a href="http://jonthysell.com">http://jonthysell.com</a>&gt;</em></p>
</body>
</html>
"""

_item_html = u"""
<h2><a href="%s">%s</a></h2>
<p>Published: %s</p>
<p>%s</p>
"""

_title = ""
_link = ""
_desc = ""
_pubdate = ""
_items = []

# This controls whether to add in paragraph tags. Most likely you want this on. Only change this to False if for some reason your posts are already valid HMTL.
_autop = True

def autop(s):
    s = string.replace(s, "\r\n", "\n")
    s = string.replace(s, "\n\n", "</p><p>")
    s = string.replace(s, "\n", "<br \>")
    return s

def main(input_file):
    """Take an WXR XML file and export an HMTL file."""
    global _title, _link, _desc, _pubdate, _items, _autop
    print "Reading from %s" % input_file
    with codecs.open(input_file, 'r') as wxr_file:
        tree = etree.parse(wxr_file)
        _title = tree.xpath('/rss/channel/title')[0].text
        _link = tree.xpath('/rss/channel/link')[0].text
        _desc = tree.xpath('/rss/channel/description')[0].text
        _pubdate = tree.xpath('/rss/channel/pubDate')[0].text
        xml_items = tree.xpath('/rss/channel/item')
        for xml_item in xml_items:
            t = xml_item.xpath('title')[0].text
            l = xml_item.xpath('link')[0].text
            p = xml_item.xpath('pubDate')[0].text
            c = xml_item.xpath('content:encoded', namespaces={'content': 'http://purl.org/rss/1.0/modules/content/'})[0].text
            if _autop:
                c = autop(c)
            _items.append((l, t, p, c))

    output_file = input_file[:-3] + "html"
    print "Writing to %s" % output_file
    with codecs.open(output_file, encoding='utf-8', mode='w') as html_file:
        p = (_title, _title, _desc, _link, _link, _pubdate)
        html_file.write(_header_html % p)
        for _item in _items:
            html_file.write(_item_html % _item)
        html_file.write(_footer_html)

if __name__ == "__main__":
    main(sys.argv[1])

To run this, you’ll need Python and the lxml module installed. The script takes one parameter, the WXR file, and exports a single HTML file with all of your posts and pages, including titles, original links, and timestamps. It will not export your comments, tags, categories, etc. If you need that, feel free to tweak the script.

Now finally I can delve into my own personal back-catalog. It’s rather exciting to look at my posts from so long ago.

Do you find this script useful? Say so in the comments!

/jon

Call me Kamalani

My name is Jon, I introduce myself as Jon, so it only makes sense that most people who meet me, know me as Jon. Which begs the question, why do half the people who know me, know me only as Kamalani?

One day, when I was five, maybe six, I remember playing in the front yard at our home in Newark. Grandma sat on a wood bench with iron armrests near the house, and she called me over to sit with her. When I finally caught my breath, she asked me if I wanted her to call me Jon, or if she could call me Kamalani. That’s my Hawaiian name, the name she gave me. She asked what to use, because she didn’t want me to feel embarrassed in front of my friends.

Grandma in 2006

I told her, “Grandma, you can call me Kamalani.”

Since that day, she never called me by anything else. That’s why her friends all know me as Shirley’s grandson, Kamalani.

I have a wealth of cherished memories with my grandmother, more than I have time to tell. We visited her house so often, to this day I still have more dreams set on Moyers Road than any place I’ve ever lived.

When I was probably seven, Grandma taught me that good things sometimes come in small packages. My birthday gift that year was my first nice watch, a Timex Ironman Triathalon, which was a Rolex compared to the sea of cheap Casio kid watches.

In the third grade, she said she would buy me a video game system if I got straight A’s. As you can imagine, the bribe worked. Now, nineteen years later, I work at Xbox, which I think officially makes that bribe an early investment in my future career.

When I hit my teens, she pestered me about when I was going to get my ears pierced. All the kids are doing it, she said. She let me stay with them over the summers when I worked at the CoCo Hut. I didn’t drink caffeine growing up, so working at that coffee cart was a crash-course in workplace stimulants. So yet again we have another investment in my future career.

I could go on and on, and I’m still only talking about what I remember, what I saw in the last third of her life. I mean, she remembered surviving Pearl Harbor; I can’t even begin to catalog the amazing life she had. I only know that she was one of the toughest, generous, and loving women I know, and that I’m going to miss her with all of my heart.

My name is Jon Pekele Kamalani Thysell, but for you Grandma, you can call me Kamalani. You can always call me Kamalani.

Until we meet again,

/kamalani

In memory of Shirley K. Jones
September 20, 1935 — May 9, 2012