<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>brool &#187; coding</title>
	<atom:link href="http://www.brool.com/index.php/category/coding/feed" rel="self" type="application/rss+xml" />
	<link>http://www.brool.com</link>
	<description>brool \brool\ (n.) : a low roar; a deep murmur or humming</description>
	<lastBuildDate>Fri, 20 Jan 2012 07:58:59 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Using Markdown with Mutt</title>
		<link>http://www.brool.com/index.php/using-markdown-with-mutt</link>
		<comments>http://www.brool.com/index.php/using-markdown-with-mutt#comments</comments>
		<pubDate>Wed, 24 Aug 2011 08:28:19 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=584</guid>
		<description><![CDATA[This took me a while to figure out, so in the hopes that I can save someone some time, here&#8217;s how to use markdown with Mutt: Step 1: Install msmtp (or any other program) On OS X you can do this with &#8220;brew install msmtp&#8221;. (You can use sendmail or whatever, but msmtp was easy [...]]]></description>
			<content:encoded><![CDATA[<p>This took me a while to figure out, so in the hopes that I can save someone some time, here&#8217;s how to use markdown with Mutt:</p>
<h2>Step 1: Install msmtp (or any other program)</h2>
<p>On OS X you can do this with &#8220;brew install msmtp&#8221;.  (You can use sendmail or whatever, but msmtp was easy to set up.).</p>
<h2>Step 2: Set up msmtp</h2>
<p>Set up the appropriate configuration file.  I ran into a problem wherein the program would just hang;  turning on debug showed that it hung after &#8220;Reading recipients from command line.&#8221;  This turned out to be an incorrect SSL configuration &#8212; I needed tls_starttls to be off. </p>
<p>My particular .msmtprc configuration:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1">defaults
tls on
tls_starttls off
tls_certcheck off
&nbsp;
account Work
host mail.host.com
domain domain.com
auth on
port <span class="nu0">465</span>
protocol smtp
from from-name
user user-name
password password</pre></div></div></div></div></div></div></div>



<h2>Step 3: install sendmail replacement</h2>
<p>For me, this was two things:   a small shell script and a Python program that will convert any plaintext e-mail into a plaintext + HTML e-mail.</p>
<p>Shell script mark_send:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1"><span class="co0">#!/bin/sh</span>
<span class="sy0">/</span>Users<span class="sy0">/</span>tim<span class="sy0">/</span>mutt<span class="sy0">/</span>mark_and_send.py <span class="st0">&quot;/usr/local/bin/pandoc -s&quot;</span> <span class="sy0">|</span> <span class="sy0">/</span>usr<span class="sy0">/</span>local<span class="sy0">/</span>bin<span class="sy0">/</span>msmtp <span class="re5">-a</span> Work $<span class="sy0">@</span></pre></div></div></div></div></div></div></div>



<p>Python script mark_and_send.py:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="python"><pre class="de1"><span class="co1">#!/usr/bin/python</span>
<span class="kw1">import</span> <span class="kw3">os</span>
<span class="kw1">import</span> <span class="kw3">sys</span>
<span class="kw1">import</span> <span class="kw3">subprocess</span>
<span class="kw1">import</span> <span class="kw3">email</span>
<span class="kw1">import</span> <span class="kw3">email</span>.<span class="kw3">parser</span>
<span class="kw1">import</span> <span class="kw3">tempfile</span>
<span class="kw1">from</span> <span class="kw3">email</span>.<span class="me1">mime</span>.<span class="me1">multipart</span> <span class="kw1">import</span> MIMEMultipart
<span class="kw1">from</span> <span class="kw3">email</span>.<span class="me1">mime</span>.<span class="me1">text</span> <span class="kw1">import</span> MIMEText
&nbsp;
<span class="kw1">if</span> <span class="kw2">len</span><span class="br0">&#40;</span><span class="kw3">sys</span>.<span class="me1">argv</span><span class="br0">&#41;</span> <span class="sy0">==</span> <span class="nu0">1</span>:
    <span class="kw1">print</span> <span class="st0">&quot;usage: mark_and_send.py markdown_program [markdown_flag]&quot;</span>
    <span class="kw1">print</span> <span class="st0">&quot;where:&quot;</span>
    <span class="kw1">print</span> <span class="st0">&quot;    markdown_program is the name of the program to translate markdown.&quot;</span>
    <span class="kw1">print</span> <span class="st0">&quot;    markdown_flag is an optional indicator in the first line that determines whether to run the markdown program&quot;</span>
    <span class="kw3">sys</span>.<span class="me1">exit</span><span class="br0">&#40;</span><span class="nu0">0</span><span class="br0">&#41;</span>
&nbsp;
markdown_program <span class="sy0">=</span> <span class="kw3">sys</span>.<span class="me1">argv</span><span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span>
&nbsp;
p <span class="sy0">=</span> <span class="kw3">email</span>.<span class="kw3">parser</span>.<span class="me1">Parser</span><span class="br0">&#40;</span><span class="br0">&#41;</span>
m <span class="sy0">=</span> p.<span class="me1">parsestr</span><span class="br0">&#40;</span><span class="kw3">sys</span>.<span class="me1">stdin</span>.<span class="me1">read</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
&nbsp;
<span class="co1"># is this a single header?</span>
do_markdown <span class="sy0">=</span> <span class="kw2">False</span>
<span class="kw1">if</span> <span class="kw1">not</span> m.<span class="me1">is_multipart</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="kw1">and</span> m.<span class="me1">get_content_maintype</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="sy0">==</span> <span class="st0">'text'</span> <span class="kw1">and</span> m.<span class="me1">get_content_subtype</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="sy0">==</span> <span class="st0">'plain'</span>:
    do_markdown <span class="sy0">=</span> <span class="kw2">True</span>
    payload <span class="sy0">=</span> m.<span class="me1">get_payload</span><span class="br0">&#40;</span><span class="br0">&#41;</span>
    <span class="co1"># optionally check the markdown flag</span>
    <span class="kw1">if</span> <span class="kw2">len</span><span class="br0">&#40;</span><span class="kw3">sys</span>.<span class="me1">argv</span><span class="br0">&#41;</span> <span class="sy0">==</span> <span class="nu0">3</span>:
        lines <span class="sy0">=</span> payload.<span class="me1">split</span><span class="br0">&#40;</span><span class="st0">&quot;<span class="es0">\n</span>&quot;</span><span class="br0">&#41;</span>
        do_markdown <span class="sy0">=</span> lines<span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span> <span class="sy0">==</span> <span class="kw3">sys</span>.<span class="me1">argv</span><span class="br0">&#91;</span><span class="nu0">2</span><span class="br0">&#93;</span>
        <span class="kw1">if</span> do_markdown:
            payload <span class="sy0">=</span> <span class="st0">&quot;<span class="es0">\n</span>&quot;</span>.<span class="me1">join</span><span class="br0">&#40;</span>lines<span class="br0">&#91;</span><span class="nu0">1</span>:<span class="br0">&#93;</span><span class="br0">&#41;</span>
&nbsp;
<span class="kw1">if</span> do_markdown:
    tf <span class="sy0">=</span> <span class="kw3">tempfile</span>.<span class="me1">NamedTemporaryFile</span><span class="br0">&#40;</span>delete<span class="sy0">=</span><span class="kw2">False</span><span class="br0">&#41;</span>
    tf.<span class="me1">write</span><span class="br0">&#40;</span>payload<span class="br0">&#41;</span>
    tf.<span class="me1">close</span><span class="br0">&#40;</span><span class="br0">&#41;</span>
&nbsp;
    <span class="br0">&#40;</span>markdown<span class="sy0">,</span> error<span class="br0">&#41;</span> <span class="sy0">=</span> <span class="kw3">subprocess</span>.<span class="me1">Popen</span><span class="br0">&#40;</span>markdown_program.<span class="me1">split</span><span class="br0">&#40;</span><span class="br0">&#41;</span> + <span class="br0">&#91;</span>tf.<span class="me1">name</span><span class="br0">&#93;</span><span class="sy0">,</span> stdout<span class="sy0">=</span><span class="kw3">subprocess</span>.<span class="me1">PIPE</span><span class="br0">&#41;</span>.<span class="me1">communicate</span><span class="br0">&#40;</span><span class="br0">&#41;</span>
&nbsp;
    <span class="co1"># create a new message with the exact same headers</span>
    new_message <span class="sy0">=</span> MIMEMultipart<span class="br0">&#40;</span><span class="st0">'alternative'</span><span class="br0">&#41;</span>
    <span class="kw1">for</span> <span class="br0">&#40;</span>k<span class="sy0">,</span>v<span class="br0">&#41;</span> <span class="kw1">in</span> m._headers:
        new_message<span class="br0">&#91;</span>k<span class="br0">&#93;</span> <span class="sy0">=</span> v
&nbsp;
    text_plain <span class="sy0">=</span> MIMEText<span class="br0">&#40;</span>payload<span class="sy0">,</span> <span class="st0">'plain'</span><span class="br0">&#41;</span>
    new_message.<span class="me1">attach</span><span class="br0">&#40;</span>text_plain<span class="br0">&#41;</span>
&nbsp;
    text_html <span class="sy0">=</span> MIMEText<span class="br0">&#40;</span>markdown<span class="sy0">,</span> <span class="st0">'html'</span><span class="br0">&#41;</span>
    new_message.<span class="me1">attach</span><span class="br0">&#40;</span>text_html<span class="br0">&#41;</span>
&nbsp;
    <span class="kw1">print</span> new_message.<span class="me1">as_string</span><span class="br0">&#40;</span><span class="br0">&#41;</span>
    <span class="kw3">os</span>.<span class="me1">unlink</span><span class="br0">&#40;</span>tf.<span class="me1">name</span><span class="br0">&#41;</span>
<span class="kw1">else</span>:
    <span class="kw1">print</span> m.<span class="me1">as_string</span><span class="br0">&#40;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>



<p>The mark_and_send.py program takes one required argument (the name of the program to convert from markdown to HTML, possibly quoted) an an optional indicator of markdown.  If the first line of the message is the indicator, then it is stripped and the rest of the message is converted to HTML.

<h2>Step 3: set up mutt</h2>
<p>You need to tell mutt to use your shell script instead of sendmail.  I have it inside a folder hook:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1">folder-hook +work<span class="sy0">/</span>.<span class="sy0">*</span> <span class="st_h">'set sendmail=&quot;/bin/sh /Users/tim/mutt/mark_send&quot;'</span></pre></div></div></div></div></div></div></div>



<p>Now, whenever you send a plain-text email, mutt will automatically run markdown on it and attach it as the HTML view of the message.</p>

<h3>Results</h3>
So, after running with this for a few days, here are my impressions:

<ul>
<li>Markdown is not ideal for e-mail, at least, not without some work.  For example: more e-mails that are quoted by default will not work properly in markdown, because there are not blank lines in the message to set off the blockquote.  This could possibly be improved by post-processing the file that Mutt generates on a reply, but I don&#8217;t think there&#8217;s a hook for that.</li>
<li>Code coloring is nice</li>
<li>Math formulas, if you need them</li>
</ul>

<p>Example is show below, as rendered by Zimbra. It isn&#8217;t perfect. Google will unfortunately strip the colors from the code coloring, and Zimbra loses leading indents, and both readers will require the user to load images. Nonetheless, in a pinch&#8230;</p>

<div style="border: 1px solid black !important; margin: 10px"><img src="http://images.brool.com/blog/coding/mail-example.png"/></div>]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/using-markdown-with-mutt/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ubuntu 11.04 / Buffalo WLI-UC-G300HP</title>
		<link>http://www.brool.com/index.php/ubuntu-11-04-buffalo-wli-uc-g300hp</link>
		<comments>http://www.brool.com/index.php/ubuntu-11-04-buffalo-wli-uc-g300hp#comments</comments>
		<pubDate>Tue, 28 Jun 2011 07:39:14 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[linux]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=571</guid>
		<description><![CDATA[Super quick, just thrown out in the hopes it may save some Googlers some time: I was getting really slow wireless connections on my Buffalo WLI-UC-G300HP adapter with intermittent disconnects. This link had the clearest directions, although: usb_buffer_free and usb_buffer_alloc had changed in the most recent kernel needed to add a device code for my [...]]]></description>
			<content:encoded><![CDATA[Super quick, just thrown out in the hopes it may save some Googlers some time:  I was getting really slow wireless connections on my Buffalo WLI-UC-G300HP adapter with intermittent disconnects.  <a href="http://www.cyberciti.biz/tips/linux-install-rt2870-chipset-based-usb-wireless-adapter.html">This link</a> had the clearest directions, although:
<ul>
<li>usb_buffer_free and usb_buffer_alloc had changed in the most recent kernel</li>
<li>needed to add a device code for my adapter</li>
<li>did not need to update the firmware or the USB list</li>
</ul>

Anyway, <a href="http://images.brool.com/blog/coding/buffalo-wli-uc-g300hp.diff">patch is here</a> if you need.  If you&#8217;re absolutely desperate and you&#8217;re running my configuration (Ubuntu 11.04 64-bit on kernel 2.6.38-8) and you trust that I have no nefarious intent then you can pull <a href="http://images.brool.com/blog/coding/rt2870sta.ko">rt2870sta.ko</a> from here and put it into <b>/lib/modules/2.6.38-8-generic/kernel/drivers/net/wireless</b>.

]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/ubuntu-11-04-buffalo-wli-uc-g300hp/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Editing WordPress Locally</title>
		<link>http://www.brool.com/index.php/editing-wordpress-locally</link>
		<comments>http://www.brool.com/index.php/editing-wordpress-locally#comments</comments>
		<pubDate>Tue, 14 Jun 2011 19:40:38 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[writing]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=563</guid>
		<description><![CDATA[I&#8217;ve written before on editing WordPress locally, but recent circumstances (moving my blog to another server) made me take another look at it. I had written a utility previously that was based on git, but on reflection git is unnecessary. So, stripped out most of the code and moved it into wordpress-shuffle, that allow you [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve <a href="http://www.brool.com/index.php/posting-to-wordpress-from-git">written before on editing WordPress locally</a>, but recent circumstances (moving my blog to another server) made me take another look at it.  I had written a utility previously that was based on git, but on reflection git is unnecessary.  So, stripped out most of the code and moved it into wordpress-shuffle, that allow you to:</p>
<ul>
<li>download files from the blog</li>
<li>show differences</li>
<li>copy changes wholesale from the blog to local, or vice versa</li>
<li>add new posts
</ul>

<p><a href="http://www.github.com/brool/wordpress-shuffle">It&#8217;s all on Github</a>.

<h1>Setting Up</h1>
<p>Assuming that your blog is set up at <a href="http://www.yourblog.com">http://www.yourblog.com</a> (you can create an account at wordpress.com to test this out), all you&#8217;ll need to do is:</p>
<pre>-- make a directory for the blog
mkdir blog
chdir blog

-- download everything
python wp.py --user=yourname --password=yourpass --url=http://www.yourblog.com/xmlrpc.php init
(wait a bit)

-- now set up so we don't have to specify --user, --password, and --url every time (optional)
python wp.py --user=yourname --password=yourpass --url=http://www.yourblog.com/xmlrpc.php defaults
-- you can skip specifying the password, and it'll prompt you when you run it
</pre>
<p>The files are downloaded in the appropriate YYYY/MM directories, with the draft directory being used for all of your unpublished drafts.</p>
<p>All the drafts are stored in plain text, but you&#8217;ll see some lines starting with periods &#8212; these are various WordPress variables that are associated with the file.  You can change them, as well;  for example, to change the title of the post, just change the line that begins with &#8220;.title&#8221;.</p>


<h1>Seeing What&#8217;s Different</h1>
<p>You can use the status command to see differences between the local file system and your blog.</p>
<pre>python wp.py status
</pre>
<p>Note that only the most recent files are checked.  If you want to really check every single file for changes, do:</p>
<pre>python wp.py status all
</pre>
<p>You can also use the &#8211;diff command line option to see the differences between local and server:</p>
<pre>python wp.py --diff status all
</pre>


<h1>Updating From The Blog</h1>
<p>If you&#8217;ve made changes through the web interface and you&#8217;d like to bring them down, you don&#8217;t have to download everything again, but can instead just update.</p>
<pre>python wp.py pull
</pre>
<p>Again, only the most recent changes are brought down.  If you want to check every post on the blog, do:</p>
<pre>python wp.py pull all
</pre>


<h1>Posting To The Blog</h1>
<p>If you&#8217;ve made changes to files and you&#8217;d like to post them back, do:</p>
<pre>python wp.py push
</pre>
<p>To push everything (and not just the most recent files), do:</p>
<pre>python wp.py push all
</pre>
<p>Note that push only changes those files that exist in both spots.  If you&#8217;re adding a new post, use the &#8220;post&#8221; command.</p>


<h1>Posting/Editing</h1>
<p>If you&#8217;d like to add a new post, put it in the drafts folder, and then do:</p>
<pre>python wp.py post drafts/filename
</pre>
<p>Note that add can actually take existing posts, as well &#8212; it just forces an update of that one file, rather than running through all changes like push.</p>

<h1>Publishing</h1>
<p>To publish a file, just change the .post_status field from &#8216;draft&#8217; to &#8216;published&#8217;.  Note that doing this will cause a copy to move from the drafts folder to the appropriate year/month.</p>

<h1>Gotchas</h1>
<p>There are some gotchas due to the fact that the filename can change on you.  There are cases where the filename that will be brought down is different then the one that you send up:</p>
<ul>
<li>You post a file without a .title or .wp_slug line</li>
<li>You post a file with a different file name than the slug that is generated (i.e., &#8220;my-first-draft&#8221; when the title is actually &#8220;my final draft&#8221;)</li>
</ul>]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/editing-wordpress-locally/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using Google Authenticator For Your Website</title>
		<link>http://www.brool.com/index.php/using-google-authenticator-for-your-website</link>
		<comments>http://www.brool.com/index.php/using-google-authenticator-for-your-website#comments</comments>
		<pubDate>Sat, 26 Feb 2011 18:41:44 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=465</guid>
		<description><![CDATA[Google has started offering two-factor authentication for Google logins, using Google Authenticator. They have applications available for iPhone, Android, and Blackberry that give time-based passwords based on the proposed TOTP (Time-based One Time Password) draft standard. The Google code provides a command line program that can generate secret keys as well as a PAM module, [...]]]></description>
			<content:encoded><![CDATA[<p>Google has started offering two-factor authentication for Google logins, using <a href="https://code.google.com/p/google-authenticator/">Google Authenticator</a>.  They have applications available for iPhone, Android, and Blackberry that give time-based passwords based on the proposed <a href="http://tools.ietf.org/html/draft-mraihi-totp-timebased">TOTP (Time-based One Time Password)</a> draft standard.</p>

<p>The Google code provides a command line program that can generate secret keys as well as a PAM module, but it turns out to be very little code to authenticate a TOTP, thereby providing two-factor authentication to your website very easily.</p>

<p>To give the user the key, you&#8217;ll need to generate a cryptographically-secure 10 byte random key, presented to the user as a base32 16-character string.  They can either enter this string directly, or you can use Google charts to provide a barcode that they can scan into the Google Authenticator application:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="python"><pre class="de1"><span class="kw1">def</span> get_barcode_image<span class="br0">&#40;</span>username<span class="sy0">,</span> domain<span class="sy0">,</span> secretkey<span class="br0">&#41;</span>:
    url <span class="sy0">=</span> <span class="st0">&quot;https://www.google.com/chart&quot;</span>
    url +<span class="sy0">=</span> <span class="st0">&quot;?chs=200x200&amp;chld=M|0&amp;cht=qr&amp;chl=otpauth://totp/&quot;</span>
    url +<span class="sy0">=</span> username + <span class="st0">&quot;@&quot;</span> + domain + <span class="st0">&quot;%3Fsecret%3D&quot;</span> + secretkey
    <span class="kw1">return</span> url</pre></div></div></div></div></div></div></div>




<p>For an example of what a code looks like, <a href="https://www.google.com/chart?chs=200x200&#038;chld=M|0&#038;cht=qr&#038;chl=otpauth://totp/me@brool.com%3Fsecret%3DZVMDU4NOTXEJGGET">click here</a>, or, look below:</p>

<img src="http://images.brool.com/blog/coding/totp_code.png"/>

<p>After the user has a secret key from you and has entered it into Google Authenticator either by typing it in directly or scanning in the barcode, you have to be able to verify the key during login (for example).  The code to authenticate is only a few lines in Python:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="python"><pre class="de1"><span class="kw1">import</span> <span class="kw3">time</span>
<span class="kw1">import</span> <span class="kw3">struct</span>
<span class="kw1">import</span> <span class="kw3">hmac</span>
<span class="kw1">import</span> hashlib
<span class="kw1">import</span> <span class="kw3">base64</span>
&nbsp;
<span class="kw1">def</span> authenticate<span class="br0">&#40;</span>secretkey<span class="sy0">,</span> code_attempt<span class="br0">&#41;</span>:
    tm <span class="sy0">=</span> <span class="kw2">int</span><span class="br0">&#40;</span><span class="kw3">time</span>.<span class="kw3">time</span><span class="br0">&#40;</span><span class="br0">&#41;</span> / <span class="nu0">30</span><span class="br0">&#41;</span>
&nbsp;
    secretkey <span class="sy0">=</span> <span class="kw3">base64</span>.<span class="me1">b32decode</span><span class="br0">&#40;</span>secretkey<span class="br0">&#41;</span>
&nbsp;
    <span class="co1"># try 30 seconds behind and ahead as well</span>
    <span class="kw1">for</span> ix <span class="kw1">in</span> <span class="br0">&#91;</span>-<span class="nu0">1</span><span class="sy0">,</span> <span class="nu0">0</span><span class="sy0">,</span> <span class="nu0">1</span><span class="br0">&#93;</span>:
        <span class="co1"># convert timestamp to raw bytes</span>
        b <span class="sy0">=</span> <span class="kw3">struct</span>.<span class="me1">pack</span><span class="br0">&#40;</span><span class="st0">&quot;&gt;q&quot;</span><span class="sy0">,</span> tm + ix<span class="br0">&#41;</span>
&nbsp;
        <span class="co1"># generate HMAC-SHA1 from timestamp based on secret key</span>
        hm <span class="sy0">=</span> <span class="kw3">hmac</span>.<span class="me1">HMAC</span><span class="br0">&#40;</span>secretkey<span class="sy0">,</span> b<span class="sy0">,</span> hashlib.<span class="me1">sha1</span><span class="br0">&#41;</span>.<span class="me1">digest</span><span class="br0">&#40;</span><span class="br0">&#41;</span>
&nbsp;
        <span class="co1"># extract 4 bytes from digest based on LSB</span>
        offset <span class="sy0">=</span> <span class="kw2">ord</span><span class="br0">&#40;</span>hm<span class="br0">&#91;</span>-<span class="nu0">1</span><span class="br0">&#93;</span><span class="br0">&#41;</span> &amp; <span class="nu0">0x0F</span>
        truncatedHash <span class="sy0">=</span> hm<span class="br0">&#91;</span>offset:offset+<span class="nu0">4</span><span class="br0">&#93;</span>
&nbsp;
        <span class="co1"># get the code from it</span>
        <span class="kw3">code</span> <span class="sy0">=</span> <span class="kw3">struct</span>.<span class="me1">unpack</span><span class="br0">&#40;</span><span class="st0">&quot;&gt;L&quot;</span><span class="sy0">,</span> truncatedHash<span class="br0">&#41;</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span>
        <span class="kw3">code</span> &amp;<span class="sy0">=</span> <span class="nu0">0x7FFFFFFF</span><span class="sy0">;</span>
        <span class="kw3">code</span> %<span class="sy0">=</span> <span class="nu0">1000000</span><span class="sy0">;</span>
&nbsp;
        <span class="kw1">if</span> <span class="br0">&#40;</span><span class="st0">&quot;%06d&quot;</span> % <span class="kw3">code</span><span class="br0">&#41;</span> <span class="sy0">==</span> <span class="kw2">str</span><span class="br0">&#40;</span>code_attempt<span class="br0">&#41;</span>:
            <span class="kw1">return</span> <span class="kw2">True</span>
&nbsp;
    <span class="kw1">return</span> <span class="kw2">False</span></pre></div></div></div></div></div></div></div>


]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/using-google-authenticator-for-your-website/feed</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Hadoop Shim To Clojure</title>
		<link>http://www.brool.com/index.php/hadoop-shim-to-clojure</link>
		<comments>http://www.brool.com/index.php/hadoop-shim-to-clojure#comments</comments>
		<pubDate>Mon, 08 Nov 2010 09:47:12 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[clojure]]></category>
		<category><![CDATA[coding]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=452</guid>
		<description><![CDATA[I&#8217;ve been working with Hadoop a lot lately in order to do some exploratory data analysis on traffic logs. Hadoop is great; it makes things that were taking 30 minutes run 10x faster, which means that I can iterate a lot faster and experiment with more ways to slice the data. I wanted an easy [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been working with Hadoop a lot lately in order to do some exploratory data analysis on traffic logs.  Hadoop is great;  it makes things that were taking 30 minutes run 10x faster, which means that I can iterate a lot faster and experiment with more ways to slice the data.</p>

<p>I wanted an easy way of running Clojure programs under Hadoop, and ended up writing a silly, simple little shim that would simply take a Clojure file with a mapper and reducer function.  It means that there is no .JAR building and no AOT compiling &#8212; just write your mapper and reducer function, bundle up the file, and go.  Note that the JAR is built for Hadoop 0.20 and later.</p>

<h2>Getting The Jar</h2>

<p>The quickest way of downloading the .JAR is to just <a href="http://images.brool.com/blog/files/coding/shim.jar">download it</a>.  </p>

<h2>Building The Jar</h2>

<p>Build the shim:</p>

<pre>javac -cp /usr/lib/hadoop/hadoop-core.jar:/usr/lib/hadoop/lib/*:lib/* -d classes Shim.java</pre>

<p>Create a lib directory and add the clojure classes:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1"><span class="kw2">mkdir</span> lib
<span class="kw2">cp</span> <span class="sy0">/</span>from<span class="sy0">/</span>wherever<span class="sy0">/</span>clojure-1.2.0.jar lib
<span class="kw2">cp</span> <span class="sy0">/</span>from<span class="sy0">/</span>wherever<span class="sy0">/</span>clojure-contrib-1.2.0.jar lib</pre></div></div></div></div></div></div></div>




<p>Bundle it together into one .JAR file:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1">jar <span class="re5">-cvf</span> shim.jar <span class="re5">-C</span> classes<span class="sy0">/</span> . lib<span class="sy0">/*</span></pre></div></div></div></div></div></div></div>




The directory of your .JAR should look something like this when you&#8217;ve finished:


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1"><span class="br0">&#91;</span>~<span class="sy0">/</span>github<span class="sy0">/</span>hadoop-shim<span class="br0">&#93;</span> $ jar tf shim.jar
META-INF<span class="sy0">/</span>
META-INF<span class="sy0">/</span>MANIFEST.MF
com<span class="sy0">/</span>
com<span class="sy0">/</span>brool<span class="sy0">/</span>
com<span class="sy0">/</span>brool<span class="sy0">/</span>Shim.class
com<span class="sy0">/</span>brool<span class="sy0">/</span>Shim<span class="re1">$Reduce</span>.class
com<span class="sy0">/</span>brool<span class="sy0">/</span>Shim<span class="re1">$Map</span>.class
lib<span class="sy0">/</span>clojure-1.2.0.jar
lib<span class="sy0">/</span>clojure-contrib-1.2.0.jar</pre></div></div></div></div></div></div></div>




<h2>Using The Jar</h2>

<p>To run the example wordcount:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1">hadoop jar shim.jar com.brool.Shim <span class="re5">-files</span> wordcount.clj input-file output-file</pre></div></div></div></div></div></div></div>




<p>The output from the run will be in output-file/part-r-00000</p>

<h2>More Details</h2>

<p>The Clojure file that you provide should be in the user namespace, and must provide two functions named mapper and reducer.</p>

<h3>Mapper</h3>

<p>The mapper function takes a string representing one line in the input file and returns a list of [ key value ] pairs. A given input line can generate any number of map lines.</p>

<h3>Reducer</h3>

<p>The reducer is given a key and all values that were associated with that key. The reducer&#8217;s function is to consolidate all of that into one output line.</p>

<h2>Example</h2>

<p>Since word count is the canonical example, let&#8217;s do that. The mapper for word count is:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span><span class="kw1">defn</span> mapper <span class="br0">&#91;</span>v<span class="br0">&#93;</span>
  <span class="br0">&#40;</span><span class="kw1">let</span> <span class="br0">&#91;</span>words <span class="br0">&#40;</span><span class="kw1">enumeration-seq</span> <span class="br0">&#40;</span>StringTokenizer<span class="sy0">.</span> v<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#93;</span>
    <span class="br0">&#40;</span><span class="kw1">map</span> #<span class="br0">&#40;</span><span class="kw1">do</span> <span class="br0">&#91;</span> <span class="sy0">%</span> <span class="st0">&quot;1&quot;</span> <span class="br0">&#93;</span> <span class="br0">&#41;</span> words<span class="br0">&#41;</span>
<span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>As an example, for an input line of &#8220;This is a test, is it not?&#8221;, the following map will be generated:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span><span class="br0">&#91;</span><span class="st0">&quot;This&quot;</span> <span class="st0">&quot;1&quot;</span><span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="st0">&quot;is&quot;</span> <span class="st0">&quot;1&quot;</span><span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="st0">&quot;a&quot;</span> <span class="st0">&quot;1&quot;</span><span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="st0">&quot;test,&quot;</span> <span class="st0">&quot;1&quot;</span><span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="st0">&quot;is&quot;</span> <span class="st0">&quot;1&quot;</span><span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="st0">&quot;it&quot;</span> <span class="st0">&quot;1&quot;</span><span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="st0">&quot;not?&quot;</span> <span class="st0">&quot;1&quot;</span><span class="br0">&#93;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Before being given the reducer, the Hadoop system will group all like keys together, resulting in:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="st0">&quot;This&quot;</span> <span class="sy0">=&gt;</span> <span class="br0">&#91;</span> <span class="st0">&quot;1&quot;</span> <span class="br0">&#93;</span>
<span class="st0">&quot;is&quot;</span>     <span class="sy0">=&gt;</span> <span class="br0">&#91;</span> <span class="st0">&quot;1&quot;</span> <span class="st0">&quot;1&quot;</span> <span class="br0">&#93;</span></pre></div></div></div></div></div></div></div>




<p>So on and so forth. The reducer is given the key and the list of all values that had that key, and then emits the final result &#8212; for a word count, the correct code would be:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span><span class="kw1">defn</span> reducer <span class="br0">&#91;</span>k v<span class="br0">&#93;</span>
  <span class="br0">&#91;</span> k <span class="br0">&#40;</span><span class="kw1">reduce</span> <span class="sy0">+</span> <span class="br0">&#40;</span><span class="kw1">map</span> #<span class="br0">&#40;</span>Integer<span class="sy0">/</span>parseInt <span class="sy0">%</span><span class="br0">&#41;</span> v<span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#93;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>This simply adds up the counts.</p>

<h2>Testing</h2>

<p>In wordcount.clj there are two handy functions that allow for debugging of the mapper and reducer before submitting it to Hadoop.</p>

<p>Given a local file, the test-mapper will load the file and run it through the mapper; if your file is large you may just want to (take 20 (test-mapper &#8220;/my/filename&#8221;)).</p>

<p>The test-reducer function will load the file and run it through both the mapper and the reducer. Taking the example sentence above:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1">user<span class="sy0">&gt;</span> <span class="br0">&#40;</span>test<span class="sy0">-</span>reducer <span class="st0">&quot;/tmp/one-sentence&quot;</span><span class="br0">&#41;</span>
<span class="br0">&#40;</span><span class="br0">&#91;</span><span class="st0">&quot;This&quot;</span> <span class="nu0">1</span><span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="st0">&quot;a&quot;</span> <span class="nu0">1</span><span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="st0">&quot;is&quot;</span> <span class="nu0">2</span><span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="st0">&quot;it&quot;</span> <span class="nu0">1</span><span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="st0">&quot;not?&quot;</span> <span class="nu0">1</span><span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="st0">&quot;test,&quot;</span> <span class="nu0">1</span><span class="br0">&#93;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<h2>Example #2</h3>

<p>As another example: let&#8217;s say that you have a collection of log entries, and would like to record the first and last log entry for every user. Assume that the files are in a CSV format, with the fields being in the order of timehit, userid. Example:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1"><span class="nu0">2010</span>-<span class="nu0">10</span>-04 <span class="nu0">13</span>:04:<span class="nu0">22</span>,<span class="nu0">112334</span>
<span class="nu0">2010</span>-<span class="nu0">10</span>-04 <span class="nu0">10</span>:04:<span class="nu0">22</span>,<span class="nu0">182994</span>
<span class="nu0">2010</span>-<span class="nu0">10</span>-04 <span class="nu0">10</span>:05:<span class="nu0">18</span>,<span class="nu0">182994</span>
<span class="nu0">2010</span>-<span class="nu0">10</span>-04 <span class="nu0">10</span>:07:<span class="nu0">19</span>,<span class="nu0">182994</span>
<span class="nu0">2010</span>-<span class="nu0">10</span>-04 <span class="nu0">13</span>:<span class="nu0">28</span>:<span class="nu0">41</span>,<span class="nu0">112334</span>
<span class="nu0">2010</span>-<span class="nu0">10</span>-04 <span class="nu0">10</span>:09:<span class="nu0">22</span>,<span class="nu0">182994</span>
<span class="nu0">2010</span>-<span class="nu0">10</span>-04 <span class="nu0">13</span>:<span class="nu0">56</span>:<span class="nu0">22</span>,<span class="nu0">112334</span>
<span class="nu0">2010</span>-<span class="nu0">10</span>-04 <span class="nu0">11</span>:<span class="nu0">30</span>:01,<span class="nu0">182994</span></pre></div></div></div></div></div></div></div>




<p>The mapper for this:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span><span class="kw1">defn</span> mapper <span class="br0">&#91;</span>v<span class="br0">&#93;</span>
    <span class="br0">&#40;</span><span class="kw1">let</span> <span class="br0">&#91;</span><span class="br0">&#91;</span>timehit userid<span class="br0">&#93;</span> <span class="br0">&#40;</span><span class="sy0">.</span>split v <span class="st0">&quot;,&quot;</span><span class="br0">&#41;</span><span class="br0">&#93;</span>
        <span class="br0">&#91;</span> <span class="br0">&#91;</span> userid timehit <span class="br0">&#93;</span> <span class="br0">&#93;</span>
<span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>The reducer:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span><span class="kw1">defn</span> reducer <span class="br0">&#91;</span>k v<span class="br0">&#93;</span>
<span class="br0">&#40;</span><span class="kw1">let</span> <span class="br0">&#91;</span>s <span class="br0">&#40;</span><span class="kw1">sort</span> v<span class="br0">&#41;</span><span class="br0">&#93;</span>
<span class="br0">&#91;</span>k <span class="br0">&#40;</span><span class="kw1">str</span> <span class="br0">&#40;</span><span class="kw1">first</span> s<span class="br0">&#41;</span> <span class="st0">&quot;,&quot;</span> <span class="br0">&#40;</span>last s<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>The source is <a href="https://github.com/brool/hadoop-shim">all on Github</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/hadoop-shim-to-clojure/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Beaujiful Soup</title>
		<link>http://www.brool.com/index.php/beaujiful-soup</link>
		<comments>http://www.brool.com/index.php/beaujiful-soup#comments</comments>
		<pubDate>Thu, 28 Oct 2010 05:11:16 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[clojure]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[beautiful]]></category>
		<category><![CDATA[html]]></category>
		<category><![CDATA[parsing]]></category>
		<category><![CDATA[soup]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=441</guid>
		<description><![CDATA[Horrible name, isn&#8217;t it? Beautiful Soup is a really nice Python library for extracting content from possibly-sloppy HTML, and I wanted some reasonably close Clojure equivalent. Unfortunately, the standard classes don&#8217;t work well with malformed HTML; as an example: =&#62; &#40;require '&#40;clojure &#91;xml :as xml&#93;&#41;&#41; =&#62; &#40;xml/parse &#34;http://www.google.com&#34;&#41; org.xml.sax.SAXParseException: The markup in the document preceding [...]]]></description>
			<content:encoded><![CDATA[<p>Horrible name, isn&#8217;t it? </p>

<p><a href="http://www.crummy.com/software/BeautifulSoup/">Beautiful Soup</a> is a really nice Python library for extracting content from possibly-sloppy HTML, and I wanted some reasonably close Clojure equivalent.  Unfortunately, the standard classes don&#8217;t work well with malformed HTML;  as an example:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1">    <span class="sy0">=&gt;</span> <span class="br0">&#40;</span>require '<span class="br0">&#40;</span>clojure <span class="br0">&#91;</span>xml :<span class="me1">as</span> xml<span class="br0">&#93;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="sy0">=&gt;</span> <span class="br0">&#40;</span>xml<span class="sy0">/</span>parse <span class="st0">&quot;http://www.google.com&quot;</span><span class="br0">&#41;</span>
    org<span class="sy0">.</span>xml<span class="sy0">.</span>sax<span class="sy0">.</span>SAXParseException: <span class="me1">The</span> markup in the document preceding the root element must be well<span class="sy0">-</span>formed<span class="sy0">.</span> <span class="br0">&#40;</span>NO_SOURCE_FILE:<span class="nu0">0</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Fortunately, there is already a <a href="http://home.ccil.org/~cowan/XML/tagsoup/">TagSoup</a> library that can parse non-perfect HTML, and it is very <a href="http://markmail.org/message/2e7i72y4cg36wqdx">easy to integrate</a> TagSoup into xml/parse.  This module hardly does anything; it simply adds a few helper routines and brings the most-used calls into one amazingly bad namespace name.</p>

<h1>Examples</h1>

<p>Building your soup:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1">    <span class="br0">&#40;</span>use beaujiful<span class="sy0">-</span>soup<span class="sy0">.</span>core<span class="br0">&#41;</span>
&nbsp;
    <span class="co1">; build soup from URL</span>
    <span class="br0">&#40;</span><span class="kw1">def</span> t <span class="br0">&#40;</span>build<span class="sy0">-</span>soup <span class="st0">&quot;http://www.google.com&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
&nbsp;
    <span class="co1">; build soup from (deliberately malformed) string</span>
    <span class="br0">&#40;</span><span class="kw1">def</span> t2 <span class="br0">&#40;</span>build<span class="sy0">-</span>string<span class="sy0">-</span>soup <span class="st0">&quot;&lt;html&gt;&lt;body&gt;&lt;ul&gt;&lt;li&gt;One&lt;li&gt;Two&lt;/ul&gt;&lt;/body&gt;&lt;/html&gt;&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Extracting information is done with the xml-> call. Oftentimes the last thing you do will be a node or text or (attr :attribute) call, in order to convert the results into a more workable type:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1">    <span class="co1">; you can &quot;walk&quot; down the tree with successive tag names.  For</span>
    <span class="co1">; example, get every list item inside the unordered list</span>
    <span class="co1">; immediately inside the body.</span>
    <span class="br0">&#40;</span>xml<span class="sy0">-&gt;</span> t2 :<span class="me1">body</span> :<span class="me1">ul</span> :<span class="me1">li</span> node<span class="br0">&#41;</span>
    <span class="co1">; =&gt; ({:tag :li, :attrs nil, :content [&quot;One&quot;]} {:tag :li, :attrs nil, :content [&quot;Two&quot;]})</span>
&nbsp;
    <span class="co1">; get the text for the list items</span>
    <span class="br0">&#40;</span>xml<span class="sy0">-&gt;</span> t2 :<span class="me1">body</span> :<span class="me1">ul</span> :<span class="me1">li</span> text<span class="br0">&#41;</span>
    <span class="co1">; =&gt; (&quot;One&quot; &quot;Two&quot;)</span>
&nbsp;
    <span class="co1">; Get textareas immediately inside the body.</span>
    <span class="br0">&#40;</span>xml<span class="sy0">-&gt;</span> t :<span class="me1">body</span> :<span class="me1">textarea</span> node<span class="br0">&#41;</span>
    <span class="co1">; =&gt; ({:tag :textarea, :attrs {:id &quot;csi&quot;, :style &quot;display:none&quot;}, :content nil})</span>
&nbsp;
    <span class="co1">; use descendants to iterate through all nodes, not just the immediate children.</span>
    <span class="co1">; Get the text from all &lt;a&gt; tags anywhere in the body.</span>
    <span class="br0">&#40;</span>xml<span class="sy0">-&gt;</span> t descendants :<span class="me1">a</span> text<span class="br0">&#41;</span>
    <span class="co1">; =&gt; (&quot;Images&quot; &quot;Videos&quot; &quot;Maps&quot; ...)</span>
&nbsp;
    <span class="co1">;  Get the href attribute from all tags</span>
    <span class="br0">&#40;</span>xml<span class="sy0">-&gt;</span> t descendants :<span class="me1">a</span> <span class="br0">&#40;</span>attr :<span class="me1">href</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="co1">; =&gt; (&quot;http://www.google.com/imghp?hl=en&amp;tab=wi&quot; ... )</span></pre></div></div></div></div></div></div></div>




<p>Use the (attr=) predicate to match an attribute value:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1">    <span class="co1">; find invisible stuff</span>
    <span class="br0">&#40;</span>xml<span class="sy0">-&gt;</span> t2 descendants <span class="br0">&#40;</span>attr<span class="sy0">=</span> :<span class="me1">style</span> <span class="st0">&quot;display:none&quot;</span><span class="br0">&#41;</span> tag<span class="br0">&#41;</span>
    <span class="co1">; =&gt; (:textarea :iframe)</span></pre></div></div></div></div></div></div></div>




<p>Strings match the text inside nodes:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1">    <span class="co1">; find the link for the &lt;a&gt; that has &quot;Videos&quot; for content</span>
    <span class="br0">&#40;</span>xml<span class="sy0">-&gt;</span> t descendants :<span class="me1">a</span> <span class="st0">&quot;Videos&quot;</span> <span class="br0">&#40;</span>attr :<span class="me1">href</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="co1">; =&gt; (&quot;http://video.google.com/?hl=en&amp;tab=wv&quot;)</span></pre></div></div></div></div></div></div></div>




<p>Arbitrary predicates can be used as well.  They will take a loc (location), and are usually converted to a node before being used:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1">    <span class="co1">; find any :p or :div</span>
    <span class="br0">&#40;</span><span class="kw1">defn</span> p<span class="sy0">-</span>or<span class="sy0">-</span>div <span class="br0">&#91;</span>loc<span class="br0">&#93;</span> <span class="br0">&#40;</span>contains? #<span class="br0">&#123;</span>:<span class="me1">p</span> :<span class="me1">div</span><span class="br0">&#125;</span> <span class="br0">&#40;</span>:<span class="me1">tag</span> <span class="br0">&#40;</span>node loc<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="br0">&#40;</span>xml<span class="sy0">-&gt;</span> t descendants p<span class="sy0">-</span>or<span class="sy0">-</span>div tag<span class="br0">&#41;</span>
    <span class="co1">; =&gt; (:div :div :div :div :div :div :div :div :div :div :div :p :div :div)</span>
&nbsp;
    <span class="co1">; find the link for &lt;a&gt; that has case-insensitive &quot;Videos&quot; for content</span>
    <span class="br0">&#40;</span>require 'clojure<span class="sy0">.</span>string<span class="br0">&#41;</span>
    <span class="br0">&#40;</span><span class="kw1">defn</span> f <span class="br0">&#91;</span>loc<span class="br0">&#93;</span> 
      <span class="br0">&#40;</span><span class="kw1">let</span> <span class="br0">&#91;</span>n <span class="br0">&#40;</span>node loc<span class="br0">&#41;</span><span class="br0">&#93;</span>
       <span class="br0">&#40;</span><span class="kw1">and</span> <span class="br0">&#40;</span><span class="sy0">=</span> <span class="br0">&#40;</span>:<span class="me1">tag</span> n<span class="br0">&#41;</span> :<span class="me1">a</span><span class="br0">&#41;</span> <span class="br0">&#40;</span><span class="sy0">=</span> <span class="br0">&#40;</span>clojure<span class="sy0">.</span>string<span class="sy0">/</span>upper<span class="sy0">-</span>case <span class="br0">&#40;</span><span class="kw1">first</span> <span class="br0">&#40;</span>:<span class="me1">content</span> n<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="st0">&quot;VIDEOS&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="br0">&#40;</span>xml<span class="sy0">-&gt;</span> t descendants f <span class="br0">&#40;</span>attr :<span class="me1">href</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="co1">; =&gt; (&quot;http://video.google.com/?hl=en&amp;tab=wv&quot;)</span></pre></div></div></div></div></div></div></div>




<p>Fundamentally, the xml-> call returns a list of locations, and you can apply arbitrary transforms as necessary.  For example, let&#8217;s say that you want to build a map of text => hrefs for all of the links:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1">    <span class="br0">&#40;</span><span class="kw1">defn</span> loc<span class="sy0">-</span>to<span class="sy0">-</span>pair <span class="br0">&#91;</span>loc<span class="br0">&#93;</span>
        <span class="br0">&#91;</span> <span class="br0">&#40;</span>attr loc :<span class="me1">href</span><span class="br0">&#41;</span>, <span class="br0">&#40;</span>text loc<span class="br0">&#41;</span> <span class="br0">&#93;</span><span class="br0">&#41;</span>
    <span class="br0">&#40;</span><span class="kw1">apply</span> <span class="kw1">hash-map</span> <span class="br0">&#40;</span>xml<span class="sy0">-&gt;</span> t descendants :<span class="me1">a</span> loc<span class="sy0">-</span>to<span class="sy0">-</span>pair<span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="co1">; =&gt; {&quot;/services/&quot; &quot;Business Solutions&quot;,  ... }</span></pre></div></div></div></div></div></div></div>




<p>Having a vector in the chain applies all the predicates within the vector, and filters out anything that doesn&#8217;t match.  It acts a little like a lookahead in a regex.  For example:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1">    <span class="co1">; Find the IDs of all divs that contain an href immediately within them</span>
    <span class="br0">&#40;</span>xml<span class="sy0">-&gt;</span> t descendants :<span class="me1">div</span> <span class="br0">&#91;</span> :<span class="me1">a</span> <span class="br0">&#93;</span> <span class="br0">&#40;</span>attr :<span class="me1">id</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="co1">; =&gt; (&quot;fll&quot;)</span>
&nbsp;
    <span class="co1">; Find the IDs of all divs that contains an href anywhere within them</span>
    <span class="br0">&#40;</span>xml<span class="sy0">-&gt;</span> t descendants :<span class="me1">div</span> <span class="br0">&#91;</span> descendants :<span class="me1">a</span> <span class="br0">&#93;</span> <span class="br0">&#40;</span>attr :<span class="me1">id</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="co1">; =&gt; (&quot;ghead&quot; &quot;gbar&quot; &quot;guser&quot; &quot;fll&quot;)</span></pre></div></div></div></div></div></div></div>




<h1>Source</h1>

<a href="https://github.com/brool/beaujiful-soup">It&#8217;s all on Github</a>.]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/beaujiful-soup/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Setting Up Incanter and MySQL</title>
		<link>http://www.brool.com/index.php/setting-up-incanter-and-mysql</link>
		<comments>http://www.brool.com/index.php/setting-up-incanter-and-mysql#comments</comments>
		<pubDate>Thu, 21 Jan 2010 19:16:02 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[clojure]]></category>
		<category><![CDATA[coding]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=420</guid>
		<description><![CDATA[Okay, Lein really does make stuff pretty easy. Rather than wrestling with eleventybillion classpaths, just install Lein. Create a new project directory with lein new mydirectory Change the project.clj file that is autogenerated with: &#40;defproject mydirectory &#34;1.0.0-SNAPSHOT&#34; :description &#34;FIXME: write&#34; :dependencies &#91;&#91;org.clojure/clojure &#34;1.1.0-alpha-SNAPSHOT&#34;&#93; &#91;org.clojure/clojure-contrib &#34;1.0-SNAPSHOT&#34;&#93; &#91;mysql/mysql-connector-java &#34;5.1.6&#34;&#93; &#91;incanter/incanter &#34;1.0-master-SNAPSHOT&#34;&#93;&#93;&#41; (that is, add the mysql and [...]]]></description>
			<content:encoded><![CDATA[<p>Okay, Lein really does make stuff pretty easy.  Rather than wrestling with eleventybillion classpaths, just <a href="http://github.com/technomancy/leiningen">install Lein</a>.</p>

Create a new project directory with <code>lein new mydirectory</code>

Change the project.clj file that is autogenerated with:


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span>defproject mydirectory <span class="st0">&quot;1.0.0-SNAPSHOT&quot;</span> :<span class="me1">description</span> <span class="st0">&quot;FIXME: write&quot;</span> 
  :<span class="me1">dependencies</span> <span class="br0">&#91;</span><span class="br0">&#91;</span>org<span class="sy0">.</span>clojure<span class="sy0">/</span>clojure <span class="st0">&quot;1.1.0-alpha-SNAPSHOT&quot;</span><span class="br0">&#93;</span> 
    <span class="br0">&#91;</span>org<span class="sy0">.</span>clojure<span class="sy0">/</span>clojure<span class="sy0">-</span>contrib <span class="st0">&quot;1.0-SNAPSHOT&quot;</span><span class="br0">&#93;</span>
    <span class="br0">&#91;</span>mysql<span class="sy0">/</span>mysql<span class="sy0">-</span>connector<span class="sy0">-</span>java <span class="st0">&quot;5.1.6&quot;</span><span class="br0">&#93;</span>
    <span class="br0">&#91;</span>incanter<span class="sy0">/</span>incanter <span class="st0">&quot;1.0-master-SNAPSHOT&quot;</span><span class="br0">&#93;</span><span class="br0">&#93;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>(that is, add the mysql and incanter dependencies).</p>

<p>Download all the dependencies with <code>lein deps</code></p>

<p>Start up a REPL with <i>everything in the classpath</i> by just using <code>lein repl</code></p>

<p>Wow, that&#8217;s kind of nice.</p>]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/setting-up-incanter-and-mysql/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tethering an iPhone with SSH and Windows 7</title>
		<link>http://www.brool.com/index.php/tethering-an-iphone-with-ssh-and-windows-7</link>
		<comments>http://www.brool.com/index.php/tethering-an-iphone-with-ssh-and-windows-7#comments</comments>
		<pubDate>Sun, 17 Jan 2010 04:59:10 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[iphone ssh tether jailbreak]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=415</guid>
		<description><![CDATA[I don&#8217;t need to tether through my iPhone &#8212; I always seem to be near a hotspot &#8212; but nonetheless I thought that spending 10 minutes to set it up was worth it, because when you need tethering, you really need tethering. If you&#8217;re willing to drop $10 bucks there are a couple of apps [...]]]></description>
			<content:encoded><![CDATA[<p>I don&#8217;t need to tether through my iPhone &mdash; I always seem to be near a hotspot &mdash; but nonetheless I thought that spending 10 minutes to set it up was worth it, because when you need tethering, you <i>really</i> need tethering.  If you&#8217;re willing to drop $10 bucks there are a couple of apps in Cydia that can do this easy (PhoneModem and MyWi, I think), but I just wanted something that I could use in an emergency.</p>

<p>Unfortunately, despite the fact that there were some <a href="http://lifehacker.com/398961/get-your-computer-online-using-your-iphones-data-connection">good guides</a> on how to do it, it turned out there were a few gotchas, and it took significantly longer than 10 minutes.  So, in the hopes that I can save someone else the time:</p>

<p>Use <a href="http://lifehacker.com/398961/get-your-computer-online-using-your-iphones-data-connection">this guide</a>, but:</p>

<ul>
<li>in Windows 7, you&#8217;ll want to use the &#8220;Network and Sharing Center&#8221; (right click on your network icon) to create a new &#8220;computer-to-computer&#8221; network.  This is the one the iPhone will connect to.</li>
<li>When you configure Putty, you&#8217;ll want to make sure you go to the SSH/Tunnels page and a) add a dynamic tunnel to a local port (assume 8080) &mdash; you won&#8217;t need to specify a destination and can leave that blank, and b) go to the &#8220;Connection&#8221; page and specify a keepalive of 5 or so.  This keeps the SSH connection between the computer and the iPhone from dying every 30 seconds.</li>
<li>You <i>cannot</i> use Chrome right now, because it does not properly forward DNS requests to the proxy (it took me hours to figure this out).  Use Firefox and specify a SOCKS v5 proxy at port 8080 (or whatever) for the proxy, but you&#8217;ll also need to make sure to go to the about:config page and change the network.proxy.socks_remote_dns to &#8220;true&#8221;.</li>
<li>Doing things in a particular order makes this work better.  If the 3G connection on the iPhone is not active, then there doesn&#8217;t seem to be a way to activate after it links to your ad hoc connection.  So:  turn off wireless, go to a web page (this activates the 3G), turn on wireless, connect to your computer.  This way both the 3G and wireless connections are established.  An easy way to see if the 3G is active is to run MobileTerminal (or just SSH into the iPhone) and run ipconfig;  if the pdp_ip0 device is pointing to a real IP, then your 3G is active.</li>
</ul>

]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/tethering-an-iphone-with-ssh-and-windows-7/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>A Modest Proposal</title>
		<link>http://www.brool.com/index.php/a-modest-proposal</link>
		<comments>http://www.brool.com/index.php/a-modest-proposal#comments</comments>
		<pubDate>Tue, 01 Sep 2009 16:21:25 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[clojure]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[clojure compojure php]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=373</guid>
		<description><![CDATA[Warning: a very rambling article; less a solid proposal than me just exploring an idea that may lead to a dead end in a week or two after I&#8217;ve thought about it. I really enjoy Clojure. Everything seems so well thought out and well designed; in a lot of ways it reminds me of Python, [...]]]></description>
			<content:encoded><![CDATA[<p><i>Warning: a very rambling article; less a solid proposal than me just exploring an idea that may lead to a dead end in a week or two after I&#8217;ve thought about it.</i></p>

<p>I really enjoy Clojure. Everything seems so well thought out and well designed; in a lot of ways it reminds me of Python, which is three ways to ironic because back in the day I started using Python because it was very Lispy. The cycle of life turns even on computer languages.</p>

<p>But unfortunately, I get the sense that there&#8217;s probably only room for one JVM language to make it big.  As much as I like Clojure, I think that Scala will probably be the one that succeeds.  It&#8217;s got the static typing, it&#8217;s got the Java-like performance, and it&#8217;s got a syntax that&#8217;s close-ish to Java, so it fills the niche that Java fills so well: a programming language that is ideal for large groups of developers.</p>

<p>However, it&#8217;s early enough in the development of Clojure that there&#8217;s the sense that maybe individual efforts could make a difference.  (As a counterexample, Haskell is too far along &mdash; it seems like everything has been done already, or anything that needs to be done requires a deep knowledge of category theory and the latest functional design papers.)</p>

<p>Assuming that I would want to help out Clojure, what could be done?</p>

<p>Donate? Actually, I did that. I figured that I got as many hours of enjoyment out of Clojure as I did out of a typical game, so I contributed the price of a game.</p>

<p>Libraries?  Well, I&#8217;ve written one or two.  Will probably write more.</p>

<p>If Clojure is going to succeed, it needs a <i>niche</i>, an area in which it performs so well that it blows everything else away.  Concurrent programming is one possibility, but there is a lot of competition there.</p>

<p>I have a modest proposal:</p>

<p>Clojure should be more like PHP.</p>

<p>Yes, yes, I know that programming in PHP is socially reprehensible, just one step above people that shop without their shirt while in Walmart and car salesmen.</p>

<p>But what if we were to give Clojure the trappings that make starting developers try it?  Metaphorically, it&#8217;s like <i>Gulliver&#8217;s Travels</i>; it <i>looks</i> like a humorous, light-hearted adventure story, but underneath it&#8217;s biting political satire and deep thoughts.  We lure developers in with the possibility of an easy, light-hearted way to write web sites&#8230; and then once they&#8217;re in there, once they&#8217;ve progressed from manipulating web sites and start to realize the benefits of advanced languages, they can deal with a sane language, built on top of the JVM and with the full power of a Lisp.</p>

<p>So call this Gulliver.</p>

<h2>Why Did PHP Succeed?</h2>

<p>My impression, based on absolutely no citations or hard evidence, is that PHP became popular primarily because it is:</p>

<ul>
<li>Easy to start trying &mdash; start with an HTML page, and just add one statement to start playing around with PHP itself</li>
<li>Easy to access data &mdash; typically, with MySQL</li>
<li>Easy to deploy &mdash; just upload a file and go</li>
</ul>

<p>So, continuing the gedankenexperiment: if we were designing something that would be very approachable and very easy to use, what would it look like?</p>

<h2>Making Compojure More Approachable</h2>

<p>Compojure is an excellent base to start with, but it strays away from templates, and instead tries to do everything with compojure.html, a library that allows you to form everything with vectors and lists:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="lisp"><pre class="de1"><span class="br0">&#40;</span>html <span class="br0">&#91;</span><span class="sy0">:</span><span class="me1">p</span> <span class="st0">&quot;2 + 2 = &quot;</span> <span class="br0">&#40;</span>+ <span class="nu0">2</span> <span class="nu0">2</span><span class="br0">&#41;</span><span class="br0">&#93;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Handy? Yes, if you&#8217;re a developer. But if a web designer is doing your HTML, or if you&#8217;re just starting out programming and you already know HTML, though, it seems an additional barrier to entry.  So what if we turn it inside out and do it like PHP does it?</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="html4strict"><pre class="de1"><span class="sc2">&lt;<span class="kw2">p</span>&gt;</span>2 + 2 = <span class="sc2">&lt;?<span class="sy0">=</span> <span class="br0">&#40;</span>+ <span class="nu0">2</span> <span class="nu0">2</span><span class="br0">&#41;</span> ?&gt;</span></pre></div></div></div></div></div></div></div>




<p>Trying something more complicated, in compojure.html this:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="lisp"><pre class="de1"><span class="br0">&#40;</span>html <span class="br0">&#91;</span><span class="sy0">:</span><span class="me1">ul</span> <span class="br0">&#40;</span>map <span class="br0">&#40;</span>fn <span class="br0">&#91;</span>x<span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="sy0">:</span><span class="me1">li</span> x<span class="br0">&#93;</span><span class="br0">&#41;</span> <span class="br0">&#40;</span>range <span class="nu0">10</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#93;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




becomes:


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="html4strict"><pre class="de1"><span class="sc2">&lt;<span class="kw2">ul</span>&gt;</span>
  <span class="sc2">&lt;? <span class="br0">&#40;</span>dotimes <span class="br0">&#91;</span>x <span class="nu0">10</span><span class="br0">&#93;</span> ?&gt;</span>
  <span class="sc2">&lt;<span class="kw2">li</span>&gt;&lt;?<span class="sy0">=</span> x ?&gt;</span>
  <span class="sc2">&lt;? <span class="br0">&#41;</span> ?&gt;</span>
<span class="sc2">&lt;<span class="sy0">/</span><span class="kw2">ul</span>&gt;</span></pre></div></div></div></div></div></div></div>




<p>A little bit longer, but treating it as HTML makes it conceptually easier for beginning programmers to understand, especially if you want to do something like changing the id or class with every row so your CSS can make it all pretty:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="html4strict"><pre class="de1"><span class="sc2">&lt;<span class="kw2">ul</span>&gt;</span>
  <span class="sc2">&lt;? <span class="br0">&#40;</span>dotimes <span class="br0">&#91;</span>x <span class="nu0">10</span><span class="br0">&#93;</span> ?&gt;</span>
  <span class="sc2">&lt;<span class="kw2">li</span> <span class="kw3">id</span><span class="sy0">=</span><span class="st0">&quot;list-&lt;?= x ?&gt;</span></span>&quot; class=&quot;<span class="sc2">&lt;?<span class="sy0">=</span> <span class="br0">&#40;</span>if <span class="br0">&#40;</span>even? x<span class="br0">&#41;</span> <span class="st0">&quot;even&quot;</span> <span class="st0">&quot;odd&quot;</span><span class="br0">&#41;</span> ?&gt;</span>&quot;&gt;<span class="sc2">&lt;?<span class="sy0">=</span> x ?&gt;&lt;<span class="sy0">/</span><span class="kw2">li</span>&gt;</span>
  <span class="sc2">&lt;? <span class="br0">&#41;</span> ?&gt;</span>
<span class="sc2">&lt;<span class="sy0">/</span><span class="kw2">ul</span>&gt;</span></pre></div></div></div></div></div></div></div>




<p>Doing this is compojure.html is trickier:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="lisp"><pre class="de1"><span class="br0">&#40;</span>html 
  <span class="br0">&#91;</span><span class="sy0">:</span><span class="me1">ul</span> 
    <span class="br0">&#40;</span>map <span class="br0">&#40;</span>fn <span class="br0">&#91;</span>x<span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="sy0">:</span><span class="me1">li</span> <span class="br0">&#123;</span><span class="sy0">:</span><span class="me1">id</span> <span class="br0">&#40;</span>str <span class="st0">&quot;list-&quot;</span> x<span class="br0">&#41;</span> <span class="sy0">:</span><span class="me1">class</span> <span class="br0">&#40;</span><span class="kw1">if</span> <span class="br0">&#40;</span>even? x<span class="br0">&#41;</span> <span class="st0">&quot;even&quot;</span> <span class="st0">&quot;odd&quot;</span><span class="br0">&#41;</span><span class="br0">&#125;</span> x<span class="br0">&#93;</span><span class="br0">&#41;</span> 
    <span class="br0">&#40;</span>range <span class="nu0">10</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#93;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>So, let&#8217;s lure people in with easy to use, and they can pick up vectors and S-exprs and notational convenience later.  Converting a template file into the basic code is easy:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="lisp"><pre class="de1"><span class="br0">&#40;</span>defn- transform-string
  <span class="st0">&quot;Internal.  Transform a string into the appropriate echo statement.
An empty string translates into a single space.&quot;</span>  
  <span class="br0">&#91;</span>s<span class="br0">&#93;</span>
  <span class="br0">&#40;</span><span class="kw1">if</span> <span class="br0">&#40;</span><span class="sy0">&gt;</span> <span class="br0">&#40;</span><span class="sy0">.</span><span class="kw1">length</span> s<span class="br0">&#41;</span> <span class="nu0">0</span><span class="br0">&#41;</span>
    <span class="br0">&#40;</span>format <span class="st0">&quot;(gulliver/echo %s)&quot;</span> <span class="br0">&#40;</span>pr-str s<span class="br0">&#41;</span><span class="br0">&#41;</span>
  <span class="st0">&quot; &quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
&nbsp;
<span class="br0">&#40;</span>defn- transform-code
  <span class="st0">&quot;Internal.  Transform code so that it can run as a render routine.&quot;</span>
  <span class="br0">&#91;</span>s<span class="br0">&#93;</span>
  <span class="br0">&#40;</span><span class="kw1">if</span> <span class="br0">&#40;</span><span class="sy0">=</span> <span class="br0">&#40;</span>first s<span class="br0">&#41;</span> \<span class="sy0">=</span><span class="br0">&#41;</span>
    <span class="br0">&#40;</span>format <span class="st0">&quot;(gulliver/echo %s)&quot;</span> <span class="br0">&#40;</span><span class="sy0">.</span>substring s <span class="nu0">1</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
    s<span class="br0">&#41;</span><span class="br0">&#41;</span>
&nbsp;
<span class="br0">&#40;</span>defn- convert* 
  <span class="st0">&quot;Internal.  Given a string that represents a Clojure template,
produce a flat list of the statements necessary to produce that template.&quot;</span>
  <span class="br0">&#91;</span>s<span class="br0">&#93;</span>
  <span class="br0">&#40;</span><span class="kw1">let</span> <span class="br0">&#91;</span>m <span class="br0">&#40;</span>re-matcher #<span class="st0">&quot;(?s)(.*?)&lt;<span class="es0">\?</span>(.+?)<span class="es0">\?</span>&gt;|(.+?)$&quot;</span> s<span class="br0">&#41;</span><span class="br0">&#93;</span>
    <span class="br0">&#40;</span>loop <span class="br0">&#91;</span>next <span class="br0">&#40;</span>re-find m<span class="br0">&#41;</span> buffer <span class="br0">&#91;</span><span class="st0">&quot;'(&quot;</span><span class="br0">&#93;</span><span class="br0">&#93;</span>
      <span class="br0">&#40;</span><span class="kw1">cond</span>
       <span class="br0">&#40;</span><span class="kw1">not</span> <span class="br0">&#40;</span><span class="kw1">nil</span>? <span class="br0">&#40;</span><span class="kw1">nth</span> next <span class="nu0">3</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span> 
           <span class="br0">&#40;</span><span class="kw1">eval</span> <span class="br0">&#40;</span>read-string <span class="br0">&#40;</span><span class="kw1">apply</span> str <span class="br0">&#40;</span>conj buffer <span class="br0">&#40;</span>transform-string <span class="br0">&#40;</span><span class="kw1">nth</span> next <span class="nu0">3</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="st0">&quot;)&quot;</span> <span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
       <span class="sy0">:</span><span class="me1">otherwise</span> <span class="br0">&#40;</span>recur 
                   <span class="br0">&#40;</span>re-find m<span class="br0">&#41;</span> 
                   <span class="br0">&#40;</span>conj buffer 
                         <span class="br0">&#40;</span>transform-string <span class="br0">&#40;</span><span class="kw1">nth</span> next <span class="nu0">1</span><span class="br0">&#41;</span><span class="br0">&#41;</span> 
                         <span class="br0">&#40;</span>transform-code <span class="br0">&#40;</span><span class="kw1">nth</span> next <span class="nu0">2</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Resulting in&#8230;</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="lisp"><pre class="de1"><span class="sy0">==&gt;</span> <span class="br0">&#40;</span>convert* <span class="st0">&quot;&lt;ul&gt;&lt;? (dotimes [x 10] ?&gt;&lt;li&gt;&lt;?= x ?&gt;&lt;/li&gt;&lt;? ) ?&gt;&lt;/ul&gt;&quot;</span><span class="br0">&#41;</span>
&nbsp;
<span class="br0">&#40;</span><span class="br0">&#40;</span>gulliver/echo <span class="st0">&quot;&lt;ul&gt;&quot;</span><span class="br0">&#41;</span> <span class="br0">&#40;</span><span class="kw1">dotimes</span> <span class="br0">&#91;</span>x <span class="nu0">10</span><span class="br0">&#93;</span> <span class="br0">&#40;</span>gulliver/echo <span class="st0">&quot;&lt;li&gt;&quot;</span><span class="br0">&#41;</span> <span class="br0">&#40;</span>gulliver/echo x<span class="br0">&#41;</span> <span class="br0">&#40;</span>gulliver/echo <span class="st0">&quot;&lt;/li&gt;&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#40;</span>gulliver/echo <span class="st0">&quot;&lt;/ul&gt;&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Once you have the basic code generation, it&#8217;s easy enough to hook it up to a default servlet that automatically invokes it when necessary. There&#8217;s a fair bit of handwaving going on here &mdash; regexes are questionable for this, there are some interesting issues with namespaces, and it would require some method of resolving dependencies between files, but heck, it&#8217;s only 100 lines of code;  it was really quick to put together and gives us a sense of whether this can work out.</p>

<p>So, with the <a href="http://github.com/brool/gulliver/tree/master">source on github</a>, your basic server becomes&#8230;</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span>use 'gulliver<span class="br0">&#41;</span>
<span class="br0">&#40;</span>defserver web<span class="sy0">-</span>server
  <span class="br0">&#123;</span>:<span class="me1">port</span> <span class="nu0">8080</span><span class="br0">&#125;</span>
  <span class="st0">&quot;/*&quot;</span>  <span class="br0">&#40;</span>servlet gulliver<span class="sy0">/</span>template<span class="sy0">-</span>servlet<span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Easy!  As an test, an example file:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="sy0">&lt;</span>?
<span class="br0">&#40;</span><span class="kw1">import</span> 'java<span class="sy0">.</span>util<span class="sy0">.</span>Date<span class="br0">&#41;</span>
?<span class="sy0">&gt;</span>
&nbsp;
<span class="sy0">&lt;</span>html<span class="sy0">&gt;</span>
  <span class="sy0">&lt;</span>body<span class="sy0">&gt;</span>
    <span class="sy0">&lt;</span>h1<span class="sy0">&gt;</span>Delivered via Gulliver<span class="sy0">&lt;/</span>h1<span class="sy0">&gt;</span>
    <span class="sy0">&lt;</span>p<span class="sy0">&gt;</span>Expressions at work: <span class="sy0">&lt;</span>?<span class="sy0">=</span> <span class="br0">&#40;</span><span class="sy0">+</span> <span class="nu0">2</span> <span class="nu0">2</span><span class="br0">&#41;</span> ?<span class="sy0">&gt;&lt;/</span>p<span class="sy0">&gt;</span>
    <span class="sy0">&lt;</span>p<span class="sy0">&gt;</span>Looping:<span class="sy0">&lt;/</span>p<span class="sy0">&gt;</span>
    <span class="sy0">&lt;</span>ul<span class="sy0">&gt;</span>
    <span class="sy0">&lt;</span>? <span class="br0">&#40;</span><span class="kw1">dotimes</span> <span class="br0">&#91;</span>x <span class="nu0">10</span><span class="br0">&#93;</span> ?<span class="sy0">&gt;</span>
      <span class="sy0">&lt;</span>li<span class="sy0">&gt;&lt;</span>?<span class="sy0">=</span> x ?<span class="sy0">&gt;&lt;/</span>li<span class="sy0">&gt;</span>
    <span class="sy0">&lt;</span>? <span class="br0">&#41;</span> ?<span class="sy0">&gt;</span>
    <span class="sy0">&lt;/</span>ul<span class="sy0">&gt;</span>
&nbsp;
    <span class="sy0">&lt;</span>p<span class="sy0">&gt;</span>Here's a date:  <span class="sy0">&lt;</span>?<span class="sy0">=</span> <span class="br0">&#40;</span>Date<span class="sy0">.</span><span class="br0">&#41;</span> ?<span class="sy0">&gt;&lt;/</span>p<span class="sy0">&gt;</span>
  <span class="sy0">&lt;/</span>body<span class="sy0">&gt;</span>
<span class="sy0">&lt;/</span>html<span class="sy0">&gt;</span></pre></div></div></div></div></div></div></div>




<p>&#8230; does the right thing. How about a time test?</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="sy0">&lt;</span>html<span class="sy0">&gt;</span>
    <span class="sy0">&lt;</span>body<span class="sy0">&gt;</span>
    <span class="sy0">&lt;</span>h1<span class="sy0">&gt;</span>Starting <span class="kw1">time</span> test<span class="sy0">...&lt;/</span>h1<span class="sy0">&gt;</span>
    <span class="sy0">&lt;</span>? <span class="br0">&#40;</span><span class="kw1">let</span> <span class="br0">&#91;</span> start <span class="br0">&#40;</span>System<span class="sy0">/</span>currentTimeMillis<span class="br0">&#41;</span> <span class="br0">&#93;</span> ?<span class="sy0">&gt;</span>
    <span class="sy0">&lt;</span>p<span class="sy0">&gt;</span>Adding <span class="nu0">1</span><span class="sy0">..</span><span class="nu0">100000</span>:  <span class="sy0">&lt;</span>?<span class="sy0">=</span> <span class="br0">&#40;</span><span class="kw1">reduce</span> <span class="sy0">+</span> <span class="nu0">0</span> <span class="br0">&#40;</span><span class="kw1">range</span> <span class="nu0">1</span> <span class="nu0">1000001</span><span class="br0">&#41;</span><span class="br0">&#41;</span> ?<span class="sy0">&gt;&lt;/</span>p<span class="sy0">&gt;</span>
    <span class="sy0">&lt;</span>p<span class="sy0">&gt;</span><span class="br0">&#40;</span>took <span class="sy0">&lt;</span>?<span class="sy0">=</span> <span class="br0">&#40;</span><span class="sy0">-</span> <span class="br0">&#40;</span>System<span class="sy0">/</span>currentTimeMillis<span class="br0">&#41;</span> start<span class="br0">&#41;</span> ?<span class="sy0">&gt;</span> ms<span class="br0">&#41;</span><span class="sy0">&lt;/</span>p<span class="sy0">&gt;</span>
    <span class="sy0">&lt;</span>? <span class="br0">&#41;</span> ?<span class="sy0">&gt;</span> 
    <span class="sy0">&lt;/</span>body<span class="sy0">&gt;</span>
<span class="sy0">&lt;/</span>html<span class="sy0">&gt;</span></pre></div></div></div></div></div></div></div>




<p>&#8230; runs in about 50ms.  PHP takes about 130ms on the same sequence on my laptop, so, while there are tons of optimizations we could do (like not doing a stat on the file every time the page is rendered (!)), it nonetheless performs well enough that maybe this is a viable option. Finally, as a special bonus, a stupid eval trick that evaluates Clojure code:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="sy0">&lt;</span>html<span class="sy0">&gt;</span>
    <span class="sy0">&lt;</span>body<span class="sy0">&gt;</span>
        <span class="sy0">&lt;</span>? <span class="br0">&#40;</span><span class="kw1">when</span> <span class="br0">&#40;</span><span class="br0">&#40;</span>request :<span class="me1">params</span><span class="br0">&#41;</span> :<span class="me1">code</span><span class="br0">&#41;</span> ?<span class="sy0">&gt;</span>
        <span class="sy0">&lt;</span>pre style<span class="sy0">=</span><span class="st0">&quot;border: 1px dotted&quot;</span><span class="sy0">&gt;</span>
        <span class="sy0">&lt;</span>?<span class="sy0">=</span> <span class="br0">&#40;</span><span class="kw1">str</span> <span class="br0">&#40;</span>eval <span class="br0">&#40;</span>read<span class="sy0">-</span>string <span class="br0">&#40;</span><span class="br0">&#40;</span>request :<span class="me1">params</span><span class="br0">&#41;</span> :<span class="me1">code</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span> ?<span class="sy0">&gt;</span>
        <span class="sy0">&lt;/</span>pre<span class="sy0">&gt;</span>
        <span class="sy0">&lt;</span>? <span class="br0">&#41;</span> ?<span class="sy0">&gt;</span>
        <span class="sy0">&lt;</span>form method<span class="sy0">=</span><span class="st0">&quot;post&quot;</span><span class="sy0">&gt;</span>
        <span class="sy0">&lt;</span>textarea name<span class="sy0">=</span><span class="st0">&quot;code&quot;</span> rows<span class="sy0">=</span><span class="nu0">10</span> cols<span class="sy0">=</span><span class="nu0">80</span><span class="sy0">&gt;&lt;/</span>textarea<span class="sy0">&gt;&lt;</span>br<span class="sy0">/&gt;</span>
        <span class="sy0">&lt;</span>input type<span class="sy0">=</span><span class="st0">&quot;submit&quot;</span> value<span class="sy0">=</span><span class="st0">&quot;Evaluate&quot;</span> <span class="sy0">/&gt;</span>
        <span class="sy0">&lt;/</span>form<span class="sy0">&gt;</span>
    <span class="sy0">&lt;/</span>body<span class="sy0">&gt;</span>
<span class="sy0">&lt;/</span>html<span class="sy0">&gt;</span></pre></div></div></div></div></div></div></div>




<p>Take that, non-dynamic languages!</p>

<p><a href="http://github.com/brool/gulliver/tree/master">Github copy of the source</a></p>.

<p>Next part:  Schema-ing Against MySQL</p>]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/a-modest-proposal/feed</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Snippet: Automatic Proxy Creation in Clojure</title>
		<link>http://www.brool.com/index.php/snippet-automatic-proxy-creation-in-clojure</link>
		<comments>http://www.brool.com/index.php/snippet-automatic-proxy-creation-in-clojure#comments</comments>
		<pubDate>Fri, 21 Aug 2009 22:20:42 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[clojure]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[macro]]></category>
		<category><![CDATA[proxy]]></category>
		<category><![CDATA[snippet]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=366</guid>
		<description><![CDATA[The proxy function makes it easy for Clojure to interface with the Java layer, but I was dealing with an interface (the AIM Java API) that had an punitive number of things that needed to be overridden&#8230; public void OnIdleStateChange&#40;AccSession arg0, int arg1&#41; &#123; &#125; &#160; public void OnInstanceChange&#40;AccSession arg0, AccInstance arg1, AccInstance arg2, AccInstanceProp [...]]]></description>
			<content:encoded><![CDATA[<p>The proxy function makes it easy for Clojure to interface with the Java layer, but I was dealing with an interface (the AIM Java API) that had an punitive number of things that needed to be overridden&#8230;</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="java"><pre class="de1"><span class="kw1">public</span> <span class="kw4">void</span> OnIdleStateChange<span class="br0">&#40;</span>AccSession arg0, <span class="kw4">int</span> arg1<span class="br0">&#41;</span> <span class="br0">&#123;</span>
<span class="br0">&#125;</span>
&nbsp;
<span class="kw1">public</span> <span class="kw4">void</span> OnInstanceChange<span class="br0">&#40;</span>AccSession arg0, AccInstance arg1, AccInstance arg2, AccInstanceProp arg3<span class="br0">&#41;</span> <span class="br0">&#123;</span>
<span class="br0">&#125;</span>
&nbsp;
<span class="kw1">public</span> <span class="kw4">void</span> OnLookupUsersResult<span class="br0">&#40;</span>AccSession arg0, <span class="kw3">String</span><span class="br0">&#91;</span><span class="br0">&#93;</span> arg1, <span class="kw4">int</span> arg2, AccResult arg3, AccUser<span class="br0">&#91;</span><span class="br0">&#93;</span> arg4<span class="br0">&#41;</span> <span class="br0">&#123;</span>
<span class="br0">&#125;</span>
&nbsp;
<span class="kw1">public</span> <span class="kw4">void</span> OnSearchDirectoryResult<span class="br0">&#40;</span>AccSession arg0, <span class="kw4">int</span> arg1, AccResult arg2, AccDirEntry arg3<span class="br0">&#41;</span> <span class="br0">&#123;</span>
<span class="br0">&#125;</span>
&nbsp;
<span class="co1">// ... go on like this for pages</span></pre></div></div></div></div></div></div></div>




<p>The Java code is <a href="http://dev.aol.com/aimclient/OpenAIM182/samples/accjsample/AccJSample.java">here</a>, if you&#8217;re interested in the entire set of calls.  Now, I didn&#8217;t care about most of those events, but I had to override them, since they didn&#8217;t have a default implementation.  What made this seem painful was that I was really only interested in two of the callbacks.  So I started to record an Emacs macro to convert the Java code to the equivalent Clojure proxy statement, and then I realized that I didn&#8217;t have to &mdash; <i>I was using a Lisp</i>.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span><span class="kw1">defmacro</span> auto<span class="sy0">-</span><span class="kw1">proxy</span> <span class="br0">&#91;</span>interfaces variables <span class="sy0">&amp;</span> args<span class="br0">&#93;</span>
  <span class="br0">&#40;</span><span class="kw1">let</span> <span class="br0">&#91;</span>defined <span class="br0">&#40;</span><span class="kw1">set</span> <span class="br0">&#40;</span><span class="kw1">map</span> #<span class="br0">&#40;</span><span class="kw1">str</span> <span class="br0">&#40;</span><span class="kw1">first</span> <span class="sy0">%</span><span class="br0">&#41;</span><span class="br0">&#41;</span> args<span class="br0">&#41;</span><span class="br0">&#41;</span>
        names <span class="br0">&#40;</span><span class="kw1">fn</span> <span class="br0">&#91;</span>i<span class="br0">&#93;</span> <span class="br0">&#40;</span><span class="kw1">map</span> #<span class="br0">&#40;</span><span class="sy0">.</span>getName <span class="sy0">%</span><span class="br0">&#41;</span> <span class="br0">&#40;</span><span class="sy0">.</span>getMethods i<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
        all<span class="sy0">-</span>names <span class="br0">&#40;</span><span class="kw1">into</span> #<span class="br0">&#123;</span><span class="br0">&#125;</span> <span class="br0">&#40;</span><span class="kw1">apply</span> <span class="kw1">concat</span> <span class="br0">&#40;</span><span class="kw1">map</span> names <span class="br0">&#40;</span><span class="kw1">map</span> resolve interfaces<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
        undefined <span class="br0">&#40;</span>difference all<span class="sy0">-</span>names defined<span class="br0">&#41;</span> 
        auto<span class="sy0">-</span>gen <span class="br0">&#40;</span><span class="kw1">map</span> <span class="br0">&#40;</span><span class="kw1">fn</span> <span class="br0">&#91;</span>x<span class="br0">&#93;</span> `<span class="br0">&#40;</span>~<span class="br0">&#40;</span>symbol x<span class="br0">&#41;</span> <span class="br0">&#91;</span><span class="sy0">&amp;</span> ~'args<span class="br0">&#93;</span><span class="br0">&#41;</span><span class="br0">&#41;</span> undefined<span class="br0">&#41;</span><span class="br0">&#93;</span>
    `<span class="br0">&#40;</span><span class="kw1">proxy</span> ~interfaces ~variables ~@args ~@auto<span class="sy0">-</span>gen<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Auto-proxy works just like proxy, but it makes an empty implementation for any call that wasn&#8217;t defined.  So, suddenly, what would have been a bunch of lines collapsed into just:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span><span class="kw1">defn</span> create<span class="sy0">-</span>aim<span class="sy0">-</span><span class="kw1">proxy</span> <span class="br0">&#91;</span><span class="br0">&#93;</span>
  <span class="br0">&#40;</span>auto<span class="sy0">-</span><span class="kw1">proxy</span> <span class="br0">&#91;</span>com<span class="sy0">.</span>aol<span class="sy0">.</span>acc<span class="sy0">.</span>AccEvents<span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="br0">&#93;</span>
     <span class="br0">&#40;</span>OnImReceived <span class="br0">&#91;</span>session imSession participant im<span class="br0">&#93;</span> 
        <span class="br0">&#40;</span>handle<span class="sy0">-</span>im session imSession participant im<span class="br0">&#41;</span><span class="br0">&#41;</span>
     <span class="br0">&#40;</span>OnStateChange <span class="br0">&#91;</span>arg0 arg1 arg2<span class="br0">&#93;</span>
        <span class="br0">&#40;</span>handle<span class="sy0">-</span>state<span class="sy0">-</span>change arg0 arg1 arg2<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Macros ftw.  The nice thing about Clojure/Lisp is that it makes coding up this kind of reusable framework stuff really easy.</p>]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/snippet-automatic-proxy-creation-in-clojure/feed</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Pattern Matching In Clojure</title>
		<link>http://www.brool.com/index.php/pattern-matching-in-clojure</link>
		<comments>http://www.brool.com/index.php/pattern-matching-in-clojure#comments</comments>
		<pubDate>Wed, 12 Aug 2009 09:38:57 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[clojure]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[matching]]></category>
		<category><![CDATA[ocaml]]></category>
		<category><![CDATA[pattern]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=351</guid>
		<description><![CDATA[Updated: Note that this is available as a clojars module. Clojure code density seems to be pretty good. There are a fair number of convenient shortforms in the language; for example, associative datatypes all act as a function &#8212; so given a hash map you can reference it with (my-hashmap :key). The base language itself [...]]]></description>
			<content:encoded><![CDATA[<p><b>Updated</b>:  Note that this is available as a <a href="http://clojars.org/pattern-match">clojars module</a>.</p>

<p>Clojure code density seems to be pretty good. There are a fair number of convenient shortforms in the language; for example, associative datatypes all act as a function &mdash; so given a hash map you can reference it with (my-hashmap :key).  The base language itself is probably about as expressive as Python (or a bit better), but you have the added advantage of being able to use macros as needed to really get the code density up.</p>

<p>Nonetheless, I really wanted something like Ocaml&#8217;s / Haskell&#8217;s pattern matching; it makes some code wonderfully concise.</p>

<p>Accordingly, I hacked something up, based on Clojure&#8217;s built-in destructuring. Some examples:</p>

<p>Literal values match against the same value, while _ matches against any non-nil value (and nil matches against any nil one).  Additionally, :when clauses can be used for conditional checks.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="co1">; simple recursive evaluator</span>
<span class="br0">&#40;</span><span class="kw1">defn</span> arithmetic <span class="br0">&#91;</span>lst<span class="br0">&#93;</span>
  <span class="br0">&#40;</span>match lst
    v  :<span class="kw1">when</span> <span class="br0">&#40;</span>number? v<span class="br0">&#41;</span>  v
    <span class="br0">&#91;</span> _ <span class="st0">&quot;error&quot;</span> _<span class="br0">&#93;</span>     <span class="st0">&quot;error&quot;</span>
    <span class="br0">&#91;</span> _ _ <span class="st0">&quot;error&quot;</span><span class="br0">&#93;</span>     <span class="st0">&quot;error&quot;</span>
    <span class="br0">&#91;</span> <span class="st0">&quot;add&quot;</span> a b <span class="br0">&#93;</span>      <span class="br0">&#40;</span><span class="sy0">+</span> <span class="br0">&#40;</span>arithmetic a<span class="br0">&#41;</span> <span class="br0">&#40;</span>arithmetic b<span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="br0">&#91;</span> <span class="st0">&quot;sub&quot;</span> a b <span class="br0">&#93;</span>      <span class="br0">&#40;</span><span class="sy0">-</span> <span class="br0">&#40;</span>arithmetic a<span class="br0">&#41;</span> <span class="br0">&#40;</span>arithmetic b<span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="br0">&#91;</span> <span class="st0">&quot;mul&quot;</span> a b <span class="br0">&#93;</span>      <span class="br0">&#40;</span><span class="sy0">*</span> <span class="br0">&#40;</span>arithmetic a<span class="br0">&#41;</span> <span class="br0">&#40;</span>arithmetic b<span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="br0">&#91;</span> <span class="st0">&quot;div&quot;</span> a b <span class="br0">&#93;</span>      <span class="br0">&#40;</span><span class="sy0">/</span> <span class="br0">&#40;</span>arithmetic a<span class="br0">&#41;</span> <span class="br0">&#40;</span>arithmetic b<span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="br0">&#91;</span> <span class="st0">&quot;squared&quot;</span> a <span class="br0">&#93;</span>    <span class="br0">&#40;</span>arithmetic <span class="br0">&#91;</span><span class="st0">&quot;mul&quot;</span> <span class="br0">&#40;</span>arithmetic a<span class="br0">&#41;</span> <span class="br0">&#40;</span>arithmetic a<span class="br0">&#41;</span><span class="br0">&#93;</span><span class="br0">&#41;</span>
    _                  <span class="st0">&quot;error&quot;</span> <span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Both collections and single values can be used:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="co1">;; return signum fr a number</span>
<span class="br0">&#40;</span><span class="kw1">defn</span> signum <span class="br0">&#91;</span>x<span class="br0">&#93;</span>
  <span class="br0">&#40;</span>match x 
     <span class="nu0">0</span> <span class="nu0">0</span>
     n :<span class="kw1">when</span> <span class="br0">&#40;</span><span class="sy0">&lt;</span> n <span class="nu0">0</span><span class="br0">&#41;</span> <span class="sy0">-</span><span class="nu0">1</span>
     _ <span class="nu0">1</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>The pattern matching is stricter than the typical destructure;  whereas [ a b ] will destructure against a list of any number of elements, [ a b ] will pattern match only against a list of two elements.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span>match x 
    <span class="br0">&#91;</span><span class="br0">&#93;</span>    <span class="st0">&quot;empty&quot;</span>
    <span class="br0">&#91;</span>_<span class="br0">&#93;</span>   <span class="st0">&quot;one element&quot;</span>
    <span class="br0">&#91;</span>a a<span class="br0">&#93;</span> <span class="st0">&quot;two identical elements&quot;</span>
    <span class="br0">&#91;</span>_ _<span class="br0">&#93;</span> <span class="st0">&quot;two elements&quot;</span>
    _     <span class="st0">&quot;three or more&quot;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>If the same variable occurs in multiple locations in the parameter list, it will be checked for equality.  The &#038; tail form can be used to specify the rest of the list.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="co1">;; count identical elements in the same location in two lists:</span>
<span class="br0">&#40;</span><span class="kw1">defn</span> <span class="kw1">count</span><span class="sy0">=</span> <span class="br0">&#91;</span> lst1 lst2 <span class="br0">&#93;</span>
  <span class="br0">&#40;</span><span class="kw1">loop</span> <span class="br0">&#91;</span> a lst1 b lst2 <span class="kw1">count</span> <span class="nu0">0</span> <span class="br0">&#93;</span>
    <span class="br0">&#40;</span>match <span class="br0">&#91;</span>a b<span class="br0">&#93;</span>
      <span class="br0">&#91;</span><span class="br0">&#91;</span>e <span class="sy0">&amp;</span> at<span class="br0">&#93;</span> <span class="br0">&#91;</span>e <span class="sy0">&amp;</span> bt<span class="br0">&#93;</span><span class="br0">&#93;</span>  <span class="br0">&#40;</span><span class="kw1">recur</span> at bt <span class="br0">&#40;</span><span class="kw1">inc</span> <span class="kw1">count</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
      <span class="br0">&#91;</span><span class="br0">&#91;</span>_ <span class="sy0">&amp;</span> at<span class="br0">&#93;</span> <span class="br0">&#91;</span>_ <span class="sy0">&amp;</span> bt<span class="br0">&#93;</span><span class="br0">&#93;</span>  <span class="br0">&#40;</span><span class="kw1">recur</span> at bt <span class="kw1">count</span><span class="br0">&#41;</span>
      _                    <span class="kw1">count</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Note that this is slightly more flexible than Haskell / ML, in that a variable of the same name can be multiple places in the pattern.</p>

<h3>Defining</h3>

<p>You can use the defnp macro to define a function that is pattern matched; it defines a function that takes one argument and has an implicit match statement.  For example, the signum function can be written:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1">    <span class="br0">&#40;</span>defnp signum
       <span class="nu0">0</span> <span class="nu0">0</span>
       n :<span class="kw1">when</span> <span class="br0">&#40;</span><span class="sy0">&lt;</span> n <span class="nu0">0</span><span class="br0">&#41;</span> <span class="sy0">-</span><span class="nu0">1</span>
       _ <span class="nu0">1</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>(Thanks to <a href="http://infolace.blogspot.com/">Tom Faulhaber</a> for suggesting this)</p>

<h3>Gotchas</h3>

<p>The Clojure destructuring will cause an exception if you try to destructure a collection type with a value.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span><span class="kw1">let</span> <span class="br0">&#91;</span><span class="br0">&#91;</span>a b<span class="br0">&#93;</span> <span class="nu0">10</span><span class="br0">&#93;</span> a<span class="br0">&#41;</span>
java<span class="sy0">.</span>lang<span class="sy0">.</span>UnsupportedOperationException: <span class="me1">nth</span> <span class="kw1">not</span> supported on this type: <span class="me1">Integer</span> <span class="br0">&#40;</span>NO_SOURCE_FILE:<span class="nu0">0</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>&#8230; so be sure to check such cases early in your match statement, if they are possible.</p>

<h3>How It Works</h3>

<p>The pattern matcher uses the built-in Clojure destructuring as the main mechanism, but adorns it so that the pattern can be verified.  For example, the code:


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span>match x <span class="br0">&#91;</span>a a<span class="br0">&#93;</span> <span class="st0">&quot;two identical&quot;</span> <span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




turns into essentially the following:


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span><span class="kw1">let</span> <span class="br0">&#91;</span> <span class="br0">&#91;</span> a g0001 <span class="sy0">&amp;</span> g0002 <span class="br0">&#93;</span> x <span class="br0">&#93;</span> 
     <span class="br0">&#40;</span><span class="kw1">if</span> <span class="br0">&#40;</span><span class="kw1">and</span> <span class="br0">&#40;</span><span class="kw1">not</span> <span class="br0">&#40;</span><span class="kw1">nil?</span> a<span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#40;</span><span class="sy0">=</span> g0001 a<span class="br0">&#41;</span> <span class="br0">&#40;</span><span class="kw1">nil?</span> g0002<span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="st0">&quot;two identical&quot;</span> nil<span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>That is, the destructuring is done, but then the two variables are checked to make sure that they are equal, and the list is checked to make sure it is only two elements long.</p>

<h3>Source</h3>

<a href="http://github.com/brool/clojure-misc/tree">It&#8217;s all on Github</a>.]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/pattern-matching-in-clojure/feed</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>aset is Faster Than aset-int</title>
		<link>http://www.brool.com/index.php/aset-is-faster-than-aset-int</link>
		<comments>http://www.brool.com/index.php/aset-is-faster-than-aset-int#comments</comments>
		<pubDate>Sat, 08 Aug 2009 21:51:13 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[clojure]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[array]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=345</guid>
		<description><![CDATA[Clojure isn&#8217;t the fastest functional lanuage &#8212; that title seems to go to Haskell these days, at least for the stuff that I do &#8212; but it nonetheless is usually fast enough. It&#8217;s a dynamic language, so is perhaps cursed to be somewhat slower always, but nonetheless for the things that I do, it seems [...]]]></description>
			<content:encoded><![CDATA[<p>Clojure isn&#8217;t the fastest functional lanuage &#8212; that title seems to go to Haskell these days, at least for the stuff that I do &#8212; but it nonetheless is usually fast enough.  It&#8217;s a dynamic language, so is perhaps cursed to be somewhat slower always, but nonetheless for the things that I do, it seems to be about 2-4x slower than Ocaml/Haskell and substantially faster than Python.</p>

<p>Nonetheless, I ran into a situation where it seemed to be 100x slower than Java because of an erroneous assumption on my part. After stripping out and profiling and removing everything extraneous, it finally came down to array access. Ignoring the uselessness of this snippet for a while, the issue is how to translate something like the following code:</p>

[code language="java"]
int v[] = new int[1000];
java.util.Arrays.fill(v, 0);
for (int i = 0; i < 100000; i++)
    for (int ix = 0; ix < 1000; ix++) {
        v[ix]++;
        v[ix]--;
}
[/code]

<p>... which runs in about 350ms on my Macbook.  My pass at the Clojure equivalent, with the type hinting:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="co1">;; this is 100x slower than the equivalent in Java</span>
<span class="br0">&#40;</span><span class="kw1">defn</span> useless<span class="sy0">-</span>array<span class="sy0">-</span>manipulation <span class="br0">&#91;</span><span class="br0">&#93;</span>
  <span class="br0">&#40;</span><span class="kw1">let</span> <span class="br0">&#91;</span>v <span class="br0">&#40;</span>int<span class="sy0">-</span>array <span class="nu0">1000</span><span class="br0">&#41;</span><span class="br0">&#93;</span>
    <span class="br0">&#40;</span>java<span class="sy0">.</span>util<span class="sy0">.</span>Arrays<span class="sy0">/</span>fill v <span class="nu0">0</span><span class="br0">&#41;</span>
    <span class="br0">&#40;</span><span class="kw1">dotimes</span> <span class="br0">&#91;</span>_ <span class="br0">&#40;</span>int <span class="nu0">100000</span><span class="br0">&#41;</span><span class="br0">&#93;</span>
      <span class="br0">&#40;</span><span class="kw1">dotimes</span> <span class="br0">&#91;</span>ix <span class="br0">&#40;</span>int <span class="nu0">1000</span><span class="br0">&#41;</span><span class="br0">&#93;</span>
        <span class="br0">&#40;</span>aset<span class="sy0">-</span>int v ix <span class="br0">&#40;</span>unchecked<span class="sy0">-</span>add <span class="br0">&#40;</span>int <span class="nu0">1</span><span class="br0">&#41;</span> <span class="br0">&#40;</span><span class="kw1">aget</span> v ix<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
        <span class="br0">&#40;</span>aset<span class="sy0">-</span>int v ix <span class="br0">&#40;</span>unchecked<span class="sy0">-</span>subtract <span class="br0">&#40;</span>int <span class="nu0">1</span><span class="br0">&#41;</span> <span class="br0">&#40;</span><span class="kw1">aget</span> v ix<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>... which takes about 40692ms -- more than 100x slower.  What the hell?  After many hours of Googling, I finally discovered from <a href="http://www.fatvat.co.uk/2009/01/ray-tracing-in-clojure-part-ii.html">the comments on this blog post</a> that aset-int is slower than aset, which I don't recall seeing anywhere else. Does it really make a difference?</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="co1">;; much faster</span>
<span class="br0">&#40;</span><span class="kw1">defn</span> useless<span class="sy0">-</span>array<span class="sy0">-</span>manipulation <span class="br0">&#91;</span><span class="br0">&#93;</span>
  <span class="br0">&#40;</span><span class="kw1">let</span> <span class="br0">&#91;</span>v <span class="br0">&#40;</span>int<span class="sy0">-</span>array <span class="nu0">1000</span><span class="br0">&#41;</span><span class="br0">&#93;</span>
    <span class="br0">&#40;</span>java<span class="sy0">.</span>util<span class="sy0">.</span>Arrays<span class="sy0">/</span>fill v <span class="nu0">0</span><span class="br0">&#41;</span>
    <span class="br0">&#40;</span><span class="kw1">dotimes</span> <span class="br0">&#91;</span>_ <span class="br0">&#40;</span>int <span class="nu0">100000</span><span class="br0">&#41;</span><span class="br0">&#93;</span>
      <span class="br0">&#40;</span><span class="kw1">dotimes</span> <span class="br0">&#91;</span>ix <span class="br0">&#40;</span>int <span class="nu0">1000</span><span class="br0">&#41;</span><span class="br0">&#93;</span><span class="sy0">&gt;</span>
        <span class="br0">&#40;</span><span class="kw1">aset</span> v ix <span class="br0">&#40;</span>unchecked<span class="sy0">-</span>add <span class="br0">&#40;</span><span class="kw1">aget</span> v ix<span class="br0">&#41;</span> <span class="br0">&#40;</span>int <span class="nu0">1</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
        <span class="br0">&#40;</span><span class="kw1">aset</span> v ix <span class="br0">&#40;</span>unchecked<span class="sy0">-</span>subtract <span class="br0">&#40;</span><span class="kw1">aget</span> v ix<span class="br0">&#41;</span> <span class="br0">&#40;</span>int <span class="nu0">1</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>... runs in about 1000ms, or about 3x slower than the Java variant, and is definitely something that I can live with. I had been assuming that aset-int was of course faster because it was more specific, but in fact (aset v ix (int val)) is <i>much faster</i> than (aset-int v ix val).</p>]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/aset-is-faster-than-aset-int/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tokyo Cabinet API for Clojure</title>
		<link>http://www.brool.com/index.php/tokyo-cabinet-api-for-clojure</link>
		<comments>http://www.brool.com/index.php/tokyo-cabinet-api-for-clojure#comments</comments>
		<pubDate>Fri, 07 Aug 2009 01:51:56 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[clojure]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[kvstore]]></category>
		<category><![CDATA[nonsql]]></category>
		<category><![CDATA[tokyo cabinet]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=333</guid>
		<description><![CDATA[I&#8217;ve been playing with Tokyo Cabinet and Clojure for a bit, and while I will go on about both of them in another blog post (or not), I have to mention that Clojure is such a well designed language that it&#8217;s a pleasure to play with. It has much of the same intrinsic power as [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been playing with Tokyo Cabinet and Clojure for a bit, and while I will go on about both of them in another blog post (or not), I have to mention that Clojure is such a well designed language that it&#8217;s a pleasure to play with.  It has much of the same intrinsic power as Haskell, but in a fashion that might be more approachable for people coming from Python or Ruby.</p>

<p>At any rate, I made a small, thin layer around the Tokyo Cabinet API, and <a href="http://github.com/brool/tokyo-cabinet/tree/master">put it on Github</a>.  Another thin wrapper can be found <a href="http://justin.harmonize.fm/index.php/tag/tokyo-cabinet/">at this blog</a>.</p>

<p>Copy of the README is below (the ultimate in lazy!).</p>

<h3>Introduction</h3>

<p>This is a simple interface to the Tokyo Cabinet libraries.  Tokyo Cabinet is a very cool, very high performing key-value store.  This library supports table mode, which essentially means that arbitrary hashmaps can be stored in the cabinet.</p>

<p>Note that this is appropriate for local storage only &#8212; if you&#8217;re looking to share a Tokyo Cabinet to multiple computers, you actually want Tokyo Tyrant.</p>

<h3>Basic Usage</h3>

<p>The with-cabinet call creates/opens a cabinet and allows the use of the various access routines within the scope of the call.  For example, here&#8217;s how to create a cabinet with three entries.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span><span class="kw1">ns</span> user <span class="br0">&#40;</span>:<span class="me1">use</span> tokyo<span class="sy0">-</span>cabinet<span class="br0">&#41;</span><span class="br0">&#41;</span>  <span class="co1">;; bring into our namespace</span>
&nbsp;
<span class="br0">&#40;</span>with<span class="sy0">-</span>cabinet <span class="br0">&#123;</span> :<span class="me1">filename</span> <span class="st0">&quot;test.tokyo&quot;</span> :<span class="me1">mode</span> <span class="br0">&#40;</span><span class="sy0">+</span> OWRITER OCREAT<span class="br0">&#41;</span> <span class="br0">&#125;</span> 
    <span class="br0">&#40;</span><span class="kw1">doseq</span> <span class="br0">&#91;</span><span class="br0">&#91;</span>name val<span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="br0">&#91;</span><span class="st0">&quot;1&quot;</span> <span class="st0">&quot;one&quot;</span><span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="st0">&quot;2&quot;</span> <span class="st0">&quot;two&quot;</span><span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="st0">&quot;3&quot;</span> <span class="st0">&quot;three&quot;</span><span class="br0">&#93;</span><span class="br0">&#93;</span><span class="br0">&#93;</span>
        <span class="br0">&#40;</span>put<span class="sy0">-</span>value name val<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>This creates a Tokyo Cabinet <i>hash table</i>, which allows one value per key.  Now query an entry:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span>with<span class="sy0">-</span>cabinet <span class="br0">&#123;</span> :<span class="me1">filename</span> <span class="st0">&quot;test.tokyo&quot;</span> :<span class="me1">mode</span> OREADER <span class="br0">&#125;</span> 
    <span class="br0">&#40;</span>get<span class="sy0">-</span>value <span class="st0">&quot;1&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
<span class="st0">&quot;one&quot;</span></pre></div></div></div></div></div></div></div>




<h3>Tables</h3>

<p>A <i>table</i> in Tokyo Cabinet can be used to store arbitrary hash maps.  For example:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span><span class="kw1">def</span> params <span class="br0">&#123;</span> :<span class="me1">filename</span> <span class="st0">&quot;test-table.tokyo&quot;</span> :<span class="me1">mode</span> <span class="br0">&#40;</span><span class="sy0">+</span> OWRITER OCREAT<span class="br0">&#41;</span> :<span class="me1">type</span> :<span class="me1">table</span> <span class="br0">&#125;</span> <span class="br0">&#41;</span>
<span class="br0">&#40;</span>with<span class="sy0">-</span>cabinet params
    <span class="br0">&#40;</span>put<span class="sy0">-</span>value nil <span class="br0">&#123;</span> :<span class="me1">name</span> <span class="st0">&quot;John Doe&quot;</span> :<span class="me1">hobbies</span> <span class="st0">&quot;rowing fishing skiing&quot;</span> :<span class="me1">age</span> <span class="nu0">28</span> :<span class="me1">gender</span> <span class="st0">&quot;M&quot;</span> <span class="br0">&#125;</span><span class="br0">&#41;</span>
    <span class="br0">&#40;</span>put<span class="sy0">-</span>value nil <span class="br0">&#123;</span> :<span class="me1">name</span> <span class="st0">&quot;Melissa Swift&quot;</span> :<span class="me1">hobbies</span> <span class="st0">&quot;soccer tennis books&quot;</span> :<span class="me1">age</span> <span class="nu0">33</span> :<span class="me1">gender</span> <span class="st0">&quot;F&quot;</span><span class="br0">&#125;</span><span class="br0">&#41;</span>
    <span class="br0">&#40;</span>put<span class="sy0">-</span>value nil <span class="br0">&#123;</span> :<span class="me1">name</span> <span class="st0">&quot;Tom Swift&quot;</span> :<span class="me1">hobbies</span> <span class="st0">&quot;inventing exploring&quot;</span> :<span class="me1">gender</span> <span class="st0">&quot;M&quot;</span> <span class="br0">&#125;</span><span class="br0">&#41;</span>
    <span class="br0">&#40;</span>put<span class="sy0">-</span>value nil <span class="br0">&#123;</span> :<span class="me1">name</span> <span class="st0">&quot;Harry Potter&quot;</span> :<span class="me1">hobbies</span> <span class="st0">&quot;magic quidditch flying&quot;</span> :<span class="me1">gender</span> <span class="st0">&quot;M&quot;</span> :<span class="me1">age</span> <span class="nu0">9</span> <span class="br0">&#125;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<h3>Queries</h3>

<p>Queries can be run, and you can use (hint) to take a look at how the query is being performed:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="co1">; show a hint and all rows matching</span>
<span class="br0">&#40;</span><span class="kw1">defn</span> showrows <span class="br0">&#91;</span>query<span class="br0">&#93;</span>
    <span class="br0">&#40;</span><span class="kw1">let</span> <span class="br0">&#91;</span>showhint <span class="br0">&#40;</span>atom false<span class="br0">&#41;</span><span class="br0">&#93;</span> 
        <span class="br0">&#40;</span>with<span class="sy0">-</span>query<span class="sy0">-</span>results row query
            <span class="br0">&#40;</span><span class="kw1">when</span> <span class="br0">&#40;</span>compare<span class="sy0">-</span>and<span class="sy0">-</span><span class="kw1">set</span><span class="sy0">!</span> showhint false true<span class="br0">&#41;</span>
                  <span class="br0">&#40;</span>println <span class="st0">&quot;Query: &quot;</span> query<span class="br0">&#41;</span>
                  <span class="br0">&#40;</span>println <span class="st0">&quot;Hint: &quot;</span> <span class="br0">&#40;</span>hint<span class="br0">&#41;</span><span class="br0">&#41;</span>
                  <span class="br0">&#40;</span>println <span class="st0">&quot;Results:&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
            <span class="br0">&#40;</span>println row<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
        <span class="br0">&#40;</span>println<span class="br0">&#41;</span><span class="br0">&#41;</span>
&nbsp;
<span class="br0">&#40;</span>with<span class="sy0">-</span>cabinet params
    <span class="br0">&#40;</span>showrows <span class="br0">&#91;</span><span class="br0">&#91;</span>:<span class="me1">age</span> <span class="st0">&quot;&gt;=&quot;</span> <span class="nu0">30</span><span class="br0">&#93;</span><span class="br0">&#93;</span><span class="br0">&#41;</span>
    <span class="br0">&#40;</span>showrows <span class="br0">&#91;</span><span class="br0">&#91;</span>:<span class="me1">hobbies</span> <span class="st0">&quot;any-token&quot;</span> <span class="st0">&quot;soccer&quot;</span><span class="br0">&#93;</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Leads to the following output:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1">Query:  <span class="br0">&#91;</span><span class="br0">&#91;</span>:<span class="me1">age</span> <span class="sy0">&gt;=</span> <span class="nu0">30</span><span class="br0">&#93;</span><span class="br0">&#93;</span>
Hint:  <span class="me1">scanning</span> the whole table
result <span class="kw1">set</span> size: <span class="nu0">1</span>
leaving the natural order
&nbsp;
Results:
<span class="br0">&#123;</span>:<span class="me1">gender</span> F, :<span class="me1">hobbies</span> soccer tennis books, :<span class="me1">name</span> Melissa Swift, :<span class="me1">age</span> <span class="nu0">33</span><span class="br0">&#125;</span>
&nbsp;
Query:  <span class="br0">&#91;</span><span class="br0">&#91;</span>:<span class="me1">hobbies</span> any<span class="sy0">-</span>token soccer<span class="br0">&#93;</span><span class="br0">&#93;</span>
Hint:  <span class="me1">scanning</span> the whole table
result <span class="kw1">set</span> size: <span class="nu0">1</span>
leaving the natural order
&nbsp;
Results:
<span class="br0">&#123;</span>:<span class="me1">gender</span> F, :<span class="me1">hobbies</span> soccer tennis books, :<span class="me1">name</span> Melissa Swift, :<span class="me1">age</span> <span class="nu0">33</span><span class="br0">&#125;</span></pre></div></div></div></div></div></div></div>




<h3>Indexes</h3>

<p>Indexes can be added with create-index (and removed with delete-index), which help optimize particular queries.</p>

<p>The different index types:</p>

<ul>
<li>INDEX-DECIMAL</li>
<li>INDEX-LEXICAL</li>
<li>INDEX-QGRAM</li>
</ul>

<p>With some optional specifiers that can be added / ored in:</p>

<ul>
<li>INDEX-KEEP &#8212; keep the index if it already exists</li>
<li>INDEX-OPTIMIZE</li>
</ul>

<p>Running the queries again, with indexes:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="co1">; indexes are persistent</span>
<span class="br0">&#40;</span>with<span class="sy0">-</span>cabinet params
    <span class="br0">&#40;</span>create<span class="sy0">-</span>index :<span class="me1">hobbies</span> INDEX<span class="sy0">-</span>TOKEN<span class="br0">&#41;</span>
    <span class="br0">&#40;</span>create<span class="sy0">-</span>index :<span class="me1">age</span> INDEX<span class="sy0">-</span>DECIMAL<span class="br0">&#41;</span><span class="br0">&#41;</span>
&nbsp;
<span class="co1">; try the queries again with the indexes in place</span>
<span class="br0">&#40;</span>with<span class="sy0">-</span>cabinet params
    <span class="br0">&#40;</span>showrows <span class="br0">&#91;</span><span class="br0">&#91;</span>:<span class="me1">age</span> <span class="st0">&quot;&gt;=&quot;</span> <span class="nu0">30</span><span class="br0">&#93;</span><span class="br0">&#93;</span><span class="br0">&#41;</span>
    <span class="br0">&#40;</span>showrows <span class="br0">&#91;</span><span class="br0">&#91;</span>:<span class="me1">hobbies</span> <span class="st0">&quot;any-token&quot;</span> <span class="st0">&quot;soccer&quot;</span><span class="br0">&#93;</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Gets the following hint:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1">Query:  <span class="br0">&#91;</span><span class="br0">&#91;</span>:<span class="me1">age</span> <span class="sy0">&gt;=</span> <span class="nu0">30</span><span class="br0">&#93;</span><span class="br0">&#93;</span>
Hint:  <span class="me1">using</span> an index: <span class="st0">&quot;:age&quot;</span> asc <span class="br0">&#40;</span>NUMGT<span class="sy0">/</span>NUMGE<span class="br0">&#41;</span>
result <span class="kw1">set</span> size: <span class="nu0">1</span>
leaving the natural order
&nbsp;
Results:
<span class="br0">&#123;</span>:<span class="me1">gender</span> F, :<span class="me1">hobbies</span> soccer tennis books, :<span class="me1">name</span> Melissa Swift, :<span class="me1">age</span> <span class="nu0">33</span><span class="br0">&#125;</span>
&nbsp;
Query:  <span class="br0">&#91;</span><span class="br0">&#91;</span>:<span class="me1">hobbies</span> any<span class="sy0">-</span>token soccer<span class="br0">&#93;</span><span class="br0">&#93;</span>
Hint:  <span class="me1">using</span> an index: <span class="st0">&quot;:hobbies&quot;</span> inverted <span class="br0">&#40;</span>STROR<span class="br0">&#41;</span>
token occurrence: <span class="st0">&quot;soccer&quot;</span> <span class="nu0">1</span>
result <span class="kw1">set</span> size: <span class="nu0">1</span>
leaving the natural order
&nbsp;
Results:
<span class="br0">&#123;</span>:<span class="me1">gender</span> F, :<span class="me1">hobbies</span> soccer tennis books, :<span class="me1">name</span> Melissa Swift, :<span class="me1">age</span> <span class="nu0">33</span><span class="br0">&#125;</span></pre></div></div></div></div></div></div></div>




<h3>Optional Search Parameters</h3>

<p>You can further control what&#8217;s fetched by using a number of optional specifiers in the query:</p>

<ul>
<li>:limit nnn &#8212; limits the number of rows returned</li>
<li>:skip  nnn &#8212; skips the first nnn rows</li>
<li>:sort  fieldname &#8212; sorts by the given field</li>
<li>:order val &#8212; the specific ordering, one of SORT-NUM-ASC, SORT-NUM-DESC, SORT-TEXT-ASC, or SORT-TEXT-DESC</li>
</ul>

<p>For example:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span>with<span class="sy0">-</span>cabinet params <span class="br0">&#40;</span>with<span class="sy0">-</span>query<span class="sy0">-</span>results row <span class="br0">&#91;</span><span class="br0">&#93;</span> <span class="br0">&#40;</span>println <span class="br0">&#40;</span>:<span class="me1">name</span> row<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>    
John Doe
Melissa Swift
Tom Swift
Harry Potter
&nbsp;
<span class="br0">&#40;</span>with<span class="sy0">-</span>cabinet params <span class="br0">&#40;</span>with<span class="sy0">-</span>query<span class="sy0">-</span>results row <span class="br0">&#91;</span><span class="br0">&#91;</span>:<span class="kw1">sort</span> :<span class="me1">name</span><span class="br0">&#93;</span><span class="br0">&#93;</span> <span class="br0">&#40;</span>println <span class="br0">&#40;</span>:<span class="me1">name</span> row<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
Harry Potter
John Doe
Melissa Swift
Tom Swift
&nbsp;
<span class="br0">&#40;</span>with<span class="sy0">-</span>cabinet params <span class="br0">&#40;</span>with<span class="sy0">-</span>query<span class="sy0">-</span>results row <span class="br0">&#91;</span><span class="br0">&#91;</span>:<span class="kw1">sort</span> :<span class="me1">name</span><span class="br0">&#93;</span> <span class="br0">&#91;</span>:<span class="me1">order</span> SORT<span class="sy0">-</span>TEXT<span class="sy0">-</span>DESC<span class="br0">&#93;</span><span class="br0">&#93;</span> <span class="br0">&#40;</span>println <span class="br0">&#40;</span>:<span class="me1">name</span> row<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
Tom Swift
Melissa Swift
John Doe
Harry Potter
&nbsp;
<span class="br0">&#40;</span>with<span class="sy0">-</span>cabinet params <span class="br0">&#40;</span>with<span class="sy0">-</span>query<span class="sy0">-</span>results row <span class="br0">&#91;</span><span class="br0">&#91;</span>:<span class="kw1">sort</span> :<span class="me1">name</span><span class="br0">&#93;</span> <span class="br0">&#91;</span>:<span class="me1">order</span> SORT<span class="sy0">-</span>TEXT<span class="sy0">-</span>DESC<span class="br0">&#93;</span> <span class="br0">&#91;</span>:<span class="me1">limit</span> <span class="nu0">1</span><span class="br0">&#93;</span><span class="br0">&#93;</span> <span class="br0">&#40;</span>println <span class="br0">&#40;</span>:<span class="me1">name</span> row<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
Tom Swift</pre></div></div></div></div></div></div></div>




<h3>Lower Level</h3>

<p>Depending on your application, it might not be convenient to have to bracket everything with with-cabinet, since that means an open and close of the cabinet.  You can also use the lower level open-cabinet and close-cabinet calls, along with the &#8220;with&#8221; statement.  This is also an easier way to use it at the command line.  For example:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span><span class="kw1">def</span> test<span class="sy0">-</span>database <span class="br0">&#40;</span>open<span class="sy0">-</span>cabinet <span class="br0">&#123;</span> :<span class="me1">filename</span> <span class="st0">&quot;test-open.tokyo&quot;</span> :<span class="me1">mode</span> <span class="br0">&#40;</span><span class="sy0">+</span> OWRITER OCREAT<span class="br0">&#41;</span> <span class="br0">&#125;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
<span class="br0">&#40;</span>with test<span class="sy0">-</span>database <span class="br0">&#40;</span>put<span class="sy0">-</span>value <span class="st0">&quot;1&quot;</span> <span class="st0">&quot;one&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
<span class="br0">&#40;</span>with test<span class="sy0">-</span>database <span class="br0">&#40;</span>get<span class="sy0">-</span>value <span class="st0">&quot;1&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
<span class="br0">&#40;</span>with test<span class="sy0">-</span>database <span class="br0">&#40;</span><span class="kw1">print</span> <span class="br0">&#40;</span>primary<span class="sy0">-</span><span class="kw1">keys</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
<span class="br0">&#40;</span>close<span class="sy0">-</span>cabinet test<span class="sy0">-</span>database<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<h3>Miscellaneous</h3>

<p>Use (primary-keys) to return a lazy list of primary keys.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span>with<span class="sy0">-</span>cabinet <span class="br0">&#123;</span> :<span class="me1">filename</span> <span class="st0">&quot;test.tokyo&quot;</span> :<span class="me1">mode</span> <span class="br0">&#40;</span><span class="sy0">+</span> OWRITER OCREATE<span class="br0">&#41;</span> :<span class="me1">type</span> :<span class="me1">table</span> <span class="br0">&#125;</span>
    <span class="br0">&#40;</span><span class="kw1">print</span> <span class="br0">&#40;</span>primary<span class="sy0">-</span><span class="kw1">keys</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<h3>Links</h3>

<ul>
<li><a href="http://tokyocabinet.sourceforge.net/">Tokyo Cabinet</a></li>
<li><a href="http://tokyocabinet.sourceforge.net/javadoc/">Tokyo Cabinet / Java API</a></li>
</ul>]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/tokyo-cabinet-api-for-clojure/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Posting To WordPress From Git</title>
		<link>http://www.brool.com/index.php/posting-to-wordpress-from-git</link>
		<comments>http://www.brool.com/index.php/posting-to-wordpress-from-git#comments</comments>
		<pubDate>Mon, 27 Jul 2009 20:57:20 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[git]]></category>
		<category><![CDATA[wordpress]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=302</guid>
		<description><![CDATA[I&#8217;ve found WordPress to be pretty decent, aside from the security updates every other week, but for me writing is a very spur of the moment thing; I prefer to be able to go into Emacs and just immediately type anything without having to log into my blog, create a new post, and then suffer [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve found WordPress to be pretty decent, aside from the security updates every other week, but for me writing is a very spur of the moment thing;  I prefer to be able to go into Emacs and just immediately type anything without having to log into my blog, create a new post, and then suffer through a web editor.  Basically, I want to do it in Emacs!  Now!  And maybe I&#8217;m offline!</p>

<p>So, I wrote a tiny little program to help facilitate using git and WordPress together through the XMLRPC API, and it&#8217;s helped me write&#8230; well, if not <i>more</i> than at least <i>less painfully</i>.</p>

<h3>Setting Up</h3>
<p>Assuming that your blog is set up at http://www.yourblog.com (you can create an account at wordpress.com to test this out), all you&#8217;ll need to do is:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1"><span class="re5">--</span> <span class="kw2">make</span> a directory <span class="kw1">for</span> the blog
<span class="kw2">mkdir</span> blog
chdir blog
&nbsp;
<span class="re5">--</span> download everything
python wp.py download <span class="re5">--user</span>=yourname <span class="re5">--password</span>=yourpass <span class="re5">--url</span>=http:<span class="sy0">//</span>www.yourblog.com<span class="sy0">/</span>xmlrpc.php
<span class="br0">&#40;</span><span class="kw3">wait</span> a bit<span class="br0">&#41;</span>
<span class="kw2">git init</span>
<span class="kw2">git add</span> .
<span class="kw2">git commit</span> <span class="re5">-m</span> <span class="st0">&quot;first version&quot;</span>
&nbsp;
<span class="re5">--</span> now <span class="kw1">set</span> up so we don<span class="st_h">'t have to specify --user, --password, and --url every time (optional)
git wp config wp.url http://www.yourblog.com/xmlrpc.php
git wp config wp.user yourname
git wp config wp.password yourpass</span></pre></div></div></div></div></div></div></div>




<p>The files are downloaded in the appropriate YYYY/MM directories, with the draft directory being used for all of your unpublished drafts.</p>

<p>All the drafts are stored in plain text, but you&#8217;ll see some lines starting with periods &#8212; these are various WordPress variables that are associated with the file.  You can change them, as well;  for example, to change the title of the post, just change the line that begins with &#8220;.title&#8221;. </p>

<h3>Seeing What&#8217;s Different</h3>
<p>You can use the status command to see differences between the local file system and your blog.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1">python wp.py status</pre></div></div></div></div></div></div></div>




<h3>Updating From The Blog</h3>
<p>If you&#8217;ve made changes through the web interface and you&#8217;d like to bring them down, you don&#8217;t have to download everything again, but can instead just update.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1">python wp.py update</pre></div></div></div></div></div></div></div>




<h3>Posting/Editing</h3>
<p>If you&#8217;d like to edit a post, just edit it, and then use wp post to push it back to the blog:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1">python wp.py post changed-file</pre></div></div></div></div></div></div></div>




<p>To create a new post, just create a new file in the draft folder &#8212; I like to specify the .title and .wp_slug parameters, as well &#8212; and then post it.  You can also publish it by changing the .post_status line from draft to publish.

<h3>Gotchas</h3>

<p>While this program <i>requires</i> git, it doesn&#8217;t automatically check anything in &#8212; so you&#8217;ll need to make sure you do git add / git commit or whatnot as necessary.</p>

<p>There are also some gotchas due to the fact that the filename can change on you.  There are cases where the filename that will be brought down is different then the one that you send up:</p>
<ul>
<li>You post a file without a .title or .wp_slug line</li>
<li>You post a file with a different file name than the slug that is generated (i.e., &#8220;my-first-draft&#8221; when the title is actually &#8220;my final draft&#8221;)</li>
<li>Something moves from &#8220;draft&#8221; to &#8220;publish&#8221;
</ul>

<p>When you see a message of the form &#8220;changed: fn1 -> fn2&#8243;, it means that a rename has occurred, and you&#8217;ll need to do the git rm/git add or git mv by hand.</p>

<h3>Source</h3>
<a href="http://github.com/brool/git-wordpress/tree/master">Source is available on github</a>.]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/posting-to-wordpress-from-git/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Hacks: Python Calling PHP</title>
		<link>http://www.brool.com/index.php/hacks-python-calling-php</link>
		<comments>http://www.brool.com/index.php/hacks-python-calling-php#comments</comments>
		<pubDate>Wed, 22 Jul 2009 00:57:26 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[bridge]]></category>
		<category><![CDATA[hacks]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=313</guid>
		<description><![CDATA[(This is almost too stupid to post, but on the off chance that someone actually needs something like this&#8230;) I needed to interface with a bunch of data that had PHP wrapper classes, and needed a quick way of being able to interface with PHP from Python. (At this point, you might find it hard [...]]]></description>
			<content:encoded><![CDATA[<p>(This is almost too stupid to post, but on the off chance that someone actually needs something like this&#8230;) </p>

<p>I needed to interface with a bunch of data that had PHP wrapper classes, and needed a quick way of being able to interface with PHP from Python.  (At this point, you might find it hard to believe that the PHP wrapper classes were worth trying to reuse, but the wrapper classes took care of partitioning and caching and everything, so it seemed silly to not reuse them).  I considered using the PHP API and ctypes to make a Python to PHP bridge , but then decided that was <i>entirely</i> too much work, so I just hacked together a stupid class that invoked PHP with arbitrary code and returned it in a number of ways.</p>

<h3>Basic usage</h3>

<p>Create the PHP class (giving it an optional prefix and postfix &#8212; prefix is the most useful, as it allows you to specify requires), and then invoke a code block with get_raw, get, or get_one.   Note that every &#8220;call&#8221; into PHP will create and destroy a PHP process.</p>

<p>Some examples:</p>

<h3>Reading raw input</h3>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="python"><pre class="de1">php <span class="sy0">=</span> PHP<span class="br0">&#40;</span><span class="st0">&quot;require '../code/private/common.php';&quot;</span><span class="br0">&#41;</span>
<span class="kw3">code</span> <span class="sy0">=</span> <span class="st0">&quot;&quot;&quot;for ($i = 1; $i &lt;= 10; $i++) { echo &quot;$i<span class="es0">\n</span>&quot;; }&quot;&quot;&quot;</span>
<span class="kw1">print</span> php.<span class="me1">get_raw</span><span class="br0">&#40;</span><span class="kw3">code</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>will output a string containing the numbers &#8220;1&#8243;..&#8221;10&#8243;</p>

<h3>Reading one big JSON value</h3>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="python"><pre class="de1">php <span class="sy0">=</span> PHP<span class="br0">&#40;</span><span class="st0">&quot;require '../code/private/common.php';&quot;</span><span class="br0">&#41;</span>
<span class="kw3">code</span> <span class="sy0">=</span> <span class="st0">&quot;&quot;&quot;                                                                                          
$a = array();                                                                                       
for ($i = 1; $i &lt;= 10; $i++) { $a[] = $i; }                                                         
echo json_encode($a);&quot;&quot;&quot;</span>
<span class="kw1">print</span> php.<span class="me1">get</span><span class="br0">&#40;</span><span class="kw3">code</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>This code would return a Python list with the numbers 1..10.</p>

<h3>Reading many JSON values</h3>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="python"><pre class="de1">    php <span class="sy0">=</span> PHP<span class="br0">&#40;</span><span class="st0">&quot;require '../code/private/common.php';&quot;</span><span class="br0">&#41;</span>
    <span class="kw3">code</span> <span class="sy0">=</span> <span class="st0">&quot;&quot;&quot;                                                                                          
    for ($i = 1; $i &lt;= 10; $i++) {                                                                      
        echo json_encode(array($i =&gt; $i * $i)) . &quot;<span class="es0">\n</span>&quot;;                                                  
    }&quot;&quot;&quot;</span>
    <span class="kw1">for</span> row <span class="kw1">in</span> php.<span class="me1">get_one</span><span class="br0">&#40;</span><span class="kw3">code</span><span class="br0">&#41;</span>:
        <span class="kw1">print</span> row</pre></div></div></div></div></div></div></div>




<p>Given a PHP snippet that returns one JSON value per line, iterate through them as Python values.</p>

<p>Find it on <a href="http://github.com/brool/util/blob/master/php.py">Github</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/hacks-python-calling-php/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Recovering From A &#8211;hard Reset In Git</title>
		<link>http://www.brool.com/index.php/recovering-from-a-hard-reset-in-git</link>
		<comments>http://www.brool.com/index.php/recovering-from-a-hard-reset-in-git#comments</comments>
		<pubDate>Sun, 19 Apr 2009 06:37:09 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[git]]></category>
		<category><![CDATA[hard]]></category>
		<category><![CDATA[recovering]]></category>
		<category><![CDATA[reset]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=303</guid>
		<description><![CDATA[I was switching between git repositories the other day, and managed to do a &#8220;git reset &#8211;hard HEAD^&#8221; in the wrong repository. Which wasn&#8217;t bad, since I had most of the files already open in Emacs&#8230; but then Emacs calmly told me that it was re-reading the files from disk. But, git had everything still [...]]]></description>
			<content:encoded><![CDATA[I was switching between git repositories the other day, and managed to do a &#8220;git reset &#8211;hard HEAD^&#8221; in the wrong repository.  Which wasn&#8217;t bad, since I had most of the files already open in Emacs&#8230; but then Emacs calmly told me that it was re-reading the files from disk.

But, git had everything still around &#8212; it turns out to be pretty easy to get it back.  The magic command turned out to be git reflog.


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1">$ <span class="kw2">git reflog</span>
aba2b93... HEAD<span class="sy0">@</span><span class="br0">&#123;</span><span class="nu0">0</span><span class="br0">&#125;</span>: reset <span class="re5">--hard</span> HEAD^
28a0c01... HEAD<span class="sy0">@</span><span class="br0">&#123;</span><span class="nu0">1</span><span class="br0">&#125;</span>: commit: <span class="kw2">more</span> work on pre-receive
1c4a3af... HEAD<span class="sy0">@</span><span class="br0">&#123;</span><span class="nu0">2</span><span class="br0">&#125;</span>: merge tmp: Fast forward
84d69cb... HEAD<span class="sy0">@</span><span class="br0">&#123;</span><span class="nu0">3</span><span class="br0">&#125;</span>: checkout: moving to commit_hooks
1c4a3af... HEAD<span class="sy0">@</span><span class="br0">&#123;</span><span class="nu0">4</span><span class="br0">&#125;</span>: commit: commit hooks
a489ebd... HEAD<span class="sy0">@</span><span class="br0">&#123;</span><span class="nu0">5</span><span class="br0">&#125;</span>: checkout: moving to tmp
a489ebd... HEAD<span class="sy0">@</span><span class="br0">&#123;</span><span class="nu0">6</span><span class="br0">&#125;</span>: checkout: moving to a489ebd</pre></div></div></div></div></div></div></div>




So I had lost everything in HEAD@{1}, but you can get it back by just checking out that particular commit.


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1">$ <span class="kw2">git checkout</span> HEAD<span class="sy0">@</span><span class="br0">&#123;</span><span class="nu0">1</span><span class="br0">&#125;</span>
Note: moving to <span class="st0">&quot;28a0c01&quot;</span> <span class="kw2">which</span> isn<span class="st_h">'t a local branch
If you want to create a new branch from this checkout, you may do so
(now or later) by using -b with the checkout command again. Example:
  git checkout -b &lt;new_branch_name&gt;
HEAD is now at 28a0c01... commit: more work on pre-receive
&nbsp;
$ git checkout -b tmp
... do whatever you need to do to get tmp to the right state ...
&nbsp;
$ git checkout master
$ git merge tmp</span></pre></div></div></div></div></div></div></div>




Note that if I didn&#8217;t need to fiddle around with stuff, but just had wanted everything in the commit, I could just have merged the entire commit:


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="bash"><pre class="de1"><span class="co4">$ </span><span class="kw2">git merge</span> HEAD<span class="sy0">@</span><span class="br0">&#123;</span><span class="nu0">1</span><span class="br0">&#125;</span></pre></div></div></div></div></div></div></div>




&#8230; and that would have brought everything back to the state I needed.]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/recovering-from-a-hard-reset-in-git/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Stupid Haskell Tricks</title>
		<link>http://www.brool.com/index.php/stupid-haskell-tricks</link>
		<comments>http://www.brool.com/index.php/stupid-haskell-tricks#comments</comments>
		<pubDate>Mon, 13 Apr 2009 10:25:30 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[oo]]></category>
		<category><![CDATA[stupid]]></category>
		<category><![CDATA[tricks]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=294</guid>
		<description><![CDATA[Let&#8217;s say that you really, really want some notion of objected oriented programming. So let&#8217;s make a class that represents a name, and some simple method calls on it: data S = S &#123; name :: String &#125; deriving &#40;Show&#41; firstname s = &#40;words &#40;name s&#41;&#41;!!0 lastname s = &#40;words &#40;name s&#41;&#41;!!1 But, dammit, you [...]]]></description>
			<content:encoded><![CDATA[<p>Let&#8217;s say that you really, <i>really</i> want some notion of objected oriented programming.  So let&#8217;s make a class that represents a name, and some simple method calls on it:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="kw1">data</span> S <span class="sy0">=</span> S <span class="br0">&#123;</span> name <span class="sy0">::</span> <span class="kw4">String</span> <span class="br0">&#125;</span> <span class="kw1">deriving</span> <span class="br0">&#40;</span><span class="kw4">Show</span><span class="br0">&#41;</span>
firstname s <span class="sy0">=</span> <span class="br0">&#40;</span><span class="kw3">words</span> <span class="br0">&#40;</span>name s<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">!!</span><span class="nu0">0</span>
lastname  s <span class="sy0">=</span> <span class="br0">&#40;</span><span class="kw3">words</span> <span class="br0">&#40;</span>name s<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">!!</span><span class="nu0">1</span></pre></div></div></div></div></div></div></div>



<p>But, dammit, you want to invoke it like you would in C++.  So define a function:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="sy0">*</span>Main<span class="sy0">&gt;</span> <span class="kw1">let</span> <span class="br0">&#40;</span><span class="co2">--&gt;</span><span class="br0">&#41;</span> x f <span class="sy0">=</span> f x
<span class="sy0">*</span>Main<span class="sy0">&gt;</span> <span class="kw1">let</span> test <span class="sy0">=</span> S <span class="st0">&quot;George Washington&quot;</span>
<span class="sy0">*</span>Main<span class="sy0">&gt;</span> test <span class="co2">--&gt;</span> firstname
<span class="st0">&quot;George&quot;</span>
<span class="sy0">*</span>Main<span class="sy0">&gt;</span> test <span class="co2">--&gt;</span> lastname
<span class="st0">&quot;Washington&quot;</span></pre></div></div></div></div></div></div></div>



<p>(It&#8217;s tempting to use `.`, but it conflicts with the Prelude.  Also note that you could define it as &#8220;(&#8211;>) = flip ($)&#8221;).  But what if it takes more than one parameter?</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="sy0">*</span>Main<span class="sy0">&gt;</span> <span class="kw1">let</span> flip<span class="sy0">_</span>concat t s <span class="sy0">=</span> intercalate t <span class="sy0">$</span> <span class="kw3">reverse</span><span class="sy0">.</span><span class="kw3">words</span> <span class="sy0">$</span> name s
<span class="sy0">*</span>Main<span class="sy0">&gt;</span> test <span class="co2">--&gt;</span> flip<span class="sy0">_</span>concat <span class="st0">&quot;, &quot;</span>
<span class="st0">&quot;Washington, George&quot;</span>
<span class="sy0">*</span>Main<span class="sy0">&gt;</span> test <span class="co2">--&gt;</span> flip<span class="sy0">_</span>concat<span class="br0">&#40;</span><span class="st0">&quot;, &quot;</span><span class="br0">&#41;</span>
<span class="st0">&quot;Washington, George&quot;</span></pre></div></div></div></div></div></div></div>



<p>&#8230; although you might not like that anything can be applied to (&#8211;>).</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="sy0">*</span>Main<span class="sy0">&gt;</span> <span class="br0">&#91;</span><span class="nu0">1</span><span class="sy0">,</span><span class="nu0">2</span><span class="sy0">,</span><span class="nu0">3</span><span class="br0">&#93;</span> <span class="co2">--&gt;</span> <span class="kw3">length</span>
<span class="nu0">3</span>
&nbsp;
<span class="kw1">class</span> Deref a <span class="kw1">where</span>
    <span class="br0">&#40;</span><span class="co2">--&gt;</span><span class="br0">&#41;</span> <span class="sy0">::</span> a <span class="sy0">-&gt;</span> <span class="br0">&#40;</span>a <span class="sy0">-&gt;</span> b<span class="br0">&#41;</span> <span class="sy0">-&gt;</span> b
    x <span class="co2">--&gt;</span> f <span class="sy0">=</span> f x 
&nbsp;
<span class="kw1">instance</span> Deref S
&nbsp;
<span class="sy0">*</span>Main<span class="sy0">&gt;</span> <span class="br0">&#91;</span><span class="nu0">1</span><span class="sy0">,</span><span class="nu0">2</span><span class="sy0">,</span><span class="nu0">3</span><span class="br0">&#93;</span> <span class="co2">--&gt;</span> <span class="kw3">length</span>
&nbsp;
<span class="sy0">&lt;</span>interactive<span class="sy0">&gt;</span>:<span class="nu0">1</span>:<span class="nu0">0</span>:
    No <span class="kw1">instance</span> for <span class="br0">&#40;</span>Deref <span class="br0">&#91;</span>t<span class="br0">&#93;</span><span class="br0">&#41;</span>
      arising from a use <span class="kw1">of</span> `<span class="co2">--&gt;</span>' at <span class="sy0">&lt;</span>interactive<span class="sy0">&gt;</span>:<span class="nu0">1</span>:<span class="nu0">0</span><span class="sy0">-</span><span class="nu0">17</span>
    Possible fix: add an <span class="kw1">instance</span> declaration for <span class="br0">&#40;</span>Deref <span class="br0">&#91;</span>t<span class="br0">&#93;</span><span class="br0">&#41;</span>
    In the expression: <span class="br0">&#91;</span><span class="nu0">1</span><span class="sy0">,</span> <span class="nu0">2</span><span class="sy0">,</span> <span class="nu0">3</span><span class="br0">&#93;</span> <span class="co2">--&gt;</span> <span class="kw3">length</span>
    In the definition <span class="kw1">of</span> `it': it <span class="sy0">=</span> <span class="br0">&#91;</span><span class="nu0">1</span><span class="sy0">,</span> <span class="nu0">2</span><span class="sy0">,</span> <span class="nu0">3</span><span class="br0">&#93;</span> <span class="co2">--&gt;</span> <span class="kw3">length</span></pre></div></div></div></div></div></div></div>



<p>You can even use tuple passing to make it look even more like a typical call.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="sy0">*</span>Main<span class="sy0">&gt;</span> <span class="kw1">let</span> pretty <span class="br0">&#40;</span>pre<span class="sy0">,</span> mid<span class="sy0">,</span> post<span class="br0">&#41;</span> s <span class="sy0">=</span> pre <span class="sy0">++</span> <span class="br0">&#40;</span>firstname s<span class="br0">&#41;</span> <span class="sy0">++</span> mid <span class="sy0">++</span> <span class="br0">&#40;</span>lastname s<span class="br0">&#41;</span> <span class="sy0">++</span> post
<span class="sy0">*</span>Main<span class="sy0">&gt;</span> test <span class="co2">--&gt;</span> pretty <span class="br0">&#40;</span><span class="st0">&quot;&lt;&quot;</span><span class="sy0">,</span> <span class="st0">&quot;, &quot;</span><span class="sy0">,</span> <span class="st0">&quot;&gt;&quot;</span><span class="br0">&#41;</span>
<span class="st0">&quot;&lt;George, Washington&gt;&quot;</span></pre></div></div></div></div></div></div></div>



<p>&#8230; although that makes it harder to curry. </p>]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/stupid-haskell-tricks/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Haskell To WordPress (Snippet)</title>
		<link>http://www.brool.com/index.php/haskell-to-wordpress-snippet</link>
		<comments>http://www.brool.com/index.php/haskell-to-wordpress-snippet#comments</comments>
		<pubDate>Fri, 10 Apr 2009 08:25:00 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[haxr]]></category>
		<category><![CDATA[wordpress]]></category>
		<category><![CDATA[xmlrpc]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=291</guid>
		<description><![CDATA[A small snippet of code that demonstrates calling into a WordPress XML-RPC server with Haskell and HaxR. import qualified Data.Map as Map import Data.Maybe import Network.XmlRpc.Client import Network.XmlRpc.Internals &#160; server = &#34;http://yourserver.wordpress.com/xmlrpc.php&#34; &#160; -- extract multiple posts from the XML response extract :: Value -&#62; &#91;Map String Value&#93; extract xmlresp = let ValueArray rs = [...]]]></description>
			<content:encoded><![CDATA[<p>A small snippet of code that demonstrates calling into a WordPress XML-RPC server with Haskell and <a href="http://www.haskell.org/haxr/">HaxR</a>.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>Map <span class="kw1">as</span> Map
<span class="kw1">import</span> Data<span class="sy0">.</span><span class="kw4">Maybe</span>
<span class="kw1">import</span> Network<span class="sy0">.</span>XmlRpc<span class="sy0">.</span>Client
<span class="kw1">import</span> Network<span class="sy0">.</span>XmlRpc<span class="sy0">.</span>Internals
&nbsp;
server <span class="sy0">=</span> <span class="st0">&quot;http://yourserver.wordpress.com/xmlrpc.php&quot;</span>
&nbsp;
<span class="co1">-- extract multiple posts from the XML response</span>
extract <span class="sy0">::</span> Value <span class="sy0">-&gt;</span> <span class="br0">&#91;</span>Map <span class="kw4">String</span> Value<span class="br0">&#93;</span>
extract xmlresp <span class="sy0">=</span> 
    <span class="kw1">let</span> ValueArray rs <span class="sy0">=</span> xmlresp <span class="kw1">in</span>
    <span class="kw3">map</span> <span class="br0">&#40;</span>\v <span class="sy0">-&gt;</span> <span class="kw1">case</span> v <span class="kw1">of</span> 
                  ValueStruct vs <span class="sy0">-&gt;</span> Map<span class="sy0">.</span>fromList vs
                  <span class="sy0">_</span>              <span class="sy0">-&gt;</span> Map<span class="sy0">.</span>fromList <span class="br0">&#91;</span><span class="br0">&#93;</span><span class="br0">&#41;</span> rs
&nbsp;
getRecentPosts <span class="sy0">::</span> <span class="kw4">Int</span> <span class="sy0">-&gt;</span> <span class="br0">&#91;</span><span class="kw4">Char</span><span class="br0">&#93;</span> <span class="sy0">-&gt;</span> <span class="br0">&#91;</span><span class="kw4">Char</span><span class="br0">&#93;</span> <span class="sy0">-&gt;</span> <span class="kw4">Int</span> <span class="sy0">-&gt;</span> <span class="kw4">IO</span> Value
getRecentPosts <span class="sy0">=</span> remote server <span class="st0">&quot;metaWeblog.getRecentPosts&quot;</span>
&nbsp;
<span class="co1">-- print out the five most recent posts</span>
main <span class="sy0">=</span> <span class="kw1">do</span> result <span class="sy0">&lt;-</span> getRecentPosts <span class="nu0">1</span> <span class="st0">&quot;yourname&quot;</span> <span class="st0">&quot;yourpass&quot;</span> <span class="nu0">5</span>
          <span class="kw1">let</span> posts <span class="sy0">=</span> extract result
          <span class="kw3">mapM_</span> <span class="br0">&#40;</span>\p <span class="sy0">-&gt;</span> <span class="kw3">print</span> <span class="sy0">$</span> fromJust <span class="sy0">$</span> Map<span class="sy0">.</span><span class="kw3">lookup</span> <span class="st0">&quot;title&quot;</span> p<span class="br0">&#41;</span> posts</pre></div></div></div></div></div></div></div>


]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/haskell-to-wordpress-snippet/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Haskell Performance: Array Creation</title>
		<link>http://www.brool.com/index.php/haskell-performance-array-creation</link>
		<comments>http://www.brool.com/index.php/haskell-performance-array-creation#comments</comments>
		<pubDate>Mon, 30 Mar 2009 03:00:20 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=274</guid>
		<description><![CDATA[Ran into another interesting performance problem while converting a small test program over to Haskell. Let&#8217;s say that you want to walk through every line of a text file, collate character frequencies, and return anything that maps to a particular frequency. For purposes of explanation we&#8217;ll do something really silly like look for lines with [...]]]></description>
			<content:encoded><![CDATA[<p>Ran into another interesting performance problem while converting a small test program over to Haskell. Let&#8217;s say that you want to walk through every line of a text file, collate character frequencies, and return anything that maps to a particular frequency.  For purposes of explanation we&#8217;ll do something really silly like look for lines with 10 capital &#8216;A&#8217;s.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="coMULTI">{-# OPTIONS -XBangPatterns #-}</span>
<span class="kw1">import</span> <span class="kw4">IO</span>
<span class="kw1">import</span> System
<span class="kw1">import</span> Data<span class="sy0">.</span>Word
<span class="kw1">import</span> Data<span class="sy0">.</span>Array<span class="sy0">.</span>Unboxed
<span class="kw1">import</span> Control<span class="sy0">.</span><span class="kw4">Monad</span><span class="sy0">.</span>ST
<span class="kw1">import</span> Data<span class="sy0">.</span>Array<span class="sy0">.</span>ST
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString <span class="kw1">as</span> B
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Internal <span class="kw1">as</span> BI
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Char8 <span class="kw1">as</span> C
&nbsp;
counts' <span class="sy0">!</span>line <span class="sy0">=</span> <span class="kw1">do</span> arr <span class="sy0">&lt;-</span> newArray <span class="br0">&#40;</span><span class="nu0">0</span><span class="sy0">,</span><span class="nu0">255</span><span class="br0">&#41;</span> <span class="nu0">0</span> <span class="sy0">::</span> ST s <span class="br0">&#40;</span>STUArray s Word8 <span class="kw4">Int</span><span class="br0">&#41;</span>
                   <span class="co1">-- collate character counts here</span>
                   <span class="kw3">return</span> arr
counts <span class="sy0">!</span>line <span class="sy0">=</span> runSTUArray <span class="br0">&#40;</span>counts' line<span class="br0">&#41;</span>
&nbsp;
hit <span class="sy0">!</span>line <span class="sy0">=</span> <span class="kw1">let</span> freq <span class="sy0">=</span> counts line <span class="kw1">in</span> freq<span class="sy0">!</span><span class="nu0">65</span> <span class="sy0">==</span> <span class="nu0">10</span>
&nbsp;
main <span class="sy0">=</span>
    <span class="kw1">do</span> args <span class="sy0">&lt;-</span> getArgs
       f <span class="sy0">&lt;-</span> openFile <span class="st0">&quot;wordlist&quot;</span> ReadMode
       text <span class="sy0">&lt;-</span> B<span class="sy0">.</span>hGetContents f
       <span class="kw3">print</span> <span class="sy0">$</span> <span class="kw3">length</span> <span class="sy0">$</span> <span class="kw3">filter</span> hit <span class="br0">&#40;</span>C<span class="sy0">.</span><span class="kw3">lines</span> text<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>This is fast (0.07s on 300K test file, compiled with -O2 on a Macbook Pro), but there&#8217;s no actual collation going on.  It suddenly becomes two magnitudes slower as soon as you start to do anything based on the line:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- version 1 : 9.5 seconds</span>
counts' <span class="sy0">!</span>line <span class="sy0">=</span> <span class="kw1">do</span> arr <span class="sy0">&lt;-</span> newArray <span class="br0">&#40;</span><span class="nu0">0</span><span class="sy0">,</span><span class="nu0">255</span><span class="br0">&#41;</span> <span class="nu0">0</span> <span class="sy0">::</span> ST s <span class="br0">&#40;</span>STUArray s Word8 <span class="kw4">Int</span><span class="br0">&#41;</span>
                   readArray arr <span class="br0">&#40;</span>B<span class="sy0">.</span><span class="kw3">head</span> line<span class="br0">&#41;</span> <span class="sy0">&gt;&gt;=</span> \v <span class="sy0">-&gt;</span> writeArray arr <span class="br0">&#40;</span>B<span class="sy0">.</span><span class="kw3">head</span> line<span class="br0">&#41;</span> <span class="br0">&#40;</span>v<span class="sy0">+</span><span class="nu0">1</span><span class="br0">&#41;</span>
                   <span class="kw3">return</span> arr
counts <span class="sy0">!</span>line <span class="sy0">=</span> runSTUArray <span class="br0">&#40;</span>counts' line<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>.. equals 9.54 seconds just when you collate the first character of the string, and led me to believe that ByteStrings were slow, especially since a constant change to the array was fast (0.07 seconds):</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- version 2 : 0.07 seconds</span>
counts' <span class="sy0">!</span>line <span class="sy0">=</span> <span class="kw1">do</span> arr <span class="sy0">&lt;-</span> newArray <span class="br0">&#40;</span><span class="nu0">0</span><span class="sy0">,</span><span class="nu0">255</span><span class="br0">&#41;</span> <span class="nu0">0</span> <span class="sy0">::</span> ST s <span class="br0">&#40;</span>STUArray s Word8 <span class="kw4">Int</span><span class="br0">&#41;</span>
                   readArray arr <span class="nu0">0</span> <span class="sy0">&gt;&gt;=</span> \v <span class="sy0">-&gt;</span> writeArray arr <span class="nu0">0</span> <span class="br0">&#40;</span>v<span class="sy0">+</span><span class="nu0">1</span><span class="br0">&#41;</span>
                   <span class="kw3">return</span> arr</pre></div></div></div></div></div></div></div>




<p>(Here I&#8217;ll elide the hours of me messing around with profiling and whatnot to try and figure out why B.head was so slow, or how to make B.foldl&#8217; update a state variable, or sprinkling strictness bangs all over the place, or the other tons of false trails that I went down.)</p>

<p>It turns out that the slow portion in all of this is the newArray, not the character collation itself, which can be seen in the profile if the array creation is moved into its own routine and we look at the profile:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- version 3</span>
<span class="co1">-- compile with: ghc -package bytestring stuarray.hs -prof -auto-all -O2 -o stuarray.out</span>
<span class="co1">-- run with: ./stuarray.out +RTS -p </span>
initial<span class="sy0">_</span>array <span class="sy0">=</span> <span class="kw1">do</span> arr <span class="sy0">&lt;-</span> newArray <span class="br0">&#40;</span><span class="nu0">0</span><span class="sy0">,</span><span class="nu0">255</span><span class="br0">&#41;</span> <span class="nu0">0</span> <span class="sy0">::</span> ST s <span class="br0">&#40;</span>STUArray s Word8 <span class="kw4">Int</span><span class="br0">&#41;</span>
                   <span class="kw3">return</span> arr
&nbsp;
counts' <span class="sy0">!</span>line <span class="sy0">=</span> <span class="kw1">do</span> arr <span class="sy0">&lt;-</span> initial<span class="sy0">_</span>array
                   <span class="kw1">let</span> ix <span class="sy0">=</span> B<span class="sy0">.</span><span class="kw3">head</span> line
                   readArray arr ix <span class="sy0">&gt;&gt;=</span> \v <span class="sy0">-&gt;</span> writeArray arr ix <span class="br0">&#40;</span>v<span class="sy0">+</span><span class="nu0">1</span><span class="br0">&#41;</span>
                   <span class="kw3">return</span> arr
counts <span class="sy0">!</span>line <span class="sy0">=</span> runSTUArray <span class="br0">&#40;</span>counts' line<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>



<p>&#8230; and the relevant part of the profile:</p>
<pre>
MAIN                     MAIN                     1           0   0.0    0.0   100.0  100.0
 main                    Main                   240           1   0.7    0.3   100.0  100.0
  hit                    Main                   242      335075   0.0    0.0    99.3   99.7
   counts                Main                   243      335075   0.0    0.2    99.3   99.7
    counts'              Main                   244      335075   0.0    0.0    99.3   99.5
     initial_array       Main                   245      335075  99.3   99.5    99.3   99.5
</pre>

<p>Which brings up two interesting points:</p>
<ul>
<li>The Haskell optimizer was so smart that it was able to figure out that the code in version 2 built a constant array&#8230; and so only called the newArray once.  (Confirmed with a profile).</li>
<li>New array creation seems to be so slow that it <i>dominates a benchmark that has file I/O.</i></li>
</ul>

<p>The Ocaml version took about 0.720 seconds (Ocaml version below), compared to Haskell time of about 8.5 seconds.  Reducing the array size to 128 in both cases reduced it to 0.112 seconds for Ocaml and 4.3 seconds in Haskell.</p>

<table border="1px dotted">
<tr><th>Array size</th><th>Ocaml</th><th>Haskell</th></tr>
<tr><td>128</td><td>0.112</td><td>4.3</td></tr>
<tr><td>255</td><td>0.184</td><td>8.5</td></tr>
<tr><td>256</td><td>0.720</td><td>8.5</td></tr> 
</table>

<p>Note the discontinuity in Ocaml between 255 and 256 elements, which I find interesting.  The nice people in #haskell suggested that I stop constructing/destructing the array and instead just null it out between each line, and it turned out that the best way to do that was just to thaw a starter array.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- version 4: 0.11 seconds</span>
initial<span class="sy0">_</span>array <span class="sy0">=</span> listArray <span class="br0">&#40;</span><span class="nu0">0</span><span class="sy0">,</span><span class="nu0">255</span><span class="br0">&#41;</span> <span class="br0">&#40;</span><span class="kw3">repeat</span> <span class="nu0">0</span><span class="br0">&#41;</span> <span class="sy0">::</span> UArray Word8 <span class="kw4">Int</span>
&nbsp;
counts' <span class="sy0">!</span>line <span class="sy0">=</span> <span class="kw1">do</span> arr <span class="sy0">&lt;-</span> thaw initial<span class="sy0">_</span>array
                   <span class="kw1">let</span> ix <span class="sy0">=</span> B<span class="sy0">.</span><span class="kw3">head</span> line
                   readArray arr ix <span class="sy0">&gt;&gt;=</span> \v <span class="sy0">-&gt;</span> writeArray arr ix <span class="br0">&#40;</span>v<span class="sy0">+</span><span class="nu0">1</span><span class="br0">&#41;</span>
                   <span class="kw3">return</span> arr
counts <span class="sy0">!</span>line <span class="sy0">=</span> runSTUArray <span class="br0">&#40;</span>counts' line<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>This is still kind of strange to me (because an object is still getting constructed/destructed &#8212; thaw guarantees a copy) but you can&#8217;t argue with a performance increase.  (Interestingly, using Array.copy or Array.fill in Ocaml is <i>slower</i> than just using Array.make).  Final times?  0.11 seconds in Haskell vs. the best of <strike>0.720 second</strike> UPDATED: 0.12 seconds in Ocaml&#8230; and the Haskell version is just 50% slower than a C implementation with a gratuitous calloc instead of a memset. </p>

<p><b>UPDATE</b>:  An pointed out that zeroing the Ocaml array with a for loop is much faster than Array.copy or Array.fill, bringing it to about the same speed as Haskell.</p>

<p>Lessons learned?</p>
<ul>
<li>Performance is a treacherous mistress</li>
<li>The Haskell optimizer is awesome but you have to be wary when trying to narrow down performance problems</li>
<li>#haskell is always full of useful suggestions</li>
<li>Array creation seems to be slow enough that alternatives should be explored.</li>
</ul>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="ocaml"><pre class="de1"><span class="kw1">open</span> <span class="kw2">Char</span>
<span class="kw1">open</span> <span class="kw2">String</span>
&nbsp;
<span class="kw1">let</span> all_lines fn filename <span class="sy0">=</span> 
    <span class="kw1">let</span> chan <span class="sy0">=</span> <span class="kw3">open_in</span> filename <span class="kw1">in</span>
        <span class="kw1">try</span>
            <span class="kw1">while</span> <span class="kw1">true</span> <span class="kw1">do</span> 
                <span class="kw1">let</span> line <span class="sy0">=</span> <span class="kw3">input_line</span> chan <span class="kw1">in</span> 
                    fn line
            <span class="kw1">done</span>
        <span class="kw1">with</span> End_of_file <span class="sy0">-&gt;</span>
            <span class="kw3">close_in</span> chan
&nbsp;
<span class="kw1">let</span> initial_array <span class="sy0">=</span> <span class="kw2">Array</span><span class="sy0">.</span>make <span class="nu0">256</span> <span class="nu0">0</span>
&nbsp;
<span class="kw1">let</span> count line <span class="sy0">=</span> 
    <span class="kw1">let</span> freq <span class="sy0">=</span> initial_array <span class="kw1">in</span>
    <span class="kw1">let</span> ix <span class="sy0">=</span> <span class="kw3">int_of_char</span> line<span class="sy0">.</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span> <span class="kw1">in</span> 
        <span class="kw1">for</span> i <span class="sy0">=</span> <span class="nu0">0</span> <span class="kw1">to</span> <span class="nu0">255</span> <span class="kw1">do</span> freq<span class="sy0">.</span><span class="br0">&#40;</span>i<span class="br0">&#41;</span> <span class="sy0">&lt;-</span> <span class="nu0">0</span> <span class="kw1">done</span><span class="sy0">;</span>
        freq<span class="sy0">.</span><span class="br0">&#40;</span>ix<span class="br0">&#41;</span> <span class="sy0">&lt;-</span> freq<span class="sy0">.</span><span class="br0">&#40;</span>ix<span class="br0">&#41;</span> <span class="sy0">+</span> <span class="nu0">1</span><span class="sy0">;</span>
        freq
&nbsp;
<span class="kw1">let</span> hit line <span class="sy0">=</span> 
    <span class="kw1">let</span> freq <span class="sy0">=</span> count line <span class="kw1">in</span>
        freq<span class="sy0">.</span><span class="br0">&#40;</span><span class="nu0">65</span><span class="br0">&#41;</span> <span class="sy0">=</span> <span class="nu0">10</span>
&nbsp;
<span class="kw1">let</span> _ <span class="sy0">=</span> 
    <span class="kw1">let</span> linecount <span class="sy0">=</span> <span class="kw1">ref</span> <span class="nu0">0</span> <span class="kw1">in</span> 
        all_lines <span class="br0">&#40;</span><span class="kw1">fun</span> line <span class="sy0">-&gt;</span> <span class="kw1">if</span> hit line <span class="kw1">then</span> linecount <span class="sy0">:=</span> <span class="sy0">!</span>linecount <span class="sy0">+</span> <span class="nu0">1</span> <span class="kw1">else</span> <span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="st0">&quot;wordlist&quot;</span><span class="sy0">;</span>
        <span class="kw3">print_int</span> <span class="sy0">!</span>linecount</pre></div></div></div></div></div></div></div>


]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/haskell-performance-array-creation/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Haskell Performance: Lowercase</title>
		<link>http://www.brool.com/index.php/haskell-performance-lowercase</link>
		<comments>http://www.brool.com/index.php/haskell-performance-lowercase#comments</comments>
		<pubDate>Sun, 29 Mar 2009 02:05:25 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=266</guid>
		<description><![CDATA[I was trying to track down some issues with some text processing programs that I was writing in Haskell, and ran into an interesting problem. I made one small change and my program ended up being 5 times slower, and I had to backtrack to try and find out what it was. So, given a [...]]]></description>
			<content:encoded><![CDATA[<p>I was trying to track down some issues with some text processing programs that I was writing in Haskell, and ran into an interesting problem.  I made one small change and my program ended up being 5 times slower, and I had to backtrack to try and find out what it was. So, given a simple Haskell program that sees if a word is in a wordlist:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="kw1">import</span> <span class="kw4">IO</span>
<span class="kw1">import</span> System
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString <span class="kw1">as</span> B
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Internal <span class="kw1">as</span> BI
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Char8 <span class="kw1">as</span> C
&nbsp;
main <span class="sy0">=</span> <span class="kw1">do</span> args <span class="sy0">&lt;-</span> getArgs
          <span class="kw1">let</span> searchfor <span class="sy0">=</span> C<span class="sy0">.</span>pack <span class="sy0">$</span> <span class="kw3">head</span> args
          f <span class="sy0">&lt;-</span> openFile <span class="st0">&quot;wordlist&quot;</span> ReadMode
          text <span class="sy0">&lt;-</span> B<span class="sy0">.</span>hGetContents f 
          <span class="kw3">print</span> <span class="sy0">$</span> <span class="kw3">length</span> <span class="sy0">$</span> <span class="kw3">filter</span> <span class="br0">&#40;</span><span class="br0">&#40;</span><span class="sy0">==</span><span class="br0">&#41;</span> searchfor<span class="br0">&#41;</span> <span class="br0">&#40;</span>C<span class="sy0">.</span><span class="kw3">lines</span> text<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




To search a smallish list of about 300K words takes 0.040 seconds on my computer, compared to 0.200 seconds for Python and 0.210 seconds for a naive Haskell implementation that is not using ByteStrings.  However, let&#8217;s just add lowercase to the equation:


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="kw1">import</span> <span class="kw4">IO</span>
<span class="kw1">import</span> System
<span class="kw1">import</span> Data<span class="sy0">.</span><span class="kw4">Char</span>
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString <span class="kw1">as</span> B
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Internal <span class="kw1">as</span> BI
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Char8 <span class="kw1">as</span> C
&nbsp;
main <span class="sy0">=</span> <span class="kw1">do</span> args <span class="sy0">&lt;-</span> getArgs
          <span class="kw1">let</span> searchfor <span class="sy0">=</span> C<span class="sy0">.</span>pack <span class="sy0">$</span> <span class="kw3">head</span> args
          f <span class="sy0">&lt;-</span> openFile <span class="st0">&quot;wordlist&quot;</span> ReadMode
          text <span class="sy0">&lt;-</span> B<span class="sy0">.</span>hGetContents f 
          <span class="kw3">print</span> <span class="sy0">$</span> <span class="kw3">length</span> <span class="sy0">$</span> <span class="kw3">filter</span> <span class="br0">&#40;</span>\x <span class="sy0">-&gt;</span> <span class="br0">&#40;</span>C<span class="sy0">.</span><span class="kw3">map</span> toLower x<span class="br0">&#41;</span> <span class="sy0">==</span> searchfor<span class="br0">&#41;</span> <span class="br0">&#40;</span>C<span class="sy0">.</span><span class="kw3">lines</span> text<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Suddenly, the ByteString version becomes about 30% <i>slower</i> than the naive version &mdash; 0.337 seconds vs. 0.251 seconds &mdash; and is even slower than the Python version.  What the heck is going on here?  Trying an empty map (i.e., C.map id x) resulted in something fast, so I&#8217;m suspecting that the lowercase function itself is slow.</p>

<p>Unfortunately, there doesn&#8217;t seem to be a lowercase available in ByteString; at the moment it seems that you need to set up your own ctype table and use that.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="kw1">import</span> <span class="kw4">IO</span>
<span class="kw1">import</span> System
<span class="kw1">import</span> Data<span class="sy0">.</span><span class="kw4">Char</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>Word
<span class="kw1">import</span> Data<span class="sy0">.</span>Array<span class="sy0">.</span>Unboxed
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString <span class="kw1">as</span> B
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Internal <span class="kw1">as</span> BI
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Char8 <span class="kw1">as</span> C
&nbsp;
ctype<span class="sy0">_</span>lower <span class="sy0">=</span> listArray <span class="br0">&#40;</span><span class="nu0">0</span><span class="sy0">,</span><span class="nu0">255</span><span class="br0">&#41;</span> <span class="br0">&#40;</span><span class="kw3">map</span> <span class="br0">&#40;</span>BI<span class="sy0">.</span>c2w <span class="sy0">.</span> toLower<span class="br0">&#41;</span> <span class="br0">&#91;</span>'\<span class="nu0">0</span>'<span class="sy0">..</span>'\<span class="nu0">255</span>'<span class="br0">&#93;</span><span class="br0">&#41;</span> <span class="sy0">::</span> UArray Word8 Word8
lowercase <span class="sy0">=</span> B<span class="sy0">.</span><span class="kw3">map</span> <span class="br0">&#40;</span>\x <span class="sy0">-&gt;</span> ctype<span class="sy0">_</span>lower<span class="sy0">!</span>x<span class="br0">&#41;</span>
&nbsp;
main <span class="sy0">=</span> <span class="kw1">do</span> args <span class="sy0">&lt;-</span> getArgs
          <span class="kw1">let</span> searchfor <span class="sy0">=</span> C<span class="sy0">.</span>pack <span class="sy0">$</span> <span class="kw3">head</span> args
          f <span class="sy0">&lt;-</span> openFile <span class="st0">&quot;wordlist&quot;</span> ReadMode
          text <span class="sy0">&lt;-</span> B<span class="sy0">.</span>hGetContents f 
          <span class="kw3">print</span> <span class="sy0">$</span> <span class="kw3">length</span> <span class="sy0">$</span> <span class="kw3">filter</span> <span class="br0">&#40;</span>\x <span class="sy0">-&gt;</span> <span class="br0">&#40;</span>lowercase x<span class="br0">&#41;</span> <span class="sy0">==</span> searchfor<span class="br0">&#41;</span> <span class="br0">&#40;</span>C<span class="sy0">.</span><span class="kw3">lines</span> text<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>&#8230; which turns out to run really quickly at 0.070 seconds, about the same as a C program doing the same task.</p>

<p><b>Update</b>:  See dons comments below &#8212; Char is operating on Unicode, which makes it slow.  I wonder if a ctype.h-type library for ByteString makes sense?</p>]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/haskell-performance-lowercase/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

