<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>brool &#187; haskell</title>
	<atom:link href="http://www.brool.com/index.php/tag/haskell/feed" rel="self" type="application/rss+xml" />
	<link>http://www.brool.com</link>
	<description>brool \brool\ (n.) : a low roar; a deep murmur or humming</description>
	<lastBuildDate>Fri, 20 Jan 2012 07:58:59 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Pattern Matching In Clojure</title>
		<link>http://www.brool.com/index.php/pattern-matching-in-clojure</link>
		<comments>http://www.brool.com/index.php/pattern-matching-in-clojure#comments</comments>
		<pubDate>Wed, 12 Aug 2009 09:38:57 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[clojure]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[matching]]></category>
		<category><![CDATA[ocaml]]></category>
		<category><![CDATA[pattern]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=351</guid>
		<description><![CDATA[Updated: Note that this is available as a clojars module. Clojure code density seems to be pretty good. There are a fair number of convenient shortforms in the language; for example, associative datatypes all act as a function &#8212; so given a hash map you can reference it with (my-hashmap :key). The base language itself [...]]]></description>
			<content:encoded><![CDATA[<p><b>Updated</b>:  Note that this is available as a <a href="http://clojars.org/pattern-match">clojars module</a>.</p>

<p>Clojure code density seems to be pretty good. There are a fair number of convenient shortforms in the language; for example, associative datatypes all act as a function &mdash; so given a hash map you can reference it with (my-hashmap :key).  The base language itself is probably about as expressive as Python (or a bit better), but you have the added advantage of being able to use macros as needed to really get the code density up.</p>

<p>Nonetheless, I really wanted something like Ocaml&#8217;s / Haskell&#8217;s pattern matching; it makes some code wonderfully concise.</p>

<p>Accordingly, I hacked something up, based on Clojure&#8217;s built-in destructuring. Some examples:</p>

<p>Literal values match against the same value, while _ matches against any non-nil value (and nil matches against any nil one).  Additionally, :when clauses can be used for conditional checks.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="co1">; simple recursive evaluator</span>
<span class="br0">&#40;</span><span class="kw1">defn</span> arithmetic <span class="br0">&#91;</span>lst<span class="br0">&#93;</span>
  <span class="br0">&#40;</span>match lst
    v  :<span class="kw1">when</span> <span class="br0">&#40;</span>number? v<span class="br0">&#41;</span>  v
    <span class="br0">&#91;</span> _ <span class="st0">&quot;error&quot;</span> _<span class="br0">&#93;</span>     <span class="st0">&quot;error&quot;</span>
    <span class="br0">&#91;</span> _ _ <span class="st0">&quot;error&quot;</span><span class="br0">&#93;</span>     <span class="st0">&quot;error&quot;</span>
    <span class="br0">&#91;</span> <span class="st0">&quot;add&quot;</span> a b <span class="br0">&#93;</span>      <span class="br0">&#40;</span><span class="sy0">+</span> <span class="br0">&#40;</span>arithmetic a<span class="br0">&#41;</span> <span class="br0">&#40;</span>arithmetic b<span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="br0">&#91;</span> <span class="st0">&quot;sub&quot;</span> a b <span class="br0">&#93;</span>      <span class="br0">&#40;</span><span class="sy0">-</span> <span class="br0">&#40;</span>arithmetic a<span class="br0">&#41;</span> <span class="br0">&#40;</span>arithmetic b<span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="br0">&#91;</span> <span class="st0">&quot;mul&quot;</span> a b <span class="br0">&#93;</span>      <span class="br0">&#40;</span><span class="sy0">*</span> <span class="br0">&#40;</span>arithmetic a<span class="br0">&#41;</span> <span class="br0">&#40;</span>arithmetic b<span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="br0">&#91;</span> <span class="st0">&quot;div&quot;</span> a b <span class="br0">&#93;</span>      <span class="br0">&#40;</span><span class="sy0">/</span> <span class="br0">&#40;</span>arithmetic a<span class="br0">&#41;</span> <span class="br0">&#40;</span>arithmetic b<span class="br0">&#41;</span><span class="br0">&#41;</span>
    <span class="br0">&#91;</span> <span class="st0">&quot;squared&quot;</span> a <span class="br0">&#93;</span>    <span class="br0">&#40;</span>arithmetic <span class="br0">&#91;</span><span class="st0">&quot;mul&quot;</span> <span class="br0">&#40;</span>arithmetic a<span class="br0">&#41;</span> <span class="br0">&#40;</span>arithmetic a<span class="br0">&#41;</span><span class="br0">&#93;</span><span class="br0">&#41;</span>
    _                  <span class="st0">&quot;error&quot;</span> <span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Both collections and single values can be used:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="co1">;; return signum fr a number</span>
<span class="br0">&#40;</span><span class="kw1">defn</span> signum <span class="br0">&#91;</span>x<span class="br0">&#93;</span>
  <span class="br0">&#40;</span>match x 
     <span class="nu0">0</span> <span class="nu0">0</span>
     n :<span class="kw1">when</span> <span class="br0">&#40;</span><span class="sy0">&lt;</span> n <span class="nu0">0</span><span class="br0">&#41;</span> <span class="sy0">-</span><span class="nu0">1</span>
     _ <span class="nu0">1</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>The pattern matching is stricter than the typical destructure;  whereas [ a b ] will destructure against a list of any number of elements, [ a b ] will pattern match only against a list of two elements.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span>match x 
    <span class="br0">&#91;</span><span class="br0">&#93;</span>    <span class="st0">&quot;empty&quot;</span>
    <span class="br0">&#91;</span>_<span class="br0">&#93;</span>   <span class="st0">&quot;one element&quot;</span>
    <span class="br0">&#91;</span>a a<span class="br0">&#93;</span> <span class="st0">&quot;two identical elements&quot;</span>
    <span class="br0">&#91;</span>_ _<span class="br0">&#93;</span> <span class="st0">&quot;two elements&quot;</span>
    _     <span class="st0">&quot;three or more&quot;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>If the same variable occurs in multiple locations in the parameter list, it will be checked for equality.  The &#038; tail form can be used to specify the rest of the list.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="co1">;; count identical elements in the same location in two lists:</span>
<span class="br0">&#40;</span><span class="kw1">defn</span> <span class="kw1">count</span><span class="sy0">=</span> <span class="br0">&#91;</span> lst1 lst2 <span class="br0">&#93;</span>
  <span class="br0">&#40;</span><span class="kw1">loop</span> <span class="br0">&#91;</span> a lst1 b lst2 <span class="kw1">count</span> <span class="nu0">0</span> <span class="br0">&#93;</span>
    <span class="br0">&#40;</span>match <span class="br0">&#91;</span>a b<span class="br0">&#93;</span>
      <span class="br0">&#91;</span><span class="br0">&#91;</span>e <span class="sy0">&amp;</span> at<span class="br0">&#93;</span> <span class="br0">&#91;</span>e <span class="sy0">&amp;</span> bt<span class="br0">&#93;</span><span class="br0">&#93;</span>  <span class="br0">&#40;</span><span class="kw1">recur</span> at bt <span class="br0">&#40;</span><span class="kw1">inc</span> <span class="kw1">count</span><span class="br0">&#41;</span><span class="br0">&#41;</span>
      <span class="br0">&#91;</span><span class="br0">&#91;</span>_ <span class="sy0">&amp;</span> at<span class="br0">&#93;</span> <span class="br0">&#91;</span>_ <span class="sy0">&amp;</span> bt<span class="br0">&#93;</span><span class="br0">&#93;</span>  <span class="br0">&#40;</span><span class="kw1">recur</span> at bt <span class="kw1">count</span><span class="br0">&#41;</span>
      _                    <span class="kw1">count</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Note that this is slightly more flexible than Haskell / ML, in that a variable of the same name can be multiple places in the pattern.</p>

<h3>Defining</h3>

<p>You can use the defnp macro to define a function that is pattern matched; it defines a function that takes one argument and has an implicit match statement.  For example, the signum function can be written:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1">    <span class="br0">&#40;</span>defnp signum
       <span class="nu0">0</span> <span class="nu0">0</span>
       n :<span class="kw1">when</span> <span class="br0">&#40;</span><span class="sy0">&lt;</span> n <span class="nu0">0</span><span class="br0">&#41;</span> <span class="sy0">-</span><span class="nu0">1</span>
       _ <span class="nu0">1</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>(Thanks to <a href="http://infolace.blogspot.com/">Tom Faulhaber</a> for suggesting this)</p>

<h3>Gotchas</h3>

<p>The Clojure destructuring will cause an exception if you try to destructure a collection type with a value.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span><span class="kw1">let</span> <span class="br0">&#91;</span><span class="br0">&#91;</span>a b<span class="br0">&#93;</span> <span class="nu0">10</span><span class="br0">&#93;</span> a<span class="br0">&#41;</span>
java<span class="sy0">.</span>lang<span class="sy0">.</span>UnsupportedOperationException: <span class="me1">nth</span> <span class="kw1">not</span> supported on this type: <span class="me1">Integer</span> <span class="br0">&#40;</span>NO_SOURCE_FILE:<span class="nu0">0</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>&#8230; so be sure to check such cases early in your match statement, if they are possible.</p>

<h3>How It Works</h3>

<p>The pattern matcher uses the built-in Clojure destructuring as the main mechanism, but adorns it so that the pattern can be verified.  For example, the code:


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span>match x <span class="br0">&#91;</span>a a<span class="br0">&#93;</span> <span class="st0">&quot;two identical&quot;</span> <span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




turns into essentially the following:


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="clojure"><pre class="de1"><span class="br0">&#40;</span><span class="kw1">let</span> <span class="br0">&#91;</span> <span class="br0">&#91;</span> a g0001 <span class="sy0">&amp;</span> g0002 <span class="br0">&#93;</span> x <span class="br0">&#93;</span> 
     <span class="br0">&#40;</span><span class="kw1">if</span> <span class="br0">&#40;</span><span class="kw1">and</span> <span class="br0">&#40;</span><span class="kw1">not</span> <span class="br0">&#40;</span><span class="kw1">nil?</span> a<span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="br0">&#40;</span><span class="sy0">=</span> g0001 a<span class="br0">&#41;</span> <span class="br0">&#40;</span><span class="kw1">nil?</span> g0002<span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="st0">&quot;two identical&quot;</span> nil<span class="br0">&#41;</span><span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>That is, the destructuring is done, but then the two variables are checked to make sure that they are equal, and the list is checked to make sure it is only two elements long.</p>

<h3>Source</h3>

<a href="http://github.com/brool/clojure-misc/tree">It&#8217;s all on Github</a>.]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/pattern-matching-in-clojure/feed</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Stupid Haskell Tricks</title>
		<link>http://www.brool.com/index.php/stupid-haskell-tricks</link>
		<comments>http://www.brool.com/index.php/stupid-haskell-tricks#comments</comments>
		<pubDate>Mon, 13 Apr 2009 10:25:30 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[oo]]></category>
		<category><![CDATA[stupid]]></category>
		<category><![CDATA[tricks]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=294</guid>
		<description><![CDATA[Let&#8217;s say that you really, really want some notion of objected oriented programming. So let&#8217;s make a class that represents a name, and some simple method calls on it: data S = S &#123; name :: String &#125; deriving &#40;Show&#41; firstname s = &#40;words &#40;name s&#41;&#41;!!0 lastname s = &#40;words &#40;name s&#41;&#41;!!1 But, dammit, you [...]]]></description>
			<content:encoded><![CDATA[<p>Let&#8217;s say that you really, <i>really</i> want some notion of objected oriented programming.  So let&#8217;s make a class that represents a name, and some simple method calls on it:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="kw1">data</span> S <span class="sy0">=</span> S <span class="br0">&#123;</span> name <span class="sy0">::</span> <span class="kw4">String</span> <span class="br0">&#125;</span> <span class="kw1">deriving</span> <span class="br0">&#40;</span><span class="kw4">Show</span><span class="br0">&#41;</span>
firstname s <span class="sy0">=</span> <span class="br0">&#40;</span><span class="kw3">words</span> <span class="br0">&#40;</span>name s<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">!!</span><span class="nu0">0</span>
lastname  s <span class="sy0">=</span> <span class="br0">&#40;</span><span class="kw3">words</span> <span class="br0">&#40;</span>name s<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="sy0">!!</span><span class="nu0">1</span></pre></div></div></div></div></div></div></div>



<p>But, dammit, you want to invoke it like you would in C++.  So define a function:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="sy0">*</span>Main<span class="sy0">&gt;</span> <span class="kw1">let</span> <span class="br0">&#40;</span><span class="co2">--&gt;</span><span class="br0">&#41;</span> x f <span class="sy0">=</span> f x
<span class="sy0">*</span>Main<span class="sy0">&gt;</span> <span class="kw1">let</span> test <span class="sy0">=</span> S <span class="st0">&quot;George Washington&quot;</span>
<span class="sy0">*</span>Main<span class="sy0">&gt;</span> test <span class="co2">--&gt;</span> firstname
<span class="st0">&quot;George&quot;</span>
<span class="sy0">*</span>Main<span class="sy0">&gt;</span> test <span class="co2">--&gt;</span> lastname
<span class="st0">&quot;Washington&quot;</span></pre></div></div></div></div></div></div></div>



<p>(It&#8217;s tempting to use `.`, but it conflicts with the Prelude.  Also note that you could define it as &#8220;(&#8211;>) = flip ($)&#8221;).  But what if it takes more than one parameter?</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="sy0">*</span>Main<span class="sy0">&gt;</span> <span class="kw1">let</span> flip<span class="sy0">_</span>concat t s <span class="sy0">=</span> intercalate t <span class="sy0">$</span> <span class="kw3">reverse</span><span class="sy0">.</span><span class="kw3">words</span> <span class="sy0">$</span> name s
<span class="sy0">*</span>Main<span class="sy0">&gt;</span> test <span class="co2">--&gt;</span> flip<span class="sy0">_</span>concat <span class="st0">&quot;, &quot;</span>
<span class="st0">&quot;Washington, George&quot;</span>
<span class="sy0">*</span>Main<span class="sy0">&gt;</span> test <span class="co2">--&gt;</span> flip<span class="sy0">_</span>concat<span class="br0">&#40;</span><span class="st0">&quot;, &quot;</span><span class="br0">&#41;</span>
<span class="st0">&quot;Washington, George&quot;</span></pre></div></div></div></div></div></div></div>



<p>&#8230; although you might not like that anything can be applied to (&#8211;>).</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="sy0">*</span>Main<span class="sy0">&gt;</span> <span class="br0">&#91;</span><span class="nu0">1</span><span class="sy0">,</span><span class="nu0">2</span><span class="sy0">,</span><span class="nu0">3</span><span class="br0">&#93;</span> <span class="co2">--&gt;</span> <span class="kw3">length</span>
<span class="nu0">3</span>
&nbsp;
<span class="kw1">class</span> Deref a <span class="kw1">where</span>
    <span class="br0">&#40;</span><span class="co2">--&gt;</span><span class="br0">&#41;</span> <span class="sy0">::</span> a <span class="sy0">-&gt;</span> <span class="br0">&#40;</span>a <span class="sy0">-&gt;</span> b<span class="br0">&#41;</span> <span class="sy0">-&gt;</span> b
    x <span class="co2">--&gt;</span> f <span class="sy0">=</span> f x 
&nbsp;
<span class="kw1">instance</span> Deref S
&nbsp;
<span class="sy0">*</span>Main<span class="sy0">&gt;</span> <span class="br0">&#91;</span><span class="nu0">1</span><span class="sy0">,</span><span class="nu0">2</span><span class="sy0">,</span><span class="nu0">3</span><span class="br0">&#93;</span> <span class="co2">--&gt;</span> <span class="kw3">length</span>
&nbsp;
<span class="sy0">&lt;</span>interactive<span class="sy0">&gt;</span>:<span class="nu0">1</span>:<span class="nu0">0</span>:
    No <span class="kw1">instance</span> for <span class="br0">&#40;</span>Deref <span class="br0">&#91;</span>t<span class="br0">&#93;</span><span class="br0">&#41;</span>
      arising from a use <span class="kw1">of</span> `<span class="co2">--&gt;</span>' at <span class="sy0">&lt;</span>interactive<span class="sy0">&gt;</span>:<span class="nu0">1</span>:<span class="nu0">0</span><span class="sy0">-</span><span class="nu0">17</span>
    Possible fix: add an <span class="kw1">instance</span> declaration for <span class="br0">&#40;</span>Deref <span class="br0">&#91;</span>t<span class="br0">&#93;</span><span class="br0">&#41;</span>
    In the expression: <span class="br0">&#91;</span><span class="nu0">1</span><span class="sy0">,</span> <span class="nu0">2</span><span class="sy0">,</span> <span class="nu0">3</span><span class="br0">&#93;</span> <span class="co2">--&gt;</span> <span class="kw3">length</span>
    In the definition <span class="kw1">of</span> `it': it <span class="sy0">=</span> <span class="br0">&#91;</span><span class="nu0">1</span><span class="sy0">,</span> <span class="nu0">2</span><span class="sy0">,</span> <span class="nu0">3</span><span class="br0">&#93;</span> <span class="co2">--&gt;</span> <span class="kw3">length</span></pre></div></div></div></div></div></div></div>



<p>You can even use tuple passing to make it look even more like a typical call.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="sy0">*</span>Main<span class="sy0">&gt;</span> <span class="kw1">let</span> pretty <span class="br0">&#40;</span>pre<span class="sy0">,</span> mid<span class="sy0">,</span> post<span class="br0">&#41;</span> s <span class="sy0">=</span> pre <span class="sy0">++</span> <span class="br0">&#40;</span>firstname s<span class="br0">&#41;</span> <span class="sy0">++</span> mid <span class="sy0">++</span> <span class="br0">&#40;</span>lastname s<span class="br0">&#41;</span> <span class="sy0">++</span> post
<span class="sy0">*</span>Main<span class="sy0">&gt;</span> test <span class="co2">--&gt;</span> pretty <span class="br0">&#40;</span><span class="st0">&quot;&lt;&quot;</span><span class="sy0">,</span> <span class="st0">&quot;, &quot;</span><span class="sy0">,</span> <span class="st0">&quot;&gt;&quot;</span><span class="br0">&#41;</span>
<span class="st0">&quot;&lt;George, Washington&gt;&quot;</span></pre></div></div></div></div></div></div></div>



<p>&#8230; although that makes it harder to curry. </p>]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/stupid-haskell-tricks/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Haskell To WordPress (Snippet)</title>
		<link>http://www.brool.com/index.php/haskell-to-wordpress-snippet</link>
		<comments>http://www.brool.com/index.php/haskell-to-wordpress-snippet#comments</comments>
		<pubDate>Fri, 10 Apr 2009 08:25:00 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[haxr]]></category>
		<category><![CDATA[wordpress]]></category>
		<category><![CDATA[xmlrpc]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=291</guid>
		<description><![CDATA[A small snippet of code that demonstrates calling into a WordPress XML-RPC server with Haskell and HaxR. import qualified Data.Map as Map import Data.Maybe import Network.XmlRpc.Client import Network.XmlRpc.Internals &#160; server = &#34;http://yourserver.wordpress.com/xmlrpc.php&#34; &#160; -- extract multiple posts from the XML response extract :: Value -&#62; &#91;Map String Value&#93; extract xmlresp = let ValueArray rs = [...]]]></description>
			<content:encoded><![CDATA[<p>A small snippet of code that demonstrates calling into a WordPress XML-RPC server with Haskell and <a href="http://www.haskell.org/haxr/">HaxR</a>.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>Map <span class="kw1">as</span> Map
<span class="kw1">import</span> Data<span class="sy0">.</span><span class="kw4">Maybe</span>
<span class="kw1">import</span> Network<span class="sy0">.</span>XmlRpc<span class="sy0">.</span>Client
<span class="kw1">import</span> Network<span class="sy0">.</span>XmlRpc<span class="sy0">.</span>Internals
&nbsp;
server <span class="sy0">=</span> <span class="st0">&quot;http://yourserver.wordpress.com/xmlrpc.php&quot;</span>
&nbsp;
<span class="co1">-- extract multiple posts from the XML response</span>
extract <span class="sy0">::</span> Value <span class="sy0">-&gt;</span> <span class="br0">&#91;</span>Map <span class="kw4">String</span> Value<span class="br0">&#93;</span>
extract xmlresp <span class="sy0">=</span> 
    <span class="kw1">let</span> ValueArray rs <span class="sy0">=</span> xmlresp <span class="kw1">in</span>
    <span class="kw3">map</span> <span class="br0">&#40;</span>\v <span class="sy0">-&gt;</span> <span class="kw1">case</span> v <span class="kw1">of</span> 
                  ValueStruct vs <span class="sy0">-&gt;</span> Map<span class="sy0">.</span>fromList vs
                  <span class="sy0">_</span>              <span class="sy0">-&gt;</span> Map<span class="sy0">.</span>fromList <span class="br0">&#91;</span><span class="br0">&#93;</span><span class="br0">&#41;</span> rs
&nbsp;
getRecentPosts <span class="sy0">::</span> <span class="kw4">Int</span> <span class="sy0">-&gt;</span> <span class="br0">&#91;</span><span class="kw4">Char</span><span class="br0">&#93;</span> <span class="sy0">-&gt;</span> <span class="br0">&#91;</span><span class="kw4">Char</span><span class="br0">&#93;</span> <span class="sy0">-&gt;</span> <span class="kw4">Int</span> <span class="sy0">-&gt;</span> <span class="kw4">IO</span> Value
getRecentPosts <span class="sy0">=</span> remote server <span class="st0">&quot;metaWeblog.getRecentPosts&quot;</span>
&nbsp;
<span class="co1">-- print out the five most recent posts</span>
main <span class="sy0">=</span> <span class="kw1">do</span> result <span class="sy0">&lt;-</span> getRecentPosts <span class="nu0">1</span> <span class="st0">&quot;yourname&quot;</span> <span class="st0">&quot;yourpass&quot;</span> <span class="nu0">5</span>
          <span class="kw1">let</span> posts <span class="sy0">=</span> extract result
          <span class="kw3">mapM_</span> <span class="br0">&#40;</span>\p <span class="sy0">-&gt;</span> <span class="kw3">print</span> <span class="sy0">$</span> fromJust <span class="sy0">$</span> Map<span class="sy0">.</span><span class="kw3">lookup</span> <span class="st0">&quot;title&quot;</span> p<span class="br0">&#41;</span> posts</pre></div></div></div></div></div></div></div>


]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/haskell-to-wordpress-snippet/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Haskell Performance: Array Creation</title>
		<link>http://www.brool.com/index.php/haskell-performance-array-creation</link>
		<comments>http://www.brool.com/index.php/haskell-performance-array-creation#comments</comments>
		<pubDate>Mon, 30 Mar 2009 03:00:20 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=274</guid>
		<description><![CDATA[Ran into another interesting performance problem while converting a small test program over to Haskell. Let&#8217;s say that you want to walk through every line of a text file, collate character frequencies, and return anything that maps to a particular frequency. For purposes of explanation we&#8217;ll do something really silly like look for lines with [...]]]></description>
			<content:encoded><![CDATA[<p>Ran into another interesting performance problem while converting a small test program over to Haskell. Let&#8217;s say that you want to walk through every line of a text file, collate character frequencies, and return anything that maps to a particular frequency.  For purposes of explanation we&#8217;ll do something really silly like look for lines with 10 capital &#8216;A&#8217;s.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="coMULTI">{-# OPTIONS -XBangPatterns #-}</span>
<span class="kw1">import</span> <span class="kw4">IO</span>
<span class="kw1">import</span> System
<span class="kw1">import</span> Data<span class="sy0">.</span>Word
<span class="kw1">import</span> Data<span class="sy0">.</span>Array<span class="sy0">.</span>Unboxed
<span class="kw1">import</span> Control<span class="sy0">.</span><span class="kw4">Monad</span><span class="sy0">.</span>ST
<span class="kw1">import</span> Data<span class="sy0">.</span>Array<span class="sy0">.</span>ST
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString <span class="kw1">as</span> B
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Internal <span class="kw1">as</span> BI
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Char8 <span class="kw1">as</span> C
&nbsp;
counts' <span class="sy0">!</span>line <span class="sy0">=</span> <span class="kw1">do</span> arr <span class="sy0">&lt;-</span> newArray <span class="br0">&#40;</span><span class="nu0">0</span><span class="sy0">,</span><span class="nu0">255</span><span class="br0">&#41;</span> <span class="nu0">0</span> <span class="sy0">::</span> ST s <span class="br0">&#40;</span>STUArray s Word8 <span class="kw4">Int</span><span class="br0">&#41;</span>
                   <span class="co1">-- collate character counts here</span>
                   <span class="kw3">return</span> arr
counts <span class="sy0">!</span>line <span class="sy0">=</span> runSTUArray <span class="br0">&#40;</span>counts' line<span class="br0">&#41;</span>
&nbsp;
hit <span class="sy0">!</span>line <span class="sy0">=</span> <span class="kw1">let</span> freq <span class="sy0">=</span> counts line <span class="kw1">in</span> freq<span class="sy0">!</span><span class="nu0">65</span> <span class="sy0">==</span> <span class="nu0">10</span>
&nbsp;
main <span class="sy0">=</span>
    <span class="kw1">do</span> args <span class="sy0">&lt;-</span> getArgs
       f <span class="sy0">&lt;-</span> openFile <span class="st0">&quot;wordlist&quot;</span> ReadMode
       text <span class="sy0">&lt;-</span> B<span class="sy0">.</span>hGetContents f
       <span class="kw3">print</span> <span class="sy0">$</span> <span class="kw3">length</span> <span class="sy0">$</span> <span class="kw3">filter</span> hit <span class="br0">&#40;</span>C<span class="sy0">.</span><span class="kw3">lines</span> text<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>This is fast (0.07s on 300K test file, compiled with -O2 on a Macbook Pro), but there&#8217;s no actual collation going on.  It suddenly becomes two magnitudes slower as soon as you start to do anything based on the line:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- version 1 : 9.5 seconds</span>
counts' <span class="sy0">!</span>line <span class="sy0">=</span> <span class="kw1">do</span> arr <span class="sy0">&lt;-</span> newArray <span class="br0">&#40;</span><span class="nu0">0</span><span class="sy0">,</span><span class="nu0">255</span><span class="br0">&#41;</span> <span class="nu0">0</span> <span class="sy0">::</span> ST s <span class="br0">&#40;</span>STUArray s Word8 <span class="kw4">Int</span><span class="br0">&#41;</span>
                   readArray arr <span class="br0">&#40;</span>B<span class="sy0">.</span><span class="kw3">head</span> line<span class="br0">&#41;</span> <span class="sy0">&gt;&gt;=</span> \v <span class="sy0">-&gt;</span> writeArray arr <span class="br0">&#40;</span>B<span class="sy0">.</span><span class="kw3">head</span> line<span class="br0">&#41;</span> <span class="br0">&#40;</span>v<span class="sy0">+</span><span class="nu0">1</span><span class="br0">&#41;</span>
                   <span class="kw3">return</span> arr
counts <span class="sy0">!</span>line <span class="sy0">=</span> runSTUArray <span class="br0">&#40;</span>counts' line<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>.. equals 9.54 seconds just when you collate the first character of the string, and led me to believe that ByteStrings were slow, especially since a constant change to the array was fast (0.07 seconds):</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- version 2 : 0.07 seconds</span>
counts' <span class="sy0">!</span>line <span class="sy0">=</span> <span class="kw1">do</span> arr <span class="sy0">&lt;-</span> newArray <span class="br0">&#40;</span><span class="nu0">0</span><span class="sy0">,</span><span class="nu0">255</span><span class="br0">&#41;</span> <span class="nu0">0</span> <span class="sy0">::</span> ST s <span class="br0">&#40;</span>STUArray s Word8 <span class="kw4">Int</span><span class="br0">&#41;</span>
                   readArray arr <span class="nu0">0</span> <span class="sy0">&gt;&gt;=</span> \v <span class="sy0">-&gt;</span> writeArray arr <span class="nu0">0</span> <span class="br0">&#40;</span>v<span class="sy0">+</span><span class="nu0">1</span><span class="br0">&#41;</span>
                   <span class="kw3">return</span> arr</pre></div></div></div></div></div></div></div>




<p>(Here I&#8217;ll elide the hours of me messing around with profiling and whatnot to try and figure out why B.head was so slow, or how to make B.foldl&#8217; update a state variable, or sprinkling strictness bangs all over the place, or the other tons of false trails that I went down.)</p>

<p>It turns out that the slow portion in all of this is the newArray, not the character collation itself, which can be seen in the profile if the array creation is moved into its own routine and we look at the profile:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- version 3</span>
<span class="co1">-- compile with: ghc -package bytestring stuarray.hs -prof -auto-all -O2 -o stuarray.out</span>
<span class="co1">-- run with: ./stuarray.out +RTS -p </span>
initial<span class="sy0">_</span>array <span class="sy0">=</span> <span class="kw1">do</span> arr <span class="sy0">&lt;-</span> newArray <span class="br0">&#40;</span><span class="nu0">0</span><span class="sy0">,</span><span class="nu0">255</span><span class="br0">&#41;</span> <span class="nu0">0</span> <span class="sy0">::</span> ST s <span class="br0">&#40;</span>STUArray s Word8 <span class="kw4">Int</span><span class="br0">&#41;</span>
                   <span class="kw3">return</span> arr
&nbsp;
counts' <span class="sy0">!</span>line <span class="sy0">=</span> <span class="kw1">do</span> arr <span class="sy0">&lt;-</span> initial<span class="sy0">_</span>array
                   <span class="kw1">let</span> ix <span class="sy0">=</span> B<span class="sy0">.</span><span class="kw3">head</span> line
                   readArray arr ix <span class="sy0">&gt;&gt;=</span> \v <span class="sy0">-&gt;</span> writeArray arr ix <span class="br0">&#40;</span>v<span class="sy0">+</span><span class="nu0">1</span><span class="br0">&#41;</span>
                   <span class="kw3">return</span> arr
counts <span class="sy0">!</span>line <span class="sy0">=</span> runSTUArray <span class="br0">&#40;</span>counts' line<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>



<p>&#8230; and the relevant part of the profile:</p>
<pre>
MAIN                     MAIN                     1           0   0.0    0.0   100.0  100.0
 main                    Main                   240           1   0.7    0.3   100.0  100.0
  hit                    Main                   242      335075   0.0    0.0    99.3   99.7
   counts                Main                   243      335075   0.0    0.2    99.3   99.7
    counts'              Main                   244      335075   0.0    0.0    99.3   99.5
     initial_array       Main                   245      335075  99.3   99.5    99.3   99.5
</pre>

<p>Which brings up two interesting points:</p>
<ul>
<li>The Haskell optimizer was so smart that it was able to figure out that the code in version 2 built a constant array&#8230; and so only called the newArray once.  (Confirmed with a profile).</li>
<li>New array creation seems to be so slow that it <i>dominates a benchmark that has file I/O.</i></li>
</ul>

<p>The Ocaml version took about 0.720 seconds (Ocaml version below), compared to Haskell time of about 8.5 seconds.  Reducing the array size to 128 in both cases reduced it to 0.112 seconds for Ocaml and 4.3 seconds in Haskell.</p>

<table border="1px dotted">
<tr><th>Array size</th><th>Ocaml</th><th>Haskell</th></tr>
<tr><td>128</td><td>0.112</td><td>4.3</td></tr>
<tr><td>255</td><td>0.184</td><td>8.5</td></tr>
<tr><td>256</td><td>0.720</td><td>8.5</td></tr> 
</table>

<p>Note the discontinuity in Ocaml between 255 and 256 elements, which I find interesting.  The nice people in #haskell suggested that I stop constructing/destructing the array and instead just null it out between each line, and it turned out that the best way to do that was just to thaw a starter array.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- version 4: 0.11 seconds</span>
initial<span class="sy0">_</span>array <span class="sy0">=</span> listArray <span class="br0">&#40;</span><span class="nu0">0</span><span class="sy0">,</span><span class="nu0">255</span><span class="br0">&#41;</span> <span class="br0">&#40;</span><span class="kw3">repeat</span> <span class="nu0">0</span><span class="br0">&#41;</span> <span class="sy0">::</span> UArray Word8 <span class="kw4">Int</span>
&nbsp;
counts' <span class="sy0">!</span>line <span class="sy0">=</span> <span class="kw1">do</span> arr <span class="sy0">&lt;-</span> thaw initial<span class="sy0">_</span>array
                   <span class="kw1">let</span> ix <span class="sy0">=</span> B<span class="sy0">.</span><span class="kw3">head</span> line
                   readArray arr ix <span class="sy0">&gt;&gt;=</span> \v <span class="sy0">-&gt;</span> writeArray arr ix <span class="br0">&#40;</span>v<span class="sy0">+</span><span class="nu0">1</span><span class="br0">&#41;</span>
                   <span class="kw3">return</span> arr
counts <span class="sy0">!</span>line <span class="sy0">=</span> runSTUArray <span class="br0">&#40;</span>counts' line<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>This is still kind of strange to me (because an object is still getting constructed/destructed &#8212; thaw guarantees a copy) but you can&#8217;t argue with a performance increase.  (Interestingly, using Array.copy or Array.fill in Ocaml is <i>slower</i> than just using Array.make).  Final times?  0.11 seconds in Haskell vs. the best of <strike>0.720 second</strike> UPDATED: 0.12 seconds in Ocaml&#8230; and the Haskell version is just 50% slower than a C implementation with a gratuitous calloc instead of a memset. </p>

<p><b>UPDATE</b>:  An pointed out that zeroing the Ocaml array with a for loop is much faster than Array.copy or Array.fill, bringing it to about the same speed as Haskell.</p>

<p>Lessons learned?</p>
<ul>
<li>Performance is a treacherous mistress</li>
<li>The Haskell optimizer is awesome but you have to be wary when trying to narrow down performance problems</li>
<li>#haskell is always full of useful suggestions</li>
<li>Array creation seems to be slow enough that alternatives should be explored.</li>
</ul>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="ocaml"><pre class="de1"><span class="kw1">open</span> <span class="kw2">Char</span>
<span class="kw1">open</span> <span class="kw2">String</span>
&nbsp;
<span class="kw1">let</span> all_lines fn filename <span class="sy0">=</span> 
    <span class="kw1">let</span> chan <span class="sy0">=</span> <span class="kw3">open_in</span> filename <span class="kw1">in</span>
        <span class="kw1">try</span>
            <span class="kw1">while</span> <span class="kw1">true</span> <span class="kw1">do</span> 
                <span class="kw1">let</span> line <span class="sy0">=</span> <span class="kw3">input_line</span> chan <span class="kw1">in</span> 
                    fn line
            <span class="kw1">done</span>
        <span class="kw1">with</span> End_of_file <span class="sy0">-&gt;</span>
            <span class="kw3">close_in</span> chan
&nbsp;
<span class="kw1">let</span> initial_array <span class="sy0">=</span> <span class="kw2">Array</span><span class="sy0">.</span>make <span class="nu0">256</span> <span class="nu0">0</span>
&nbsp;
<span class="kw1">let</span> count line <span class="sy0">=</span> 
    <span class="kw1">let</span> freq <span class="sy0">=</span> initial_array <span class="kw1">in</span>
    <span class="kw1">let</span> ix <span class="sy0">=</span> <span class="kw3">int_of_char</span> line<span class="sy0">.</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span> <span class="kw1">in</span> 
        <span class="kw1">for</span> i <span class="sy0">=</span> <span class="nu0">0</span> <span class="kw1">to</span> <span class="nu0">255</span> <span class="kw1">do</span> freq<span class="sy0">.</span><span class="br0">&#40;</span>i<span class="br0">&#41;</span> <span class="sy0">&lt;-</span> <span class="nu0">0</span> <span class="kw1">done</span><span class="sy0">;</span>
        freq<span class="sy0">.</span><span class="br0">&#40;</span>ix<span class="br0">&#41;</span> <span class="sy0">&lt;-</span> freq<span class="sy0">.</span><span class="br0">&#40;</span>ix<span class="br0">&#41;</span> <span class="sy0">+</span> <span class="nu0">1</span><span class="sy0">;</span>
        freq
&nbsp;
<span class="kw1">let</span> hit line <span class="sy0">=</span> 
    <span class="kw1">let</span> freq <span class="sy0">=</span> count line <span class="kw1">in</span>
        freq<span class="sy0">.</span><span class="br0">&#40;</span><span class="nu0">65</span><span class="br0">&#41;</span> <span class="sy0">=</span> <span class="nu0">10</span>
&nbsp;
<span class="kw1">let</span> _ <span class="sy0">=</span> 
    <span class="kw1">let</span> linecount <span class="sy0">=</span> <span class="kw1">ref</span> <span class="nu0">0</span> <span class="kw1">in</span> 
        all_lines <span class="br0">&#40;</span><span class="kw1">fun</span> line <span class="sy0">-&gt;</span> <span class="kw1">if</span> hit line <span class="kw1">then</span> linecount <span class="sy0">:=</span> <span class="sy0">!</span>linecount <span class="sy0">+</span> <span class="nu0">1</span> <span class="kw1">else</span> <span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="st0">&quot;wordlist&quot;</span><span class="sy0">;</span>
        <span class="kw3">print_int</span> <span class="sy0">!</span>linecount</pre></div></div></div></div></div></div></div>


]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/haskell-performance-array-creation/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Haskell Performance: Lowercase</title>
		<link>http://www.brool.com/index.php/haskell-performance-lowercase</link>
		<comments>http://www.brool.com/index.php/haskell-performance-lowercase#comments</comments>
		<pubDate>Sun, 29 Mar 2009 02:05:25 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=266</guid>
		<description><![CDATA[I was trying to track down some issues with some text processing programs that I was writing in Haskell, and ran into an interesting problem. I made one small change and my program ended up being 5 times slower, and I had to backtrack to try and find out what it was. So, given a [...]]]></description>
			<content:encoded><![CDATA[<p>I was trying to track down some issues with some text processing programs that I was writing in Haskell, and ran into an interesting problem.  I made one small change and my program ended up being 5 times slower, and I had to backtrack to try and find out what it was. So, given a simple Haskell program that sees if a word is in a wordlist:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="kw1">import</span> <span class="kw4">IO</span>
<span class="kw1">import</span> System
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString <span class="kw1">as</span> B
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Internal <span class="kw1">as</span> BI
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Char8 <span class="kw1">as</span> C
&nbsp;
main <span class="sy0">=</span> <span class="kw1">do</span> args <span class="sy0">&lt;-</span> getArgs
          <span class="kw1">let</span> searchfor <span class="sy0">=</span> C<span class="sy0">.</span>pack <span class="sy0">$</span> <span class="kw3">head</span> args
          f <span class="sy0">&lt;-</span> openFile <span class="st0">&quot;wordlist&quot;</span> ReadMode
          text <span class="sy0">&lt;-</span> B<span class="sy0">.</span>hGetContents f 
          <span class="kw3">print</span> <span class="sy0">$</span> <span class="kw3">length</span> <span class="sy0">$</span> <span class="kw3">filter</span> <span class="br0">&#40;</span><span class="br0">&#40;</span><span class="sy0">==</span><span class="br0">&#41;</span> searchfor<span class="br0">&#41;</span> <span class="br0">&#40;</span>C<span class="sy0">.</span><span class="kw3">lines</span> text<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




To search a smallish list of about 300K words takes 0.040 seconds on my computer, compared to 0.200 seconds for Python and 0.210 seconds for a naive Haskell implementation that is not using ByteStrings.  However, let&#8217;s just add lowercase to the equation:


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="kw1">import</span> <span class="kw4">IO</span>
<span class="kw1">import</span> System
<span class="kw1">import</span> Data<span class="sy0">.</span><span class="kw4">Char</span>
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString <span class="kw1">as</span> B
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Internal <span class="kw1">as</span> BI
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Char8 <span class="kw1">as</span> C
&nbsp;
main <span class="sy0">=</span> <span class="kw1">do</span> args <span class="sy0">&lt;-</span> getArgs
          <span class="kw1">let</span> searchfor <span class="sy0">=</span> C<span class="sy0">.</span>pack <span class="sy0">$</span> <span class="kw3">head</span> args
          f <span class="sy0">&lt;-</span> openFile <span class="st0">&quot;wordlist&quot;</span> ReadMode
          text <span class="sy0">&lt;-</span> B<span class="sy0">.</span>hGetContents f 
          <span class="kw3">print</span> <span class="sy0">$</span> <span class="kw3">length</span> <span class="sy0">$</span> <span class="kw3">filter</span> <span class="br0">&#40;</span>\x <span class="sy0">-&gt;</span> <span class="br0">&#40;</span>C<span class="sy0">.</span><span class="kw3">map</span> toLower x<span class="br0">&#41;</span> <span class="sy0">==</span> searchfor<span class="br0">&#41;</span> <span class="br0">&#40;</span>C<span class="sy0">.</span><span class="kw3">lines</span> text<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>Suddenly, the ByteString version becomes about 30% <i>slower</i> than the naive version &mdash; 0.337 seconds vs. 0.251 seconds &mdash; and is even slower than the Python version.  What the heck is going on here?  Trying an empty map (i.e., C.map id x) resulted in something fast, so I&#8217;m suspecting that the lowercase function itself is slow.</p>

<p>Unfortunately, there doesn&#8217;t seem to be a lowercase available in ByteString; at the moment it seems that you need to set up your own ctype table and use that.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="kw1">import</span> <span class="kw4">IO</span>
<span class="kw1">import</span> System
<span class="kw1">import</span> Data<span class="sy0">.</span><span class="kw4">Char</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>Word
<span class="kw1">import</span> Data<span class="sy0">.</span>Array<span class="sy0">.</span>Unboxed
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString <span class="kw1">as</span> B
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Internal <span class="kw1">as</span> BI
<span class="kw1">import</span> <span class="kw1">qualified</span> Data<span class="sy0">.</span>ByteString<span class="sy0">.</span>Char8 <span class="kw1">as</span> C
&nbsp;
ctype<span class="sy0">_</span>lower <span class="sy0">=</span> listArray <span class="br0">&#40;</span><span class="nu0">0</span><span class="sy0">,</span><span class="nu0">255</span><span class="br0">&#41;</span> <span class="br0">&#40;</span><span class="kw3">map</span> <span class="br0">&#40;</span>BI<span class="sy0">.</span>c2w <span class="sy0">.</span> toLower<span class="br0">&#41;</span> <span class="br0">&#91;</span>'\<span class="nu0">0</span>'<span class="sy0">..</span>'\<span class="nu0">255</span>'<span class="br0">&#93;</span><span class="br0">&#41;</span> <span class="sy0">::</span> UArray Word8 Word8
lowercase <span class="sy0">=</span> B<span class="sy0">.</span><span class="kw3">map</span> <span class="br0">&#40;</span>\x <span class="sy0">-&gt;</span> ctype<span class="sy0">_</span>lower<span class="sy0">!</span>x<span class="br0">&#41;</span>
&nbsp;
main <span class="sy0">=</span> <span class="kw1">do</span> args <span class="sy0">&lt;-</span> getArgs
          <span class="kw1">let</span> searchfor <span class="sy0">=</span> C<span class="sy0">.</span>pack <span class="sy0">$</span> <span class="kw3">head</span> args
          f <span class="sy0">&lt;-</span> openFile <span class="st0">&quot;wordlist&quot;</span> ReadMode
          text <span class="sy0">&lt;-</span> B<span class="sy0">.</span>hGetContents f 
          <span class="kw3">print</span> <span class="sy0">$</span> <span class="kw3">length</span> <span class="sy0">$</span> <span class="kw3">filter</span> <span class="br0">&#40;</span>\x <span class="sy0">-&gt;</span> <span class="br0">&#40;</span>lowercase x<span class="br0">&#41;</span> <span class="sy0">==</span> searchfor<span class="br0">&#41;</span> <span class="br0">&#40;</span>C<span class="sy0">.</span><span class="kw3">lines</span> text<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<p>&#8230; which turns out to run really quickly at 0.070 seconds, about the same as a C program doing the same task.</p>

<p><b>Update</b>:  See dons comments below &#8212; Char is operating on Unicode, which makes it slow.  I wonder if a ctype.h-type library for ByteString makes sense?</p>]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/haskell-performance-lowercase/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Using MissingPy</title>
		<link>http://www.brool.com/index.php/using-missingpy</link>
		<comments>http://www.brool.com/index.php/using-missingpy#comments</comments>
		<pubDate>Fri, 20 Mar 2009 02:15:00 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[imap]]></category>
		<category><![CDATA[missingpy]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=209</guid>
		<description><![CDATA[For comparison purposes, I was rewriting the silly little e-mail digest program in Haskell, using Python libraries for the IMAP interface (it&#8217;s available on Github). It&#8217;s hard to beat Python&#8217;s amazing collection of libraries. Cabal / hackage isn&#8217;t bad, but it doesn&#8217;t yet approach Python&#8217;s &#8220;batteries included&#8221; philosophy and ease of use. Anyway, I couldn&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p>For comparison purposes, I was rewriting the <a href="http://www.brool.com/index.php/creating-mail-digests-with-python-and-imap">silly little e-mail digest program</a> in Haskell, using Python libraries for the IMAP interface (it&#8217;s <a href="http://github.com/brool/digest/tree/master">available on Github</a>). It&#8217;s hard to beat Python&#8217;s amazing collection of libraries.  Cabal / hackage isn&#8217;t bad, but it doesn&#8217;t yet approach Python&#8217;s &#8220;batteries included&#8221; philosophy and ease of use.  Anyway, I couldn&#8217;t find an IMAP library for Haskell, so decided instead to try <a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/MissingPy">MissingPy</a> to interface into the Python IMAP library.  A quick overview/tutorial:</p>

<h3>Installing MissingPy</h3>
<p>I had an issue using Cabal to install MissingPy into my GHC 6.10 installation (Mac OS X).  I went into the directory that had been unpacked and modified Setup.hs:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="kw1">import</span> Distribution<span class="sy0">.</span>PackageDescription
<span class="kw1">import</span> Distribution<span class="sy0">.</span>PackageDescription<span class="sy0">.</span>Parse  <span class="co1">-- added this line</span>
<span class="kw1">import</span> Distribution<span class="sy0">.</span>Simple</pre></div></div></div></div></div></div></div>




<p>A cabal configure, build, and install of the package then installed everything.</p>

<h3>Basic Usage</h3>
<p>It&#8217;s easiest to play around with MissingPy in ghci.  Before you do anything you&#8217;ll need to initialize the interpreter with py_initialize and import any necessary modules with pyImport.  You can test by using pyRun_SimpleString:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"># :m Python<span class="sy0">.</span>Interpreter Python<span class="sy0">.</span>Objects Python<span class="sy0">.</span>Utils Python<span class="sy0">.</span>Exceptions
# py<span class="sy0">_</span>initialize
# pyRun<span class="sy0">_</span>SimpleString <span class="st0">&quot;print 'hello'&quot;</span>
hello</pre></div></div></div></div></div></div></div>




<p>The simplest useful way to execute python code is to use pyRun_String.  For example:</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- add 1 and 2 in Python</span>
<span class="co1">-- can also use Py_file_input or Py_single_input for different handling of the context</span>
addem <span class="sy0">=</span> pyRun<span class="sy0">_</span>String <span class="st0">&quot;1+2&quot;</span> Py<span class="sy0">_</span>eval<span class="sy0">_</span>input <span class="br0">&#91;</span><span class="br0">&#93;</span> <span class="sy0">&gt;&gt;=</span> fromPyObject <span class="sy0">::</span> <span class="kw4">IO</span> <span class="kw4">Integer</span></pre></div></div></div></div></div></div></div>




<p>There are also the callByName and pyObject_Call methods for more complicated scenarios.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1">callByName <span class="sy0">::</span> <span class="kw4">String</span>            <span class="co1">-- ^ Object\/function name</span>
           <span class="sy0">-&gt;</span> <span class="br0">&#91;</span>PyObject<span class="br0">&#93;</span>        <span class="co1">-- ^ List of non-keyword parameters</span>
           <span class="sy0">-&gt;</span> <span class="br0">&#91;</span><span class="br0">&#40;</span><span class="kw4">String</span><span class="sy0">,</span> PyObject<span class="br0">&#41;</span><span class="br0">&#93;</span> <span class="co1">-- ^ List of keyword parameters</span>
           <span class="sy0">-&gt;</span> <span class="kw4">IO</span> PyObject
&nbsp;
pyObject<span class="sy0">_</span>Call <span class="sy0">::</span> PyObject       <span class="co1">-- ^ Object to call</span>
              <span class="sy0">-&gt;</span> <span class="br0">&#91;</span>PyObject<span class="br0">&#93;</span>     <span class="co1">-- ^ List of non-keyword parameters (may be empty)</span>
              <span class="sy0">-&gt;</span> <span class="br0">&#91;</span><span class="br0">&#40;</span><span class="kw4">String</span><span class="sy0">,</span> PyObject<span class="br0">&#41;</span><span class="br0">&#93;</span> <span class="co1">-- ^ List of keyword parameters (may be empty)</span>
              <span class="sy0">-&gt;</span> <span class="kw4">IO</span> PyObject    <span class="co1">-- ^ Return value</span>
&nbsp;
<span class="co1">-- example 1: create an IMAP instance</span>
imap <span class="sy0">&lt;-</span> callByName <span class="st0">&quot;imaplib.IMAP4_SSL&quot;</span> <span class="br0">&#91;</span>imap<span class="sy0">_</span>hostname<span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="br0">&#93;</span>
&nbsp;
<span class="co1">-- call &quot;foo.bar(p1,p2)&quot;</span>
bar <span class="sy0">&lt;-</span> getattr foo <span class="st0">&quot;bar&quot;</span>
result <span class="sy0">&lt;-</span> pyObject<span class="sy0">_</span>Call bar <span class="br0">&#91;</span>p1<span class="sy0">,</span>p2<span class="br0">&#93;</span> <span class="br0">&#91;</span><span class="br0">&#93;</span></pre></div></div></div></div></div></div></div>




<h3>Manipulating Python Objects</h3>
<p>The fromPyObject and toPyObject handle marshaling of objects from Haskell to Python and vice versa.  There are some handy functions for doing common operations on Python objects that are especially useful from the REPL: reprOf (like Python repr), strOf (Python str), showPyObject (shows the type and repr), and dirPyObject (like the Python dir()).</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"># g <span class="sy0">&lt;-</span> pyRun<span class="sy0">_</span>String <span class="st0">&quot;(1,2,'hello')&quot;</span> Py<span class="sy0">_</span>eval<span class="sy0">_</span>input <span class="br0">&#91;</span><span class="br0">&#93;</span>
# showPyObject g
<span class="st0">&quot;&lt;type 'tuple'&gt;: (1, 2, 'hello')&quot;</span>
# reprOf g
<span class="st0">&quot;(1, 2, 'hello')&quot;</span>
# strOf g
<span class="st0">&quot;(1, 2, 'hello')&quot;</span>
# dirPyObject g
<span class="br0">&#91;</span><span class="st0">&quot;__add__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__class__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__contains__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__delattr__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__doc__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__eq__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__ge__&quot;</span><span class="sy0">,</span>
<span class="st0">&quot;__getattribute__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__getitem__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__getnewargs__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__getslice__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__gt__&quot;</span><span class="sy0">,</span>
<span class="st0">&quot;__hash__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__init__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__iter__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__le__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__len__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__lt__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__mul__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__ne__&quot;</span><span class="sy0">,</span>
<span class="st0">&quot;__new__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__reduce__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__reduce_ex__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__repr__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__rmul__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__setattr__&quot;</span><span class="sy0">,</span><span class="st0">&quot;__str__&quot;</span><span class="br0">&#93;</span></pre></div></div></div></div></div></div></div>




<h3>Dealing with Data</h3>
<p>Tuples look like lists when they are returned.  There is no conversion of tuples in the toPyObject call, so you&#8217;ll need to just convert a list and then convert it to a tuple with pyList_AsTuple.  Passing &#8220;None,&#8221; &#8220;True,&#8221; or &#8220;False&#8221; is trickier; the only way I found to do it was create an instance of the value and use it in subsequent calls.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"># a
<span class="br0">&#91;</span><span class="nu0">1</span><span class="sy0">,</span><span class="nu0">2</span><span class="br0">&#93;</span>
# b <span class="sy0">&lt;-</span> toPyObject a
# reprOf b
<span class="st0">&quot;[1L, 2L]&quot;</span>
# pyList<span class="sy0">_</span>AsTuple b <span class="sy0">&gt;&gt;=</span> reprOf
<span class="st0">&quot;(1L, 2L)&quot;</span>
&nbsp;
<span class="co1">-- get &quot;True,&quot; &quot;False,&quot; and &quot;None&quot;</span>
none <span class="sy0">&lt;-</span> pyRun<span class="sy0">_</span>String <span class="st0">&quot;None&quot;</span> Py<span class="sy0">_</span>eval<span class="sy0">_</span>input <span class="br0">&#91;</span><span class="br0">&#93;</span>
pyTrue <span class="sy0">&lt;-</span> pyRun<span class="sy0">_</span>String <span class="st0">&quot;True&quot;</span> Py<span class="sy0">_</span>eval<span class="sy0">_</span>input <span class="br0">&#91;</span><span class="br0">&#93;</span>
pyFalse <span class="sy0">&lt;-</span> pyRun<span class="sy0">_</span>String <span class="st0">&quot;False&quot;</span> Py<span class="sy0">_</span>eval<span class="sy0">_</span>input <span class="br0">&#91;</span><span class="br0">&#93;</span>
&nbsp;
<span class="co1">-- try it out</span>
# r <span class="sy0">&lt;-</span> pyRun<span class="sy0">_</span>String <span class="st0">&quot;1==1&quot;</span> Py<span class="sy0">_</span>eval<span class="sy0">_</span>input <span class="br0">&#91;</span><span class="br0">&#93;</span>
# r <span class="sy0">==</span> pyTrue
True</pre></div></div></div></div></div></div></div>




<h3>Passing Haskell Objects</h3>
<p>If you want to call into functions without having to deal with Python objects, there are a couple of calls that automatically marshal the objects back and forth for you (all suffixed with &#8220;Hs&#8221;), although they turn out to be less than useful in most situations, since a) the parameters passed in must all be the same type, and b) extracting most return objects from Python is painful.  </p>

<table border="1px solid" width="100%">
<tr><th>call</th><th>invoked upon</th><th>returns</th></tr>
<tr><td>callByNameHs</td><td>Object/function name</td><td>Haskell</td></tr>
<tr><td>pyObject_CallHs</td><td>PyObject</td><td>Haskell</td></tr>
<tr><td>pyObject_Hs</td><td>PyObject</td><td>Python</td></tr>
<tr><td>pyObject_RunHs</td><td>PyObject</td><td>nothing</td></tr>
<tr><td>callMethodHs</td><td>PyObject + method name</td><td>Haskell</td></tr>
<tr><td>runMethodHs</td><td>PyObject + method name</td><td>nothing</td></tr>
</table>
<br/>

<h3>Exceptions</h3>
<p>Python exceptions can be caught and handled;  probably the easiest default way to do this is to use handlePy in conjunction with the ex2ioerr handler, which will automatically print the exception and fail.</p>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1">addem <span class="sy0">=</span> handlePy exc2ioerror <span class="sy0">$</span> 
           <span class="kw1">do</span> r <span class="sy0">&lt;-</span> pyRun<span class="sy0">_</span>String <span class="st0">&quot;1/0&quot;</span> Py<span class="sy0">_</span>eval<span class="sy0">_</span>input <span class="br0">&#91;</span><span class="br0">&#93;</span>
              fromPyObject r <span class="sy0">::</span> <span class="kw4">IO</span> <span class="kw4">Integer</span>
&nbsp;
# addem
<span class="sy0">***</span> Exception: user <span class="kw3">error</span> <span class="br0">&#40;</span>Python <span class="sy0">&lt;</span><span class="kw1">type</span> 'exceptions<span class="sy0">.</span>ZeroDivisionError'<span class="sy0">&gt;</span>: integer division <span class="kw3">or</span> modulo by zero<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>


]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/using-missingpy/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Python to Haskell (String Functions)</title>
		<link>http://www.brool.com/index.php/python-to-haskell-string-functions</link>
		<comments>http://www.brool.com/index.php/python-to-haskell-string-functions#comments</comments>
		<pubDate>Thu, 26 Feb 2009 06:29:40 +0000</pubDate>
		<dc:creator>tim</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.brool.com/?p=191</guid>
		<description><![CDATA[I will be updating this as I go&#8230; but see also the Wikipedia page on string functions in various languages. Joining a string list -- Python: &#34;,&#34;.join(lst) -- either... import Data.List intercalate &#34;,&#34; lst -- or... import Data.List concat &#40;intersperse &#34;,&#34; lst&#41; Splitting on a string -- a.split(ss) -- You might need to get Data.List.Split [...]]]></description>
			<content:encoded><![CDATA[<p>I will be updating this as I go&#8230;  but see also <a href="http://en.wikipedia.org/wiki/String_manipulation_algorithm">the Wikipedia page on string functions in various languages</a>.</p>

<b>Joining a string list</b>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- Python:  &quot;,&quot;.join(lst)</span>
<span class="co1">-- either...</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>List
intercalate <span class="st0">&quot;,&quot;</span> lst
<span class="co1">-- or...</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>List
<span class="kw3">concat</span> <span class="br0">&#40;</span>intersperse <span class="st0">&quot;,&quot;</span> lst<span class="br0">&#41;</span></pre></div></div></div></div></div></div></div>




<b>Splitting on a string</b>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- a.split(ss)</span>
<span class="co1">-- You might need to get Data.List.Split from Cabal</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>List<span class="sy0">.</span>Split
splitOn ss a 
<span class="co1">-- using regexes</span>
<span class="kw1">import</span> Text<span class="sy0">.</span>Regex
splitRegex <span class="br0">&#40;</span>mkRegex ss<span class="br0">&#41;</span> a</pre></div></div></div></div></div></div></div>




<b>Does the string start with the prefix?</b>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- Python: a.startswith(b)</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>List
isPrefixOf b a
<span class="co1">-- ByteString</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>ByteString
isPrefixOf b a</pre></div></div></div></div></div></div></div>




<b>Does the string end with the suffix?</b>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- Python: a.endswith(b)</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>List
isSuffixOf b a
<span class="co1">-- ByteString</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>ByteString
isSuffixOf b a</pre></div></div></div></div></div></div></div>




<b>Does one string contain another?</b>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- Python: a.find(b) != -1</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>List
isInfixOf b a
<span class="co1">-- ByteString</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>ByteString
isInfixOf b a 
<span class="co1">-- ByteString</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>ByteString
findSubstring b a <span class="co1">-- returns Just int or Nothing</span></pre></div></div></div></div></div></div></div>




<b>Reverse a string</b>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- Python: a[::-1]</span>
<span class="kw3">reverse</span> a
<span class="co1">-- ByteString</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>ByteString
<span class="kw3">reverse</span> bs</pre></div></div></div></div></div></div></div>




<b>Access one character</b>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- Python: a[pos]</span>
a <span class="sy0">!!</span> pos    <span class="co1">-- slow O(n)</span>
<span class="co1">-- ByteString</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>ByteString
index a pos</pre></div></div></div></div></div></div></div>




<b>Where does one string contain another?</b>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- Python: a.find(b)</span>
<span class="kw1">import</span> Data<span class="sy0">.</span>ByteString
findSubstring a b</pre></div></div></div></div></div></div></div>




<b>Replace substrings</b>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- Python: b = a.replace(src,dest)</span>
<span class="kw1">import</span> Text<span class="sy0">.</span>Regex
b <span class="sy0">=</span> subRegex <span class="br0">&#40;</span>mkRegex src<span class="br0">&#41;</span> a dest</pre></div></div></div></div></div></div></div>




<b>Replace regex, calling function on substitution</b>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- Python: b = re.sub(&quot;[0-9]+&quot;, lambda x: str(int(x.group(0)) + 1), a)</span>
<span class="kw1">import</span> Text<span class="sy0">.</span>Regex <span class="br0">&#40;</span>matchRegexAll<span class="sy0">,</span> mkRegex<span class="br0">&#41;</span>
&nbsp;
subRegexFn fn re s <span class="sy0">=</span> <span class="kw3">concat</span> <span class="sy0">$</span> <span class="kw3">reverse</span> <span class="sy0">$</span> sub s <span class="br0">&#91;</span><span class="br0">&#93;</span>
    <span class="kw1">where</span> sub s accum <span class="sy0">=</span> <span class="kw1">case</span> matchRegexAll re s <span class="kw1">of</span>
                            Nothing <span class="sy0">-&gt;</span> s:accum
                            Just <span class="br0">&#40;</span>pre<span class="sy0">,</span> mid<span class="sy0">,</span> post<span class="sy0">,</span> <span class="sy0">_</span><span class="br0">&#41;</span> <span class="sy0">-&gt;</span> sub post <span class="sy0">$</span> <span class="br0">&#40;</span>fn mid<span class="br0">&#41;</span>:pre:accum 
&nbsp;
b <span class="sy0">=</span> subRegexFn <span class="br0">&#40;</span><span class="kw3">read</span> <span class="sy0">.</span> <span class="br0">&#40;</span><span class="sy0">+</span> <span class="nu0">1</span><span class="br0">&#41;</span> <span class="sy0">.</span> <span class="kw3">show</span><span class="br0">&#41;</span> <span class="st0">&quot;[0-9]+&quot;</span> a</pre></div></div></div></div></div></div></div>




<b>Strip leading/trailing spaces</b>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- Python: a.strip()</span>
<span class="kw1">import</span> Data<span class="sy0">.</span><span class="kw4">Char</span> <span class="br0">&#40;</span>isSpace<span class="br0">&#41;</span>
trim      <span class="sy0">::</span> <span class="kw4">String</span> <span class="sy0">-&gt;</span> <span class="kw4">String</span>
trim      <span class="sy0">=</span> f <span class="sy0">.</span> f
   <span class="kw1">where</span> f <span class="sy0">=</span> <span class="kw3">reverse</span> <span class="sy0">.</span> <span class="kw3">dropWhile</span> isSpace</pre></div></div></div></div></div></div></div>




<b>Loop through every line in stdin</b>


<div class="wp-geshi-highlight-wrap5"><div class="wp-geshi-highlight-wrap4"><div class="wp-geshi-highlight-wrap3"><div class="wp-geshi-highlight-wrap2"><div class="wp-geshi-highlight-wrap"><div class="wp-geshi-highlight"><div class="haskell"><pre class="de1"><span class="co1">-- Python:  loop through every line in stdin, applying a function</span>
<span class="co1">-- for line in sys.stdin:</span>
<span class="co1">--      print fn(line)</span>
<span class="kw1">let</span> main <span class="sy0">=</span> <span class="kw3">interact</span> <span class="br0">&#40;</span><span class="kw3">unlines</span> <span class="sy0">.</span> <span class="kw3">map</span> fn <span class="sy0">.</span> <span class="kw3">lines</span><span class="br0">&#41;</span> 
<span class="co1">-- or...</span>
<span class="kw1">let</span> main <span class="sy0">=</span> <span class="kw1">do</span> <span class="br0">&#123;</span> a <span class="sy0">&lt;-</span> stdin; <span class="kw3">mapM_</span> <span class="kw3">print</span> <span class="br0">&#40;</span><span class="kw3">map</span> fn <span class="br0">&#40;</span><span class="kw3">lines</span> a<span class="br0">&#41;</span><span class="br0">&#41;</span>; <span class="br0">&#125;</span></pre></div></div></div></div></div></div></div>


]]></content:encoded>
			<wfw:commentRss>http://www.brool.com/index.php/python-to-haskell-string-functions/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

