<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Patrick Tulskie &#187; gems</title>
	<atom:link href="http://www.patricktulskie.com/tag/gems/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.patricktulskie.com</link>
	<description>Building a Better Internet</description>
	<lastBuildDate>Wed, 16 Jun 2010 19:12:25 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>libxml-ruby vs nokogiri vs hpricot</title>
		<link>http://www.patricktulskie.com/2009/03/libxml-ruby-vs-nokogiri-vs-hpricot/</link>
		<comments>http://www.patricktulskie.com/2009/03/libxml-ruby-vs-nokogiri-vs-hpricot/#comments</comments>
		<pubDate>Wed, 18 Mar 2009 05:05:26 +0000</pubDate>
		<dc:creator>Patrick Tulskie</dc:creator>
				<category><![CDATA[Comedy]]></category>
		<category><![CDATA[New Stuff]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[gems]]></category>
		<category><![CDATA[hpricot]]></category>
		<category><![CDATA[libxml]]></category>
		<category><![CDATA[libxml-ruby]]></category>
		<category><![CDATA[nokogiri]]></category>
		<category><![CDATA[parsing]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://www.patricktulskie.com/?p=115</guid>
		<description><![CDATA[Patrick talks about Ruby XML parser testing with libxml-ruby, nokogiri, hpricot, and rexml.  New test results using the test suite written by Tenderlove (Aaron Patterson) and modified to satisfy some of Why the Lucky Stiff's complaints about the tests.]]></description>
			<content:encoded><![CDATA[<p><em><strong>Update: Aaron told me that he is going to be re-running the benchmarks this weekend so we&#8217;ll get a more complete set of data from the machine that originally ran the tests.</strong></em></p>
<p>If you&#8217;re into parsing XML or HTML with ruby then chances are you&#8217;re familiar with the various gems out there for getting the job done.  Lately, there have been a lot of things flying around about which is the fastest and to settle it, Aaron Patterson (author of Nokogiri and Mechanize) wrote a test suite.</p>
<p>After it&#8217;s release, RubyInside posted about how the tests showed how fast Nokogiri was compared to Hpricot in this article here: <a title="Ruby XML Performance Shootout: Nokogiri vs LibXML vs Hpricot vs REXML - RubyInside" href="http://www.rubyinside.com/ruby-xml-performance-benchmarks-1641.html">Ruby XML Performance Shootout: Nokogiri vs LibXML vs Hpricot vs REXML</a>.  Later in the day, I saw Why&#8217;s posting about the release of Hpricot here: <a title="hpricot 0.7" href="http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/331411">hpricot 0.7</a> and decided to modify Aaron&#8217;s tests to use Hpricot.XML and here are the results:<br />
<span id="more-115"></span></p>
<pre><code>Tests were run at N=5 to get a clearer picture of the differences between the various gems.  At N=2, tests were pretty close, which indicated that a larger sample was needed.

test_IO_parsing(XmlTruth::DOM::XML::LargeDocumentParsingTest) N=5
user     system      total        real   kBps
null          0.690000   0.070000   0.760000 (  0.768641) 46343.68
nokogiri      2.790000   0.130000   2.920000 (  3.015303) 11813.62
libxml-ruby   2.970000   0.140000   3.110000 (  3.130175) 11380.08
hpricot      13.660000   0.370000  14.030000 ( 14.088780) 2528.37
.
test_in_memory_parsing(XmlTruth::DOM::XML::LargeDocumentParsingTest) N=5
user     system      total        real   kBps
null          1.240000   0.010000   1.250000 (  1.260841) 28252.30
nokogiri      4.360000   0.060000   4.420000 (  4.444468) 8014.83
libxml-ruby   4.570000   0.050000   4.620000 (  4.641338) 7674.87
hpricot      13.750000   0.210000  13.960000 ( 14.045647) 2536.13
.
test_simple_xpath(XmlTruth::DOM::XML::LargeDocumentXPathSearchTest) N=5
user     system      total        real   kBps
nokogiri     44.430000   0.300000  44.730000 ( 44.972003) 792.09
libxml-ruby  40.950000   0.210000  41.160000 ( 41.300780) 862.49
hpricot      18.410000   0.090000  18.500000 ( 18.540239) 1921.32
.
test_IO_parsing(XmlTruth::DOM::XML::SmallDocumentParsingTest) N=1944
user     system      total        real   kBps
null          8.150000   0.130000   8.280000 (  8.326070) 4278.17
nokogiri     17.850000   0.100000  17.950000 ( 17.950534) 1984.36
libxml-ruby  19.010000   0.260000  19.270000 ( 19.370769) 1838.87
hpricot      25.320000   0.460000  25.780000 ( 25.827516) 1379.16
.
test_in_memory_parsing(XmlTruth::DOM::XML::SmallDocumentParsingTest) N=1944
user     system      total        real   kBps
null          3.960000   0.030000   3.990000 (  4.005522) 8892.82
nokogiri     18.140000   0.200000  18.340000 ( 18.403396) 1935.53
libxml-ruby  19.760000   0.230000  19.990000 ( 19.999905) 1781.03
hpricot      15.980000   0.150000  16.130000 ( 16.133157) 2207.90
.
Finished in 426.233021 seconds.

5 tests, 0 assertions, 0 failures, 0 errors</code></pre>
<p>You can find my fork of the test suite on github here: <a title="Patrick Tulskie's fork of XMLTruth on Github" href="http://github.com/PatrickTulskie/xml_truth/tree/master">Patrick Tulskie&#8217;s Fork of XMLTruth</a></p>
<p>From this small sample of tests, it appears as though Nokogiri and libxml-ruby are similar in performance for most items.  This makes sense though since Nokogiri utilizes the native libxml of the current operating environment.  Nokogiri clearly excels at parsing larger documents while Hpricot appears to handle smaller, in-memory documents rather quickly.</p>
<p>In real-world scenarios, one might expect Nokogiri to be the ideal solution to parsing large XML or HTML documents from the disk into a database, whereas Hpricot might be a more ideal gem for use in a web crawler where it is rare that a page&#8217;s DOM is more than a 1MB.</p>
<p>Please post any other thoughts you might have in the comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.patricktulskie.com/2009/03/libxml-ruby-vs-nokogiri-vs-hpricot/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>you need to write me [updated]</title>
		<link>http://www.patricktulskie.com/2009/03/you-need-to-write-me/</link>
		<comments>http://www.patricktulskie.com/2009/03/you-need-to-write-me/#comments</comments>
		<pubDate>Fri, 06 Mar 2009 23:18:40 +0000</pubDate>
		<dc:creator>Patrick Tulskie</dc:creator>
				<category><![CDATA[ruby]]></category>
		<category><![CDATA[gems]]></category>
		<category><![CDATA[os x]]></category>
		<category><![CDATA[you need to write me]]></category>

		<guid isPermaLink="false">http://www.patricktulskie.com/?p=109</guid>
		<description><![CDATA[Solution for how to fix the you need to write me problem with ruby.]]></description>
			<content:encoded><![CDATA[<p>About 20 minutes ago I entered a situation where every time I ran a script on my machine, the only output would be &#8220;you need to write me&#8221;</p>
<p>Naturally I was a little freaked out.</p>
<p>After retracing my steps over the past hour I remembered I had updated my ruby gems with a good ol &#8220;sudo gem update.&#8221;  I do it all the time so I didn&#8217;t see the cause for concern.  I went and looked at the newly installed gems and saw that there was libxml-ruby-1.0.0.  I browsed inside the gem and saw that it had a bin directory that had a ruby executable in it.  Cute.  Whoever the person is who released that needs to pay super close attention to what they are doing in the future.</p>
<p>Anyhow, I uninstalled the gem and when it asked if I wanted to remove the ruby executable I said yes.  This of course trashed the ruby executable in my /usr/bin.  Luckily I was able to retrieve it from Jay Amster and all was well.  If I was to do things over I&#8217;d say not to trash the executable and just delete the gem and all of its files.</p>
<p>Having that broken ruby executable in my path devastated my system though.  Half of my Textmate scripts no longer worked, none of my rails apps would execute, etc.  It was awful.  Thankfully I was able to figure it out quickly and hopefully if you run a search for &#8220;you need to write me&#8221; then you&#8217;ll stumble upon this post and know what to do to fix your machine.</p>
<p><strong>UPDATE:</strong></p>
<p><strong>It would appear as though this problem is now resolved.  Maybe I got a bad install of the gem?  Maybe it was just a fluke?  Who knows?  It appears safe to install the latest libxml-ruby now though.</strong></p>
]]></content:encoded>
			<wfw:commentRss>http://www.patricktulskie.com/2009/03/you-need-to-write-me/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
