I finally found a good test case to compare YARV's execution speed with the plain ruby interpreter -- and the results are quite satisfying.
The script I benchmarked reads and parses 3.300 small XML files, extracts data, and writes the result to a tab separated file.
(You can find a description of how to install YARV in the comments of why's recent article "YARV Merged Matz".)
About the Test Case
The benchmarked script, export.rb, is a real-life script that I wrote a couple of days ago: A friend is working on a profile matching algorithm for a community site, and to test his approach he needed some real-world data. I'm jumping at any chance to write a screen scraper, so I proposed to help. I extracted 3.300 XML files from Last.FM user profiles using the Audioscrobbler API, and proceeded to extract data from them. Here is an example file from my own Last.FM user profile: topartists.xml.
The latter part of this process, extracting data from the XML, takes quite a while to process; REXML is convenient, but it's certainly not the fastest way to parse and query an XML file. But this is what makes this a meaningful benchmark to me: yeah it's quite limited in its scope, but it is a good example of the things I do in my daily life as a Ruby developer. Everything that succeeds at speeding up REXML makes me more productive.
What follows is the core of the benchmarked code, cleaned up for clarity. It loads all XML files in a directory, parses their content, extracts some data using an XPath expression and writes the result to an IO stream.
Dir.glob("#{CACHE_DIR}*.xml").each do |cachefile| name = File.basename(cachefile, ".xml") items = [] xmldata = File.read(cachefile) REXML::Document.new(xmldata).elements.each('topartists/artist/name') do |el| items << el.text end out << items.unshift(name).join("\t") end
The benchmark results
This is not a very scientific setup, it's just a quick test to satisfy my own curiosity -- e.g. note that the Ruby versions don't match. So simply take it as an illustration of the potential speedup YARV can provide, and not much else.
Using Ruby's default interpreter:
$ ruby -v ruby 1.8.4 (2005-12-24) [powerpc-darwin8.5.0] $ time ruby export.rb real 13m56.559s user 10m42.359s sys 0m15.955s $ time ruby export.rb real 13m54.813s user 10m58.716s sys 0m14.627s $ time ruby export.rb real 14m0.648s user 10m55.555s sys 0m14.307s
And when using YARV:
$ ruby_yarv -v ruby 2.0.0 (Base: Ruby 1.9.0 2006-02-14) [powerpc-darwin8.5.0] YARVCore 0.4.0 Rev: 475 (2006-02-23) [opts: ] $ time ruby_yarv export.rb real 6m0.519s user 4m36.774s sys 0m8.683s $ time ruby_yarv export.rb real 6m0.070s user 4m32.151s sys 0m7.609s $ time ruby_yarv export.rb real 6m1.309s user 4m32.323s sys 0m7.487s
So it's not as impressive as in other tests, but it's a great start.
Comments
Comments are closed. You can contact me instead.