Archive for the ‘Languages’ Category

Idiomatic Ruby

Sunday, June 22nd, 2008

In my last post about python, I included the following ruby code to give an example of how to get a list of all the classes named “Test.*” but not “TestSlow”:

fastTests = []
ObjectSpace.each_object( Class ) { |x| fastTests << x.to_s if x =~ /^Test/ }
fastTests.delete(”TestSlow”) 

But this is ugly. The Ruby Way(tm) should be to do it in one line, right? Surely there is some idiomatic way to filter a collection — .select should do it.

But you can’t apply .select to the result of each_object(). In fact the return vale of each_object is a fixnum, not an array, and you only have access to the object list through the block.

But you can’t return a value out of a block, either. You can alter existing variables in scope, but the routine you passed the block to gets to determine its own return value.

I spent a little while hacking on a way around this limitation using exceptions as an additional return mechanism. It was evil and it didn’t work.

So I created the following function that executes a method which takes a block parameter and returns the results of iterating it as an array –


def unpack( method, args )
  a  = []
  method.call( args ) { |x| a << x}
  return a
end

With that in hand, you can get the available classes into an array, then filter them like this:

  fastTests = unpack( ObjectSpace.method( :each_object ), Class ).select do |x|
    # starts with test but is not testSlow
    x.to_s =~ /^test/ and not x.to_s == "testSlow"
  end


Or in many other clever ruby ways, surely.

But why isn’t that functionality built in? If you don’t pass an argument to each_object, it will iterate over all the currently active objects, which could yield a very large array. Forcing the user to use a block instead of blindly dumping the object references into an array is probably an attempt to be kind to the user, to prevent his shooting himself in the foot. But it yields the decidedly un-rubylike code that starts off this entry, with the explicit initialization of the target array.

By the way, I checked to see how it’s done in Test::Unit, written by someone who’s at least ten times more expert in ruby than I am, and it’s the same idiom:

        def collect(name=NAME)
          suite = TestSuite.new(name)
          sub_suites = []
          @source.each_object(Class) do |klass|
            if(Test::Unit::TestCase > klass)
              add_suite(sub_suites, klass.suite)
            end
          end

Maybe you could come up with something evil by creating a custom filter class and overriding the === method — which is how the each_object method is filtering what gets passed to the block. But it’s a little disappointing that there’s no simple builtin way to turn an iterator into a list.

Why I hate python

Saturday, June 21st, 2008

I recently wrote a small internal program in python to automate our client’s builds: run everything overnight, collect the output, and generate some rudimentary web pages It wound up being 250 loc for the program, split into about three files; and 500 loc for the tests. I spent about 2-5 days on it, part time. Just enough to give me a feel for what the language is like.

And I don’t like the language. I know that python is supposed to be the cool language, and perl is ugly line noise. And ruby? I don’t know what people have against ruby. But python is so sexy they even use it to teach the Intro to Computer Science course at Harvey Mudd now.

I hate the way formatting structures control flow. I hate that even in XEmacs in python-mode, I was fighting with it to get the right indentation (which means the right semantics). I hate that I can’t define an empty class or method. Several times I started to write a test, realized I wanted to run something first, and left it stubbed out as


  def testHTMLLine:

Well, you can’t do that. You need to do:


  def testHTMLLine:
      pass

At that point I’ll bring my own curly braces.

I hate that it’s dynamic, but not dynamic enough. I had a set of tests, collected into about five test cases. One set of tests was slow (10 seconds) because it actually tested connecting to the CVS server, checking out a project, building it, etc. The rest of the tests finished in under a second. I wanted to define a test suite that contained all the fast tests only, so I could shorten my test-code-test cycles.

Simple and evil solution: enumerate the test cases that are fast. Evil because I have to remember to update the list if I add a new fast test case. So create a suite at module level like this:


fastTests = ["test.TestParseOutputLog", "test.TestDirClass", "test.TestSystem" ]
fastSuite = unittest.TestLoader().loadTestsFromNames( fastTests )

Clever and dynamic solution: enumerate all the tests but remove the slow ones.

# NOTE: not actually python, this DOESN'T WORK
fastTests = __thismodule__.Classes.remove( "SlowTest" )
fastSuite = unittest.TestLoader().loadTestsFromNames( fastTests )

I couldn’t figure out how to easily enumerate all the classes in the module I was in. While loading the module, some of the things which get set up eventually (the module object, for example) are not yet available to me. Look, I know it’s possible because the unittest.main() routine knows how to enumerate all the classes which are descended from unittest.TestCase. But it’s not trivial to do. In ruby you could do something like

fastTests = []
ObjectSpace.each_object( Class ) { |x| fastTests << x.to_s if x =~ /^Test/ }
fastTests.delete(”TestSlow”) 

where you could filter on name (starts with Test) and file (if necessary), and then remove the slow one. This isn’t about dynamic vs. static languages, it’s about user experience. In C you know you can’t do reflection, and when a unit test framework like cxxtest depends on perl or Python being installed, that’s fine. But because it’s possible in Python — but obscure and ill-documented — it becomes an annoyance.

The documentation is horrible. Over the last ten years my first recourse when looking for API documentation has been my friend Google. The Python documentation, even though it’s all free and available online, seems to be deliberately poorly organized and indexed. What I was hoping for was a quick cheat sheet on how to do a common thing (string interpolation). What I found was proposals to change or extend Python in varying mutually incompatible ways, in order to make it easier to do string interpolation. And this experience was repeated over and over as I dug through the language, library, and ancillary documentation looking for the Python way to do the things I needed.

Finally, Python doesn’t live up to its hype. How can it support functional programming when expression if (as opposed to statement if) was only added as a language feature in version 2.5? Even C has it!

Perhaps I have this all wrong, and Python is the best language EVAH. In the spirit of Strong Opinions, Weakly Held I welcome corrections and arguments in the comments.