Programming

New Post at the DevBlog

I've got a new post up at the DevBlog, discussing Enhancements:

  
http://guidewiredevelopment.wordpress.com/2008/03/18/enhancements-in-gscript/

Pretty cool feature of GScript.
|

Dynamic Languages are Wrong

No one is ever going to read this post thus it doesn't matter if I take an absurdly extreme position, so I'm going to come right out and say it:

Dynamic Languages Are Just Wrong For 99% of All Development


There has been a flurry of excitement over ruby and a few other dynamically typed languages in the last few years, driven mainly by rails on the server side and javascript on the browser. Rails is a great project and is far better for most websites than the J2EE stack, but that unfortunately obscures the fact that the language it is built on, ruby, while superior to Java in many ways, just isn't the right thing for most developers.

This is a relatively new opinion of mine. You can see
previous posts of mine where I'm very enthusiastic about ruby. And, at some level, I'm still enthusiastic about it. Many of the features Ruby offers we've translated into GScript at Guidewire and I'm forever in debt to it for that. But that doesn't change the fact that I think it is wrong for most developers.

In order to prove this somewhat ridiculous claim, I'll compare what I consider the key features of ruby and how GScript matches up in its statically typed world.

Terseness

Ruby is incredibly terse when compared with many statically typed languages. As a motivating example, a simple enterprise-y method definition might look like this (
Note: I intentionally include an assignment to a local var in order to contrast with GScript):

  def employees_over_age( age )
    emps = @employees.find_all { | e | e.age > age }
    emps
  end


Compare that with the five to ten lines of java you would have to write to accomplish the same task, with all the generics and types you would have to annotate. I can't even bear to write it all out.

But let's look at the same function defined in GScript:

  function employeesOverAge( age : int ) : Employee[] {
    var emps = _employees.findAll( \ e -> e.age > age )
    return emps
  }


I'll admit, it is more code. But not a ton more and I think most of the additional code is pretty reasonable: you have to annotate in and out types at the method level, which is good because you can restrict your implementation details from leaking out of the method. You have to put an explicit return statement in the code which I actually find more readable. And you have a slightly more verbose but also more consistent syntax for blocks.

So ruby wins in terseness but GScript gets pretty darned close even though it is statically typed.

Open Classes

Ruby elegantly (or hackily, according to tastes) solves another common problem: what if someone hasn't designed a class to your liking, omitting an obvious method or two, and you want to add this functionality to it. In java, this has lead to a proliferation of *Util classes: StringUtil, ObjectUtil, DateUtil, FileUtil, etc. There are thousands of these util classes filled with static methods floating around java code bases. Some code bases are so large (*cough* Guidewire *cough*) that there are many multiple different versions of these utility classes, often with subtly different names.

In ruby, you can simply add a method to a class like so:

  class String
    def my_method()
      puts( "Holy Crap!!! I've added a method to Strings!" )
    end
  end

These are referred to as "Open Classes." Pretty neat, eh? Well, it's so neat that we decided we needed something like that in GScript, so we added something called Enhancements. Here is an equivalent Enhancement:

  enhancement MyStrEnhancement : String {
     function myMethod() {
       print( "Holy Crap!!! I've added a method to Strings!")
     }
  }

Not bad, eh? And because GScript is statically typed and because we provide an IDE for it, you will get very nice code completion when you hit '.' after a string object, with your shiny new method available and quite discoverable.

So I think GScript actually wins here by a hair because it formalizes the class extension mechanism in a new language construct, but both are just about equivalent. The point is that you don't need dynamic typing for this very useful feature.


MetaProgramming

The really killer aspect of ruby and what lets rails clean J2EE's clock in terms of ease-of-development is the metaprogramming ability you have available. This allows you to dynamically generate classes on the fly based on, well, whatever you damned well please. This is how ActiveRecord builds classes based on your database schema, with no code-gen phase in the middle to gunk up the works. You change the schema and,
*bam*, your class is updated.

It's hard to contrast this with GScript's alternative in a succinct, blog-friendly way, but I'll try. GScript has an "open" typesystem allowing anyone to implement a TypeLoader and custom Types in java (which underlies GScript.) That TypeLoader can construct its types based on whatever metadata is wants to just like in ruby. At Guidewire we use this feature to build type systems on top of our web-UI files, on top of our internationalization properties files and on top of our OR layer, to name just a few. This allows GScript code to access these resources in a typesafe way but without any sort of a code-gen step.

GScript's Type System, therefore, is very flexible in much the same way that Ruby's is: developers can implement their API's in terms of types that they create dynamically rather than statically, based on whatever metadata they like.

An Anti-Feature: DSLs

A lot of developers are excited about DSL's in ruby that take advantage of the flexible nature of the ruby langauge. I'm not sure they are such a good idea. I think developers, on balance, would prefer to program in one sufficiently powerful language. I would imagine this is especially true in the enterprise space. I also think language design is pretty difficult and putting a bunch of people to work churning out specialized languages wouldn't turn out as well as we might hope. I think there may even be a biblical story about that sort of thing.

Rather than domain specific languages, I think there should be domain specific type systems: as I mentioned above we have type systems for our web layer, our OR layer, our permissions layer, etc. and it all works out grand. You have access to all these resources in a single language, GScript, presented in (one hopes) a nice API shaped by the dynamically generated types of the particular TypeLoader.

No new syntax to learn, just more libraries. Nice.

So Why Are Dynamic Languages Wrong?

Really it boils down to two reasons: tools and static verification. The first reason is
far more important than the second one.

Being able to hit '.' and see what the hell you can do with an object is priceless, particularly on larger projects. I know some people say "don't get involved in larger projects" but, well, sometimes it happens. Refactor tools (yeah, yeah, SmallTalk, blah blah blah) are far easier to implement correctly with statically typed languages than dynamically typed languages. And total-program analysis tools become possible. If the syntactic and expressive price is low enough (and in GScript, it is) then there is no reason to give up all this functionality for a dynamic language.

Static verification has gotten a bit of a bad name lately and we often joke at Guidewire that "well, it compiled, it must be right." Still, when you are making big changes and you have tens of thousands of tests to run, it is really nice to have something relatively fast (a compiler) point out things you have obviously missed at compilation time rather than waiting to run a series of test suites (even on our distributed testing cluster, it often takes up to an hour to hear back about every test after a checkin.)

But really, I could have stopped at '.'

Good code completion pretty much QED's the argument in my book.

|

GScript...

I'm doing a series of blog posts over at the Guidewire DevBlog on why GScript is a more enjoyable programming language than Java. I've just put up a recent post that shows some of the special sauce we have added to the language using Enhancements and Generics. (I'll discuss them later.)

Check it out.

GScript rocks.
|

Java and XML: Let's not use them together

My dissatisfaction with both java and XML is fairly well documented in previous posts. To recap two major points:

  • Java isn't flexible enough, both syntactically and with respect to it's type system. (Let's leave aside the lack of a reasonable lambda-style syntax for the moment, which is higher on my List of Things That Make Me Cuss When I'm Programming In Java, but is not as relevant for this article.)
  • XML is, by design, horrifically redundant. (This is almost acceptable for what it was intended to be: a non-human readable data format that didn't have the negative connotations associated with s-expressions. It is totally unacceptable now that people are forced to look at it all day long.)

What I'd like to look at in this post is why I think XML has become such a huge part of java development and why I think that is unfortunate. The short answer to the first part is:


People need Domain Specific Language (DSLs)



Why do people need DSL's? Because there are whole chunks of applications that don't need a full, general programming language, and for which a full, general programming language is poorly suited. O/R mapping is a good and common enough example. Build tools are another: you want a syntax that encapsulates the common operations so you don't end up generating reams of general code to do basic activities.

As noted in
this paper, there is a continuum between libraries and DSLs. So, why have DSLs at all? Why not simply design libraries?

The answer to that question is: syntax. As much as academics might scoff at it, syntax matters, and it matters
a lot.

This is precisely why ruby is enjoying so much success right now: ruby's syntax and evaluation rules are so flexible that it allows you to create minimal, expressive DSL's with very little effort. You simply have to get your head around how meta-programming in ruby works, and you are off to the races. Ruby on rails is a DSL for building web applications, and a pretty darned good one.

Java is, of course, much more locked down than ruby. And this isn't necessarily a bad thing. The java designers were coming from a world replete with horrible C macro-kludges, so it's understandable that they decided to leave out syntactic extensions. If every man is a language designer, you end up with a ton of badly designed languages. But you also end up with a few very well designed ones. And I'm not entirely convinced that the vast majority of useful, small DSL's aren't simply badly designed languages that answer a particular specific need, akin to German's relationship with soldiering.

In any even, that's wandering a bit off point. The facts on the ground today are that java developers have been in need of a way to design and implement DSL's for a while now (even when they don't call it that) and the accepted way to do it has become XML. Why?

My theory is this: in java, DSL's usually start out as a library, then progress to a library with a smidgen of configuration. XML became the de facto standard for config files during the .com boom, property files apparently not being cool enough, so config information ended up in XML files. Additionally, XSD's give us a rudimentary language syntax (though not semantic) verification tools.

All fine and well. I might pick another syntax for structured configuration (say,
YAML), but whatever. XML is reasonably suited for simple declarative programming.

But then we java developers started doing more and more in those config files and, at some point, they began to cross over that invisible line and become
semantically crucial parts of our applications. They no longer simply contained a few flags used to slightly modify runtime behavior. They became an XML-based programming language for crucial subsystems.

This is unfortunate, for many reasons. Among them:

  • We have traded java, a language that, while certainly not beautiful, is at least plausible for one that was never designed for human consumption. XML is utterly miserable to use in large quantities. See ant build files, and weep.
  • We now have to think in two different syntaxes. I maintain that this is a difficult transition for a significant portion of the programming populace.
  • We cannot have any sort of locality with related java code. Again, the syntax is so utterly foreign that it is like mixing Japanese and English. Even if we could put it in the same file, or add IDE support to navigate from one to the other, it wouldn't work well.
  • And, most interestingly to me, it becomes difficult to communicate whatever type information we have built into our DSL to Java. We have two choices I can see:


      So, that outlines why I think we ended up with so much XML in our java applications, and why I view that as an unfortunate thing. Now the hard part: what can be done about it.

      Frankly, I have no idea.

      My first reaction is that we need to open up java with a type-safe macro language to allow for syntactic extensions. But as nonchalant as ruby has made me about language extensions, it still seems insane in java. The macro (meta?) language needs the ability to communicate with the java type system easily, making it easy to generate coherent error messages.

      I realize, of course, that I may simply be saying something as absurd as "let's make hard problems easy," but I have to believe that there is a better way than the current state of things.

      I'm going to spend some quality time with O'caml/camlp4 over the next month and see if I get anything out of it.

      Related Links:
      • Wikipedia link on DSLs
      • Camlp4 - O'caml's DSL support framework
      |

      All Ordered Combos

      Got a bit obsessed with this last night. The wife is gone, so I have to find something to do with myself.

      Pasted Graphic

      It's a bit convoluted, although not bad and nowhere near what the java implementation would look like.
      |

      OK, OK, OK, last one, I promise

      Pasted Graphic 1

      Nowhere near as elegant as the inject method below, but this covers both assignment and block usages, so you can pick your poison. If you pass in a block then you have linear rather than exponential memory usage, although the run time is of course equivalent.

      OK, I'm done. This is my final answer.

      Wait... Maybe we should add an optional argument with a default value that limits the output, to prevent inadvertently calling it with an array that will take forever to return...

      |

      You know what this blog needs? More power_set().

      As luck would have it, an email got sent out to our coders list today asking about power set functionality in java. I flippantly replied with my implementation, saying that I was sure that the java equivilent would be just as elegant.

      A witty exchange of emails followed (the sort of thing that makes you love working at an engineering oriented company), and Jim made the point that all the offered implementations were horribly memory inefficient. He offered an iterator-based solution in java.

      Well, I for one am not going to stand here and let our favorite little programming language have its name dragged through the mud. So here was the iterative-based solution I came up with:

      Pasted Graphic 2
      |

      power_set()

      Poked around a bit and didn't see this implementation, which actually came to me in bed last night. (Don't ask.)

      Pasted Graphic 3
      |

      Dynamic Languages

      I gave a Ruby talk here at Guidewire a few weeks ago, and it has been fun to see all the discussion that it has generated. I don't think there is much chance of us using Ruby in any serious way in our core applications, but it could make inroads in the periphery (support code, example integration applications, etc.)

      Despite how much I love Ruby, I'm still skeptical of how well it will perform in a large system. I just don't have experience with large, dynamically typed systems, and a lot of older engineers I respect shudder at the idea. Maybe the prevalence of unit-testing will change this (Martin Fowler seems to think so.) I guess we will have to wait and see how the Rails projects turn out to provide evidence one way or the other.
      |

      Testing and Change

      Guidewire is an XP-ish shop, and one of the delights of working there is the incredible test infrastructure that they have set up. It has been immensely educational to work with people who take testing so seriously. Now that I have some first-hand experience with this sort of an environment, I have a few tentative observations:

      * Test-first development is hard when a GUI layer is involved
      * End-to-end tests are not worth the effort until you have a 1.0 product. And perhaps they aren't even worth the effort until you have a 2.0 product, where certain application paths have been established and need to be maintained.
      * Unit testing can get really nasty when there are elaborate dependencies between classes
      o It is very hard to keep dependencies low. It requires effort at every step. If you aren't constantly watching it, you will introduce them.
      * A flexible sample data and configuration generation platform is crucial for a good test environment
      * If tests aren't easy to write, they won't get written or they will be written poorly
      |

      Why *their* programming language is cooler than *my* programming language

      Some haskell code that makes me cry:


      --Good ol' quicksort
      quicksort [] =[]
      quicksort(x:xs) = quicksort[ y | y <- xs, y < x]
      ++ [x]
      ++ quicksort[ y | y <- xs, y >= x]

      --The list of the Fibonacci numbers
      fib = 1 : 1 : [a+b| (a,b) <- zip fib (tail fib)]


      Holy. Crap. Too bad I had to get a masters in CS to understand what the hell is going on here.

      : /
      |