New Post at the DevBlog
http://guidewiredevelopment.wordpress.com/2008/03/18/enhancements-in-gscript/
Pretty cool feature of GScript.
Dynamic Languages are Wrong
Dynamic Languages
Are Just Wrong For 99% of All
Development
There has been a flurry of excitement over ruby and a
few other dynamically typed languages in the last few
years, driven mainly by rails on the server side and
javascript on the browser. Rails is a great project
and is far better for most websites than the J2EE
stack, but that unfortunately obscures the fact that
the language it is built on, ruby, while superior to
Java in many ways, just isn't the right thing for
most developers.
This is a relatively new opinion of mine. You can
see previous posts of mine where I'm very
enthusiastic about ruby. And, at some level, I'm
still enthusiastic about it. Many of the features
Ruby offers we've translated into GScript at
Guidewire and I'm forever in debt to it for that.
But that doesn't change the fact that I think it
is wrong for most developers.
In order to prove this somewhat ridiculous claim,
I'll compare what I consider the key features of ruby
and how GScript matches up in its statically typed
world.
Terseness
Ruby is incredibly terse when compared with many
statically typed languages. As a motivating example,
a simple enterprise-y method definition might look
like this (Note: I intentionally include
an assignment to a local var in order to contrast
with GScript):
def
employees_over_age( age
)
emps = @employees.find_all { | e
| e.age > age }
emps
end
Compare that with the five to ten lines of java you
would have to write to accomplish the same task, with
all the generics and types you would have to
annotate. I can't even bear to write it all out.
But let's look at the same function defined in
GScript:
function employeesOverAge( age :
int ) :
Employee[] {
var emps = _employees.findAll( \ e
-> e.age > age )
return emps
}
I'll admit, it is more code. But not a ton more and I
think most of the additional code is pretty
reasonable: you have to annotate in and out types at
the method level, which is good because you can
restrict your implementation details from leaking out
of the method. You have to put an explicit return
statement in the code which I actually find more
readable. And you have a slightly more verbose but
also more consistent syntax for blocks.
So ruby wins in terseness but GScript gets pretty
darned close even though it is statically typed.
Open Classes
Ruby
elegantly (or hackily, according to tastes) solves
another common problem: what if someone hasn't
designed a class to your liking, omitting an obvious
method or two, and you want to add this functionality
to it. In java, this has lead to a proliferation of
*Util classes: StringUtil, ObjectUtil, DateUtil,
FileUtil, etc. There are thousands of these util
classes filled with static methods floating around
java code bases. Some code bases are so large
(*cough* Guidewire *cough*) that there are many multiple
different versions of these utility classes, often
with subtly different names.
In ruby,
you can simply add a method to a class like so:
class String
def my_method()
puts( "Holy
Crap!!! I've added a method to Strings!"
)
end
end
These are
referred to as "Open Classes." Pretty neat, eh? Well,
it's so neat that we decided we needed something like
that in GScript, so we added something called
Enhancements. Here is an equivalent Enhancement:
enhancement
MyStrEnhancement :
String {
function myMethod() {
print( "Holy
Crap!!! I've added a method to
Strings!")
}
}
Not bad, eh? And because GScript is statically typed
and because we provide an IDE for it, you will get
very nice code completion when you hit '.' after a
string object, with your shiny new method available
and quite discoverable.
So I think GScript actually wins here by a hair
because it formalizes the class extension mechanism
in a new language construct, but both are just about
equivalent. The point is that you don't need dynamic
typing for this very useful
feature.
MetaProgramming
The really killer aspect of ruby and what lets rails
clean J2EE's clock in terms of ease-of-development is
the metaprogramming ability you have available. This
allows you to dynamically generate classes on the fly
based on, well, whatever you damned well please. This
is how ActiveRecord builds classes based on your
database schema, with no code-gen phase in the middle
to gunk up the works. You change the schema
and, *bam*, your class is updated.
It's hard to contrast this with GScript's alternative
in a succinct, blog-friendly way, but I'll try.
GScript has an "open" typesystem allowing anyone to
implement a TypeLoader and custom Types in java
(which underlies GScript.) That TypeLoader can
construct its types based on whatever metadata is
wants to just like in ruby. At Guidewire we use this
feature to build type systems on top of our web-UI
files, on top of our internationalization properties
files and on top of our OR layer, to name just a few.
This allows GScript code to access these resources in
a typesafe way but without any sort of a code-gen
step.
GScript's Type System, therefore, is very flexible in
much the same way that Ruby's is: developers can
implement their API's in terms of types that they
create dynamically rather than statically, based on
whatever metadata they like.
An Anti-Feature:
DSLs
A lot of developers are excited about DSL's in ruby
that take advantage of the flexible nature of the
ruby langauge. I'm not sure they are such a good
idea. I think developers, on balance, would prefer to
program in one sufficiently powerful language. I
would imagine this is especially true in the
enterprise space. I also think language design is
pretty difficult and putting a bunch of people to
work churning out specialized languages wouldn't turn
out as well as we might hope. I think there may even
be a biblical story about that sort of thing.
Rather than domain specific languages, I think there
should be domain specific type systems: as I
mentioned above we have type systems for our web
layer, our OR layer, our permissions layer, etc. and
it all works out grand. You have access to all these
resources in a single language, GScript, presented in
(one hopes) a nice API shaped by the dynamically
generated types of the particular TypeLoader.
No new syntax to learn, just more libraries. Nice.
So Why Are
Dynamic Languages Wrong?
Really it boils down to two reasons: tools and static
verification. The first reason is far
more important than the
second one.
Being able to hit '.' and see what the hell you can
do with an object is priceless, particularly on
larger projects. I know some people say "don't get
involved in larger projects" but, well, sometimes it
happens. Refactor tools (yeah, yeah, SmallTalk, blah
blah blah) are far easier to implement correctly with
statically typed languages than dynamically typed
languages. And total-program analysis tools become
possible. If the syntactic and expressive price is
low enough (and in GScript, it is) then there is no
reason to give up all this functionality for a
dynamic language.
Static verification has gotten a bit of a bad name
lately and we often joke at Guidewire that "well, it
compiled, it must be right." Still, when you are
making big changes and you have tens of thousands of
tests to run, it is really nice to have something
relatively fast (a compiler) point out things you
have obviously missed at compilation time rather than
waiting to run a series of test suites (even on our
distributed testing cluster, it often takes up to an
hour to hear back about every test after a checkin.)
But really, I could have stopped at '.'
Good code completion pretty much QED's the argument
in my book.
GScript...
Check it out.
GScript rocks.
Java and XML: Let's not use them together
- Java isn't flexible enough, both syntactically and with respect to it's type system. (Let's leave aside the lack of a reasonable lambda-style syntax for the moment, which is higher on my List of Things That Make Me Cuss When I'm Programming In Java, but is not as relevant for this article.)
- XML is, by design, horrifically redundant. (This is almost acceptable for what it was intended to be: a non-human readable data format that didn't have the negative connotations associated with s-expressions. It is totally unacceptable now that people are forced to look at it all day long.)
What I'd like to look at in this post is why I think XML has become such a huge part of java development and why I think that is unfortunate. The short answer to the first part is:
People need Domain Specific Language
(DSLs)
Why do people need DSL's? Because there are whole
chunks of applications that don't need a full,
general programming language, and for which a full,
general programming language is poorly suited. O/R
mapping is a good and common enough example. Build
tools are another: you want a syntax that
encapsulates the common operations so you don't end
up generating reams of general code to do basic
activities.
As noted in
this paper,
there is a continuum between libraries and DSLs. So,
why have DSLs at all? Why not simply design
libraries?
The answer to that question is: syntax. As much as
academics might scoff at it, syntax matters, and it
matters
a lot.
This is precisely why ruby is enjoying so much
success right now: ruby's syntax and evaluation rules
are so flexible that it allows you to create minimal,
expressive DSL's with very little effort. You simply
have to get your head around how meta-programming in
ruby works, and you are off to the races. Ruby on
rails is a DSL for building web applications, and a
pretty darned good one.
Java is, of course, much more locked down than ruby.
And this isn't necessarily a bad thing. The java
designers were coming from a world replete with
horrible C macro-kludges, so it's understandable that
they decided to leave out syntactic extensions. If
every man is a language designer, you end up with a
ton of badly designed languages. But you also end up
with a few very well designed ones. And I'm not
entirely convinced that the vast majority of useful,
small DSL's aren't simply badly designed languages
that answer a particular specific need, akin to
German's relationship with soldiering.
In any even, that's wandering a bit off point. The
facts on the ground today are that java developers
have been in need of a way to design and implement
DSL's for a while now (even when they don't call it
that) and the accepted way to do it has become XML.
Why?
My theory is this: in java, DSL's usually start out
as a library, then progress to a library with a
smidgen of configuration. XML became the de facto
standard for config files during the .com boom,
property files apparently not being cool enough, so
config information ended up in XML files.
Additionally, XSD's give us a rudimentary language
syntax (though not semantic) verification tools.
All fine and well. I might pick another syntax for
structured configuration (say,
YAML),
but whatever. XML is reasonably suited for simple
declarative programming.
But then we java developers started doing more and
more in those config files and, at some point, they
began to cross over that invisible line and
become
semantically crucial parts of our
applications.
They no longer simply contained a few flags used to
slightly modify runtime behavior. They became an
XML-based programming language for crucial
subsystems.
This is unfortunate, for many reasons. Among them:
- We have traded java, a language that, while certainly not beautiful, is at least plausible for one that was never designed for human consumption. XML is utterly miserable to use in large quantities. See ant build files, and weep.
- We now have to think in two different syntaxes. I maintain that this is a difficult transition for a significant portion of the programming populace.
- We cannot have any sort of locality with related java code. Again, the syntax is so utterly foreign that it is like mixing Japanese and English. Even if we could put it in the same file, or add IDE support to navigate from one to the other, it wouldn't work well.
- And, most interestingly to me, it becomes difficult to communicate whatever type information we have built into our DSL to Java. We have two choices I can see:
So, that outlines why I think we ended up with so much XML in our java applications, and why I view that as an unfortunate thing. Now the hard part: what can be done about it.
Frankly, I have no idea.
My first reaction is that we need to open up java with a type-safe macro language to allow for syntactic extensions. But as nonchalant as ruby has made me about language extensions, it still seems insane in java. The macro (meta?) language needs the ability to communicate with the java type system easily, making it easy to generate coherent error messages.
I realize, of course, that I may simply be saying something as absurd as "let's make hard problems easy," but I have to believe that there is a better way than the current state of things.
I'm going to spend some quality time with O'caml/camlp4 over the next month and see if I get anything out of it.
Related Links:
- Wikipedia link on DSLs
- Camlp4 - O'caml's DSL support framework
All Ordered Combos
OK, OK, OK, last one, I promise
Nowhere near as elegant as the inject method below, but this covers both assignment and block usages, so you can pick your poison. If you pass in a block then you have linear rather than exponential memory usage, although the run time is of course equivalent.
OK, I'm done. This is my final answer.
Wait... Maybe we should add an optional argument with a default value that limits the output, to prevent inadvertently calling it with an array that will take forever to return...
You know what this blog needs? More power_set().
A witty exchange of emails followed (the sort of thing that makes you love working at an engineering oriented company), and Jim made the point that all the offered implementations were horribly memory inefficient. He offered an iterator-based solution in java.
Well, I for one am not going to stand here and let our favorite little programming language have its name dragged through the mud. So here was the iterative-based solution I came up with:
power_set()
Dynamic Languages
Despite how much I love Ruby, I'm still skeptical of how well it will perform in a large system. I just don't have experience with large, dynamically typed systems, and a lot of older engineers I respect shudder at the idea. Maybe the prevalence of unit-testing will change this (Martin Fowler seems to think so.) I guess we will have to wait and see how the Rails projects turn out to provide evidence one way or the other.
Testing and Change
* Test-first development is hard when a GUI layer is involved
* End-to-end tests are not worth the effort until you have a 1.0 product. And perhaps they aren't even worth the effort until you have a 2.0 product, where certain application paths have been established and need to be maintained.
* Unit testing can get really nasty when there are elaborate dependencies between classes
o It is very hard to keep dependencies low. It requires effort at every step. If you aren't constantly watching it, you will introduce them.
* A flexible sample data and configuration generation platform is crucial for a good test environment
* If tests aren't easy to write, they won't get written or they will be written poorly
Why *their* programming language is cooler than *my* programming language
--Good ol' quicksort
quicksort [] =[]
quicksort(x:xs) = quicksort[ y | y <- xs, y < x]
++ [x]
++ quicksort[ y | y <- xs, y >= x]
--The list of the Fibonacci numbers
fib = 1 : 1 : [a+b| (a,b) <- zip fib (tail fib)]
Holy. Crap. Too bad I had to get a masters in CS to understand what the hell is going on here.
: /