On Language Polyglotism

I believe being a polyglot is nothing but an advantage, and that polyglots are normally the best programmers. As a matter of fact, I challenge myself to learn a new programming language every year. Last year it was Scala, this year I’ve started learning and writing some Erlang (and yes, there is a pattern here for functional programming languages).

One of the reasons I like learning programming languages is a drive to reach the Holly Programming Grail: I find most fascinating discovering which language is better suited for which task. In other words, some languages excel at tasks where other languages fail. An organization could decide to allow all those languages to exist, and if we were doing best-of-breed in every problem domain, we would end up with a language soup. So the question then is, is such soup good or bad? In other words, when we think about developing Internet applications: which and how many languages should an organization use?

There are two factors to consider in choice of language choices (assuming the language choice is functionally the best choice for solving the problem at hand): talent and operations.

Talent is truly an interesting one. Whereas a startup can possibly afford to pick more exotic languages based on their goodness, large organizations can do less so because of talent. This is because A-class developers will excel in any language, and usually A-class developers are polyglots. But one can only hire so many A-class developers, and eventually you end up growing into B- and C-class developers. Your ability to maintain software at that point really depends on your C-class hiring pipe. Pick a rare language choice, and you are calling for trouble.

Operations is very different from talent. Here the size of the company also matters. Developer cost will be the starting cost, but as the business becomes successful operational cost becomes the key driver (specially in consumer Internet applications). In a way, you want to pick the language choices that drive your productivity higher, and later slowly move into reducing your operational cost.

If you follow me, you'll notice talent and operations are conflicting today. Java is the mainstream programming language worldwide, i.e. A-, B- and C-class talent is widely available, yet scaling Java horizontally at linear cost is difficult. On the opposite end, Erlang is a good choice for distributed concurrency and scaling linearly, yet B- and C-class talent is difficult to find.

I used to advocate the JVM as the minimum host environment, yet allowing language choice across the stack. As long as it could run on the JVM and I could manage it, any language was fine. One could run the complete stack on a JVM. The presentation tier in JRuby, Jython, P8 or Caucho's PHP implementation. The application logic and persistence in Java or Scala. The batch computing in Scala. But it just does not work that well. Even though Scala has the artifacts to work, the JVM and the runtime libraries don't. Shared memory, native threads and locks means the JVM will be more expensive to scale and operate on the long run. The JVM is a feasible, stable and proven alternative, yet it is an expensive choice (in operational cost and probably lost opportunity because of slower agility).

Given this context, my language choices would be:

Conservative stack, which gets the job done well enough, can be operated, and you can get accessible talent.
- Frontend: PHP
- Application Logic: Java
- Persistence: Java and C/C++
- Batch: Java
Medium Risk stack, that addresses some of today's shortcomings, but reduces the talent pool:
- Frontend: JRuby on a JVM, with Merb
- Application Logic: Scala on a JVM, with Jersey
- Persistence (non-relational): Scala on a JVM
- Batch: Scala, with Hadoop
High Risk stack, forward looking to the highly multi-core CPU roadmaps announced by Intel and AMD, that reduces even further the talent-pool:
- Frontend: JRuby on a JVM, with Merb
- Application Logic: Erlang. Actually, I'd love to see a Scala-like language targeted to the Erlang VM to make it more palatable. But for now, I'd settle on Erlang.
- Persistence (non-relational): Erlang
- Batch: Erlang