This post is adapted from a talk I gave at the Cygni tech summit.
I have always held that the act of programming is something that is done in an abstract realm and only later translated into a programming language. That programming should involve much more thinking than actual writing.
I mostly still stand by that but my years as a readability reviewer at Google raised my awareness of the fact that it takes time and practice to use a language well. Many were the hopeful C++ programmers who wrote fully functional and reasonably object-oriented code in Java but with a heavy C++ accent that couldn't quite meet the bar of elegance in Java (even if a C++ programmer might think it mostly elegant apart from the "flaws" of Java).
So how much does the choice of programming language matter? Is there or could there be a language that could be considered perfect?
What makes a good language?
Academia seems pretty quiet on the subject of what makes a good language, but Tony Hoare delivered an interesting keynote called "Hints on Programming Language design" in 1973.
If you don't know who Tony Hoare is, you might remember him as the person who apologized in public a few years ago for having invented NULL. He had intended to avoid it, but it was just so easy to implement, so he couldn't resist. He also invented a lot of good things like switch statements and the idea of Communicating Sequential Processes which is the concurrency paradigm used in Go and Ada. Most of his work was spent in the pursuit of error-free programming, but he wrily concluded that it was wasted because the industry simply isn't interested.
The Hints on PL design start out proposing that a good language should support the programmer in the work that needs to be done, mainly design, documentation and debugging. The programming language design should therefore guide the programmer into breaking up the program in manageable parts, help the programmer make the code mostly self-explanatory, help avoid errors, but when errors do enter the code, it should not be unduly difficult to find them and fix them. I think that sounds like a good approach, although I would also like to add that programming should also be fun to some extent.
Turning the basic proposals around a bit, Tony Hoare then comes up with some properties that a good programming language should have:
- The foremost is simplicity, that a language should be simple enough that the programmer should be able to know everything about it. There seems to be some evidence to support this idea. Smalltalk is simple enough to describe completely on a postcard and is maybe the most productive general programming language. A blog I found once concluded that the perfect programming language is the one you already know, so that begs the question if it is better to stick with a "worse" language that you know well than switch to a "better" language. Well, first of all, it seems that even though languages have statistically significant measurable differences, the cost of switching always seems to take much longer to pay back than we like to imagine. It also can be very costly to not know everything about a language. But these are boring answers and in this post we seek perfection.
- Security is important, the programs produced should run in a predictable way and preferably error-free. It is especially bad when programming errors simply give a wrong answer, as when you in assembler perform a floating-point operation on a fixed-point number. It would also be desirable to attach units to numbers, but as far as I know very few languages allow this easily.
- Fast compilation is necessary to iterate on a solution. Anyone who has experienced a tight TDD-cycle with sub-second test runs knows that the importance of this property cannot be overstated. Flutter's hot reload or REPL-like exploratory programming really does take productivity to another level.
- The code should run fast and efficiently. No matter how fast processing power increases, there are still bigger problems we want to solve. Tony Hoare proposes that a good language should mirror how a computer actually works and that the optimization step should emit new source code. This may not be entirely practical, but I think the least we can ask is that it should be possible to know the performance characteristics of the code constructs available.
- Readability: the ability to understand what someone else's code does, or even your own code three months down the line. Obviously hugely important if your code is to live and evolve for any amount of time when code is read orders of magnitude more often than it is written.
The market can stay irrational longer than you can stay solvent.
This is a quote by the economist John Maynard Keynes. It serves to illustrate the idea that we don't always recognize the best product and it is entirely possible that the perfect language has already been created but we haven't adopted it yet.
First candidate language
The above language was very hyped in the 1980s and the designer, Jean Ichbiah, thought that within ten years there would be only two programming languages in the world, one was Lisp and the other was this language, Ada. It wasn't an entirely unreasonable thought. Ada was the result of a very solid gathering of requirements and a design competition with a large number of teams, since computer languages were big business at that time. Tony Hoare, together with Niklaus Wirth, was actually a consultant on another semi-finalist team and he was trying to get their submission simplified, although the winning team had an even more complex language. What Ichbiah hadn't counted on was that first Ada was so complex that even the compilers had performance problems, so there was a bit of a delay getting it out, and then, second, C and Unix happened.
But Ada was an initiative by the US Department of Defense who wanted to create one language to replace the 160 or so they had and they are not easily deterred, so in 1991 Ada became compulsory for NATO systems (although exceptions were often granted). It turns out that Ada was actually a well-designed language and besides military systems it has proven itself in space missions, air traffic control and French high-speed trains. When human lives are at stake, Ada should be the go-to choice. For every 100 errors in a C program, there will be about 50 in the corresponding Java program and only 4 in the Ada version. There is even a dialect of Ada, Spark, where you can formally prove the correctness of your program. Productivity-wise Ada is about the same as Java. So, apart from the complexity, Ada seems to pretty much tick all the boxes. Even the complexity turns out to be ok because Ada has a consistent design, so you can pretty much get the idea what the code does even when you don't know the exact details of the particular construct and there is excellent documentation, including motivations for why each feature exists. Those who write code in Ada also seem to enjoy it.
Ada has continued to develop and in the 2012 spec it gained contracts (à la Eiffel) and concurrency. Unfortunately it seems unlikely that we will have an Ada-revival, unless perhaps the almost error-free code turns out to be desirable for small devices that are difficult to update code in.
Second candidate language
Above we see six different ways of implementing the factorial function in our second candidate for language perfection. Every year there seems to be at least one blogger who asks if this might finally be the year when the world comes to its senses and Haskell finally takes off. Search results for Haskell seem to unanimously praise the language. Are the rest of us programmers simply being irrational?
Haskell is, of course, a very interesting language, but it is very abstract, for better and for worse. Do you really need six different ways to implement factorial? I would like to suggest that perhaps Haskell is mainly interesting in the same way that abstract algebra is interesting in mathematics. There are people who do use Haskell in production, so I've looked at what they say as well:
- It is difficult to create binary-compatible libs, which means you basically want to compile everything from scratch, which can take hours on a big project.
- It is difficult to predict performance, a small refactoring can lead to the code running orders of magnitude slower and even experienced programmers have difficulty diagnosing it.
- You need to monitor your memory usage because it can suddenly explode on some inputs.
- Even though it is claimed that when Haskell code compiles it tends to be correct, this seems not to be as true as the proponents would like to believe.
- Everything is side-effect free by default, so if you ever do need to add a side effect deep inside your code, you have to rewrite all the layers above it (so no debug-print-statements, for example)
- Haskell suffers from a version of "The mysterious tuple problem" in that although types are very strictly defined as to what types they are composed of, the components can be named differently in every function.
Those who use Haskell claim to be very happy about it, but one production user said they preferred to use Python when they coded for their own private use.
Apart from the type safety, Haskell really doesn't seem to tick any of our boxes, so we probably haven't missed out on perfection after all. Onward!
What features would I want in a language?
To try to begin to answer that, I've taken a look at languages I've used and tried to nominate features that I appreciated and would consider using if I designed a language.
Cobol was originally a stop-gap solution for the US Department of Defense to manage the administrative side of things. Like all temporary solutions it is still going strong 60 years later, mainly in the banking and government sectors. It has developed steadily and the latest spec is from 2014. Cobol is very verbose as you can see from the for-loop. The idea was that the code should be readable by non-programming business experts. While I appreciate that goal, I won't be emulating the verbosity. The record definition, on the other hand, is awesome! You simply specify what fields you have and what pattern they are written in and the computer takes care of all the reading and writing for you. This is an example of a declarative syntax and I especially appreciate that there is a bit of a visual component to the patterns as well.
We programmers might laugh a little at FORTRAN as being old-fashioned, but as soon as you enter a physics-institution anywhere in the world, FORTRAN likely reigns supreme. It turns out that FORTRAN is a very good match for the way physicists think about their work and it also tends to turn out the most efficient executables. One reason for the efficiency is the lack of pointers, which allows compilers to do more aggressive cache optimizations. FORTRAN has of course also developed over the years and the latest spec is from 2018.
Another language used a fair amount in physics departments is Python, which unfortunately is very slow. To improve matters, an effort started in 2009 has come up with the programming language Julia, which aims to be as easy as Python and as fast as FORTRAN. By and large it succeeds well in that and it also turns out to be a really pleasant language to use.
What I would take from FORTRAN is the array/vector capabilities. Indexing starts at 1 by default (Yay!) but can be defined to start anywhere (sometimes very handy). In the code above we see vector b with indices from -5 to 5, and vector c is a 5-element vector where the elements are the sum of the corresponding elements from the slices of the a and b vectors.
I have used a number of other languages and dabbled in even more, so I'll quickly run through a few here, perhaps some of them had some inspirational effects:
- BASIC was available on the Apple II and is pretty much a simplified FORTRAN, pre-vectors.
- Pascal came a little after that. The poster-child of structured programming, it was rather pleasant to use. I remember that the REPEAT..UNTIL structure was often more logical than a do..while. The record structures and ability to restrict numeric values to ranges are other useful features.
- Forth was fun to play with a bit, but it didn't really take. What I remember most is that pretty much nothing was defined up front and you kind of defined your own language in a way.
- PostScript was used on the Irix windowing system, NeWS, and I used it a fair amount to try and customize that, mostly just for fun. I still think it is fun. The PostScript stack turns out to be really useful in some algorithms and writing code is a little like doing a puzzle. It has postfix notation to manipulate the top elements on the stack, so "2 5 3 add mul" would become 16
- Tcl was in part a reaction to the illogical aberrations of csh, so it is designed to have a very logical and uniform syntax, which is a good thing to have. It can be used as a shell with tclsh, but I think it is more common to use it with the awesome windowing toolkit Tk in wish. I still carry around a lightweight but fairly well-featured editor written in 427 spacious lines of wish code. In 1998 I used the tclplugin for the browser to create an SPA.
- Scheme I learned because everyone should try a version of Lisp at some point. It was interesting but I didn't really have any reason to use it for anything substantial. The parentheses were too many and I didn't have any great epiphany. Now I'm exploring Shen which has some really nice syntax features, embedded Prolog and an optional sequent-logic based type system.
- At one point when i was struggling to process some text with AWK, a colleague suggested I try Perl and the first example in the book did the trick, which started a brief love affair. It fairly soon wore thin when I tried to understand some of the programs I had written previously. In Perl, anything is a one-liner if you know the magic operator. So Perl will serve as a reminder to NOT invent a lot of operators for various problems.
C - worse is better
As I previously mentioned, programming languages used to be big business. Someone would create a computer, design an operating system for it and then compilers for languages were optional add-ons. But then Unix came along and all you had to do after building your computer was to create a fairly simple C-compiler and get the Unix sources at copy-cost and you were done. Since you have the C-compiler, you might as well throw it in for free. Thereby C became the world's most successful computer virus.
People will go to great lengths to convince themselves that there are free lunches, but just because Unix was written in C it doesn't imply it would be good to write your applications in C. Compared to FORTRAN, Pascal and other languages available at the time, C was a fantastic tool if you wanted to shoot yourself in the foot.
On the up-side, C was the closest thing to a universal language we were ever likely to get and in retrospect it seems difficult to imagine programming languages, operating systems and lots of software being available free of charge without C having happened.
I've mostly managed to avoid C++ which I am rather grateful for. It is loads of lipstick on the pig and layers upon layers of unnecessary complexity. Interestingly, when Go was created explicitly to replace C++, it turned out that the programmers who love C++ get high on the complexity that makes them feel like masters of the universe and they will never change to any other language except possibly Haskell.
DSL - parser/generator
At the end of the 1990s there was a bit of a move to create domain-specific languages and for that it was very helpful to have a parser/generator. I have used yacc/lex (and the GNU equivalents bison/flex) as well as Javacc. I wasn't too thrilled with previous versions of ANTLR but the latest ANTLR4 is crazy good, it just handles the way you write the rules. The big idea is of course to describe the grammar for a syntax in a descriptive declarative way, so a syntax for describing something similar to a JSON object could look like this:
At some point I expect that it will start to wear thin to code everything as XML, YAML or JSON just to get the parsing for free and there will be a need to create more expressive syntaxes, so I definitely think a language should contain a parser/generator of sorts.
SQL - a necessary evil?
I used to have a copy of "SQL for Dummies" at hand whenever I, with dread in my heart, needed to write some SQL, especially if there was any funky stuff like joins and such going on. But you can do pretty powerful things with SQL, just consider trying to do the same thing with an "ordinary" programming language. I got inspired by a colleague at Cygni who sometimes uses SQL for application code and the above code is from adventofcode 2018 where I started out using SQL for the problems. I gave up after a while because SQL is bad at iterating, especially the mariadb version I was using, but I also got an appreciation of the strengths of SQL.
There are some criticisms of SQL, most notably "The third manifesto" where a family of true/good database languages called "D" are described, languages that also extend to general programming. One version of a "D" is "Tutorial D" that was developed for teaching purposes and is being implemented in reldb.
I think it may be a good idea to have some relational concepts or data structures in the language, even doing something like LINQ in C#, but the thing I will first take from SQL is the treatment of NULLs as being mostly non-existent values.
I really love Java. It is 30-200% more productive than C with half as many errors. If we look at the criteria we set out in the beginning from the "Hints on PL design", I think it pretty much fills them.
It is difficult to pick specific features because I think it is the combined package that does it and I think a lot of the things necessary for success are the very things many people like to complain about. If I were to pick one feature, I think it is great that my sportscar in the example above will have a top speed of 350 even when I just call it a car. That's very logical and always true for Java and Smalltalk but not for some other languages. Another thing to think about is the package structure.
One thing I think Java gets wrong is that it doesn't have a record or structure type for simple data objects.
When I've done this talk, someone always asks me why I didn't mention C#. The simple reason is that I haven't used C# enough to understand how it is different from Java (except in all the ways it grates me). Putting my preferences aside, they seem to have very similar productivity profiles and I know no strong enough objective reason to prefer one over the other (except, perhaps, for the platform issue, since nobody uses .NET core anyway). VisualBasic, on the other hand, seems to be 30% more productive. Go figure!
XSLT 💕 💡
XSLT is my favourite language of all time and it made all sorts of lights go on in my head. This is what sparked my analysis of programming languages, I wanted to program generally in an XSLT-like style, whatever that really means, and more. Of course, all data should really be in XML at all times, but that got killed with the whole XMLSchema nonsense and whatever else was done to replace freedom with vendor products.
Back to XSLT, look at the power of that code. Actually, there really isn't much code at all. The match statement tersely states that whenever we encounter a "rule" element that is the child of a "section" element that has an attribute with the value "top", we should get a result from this template. Now how much code would you normally write to determine something like that? And there is almost no code to produce the result either, it is just written literally right there.
One "lights on" moment I had using XSLT was that my perspective changed. I wasn't writing a program with commands, I was instead setting up a machine to run the input, so the input was actually the program controlling what the output would be.
Another thing about XSLT is that it is very obviously homoiconic, that is, the program itself is just data that could have been output by another program. It is an interesting property, but if you actually use it to write programs with programs, things can get difficult to maintain.
Go is perhaps the complete opposite of Haskell in that it lacks pretty much every feature that a language theoretically must have. It is a rigid, boring and unimaginative language that turns out to be really great for efficient development of efficient programs. Despite the lack of features, it seems that for every problem that exists in practice, there is an elegant practical solution in Go.
Go was developed in order to better suit the type of development done at Google, mainly as a reaction to the insane complexity and slowness of compilation of C++, but also a dissatisfaction with the alternatives. It compiles blazingly fast, has garbage collection and utilizes the CSP concurrency model to allow easy and fairly safe use of concurrency.
Colleagues who are programming in Go have stated that they like it even more the longer they use it and learn to stop obsessing about how "elegantly" they would have done things in a different language.
My take-away from Go is that it can be OK to not aim for some sort of theorectical "feature completeness".
Looking at Dart brings yet another perspective on what might be good parts of its parents. The "batteries included" policy of providing a large and capable standard library is my main take-away so far.
Which thread-safety model?
Every modern language needs a good way to handle concurrency, we can't just leave it to luck any longer. So which to choose?
- Immutable; pure functions, like Haskell
- A sane, regulated Memory Model, like Java
- Communicating Sequential Processes, like Go and Ada
- A borrow checker with safe reference rules, like Rust
- Transactional memory, like exists as add-ons for many languages and Clojure has built-in
I don't know which of these I'd choose, they all have their advantages. Currently I'm leaning toward a combination of immutability and software transactional memory.
My humble attempt, Tailspin
Now that I've been thinking about this for about 15 years, I thought it was time to try to actually create a language. Hopefully it brings a different enough spin to be interesting. Here are a few code samples.
First up is an implementation of FizzBuzz:
We start by defining what I suppose is really a function, but I decided to switch out names to avoid getting stuck in preconceived notions. So we define a templates section called "fizz" which simply takes the input modulo 3 and sends it off to be matched. If it is zero, it outputs the string "Fizz", otherwise nothing happens at all. We do the same for "buzz".
In the final line, we first create a list/array whose contents are generated by taking the stream/range of integers from 1 to 100 and transforming each into a string where the first part is the output of the fizz templates and the second part is the output of the buzz templates. The whole array is then sent into an array-templates that provides the index i, where we match each element. If it is the empty string, we output the index, else we output the string. And then we stream out all the array elements and convert them to a string with a newline at the end and send it to stdout. Note that "$" changes meaning at each transform step to mean the current value entering that step.
Next we have a little program to take the words on a line and print them in the reverse order:
Here we have a different type of function, a composer, which takes a stream of unicode characters and parses them into the production on the first line, an array of "word"-productions.
The rule for "word" produces one element of consecutive non-whitespace characters, followed by an optional element of consecutive whitespace characters. If we wanted to ignore/discard the whitespace, we could have put that production in parentheses, like
"(<ws>?)", but in the reversal we want whitespace between the words anyway, so why not just keep it?
In the final line we read a stream of lines from stdin, and for each we create a new string ended by newline where the contents is the original line parsed into an array which is then reversed and streamed out. The line string is then printed out.
The final example is a templates section to calculate the Nth fibonacci number:
In the templates, we first create an object that represents the current state, so we have the N as the input and n0 and n1 as the seeds of the fibonacci function. This object is sent to the matchers.
- If N is zero, we are done and n0 is the value we are looking for.
- If N is 1 or more, we create a new state object where N is decreased and the fibonacci relation is calculated forwards one step. This new object is then sent back to the matchers.
- If N is negative, we increase N and do the fibonacci step in reverse and send to matchers.
So is this the perfect language? I don't know and it's certainly not there yet, but so far I'm really having fun coding in it and developing it. If you're interested, it's available on github.