Ruby: The Last Eight Years
If you're a software developer, in particular one skilled in Ruby, you probably have noticed a great many changes over the last eight years. But why should we care? Shouldn't we be concerned about where things are now, rather than where they used to be?
I'd argue we need to be aware of both. In a fast-paced world of changing technology, history is still important—perhaps more important than ever. We need to know how we got where we are, how long it took, what blind alleys we investigated, how much effort really was involved. So, yes, I claim it's worth a little time reflecting on the last eight years in the Ruby world.
Why specifically eight years? That's roughly the length of time between the second and third editions of The Ruby Way. The third edition hit the shelves in March 2015 (in electronic form as well, naturally). During those intervening years, computing in general has seen significant changes. In the case of Ruby, I'd argue that the language's evolution has been almost overwhelming.
This book's history began with the first edition in 2001, when it was the second English-language Ruby book ever published, following Programming Ruby: The Pragmatic Programmer's Guide (the beloved "Pickaxe book"). Early in the decade, Ruby was barely known in the U.S.; I had known of it only three years. The book was written roughly to version 1.6.2 of Ruby. In the entire world, only one Ruby conference had been held at that point.
By 2006, Rails had been around for about a year. At that time, we were using version 1.8.4 of Ruby. In October 2006, someone (not I) posted on ruby-talk that the second edition of The Ruby Way was "coming soon."
At the time, I thought moving from the first edition to the second was painful. Overhauling the book was quite a bit of work. But moving from the second edition to the third was much, much harder. What changed? Well, consider where we were in 2006—not just with regard to the Ruby language itself, but also the community, the libraries and tools, and the overall ecosystem.
Ruby itself, born in 1993, was finally a teenager. With the sixth international Ruby conference behind us, Ruby 1.9 was still a dream of the near future. The comp.lang.ruby newsgroup (at that time mirrored to ruby-talk) was six years old, and the web-based Ruby Forum was less than a year old. Ruby groups were scattered around the world—England, Korea, Australia, Canada, the U.S., and others. There were "hoedowns" and codefests here and there, as well as several nascent Rails groups. Jim Weirich announced version 0.9 of RubyGems.
We had our second Google Summer of Code. A new tool called "Mongrel" was in version 0.3 by the end of the year. There was no real Rubinius nor even a real JRuby yet. The libraries and tools that have arisen (and even died out) since then are numerous—templating solutions, web-related libraries and frameworks, database adapters and NoSQL solutions and ORMs, as well as bindings to popular applications and many other useful creations. I can't get into all of those changes here.
What about the language itself? Changes encompassed core methods and classes (both added and deleted), updated behavior of the interpreter, and actual syntax revisions such as the following:
- A "character" in Ruby was basically an integer ASCII code. If you indexed into a string, you also got an integer.
- Character encodings were crude, and Japanese was understandably treated specially (honoring the $KCODE variable). Unicode support was spotty, especially in regular expressions.
- Block formal parameters didn't have to be local variables; for example, they could be instance variables. Block-local variables specified with the semicolon notation (;) had not been introduced yet.
- It was permissible to use a colon to separate a when-clause from its expression, instead of a semicolon, then, or a newline. The same was true of the if statement.
- Strings didn't have each_line, each_char, or each_byte. They had the each iterator, which iterated by lines, historically confusing some people.
- Block parameters worked differently then; at one point, Matz said that variable shadowing was his "biggest regret" in the design of Ruby. If you gave a block parameter the same name as an existing local variable, the local variable changed along with the parameter.
- Hashes were hashes in a stricter sense. Iterating over the keys produced a sequence that was reproducible but not predictable. Today, of course, the core data structure is still a true hash, but the keys are maintained in insertion order.
- In older Ruby versions, a symbol was actually stored as a Fixnum (and internally it still is). For a while after that, Fixnum had a to_sym method, which we no longer have.
- We didn't have BasicObject then. Sometimes there was a need for a more "barebones" object without so many methods coming from Object and Kernel; BasicObject grew from Jim Weirich's creation BlankSlate, which used undef_method to strip methods from an ordinary object.
- Instance method names returned by instance_methods were strings, not symbols. We verified the existence of a method not by method_defined? but rather by calling include? on the array of method names.
- We don't use the encoding comment now because it's a little old-fashioned. We didn't use it then because it hadn't been introduced yet. All program text was assumed to be ASCII.
- Thread safety was even more problematic then than it is now. Ruby had strictly green (non-native) threads. Ruby threads were originally based on continuations, which are no longer a feature. There were no fibers in Ruby then.
- The so-called "hash rocket" (=>) was the only notation for hashes. Symbols were not a special case, as they are now with the colon (:) transposed to the end of the symbol. The "comma notation" for hashes was syntactically legal at that time, though it was rarely used.
- A common hack was to add a to_proc method to Symbol. It wasn't part of the core. The same is true of the tap method in Object.
- Calling to_s on an array gave a string with the elements jammed together. It didn't punctuate with commas.
- The "stabby lambda" syntax (->) didn't exist yet. Ordinary blocks did not accept default parameters or the "star" operator (also called "splat" or "unary unarray").
- Lambdas could be called via call or [] but not by yield or by the .() notation. Blocks could not be passed into lambdas at all.
- In method formal parameter lists, a "splatted" parameter had to appear at the end. The same was true of parameters with default values. Now the parser and interpreter are smarter about figuring out the edge cases.
- Method definitions could not be nested. Now, in effect, an ordinary method definition can happen at runtime (without define_method).
- Enumerators (for "on demand" iteration) didn't exist, and naturally there was no to_enum method. An iterator called without a block didn't return an enumerator object. Methods such as each_with_index were commonly used.
Are you impressed yet? We haven't finished. Those were just the changes that went into the 1.9 versions, and the list isn't exhaustive. Here are some more:
- Version 2.0 introduced keyword arguments, %i for creating arrays of symbols, the prepend method (for extending classes in a new way), and the __dir__ pseudo-variable with the directory name of the currently executed file.
- "Lazy evaluation" features were added to Enumerable, Enumerator, and Range. The to_h method became a standard way of converting to a hash. The regex engine was changed to Onigmo (a fork of Oniguruma) for improved features and performance. For debugging, DTrace and TracePoint were added.
- In 2.1, keyword arguments were improved a little. Refinements (introduced experimentally in 2.0) became a full-fledged language feature. Garbage collection was greatly optimized and is now controllable with more precision. For reflection purposes, ObjectSpace has seen numerous improvements. There have been small changes in the String, Bignum, Rational, and Symbol classes.
- The 2.2 enhancements were much more incremental in nature, with changes in Binding, Dir, Enumerable, Float, File, String, and other classes.
When I began the third edition of The Ruby Way, I was targeting Ruby 1.9. This in itself was an incredible chore, since the changes were so massive and all-encompassing. But while this revision was in progress, Ruby 2.0 was released, and I tried to update the book again (with help from Russ Olsen and Andre Arko). When Ruby 2.1 was imminent, I tried to update yet again. Obviously, such updates are never-ending, until someday Ruby has completely stagnated.
As Dave Thomas said, "If it's in print, it's out of date." So although I tried hard to modernize the book, it still contains errors and omissions. Please help identify them!
I hope this article doesn't sound like just a shameless plug to buy the book; it's also my effort to instill a greater sense of history into the Ruby community. We live in a society with many people who care nothing about history, whether we are talking about a hundred years or a hundred days. That issue goes far beyond the computing world.
In the real world, existing applications and legacy code can't keep pace with our tools and environments as well as we might like. We need to understand the requirements and expectations under which code was written—even if it was years ago. We need to understand the hacks and workarounds so that we can avoid such things in the future.
I also would argue that we need to recognize fashions and trends in the computing world that don't always correlate to efficiency, practicality, or necessity. If we remember the last twenty times we jumped onto a bandwagon that went nowhere, we might be more cautious next time. In short, we need to understand the past in order to cope with the present and create the future.
In the computing world, we don't really measure time in years so much as months. What tools will we be using in 2023? Either we learn to keep up with new technology and trends, or we become irrelevant. The future isn't entirely in our control, but parts of it are. Let's learn from the past in order to help shape the future properly.