Home > Articles > Software Development & Management

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

Iterators in the Wild

Iterators in the Wild

Iterators—mostly internal but occasionally external—are so common in Ruby that it is hard to know where to start. Ruby arrays actually have two other internal iterators beside each: reverse_each cycles through the array elements from the end of the array to the beginning, while each_index calls the block passed in with each index in the array instead of each element.

The String class has an each method that cycles through each line (yes, each line, not each character) in the string as well as each_byte. Strings also have a wonderful scan method, which takes a regular expression and iterates over each match that is found in the string. For example, we might search for all words that begin with the letter 'p' in a well-known tongue twister:

s = 'Peter Piper picked a peck of pickled peppers'
s.scan(/[Pp]\w*/) {|word| puts("The word is #{word}")}

If you run this code, you will get lots of 'p' words:

The word is Peter
The word is Piper
The word is picked
The word is peck
The word is pickled
The word is peppers

Unsurprisingly, the Hash class supports a rich assortment of iterators. We have each_key, which calls the code block for each key in the hash:

h = {'name'=>'russ', 'eyes'=>'blue', 'sex'=>'male'}
h.each_key {|key| puts(key)}

This code produces the following output:

name
sex
eyes

The Hash class also has an each_value method:

h.each_value {|value| puts(value)}

This code produces the following output:

russ
male
blue

Finally, a vanilla each method is also available:

h.each {|key, value| puts("#{key} #{value}")}

The each method iterates over every key/value pair in the hash, so this code will output

name russ
sex male
eyes blue

External iterators are harder to find in Ruby, but the IO object presents an interesting example. The IO class is the base class for input and output streams. The neat thing about the IO object is that it is amphibious—it does both internal and external iterators. You can open a file and read each line in a very traditional style by using the open file handle as an external iterator:

f = File.open('names.txt')
while not f.eof?
  puts(f.readline)
end
f.close

The IO object also has an each method (also known as each_line) that implements an internal iterator over the lines in the file:

f = File.open('names.txt')
f.each {|line| puts(line)}
f.close

For the non-line-oriented files, IO supplies an each_byte iterator method:

f.each_byte {|byte| puts(byte)}

If your programs do a lot of IO, you will probably want to know about the Pathname class. Pathname tries to offer one-stop shopping for all your directory and path manipulation needs. You create a Pathname by supplying the constructor with a path:

pn = Pathname.new('/usr/local/lib/ruby/1.8')

Along with a raft of useful methods that have nothing to do with iterators, Pathname supplies the each_filename iterator, which cycles through the components of the path that you supplied. So if you run

pn.each_filename {|file| puts("File: #{file}")}

you will get something like this:

File: usr
File: local
File: lib
File: ruby
File: 1.8

But you can also go in the other dimension—the each_entry method will iterate over the contents of the directory pointed at by the Pathname. So if you run

pn.each_entry {|entry| puts("Entry: #{entry}")}

you will see the contents of /usr/local/lib/ruby/1.8:2

Entry: .
Entry: ..
Entry: i686-linux
Entry: shellwords.rb
Entry: mailread.rb
...

Finally, my very favorite internal iterator in Ruby is one supplied by the ObjectSpace module. ObjectSpace provides a window into the complete universe of objects that exist within your Ruby interpreter. The fundamental iterator supplied by ObjectSpace is the each_object method. It iterates across all of the Ruby objects—everything that is loaded into your Ruby interpreter:

ObjectSpace.each_object {|object| puts("Object: #{object}")}

The each_object method takes an optional argument that can be either a class or a module. If you supply the argument, each_object will iterate over only the instances of that class or module. And yes, subclasses count. So if I wanted to print out all of the numbers known to my Ruby interpreter, I might do this:

ObjectSpace.each_object(Numeric) {|n| puts("The number is #{n}")}

This level of introspection is reasonably breathtaking. You can, for example, use ObjectSpace to implement your own memory profiling system: Simply start a thread that looks for the objects of interest and prints a report about them. A class might use ObjectSpace to hunt down all instances of itself, for example. Rails uses ObjectSpace to build a method that finds all of the subclasses of a given class:

def subclasses_of(superclass)
  subclasses = []
  ObjectSpace.each_object(Class) do |k|
    next if !k.ancestors.include?(superclass) || superclass == k ||
             k.to_s.include?('::') || subclasses.include?(k.to_s)
    subclasses << k.to_s
  end
  subclasses
end

If we call

subclasses_of(Numeric)

we will get back an array containing ‘Bignum', 'Float', 'Fixnum', and 'Integer'. As I say, reasonably breathtaking.

  • + Share This
  • 🔖 Save To Your Account