Last updated on June 30th, 2024 at 03:05 pm
Table of Contents
- Enumerable
- A short digression regarding iterator naming
- for
- while, until
- loop
- each, each_with_index
- map
- each_with_index.map
- inject
- reduce
- Exiting and Skipping in Iterators
- The magic of Symbol to Proc
- Lazy Loading
Enumerable
In Ruby, you iterate over an Enumerable. An Enumerable can be a hash, an array, a set, a range, or in Rails, a collection of records.
A short digression regarding iterator naming
It is a common rule in any programming language that a variable name should always try to express its purpose in life. A variable name like xyz, which has absolutely no meaning, has absolutely no place in your code. Using a instead of address or pn instead of phone_number is heavily frowned upon. The iterator in a loop appears to be an exception to this rule.
It is very common to find single character iterators when using loops. The justification for this is that the loop is generally very few lines of code so the actual meaning and intent of the iterator is clear because you are always within eye-shot of the declaration. In consideration of this, I will use single character iterators in all examples.
for
Ruby has the common iteration statements of for
, while
and until
, but they operate in a way that is unique to Ruby. If your programming background is c based, or just about any derivative of c, then you are used to a for loop that operates something like:
for (int i = 0; i < max_value; i++) {
statements;
}
Although the Ruby syntax is similar, it is relatively unique to Ruby.
for object in enumerable do
expressions
end
Your variable can be just about anything – a range, an array, a hash, a comma separated list of objects. Whatever. This gives rise to things like:
for o in [1, "a string", Time.new, 42.0] do
puts "#{o.inspect}"
end
You will notice that instead of counting up or down on an index, the for loop supplies the actual object itself to the interior.
When the for
loop terminates, it returns the expression that was iterated over.
while, until
The while
loop operates similarly to what one could assume – the loop will iterate while an expression evaluates to Truthy.
counter = 0
while counter < 2
counter += 1
puts "The counter is #{counter}"
end
The until
loop behaves in a similar fashion, but executes until the expression is Truthy.
counter = 0
until counter > 2
counter += 1
puts "The counter is #{counter}"
end
Unlike the for
loop, when the while
and until
loops terminate, they return a nil.
loop
I’ve split the loop
iterator out because it is a bit different in that it does not have a conditional associated with it. Instead of a conditional terminating the execution of the loop, you will need to have a CTRL-C entered or to explicitly break
out of the loop.
counter = 0
loop do
counter += 1
puts "The counter is #{counter}"
break if counter > 5
end
Similar to the while
and until
loops, loop
returns nil when it completes.
each, each_with_index
Although the for
loop iterates over a enumerable, the each
and each_with_index
is sometimes a more readable approach and more commonly used.
[1, "a string", Time.new, 42.0].each do |o|
puts "#{o.inspect}"
end
You will notice the |o|
after the do
, which supplies the name of the object to be used in the iteration.
[1, "a string", Time.new, 42.0].each_with_index do |o, i|
puts "The item #{i} has a value of #{o.inspect}"
end
As with the for
loop, each
and each_with_index
return the enumerable operated upon when they complete.
map
It is very often that the purpose of an iteration over a enumerable is to generate a new enumerable based off of the original enumerable. For the purpose, we have the map
loop.
[1, "a string", Time.new, 42.0].map do |o|
puts "The item has a value of #{o.inspect}"
end
Unlike for
, each
, and each_with_index
, the map returns an array of the results of each iteration. Because the puts
method always returns a nil
, the example above will return an array of nil
s.
To return an enumerable that actually conveys information, we could do something like the following:
[1, "a string", Time.new, 42.0].map do |o|
o.class
end
each_with_index.map
What do you do if for some reason you need the index of each object? Try this:
[1, "a string", Time.new, 42.0].each_with_index.map do |o,i|
"Item #{i} has a class of #{o.class}"
end
inject
If you need to iterate over an enumerable and none of the aforementioned loop constructs will do what you want, Ruby has the inject
loop as the Swiss Army Knife of iterators. Because it can do so much, it is a little more complicated to implement.
values = {a: 'a', b: 'b', c: 'c'}
hash = values.inject({}) do |accumulator, (key, value)|
accumulator[key] = value.upcase
puts "The accumulator is #{accumulator}"
accumulator
end
First, you need to pass inject
the hash that will be used at the start of the iteration. Normally, this is an empty hash, but it doesn’t have to be. Then for each iteration you will have access to what is called the accumulator, a key and an index. As the final expression of the loop you will need to put the updated value of the accumulator.
To iterate over an array, you can do something like the following:
values = ['a','b','c']
hash = values.inject({}) do |accumulator, value |
accumulator[value] = value.upcase
puts "The accumulator is #{accumulator}"
accumulator
end
reduce
The reduce
is an alias for inject
Exiting and Skipping in Iterators
It is quite common that we will want to either prematurely exit an iterator or skip an iteration while keeping the loop going.
The break
statement can be used to terminate the execution of a block and the next
statement can be used to skip to the next iteration.
As well, the return
statement can be used to not only exit the loop, but also exit the method.
The magic of Symbol to Proc
One common coding requirement is to iterate over a collection and apply a single function to each object in the collection. Ruby has a very attractive shortcut to allow this, &:
, which is called Symbol to Proc.
["1","2","3"].map(&:to_i)
Lazy Loading
Method chaining is one of Ruby’s most powerful and attractive features. However, if you are not careful, you can waste a lot of CPU time unnecessarily.
A common task in programming is to process large amounts of data, often looking for a needle in a haystack. If you are performing such a task, you will normally want to terminate operations as soon as your needle is found, instead of continuing to process the remaining haystack.
Using a admittedly contrived example to keep things simple, think about a large array that needs to be processed and then the results of that processing searched for the existence of a given value.
extremely_large_array.map{ |n| process(n)}.all?{ |n| evaluate_criteria(n)}
Normally, when processing the array, map
will process the full array and once it is done, it will pass the results on to all?
. If you prefix the map
method with the lazy
method, however, this will create what is called a Enumerator::Lazy object, which will feed objects from the array to the map method individually, which will then pass the object to any?
. As soon as any?
evaluates to true, the process stops.
extremely_large_array.lazy.map{ |n| process(n)}.all?{ |n| evaluate_criteria(n)}