ruby code

Performance of Ruby Array, Set, and Hash

In reviewing some Ruby code, the data were all in Sets, rather than Arrays. I asked the original author why they used Sets instead of Arrays. The answer was "because sets are faster than arrays."

Ruby documentation implies this as well:

"Set implements a collection of unordered values with no duplicates. This is a hybrid of Array's intuitive inter-operation facilities and Hash's fast lookup."

I rewrote the code to use arrays, and found the whole program was faster with arrays. Odd. 

I decided to test it out. I wrote a quick test program. It adds 10 million numbers to a Set, an Array, and a Hash. And then to test the speed of the structure over the speed of the cpu, simply square the value. 

require 'set'
testarray = []
testset = Set[]
testhash = {}

10000000.times do |i|
  testarray.push(i)
  testset.add(i)
  testhash[:i] = i 
end

array_start = Time.now
testarray.each do |num|
  num*num
end
array_finish = Time.now

set_start = Time.now
testset.each do |num|
  num*num
end
set_finish = Time.now

hash_start = Time.now
testhash.each do |k,v|
  v*v
end
hash_finish = Time.now

puts "Array timing: #{array_finish - array_start}
Set timing: #{set_finish - set_start}
Hash timing: #{hash_finish - hash_start}"

The results, in seconds, for an average of 10 runs each are:

Array timing: 0.273671
Set timing: 0.314005
Hash timing: 2.0e-06

The hash value is 0.0000020 seconds if you don't like the scientific notation. I'm using wall clock to set the start/finish times, which is not as accurate as cpu clock time, but it's good enough for the experiment. These were run on a 2019 Macbook Pro:

macbook pro test machine specifications

For this experiment, sets are slower than arrays. In looking to re-write the code, I'd use a hash. It's vastly faster than both arrays and sets.