Performance of Ruby Array, Set, and Hash
In reviewing some Ruby code, the data were all in Sets, rather than Arrays. I asked the original author why they used Sets instead of Arrays. The answer was "because sets are faster than arrays."
Ruby documentation implies this as well:
"Set implements a collection of unordered values with no duplicates. This is a hybrid of Array's intuitive inter-operation facilities and Hash's fast lookup."
I rewrote the code to use arrays, and found the whole program was faster with arrays. Odd.
I decided to test it out. I wrote a quick test program. It adds 10 million numbers to a Set, an Array, and a Hash. And then to test the speed of the structure over the speed of the cpu, simply square the value.
require 'set'testarray = []testset = Set[]testhash = {}10000000.times do |i| testarray.push(i) testset.add(i) testhash[:i] = i endarray_start = Time.nowtestarray.each do |num| num*numendarray_finish = Time.nowset_start = Time.nowtestset.each do |num| num*numendset_finish = Time.nowhash_start = Time.nowtesthash.each do |k,v| v*vendhash_finish = Time.nowputs "Array timing: #{array_finish - array_start}Set timing: #{set_finish - set_start}Hash timing: #{hash_finish - hash_start}"
The results, in seconds, for an average of 10 runs each are:
Array timing: 0.273671
Set timing: 0.314005
Hash timing: 2.0e-06
The hash value is 0.0000020 seconds if you don't like the scientific notation. I'm using wall clock to set the start/finish times, which is not as accurate as cpu clock time, but it's good enough for the experiment. These were run on a 2019 Macbook Pro:

For this experiment, sets are slower than arrays. In looking to re-write the code, I'd use a hash. It's vastly faster than both arrays and sets.