When should I use a Set in Ruby?

This is an archive of blog post I wrote during my third venture (PullReview).

You develop a small contact manager for a client.

Contact = Struct.new(:name, :email)

One important feature is the possibility to define a list of contacts.

granny = Contact.new('granny', 'granny@weatherwax.me')
bill = Contact.new('bill', 'bill@door.me')

At first, you started with an array,

contacts = []

but you realize quickly that you have to check for duplicates. You end in many places with something like:

contacts << granny unless contacts.include? granny



Last time you were working with the list, you needed to send a campaign email to each contact:

contacts << granny
# => [granny]

contacts << granny
# => [granny, granny]

contacts << bill
# => [granny, granny, bill]

# …

contacts.each do |contact|
  contact.send_campaign # oups!

You forgot to check for duplicates, you shipped it, and the campaign was sent twice to granny!

You don't like that, and you're right. Indeed, the code is fragile: you shouldn't watch for the uniqueness constraint, it should be built in.

If you need a collection with uniqueness guaranteed, use a Set:

require 'set'


contacts = Set.new

# ...

contacts << granny
# => {granny}

contacts << granny
# => {granny}

contacts << bill
# => {granny, bill}

contacts.each do |contact|
  contact.send_campaign # yeah!

Of course, Array is a fine structure too - if duplicates are allowed or you need to access the n th element. In the end, you should work with classes properly representing your data: they will behave as you expect. In 99%, it's more important than performance consideration, or you'll end with fragile code.

Atom feed icon Subscribe to the blog!