Did you say Complexity, eh?
This is an archive of blog post I wrote during my third venture (PullReview).
TL;DR After introducing what complexity is, how it can be measured, and how to monitor it in Ruby - I'll get to the difficult part: how to reduce it in Ruby. The post is concluded with some thoughts and recap.
Intro๐
Even Vader does not know what are those buttons for.
Developing a feature requires some attention on code quality - for the sake of reliability, efficiency, security, and maintainability. If you have been following us, you know that we are not satisfied with code that just works - and nor should you be. One of the ways to achieve good quality is keeping the complexity in check. After introducing what complexity is, how it can be measured, and how to monitor it in Ruby - I'll get to the difficult part: how to reduce it in Ruby.
Disclaimer: As complexity is not a simple topic and deserves to be illustrated - this post is a bit longer than usual - I hope youโll bear with me. Bookmark it, you can always read it in pieces.
Easy, isn't it? Eek!๐
Complexity is a term covering several issues - including time and space complexity of an algorithm. In this case, we're talking about code complexity i.e. how complex it is to understand and modify your code. For a machine, it makes no difference. For another human being, the case is entirely different. Complex code is hard to read - hard to modify - and hard to debug (yep, we're writing code for humans). It impacts the maintainability and reliability of the software directly .
As code complexity has many facets, it is not so easy to summarize it with a single measure. Several measures have been developed, each one allowing to grab one aspect of complexity.As an introduction, weโll be covering the Cyclomatic Complexity metric and the ABC metric.
Cyclomatic Complexity๐
The most known metric is the cyclomatic complexity indicator. It measures how many paths of execution are possible through the code. If there is no branch (neither loop nor exception rescue), then there is only one path (i.e. the code is purely sequential) and the complexity equals to 1.
def hello_world
puts "Hello world"
end
Each time you add a branch or a loop, you create new paths - and complexity increases. For instance, adding one if-clause results in two paths, and a complexity equaling 2.
def hello(name)
if name
puts "Hello #{name}"
end
end
Its evaluation is easy, especially when considering only one possible exit point, i.e. one return:
NB + 1
where NB is the number of branch and loop keywords. If you consider multiple exit points, it is:
NB - NR + 2
where NR is the number of returns. For instance, if we're rewriting the previous example with a guard clause, we have now a complexity equals to 1 as we have 2 exit points.
def hello(name)
return unless name
puts "Hello #{name}"
end
By nesting more and more branches, weโll get the Arrow Anti-Pattern with a high cyclomatic complexity.
Really sizing up its meaning needs some experience. However, according to the community and according to some studies, 11 seems to be a good sanity threshold at a module/class level. This is not just a number, it has been observed that defects and cyclomatic complexity are correlated!
The ABC Metric๐
ABC is not per se a complexity metric. It is a measure of number statements and structures. Still, it is accepted that functional size has an impact on the complexity: a method with a lot of instructions, assignments, calls, branches, and other structures will be more complex and harder to understand - due to its size. ABC is computed as a function distance of the number of: assignments (A), branches (B), and conditions (C):
|ABC| = sqrt( A * A + B * B + C * C )
Each language having its peculiarities, the counting of A, B, and C must be adjusted accordingly. Basically, it consists in deciding to which category A, B, and C a keyword and other language structure belongs - and how to count it. It's a good complementary metric to cyclomatic complexity as it takes into account more than just branches.
For instance, if we count unless for 1 as branch, assignment for 1, and function call for 1 as branch, the following code has a ABC metric of sqrt(1ยฒ + 2ยฒ + 0ยฒ) = ~2.2:
def hello(name)
return unless name
greetings = "Hello #{name}"
puts greetings
end
You-hoo! Gi'me a ruler!๐
"Give them a tool, the use of which will lead to new ways of thinking." Richard Buckminster Fuller
Metrics without measuring tools - rulers - are useless. In Ruby, Flog or Excellent are among the best known and most actively used tools.
Flog๐
Flog is a kind of extended ABC metrics tool for Ruby. The C stands for call - not condition - in this case - as Flog puts a particular accent on it. For instance, a call to metaprogramming methods has a higher weight than to a normal method.
As with every metric, you need some experience to really weigh a Flog score for a single method. The range [20, 60] is considered as a gray zone, needing some personal contemplation: it could be due to the business domain - or it's a true candidate for refactoring.
Excellent๐
Excellent is not a single โrulerโ, it's a set of checks and metrics that you can easily apply together on a codebase. As for complexity metrics, it includes ABC metric, an approximated cyclomatic complexity, and a Flog-like score.
Aye! It's high, but how to reduce it?๐
A metric (tool) is a way to check and evaluate how critical the situation is. It will get your attention to some piece of code and push you to ask yourself: do I need to refactor? Using tools like Flog and Excellent, this is only the easy part.
A score - while interesting - does not give any clue how to fix your code. One cannot say: when you have a score A you need to do this, and with a score B you have to do that. You need to investigate the situation to understand what's wrong and how you can refactor your code.
To address these doubts it might be useful following common guidelines as the SOLID ones. In the next paragraphs, Iโll be offering several solutions in different typical use cases. The examples are intentionally stupid to clearly illustrate the idea.
Conditional execution of a method๐
class Book
def buy
if available?
puts 'buy'
end
end
#
def available?
false
end
end
method | Flog Score | Cyclomatic Complexity |
---|---|---|
buy | 2.3 | 2 |
available? | 0 | 1 |
It's impossible to buy a Book
if it's not available?
. It's very common that a method canโt be run under some conditions - preconditions. Wrapping the block with an if-clause makes the normal path unclear - especially if there are several ones. In order to isolate business logic and preconditions, it's better to use guard clauses to stop execution in the beginning if the conditions are met. This can be done with Exception
or by directly return
.
class Book
def buy
# with Exception
raise 'The book is not available' unless available?
# without
return unless available?
puts 'buy'
end
#
def available?
false
end
end
method | Flog Score | Cyclomatic Complexity |
---|---|---|
buy (without exception) | 2.3 | 1 |
buy (with exception) | 3.3 | 1 |
available? | 0 | 1 |
Conditional behavior of a method๐
class Book
def price(number_of_copies)
total = @price
if number_of_copies > 50
total = @price * 0.75
elsif number_of_copies > 20
total = @price * 0.85
elsif number_of_copies > 10
total = @price * 0.95
end
total
end
end
method | Flog Score | Cyclomatic Complexity |
---|---|---|
price | 9.7 | 4 |
The price
of the Book
depends on the number of purchased copies. The method price
returns conditionally a value for different special cases. It has been implemented by following the school of one single exit point. By doing this, you enforce dealing with multiple cases at the same time. It makes the reading more difficult. Sometimes, it's still better to keep a single exit point, other times not. If you can delimit a clear behavioral scope for each special case and assign an exit gate, directly returning the value will generally make the code clearer and simpler (I let the reader check the impact on cyclomatic complexity). In our case, it's better to exit as soon as the price
is calculated.
class Book
def price(number_of_copies)
return @price * 0.75 if number_of_copies > 50
return @price * 0.85 if number_of_copies > 20
return @price * 0.95 if number_of_copies > 10
@price
end
end
method | Flog Score | Cyclomatic Complexity |
---|---|---|
price | 7.8 | 1 |
One other option is to extract the condition into a dedicated method get_percentage(number_of_copies)
and use a Hash
to store the mapping between the percentage and the conditional number_of_copies
(I'll use this approach below in the section "Repetitive conditional behavior").
Condition depending on types๐
class Book
end
#
class DVD
end
#
class Shop
def how_much_does_it_cost(product)
price = 0
if product.is_a? Book
price = 5
elsif product.is_a? DVD
price = 10
end
price
end
end
method | Flog Score | Cyclomatic Complexity |
---|---|---|
how_much_does_it_cost | 5.2 | 3 |
You can buy two kinds of product in the Shop
: Book
and DVD
. That Shop
practices a pricing only depending on the kind of product. At first, in the method how_much_does_it_cost
, the product
type is tested to decide the price. If you add a new product type, you'll need to add other if-clauses and make the method more complex (and lead you to the God Object anti-pattern). In fact, polymorphism is an object oriented mechanism that allows you to exactly do that: implementing different behaviors for different types. Using it helps drastically reducing complexity. In our case, we just need to put the price
in the product classes Book
and DVD
.
class Book
def price
5
end
end
#
class DVD
def price
10
end
end
#
class Shop
def how_much_does_it_cost?(product)
product.price
end
end
method | Flog Score | Cyclomatic Complexity |
---|---|---|
Shop#how_much_does_it_cost | 1.1 | 1 |
DVD#price | 0.3 | 1 |
Book#price | 0.3 | 1 |
Total | 1.7 | 3 |
Nil guard clause๐
# somewhere in you Shop code
# โฆ
if product
puts product.name
end
# โฆ
method | Flog Score | Cyclomatic Complexity |
---|---|---|
Shop#somewhere | 2.6 | 2 |
How many times did you not guard a call on a nil
object? Whatever the reason, if you have a nil
object, you cannot call a method. Each time you can have a nil
, it is necessary to guard a call on it.
This could be resolved with the Null Object pattern (another option with Ruby on Rails consists in using the method try). The idea is simply to provide a model of the absence of object with a compatible interface with the used type, i.e. calling a method on an instance of it won't throw a NoMethodError
exception and won't do anything. You can implement your own or require a Gem providing it such as the excellent Naught by Avdi Grimm. Once done, you can remove the guard clauses.
# somewhere in you Shop code
# โฆ
puts product.name
# โฆ
method | Flog Score | Cyclomatic Complexity |
---|---|---|
main | 2.2 | 1 |
Multiple branches leading to a same statement or return value๐
class Book
def price
if @pocket
return 5
else
if @second_hand
return 5
else
return 10
end
end
end
end
method | Flog Score | Cyclomatic Complexity |
---|---|---|
price | 2.3 | 1 |
For two different conditions ( @pocket and @second_hand), the same price
value is returned: 5. This can at least be combined as a single conditional expression thanks to Boolean algebra (this is called ConsolidateConditional refactoring).
class Book
def price
return 5 if @pocket || @second_hand
10
end
end
method | Flog Score | Cyclomatic Complexity |
---|---|---|
price | 2.1 | 1 |
It is also a good practice to extract complex Boolean expression into a dedicated method (this is called DecomposeConditional refactoring). It will be easier to modify and to reuse.
class Book
def price
return 5 if reduced_price?
10
end
#
private
def reduced_price?
@pocket || @second_hand
end
end
method | Flog Score | Cyclomatic Complexity |
---|---|---|
price | 1.8 | 1 |
reduced_price? | 1.0 | 1 |
Total | 2.8 | 2 |
If you need to implement a deep decision tree with multiple conditions and possible outputs (for instance when dealing with taxes in accounting), you'll probably need to use a dedicated library.
Long method๐
class Shop
def inventory_missing_products
# calculates the number of sold
nbr_of_sold = Hash.new(0)
receipts.each do |receipt|
nbr_of_sold[receipt.product.name] += 1
end
#
# calculates the number of stock
nbr_in_stock = Hash.new(0)
products_in_stock.each do |product|
nbr_in_stock[product.name] += 1
end
#
# calculate missing products
nbr_of_bought.inject(Hash.new(0)) do |missing_of, (name, nbr)|
missing_of[name] = nbr - nbr_of_sold - nbr_in_stock
end
end
end
method | Flog Score | Cyclomatic Complexity |
---|---|---|
inventory_missing_products | 18.6 | 4 |
The method inventory_missing_products
couldn't calculate the number of missing products without knowing two other amounts nbr_of_sold
and nbr_in_stock
. In our case, we calculated them in the same method. As you read, we needed to make clear that we have 3 different goals to achieve - by putting comments and adding separation white line. For the sake of clarity, it would be better to extract those preliminary calculations into dedicated methods. By doing so, we'll have smaller methods - with well defined goal and scope - and lower complexity. Of course, the total complexity won't be reduced (it could even be increased), but the complexity of each method will be now โdecentโ.
class Shop
def inventory_missing_products
nbr_of_sold = how_many_sold_products
nbr_in_stock = how_many_products_in_stock
nbr_of_bought.inject(Hash.new(0)) do |missing_of, (name, nbr)|
missing_of[name] = nbr - nbr_of_sold - nbr_in_stock
end
end
#
private
def how_many_sold_products
receipts.inject(Hash.new(0)) do |nbr_of_sold, receipt|
nbr_of_sold[receipt.product.name] += 1
nbr_of_sold
end
end
#
def how_many_products_in_stock
products_in_stock.inject(Hash.new(0)) do |nbr_in_stock, product|
nbr_in_stock[product.name] += 1
nbr_in_stock
end
end
end
method | Flog Score | Cyclomatic Complexity |
---|---|---|
inventory_missing_products | 10.9 | 2 |
how_many_sold_products | 7.4 | 2 |
how_many_products_in_stock | 6.0 | 2 |
Total | 25.3 | 6 |
This is also valid for branch and loop blocks and their expression. When the block becomes too long or the expression contains too many clauses, it's always better to extract them into a separate method. A method has a great benefit over a block that shouldn't be underestimated: it has a name.
Repetitive conditional behaviors๐
class Shop
def calculate_total_price(price, country)
case country
when 'BE'
return 0.7 * price + price * 0.21
when 'FR'
return price + price * 0.20
when 'UK'
return [0.9 * price - 1.0, 5.0].max + price * 0.20
end
end
end
method | Flog Score | Cyclomatic Complexity |
---|---|---|
calculate_total_price | 14.6 | 1 |
The Shop
is spread over Europe: this means a different VAT per country. The Shop
also applies different promotions depending on the country
. There is clearly a pattern common to each country: the price
with promo + the VAT (that needs to be applied on the original price
). We have duplicated codes and conditionals. We can do better.
If we want to keep the freedom to change the promo rules, we have a good candidate to apply the Strategy pattern, i.e. to extract the calculation of the promo as a Method Object that could be specialized if needed. In order to illustrate another option, we'll use another similar functional approach: using lambdas. It's especially suited for small functions. Finally, to get rid of the case branches, we'll use a Hash.
class Shop
PROMO_RULE_PER_COUNTRY = {
'BE' => ->(price) { price * 0.7 },
'UK' => ->(price) { [price * 0.9 - 1.0, 5.0].max }
}
PROMO_RULE_PER_COUNTRY.default = ->(price) { price }
#
VAT_PER_COUNTRY = {
'BE' => 0.21,
'FR' => 0.20,
'UK' => 0.20
}
VAT_RULE_PER_COUNTRY.default = 0.0
#
def calculate_total_price(price, country)
return PROMO_RULE_PER_COUNTRY[country].call(price) + VAT_RULE_PER_COUNTRY[country] * price
end
end
method | Flog Score | Cyclomatic Complexity |
---|---|---|
calculate_total_price | 6.2 | 1 |
Shop#constants | 8.8 | - |
Total | 15.0 | 1 |
Other cases๐
There are too many possible cases to be broken down in a simple - be it long - blog post like this one - to discuss different design pattern and refactoring methods. Enough material to write ten other posts. We'll surely keep them into our list of potential topics!
Some References๐
- Flattening Arrow Code by Jeff Atwood
- Refactorings catalog maintained by Martin Fowler
- How To Score Your Rails App's Complexity Before Refactoring by Eric Davis
- Through the eyes of Sonar: Complexity by Stรฉphan Mestach
- Ruby Science by Thoughtbot
Closing Thoughts, hmm.๐
Without any external point of view, it's difficult to realize that - while some piece of code seems easy to you - it's complex for others. Indeed, we don't necessarily think in the same way. Even if you take care, your code will become more complex quickly. A code base is a living thing - and evolves through the changes made. What was simple can become progressively complex - one little change after another. A Code Analysis Tool or - better - a Code Review (Tool) can underline the complexity and catch your attention. It is then possible - and necessary - to act.
Case | Refactoring |
---|---|
conditional execution of a method | Guard clauses |
conditional behavior of a method | Opportunistic return |
condition depending on types | Polymorphism |
nil guard clause | Null Object Pattern |
multiple branches leading to same statement or return | ConsolidateConditional and DecomposeConditional |
long method | ExtractMethod |
repetitive conditional behaviors | Strategy Pattern, lambda |
other cases | Refactoring, Design Pattern, etc. |
In the end, all those refactoring methods allow to reduce, move, or split the complexity. The goal is to make the work units - like functions, methods, classes, and modules - easier: easier to read - easier to understand - easier to change - easier to debug and to maintain. Sometimes - however - a higher score is better:
- the tackled business domain is complex - full of exceptions and conditions, or
- the simpler alternative uses more advanced language structure and libraries, making it more complex.
Conclusion. Whoa!๐
"Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it." Alen Perlis
Each metric grabs a different aspect of complexity. In the end, the devs are the only ones that can correctly interpret each case. Anyway, objectively pointing to complex locations is a good way to catch the attention - not only yours but also the code reviewerโs. Complexity is effectively a complex question - bringing you back to the basics. It initiates discussion with the code reviewer. Automated tools and code review are complementary approaches that will significantly help you to control code complexity.
If you have any comment, question, or feedback, please share them with me.