SHARING AMERICA'S TECH NEWS FROM THE VALLEY TO THE ALLEY
This Q&A is part of a weekly series of posts highlighting common questions encountered by technophiles and answered by users at Stack Exchange, a free, community-powered network of 100+ Q&A sites.
Lately I’ve been trying to split long methods into several short ones.
For example: I have a `process_url()` function which splits URLs into components and then assigns them to some objects via their methods. Instead of implementing all this in one function, I only prepare the URL for splitting in `process_url()` and then pass it over to `process_components()` function, which then passes the components to `assign_components()` function.
At first, this seemed to improve readability. Instead of huge ‘God’ methods and functions, I had smaller ones with more descriptive names. However, looking through some code I’ve written that way, I’ve found that I now have no idea whether these smaller functions are called by any other functions or methods.
To continue the previous example, someone looking at the code might think that the functionality of `process_components()` is abstracted into a function because it’s called by various methods and functions, when in fact it’s only called by `process_url()`.
This seems somewhat wrong. The alternative is to still write long methods and functions but indicate their sections with comments.
Is the function-splitting technique I described wrong? What is the preferred way of managing large functions and methods?
Answer: A message from MichaelT (135 votes)
Testing code that does lots of things is difficult. Debugging code that does lots of things is difficult.
The solution to both of these problems is to write code that doesn’t do lots of things. Write each function so that it does one thing and only one thing. This makes them easy to test with a unit test (one doesn’t need umpteen dozen unit tests).
A co-worker of mine uses this phrase when judging if a given method needs to be broken up into smaller ones: “If, when describing the activity of the code to another programmer you use the word ‘and,’ the method needs to be split into at least one more part.”
I have a ‘process_url()’ function which splits URLs into components and then assigns them to some objects via their methods.
This should be at least two methods. It is OK to wrap them in one publicly facing method, but the workings should be two different methods.
Related: “One-line functions that are called only once…”
Answer: A message from Sebastian (5 vote)
If anyone is interested in some literature on this question, this is exactly what Joshua Kerievsky refers to as “Compose Method” in his Refactoring to Patterns (Addison-Wesley):
Transform the logic into a small number of intention-revealing steps at the same level of detail.
I believe the correct nesting of methods according to their “detail level” is important here. See an excerpt on the publisher’s site:
Much of the code we write doesn’t start out being simple. To make it simple, we must reflect on what isn’t simple about it and continually ask, “How could it be simpler?” We can often simplify code by considering a completely different solution. The refactorings in this chapter present different solutions for simplifying methods, state transitions, and tree structures.
Compose Method (123)_ is about producing methods that efficiently communicate what they do and how they do what they do. A Composed Method [Beck, SBPP] consists of calls to well-named methods that are all at the same level of detail. If you want to keep your system simple, endeavor to apply _Compose Method (123)_ everywhere…
Addendum: Kent Beck (Implementation Patterns) refers to it as “Composed Method.” He advises you to:
[C]ompose methods out of calls to other methods, each of which is at roughly the
same level of abstraction
One of the signs of a poorly composed method is a mixture of abstraction levels[.]
There, again, the warning not to mix different abstraction levels (emphasis mine).
Answer: A message from Ryan Rodemoyer (9 votes)
I’ve never taken issue with other developers splitting larger methods into smaller methods, as it’s a pattern that I follow myself. The “God” method is a terrible trap to fall into and others who are less experienced or simply don’t care tend to get caught more often than not. That being said…
It’s incredibly important to use appropriate access identifiers on the smaller methods. It’s frustrating to find a class littered with small public methods, because then I totally lose confidence finding where/how the method is used throughout the application.
I live in C#-land so we have “public,” “private,” “protected,” “internal,” and seeing those words shows me beyond a shadow of a doubt the scope of the method and where I must look for calls. If it’s private, I know the method is used in only one class and I have full confidence when refactoring.
In the Visual Studio world, having multiple solutions (`.sln`) exacerbates this anti-pattern because IDE/Resharper “Find Usages” helpers will not find usages outside of the open solution.
Answer: A message from Scott Whitlock (43 votes)
Yes, splitting long functions is normal. This is a way of doing things that’s encouraged by Robert C. Martin in his book Clean Code. Particularly, you should be choosing very descriptive names for your functions as a form of self-documenting code.
Thank you, TiA