In computer programming, a code smell is any characteristic in the source code of a program that possibly indicates a deeper problem. Determining what is and is not a code smell is subjective, and varies by language, developer, and development methodology.
The term was popularised by Kent Beck on WardsWiki in the late 1990s. Usage of the term increased after it was featured in the 1999 book Refactoring: Improving the Design of Existing Code by Martin Fowler. It is also a term used by agile programmers.
Definition
One way to look at smells is with respect to principles and quality: "Smells are certain structures in the code that indicate violation of fundamental design principles and negatively impact design quality". Code smells are usually not bugs; they are not technically incorrect and do not prevent the program from functioning. Instead, they indicate weaknesses in design that may slow down development or increase the risk of bugs or failures in the future. Bad code smells can be an indicator of factors that contribute to technical debt. Robert C. Martin calls a list of code smells a "value system" for software craftsmanship.
Often the deeper problem hinted at by a code smell can be uncovered when the code is subjected to a short feedback cycle, where it is refactored in small, controlled steps, and the resulting design is examined to see if there are any further code smells that in turn indicate the need for more refactoring. From the point of view of a programmer charged with performing refactoring, code smells are heuristics to indicate when to refactor, and what specific refactoring techniques to use. Thus, a code smell is a driver for refactoring.
A 2015 study utilizing automated analysis for half a million source code commits and the manual examination of 9,164 commits determined to exhibit "code smells" found that:
- There exists empirical evidence for the consequences of "technical debt", but there exists only anecdotal evidence as to how, when, or why this occurs.
- Common wisdom suggests that urgent maintenance activities and pressure to deliver features while prioritizing time-to-market over code quality are often the causes of such smells.
Tools such as Checkstyle, PMD, FindBugs, and SonarQube can automatically identify code smells.
Common code smells
Application-level smells
- Mysterious name: functions, modules, variables or classes that are named in a way that does not communicate what they do or how to use them.
- Duplicated code: identical or very similar code that exists in more than one location.
- Contrived complexity: forced usage of overcomplicated design patterns where simpler design patterns would suffice.
- Shotgun surgery: a single change that needs to be applied to multiple classes at the same time.
- Uncontrolled side effects: side effects of coding that commonly cause runtime exceptions, with unit tests unable to capture the exact cause of the problem.
- Variable mutations: mutations that vary widely enough that refactoring the code becomes increasingly difficult, due to the actual value's status as unpredictable and hard to reason about.
- Boolean blindness: easy to assert on the opposite value and still type checks.
Class-level smells
- Large class: a class that contains too many types or contains many unrelated methods
- Feature envy: a class that uses methods of another class excessively.
- Inappropriate intimacy: a class that has dependencies on implementation details of another class
- Refused bequest: a class that overrides a method of a base class in such a way that the contract of the base class is not honored by the derived class
- Lazy class/freeloader: a class that does too little.
- Excessive use of literals: these should be coded as named constants, to improve readability and to avoid programming errors. Additionally, literals can and should be externalized into resource files/scripts, or other data stores such as databases where possible, to facilitate localization of software if it is intended to be deployed in different regions.
- Cyclomatic complexity: too many branches or loops; this may indicate a function needs to be broken up into smaller functions, or that it has potential for simplification/refactoring.
- Downcasting: a type cast which breaks the abstraction model; the abstraction may have to be refactored or eliminated.
- Orphan variable or constant class: a class that typically has a collection of constants which belong elsewhere where those constants should be owned by one of the other member classes.
- Data clump: Occurs when a group of variables are passed around together in various parts of the program. In general, this suggests that it would be more appropriate to formally group the different variables together into a single object, and pass around only the new object instead.
Method-level smells
- Too many parameters: a long list of parameters is hard to read, and makes calling and testing the function complicated. It may indicate that the purpose of the function is ill-conceived and that the code should be refactored so responsibility is assigned in a more clean-cut way.
- Long method: a method, function, or procedure that has grown too large.
- Excessively long identifiers: in particular, the use of naming conventions to provide disambiguation that should be implicit in the software architecture.
- Excessively short identifiers: the name of a variable should reflect its function unless the function is obvious.
- Excessive return of data: a function or method that returns more than what each of its callers needs.
- Excessive comments: a class, function or method has irrelevant or trivial comments. A comment on an attribute setter/getter is a good example.
- Excessively long line of code (or God Line): A line of code which is too long, making the code difficult to read, understand, debug, refactor, or even identify possibilities of software reuse.