Coupling, Cohesion & Connascence
The foundational measurements of structural software quality
🌱 This wiki entry hasn't fully bloomed. It's very likely to change over the next little while.
Entropy is a fancy word for complexity, randomness, uncertainty or a state of disorder.
It was first recognized in the study of thermodynamics but to us, programmers - it's at the heart of everything we're working to do with software.
We're in the business of creating things that solve problems for people in the real world.
And you don't have to be Albert Camus to realize that:
The world is entropy. It is complex, random, uncertain, and disorderly.
The role of software is cut through that chaos and produce a simplified (or good enough) model that operates top of that complexity.
This is hard. This is where all our efforts go.
- Learn the domain with Agile planning & discovery practices: If we're going to get anywhere close to building something useful and maintainable, we better start with learning the language of the domain. Event Storming (or Event Modeling), BDD, user stories to build a shared understanding and map out what we want to build.
- Use Agile technical practices: DDD, BDD, TDD, acceptance tests, pair programming to model the domain, catch regressions, and ensure quick feedback loops.
- Design patterns and principles: SOLID, YAGNI, Simple Design, etc.
- Clean code: Style, formatting, conventions, object calisthenics, etc.
There's one foundational technique we use to solve complex problems. That's the technique of decomposing the problem into smaller pieces.
It's a great technique, but it can have both positive and negative impacts coupling & cohesion: the best measures of structural software quality there is.
Coupling is a measure of how intertwined two components (and by components, we mean: routines, methods, functions, classes, modules, etc) are.
It's said that we should strive for loose coupling.
Loose coupling is good because it works to reduce maintenance costs by making code more testable and flexible.
- Loose coupling (ideal): Components are pretty independent and not tied to others. This helps testability and flexibility.
- Tight coupling: Components are inseparable. Changes made to one component in the group will likely ripple into the other dependent components. This hurts maintainability, testability, and flexibility.
It's not really possible to have zero coupling because the components we write need to be hooked up to each other in order for our software to do anything meaningful.
That being said, when things are loosely coupled, it generally correlates to low cohesion (bad).
Cohesion is a measure of how related components are within a particular module. It's a measure of how much they belong together. Do you have stuff in here mostly focused on the topic at hand? Or is there stuff in here that probably belongs elsewhere?
You want to strive for high cohesion.
Tip: If you have classes that you call helpers, it likely means that your cohesion is low.
High cohesion is good because it works to reduce maintenance costs by making code more understandable.
- High cohesion (ideal): When a component (like a class, for example) has methods that all appear to be related to an underlying concept, then the class is said to have high cohesion. This helps understandability.
- Low cohesion: Many classes with only a single method (that would belong better together) or one class with lots of methods (that aren't related) is an example of low cohesion.
The problem with high cohesion is that it typically correlates to tight coupling.
You see the conundrum we're in?
We optimize for cohesion we lose coupling.
We optimize for coupling we lose cohesion.
The doctrine of the mean: The most virtuous act is not the extreme - one side or the other. Instead, it's the middle path; this is what the best spot is. ━ Me, paraphrasing Aristotle's ethics
It turns out that there's another mental model we can use to balance dependencies.
It's called connascence: a generalization of coupling and cohesion into one "holistic approach" according to connascence.io.
Also according to connascence.io,
Connascence is a metric, and like all metrics, it is an imperfect measure. However, connascence takes a more holistic approach, where each instance of connascence in a codebase must be considered on three separate axes:
- Strength. Stronger connascence is harder to discover or harder to refactor.
- Degree. An entity that is connascent with thousands of other entities is likely to be a larger issue than one that is connascent with only a few.
- Locality. Connascent elements that are close together in a codebase are better than ones that are far apart.
The three properties of Strength, Degree, and Locality give the programmer all the tools they need in order to make informed decisions about when they will permit certain types of coupling, and when the code ought to be refactored.
Well that's powerful.
The following image lists the different forms of connascence, ordered by strength.
You want to prefer refactorings that push you towards the weaker connascence types (when a component is changed, it's less likely that that change will ripple into another component) over the stronger ones (more likely to introduce ripple).
Apparently this concept goes way back to 1992 when Meilir Page-Jones first introduced it in "Comparing Techniques by Means of Encapsulation and Connascence". It was also written about in Fundamentals of Object-Oriented Design in UML in 1999, and recently expanded on in 2009 in a presentation called "The Grand Unified Theory of Software Development" by Jim Weirich.
I have more to say about this, but I'm still learning about it, so for now, I'll recommend you read the official site on the topic and check out the examples. I'll update this page at some point.