
March 15, 2007
Everybody does agree that avoiding dependency cycles between components is a central design principle. While developing the software NDepend, we had the opportunity to dig deeper within this subject. The present article aims to explain the interesting and non-trivial principles noticed. After a few reminders, we’ll see several concrete cases showing where and how these principles could lead to better architecture. The message is: You can keep a clean architecture by controlling dependencies between components.
Note : In this article we’ll consider only static dependencies, the ones that are created at compile-time. Thus we will focus on code-design. In this context, dynamic dependencies, created with delegates or reflection at runtime, are irrelevant.
Dependency cycles between components lead to what is commonly called spaghetti code or tangled code. If component A depends on B that depends on C that depends on A, the component A can’t be developed and tested independently of B and C. A, B and C form an indivisible unit, a kind of super-component. This super-component has a higher cost than the sum of the cost over A, B and C because of the diseconomy of scale phenomenon (well documented in Software Estimation: Demystifying the Black Art by Steve McConnell). Basically, this holds that the cost of developing an indivisible piece of code increases exponentially. This suggests that developing and maintaining 1,000 LOC (Lines Of Code) will likely cost three or four times more than developing and maintaining 500 LOC, unless it can be split in two independent lumps of 500 LOC each. Hence the comparison with spaghetti that describes tangled code that can’t be maintained. In order to rationalize architecture, one must ensure that there are no dependency cycles between components, but also check that the size of each component is acceptable (500 to 1000 LOC).
Interestingly enough, dependency cycles between classes are not problematic. As long as there are no dependency cycles between components, dependency cycles between classes are de facto bounded. The same goes for dependency cycles between methods (i.e, recursive calls). Several Gof design patterns foster dependency cycles between classes-
Assemblies are a common form for packaging .NET components. The more I analyze real-world applications, the more I notice that assemblies are over-used. Assemblies are costly to development because they represent a physical notion. Assemblies have multiple drawbacks:
- Cost at development and compile time : Every .NET developers know that C# and VB.NET compilers are efficient. However, we also know that when multiple projects are simultaneously opened in a solution, the compiler performance is dramatically reduced. A quick benchmark reveals that a single project of 25.900 LOC can be compiled in 2.3 seconds, while 14 projects (18.500 LOC total) are compiled in 6.2 seconds. From real-world observations, solutions with numerous projects seem to be the rule, not the exception. Developers get frustrated rebuilding their solutions several times a day. To minimize this problem, one uses several solutions that target the same set of projects, but doing so makes the work environment more complex. A better alternative lays in the relatively unknown VisualStudio solution folders . This great feature helps when dealing with solutions involving numerous projects but doesn’t reduce the time to Rebuild-All projects.
- Cost at deployment time : Since VisualStudio can’t build multi-modules assemblies, we tend to consider that an assembly is a single file. The more assemblies you have, the more problems you will encounter at deployment time due to file management issues. The Agile trend fosters automatic and daily builds - still, problems are looming. These include incoherent versions, out-of-synch assembly PDB files and source code, file loss, assemblies referencing issues, obfuscation issues, mess on production machines… You can still use the tool ILMerge to merge several assemblies in a single file but doing so won’t solve all problems, and can introduce malicious bugs.
- Cost at runtime : Each assembly demands work from the CLR. When the CLR loads an assembly it resolves the version and the location, loads the file, checks for CAS (Code Access Security) evidences and permissions, and reserves some memory to store the assembly manifest. This burden significantly increases the startup time required for an application when dozens of assemblies must be loaded in a row.
The .NET platform comes with a light construct to build components: namespaces. Namespaces are globally under-used. Most assemblies I analyzed contain just one or two namespaces. Why is it that developers tend to prefer assemblies over namespaces?
Reason 1 : Build technologies such as MSBuild and NAnt automatically detect and pinpoint dependency cycles between assemblies.
Reason 2 : Assemblies come with the internal visibility (C# keyword: internal, VB.NET keyword: Friend) that in effect enforce encapsulation. While the internal visibility is definitely a great feature, you can harness namespace hierarchy to better encapsulate and organize your classes. This allows for fine-grained encapsulation. For example, thanks to the namespace hierarchy you can limit your application to the use of System.Data and System.Date.Common namespaces to avoid being coupled with DB providers such as System.Data.SqlClient or System.Data.OleDb.
Reason 3 : Assembly as file allows the CLR to load code on-demand. This is why the .NET framework is made of around 30 assemblies. If it would be released as a single assembly, the CLR would load the WindowsForms code for all applications. It would waste 5MB of precious memory each time an ASP.NET application would run. However, there is still room for making big assemblies such as the WindowsForms one (that is made of 559.301 IL instructions, which means around 90.000 LOC because of the constant 6/7 ratio we noticed between C# code and IL code). The problem is that in the real-world we tend to use numerous assemblies even for 20.000 LOC applications.
Reason4: Assemblies are black-boxes. The agile trend fosters collective code ownership. This fact leans toward less black-boxes. Big applications still need black-boxes to clearly separate main parts developed by different teams but this fact still advocates for less – but bigger - assemblies.
We conclude that controlling dependencies between namespaces is the key to reducing the number of assemblies and increasing the number of namespaces. Benefits are:
- Lighter build process
- Optimized compilation time
- Lighter deployment
- Better startup time for our applications
- Facilities for hierarchical components
- Facilities for more finely-grained components
In this section we will show some dependency cycles in some well-known open-source programs. We are not judging the quality of the code. The goal is to explain that it is impossible to manually control dependencies between namespaces.
The picture below shows a Dependency Structure Matrix (DSM). It represents internal dependencies of the assembly Spring.Core.dll found in the Spring.NET open-source project. This picture and following have been generated by NDepend:

A blue cell shows a dependency between the two namespaces in abscissa and ordinate. Here, we are in indirect dependency mode. For example, the down left cell [0,12] reveals that the namespace Spring.Proxy is indirectly using the namespace Spring.Threading with a shortest path of length 5. The picture below should make things clear:

A black cell shows that there exists a dependency cycle between the two namespaces in abscissa and ordinate. For example, the couple of cells [9,7] and [7,9] reveals that the namespaces Spring.Validation and Spring.Core.* belong to a dependency cycle of shortest length five.

The red square around black cells means that each of the six namespaces concerned is using directly or indirectly the other five namespaces. In other words these six namespaces form a super-component. The size of this super-component is 10.244 LOC. If there wouldn’t be any dependency cycle, the biggest namespaces would be Spring.Objects.* with 3989 LOC. Let’s be precise and note that this namespace is itself componentized and contains 11 sub-namespaces.
Notice that according to this link the Java Spring framework doesn’t contain any dependencies cycle.
The same phenomenom can be observed with namespaces of the assembly ICSharpCode.SharpDevelop.dll found in the SharpDevelop open-source project. The eight namespaces entangled together form a super component of 25.854 LOC.

And here is another occurence of this super-component phenomenon found in the assembly ThoughtWorks.CruiseControl.Core.dll from the CruiseControl.NET open-source project.

Let’s have a glance at namespaces of the assembly mscorlib found in the .NET framework 2.0 (the same is in .NET 3.0!). Almost all namespaces are entangled together for a total of 435.444 IL instructions (which should be around 65.000 LOC C#).

Let us remember that mscorlib is the most used assembly because it contains essential types such as int, string or Thread. All namespaces of mscorlib are essential and this explains why they need each other. Would it be possible to break the cycle between System and System.Threading? The following matrix can help in answering this question. It represents dependencies between the 280 types of System and the 68 types of System.Threading. Blue cells show dependencies from System.Threading to System, green cells show dependencies from System to System.Threading, black cells show bidirectional dependencies and hashed cells are irrelevant here (they just indicate that some type are invisible to others because of private and protected visibility).

The fact that there are far less green cells than blue cells mean that it would be easier to break the dependency from System to System.Threading instead of the other direction. Would it be a good thing? Not really. In the picture below, we selected the types from System that are using something from System.Threading. Most of them are using System.Threading to be thread-safe because of the massive use of the Monitor and Interlocked classes.

Let’s be precise and emphasize that using DSM to represent dependencies is a better choice than using a ‘boxes-and-arrows’ diagram. Below you’ll find such a diagram representing the dependencies shown in the above DSM. Which is clearer?

The assembly VisualNDepend.exe found in the NDepend application contains around 25.000 C# LOC that handles all UI features. Once the DSM was coded, it was interesting to harness it to analyze the architecture of VisualNDepend.exe. We then figured out that our architecture was completely entangled as shown in the picture below (taken within indirect dependencies mode).

The picture below is taken within direct dependencies mode. The line and row 13 reveal that the namespace VisualNDepend.Kernel is directly used by and is directly using almost all other namespaces. This is because the design pattern mediator for inter panel communication. The VisualNDepend.Kernel namespace precisely contains the mediator class.

It took three days to get rid of dependency cycles between namespaces of VisualNDepend. The matrix below shows the result within direct dependency mode. In short, a view of cells [15,9] and [9,15] reveals that eight types of the namespace VisualNDepend.GraphLoader are using 23 types of the namespace VisualNDepend.GraphObjectModel.

The first refactoring task has been to split the namespace VisualNDepend.Kernel into two namespaces. The first one is the low level namespace VisualNDepend.KernelInterface that is heavily used (see blue cells of row 14) and that almost uses nothing (see green cells of row 14). The second one is the high level namespace VisualNDepend.KernelImpl that is using almost all other assemblies (see green cells of row 2) but that is almost not used (see blue cells of row 2). Our mediator class is a singleton class named MediatorImpl that is defined in the namespace VisualNDepend.KernelImpl. It implements an interface named IMediator defined in the namespace VisualNDepend.KernelInterface. The only place where the class MediatorImpl is referenced is the Main() method defined in the top level VisualNDepend namespace. The code looks like the following line. The point is to gather a reference of type IMediator toward the only instance of MediatorImpl in order to make this object widely accessible without referencing the class MediatorImpl:
Mediator.Initialize(MediatorImpl.Instance);
Once this refactoring was done, it was easy to get rid of remaining dependency cycles such as the bidirectional dependency between the namespaces VisualNDepend.QueryPanel.* and VisualNDepend.GraphObjectmodel.*. One cool thing about bidirectional dependencies is that most of the time, one of the two directions is privileged. It is not by chance.
While coding, we gain an intuition regarding the lower and higher levels of our application. While analyzing the bidirectional dependency between the namespaces System to System.Threading we figured out that the direction System.Threading to System is privileged because the level of the namespace System is naturally lower than the level of the namespace System.Threading.
Things are less obvious when analyzing the bidirectional dependency between VisualNDepend.QueryPanel.* and VisualNDepend.GraphObjectmodel.*. The matrix below makes us lean toward getting rid of the direction VisualNDepend.GraphObjectmodel.* to VisualNDepend.QueryPanel.* because the two green cells tell us that VisualNDepend.QueryPanel.* is using VisualNDepend.GraphObjectmodel.* on a massive scale:

In order to plan the impact of the refactoring required, we need to know why the namespace VisualNDepend.GraphObjectmodel is using the namespace VisualNDepend.QueryPanel.Compilation. The following matrix tells us that this dependency is only due to the use of the class MetricMinMax from the namespace VisualNDepend.GraphObjectmodel. Different ways can be used to get rid of this dependency. We could have used the inversion of control principle (IoC) but here we preferred to move the class MetricMinMax from VisualNDepend.QueryPanel.Compilation to VisualNDepend.GraphObjectmodel because a quick impact analysis told us that this refactoring wouldn’t add new cycles.

We say that we are levelizing a program when we get rid of dependency cycles between its components. As far as we know this term was coined 10 years ago by John Lakos in Large-Scale C++ Software Design . In this book, Lakos shows that the need for levelization is even more critical for C++ programs where entangled header files (.h files) dramatically damage compiler performance. Indeed, most large-scale C++ programs (> 1M LOC) still take hours to get compiled on powerful machines.
The term levelization refers to the notion of level. The idea is that once all dependency cycles have been broken, each component has a level defined as:
- Level(C) = 0 : If the C component isn’t using any other component.
- Level(C) = 1 : If the C component is only using directly a component with a 0 level or tier components (such as the .NET framework assemblies and namespaces).
- Level(C) = 1 + Max( level over the set of components that C is using directly )
- Level(C) = N/A, if the component C belongs to a dependency cycle or if C is using directly or indirectly a component that belongs to a dependency cycle.
All this is illustrated below:

The NDepend tool supports the level metric for namespaces but also for assemblies, types and methods. Since we levelized namespaces of VisualNDepend let’s have a glance at its namespaces level values. To help developers answering questions about their architecture, NDepend provides a SQL-like language named CQL (Code Query Language) that allows querying the structure of code. A further article will be dedicated to CQL. Here is a 3mn intro online demo of CQL and here is the CQL 1.2 specification . Following is the CQL query that selects all namespaces of the assembly VisualNDepend ordered by their level:
SELECT NAMESPACES FROM ASSEMBLIES "VisualNDepend" ORDER BY NamespaceLevel DESC
Namespaces |
Namespace Level |
VisualNDepend |
14 |
VisualNDepend.KernelImpl |
13 |
VisualNDepend.MainPanel |
12 |
VisualNDepend.MatrixCodePanel |
11 |
VisualNDepend.MatrixCodePanel.Matrix |
10 |
VisualNDepend.QueryPanel |
10 |
VisualNDepend.MatrixCodePanel.Header |
10 |
VisualNDepend.GraphDependencyModel.GraphMatrixAlgorithm |
9 |
VisualNDepend.QueryResultPanel |
9 |
VisualNDepend.MatrixCodePanel.BoxAndArrowGraphDrawing |
9 |
VisualNDepend.MatrixCodePanel.Base |
9 |
VisualNDepend.QueryPanel.QueryEdit |
9 |
VisualNDepend.GraphDependencyModel |
8 |
VisualNDepend.QueryPanel.Base |
8 |
VisualNDepend.QueryPanel.Compilation |
7 |
VisualNDepend.QueryPanel.QueryCompiledObjectModel.OrderBy |
6 |
VisualNDepend.QueryPanel.QueryCompiledObjectModel.Condition |
5 |
VisualNDepend.TreemapCodePanel |
5 |
VisualNDepend.GraphLoader |
4 |
VisualNDepend.InfoPanel |
4 |
VisualNDepend.TreeCodePanel |
4 |
VisualNDepend.QueryPanel.QueryCompiledObjectModel.RelevantProperty |
4 |
VisualNDepend.Async |
4 |
VisualNDepend.TreemapCodePanel.Drawer |
4 |
VisualNDepend.Tooltip |
3 |
VisualNDepend.KernelInterface |
3 |
VisualNDepend.QueryPanel.QueryCompiledObjectModel.Top |
3 |
VisualNDepend.Helpers |
2 |
VisualNDepend.QueryPanel.QueryCompiledObjectModel.Warn |
2 |
VisualNDepend.GraphObjectModel |
2 |
VisualNDepend.GraphObjectModel.SourceCode |
1 |
VisualNDepend.QueryPanel.QueryCompiledObjectModel.Comparison |
1 |
VisualNDepend.Base |
1 |
VisualNDepend.Properties |
1 |
Sum: |
207 |
Average: |
6.0882 |
Minimum: |
1 |
Maximum: |
14 |
Standard deviation: |
3.7445 |
Variance: |
14.022 |
Results in this table confirm what the DSM snapshot told us about the namespaces of VisualNDepend. Below, the DSM gives an accurate idea of the level of each component. Notice also that a DSM is triangular if and only if components are levelized.

What we need now is to be advised automatically each time a dependency cycle appears between our components. The CQL language allows you to write such a constraint. The following CQL constraint is based on the fact that the level of some namespaces is equal to N/A if and only if a dependency cycle exists. The !IsInFrameworkAssembly condition avoids matching tier components.
WARN IF Count > 0 IN SELECT NAMESPACES WHERE !HasLevel AND !IsInFrameworkAssembly
This CQL constraint can be rewritten as follows thanks to the ContainsNamespaceDependencyCycle CQL condition:
WARN IF Count > 0 IN SELECT ASSEMBLIES WHERE ContainsNamespaceDependencyCycle
We can infer three simple rules that help you create a clear and maintainable architecture:
- Where possible do namespaces instead of assemblies. If possible, only use assemblies for coarse partitioning (> 20.000 LOC) to avoid loading too much code at runtime.
- Use namespace hierarchy in order to have fine-grained small components (500/1000 LOC).
- Levelize your current architecture and continuously check for new dependency cycles.
Interestingly enough, Microsoft has developed internally a tool to rationalize dependencies between components of Vista. More information available on the Larry Osterman blog . You’ll find here a 5mn online demo that shows how to practically use NDepend to break dependency cycles.
http://www.NDepend.com
Software Estimation: Demystifying the Black Art by Steve McConnell, (Microsoft Press 2006)
Design Patterns: Elements of Reusable Object-Oriented Software by Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides, (Addison-Wesley Professional Computing Series 1995)
Large-Scale C++ Software Design by John Lakos (Addison-Wesley 1996)
The tool ILMerge
VisualStudio’s Solutions Folders
The NAnt open-source project
The NUnit open-source project
The Spring.NET open-source project
Spring’s Architecture – Not a Single Dependency Cycle
The SharpDevelop open-source project
The CruiseControl.NET open-source project
The mediator design pattern
The inversion of control principle (IoC)
The definition of the Level metric
A 3mn30 online demo intro to the CQL language
CQL 1.2 specification
Larry goes to Layer Court (Larry Osterman's blog)
A 5mn online demo that show how to practically use NDepend to break dependency cycles
Authors
 |
Patrick Smacchia is a .NET MVP involved in software development for over 15 years. He is the author of Practical .NET2 and C#2, a .NET book conceived from real world experience with 647
compilable code listings. After graduating in mathematics and computer science, he has worked on software in a variety of fields including stock exchange at Société Générale, airline ticket
reservation system at Amadeus as well as a satellite base station at Alcatel. He's currently a software consultant and trainer on .NET technologies as well as the author of the freeware
NDepend which provides numerous metrics and caveats on any compiled .NET application.
|
|