The process of creating a software bill of materials (SBOM) is called software composition analysis (SCA). A software composition analysis first creates the so-called dependency graph of your software and then derives the SBOM from it.
A dependency is a software component that some other component depends on. A component depends on another component if the component can’t perform its function without the depended-on component. In the common case, this is a code dependency like being able to call functions of the depended-on component.
A dependency graph is a graph of software components as the nodes connected by depends-on (dependency) relationships as the edges (links). In any modern software, most of these components will be third-party components, including open source components, which are components owned and licensed to you by someone else.
There are many different types of components that can become nodes in a dependency graph, depending on how broadly or narrowly the dependency graph is to be used.
- In the original narrow sense, the components in a dependency graph are all code components. There are two types of components:
- Traditional standalone components or libraries. These are components that have a clear boundary with their context (they come as their own package, ideally with a well-defined interface).
- Code snippets. Code snippets are pieces of code that have been copied and pasted into your code by your developers or into open source dependencies by the open source developers. Legally speaking, such code snippets are components separate from the embedding component, because they usually have a different copyright holder and a different license.
- In a more recent broader sense, with the goal of completely documenting everything that goes into the building of a software, components can also be tools that build the software, resources that provide necessary information, etc.
A dependency graph is a directed graph: Incoming links to a component originate from other components that depend on this component, and outgoing links from a component go to the other components this component depends on. As a matter of good software architecture, the graph is ideally also an acyclic graph.
Dependencies have levels. The level number is the number of steps removed from the root of the graph. This leads to the following definitions:
- The root component of a dependency graph has the level zero and is usually your own original code. There may be one or more root components.
- The first-level dependencies are the immediate dependencies of the root component. They are noteworthy, because they are present in the minds of your developers and they are explicitly specified in your build system instructions. They are also often called the direct dependencies.
- Second and higher-level dependencies are the dependencies of your first-level dependencies. They are also called indirect dependencies. They are noteworthy, because they are not present in the minds of your developers and they are not very visible in their day-to-day work. Yet they constitute the largest part of the code your project or product is built from.
As a rule of thumb, the size relationship between your original code, your direct dependencies, and your indirect dependencies is 1 to 9 to 90 in parts. In other words, 90% of your vulnerabilities stem from code you are not thinking much about. The indirect dependencies are the proverbial iceberg under the waterline.
SBOMs are created from a dependency graph. The nodes of a dependency graph correspond to the component entries in the SBOM. While the dependency graph remains a graph structure, the SBOM drops the relationships and is (mostly) a flat list of components. For this reason, dependency graph and SBOM are not the same.
Figure X illustrates a dependency graph including our term definitions.
Figure X. Illustration of a dependency graph for demonstration purposes
© 2024 Dirk Riehle, used with permission.