What are the Best Tools for Generating SBOM (Software Bill Of Materials)?

Best Tools For Generating SBOM

image source: unsplash

Introduction

I needed some SBOM samples to test an upcoming MergeBase feature (we plan to offer SBOM import in addition to our existing SBOM export feature). So I grabbed 5 tools for generating SBOM, more or less randomly, and tried them out. Here’s what I found!

Methodology

I whipped up a small Java Maven project (see: log4j-transitive-example.git) that transitively depended on Log4J.  That way I could assess tools both for how well they generated SBOM when source code was available and when it wasn’t.  I also put together a small Docker image in which I manually added a single vulnerable “jsch-1.3.8.jar” library under /opt/jsch/, since a lot of these tools offer SBOM generation from Docker images, and I wanted to exercise this particular use case (software added using Docker’s “COPY” command instead of via package-management).  

With these samples ready, I essentially ran 3 tests against each SBOM generation tool:

  1. Generate SBOM from source code (the Log4J transitive project, pre-build).
  2. Generate SBOM from a deployed system (again, the Log4J transitive project, but post-build).
  3. Generate SBOM from a Docker image.

Here’s how each tool fared!

Generating SBOM – A Quick Bakeoff


CycloneDx – cyclonedx-maven-plugin

1. Can it generate SBOM from something I acquired (no source code)?
No.

2. Can it generate SBOM from something I built (full source code)?
Yes.

3. Can it generate SBOM from a Docker image?
No.

4. Are transitive dependencies included in the SBOM?
Yes, beautifully. Multi-module and maven-profiles are handled beautifully as well.

5. Ease Of Deployment
Difficult to start, but wonderful after that point! Imagine a car where it’s *very* difficult to open the door the first time, but once you’re in the driver’s seat, it drives beautifully and intuitively. You need to know how to edit your project’s Maven pom.xml to activate the CycloneDX generation for your project. The documentation was not great for a newcomer (it assumed Maven expertise). But once you got it working, the default behaviour generated excellent SBOM files.

6. Comments
CycloneDX founder (Steve Springett) is clearly deeply (and by that, I mean *deeply profoundly*) proficient with Maven and Java. The resulting SBOM is ideal (as good as is possible). Anyone building a Java/Maven project should immediately enable this and start including it in their builds! Looking at the git commit history for this project – yup, Mr. Springett is definitely to blame for a work of beauty here (commit-counts by author)!

    1 Author: aalzate
    1 Author: Gregory Anne
    1 Author: iabudiab
    1 Author: Jonas Arnold Clasen
    1 Author: Mirko Friedenhagen
    1 Author: Robert Klaus
    1 Author: Robert Varga
    3 Author: Hervé Boutemy
    4 Author: Prabhu Subramanian
    5 Author: Thomas Gaskell
    5 Author: M. Scott Ford
   55 Author: dependabot
  324 Author: Steve Springett

The problem, though, is that it requires a reasonable understanding of Maven to enable.

Note: Based on our own Maven and software-composition expertise, here at MergeBase we recommend the following configuration when using cyclonedx-maven-plugin:

                <includeCompileScope>true</includeCompileScope>
                <includeProvidedScope>false</includeProvidedScope>
                <includeRuntimeScope>true</includeRuntimeScope>
                <includeSystemScope>false</includeSystemScope>
                <includeTestScope>false</includeTestScope>

Syft (by Anchore)

1. Can it generate SBOM from something I acquired (no source code)?
Yes.

2. Can it generate SBOM from something I built (full source code)?
Yes, but not as good (e.g., for log4j-transitive sample, it only listed the single top-level dep (“twilio-8.6.1.jar”).

3. Can it generate SBOM from a Docker image?
Yes, but in its “Docker” mode it seems to only query the package-manager. It did not notice that I had copied in /opt/jsch/jsch-0.1.38.jar (no mention of “jsch” in the resulting SBOM).

4. Transitive Dependencies
Yes, but ONLY in the post-build directory scan mode, and as a flat list, not as interconnected relationships (this makes sense since they are all lying together on the disk together).

5. Ease of Deployment
Very easy.  But somewhat single-minded in its approach. By this, I mean this tool is very easy to operate, and it will do exactly what you tell it to do, but not necessarily what you wanted!  In a way, I appreciate Syft’s “just scan everything” philosophy. E.g., “syft -o json /” – it’s gonna go for it (scan my complete file-system from root).

This contrasts with cyclonedx-maven-plugin’s approach, which is more:  “if you configure me incorrectly, I will do nothing, but if you do manage to get me actually working, from then on I will provide perfect output, more perfect than you ever even knew or dreamed or thought possible.”

6.  Comments
I suspect Syft is looking at whatever it can find inside the files on disk (e.g., inside “commons-io.jar” it must be quickly considering ./META-INF/maven/commons-io/commons-io/pom.properties). I didn’t look at Syft’s internal logic, but the reason I think it’s working this way is because jsch-0.1.38.jar does not have a pom.properties inside, and so Syft messed it up ("bom-ref": "pkg:maven/jsch/jsch@0.1.38"). Assuming that’s a heuristic, that’s a damn fine heuristic, though! It would actually work for jsch-0.1.29 and older!

Nonetheless, Syft is a pretty amazing tool, especially considering it’s free/open-source.

One minor criticism – what’s with all these spurious “cpe23” entries in the JSON?

{
  "name": "syft:cpe23",
  "value": "cpe:2.3:a:apache_software_foundation:log4j:2.14.0:*:*:*:*:*:*:*"
},
{
  "name": "syft:cpe23",
  "value": "cpe:2.3:a:apache_software_foundation:api:2.14.0:*:*:*:*:*:*:*"
},
{
  "name": "syft:cpe23",
  "value": "cpe:2.3:a:Activator:log4j_api:2.14.0:*:*:*:*:*:*:*"
}

 

Microsoft (Microsoft.Sbom.Tool)

1. Can it generate SBOM from something I acquired (no source code)?
It depends on your definition of “SBOM.” Yes, this tool is willing to run “ls” or “dir” recursively and re-assemble the output into a file that is technically a valid SBOM file. But I need a “packages” section in my SBOM, and it didn’t create one. The “files” section it did create and fill with data is mostly useless for the problems SBOM is supposed to help solve.

2. Can it generate SBOM from something I built (full source code)?
Yes. This tool automatically realised that it needed to run “mvn dependency:tree” and it knew how to reformat that into a useful SBOM file, including a “packages” section

3. Can it generate SBOM from a Docker image?
Yes? No? I’m not sure. It says it can, and when I asked it to do this, it obviously did something and even correctly printed the number of Alpine packages in my Docker image. But there was no resulting SBOM file.

4. Transitive Dependencies
Ha! Very funny! No, they are not included in the output, although SPDX’s “dependency” feature *IS* used to list all the Maven references (but flattened). Like so:

{
  "relationshipType": "DEPENDS_ON",
  "relatedSpdxElement": "SPDXRef-Package-AC15BDA5FB1FE5FB5C52F9BB784F7FA5FDB04D590DFD57EC899C932579A8B4B1",
  "spdxElementId": "SPDXRef-RootPackage"
},
{
  "relationshipType": "DEPENDS_ON",
  "relatedSpdxElement": "SPDXRef-Package-E4453FF31A893CF22D5BDFCD71D0BB2A98C2554822A100A1AEB5C944E44DDF8D",
  "spdxElementId": "SPDXRef-RootPackage"
},

5. Ease of Deployment
Medium difficulty.  The tool requires a lot of command-line arguments, but the error messages and help-text guided me reasonably well.

6. Comments
I was disappointed by this tool, mainly for three reasons:

– Not including a “packages” section (when source code is not available) really defeats the point of SBOM.  So I do worry that people might be happily using this tool and thinking they are getting good SBOM files out of it, when they really are essentially getting garbage.

– Also, when the source code is available, I was a little disappointed that the full dependency-tree was flattened instead of keeping its original structure. But this was a relatively small disappointment – at least all the 3rd party libraries were accurately recorded!

– Finally, I was not able to generate SBOM for a Docker image, even though it seemed to happily process the image I provided.


 

Fossa

1. Can it generate SBOM from something I acquired (no source code)?
No (or at least I could not find any way to do it!).

2. Can it generate SBOM from something I built (full source code)?
Yes.

3. Can it generate SBOM from a Docker image?
Yes, but on the paid plan (I only tried the free plan).

4. Are transitive dependencies included in the SBOM?
Yes, but on the paid plan (I only tried the free plan).

5. Ease Of Deployment
Excellent.

6. Comments
I was surprised to see they use SPDX’s plain-text format (not XML and not JSON):

SPDXVersion: SPDX-2.1
DataLicense: CC0-1.0
SPDXID: SPDXRef-DOCUMENT
Creator: Organization: FOSSA, INC.
Created: 2022-08-09T10:45:17Z

Unfortunately, the generated SBOM file was not particularly useful. This is not really a useful way to identify this library (uhh, natural language?) !!!

PackageName: Apache Log4j API
PackageVersion: 2.14.0

I would much prefer something that could be be mapped back to Maven-Central coordinates (e.g., repo1.maven.org/maven2/org/apache/logging/log4j/log4j-api/2.14.0/ or, if you prefer, groupId=org.apache.logging and artifactId=log4j-api) since both of these are unambiguous and both are essentially how developers think and talk about these dependencies in the first place! Another great option: use PURLs!

Finally, the license info included in the SBOM output made no sense to me:

PackageLicenseDeclared: Apache-2.0
PackageCopyrightText: ownership.
PackageLicenseInfoFromFiles: Apache-2.0
PackageLicenseInfoFromFiles: MIT
PackageLicenseInfoFromFiles: Apache-1.1

I quickly ran my own manual analysis of “Log4J-API-2.14.0” to verify these claimed licenses (note: I’m not a lawyer, but I have published academic papers on this problem, and I’m also an Apache committer). I could find ZERO pieces of evidence suggesting that MIT or Apache-1.1 were remotely appropriate here. Flagging “Apache-1.1” as a possible license here, while not only inaccurate, could also cause bigger problems since Apache-1.1 is famously incompatible when combined with GPL software (whereas MIT is a benign and highly cross-compatible license).


MergeBase

1. Can it generate SBOM from something I acquired (no source code)?
Yes. MergeBase is able to analyse binary applications, including Java and .NET and subsequently generate an SBOM with the click of button, or in an automated workflow.

2. Can it generate SBOM from something I built (full source code)?
Yes. Same as 1

3. Can it generate SBOM from a Docker image?
Yes.

4. Are transitive dependencies included in the SBOM?
Yes.

5. Ease Of Deployment
Excellent.

6. Comments

Not only is MergeBase the most comprehensive SCA tool to create SBOM’s with, it also is able to import SBOM’s and automatically do a full risk analysis on it for security, legal and technical (debt) risks.


Snyk (via snyk2spdx optional tool)

1. Can it generate SBOM from something I acquired (no source code)?
See answer #2.

2. Can it generate SBOM from something I built (full source code)?
I guess technically the result is a valid SBOM file….

3. Can it generate SBOM from a Docker image?
Again, see answer #2 (above).

4. Are transitive dependencies included in the SBOM?
Yes, except Snyk’s SBOM implementation here has a game-over flaw (see answer #2, and see comments below).

5. Ease Of Deployment
Excellent:  npm install -g snyk2spdx.

6. Comments
One has to wonder what they were thinking! Yes the end-result is a valid SBOM file, but it lacks a “packages” section.  It even lacks the useless “files” section.

What does it have instead?  A “vulnerabilities” section !!!

To bring this back to SBOM’s original shipping metaphor (a “bill of materials”), this would be like asking IKEA to send you the list of parts for a particular bunkbed. And IKEA answering with:  “We looked up all the parts for the bunkbed. Please find enclosed a description of 3 of those parts (out of an unknown total). These particular 3 parts contributed to serious bunkbed failures in the last 5 years. We’ve included descriptions of how they failed for your convenience.  Enjoy your SBOM!”

If we are not understanding this tool, or if Snyk has a different, more appropriate SBOM offering, I would love an email from Snyk to help me understand this better (you can reach me at julius@mergebase.com).

Update from Snyk!

Gareth Rushgrove (VP, Product Management) sent me an email. In his own words:

The snyk2spdx tool is not meant to generate an SBOM. It was an experiment looking at the WIP vulnerability extension in the draft SPDX 3.0 spec. We built the snyk2spdx tool to kick the tires there. Technically it’s generating a basic VEX.

This is very helpful information – thank you Snyk! This is a good point: I mistakenly assumed that “snyk2spdx” was an SBOM generator (since SBOM is SPDX’s primary use case these days), but SPDX can be used for other purposes.

Gareth also mentioned that Snyk is working on bringing SBOM generation to their main “snyk” command-line tool, and that they are also piloting an ability right now to work with CycloneDX data directly from Git repositories. Working directly from Git repositories (without invoking build tools) is a whole other kettle of fish that I did not touch on here! Very exciting!


 

Conclusion

If you can, use CycloneDX native plugins for your software systems. These produce perfect SBOMs for your software that you can confidently share with your customers. But you must remember to enable CycloneDX for each language in your software (which can require a bit of work).  

However, if you do not have access to source code, we recommend Syft (free) or MergeBase (free trial), since both of these products are able to produce accurate, useful SBOMs when source code is not available.

When generating SBOMs from Docker images, it’s very important to understand that many SBOM generators tend to only query the package-management system, and so Docker-derived SBOMs will probably usually miss the most important part of the image – the actual software you are trying to run inside Docker!

Doesn’t have access to source code and want to generate an SBOM?

Julius Musseau

About the Author

Julius Musseau

Julius Musseau, co-founder & CTO. Senior architect and developer with strong academic background and roots in the open source community. Contributor to a number of important open source projects.