A star is born

For the last six months or so (maybe more), Brandon Lum of Google has been sometimes participating in two or three of the CISA SBOM workgroups, especially the VEX workgroup. His title has something to do with open source software, and since a lot of people in the workgroups are involved with OSS, I wasn’t surprised that he would be attending these meetings.

A little more than a month ago, he started making announcements in some of the meetings about a new Google project called GUAC and asking for people to participate in it. I didn’t pay a lot of attention to the details, but I knew it had something to do with software supply chain security, and with SBOMs in particular. Since open source software depends on volunteers to develop and maintain the projects, this wasn’t unusual, either.

Moreover, there was one reason why I deliberately didn’t pay a lot of attention to what Brandon was saying about GUAC: Last year, Google announced another project aimed at software supply chain security called SLSA, which has been very well received by the developer community. It’s essentially a framework that will allow developers to identify and take steps to prevent attacks on the software build process, which nobody (that I know of, anyway) even thought of before SolarWinds.

(SolarWinds fell victim to an extremely sophisticated 15-month attack conducted by – according to Microsoft’s estimate – about 1,000 people working out of Russia. There’s a really fascinating article on CrowdStrike’s website about SUNSPOT, the malware that the Russians purpose-built for this attack. In fact, they tested it during a three-month proof of concept conducted inside the SolarWinds software build environment, then deployed the malware for 5 or 6 months, without ever being detected. This was easily, along with Stuxnet, the most sophisticated malware ever developed. If only the Russians would start putting all that great expertise toward a good use, for a change! BTW, don’t try to understand everything in the article. It’s just amazing to see what Sunspot was able to do, all without any direct Russian control).

But I digress. When I heard Google was following a project called SLSA with one called GUAC, I found this a little too cute for my taste (can Google CHIPS be far behind?). So, frankly, I tuned Brandon out when he brought this up.

However, two weeks ago I saw a good article in Dark Reading – which linked to a great Google blog post - about GUAC. I also found out that my good friend Jonathan Meadows of Citi in London – a real software supply chain guru, although very focused on how ordinary schlumps like you and me (OK, maybe not you) can secure our software supply chains without having all of his knowledge and experience – was involved with GUAC from the get-go. These two datapoints convinced me that I should be paying a lot more attention to GUAC.

So I did. And this is what I found:

The project intends to present to users a “graph database”, which in principle links every software product or intelligent device with all of its components, both hardware and software components, at all “levels”. You might think of the database as being based on a gigantic SBOM dependency tree that goes in all directions – i.e., each product is linked with all its upstream dependencies, as well as the downstream products in which it is a component (or a component of a component).
One of the important functions of this database is to provide a fixed location in “GUAC space” (my term) for software products and their components. Artifacts necessary for supply chain analysis, like SBOMs and VEX documents, can be attached to these locations, making it easy for the user of a software product to learn what new artifacts are available for the product (actually, the nodes of the database are versions of products, not the products themselves).
While the database will incorporate any artifacts created in the software supply chain, the three types of artifacts incorporated initially will be SBOMs, Google SLSA attestations, and OSSF Scorecards. The idea is that, ultimately, all the documents required for an organization (either a developer or an end user organization) to assess their software supply chain security will be available at a single internet location (and I don’t think the location would change – just its attributes).
The artifacts can be retrieved by GUAC itself – the supplier of the artifact won’t have to put it in place “manually”. Google says, “From its upstream data sources, GUAC imports data on artifacts, projects, resources, vulnerabilities, repositories, and even developers.”
Artifacts like SBOMs can be contributed and made available for free, but they don’t have to be. The Google blog post says, “Some sources may be open and public (e.g., OSV); some may be first-party (e.g., an organization’s internal repositories); some may be proprietary third-party (e.g., from data vendors).” In other words, a vendor that has prepared documents or artifacts related to security of a product can attach a link to the product in GUAC space. Someone interested in one of those artifacts can follow the link and, if they agree to the price, purchase it from the vendor.
Thus, one function of GUAC can be enabling a huge online marketplace. However, unlike most markets related to software, the user won’t have to search on the product name to find what’s available for it. Instead, the user will just “visit” the fixed location for the product and look through what’s available there.

I can imagine the inspiration for GUAC might have occurred when some Google employee involved in software supply chain security grew frustrated at the number of different internet locations they had to visit to get the artifacts they needed to analyze just a single product. Instead of a person running ragged while searching for the most up-to-date artifacts, how about having the computer do the legwork in advance? The user would just have to go to the right location, to find everything they need in one place and with one search.

In fact, Google has done this before. Those of you who were involved with computers in the later 1990s (once the internet had supposedly arrived, but was often proving to be slower than just doing some things by hand) may remember that the search engines – Yahoo, DEC’s Alta Vista, etc. – just searched for character strings. If you were good (plus lucky) and entered a string that would return you just the items you were interested in but no others, searching was a pleasant experience.

However, if you weren’t good and lucky, and just searched on a topic like “mountain vacation”, you would get hundreds of pages of results, with no assurance that what you were really looking for wasn’t on the very last of those pages. Google came out with a very intelligent search engine that used all sorts of tricks – like ranking results by their popularity with others – to make it more likely that what you were looking for would be on the first page or two. The rest, as they say, is history.

I certainly don’t think GUAC will have anywhere near the success that the search engine had. After all, just about everyone in the world can use a general search engine, but only a small fraction of the world’s inhabitants are involved with software supply chain security – although given how I spend my time nowadays and the people I meet online, I’m sometimes tempted to think it’s actually a very large percentage.

On the other hand, a look at the GUAC project’s statistics on GitHub shows that GUAC has already become at least a minor phenomenon in the few short months of is existence (if even that): over 10,000 code repositories, close to 700 users, 2,000 issues under discussion, 31,000 commits (i.e. code insertions or revisions). These indicate a huge level of early interest.

My guess is that in five years, anyone involved in software supply chain security will be spending a lot of their time navigating the highways and byways of the GUAC world. For some, the need to do this will become apparent sooner rather than later. For example, if you’re involved with one of the companies for which distribution of SBOMs is an important part of the business model, you should be figuring out how you can incorporate GUAC into that model – although I’m certainly not saying you should abandon whatever you’re doing, or planning to do, now.

Another idea: While I haven’t tried to look at tech specs on the project yet, I’m sure there must be some sort of fixed address for a software product (or a component, which of course is just a product that’s been incorporated into another product) within GUAC world. I can see that address becoming a kind of universal name repository for a software product, which of course can have multiple names over its lifecycle. Currently, if you have an old version of a product whose name has subsequently changed, and you want to learn about vulnerabilities that have recently been reported for the product, you’re out of luck, unless you happen to know the current name of the product. That (and a lot of other things) may change with GUAC.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at [email protected].