I work at Chainguard. We don't guarantee zero active exploits, but we do have a contractual SLA we offer around CVE scan results (those aren't quite the same thing unfortunately).
We do issue an advisory feed in a few versions that scanners integrate with. The traditional format we used (which is what most scanners supported at the time) didn't have a way to include pending information so we couldn't include it there.
The basic flow was: scanner finds CVE and alerts, we issue statement showing when and where we fixed it, the scanner understands that and doesn't show it in versions after that.
so there wasn't really a spot to put "this is present", that was the scanner's job. Not all scanners work that way though, and some just rely on our feed and don't do their own homework so it's hit or miss.
We do have another feed now that uses the newer OSV format, in that feed we have all the info around when we detect it, when we patch it, etc.
Really cool to see all the hard work on Trusted Publishing and Sigstore pay off here. As a reminder, these tools were never meant to prevent attacks like this, only to make them easier to detect, harder to hide, and easier to recover from.
Just getting around to looking at this. There is a certificate in sigstore for the 8.3.41 that claims the package is a build of cb260c243ffa3e0cc84820095cd88be2f5db86ca -- https://search.sigstore.dev/?logIndex=153415340. But it isn't. The package content differ from the content of that commit. This doesn't seem like something that's working that well.
There's no defeating of scanners or even static linking. It's all automation, dynamic linking and patching to make the scanners happy. We go to great lengths to make sure that the scanners actually find everything so the results are accurate.
The second is this blog post which highlights trust and again Docker Hub features (e.g. rate limit removal, metrics, badging) and links back to [2] for more details.
The third link is [1].
So, I'm still left curious about the grandparent comment's questions. It seems these pages are sparse on what the verification process looks like and if it might just be a "pay to play" relationship.
I read the blog, then I clicked on the big "back" button at the top labeled "Unchained" and read that, then I went to your homepage and read that, then I clicked "get started" and read that page too.
I still have no idea what Chainguard is, or what those images do. All I know is those images are "hardened", is that the only thing they're for? Is that Chainguard's product?
In a nutshell we produce minimal container images with a low CVE count. In many cases they should be drop in replacements for the containers you are currently using.
This is particularly useful if your team uses a scanner like trivy/snyk/grype/Docker Scout and spends time investigating CVEs. Less CVES == less time investigating. It can also be critical in regulated environments.
I think this comes down to audience. To a lot of engineers it's just like ... "OK, that's nice. What else?"
But for security teams in large enterprises, Chainguard is like manna from heaven. They immediately understand what is really being sold: the elimination of enormous amounts of compulsory toil due to upgrading vulnerable software -- or having to nag other teams to do it.
It's a bit like visiting the site of a medical devices manufacturer. I probably don't know what the device does, but the target audience sure do.
I just heard of this today and I was like OMG this will save me so. much. time. chasing engineers and teams and creating work around for dumb stuff that the base images refuse to fix because it's a "false positive". I unfortunately HAVE to fix all high CVEs, regardless of peoples opinions.
> But for security teams in large enterprises, Chainguard is like manna from heaven. They immediately understand what is really being sold: the elimination of enormous amounts of compulsory toil due to upgrading vulnerable software -- or having to nag other teams to do it.
Explain to me how Chainguard helps with this. Everywhere I've worked, this process has very specific needs depending on the companies internal and regulatory requirements. Chainguard may help with proof of origin/base imaging, but it doesn't do much beyond what container registries and tools like dependabot/snyk/dependency track already provide (not saying they're directly related), which doesn't really reduce that much toil.
The big ones that help are SBOMs, STIGs, FIPS, and CVE reduction. The images and the paperwork we provide make it so they can be dropped in to even the most regulated environments without toil.
Most of our customers use them for FedRAMP or IL 5/6 stuff out of the box.
As someone who has been watching Chainguard since they were "spun out" of Google, they started out trying to be the defacto container supply chain security company, realized everyone else was already doing that and well ahead of them, and have done a few pivots trying to find PMF. I think they've found more success being consultants, which is probably not what they hoped for.
Many organizations pay people (or entire teams) to maintain a suite of hardened images, either for device/firmware applications, or because they use many languages in-house, etc. This is definitely one of those business models I thought "oh, of course" as soon as I saw it.
EDIT: upon using dockerhub’s organization page for a bit, and realizing there’s no search on the organization page (I swear there was?), I now understand.
Why does the article present this bizarre set of instructions for grabbing the image instead of linking directly? You could just link your organization no?
> Getting started with Chainguard Developer Images in Docker Hub is easy. Follow these simple steps:
> Look up the Image you want.
> Select ‘Recently Updated’ from the dropdown menu on the right.
> Filter out the community images by selecting the filter ‘Verified Publisher.’
> Copy the pull command, paste it into your terminal, and you are all set.
If a primary goal of a consumer of the images is security, how can we trust the images not to have backdoors or virusesesses [extra s added for comedy]?
Great question! We take hardening of our build infrastructure very seriously, and helped build many of the OSS technologies in this space like the SLSA framework and the Sigstore project.
We produce SBOMs during the build process, and cryptographically sign SLSA-formatted provenance artifacts depicting the entire build process so you can trace a built container all the way back to the sources it was built from.
We also try to make as much of our build system reproducible as possible (but we're not all the way there yet), so you can audit or rebuild the process yourself.
Yep - a new version of image spec and distribution spec (not runtime spec).
This version allows for formalized ways to store other types of content in registries (think Helm Charts, OPA policies, etc.), as well as a way to "attach" arbitrary content to registries and then retrieve it later.
Both of these are powerful and will have lots of use cases, but the primary ones at this point are focused on supply chain security - storing content like SBOMs, digital signatures and attestations.
> This version allows for formalized ways to store other types of content in registries (think Helm Charts, OPA policies, etc.), as well as a way to "attach" arbitrary content to registries and then retrieve it later.
First part seems right. I think though that part two is maybe misworded or what not? It allows artifacts of any kind to ne attached/related to other artifacts.
From the oci blog post, the example is uploading a software-bill-of-material sbom & having it attached to the container it represents. Such that a user can then query of there is an sbom for their container & get a list of such sboms (you can also ask for an index of all related content).
SBOMs may be an important short-term use-case, but ultimately the spec will now provide a "blessed" way to attach even a cat picture. It can be uploaded standalone or pointing to a container image.
The way to indicate a non-container artifact type is to use the "artifactType" field. Your client may choose whether or not to support older <=v1.0 registries by falling back to use "config.mediaType". The way to attach the artifact to the image is to use the "subject" field, pointing "subject.digest" to the proper digest of the thing you're pointing to (e.g. "sha256:..." ). There are no limitations there either, you may point cat pictures to cat pictures.
The gap from knowing what a CWE is and actually knowing, on code level, how it manifests and how you avoid these things is very large. Given how much the software industry has grown in the past 10 years it's not particularly surprising.
> and actually knowing, on code level, how it manifests and how you avoid these things
You avoid them by using tools that make it difficult or impossible to introduce such vulnerabilities to begin with. Such as modern, memory safe programming languages.
For many decades, carpenters have been educated about table saw safety. But what finally stopped thousands of fingers getting chopped off every year was the introduction of the SawStop, and similar technologies.
Safety is a matter of using the right tools, not of "taking better care".
> For many decades, carpenters have been educated about table saw safety. But what finally stopped thousands of fingers getting chopped off every year was the introduction of the SawStop, and similar technologies.
Afaik the technology isn’t widespread and there are still 10s of thousands of injuries per year.
You mean tecnhology like bounds checking, invented during the 1950's decade, with the creation of Fortran, Lisp and Algol, and every other language derived from them, with exception of C, C++ and Objective-C?
"Although we entertained occasional thoughts about implementing one of the major languages of the time like Fortran, PL/I, or Algol 68, such a project seemed hopelessly large for our resources: much simpler and smaller tools were called for. All these languages influenced our work, but it was more fun to do things on our own."
Then there is the whole issue of making it more interesting to look elsewhere instead.
When a door is locked I can still break in by throwing a rock to the window, yet most people do lock the door nonetheless, while most thieves only bother to break the window if there is anything actually valuable in doing so.
Yeah at least in the US, it looks like tablesaw accidents that put people in the ER are about as common as they were 15 years ago. I have a buddy who just lost 6 months work because of a tablesaw accident.
Those two things aren't mutually exclusive. I'll bet a non-trivial number of XSS and SQL injection vulnerabilities came from people disabling input and output sanitation on solid frameworks and libraries because they didn't know why they shouldn't. Tools won't solve all of your problems-- you need knowledge, diligence, and tools that make doing the right thing easy.
> I'll bet a non-trivial number of XSS and SQL injection vulnerabilities came from people disabling input and output sanitation on solid frameworks and libraries because they didn't know why they shouldn't.
Searching Google for disabled sanitation "vulnerability", the first two hits are articles admonishing developers to not do it, and the third is a CVE, CVE-2023-1159, from a month ago that affects WordPress installations on which the developer disabled unfiltered_html, which is it's built-in sanitation functionality.
>Those things are stopped by other tools, such as query builders and web frameworks.
No. All tools can be used with an improper attitude which leads to the creation of weak points.
The proper way is to have a deep understanding of the role of design rules.
A programmer who does not pay attention to design (very basic principles of the design process) can create a good game, and even if this game contains weaknesses the risk related isn't a reason to not use it. The same programmer when creating critical infrastructure software is a source of potential nightmare.
Unfortunately, software business accepts such specialists for projects both of kinds. Why? Who knows?
Perhaps because of legal regulations?
Why when an engineer designs a car they don't try to "Move fast and break things"?
XSS and SQLi can happen independently of the memory safety of your chosen programming language. You can use relatively safe frameworks or ORMs to generate HTML and interact with your DB, but there will sometimes be complex use cases that require you to extend or otherwise not use those safeguards.
Similarly, I imagine that there are cases where someone needs to do complex wood working tasks that involve dangers which are a less obvious than with a table saw.
XSS is a great example of that. On paper a ton of people know exactly what XSS is and does. In practice... simply don't allow user-controlled input to be emitted unescaped, ever. Good luck!
The reason XSS (and CORS) are tricky is because they fundamentally don't work in a world where a website may be spread over a couple different domains. I get a taste of this in my dayjob where we have to manage cookie scoping across a couple different region domains and have several different subdomains for different cookie behaviors. It's easy to be clean on paper up until you need to interface with some piece of software that insists on doing it its own way - for example the Azure excel embedded functionality requires the ID token to be passed in the request body, meaning you have to pull in the request body and parse it in your gateway layer (or delegate that to a microservice)... potentially with multi-GB files being sent in the body as well!
It's super easy on paper to start from greenfield and design something that is sane and clean, bing boom so simple. But once you acquire a couple of these fixed requirements, the cleanliness of the system degrades quite a bit, because that domain uses a format that's not shared by anything else in the system, and it's a bad one, and we can't do anything about it, and now that's a whole separate identity token that has to be managed in parallel.
Anyway, you could say that buffer overflow or use-after-free are kind of an impedence mismatch for memory management/ownership in C. Well, XSS and CORS are an impedence mismatch for domain-based scoping models in a REST-based world. Obviously the correct answer is to simply not write vulnerable systems, but is domain-based scoping making that easier or harder?
Great examples. 1) You have to deal with your own complex systems where it becomes difficult and 2) you have to deal with external complex systems which enforce bad practice on you. One can see how it becomes borderline impossible not to slip once in a while.
Two of those four are things there's no need to make easy to do by mistake, but two popular programming languages choose to do so anyway and they reap the consequences.
Actually the SQL one is arguably in that category too, to a lesser extent. Libraries could, and should, make it obvious how to do parametrized SQL queries in your language. I would guess that for every extra minute of their day a programmer in your language must spend to get the parametrized version to work over just lazy string mangling, you're significantly adding to the resulting vulnerability count because some of them won't bother.
Bonus points if your example code, which people will copy-paste, just uses a fixed query string because it was only an example and surely they'll change that.
I feel there would be some value in SQL client libraries that just flat out ban all literals.
I know it's the nuclear option, but decades of experience has shown that the wider industry just cannot be trusted. People won't ever change[1], so the tools must change to account for that.
[1] Unfortunately, LLMs learned from people... so... sigh.
While I agree that the software industry suffers from ageism and anti-intellectualism, these vulnerabilities are actually the symptoms of elitism, cargo culting, and traditionalism, which it also suffers from.
Maybe not ageist, but I do think it's easier to get younger people to work slavishly and pay them relatively less (on average, not everywhere pays like Bay area).
Is there some authoritative source for what is considered modern C++ and what is old? Most projects I've seen use a wide mix of C++ features of varying age. If you use some C++23 futures it would not make it modern if you still use C++98 features you not supposed to use.
When ISO C++11 came to be, many re-used the term to mean C++11 or higher.
Given that many keep updating this to mean more modern versions, a well known developer in the community (Tony Van Eerd) has made the joke of that by C++17 time we were in Postmodern C++.
No idea what kind of modernism to call C++23, when C++17 was already postmodern, maybe Revivalist C++.
However it basically comes back to Andrei Alexandrescu's original ideas of programming in C++ as its own language, leave the C ways and pitfalls of resource management behind, learn to embrace a modern language for systems programming.
I should also note that there are developers against this philosophy, they advocate that the C++ as understood by CFront is what one should care about, thus Orthodox C++ movement was born.
> programming in C++ as its own language, leave the C ways
I'm with Kate Gregory on the "Stop teaching C" (actually Kate specifically means in order to then teach C++ but I also think it's probably fine to stop teaching C outright)
But whilst Kate is right in terms of pedagogy, as a larger philosophy this is inadequate. As a language C++ is obviously defective and the explanation is almost invariably "Because C" which only makes sense once you appreciate C++ in terms of C.
The built-in array type in C++ is garbage. Why is it garbage? This is a language with all these powerful features, why doesn't its array type leverage any of them? It's because this is actually the array type from C.
OK, maybe just the array type is trash, that's obviously not good, but it's one defect. How about string literals. Oops. C++ does sort of technically have the string literals you actually wanted, but the syntax for them is weird and you need the standard library not the core language... the ones you get for "Some text" are C's constant strings, an array of bytes with an extra zero byte, and well, the array type sucks.
This carries on, the language doesn't provide real tuples, it doesn't provide a real sum type, its built-in types don't believe in methods but user types do, everywhere there are weird choices which are non-sensical except for the reality that it's what C does.
And then at the end of that, the language isn't actually compatible with C. It's close, a lot of stuff works, and more stuff kinda-sorta works enough that you may be surprised when it fails, but there isn't the sort of robust compatibility you might expect given the enormous sacrifices made for this goal.
The issue is how "worse is better" culture tends to win, and if the option is between C and C++ for a given scenario, then I definitely take C++.
However if the option pool is widened to more alternatives, then yeah, there should be a sound reason for still pick them for greenfield development, e.g. CUDA, a language toolchain based on LLVM,...
No, there's no such authoritative source - depending on context C++ fans will mix and match what is 'modern'.
It's somewhat similar to the C/C++ split. When it is convenient it's "C/C++" because "you can easily migrate your old C codebase to C++". But in other situations it's "C++", because C is old and more error prone and "we no longer manipulate raw pointers".
“Modern C++” is not necessarily tied to any specific standard, it is more a collection of ideas and philosophies. Although if I had to pick I’d say it really started with C++11.
not authoritative, but the really big c++ change was with c++11 - changes after that have been important, but perhaps more or less transparent to the average c++ user. and compiler support for c++11 is very good.
Static analysis as a bugfinding tool has proven to be insufficient, especially for large C++ binaries and JS programs. Both languages are nightmares for precise and scalable analysis.
Coverity exists. They've got a great product. But it doesn't solve the problem.
Of course. But these issues will remain near the top of the list indefinitely if people just leverage traditional analysis tools.
I love static analysis. I did my PhD in it. But we'll still be talking about use after free in 2073 if we just try to chase higher K in our analysis implementations.
Naturally static analysis alone doesn't fix use after free in all possible cases, however it already does fix several of them when the analyser can see everything on the existing source code.
The main issue is the community sub-culture of not adopting tooling as it isn't perfect 100% of the time.
Many of the C++ security conscious folks end up being polyglot, as this subculture eventually wears one out.
Having spent well over a decade in this space, I assure you that the root cause of limitations for static analyzers doing lifetime analysis in C++ is not separate compilation or partial program analysis caused by shared libraries.
Nowhere did I suggest that we shouldn’t use these tools or spend time improving them. UX for tools more powerful than local AST matching indeed tends to be quite bad because explaining the chain of reasoning for an alarm is difficult.
My only point is that without a different approach we will continue to have the same problems in 2073.
I agree that in principle the neutralization bugs aren't something C++ is necessarily making worse than, say, Python. But it'd be fascinating to see a study to figure out whether C++ programmers make these mistakes more often, or less often, or roughly the same.
An argument for more often: C++ is so complicated, maybe you're too busy with other problems to address the neutralization issue
An argument for less often: C++ teaches you to be careful and check everything to avoid nasty outcomes so that carries over to neutralization
It's somewhat disheartening as a security enthusiast that people only focus on "popular" security bugs and ignore the rest. The other top 21 bug classes aren't as "cool" but they will let me hack your app just the same.
SQL Injection is weird because it's been known for so long and modern frameworks usually have so many ways of avoiding it by default, that's one has to go out of their way to create an injection vulnerability, but it still happens often with greenfield code.
Then ask yourself: how much have you done to prevent people choosing the wrong programming language? Because the PL has such a major influence, it's by far the most low hanging fruit to tackle those many of those issues.
Personally? I've done quite a bit here although there's always more. I worked at Google to fund Rust development internally and externally, helped sponsor the work that eventually led to getting Rust adopted in the Linux kernel, and now run a company that's building a new Linux distribution that prioritizes shipping code written in memory safe languages.
SQL injection and XSS are typically solved at a library/framework level instead of a programming language one, although type systems can help make those frameworks usable and work well.
Either way, they're effectively "solved" from a programmer's perspective if you're willing to adopt modern frameworks instead of string-concatenating HTML or SQL manually.
Judging from my limited experience the first and fourth are either caught by the compiler or at least result in a panic in some cases.
The middle two are out of reach of a typical PL or type system (there are exceptions like Ur, but I don't think it's adopted widely). It's a problem that is typically solved via libraries and Rust is not unique in terms of providing safe libraries around generating SQL or HTML.
With a bit of creativity, you can use static typing systems to at least slant the table in your favor with SQL, HTML, and in general, structured text output. It's hard to completely ban string concatenation because you will eventually need it, but you can make it so doing the right thing is easier than the wrong thing.
However, existing libraries for statically-typed languages often don't do the work or apply the creativity and end up roughly as unsafe as the dynamically typed languages.
It could, but it will be decades before Rust adoption is where C/C++ is today so in the meantime it would be nice to see some other, more practical and short term solution to these problems. Otherwise I can predict the the top 4 at least 50% for a decade ahead.
Static detection of UAF is grossly incapable of actually protecting real C++ applications. It can find some bugs, sure. But a sound analysis is going to just throw red all over a codebase and get people to disable it immediately.
Changing everything to take lengths is definitely a good change - but challenging to retrofit into existing codebases. Apple has a neat idea for automatically passing lengths along via compilation changes rather than source changes, but if you want to do things in source you have to deal with the fact that there is some function somewhere that takes a void*, increments it locally, reinterpret_casts it to some type, and then accesses one of its fields and you've got a fucking mess of a refactor on your hands.
We're trying to fix this problem at Chainguard. We have our own Linux distro that packages modern versions of software (like minutes or hours after it's released), as well as older versions.
We're also working on FIPS 140-2 and 3, and support pretty much every compliance framework we can find.
I work at Chainguard. We don't guarantee zero active exploits, but we do have a contractual SLA we offer around CVE scan results (those aren't quite the same thing unfortunately).
We do issue an advisory feed in a few versions that scanners integrate with. The traditional format we used (which is what most scanners supported at the time) didn't have a way to include pending information so we couldn't include it there.
The basic flow was: scanner finds CVE and alerts, we issue statement showing when and where we fixed it, the scanner understands that and doesn't show it in versions after that.
so there wasn't really a spot to put "this is present", that was the scanner's job. Not all scanners work that way though, and some just rely on our feed and don't do their own homework so it's hit or miss.
We do have another feed now that uses the newer OSV format, in that feed we have all the info around when we detect it, when we patch it, etc.
All this info is available publicly and shown in our console, many of them you can see here: https://github.com/wolfi-dev/advisories
You can take this example: https://github.com/wolfi-dev/advisories/blob/main/amass.advi... and see the timestamps for when we detected CVEs, in what version, and how long it took us to patch.