A postmortem about the incident that could have affected artifacts on repo.eclipse.org
What happened?
On Feb 16th 2021, we received a security report about secrets in the main Jiro repository. This report was correct. On March 18th 2020, the secrets were committed inside the repository.
What was leaked?
The secrets were deployment credentials for the Nexus application running on repo.eclipse.org. While the credentials themselves were encrypted, the master password was also part of the leak. While this master password was not in clear text, it is fairly easy to decode it and then use it to decrypt the credentials.
What were the threats?
The leaked credentials had full control (read/write/delete) over all Maven repositories stored at https://repo.eclipse.org. The threats we identified are:
- Removal of published items. This is destructive, but not too malicious as we have regular backups.
- Some jars could be tampered with to add classes with malicious code that can run on the systems where they are deployed.
- Some pom.xml files could be modified to add/change dependencies so that downstream consumers would fetch those dependencies (with potentially malicious code).
How was it mitigated?
The credentials have been revoked immediately. Shortly after, we deployed new credentials to all Jenkins instances requiring deployment capabilities. The leaked credentials have been removed from the git repository. It is understood that they stay in the git history, but there is little value in removing them from there now. It would be a destructive operation to rewrite the repository’s history as far back as one year. It would confuse contributors and force them to do a lot of work to rebase their work in progress on top of a new long history.
Has there been any malicious usage of those credentials?
We’ve done a thorough audit and we’re confident that no release artifacts have been tainted. It’s a tad more complicated to audit snapshot artifacts. We found no evidence that any of them have been tainted, but we have no proof to confirm the opposite either.
What was audited?
The leaked credentials were granting full control on all Maven repositories, but only over the REST API. It means that the last modified time (mtime) of the files on the file system could not be forged by a potential malicious user. We based most of our audit on that fact:
If a file was tainted, it’s mtime can only be between the leak date and the time we revoked the credentials.
As an approximation, it means in the last 350 days when we revoked the credentials. We found about 100k of such files.
In this list, there are some files coming from our maven_central proxy, which were not modifiable by the leaked credentials, so we can already exclude all of them: about 10k files.
Within this set of files, the primary artifacts are jar files. Luckily, our community is used to sign its jars. This cannot be forged. So, if a jar file is signed with the Foundation certificate, then it’s not tainted. We found 21168 signed jars and 26076 regular (not signed) jars.
Some projects publish to both repo.eclipse.org and to Maven Central. Maven Central is mostly immutable and can be considered trustful. Once something is published over there, it cannot be tainted. So, if we find that an unsigned jar is identical on both locations, then it’s not tainted. Unfortunately, there are no snapshots on Maven Central, so we can only check for release jars: we found 1548 unsigned release jars.
Unfortunately, projects usually don’t publish the exact same jar to both locations. They often do that as part of 2 different build steps leading to non binary identical jar files in the end. We cannot rely on binary comparison to check if the jar is modified or not. In order to take those slight changes into account, we need to do a deep comparison of jars content, ignoring the changes that can be ignored (signatures, some metadata…). We used jardiff for that purpose.
We found 692 unsigned jars only published to repo.eclipse.org, 840 unsigned jars published to both locations and identical (excluding our ignore list of differences), and 16 jars published to both locations but with non identical content. Thus, we went deeper and analyzed the byte code comparison of those 16, and found that 4 of them could be easily considered as safe minor changes (field ordering changes, timestamp changes in debug comment). The remaining 12 were compared to other artifacts with the same groupId and same version in their respective repositories. The mtime of the jar were similar to those other artifacts, so can be safely considered safe.
We also need to check other file types (pom
, xml
, etc…). There is no signature for them, so we can not do the same trick as for jar files to reduce the number of files to check. The first thing we can do is check if there are similar files on Maven Central and find out if they are different. Here we talk about exact comparison, as there is no easy way to smartly compare content of any file type. Again, we had to exclude the snapshots, as there is no such thing on Maven Central. We found 2518 identical files in both locations, 5183 files only in repo.eclipse.org and 25 files with different contents (mostly pom files and p2 artifact xml). We analyzed the differences but we did not identify any malicious changes. The bulk of the changes were in file size properties in p2 artifact xml and the Maven dependencies being declared in pom.xml for publishing to maven central.
With the various heuristics applied above, we were able to determine a list of 24526 untainted files. We still have about 65k files to verify. 6100 of those are release artifacts. For those, we can check if all the artifacts with the same GAV (GroupId, ArtifactId, Version) triplet, the same GV pair, and the same version all have a similar mtime. Indeed, it’s unlikely that a malicious user would modify all the artifacts at once. We decided on the following parameters for our heuristics:
- All files with the same triplet GAV must not differ from each other for more than 1 hour
- All files with the same goupId and version must not differ from each other for more than 6 hours
- All files with the same version must not differ from each other for more than 2 days.
With this heuristic, we found 15 groups of outliers. After more investigation, they were all false positives.
By removing all the checked outliers’ artifacts, we still had about 3000 release artifacts to check. Those were grouped by version and the day of their mtime and we ended up with a list of about 150 items to check. For each item, we had a project, a day and a version string. We checked the CI and the project metadata PMI for each of them to validate that the mtime was matching an actual CI job or a declared release date when not available.
We managed to validate — to the best of our knowledge — that no release artifacts were tainted because of this leak. Unfortunately, we can’t do much for the snapshot artifacts. We know that about 13k of them are signed jars, but for the rest, it’s impossible to deny or confirm anything.
Is there anything I need to do for my project?
As far as your release bits are concerned, you are safe and do not have to do anything. Regarding your snapshots, we’ve been pruning unused snapshots (for more than 60 days) from the repositories.
We suggest you start building new snapshot versions of all used artifacts. Feel free to reach out to us if you want to have a list of those.
What are the plans to prevent this from happening again?
- We will stop generating secrets inside the git repo folder so that such a file can never be committed again. Note that we were already having a
.gitignore
with the proper rules, but for some reasons it has not been enough. - We will enforce code reviews for all code submissions to sensitive CBI repositories.
- We will grant permissions to projects only on repositories associated with the projects. This will help contain the potential radius blast of such a leak, would it happen again in the future.
Originally published at mikael-barbero.medium.com.