The Blog

Our latest insights, opinions and general musings as we float about on the vast ocean of tech we call home.

What the recent npm and GitHub breach means for you

What the recent npm and GitHub breach means for you

Last week GitHub and npm announced the details of an investigation into an attack that targeted the npm organisation on GitHub. In this article we explore the impact to the average organisation and discuss some routes you can take to minimise the effect of this attack and similar future attacks.

How did it happen?

You can read the full details in the post-mortem published by GitHub but in brief, an attacker managed to obtain OAuth tokens issued to authorised consumers of the GitHub API and subsequently use those tokens to selectively target a number of GitHub organisations. By cloning private repositories from the npm GitHub organisation the attacker was able to access sensitive AWS access keys and secrets that allowed access to npm's cloud infrastructure. From there the attacker was able to exfiltrate, or steal, data including a database backup dating from April 2021.

What was the impact?

The attackers were able to exfiltrate data from internal npm systems. Exactly what was stolen is detailed in the GitHub analysis so we won't cover it in depth here but instead focus on the one that's likely to have an impact on you or your organisation.

One of the pieces of exfiltrated data was a backup from April 2021 of "skimdb", a public mirror of the CouchDB database behind the npm package registry itself. The stolen backup includes the following points of particular concern from the GitHub announcement:

  • An archive of user information from 2015. This contained npm usernames, password hashes, and email addresses for roughly 100k npm users.
  • All private npm package manifests and package metadata as of April 7, 2021.

If your npm user account was one of those 100,000 you should have been notified last week by npm. Affected accounts have had their passwords forcibly changed and you'll have to go through the reset process to regain access to your account.

Screenshot of the breach notification email from npm

The second point regarding private package manifests is perhaps more concerning. The GitHub postmortem goes on to explain in more detail:

This exfiltrated data includes READMEs, package version histories, maintainer email addresses, and package install scripts, but does NOT include the actual package artifacts, i.e., the tarballs themselves.

It's particularly important to note that while the public registry mirror at should not make data associated with private packages available to the public, the data stolen in this breach does appear to include such information.

Finally, it's worth mentioning that two specific unnamed organisations were further targeted and suffered a theft of actual package artefacts. These organisations have been notified by GitHub. Had the attackers chosen to do so they would have been in a position to steal this additional, and likely far more sensitive, data from any private package in the registry.

What can you do about it?

While it's fortunate that the actual package artefacts (the code itself) were not included in the exfiltrated data in the vast majority of cases it's quite likely that attackers could make use of sensitive information in the README files and package install scripts that were. We recommend the following steps at a minimum to reduce possible further impact to your organisation:

  • Audit the codebases from which all of your private npm packages are published.
  • Pay close attention to the any README files. Packages use a file in the root directory (relative to the package.json file) named README (with any case in the file name and with any extension) and we can safely assume it is the contents of this file that is included in the stolen dataset. Search these files for hostnames, access keys, passwords or any other potentially sensitive data.
  • Check the author and contributors fields in the package.json files. Any names and email addresses included in these fields were included in the stolen dataset. It may be sensible to proactively engage with any members of your organisation that have access to inboxes associated with those email addresses to reduce the risk of future targeted phishing attacks.
  • Check the scripts field in the package.json files. We can fairly safely assume that the stolen dataset did not include any files within the package referenced by these scripts but any inline scripts within package.json itself will be included.

How can you minimise the impact of similar future attacks?

  • Take care to avoid committing sensitive data such as access keys or passwords to source control repositories.
  • If you use GitHub consider enabling secret scanning on your organisation's repositories. There are numerous third-party security scanning tools available if you use a different platform.
  • If you discover or detect secrets committed in plain text to source control repositories ensure your teams understand the policy of rotating those keys immediately, rather than attempting to rewrite source control history to remove them from the codebase.
  • Consider setting up email groups or mailing lists that can be used as contact details for authors and contributors of packages instead of those of individuals. This may reduce the chance of targeted "spear" phising attacks.

Want to know more?

Get in touch.