Preface

(日本語版も公開されています。)

Cloudflare, which runs cdnjs, is running a “Vulnerability Disclosure Program” on HackerOne, which allows hackers to perform vulnerability assessments.
This article describes vulnerabilities reported through this program and published with the permission of the Cloudflare security team.
So this article is not intended to recommend you to perform an unauthorized vulnerability assessment.
If you found any vulnerabilities in Cloudflare’s product, please report it to Cloudflare’s vulnerability disclosure program.

TL;DR

There was a vulnerability in the cdnjs library update server that could execute arbitrary commands, and as a result, cdnjs could be completely compromised.
This allows an attacker to tamper 12.7%1 of all websites on the internet once caches are expired.

About cdnjs

cdnjs is a JavaScript/CSS library CDN that is owned by Cloudflare, which is used by 12.7% of all websites on the internet as of 15 July 2021.
This is the second most widely used library CDN after 12.8%2 of Google Hosted Libraries, and considering the current usage rate, it will be the most used JavaScript library CDN in the near future.

Usage rate of cdnjs on the internet

Usage graph of cdnjs from W3Techs, as of 15 July 2021

Reason for investigation

A few weeks before my last investigation into “Remote code execution in Homebrew by compromising the official Cask repository”, I was investigating supply chain attacks.
While finding a service that many software depends on, and is allowing users to perform the vulnerability assessment, I found cdnjs. So I decided to investigate it.

Initial investigation

While browsing the cdnjs website, I found the following description.

Couldn’t find the library you’re looking for?
You can make a request to have it added on our GitHub repository.

I found out that the library information is managed on the GitHub repository, so I checked the repositories of the GitHub Organization that is used by cdnjs.

As a result, it was found that the repository is used in the following ways.

As you can see from these repositories, most of the cdnjs infrastructure is centralized in this GitHub Organization.
I was interested in cdnjs/bot-ansible and cdnjs/tools because it automates library updates.
After reading codes of these 2 repositories, it turned out cdnjs/bot-ansible executes autoupdate command of cdnjs/tools in the cdnjs library update server periodically, to check updates of library from cdnjs/packages by downloading npm package / Git repository.

Investigation of automatic update

The automatic update function updates the library by downloading the user-managed Git repository / npm package and copying the target file from them.
And npm registry compress libraries into .tgz to make it downloadable.
Since the tool for this automatic update is written in Go, I guessed that it may use Go’s compress/gzip and archive/tar to extract the archive file.
Go’s archive/tar returns the filename contained in the archive without sanitizing3, so if the archive is extracted into the disk based on the filename returned from archive/tar, archives that contain filename like ../../../../../../../tmp/test may overwrite arbitrary files on the system. 4
From the information in cdnjs/bot-ansible, I knew that some scripts were running regularly and the user that runs the autoupdate command had write permission for them, so I focused on overwriting files via path traversal.

Image of crafted tgz file to perform path traversal

Path traversal

To find path traversal, I started reading the main function of the autoupdate command.

func main() {
        [...]   
        switch *pckg.Autoupdate.Source {
        case "npm":
            {
                util.Debugf(ctx, "running npm update")
                newVersionsToCommit, allVersions = updateNpm(ctx, pckg)
            }
        case "git":
            {
                util.Debugf(ctx, "running git update")
                newVersionsToCommit, allVersions = updateGit(ctx, pckg)
            }
        [...]
}

As you can see from the code snippet above, if npm is specified as a source of auto-update, it passes package information to the updateNpm function.

func updateNpm(ctx context.Context, pckg *packages.Package) ([]newVersionToCommit, []version) {
        [...]
        newVersionsToCommit = doUpdateNpm(ctx, pckg, newNpmVersions)
        [...]
}

Then, updateNpm passes information about the new library version to doUpdateNpm function.

func doUpdateNpm(ctx context.Context, pckg *packages.Package, versions []npm.Version) []newVersionToCommit {
    [...]
    for _, version := range versions {
        [...]
        tarballDir := npm.DownloadTar(ctx, version.Tarball)
        filesToCopy := pckg.NpmFilesFrom(tarballDir)
        [...]
}

And doUpdateNpm passes the URL of .tgz file into npm.DownloadTar.

func DownloadTar(ctx context.Context, url string) string {
    dest, err := ioutil.TempDir("", "npmtarball")
    util.Check(err)

    util.Debugf(ctx, "download %s in %s", url, dest)

    resp, err := http.Get(url)
    util.Check(err)

    defer resp.Body.Close()

    util.Check(Untar(dest, resp.Body))
    return dest
}

Finally, pass the .tgz file obtained using http.Get to the Untar function.

func Untar(dst string, r io.Reader) error {
    gzr, err := gzip.NewReader(r)
    if err != nil {
        return err
    }
    defer gzr.Close()
    tr := tar.NewReader(gzr)
    for {
        header, err := tr.Next()
        [...]
        // the target location where the dir/file should be created
        target := filepath.Join(dst, removePackageDir(header.Name))
        [...]
        // check the file type
        switch header.Typeflag {
        [...]
        // if it's a file create it
        case tar.TypeReg:
            {
                [...]
                f, err := os.OpenFile(target, os.O_CREATE|os.O_RDWR, os.FileMode(header.Mode))
                [...]
                // copy over contents
                if _, err := io.Copy(f, tr); err != nil {
                    return err
                }
            }
        }
    }
}

As I guessed, compress/gzip and archive/tar were used in Untar function to extract .tgz file.
At first, I thought that it’s sanitizing the path in the removePackageDir function, but when I checked the contents of the function, I noticed that it’s just removing package/ from the path.
From these code snippets, I confirmed that arbitrary code can be executed after performing path traversal from the .tgz file published to npm and overwriting the script that is executed regularly on the server.

Demonstration of vulnerability

Because Cloudflare is running a vulnerability disclosure program on HackerOne, it’s likely that HackerOne’s triage team won’t forward the report to Cloudflare unless it indicates that the vulnerability is actually exploitable.
Therefore, I decided to do a demonstration to show that vulnerability can actually be exploited.

The attack procedure is as follows.

  1. Publish the .tgz file that contains the crafted filename to the npm registry.
  2. Wait for the cdnjs library update server to process the crafted .tgz file.
  3. The contents of the file that is published in step 1 are written into a regularly executed script file and arbitrary command is executed.

… and after writing the attack procedure into my notepad, for some reason, I started wondering how automatic updates based on the Git repository works.
So, I read codes a bit before demonstrating the vulnerability, and it seemed that the symlinks aren’t considered when copying files from the Git repository.

func MoveFile(sourcePath, destPath string) error {
	inputFile, err := os.Open(sourcePath)
	if err != nil {
		return fmt.Errorf("Couldn't open source file: %s", err)
	}
	outputFile, err := os.Create(destPath)
	if err != nil {
		inputFile.Close()
		return fmt.Errorf("Couldn't open dest file: %s", err)
	}
	defer outputFile.Close()
	_, err = io.Copy(outputFile, inputFile)
	inputFile.Close()
	if err != nil {
		return fmt.Errorf("Writing to output file failed: %s", err)
	}
	// The copy was successful, so now delete the original file
	err = os.Remove(sourcePath)
	if err != nil {
		return fmt.Errorf("Failed removing original file: %s", err)
	}
	return nil
}

As Git supports symbolic links by default, it may be possible to read arbitrary files from the cdnjs library update server by adding symlink into the Git repository.

If the regularly executed script file is overwritten to execute arbitrary commands, the automatic update function may be broken, so I decided to check the arbitrary file reading first.
Along with this, the attack procedure was changed as follows.

  1. Add a symbolic link that points harmless file (Assumed /proc/self/maps here) into the Git repository.
  2. Publish a new version in the repository.
  3. Wait for the cdnjs library update server to process the crafted repository.
  4. The specified file is published on cdnjs.

It was around 20:00 at this point, but what I have to do was creating a symlink, so I decided to eat dinner after creating the symbolic link and publishing it.5

ln -s /proc/self/maps test.js

Incident

Once I finished the dinner and returning to my PC desk, I was able to confirm that cdnjs has released a version containing symbolic links.

After checking the contents of the file to send the report, I was surprised.
Surprisingly, clearly sensitive information such as GITHUB_REPO_API_KEY and WORKERS_KV_API_TOKEN was displayed.
I couldn’t understand what happened for a moment, and when I checked the command log, I found that I accidentally put a link to /proc/self/environ instead of /proc/self/maps.6
As mentioned earlier, if cdnjs’ GitHub Organization is compromised, it’s possible to compromise most of the cdnjs infrastructure.
I needed to take immediate action, so I sent the report that only contains a link that shows the current situation, and requested them to revoke all credentials.

At this point, I was very confused and hadn’t confirmed it, but in fact, these tokens were invalidated before I sent the report.
It seems that GitHub notified Cloudflare immediately because GITHUB_REPO_API_KEY (API key of GitHub) was included in the repository, and Cloudflare started incident response immediately after the notification.
I felt that they’re a great security team because they invalidated all credentials within minutes after cdnjs processed the specially crafted repository.

Determinate impact

After the incident, I investigated what could be impacted.
GITHUB_REPO_API_KEY was an API key for robocdnjs, which belongs to cdnjs organization, and had write permission against each repository.
This means it was possible to tamper arbitrary libraries on the cdnjs or tamper the cdnjs.com itself.

Also, WORKERS_KV_API_TOKEN had permission against KV of Cloudflare Workers that is used in the cdnjs, it could be used to tamper the libraries on the KV cache.
By combining these permissions, the core part of cdnjs, such as the origin data of cdnjs, the KV cache, and even the cdnjs website, could be completely tampered.

Conclusion

In this article, I described the vulnerability that was existed in cdnjs.
While this vulnerability could be exploited without any special skills, it could impact many websites.
Given that there are many vulnerabilities in the supply chain, which are easy to exploit but have a large impact, I feel that it’s very scary.
If you have any questions/comments about this article, please send a message to @ryotkak on Twitter.

Timeline

Date (JST)Event
April 6, 2021 19:00Found a vulnerability
April 6, 2021 20:00Published a crafted symlink
April 6, 2021 20:30cdnjs processed the file
At the same timeGitHub sent an alert to Cloudflare
At the same timeCloudflare started an incident response
Within minutesCloudflare finished revocation of credentials
April 6, 2021 20:40I sent an initial report
April 6, 2021 21:00I sent detailed report
April 7, 2021-Secondary fix has applied
June 3, 2021Complete fix has applied
July 16, 2021Published this article

  1. Quoted from W3Techs as of 15 July 2021. Due to the presence of SRI / cache, fewer websites could tamper immediately. ↩︎

  2. Quoted from W3Techs as of 15 July 2021. ↩︎

  3. https://github.com/golang/go/issues/25849 ↩︎

  4. Archives like this can be created by using tools such as evilarc↩︎

  5. I don’t know if this is correct, but I remember that the dinner on that day was frozen gyoza (dumplings). (It was yummy!) ↩︎

  6. Because I was tired from work and I was hungry, I ran the command completed by shell without any confirmation. ↩︎