DOM-Based XSS Attacks and Protecting Against Them

XSS and Dangerous Links

Cross-Site Scripting (XSS) attacks constitute a collection of methods for injecting malicious code into a web application by leveraging its existing normal functionality to make a victim's machine inadvertently take unwanted actions or to run code written by the hacker. There are many ways this can be done, but we'll be focussing on a method that involves link sharing with modified querystrings such that clicking on those links causes unintended consequences for an unsuspecting user.

These links don't go to bad websites, they go to a website the user knows, uses regularly, and trusts. The links could appear in emails, third-party sites, or within user comments on-site. The difference is that although these links take the user to a trusted website, the manipulated querystring in those links ends up causing malicious javascript code to run when they get there. If the user is already logged in to that website, this can lead to disaster. The code could take unauthorized actions in their name, or cause them to share their session token with a hacker who could then completely take over their account.

A Simple HTML Example

Let's build a simple web page to demonstrate this concept. If you are following along, you can copy this HTML and the corresponding javascript to practice penetration testing skills for web applications. DOM-Based XSS attacks work in essentially the same way as reflected (non-persistent) XSS attacks that take advantage of server-side flaws, except in the DOM-based variant the attack payload executes as a result of a flaw in the client-side code. This is a very simple example, and it is rarely this easy to launch an XSS attack on most production sites, but not all sites take proper measures to prevent these issues. I hope this goes without saying, but this is for educational purposes only. Use this knowledge to improve your security standards and client-side code.

Create a new folder on your computer called "xss-example". Create a new index.html file inside of xss-example/ and copy the code below to index.html:

<!DOCTYPE html><html>    <body>        <h2>Profile Post <span id="post_id"></span></h2>        <div id="post_content"></div><br/>        <button onclick="createNewPost()">Create New Post</button>        <button onclick="deleteAccount()">Delete Account</button>        <script src="main.js"></script>    </body></html>

The Corresponding Javascript

Now add a "main.js" file to xss-example/ and copy the code below into main.js:

/* Logged in User Profile Actions */function createNewPost(newpost='My new post!') {    alert("Publicly posting:\n\n" + newpost);}function deleteAccount() {    alert("Deleting your account...");}
/* Fake Backend Call: Retrieve Post by ID */function getPost(post_id) {    if (post_id == "1") return "A post about XSS attacks...";    else if (post_id == "2") return "Another post about XSS...";    else return "ERROR: NO MATCHING POST";}
/* Use Querystring to Retrieve Post and Populate Frontend */
/* Step 1: Find desired Post ID in querystring */var queryString = window.location.search;var postId = new URLSearchParams(queryString).get('post_id');if (!postId) window.location = '/?post_id=1'; // Default to 1st
/* Step 2: Get Contents of Post from 'Backend' with ID */var post = getPost(postId);
/* Step 3: Populate DOM with Post Contents and ID */document.getElementById("post_content").innerHTML = post;document.getElementById("post_id").innerHTML = postId;

Now cd into xss-example/ and run a local server. If you have python, the simplest way to do that is:

$ cd xss-example
$ python -m http.server 8080

Open your favorite browser and go to localhost:8080?post_id=1

You should should now see the following:

This is a very simple page. You can think of it as a very pared down version of a profile page on a webapp. There are two buttons you can press: one to create a new post, and one to delete your account. Both just call dummy alert functions. You can imagine many more options and functionality on a fully built out website/webapp.

If you change the querystring to ?post_id=2 you will see the second profile post instead:

Only posts 1 and 2 exist for this dummy example. If you do ?post_id=3 or ?post_id=200 you will see the title change accordingly, but the text for the post will read "ERROR: NO MATCHING POST":

Spend 30 seconds to play around with different values for post_id and click around the buttons to get a basic feel for it.

Exploiting innerHTML

Notice how in the last part of the javascript code, we populate 2 HTML elements by setting innerHTML. One is just the post_id value we pulled from the querystring which is normally just the number of the profile post we want to look at. We were able to use that value to query our 'backend' for the corresponding post entry we're looking for. With the post ID and post contents in hand, the javascript uses innerHTML to set both elements to the correct values and everything appears to function as it should.

If we give a number to a post that doesn't exist, we'll see that we just get back the message "ERROR: NO MATCHING POST" on the DOM, which is what we want. But something that is a little concerning is that we can also set the post_id value to anything we want:

That could be confusing if someone accidentally or intentially shared a link with the querystring set to something misleading. You could imagine the headaches it could cause large companies or politicians if people were sharing links to their social media profiles and were changing post_id to something that mischaracterized them or even made it look like a more serious hack had taken place.

This could definately be a problem, but it's still mostly surface level. It turns out this seemingly straightforward example with innerHTML can be exploited a lot more.

Leveraging Query Params for Code Execution

The problems don't just stop at misleading page titles. The problem get's significantly worse when the strings we write after ?post_id= are actually HTML elements, and those injected HTML elements are constructed to execute malicious javascript code.

One of the easiest ways to do this is with an <img> tag. <img> tags can be given the attribute "onerror" which will execute a given function or block of javascript if the "src" attribute goes to an image that doesn't exist. Here's an example we could use on our sample web page:

?post_id=<img src='notanimage.png' onerror='deleteAccount()'>

If someone shares a link to our sample website with that querystring, someone else clicking on it will have the deleteAccount() function run when they reach the site. Try it for yourself: With the sample app still running, in your browser go to

localhost:8080?post_id=<img src='notanimage.png' onerror='deleteAccount()'>

Notice how by just navigating to the page with this modified querystring, deleteAccount() ran automatically, even though the querystring was only intended to allow us to get and display a post, and deleteAccount() was only intended to run if we pressed the "Delete Account" button. Now, usually the process of deleting a profile/account on a website take multiple steps, but I hope you get the picture. If that was a site you had an account on and you clicked on a bad link thinking it was okay because it went to a site you trust, you are now running code from within your account that you did not want to run.

Stealing the Logged in User's Session Token

A nefarious hacker constructing these types of links could easily change the querystring to include createNewPost(<MSG>) in order to get a logged in user to inadvertently post sensitive information publicly by just clicking on the link. Even worse for the unsuspecting user, the hacker could steal their session token and use that to gain full control over their account. Let's add a cookie to our javascript to mimic the kind of session token cookie you find in many web applications that manage logged in user sessions. Add the following lines to the beginning of the main.js file:

/* Set Session Token Cookie for Logged in User */inTenMin = new Date();inTenMin.setMinutes(inTenMin.getMinutes() + 10);document.cookie =     "user_name=Butters97 " +    "session_token=41MvL4zR7eAPafZqFukWrIQpimFHIgq9; " +    "expires=" + inTenMin.toUTCString();

This will create a temporary 10 minute mock session token cookie. Restart your local server and empty cache and hard reload on localhost:8080/. Now try the following querystring:

?post_id=<img src='notanimage.png' onerror='createNewPost(document.cookie);'>

Using these query params in the URL now causes createNewPost() to run, posting the user's session token publicly:

That's not good. If the hacker is fast enough, they can hijack the logged in user's session to gain full unauthorized access to their account. No code injection is necessary after that, the hijacker now has complete control, and can potentially lock the real user out of their account.

It's not even necessary to run functions the developers of the site added themselves. You can run (somewhat) arbitrary code within curly brackets, for example:

?post_id=<img src='notanimage.png' onerror='{a=3;[a,6].forEach((i)=>{alert(i);});}'>

Try modifying the querystring yourself. See what you can get this example page to do by injecting your own code.

The Moral of the Story

The moral of the story is this: be extra careful with user submitted values, including querystring/url params and form input fields. If they can affect what shows up on the webpage or what gets stored in the database, they are a potential vulnerability. Input validation and sanitization are very important on the frontend, and the same is true before storing user submitted values to a database. Also, be cautious with innerHTML. Avoid using it when possible, but if you absolutely have to, make sure to validate and sanitize the values before adding something to the screen or saving to the db.

On the flip side, it's the blocks of code that don't properly handle user controlled parameters that hackers and penetration testers target to develop these kinds of exploits. Remember, client-side javascript is available for everyone to study.

More Types of XSS and Injection Attacks

In this tutorial we only went over DOM-based XSS attacks using querystrings. This is just one of many XSS attack vectors a hacker can use to get a victim's computer to execute their code. If a website does not have proper sanitization of user submitted values that are stored and retreived from the backend, it's possible someone simply commenting on a victim's page with malicious elements in the comment could trigger malicious code when the victim logs in, regardless of the querystring. In cases like that there are no warnings, the victim logs in normally and doesn't click on any questionable links, and malicious code still get's executed from their own browser.

Hopefully you learned something about how XSS attacks are implemented, and perhaps more importantly, how not to write client-side code for a web application. Failing to sanitize and validate parameters that users can modify opens up your app and it's users to malicious attacks. If you liked this, check out our other tutorials here.