.+:::::::::::::::::::::::::::::::::::::::::/
.//`+++++++++++++++++++++++++++++++++++++// s
-+/.h+..................................-os s
-+/.h/                                   ss s
-+/.h/                                   ss s
-+/.h/                                   ss s
-+/.h/                                   ss s
-+/.h/                                   ss s
-+/.h/                                   ss s
-+/.h/                                   ss s
-+/.h/                                   ss s
-+/.h/                                   ss s
-+/.h/                                   ss s
-+/.h/                                   ss s
-+/.h+                                   ss s
-++-yo////////////////////////////////////y`s
-+o-:--------------------------::::-:://:---s
-+/                            :-.: +:oo-   s
-+/```````````````````````````````````:-````s
./o//////////+o:::::::::::::::::s+/////////::
``://////////+s:::::::::::::::::y+/////////:.
  s                        .::::/osso::::. .+
`.y                        +yyyyhNNNmyyyy+ .+
`:h:::::/+:::::::::::::+:///////////////////+--------------------
 .o                    s s::::::::::::s.+ - +
-/y     .o             s ::::::::::::oss+::.+
`-y     .o             y            `+//:  .+
`.+/y/::/+:::::::::::::+:::::::::::::::::++:.
    `////////////////////////////////////:/.       


    

Caution! It's gonna be long and very technical...

No reason to fear though, long and technical is the most fun!

And thinking about it, I really can not imagine anyone don't want to find out about a Software developed in the year 2000 by some artists and how it works on assembly level ;)

Before we start a few words about my background. Until now I don't have any experience in reversing worth mentioning. It is in no way somtehing I do regulary. This work was created because of curiosity, experiment, a lot of patience and some research in books and the internet. Sometimes I just had an idea what I could try that maybe seems senseless to pros, because I don't know all the techniques. Looking back I surely could have save some hours here and there, which I spent by clicking "next instruction" in the debugger and note memory addresses i don't understand... But what if this exactly is what leads to learning? At least it is possible to recognize connections and structures later in the progress and use the knowledge in new ways.

Sometimes you might have to go down the rabbit hole even when it seem overwhelming at first sight. With this in mind, let's again explore the world of bits and bytes.

            ((`\
         ___ \\ '--._
      .'`   `'    o  )
     /    \   '. __.'
    _|    /_  \ \_\_
   {_\______\-'\__\_\

Knowbotic Research

First of all, what is *Knowbotic Research*?

Traveling back in time with the Wayback machine (archive.org) a short self-description can be found that no longer exists today in it's original form. On their website http://www.krcf.org back in 2001 they wrote about themself the following:

Knowbotic Research, KR+cF (Yvonne Wilhelm, Alexander Tuchacek, Christian Huebler); was established in 1991, and since then the media art group has been experimenting with formations of information, interface and networked agency. Their more recent projects present artistic practice with media as an attempt to find viable forms of intervention in the new public domain. Since 98 KR+cF is teaching New Media at University of Art and Design Zurich. KR+cF has got major international Awards.

https://web.archive.org/web/20011017023211/http://www.krcf.org/krcfhome/bio.htm

More recent links may be:

https://archive.knowbotiq.net/bio/ and https://tuchacek.net/

Connective Force Attack

But what will the following be all about?

In the year 2001 Knowbotic Research published their work Connective Force Attack. The Software was written in collaboration with Gideon May und Thomas Rehaag

On their 2001 website they wrote about this work:

xxxxx connective force attack: open way to the public is an urban system of action designed for working through the various conditions and potentials of a mediatized public domain. The project addresses the issues of insecure data networks, paranoia about hackers, privacy, electronic. The project foresaw the mass distribution of free software that would invite Internet users to join forces in cracking ('brute force attack') an Internet server in Hamburg in order to infiltrate the city's public information system 'Infoscreen'. Three times daily via mobile and ISDN networks, the information entered by the trespassers was to be transferred uncensored to the screens of the some thousand monitors installed in Hamburg's tube trains. Hacking and the controversies surrounding privacy, the public domain and data security would have been brought closer - metaphorically and literally - to some 800,000 passengers a week. It foresaw the deployment of 'brute force attacks' based on algorithmic data-descrambling strategies in order to gain access to an Internet server.

The software allows participants to get together in a chat environment and so heighten the efficiency of the attacks. The goal is to infiltrate an information medium and/or territory, to allocate a new password to a purpose-created, password-protected area and occupy it with new content - until the next group cracks the password, alters it, and either adds its own comments to the preceding group's content, or else deletes it or overwrite it. The available quantity of time and computer power decides whether, how and when a group succeeds in cracking a password. In other words, a brute force attack's chance of success is directly dependent on the connective efficiency - the number of people who use the Internet to channel and join up their PCs to form a distributed, shared unit of action. Any new content generated is simultaneously displayed in a public location without being censored, namely on the large-scale data display in the 'Jungfernsteig' station, where it remains visible round-the-clock.

https://web.archive.org/web/20011207193241/http://www.krcf.org/krcfhome/1crackit.htm

Hacker, brute force, networks, online participants, public display and Hamburg (my hometown)... this sounds extremly interesting and almost begs to be analysed by me.

Luckily there is another website about this project still online today, describing what they did in more detail.

http://krcf.knowbotiq.net/cfa/intro.html

And now the best:

Their software is still available to download!!!

http://krcf.knowbotiq.net/cfa/cfa.zip

In the following article we will now analyse the software cfa.exe in very detail. It will be deeply technical but also super interesting. While analyzing, I will try to explain all used techniques and methods.

The goal is to understand what the program does, explore it's functionality and maybe create a litte hack here and there. At the end the program should come back to live to interact with our input.

Simple Static

Alright. The program is downloaded. Time for a first impression. Let's take a look what information are hidden between the bits and bytes.

First we have to unzip the downloaded archive. This already reveales some other files, but let's begin with cfa.exe.

To find out the filetype we can use the unix program "file".

  $ file cfa.exe
  cfa.exe: PE32 executable (GUI) Intel 80386, for MS Windows

Let's check what that means:

The Portable Executable (PE) format is a file format for executables, object code, DLLs and others used in 32-bit and 64-bit versions of Windows operating systems.

https://en.wikipedia.org/wiki/Portable_Executable

On a Windows OS we can rightclick and view the file propertys.



View cfa.exe file propertys

This shows MFC Application. Decompiling would make reaching our goal much more easy, let's check if this is possible with "MFC" applications.

Unfortunately in a stackoverflow post they write the following:

MFC is compiled from C++ source, so it can't be recovered. [...]

https://stackoverflow.com/questions/5934606/open-mfc-application-to-get-source-code

Seems like we have to go the hard way and dig through the disassembly.

A short reminder:

1. Decompiling: means the possibility to restore human readable high-level source code. This is possible for example with Java applications.
This results in something like:

public static void main(){
  system.out.println("Hello World");
}
    
2. Disassembly: means to translate the zeroes and ones executed by the CPU into a little bit more human recognizable form: assembly instructions.
For example it could result in this:

0000000000001149 <main>:
  1149:       f3 0f 1e fa             endbr64
  114d:       55                      push   rbp
  114e:       48 89 e5                mov    rbp,rsp
  1151:       48 8d 3d ac 0e 00 00    lea    rdi,[rip+0xeac]
  1158:       e8 f3 fe ff ff          call   1050 <puts@plt>
  115d:       b8 00 00 00 00          mov    eax,0x0
  1162:       5d                      pop    rbp
  1163:       c3                      ret
    

But back to the first cfa.exe impression.

Other tools for collecting basic information about software are for example:

PEView - with this, info about the PE header strctures become visible. The following screenshot shows among others the exact date and time the file was created.



View timestamp in the headers with PEView

Another tool even more gets into technical details.

Dependency Walker shows the used system librarys. These are part of the Windows operating system and are not stored into the cfa.exe itself. For example the user32.dll provides functionality for graphical interfaces. To connect at a networking layer wsock32.dll can be used.



View imported librarys with Dependency Walker

But the one tool that must not be missing during the first static analysis is Strings. This just prints out all readable text in the binary.

View all strings in cfa.exe with a click below.

GetProfileStringA
IsWindowUnicode
DefDlgProcA
DrawFocusRect
ExcludeUpdateRgn
UnregisterClassA
GetTextExtentPointA
CreateDIBitmap
cZC
password
login to area
EmperorEight
waiting for billboard data
write
send
message board
write
previous
next
TimerEnd
TimerStart
command
open
shell
http
Classes
DefaultIcon
HtmlDefBrowser
Iexplore.exe
Netscape.exe
App Paths
CurrentVersion
Windows
Microsoft
SOFTWARE
the new password has to be different
old
new
change password!
please
only 0 - 9 and A - Z (big)
only  A - Z (big)
only 0 - 9
not accepted, password length = %d, 
h---h.de
defuser
nohost
waiting for 
chatserver 
response
stop 
start 
memo
login
crack control / chat  area %d
NNNN
GGGG
crack:
from:
Empereig.ttf
io.krcf.org
offline
OFFLINE
cshh
area %d           last crack: %s
connective force attack: open way to public
xxxxx
no connection!
check settings in the dialup dialog
connecting (%d/10)
defaultuser
------ AREA CRACKED ------
%02d/%02d/%04d %02d:%02d 
password
an alphanumeric (big letters)
an alpha only (big letters)
a numeric
Password Error
retry with 
NuNicks.txt
CONNECTION FAILED
http://h---h.de/intro.html
Hanging up 
RAS Connection
%02d.%02d.%02d %02d:%02d:%02d 
snm.hgkz.ch
provider
username
phone
already online / via LAN
remember password
dialup
Disconnected
Authenticated
Modem busy!
\hh.ctrl
cfa.info
INFO
NO_LEGAL_INFO
TRUE
\info\index.htm
START_INFO
MAINCLIENT
SUPER
AUTORUN
TESTSHOW
NODIAL
CREATEUID
OFFLINE
TEST
CONNECT
UID
CRACKRIGHT%d
CRACKLEFT%d
REMEMBER_PASS
LAN
HHNICK
HHUSER
PASSWORD
USERNAME_PROVIDER
MAIL
TEL
PROVIDER
FALSE
\hh.info
\cfa
Programs
Programme
xtra
XXXXXXXX
USER %s %s %s :hhCrackUser
NICK %s
PRIVMSG
Welcome
PONG
PING
JOIN #%d
PART #%d
PRIVMSG #%d :%s
QUIT 
HELLO "%s" "%s" "%s" "%s" 
BYE
ALPHANUM
ALPHA
NUM
LOGIN %d %s 
TRY %d %s 
TEXT %d "%s"
PASSWORD %d "%s" "%s" 
SUPER  %s
LOGOUT  
POST  "%s" 
BILLBOARD
ENDTRY
LOAD ERROR
SAVE ERROR
HELP ERROR bad number of arguments
HELP ERROR unknown command
GETPASSWORD ERROR bad area:  
GETPASSWORD ERROR bad number of arguments:  
GETPASSWORD ERROR no arguments
GETPASSWORD ERROR no superuser status
WHO ERROR nick  not known
WHO ERROR bad number of arguments
WHO ERROR no superuser status
NEWS ERROR no arguments
TEXT ERROR no permission
TEXT ERROR bad area
TEXT ERROR bad number of arguments:  
TEXT ERROR no arguments
LOGOUT ERROR not logged in
LOGIN ERROR bad password
LOGIN ERROR invalid password
LOGIN ERROR bad area:  
LOGIN ERROR bad number of arguments:  
LOGIN ERROR no arguments
LOGIN ERROR already logged in
SUPER ERROR bad super user password
SUPER ERROR bad number of arguments
SUPER ERROR no arguments
PASSWORD ERROR Not allowed
PASSWORD ERROR bad password
PASSWORD ERROR already changed by 
PASSWORD ERROR same password
PASSWORD ERROR Invalid password
PASSWORD ERROR Not logged in
PASSWORD ERROR bad area
PASSWORD ERROR bad number of arguments: 
PASSWORD ERROR no arguments
HELLO ERROR nickname  exists
HELLO ERROR bad number of arguments 
HELLO ERROR registered already
HELLO ERROR no arguments
ERROR command not understood
ERROR internal error; report to administrator
ERROR Please log in first
LOAD OK
SAVE OK
NEWTEXT  
CRACKED  
HELP OK
GETPASSWORD OK 
WHO OK 
NEWS OK
TEXT OK 
LOGOUT OK
LOGIN OK cracked
SUPER OK super user status
PASSWORD OK set password
QUIT OK
BYE OK
EXIT OK
HELLO OK           
000 Crack Server at  (V [experimental]) ready.
Unknown Error
PING
Courier New
format = 3
format = 2
format = 1
send
offset = %d, del = %d
lastlog.txt
MOXX
nickname
GROON
User
please enter nickname
first character of nickname must be alphabetic
login 
Connected
PasswordExpired
RetryAuthentication
CallbackSetByCaller
Interactive/PAUSED
SubEntryDisconnected
SubEntryConnected
Projected
WaitForCallback
WaitForModemReset
PrepareForCallback
ReAuthenticate
AuthAck
AuthLinkSpeed
AuthProject
AuthChangePassword
AuthCallback
AuthRetry
AuthNotify
Authenticate
AllDevicesConnected
DeviceConnected
ConnectDevice
PortOpened
OpenPort
RasGetErrorStringA
RasHangUpA
RasDialA
RasGetEntryDialParamsA
RasEnumConnectionsA
RasGetConnectStatusA
RasEnumEntriesA

With all this information, quite a few assumptions about the program can be made.

But actually in this case the best method to get an overview what cfa.exe does is to take a look into the manual...

Within the ZIP archive is a file called "h---h manual.doc". This is a manual explaining the program step by step. Also available at http://krcf.knowbotiq.net/cfa/intro.html

Now putting all this together we can make some assumptions about possible functionality in the cfa.exe.

  • The program communicates via network
  • A series of "passwords" are tested
  • There is a "crack server" to which the passwords are sent
  • It needs some kind of protocol for communication with the crack server
  • There is a chat with other participants
  • If a password is correct, an messagebox gets available to post text
  • and so on...
This in mind, it is interesting how some words in the Strings output appear to be like commands or code fragments.

LOGIN, GETPASSWORD, PING, etc.

These strings suggest that it is some kind of protocol wich is used for communication by cfa and it's server counterpart.

So, maybe it is possible to provide our own server which sends commands to the cfa and interact with it.

This would be an important step to bring the cfa back to live.

In the following, we will therefore focus on learning as much as possible about the protocol.
  • More questions are: at wich particular points input is processed?
  • What input does the program expect?
  • Can we write a simple server substitute poc to interact with the program?
To answer these questions we have to bring our analysis to the next level. Let's take a look what the cfa.exe tries to send and receive while running.

Simple Dynamic

Dynamic analysis means running the program and observing it's behaviour and interactions. For example, whether the program tries to read or write files from the system.

Another important behaviour is the connection establishment in the network. Does the programm connect to a specific host oder URL in the internet? If so, which protocols are used? Is it calling a website or is it connecting to a FTP server for example.

Besides the raw data that is sent through the observed interfaces and over the wire, we can also learn about the program itself and it's functionality. That said is it very interesting which functions are not (yet) reachable, maybe because more informations are needed to understand how to get to a specific path.
A simplified setup could look like this:
      ____________________________
      | Windows                    |
      |  _____________             |
      | | cfa.exe     |            |
      | |             |---+        |
      | |             |   |        |
      | |_____________|   |        |
      |                   |        |
      |      _____________V______  |
      |     |Analysing Programs  | |
      |     | - Wireshark        | |
      |     | - Procmon          | |
      |     | - Debugger (later) | |
      |     | - etc.             | |
      |     |_____________V______| |
      |___________________|________|
                          ^
                          |
                      network
                          |
                          |
            +------------>+<-----------+
            |                          |
       _____V_____                _____V_____
      |Linux      |              |Linux      |
      |___________|              |___________|

Basically there is a Windows system running the cfa.exe and some other systems in the network to interact with. The analysis tools are installed on Windows. The following programs are used:
  • Wireshark
  • Procmon
  • x32dbg (more about this later)
More tools could for example be:
  • Regshot (compare Registry before and after the start of a program)
  • Process Explorer (Running Process relationships)
But they propably wouldn't add much value here and therefore weren't used.

The already shown systems in the network are somewhat preempted at this point. To find out what network connections are needed, we should first run the cfa and observe it's connection attempts.

The shown setup could be quickly and easy implemented with a virtualization technology like VirtualBox.
Now it is almost time for a first test run of cfa.exe.

To directly have some info to analyse we let Procmon run in the background. Process Monitor is an advanced monitoring tool for Windows that shows real-time file system, Registry and process/thread activity.

When starting Procmon it shows a lot of processes. We can let it run a few seconds and then stop the capture. All the captured processes have nothing to do with cfa.exe, because we didn't even start it yet. These are all some windows stuff. With rightcklick we can choose and exclude all processes we don't want to see.

At second we use Wireshark to capture network traffic. Also Wireshark displays some noise traffic we are not interested in because Windows always does some surveillance shit in the background. But we can filter this traffic out later on.

Finally when Wireshark and Procmon are ready we start the cfa.exe.


Choose connect via LAN
Trying to connect to a not existing or reachable server the cfa gets stuck in a loop.

In Wireshark we can use the "dns" filter (just type dns in the filter) and see that it tries to connect to io.krcf.org which is resolved to 183.181.86.71.

If we call this IP or domain directly in the browser, it's responding with a short text saying that this URL is invalid - in japanese (according to google translate).
  無効なURLです。
  プログラム設定の反映待ちである可能性があります。
  しばらく時間をおいて再度アクセスをお試しください。 
Doesn't look like io.krcf.org or the resolved IP still belongs to Knowbotic Research.

Nevertheless this is an important hint for us, because now we can redirect the dns to our own server.

Another puzzle piece to find in Wireshark is an attampt to connect to port 6666.


First wireshark capture shows: cfa try connecting to some host on 6666
But let's first also take a look at the system interactions captured by Procmon.

Again, some by-catch has to be excluded, then it can be analyzed what exactly cfa.exe did. A lot is happening with the Windows Registry but also some interaction with the filesystem. For example the program tries to open a file named "hh.ctrl" and a file named "hh.info".

The "hh.info" file can be found at "C:\Users\yawe\AppData\Local\VirtualStore\Program Files\cfa\" and include some configurations shown in the following:
[INFO]
UID=TJ39EDM43U
USERNAME_PROVIDER=test
TEL=test
REMEMBER_PASS=FALSE
LAN=TRUE
HHUSER=defaultuser
HHNICK=yawe
CRACKLEFT4=250000
CRACKRIGHT4=750000
CRACKLEFT1=283918
CRACKRIGHT1=750000
CRACKLEFT2=123456
CRACKRIGHT2=163390
CRACKLEFT3=260358
CRACKRIGHT3=750000
NO_LEGAL_INFO=TRUE
If we want to exploit bugs in the program this may be an insertion point. But let's for now focus again on the networking aspects.
Similar to Linux there is a "hosts" file on Windows. Domain names and corresponding IP adresses can be manually written into this file. Before the system performs a DNS lookup it checks if there are entries in the "hosts" file. If it finds a to-be-lookuped name in there, it will take the IP address from the file and connect to it, instead to the "real" one from a DNS server in the internet.

On Windows the file can be found here: C:\Windows\System32\drivers\etc\hosts.txt

To open it admin privileges are required.

Now we can add IPs from our local network and let the needed domains point to them.

HINT: The following second host can also be found with Wireshark after the server at "io.krcf.org" is already reachable (meaning a port is open/netcat listener running at 6666). It then tries to reach "h---h.de" at port 6667.
The content of our "hosts" file now should look like the following. Both domain names point to a VM with the IP 192.168.0.130 in our local network.
# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
#      102.54.94.97     rhino.acme.com          # source server
#       38.25.63.10     x.acme.com              # x client host

192.168.0.130	io.krcf.org
192.168.0.130	h---h.de

# localhost name resolution is handled within DNS itself.
#	127.0.0.1       localhost
#	::1             localhost
At the server at wich the traffic should be received, we start a simple netcat listener for a quick first test. The needed ports are 6666 and 6667.

With this setup we start the cfa one more time. If we are fortunate the program already reacts different when we send back some data after receiving the connection.

Indeed! We enter the interface of the cfa with four windows. There we can click and open the chat window. Let's just write something in a few places.
$ nc -lvnp 6666
Listening on 0.0.0.0 6666
Connection received on 192.168.0.227 61283
HELLO "defaultuser" "yawe" "TJ39M43U" "2" 
TRY 1 9000 
ENDTRY
LOGIN 1 asd 
BILLBOARD
POST  "asdasdasd" 
BILLBOARD
PING
Data is received at our server. Part of it are the strings written in the cfa, others are part of the protocol. What that means we hopfully will find out soon.

In the cfa interface nevertheless all windows showing the word "offline". Just to listen on a port is seems not to be sufficient. The cfa needs a valid answer to it's requests.

Also at port 6667 some data was received and we can see the protocol strings sent.
$ nc -lvnp 6667
Listening on 0.0.0.0 6667
Connection received on 192.168.0.227 61285
NICK yawe
USER defuser nohost h---h.de :hhCrackUser
PART #1
QUIT 
NICK, USER, PART, QUIT

Searching these strings in our favourite online search engine, reveals that this is almost certainly part of an IRC chat.

At https://datatracker.ietf.org/doc/html/rfc1459 it is easy to search for "Command: " using the full text search (Ctrl-f), which quickly reveals the possible IRC commands. Compared to the received strings from cfa some intersections become visible.

This means we already know about one protocol the cfa uses. The program implements the IRC protocol at least patially and it should be possible to chat with others this way.

Knowing this we can install our own IRC server an try to chat with the cfa.

One of the first search results to setup up IRC was this: https://ubuntu.com/tutorials/irc-server#1-overview. There is no particular reason why I chose this manual, you could use any other.

Installing inspircd we have to make a few configuration changes which are documented here

And after a little bit of trial and error (for example the servers hostname has to be "h---h.de") we can actually chat with the cfa. The following is a quick proof of concept:


A simple chat with netcat, IRC Server and cfa.exe
There is another interaction that can be found in the cfa. With a click on the X in the bottom right corner, the cfa opens a web browser and tries to load the manual webpage. This is the request it sent:
GET /intro.html HTTP/1.1
Accept: image/gif, image/jpeg, image/pjpeg, application/x-ms-application, application/xaml+xml, application/x-ms-xbap, */*
Accept-Language: de-DE,de;q=0.5
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.2; WOW64; Trident/7.0; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729)
Host: h---h.de
Connection: Keep-Alive
We can also redirect this to our own webserver as mentioned before and host intro.html to display our own page into the cfa. Unfortunately it looks like the cfa is using the system browser for this. Meaning it is propably not easily exploitable (like webstalker).
Now the main thing that is still missing is the so-called "crack server", which seems to be able to unlock most of the functionality in the cfa.

First Server PoC

Communication already works via the IRC server. However, we haven't gotten far with decrypting the other protocol for the crack server. So let's start writing a simple Python script that listens on the port and responds with some test data. With a little trick we can work on a part of the server without having to stop and restart it all the time. For this we can simply include a file as a library (File 2: myfuzz.py), in which we define different responses to test manually.

Don't take this PoC code too serious please... :)
import socket
import importlib
import myfuzz

host = '0.0.0.0'
port = 6666

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind((host,port))
s.listen(2)
c, addr = s.accept()

while True:
    data = c.recv(1024).decode('utf-8')
    if not data:
        break
    print("from connected user: " + data)

    importlib.reload(myfuzz)

    data = myfuzz.test(data)

    print("sending: " + data)
    c.send(data.encode('utf-8'))
    
c.close()


########## File 2: myfuzz.py ##########

def test(data):

    if("PING" in data):
        return "PONG\r\n"

    if("HELLO" in data):
        return "HELLO OK\r\n"
    
    if("TRY" in data):
        return "CRACKED\r\n"

    if("LOGIN" in data):
        return "CRACKED\r\n"

    else:
        return "TEST\r\n"

As a test we send back more or less randomly choosen strings from the Strings output of our first static analysis. For example we can answer with: HELLO, OK, LOGIN, LOAD, Connected etc.

But somehow the cfa doesn't really seem to care what strings we send in a response...

We therefore need to take another step in our analysis and understand the inner workings of the program.

Advanced Static

Before we get into the assembly confusion, it helps to summarize again in a "high-level" what we already know. So we can better understand and guess what code path we should go down and what they are for.

The following image is a flowchart with the most important functions we know so far. Some pictures are from the online manual. The interactions with the right image, where the crack server is online, are "educated guesses".



Flow Graph of the cfa menues and functions
So far all analysis techniques are like looking at something from the outside. During the advanced analysis we will cut open the application as in an autopsy. This will give us a deeper understanding of the programs inner workings.

This type of reverse engineering requires very specialized knowledge. With little prior knowledge, it is not easy to decipher the codes. But with a little patience it is possible. And especially by constantly reading everything that you do not understand again and again. That's why I wrote a little x86 Cheat sheet, for example, where I can quickly look up the basics.

In the end, it is absolutely not necessary to understand every single instruction. The goal is mainly to understand the connections and structures and to get a general impression of the functionalities and possibilities.

The tools that can be used for this are for example: Some may prefer to use IDA Pro, but that's too expensive for me.

Let's start opening cfa.exe in Binary Ninja.



The Binary Ninja interface
One of the most important things about this type of reverse engineering is to understand the functions involved. The problem here is that the original names of the functions have unfortunately not survived. With meaningful function names, the whole thing would be much easier. Because then you would already know what it is about when you look at the assembler code and can make assumptions more easily. But unfortunately we are not so lucky here. Anyway.

The functions that Binary Ninja recognizes are all named after the memory address where they start.
sub_42d96e
sub_42dada
sub_42daf0
sub_42db06
[...]
usw.

Binary Ninja finds a _start function, but I'm not sure if that is the main function or if the main function is called there. You should probably know that when you reverse engineer... but who cares. By the way if anyone ever reads this and knows a decent Binary Ninja manual, please let me know.

00426245  _start:

With small binarys your approach would normally start at main and follow the code execution from there. Examining called functions will eventually lead to interesting code. Unfortunately, I don't really know where to start with this rather extensive program.

Theoretically, we could also go through all the functions one by one, but that would be a real Sisyphean task and would take forever. In addition, we do not know whether all recognized functions are actually called and needed.

It would also be possible to browse through the functions at random and look for particularly large functions. Because very large features are likely to include interesting functionality. Rather small functions are probably some wrappers or functions from libraries etc. which are used by default in the high level language. However, since there are simply too many functions, this is not really practical either.

The question is, what are we trying to accomplish with reversing? The goal at the moment is understand the custom protocol. Maybe we could search for functions that interact via the network.

That means we could look at where exactly the program calls a function like WSOCK?

Binary Ninja automatically renames functions that belong to the default Windows functions. If we scroll down to the bottom of the symbols window, we will see these functions. Another helpful feature of Binary Ninja is that it automatically detects where a function is called and displays those cross references.

So let's see if WSOCK32 is called somewhere. Wait, there are quite a few of them?

00445548  int32_t (* const WSOCK32:Ordinal_WSOCK32_4)() = 0x80000004
0044554c  int32_t (* const WSOCK32:Ordinal_WSOCK32_17)() = 0x80000011
00445550  int32_t (* const WSOCK32:Ordinal_WSOCK32_23)() = 0x80000017
00445554  int32_t (* const WSOCK32:Ordinal_WSOCK32_12)() = 0x8000000c
00445558  int32_t (* const WSOCK32:Ordinal_WSOCK32_101)() = 0x80000065
0044555c  int32_t (* const WSOCK32:Ordinal_WSOCK32_19)() = 0x80000013
00445560  int32_t (* const WSOCK32:Ordinal_WSOCK32_16)() = 0x80000010
00445564  int32_t (* const WSOCK32:Ordinal_WSOCK32_52)() = 0x80000034
00445568  int32_t (* const WSOCK32:Ordinal_WSOCK32_3)() = 0x80000003
0044556c  int32_t (* const WSOCK32:Ordinal_WSOCK32_8)() = 0x80000008
00445570  int32_t (* const WSOCK32:Ordinal_WSOCK32_20)() = 0x80000014
00445574  int32_t (* const WSOCK32:Ordinal_WSOCK32_10)() = 0x8000000a
00445578  int32_t (* const WSOCK32:Ordinal_WSOCK32_1)() = 0x80000001
0044557c  int32_t (* const WSOCK32:Ordinal_WSOCK32_112)() = 0x80000070
00445580  int32_t (* const WSOCK32:Ordinal_WSOCK32_115)() = 0x80000073
00445584  int32_t (* const WSOCK32:Ordinal_WSOCK32_116)() = 0x80000074
00445588  int32_t (* const WSOCK32:Ordinal_WSOCK32_13)() = 0x8000000d
0044558c  int32_t (* const WSOCK32:Ordinal_WSOCK32_111)() = 0x8000006f
00445590  int32_t (* const WSOCK32:Ordinal_WSOCK32_9)() = 0x80000009
00445594  int32_t (* const WSOCK32:Ordinal_WSOCK32_2)() = 0x80000002

When going through it, you notice that Ordinal_WSOCK32_112 and Ordinal_WSOCK32_111 are called by various other functions. Unfortunately, this does not really help either.

Another method is to search for strings and see if they are compared somewhere or if other operations are done with them.

If you scroll to the bottom of the disassembler view in Binary Ninja, you get to a section where most of strings are located. Here you can also click on the memory address to display where this address is referenced, i.e. where the string is processed.

Let's take 00452930.

00452930  data_452930:
00452930  48 45 4c 50 20 45 52 52 4f 52 20 62 61 64 20 6e  HELP ERROR bad n
00452940  75 6d 62 65 72 20 6f 66 20 61 72 67 75 6d 65 6e  umber of argumen
00452950  74 73 00 00                                      ts..

The cross-referencing leads to a function 00413f80 which processes a lot of the strings. Alright, what is happening here?

  [...]
{Case 0x0}
00413fd9  68142f4500         push    data_452f14 {__saved_esi_1}  {"000 Crack Server at  (V [experim…"}
00413fde  e93a020000         jmp     0x41421d

{Case 0x1}
00413fe3  68002f4500         push    data_452f00 {__saved_esi_1}  {"HELLO OK           "}
00413fe8  e930020000         jmp     0x41421d

{Case 0x2}
00413fed  68f82e4500         push    data_452ef8 {__saved_esi_1}  {"EXIT OK"}
00413ff2  e926020000         jmp     0x41421d

{Case 0x3}
00413ff7  68f02e4500         push    data_452ef0 {__saved_esi_1}  {"BYE OK"}
00413ffc  e91c020000         jmp     0x41421d
  [...]

The {Case 0x?} indicates this is a huge switch case function, probably related to communication and protocol somehow. Besides the huge Switch case, however, the function itself doesn't have much more functionality.

But from here we can look where this function is called and end up in 0040b740. Oh, that looks like an interesting function. Let's zoom out a bit in the graph view.



Hint: this text was written after a few hours of reversing. That is why there are my comments in the graph view, and some functions are already renamed. Just ignore that for now.

Well, that looks like a lot of functionality. And as we know there is some kind of parsing of the protocol strings in at least one place. Don't worry, of course we'll take a look at it in detail.

But first, we might find other potentially interesting functions with the technique. In other words, loop through the strings and see which of the functions process them. If more than one string is accessed by the same function, maybe even thematically matching somehow, we may be able to make some first guesses about what the function is actually doing.

The strings "start" and "crack control / chat area %d" lead to the function 00406570.



In the graph view a long sequential series of instructions becomes visible. The code blocks repeat generally in the following schema:

In the first block a few subroutines are called. They mainly do something with memory allocation stuff (didn't understand fully to be honest).

In the second block something happens with "EmperorEight". The string is pushed to the stack and at the end of the block another function is called doing stuff with fonts. "EmperorEight" is the font displayed in the cfa.

The third block pushes the string " from:" to the stack and calls a function right after. The called function propably does some interface things because it has function calls to things like: SetTextColor, SetBkColor, IsWindow, InvalidateRect, CreateSolidBrush, FillRect etc.

These block repeat. All processed strings are the following:

"from: "
"to: "
"crack: "
"GGGG"
"NNNN"
"crack control / chat  area %d"
" start"
" login"
" memo"

It is therefore highly probable that this code is related to the crack control window that can be opened in the cfa. Because it looks like this in the interface:



Some buttons in the crack control window

In Binary Ninja functions can be renamed. In a function right click on the name and then on change symbol. This makes the reversing process much easier. Because now we always know which function we are dealing with, should it be called again elsewhere. For example, I simply renamed the above function to "CRACK CONTROL / CHAT WINDOW" because that's probably what it is doing.

With this approach, some more interesting functions can be identified. Unfortunately, it is still not easy, because called subroutines are often not very easy to understand and you quickly find yourself in a completely different part of the code.

Other functions that gave more or less good hints about the functionality of the program were among others the following:

Memory Address   -   Renamed Function

0040fe00  loading config files cfa.info & hh.ctrl:
00402360  interface-window-something
00402810  MESSAGE BOARD MENU
00403510  timer-end-timer-start
00403cc0  windows system stuff - loads exes & shell
00404520  change password error??
00404920  change password
00405340  password policy
004053b0  connect to h---h.de
004056b0  IRC USER login
00408f30  connect crackserver @ io.krfc.org
0040a0b0  load area
0040a380  loading right side menue bar
0040b290  log in at Crack Server
0040b740  connect and interact with CrackServer
0040c7a0  load manual browser
0040d670  first-login-form-pass-username-provider
004100c0  SELECT AREA MENU
00410970  loading config values??
00411350  searches config files in directorys?
00411b90  log in IRC Server
00411f70  IRC_get_msg: PING/Welcome/PRIVMSG
00412280  IRC JOIN
004123e0  IRC PART
00412360  IRC PRIVMSG
004126f0  sending-strings-to-socket
004128b0  wasistdasschonwiederfüreinegeisteskrankefunktion?
00413950  LOGIN to area??
00413a30  TRY password - crack attempt
00413be0  send TEXT?
00413c50  send PASSWORD?
00413d40  LOGOUT
00413d70  POST message
00413de0  BILLBOARD
00413e10  ENDTRY
00413f80  parse-protocoll-errors
00414410  send PING:
00414440  Something with font emperoreight:
00418750  something nickname
00418b30  enter-nickname
00418b30  enter-nickname
0041ba10  agree-to-terms-and-license

Yes, that's how I renamed them. Everything is better to remember than hex addresses.

Noticeable: the memory addresses at which the functions are located. Perhaps certain memory areas can also be assigned to specific stages in the program. This would mean functions that are close to each other in memory also belong to a specific set of functions. In this way, the many small functions could be assigned conceptually without going into more detail about each of them individually. If in doubt, of course, a function must still be examined to understand exactly how it works.

All the functions that are significantly further behind in the code would probably be some standard C-functions. This might be helpful to estimate how much time could be put into the reversing of a function.

Another way to find out function names is possible with Ghidra. Some standard functions are recognized by ghidra but not by Binary Ninja.

Cool would be a way to compare this disassembly with disassembly where you know the function names. Maybe you could disassemble an old LibC and compare the functions???

Later in the process i found out IDA Pro actually just can do that.

There is a technology called "FLIRT" (https://hex-rays.com/products/ida/tech/flirt/: FLIRT stands for Fast Library Identification and Recognition Technology). So I downloaded an IDA Pro DEMO version for testing purposes. It recognizes significantly more functions. To name just one for example, the sub_426e05 function is therefore _mbsstr (Locates the first occurrence of a sequence of characters, excluding the terminating ASCII NUL byte, in a string).

It is also possible to set up the whole thing with certain restrictions in Binary Ninja. For this purpose, the plugin "nampa" can be installed. To access Binary Ninjas plugin installer: "CTRL + SHIFT + M". Then download the function signatures for example here https://github.com/push0ebp/sig-database/tree/master/windows to compare them with "nampa". But this gives partially broken output. Well, it's better than nothing and may save some time at one point or another.

Back to reversing.

Let's take a closer look at a few functions that are very similar. For example:

00413950, 00413a30, 00413be0, 00413c50



These functions probably send data to the crack server. The strings "LOGIN", "TRY", "TEXT" and "PASSWORD" indicate this.

These functions, for example, each call the same two functions after they have pushed their string onto the stack: 00434889 and 0041d3f0.



In the disassembled code of 0041d3f0 we can see:

0041d413  e8f8370100         call    Ordinal_WSOCK32_111

A call to Ordinal_WSOCK32_111. That means, here is some socket connection sending stuff happening. Lets rename the Function to socket_send or something.

00434889 is not possible to understand for me in detail. Maybe we come back later. Nevertheless we can see that for example a "PASSWORD %d "%s" "%s" \r\n" string is given as an argument to this function, as well as some value in EDI. My guess is that the function is something like a format string algorithm that inserts values in the places of % variables. Let's just rename it to: processing-string-before-sending or something.

With this approach, some functions can be revealed step by step. For further analysis, it greatly helps to rename the functions and add comments in key places.

At this point, however, we will not go through every single function.

Therefore, reviewing all the other interesting functions I will leave at this point to the inclined reader.

However, static reverse engineering is not very satisfying in the long run, because above all it is slow. Going into each subroutine, trying to understand each loop and condition, and finally making a basic and meaningful assumption about how it works is not straightforward.

Let's add another technique.

Advanced Dynamic

Aaaaalright.

We will use the "x32dbg" debugger, which is the 32 bit version of x64dbg.

During the static analysis of the assembler code we could already gain a little insight into internal functions of the cfa. We have discovered a few potentially interesting functions and noted their memory addresses. Now we can set breakpoints at these addresses in the debugger and see what the program does at runtime.

Keep the following in mind:
  • eax is the return value of a function
  • parameters of a function are pushed to the stack before the call (possibly also into EDX & EDX)
    ---> Binary Ninja shows which calling convention is used
    ---> for example __stdcall or __fastcall
The goal is to understand what input data is processed and how.

Which branches are executed in the program flow?

What data is needed to change the program flow?

The first thing we want to look at in more detail is why is the crack is server shown as "offline" even when we have a simple Python server running. The cfa apparently needs a valid response.

To find out more about this, we can start looking at which functions it probably makes the most sense to set a breakpoint. Maybe the following functions could give us more clues:

00408f30 connect crackserver @ io.krfc.org
0040a0b0 load area
0040b290 log in at Crack Server
0040b740 connect and interact with CrackServer

We can simply attach the debugger to a running cfa process. For this, the cfa has to be running. It will wait for a user input in the first dialog.

If the program cannot be minimized we can start the task manager with right-ctrl-del and then get to x32dbg in the windows task bar. There press Alt+A or click File > Append and select the cfa.exe process.

The disassembly in the debugger does not show the correct memory addresses we want. It will show some code in a section at memory address 0x770000. But we can easily find our code by going to the Memory Map tab and double-clicking in the ".text" section.



Memory Map in x32dbg

There we can now scroll around and set the needed breakpoints at the relevant positions. Just click at the dot on the left side and it will mark the breakpoint in red.



Set a breakpoint at some function

In x32dbg we can also write comments in the dissambly, which, if they are in the same command as the breakpoint, will also be displayed in the breakpoint tab. This way we can name our breakpoints and find them again later.

While using the debugger, we should continue to have Binary Ninja and Ghidra open in the background, because we have a better overview of the code there and can note our findings.

So the "dynamic" approach is really just an extension of the static approach. Both work best in parallel.

The first breakpoint triggers at 0040b290 [log in at Crack Server]. If we do a few steps through the code we arrive at: 0040b34e call sub_4197f0

When debugging, there are two important steps you can do: step into and step over.

Stepping into means to follow the function and jump to it's start. This could be somwhere else in the code at a completely different memory address. We then have to step throug it until the return instruction to get back where we came from. This allows you to go through each function in the program, but it takes a long time.

Stepping over means we just step to the next instruction after the call. The function at call is executed, as well as all other functions that are called into it. This can lead to interesting functions being overlooked because they are "skipped".

In the end, you may have to decide a bit by intuition whether to use step into or step over.

Anyway. If we step over sub_4197f0, a window pops up in the cfa asking for our nickname. Let's choose one or confirm the saved one.



Set a breakpoint at some function

Continue stepping we arrive at: 0040b3fc call sub_40cc00. This lets the "connection (1/10)" window appear.



Set a breakpoint at some function

Let's move on.

Before the call at 0040b4cf some strings are placed on the stack. In this case: "io.krcf.org", "defaultuser", "yawe", "TJ39M43U".



At 004126f0 these strings are sent.

What's interesting is that an additional "HELLO" is received at the server.



If we take a look at 004126f0 in the graph view we can see the push data_452840 {var_28} {"HELLO "%s" "%s" "%s" "%s" \r\n"} before the already mentioned "process strings before sending" function. Our assumption seems correct. It joins these strings before the data is sent to the server.

push    data_452840 {var_28}  {"HELLO "%s" "%s" "%s" "%s" \r\n"}
push    edi {var_2c}
mov     byte [esp+0x28 {var_4_1}], 0x3
call    processing-string-before-sending
add     esp, 0x18
mov     ecx, esi
push    edi {var_18_2}
call    socket-sending

"HELLO" is propably used as a command or part of the protocol that the server needs to know how to handle the received data.

Let's view 0040b4cf in the graph view and examine what happens after the call to send the strings.



We can see it sleeping and looping if the server can't be reached. At the following instruction it jumps all the way back and tries to connect again.

test    ebp, ebp
jne     0x40b3a7

If the server is online the program continues and the block showing "no connection!\check settings..." is skipped.

Following the code flow while stepping in the debugger, we return two times and land into a function that actually calls "load-area" at our next breakpoint.

A short reminder: according to our simple server Python script, the server should respond with "HELLO OK\r\n"

  if("HELLO" in data):
    return "HELLO OK\r\n"

Until now, however, this was not seen anywhere. So let's continue stepping through until we find it.

At a instruction suddenly the string "offline" appears.



When did the program decide that our server is offline? We clearly sent a response back. But maybe it is a default value and gets overwritten in the future?

The rest of "load-area" seems to be just graphical interface things, so let's skip to return at the end.

After returning we get to a function I called "loading right side menu bar". It has a loop that calls "load-area" multiple times. Propably because there are four areas. Let's skip to return here too.

Now after stepping through another few functions and returns doing just interface stuff at first sight, we get to a very big function at 00436d81. It has a big decision switch case. We land in the following part of the code right after the call:

  {Case 0xb}
  0043708a  8bcf               mov     ecx, edi
  0043708c  ffd3               call    ebx
  0043708e  e988010000         jmp     0x43721b  

This propably means our interface loading function was called dynamicly because EBX is only set at runtime.

If we now stepping further we eventually get to memory address space at 0x77-something instead of our known 0x4-something.

Maybe we are now in some kind of windows library code. If we skip to the end we directly get to our next breakpoint at 0040b740. This is the already known "connect and interact with CrackServer" function we saw in the advanced static analysis.

And now, finally we can see our server response!!!



0040b75c mov eax, dword [esp+0xc6c {arg4}] is loading the string into EAX. Maybe what we suspected as a windows library before gets our string from the websocket and passed it to this function.

At 0040b790 it calls a function with our string as a parameter. Is this where the areas get unlocked? In the following we see the call and the next two instructions.

0040b790  e8db160000         call     sub_0040ce70
0040b795  85c0               test    eax, eax
0040b797  0f84850d0000       je      0x40c522

The je intruction would jump near the end of the whole function in case EAX is zero. This indicates that this could be more like a check if the server responded with a valid string at all.

Looking quickly into sub_0040ce70 it pushes "\x0d\x0a" to the stack and calls "_strstr" which is a C function that returns the adress of a substring in a string. Meaning it searching for a newline carriage return, which is a usual string termination in network protocolls. As expected it's a check if dealing with a valid string.

If EAX is not zero and the function continues we get to another very big function at 004128b0. We will investigate that further in a minute.

For now we can see that it also does some checks and jumps before splitting into two main code paths.

call    sub_00413e40
mov     esi, eax

004128fe  81fe97000000       cmp     esi, 0x97
00412904  0f842d0c0000       je      0x413537
                             // 152
0041290a  81fe98000000       cmp     esi, 0x98
00412910  0f84210c0000       je      0x413537
                             // 155
00412916  81fe9b000000       cmp     esi, 0x9b
0041291c  0f84150c0000       je      0x413537
                             // 156
00412922  81fe9c000000       cmp     esi, 0x9c
00412928  0f84090c0000       je      0x413537
                             // 158
0041292e  81fe9e000000       cmp     esi, 0x9e
00412934  0f84fd0b0000       je      0x413537
                             // 100
0041293a  83fe64             cmp     esi, 0x64
0041293d  7c26               jl      0x412965
                             // 296
0041293f  81fe28010000       cmp     esi, 0x128
00412945  741e               je      0x412965

There is a function sub_00413e40 whose return value is compared to different values. Propably we need to unterstand these numbers.

Let's set a breakpoint to come back to it later with different inputs.

In our actual case the return value from the "HELLO OK" input is just 0xFFFFFFFF. This leads to a conditional jump that brings us right to the end of the function without touching it's interesting looking code paths.

Back in 0040b740 connect and interact with CrackServer EAX is compared to 0x7d0 but in our case it is 0x3e7.



We are transported to the end of the function without going through interesting functionality one more time.

looks like our input is not right... cfa is just displaying "offline".



Server is up and running but cfa needs special treatment

At least we already stumbled over functions that obviously process the input from our server and compares it with some values. That means we need to find out how exactly the response is processed by the server and send responses that match the values of the cmp instructions.

We can guess a few possible answers for the server, but if we look again in the Strings output there are not too many candidates either. One stands out at least a little bit, because the word "server" appears.

"000 Crack Server at (V [experimental]) ready."

Let's try responding with this one. After "at" we just write the hostname cfa wants: io.krcf.org. And V, if we just guess, could mabye stand for Version. Let's try V0.1. We end this with \r\n and write another test string afterwards.

  if("HELLO" in data):
    return "000 Crack Server at io.krcf.org (V0.1 [experimental]) ready.\r\nHELLO OK\r\n"

After restarting the cfa and attaching x32dbg we hit the breakboint at 004128b0. Let's now take a closer look at this monster.



In the end of the first block it calls sub_00413e40 and does it's compares right after. Hint Against possible confusion: In Graph View the function is already renamed to "Berechnet den Response Code". I named the function this way because I suspect it to calculate a value from the response from the server.

Anyway, our string "000 Crack Server at io.krcf.org (V0.1 [experimental]) ready." is passed to 00413e40 as an argument. Stepping into, we see two mainly interesting things.

The first, it again searches for a substring and seperates "000" from the rest for further processing. Then it compares against 0x3, propably to check if it is dealing with three chars.

At second it goes through an interesting loop shown in the following:

00413f26  33c9               xor     ecx, ecx  {0x0}

00413f28  8a0411             mov     al, byte [ecx+edx] 
00413f2b  3c30               cmp     al, 0x30   ; 0x30 = "0"
00413f2d  7cc3               jl      0x413ef2
       
00413f2f  3c39               cmp     al, 0x39   ; 0x39 = "9"
00413f31  7fbf               jg      0x413ef2

00413f33  41                 inc     ecx
00413f34  83f903             cmp     ecx, 0x3
00413f37  7cef               jl      0x413f28

Let's go through this backwards. The instruction at the end jl 0x413f28 jumps to the start of the loop at 00413f28. Before that it increments ecx (inc ecx) and compares against 3 (cmp ecx, 0x3). Also note that before the loop starts, ECX was set to zero (xor ecx, ecx). This means it is likely something like:

  for(int i = 0; i < 3; i++ ){

  }

Now at the start it moves a byte from [ecx+edx] to al and compares al to greater then 0x30 and less then 0x39. Why? At [ecx+edx] we could see 0x303030 what is equal to "000" because 0x30 is "0" in ASCII.

After some time I realized 0x30 and 0x39 both are in ASCII range and in fact they are the hex numbers of "0" and "9". This means, the function is most likely checking if the first three characters in our input are ASCII numbers between 0 and 9. A high level function may look like this pseudo code:

 
  for( int i = 0; i < 3; i++ ){

    if( str[i] < "0" || str[i] > "9" ){
       return false;
    }

  }

Could it be that the server never was intended to respond with strings like "HELLO" etc., but with numeric codes like 404 or 007 ???????

This would explain why the cfa did not react to any random strings... Maybe we can even enumerate all the functions by counting up the numbers.

But let's continue exploring for now to see where we get with "000".

After this passes all three ASCII-number checks the funtion at 004261af is called.



Simply based of it's clear structure, I would guess that this is a standard function optimized for exactly one small task.

I really tried to understand the function and went through instruction by instruction but didn't get it first. It is shifting some bytes and stuff, don't know enough to make sense out of it. So my next approach was to set a breakpoint to go through it again later with different numbers.

That being said let's return to 004128b0.

In our current case EAX is 0x00000000 and gets compared at first with 0x97, 0x98, 0x9b, 0x9c and 0x9e. All of these will let us jump just to the end of this function and afterwards lead to different error messages in the parse-protocoll-errors function. But our "000" passes these and jumps after cmp esi, 0x64 at jl 0x412965

Following the code path, even more checks are passed until the function splits into two main paths. The one on the left is a bit chunkier compared to the slimmer linear path on the right. We jump to the left part where again are a few compares and branches.

Since the compares do not trigger a jump to another area, we just keep sliding to the end of the function. And finally end up back in "0040b740 connect and interact with CrackServer".

This run was a little bit better, but still didn't end up in one of the interesting branches. Nevertheless, we have made a very exciting new discovery. Because based on the assumption that the server responds with numbers, we can now investigate whether we take a different branch if we respond with different numbers.

Let's try for example "001 HELLO OK TEST".

if("HELLO" in data):
  # return "HELLO OK\r\n"
  # return "000 Crack Server at io.krcf.org (V0.1 [experimental]) ready.\r\nHELLO OK\r\n"
  return "000 Crack Server at io.krcf.org (V0.1 [experimental]) ready.\r\n001 HELLO OK TEST\r\n"

After restarting everything, we hit the breakpoint at 00413f3a in 00413e40. This was the unknown function doing something with our response numbers.

Let's just step over to get it's return value.

Surprisingly, it is different this time. EAX is 0x00000001. Maybe the function is translating ASCII numbers to real numbers somehow. Let's test this really quick with a few more server responses like "002 HELLO OK TEST\r\n" or "016 HELLO OK TEST\r\n".

And it works as expexted. "002" becomes 0x00000002 and with "016" as input EAX becomes 0x00000010. Therefore, we can call the function "char_to_hex" with good confidence.

We now also now which numbers we should respond to reach different parts of the program.

004128fe  cmp     esi, 0x97  ---> "151"
0041290a  cmp     esi, 0x98  ---> "152"
00412916  cmp     esi, 0x9b  ---> "155"
00412922  cmp     esi, 0x9c  ---> "156"
0041292e  cmp     esi, 0x9e  ---> "158"
0041293a  cmp     esi, 0x64  ---> "100"
0041293f  cmp     esi, 0x128 ---> "296"
0041299a  cmp     esi, 0x6f  ---> "111"
004129ce  cmp     esi, 0x1   ---> "001" 
00412ee4  cmp     esi, ebx   ---> "019"
00412eec  cmp     esi, 0x7   ---> "007"
00413004  cmp     esi, 0x17  ---> "023"
004132a6  cmp     esi, 0x16  ---> "022"
004133d1  cmp     esi, 0x10  ---> "016"
00413481  cmp     esi, 0xf   ---> "015"

But if we just continue our run with "016" the program crashes :(

EXCEPTION_DEBUG_INFO:
           dwFirstChance: 0
           ExceptionCode: C0000005 (EXCEPTION_ACCESS_VIOLATION)
          ExceptionFlags: 00000000
        ExceptionAddress: 00438FDF cfa.00438FDF
        NumberParameters: 2
ExceptionInformation[00]: 00000000 Read
ExceptionInformation[01]: 0002C4A3 Inaccessible Address
Last chance exception [Ausnahme] bei 00438FDF (C0000005, EXCEPTION_ACCESS_VIOLATION)!

C000000D (STATUS_INVALID_PARAMETER)
STATUS_INVALID_PARAMETER: Einem Dienst oder einer Funktion wurde ein ungültiger Parameter übergeben. 

An invalid parameter was passed to a service or function? Does this mean that the program needs not only the correct numerical code, but also the correct number of parameters?

Actually, time to consider how to fuzz this efficiently.

But first I want to go through with "001" again to maybe finally get these areas working. This means we go into the right long branch of 004128b0



When zooming out, you will notice that there are 6 similar loops that follow each other. Are the parameters processed here sequentially?

At 004129f5 EAX is &"HELLO". Considering our input "001 HELLO OK TEST\r\n" this now propably parse the first parameter.

At 00412a6a "char_to_hex" is called again. Do we need another number here?

The called function at 00412a92 checks if 0x22 (doubleqoute char " in ascii) is present in the string. If no 0x22 was found the loop breaks and the function ends. Since it is a loop it can be assumed that it searches at least one more time for another quotation mark. Probably it looks for one opening and one closing. As in "TEST". Let's just call it check_if_string_with_quotes.

Now it somehow tries to compare the input string with "NUM", "ALPHA" and "ALPHANUM". But I am not sure.

Then comes another block with check_if_string_with_quotes.

Then another loop, after which the function splits again and either comes to its end, or goes into two more loops.

At the adress data_452528 which is accessed at 00412c73 push data_452528 {var_3c} is stored just a "x". After that the function _mbsstr is called. Unfortunately our input breaks the code flow after _mbsstr, ending the whole thing.

So something is still wrong with our input.

With the knowledge gained, and after a while of trial and error, we could come up with the following.

'001 1 "NUM" "1x1" "HELLO" "OK" TEST'

The number at the beginning makes the code take the corresponding path. The first parameter is a number for test purposes, because there seems to be another char_to_hex conversion. Then a search is done for NUM, ALPHA or ALPHANUM. Therefore, we pass NUM as a test first. But in double quotes, because they probably play some role, too. After that a string with an x and finally some additional test data. Note the TEST without quotes at the end. Now stepping through it again most parts of the function are processed, however the TEST at the end still seems to cause problems...

But what is this????

If we click back into the cfa, we see that the areas no longer show OFFLINE but TEST.



Areas showing "TEST" !!! finallyyyyyyyyyyyyyyyy

That is a great success.

If we look closely, we can see that the first area at the top of last crack displays "HELLO", the second one displays "OK" and all the following ones display TEST.

And if we take a closer look at the two bigger loops at the and of the large function we just went through, we realize the following: It first tried to process "HELLO" and then TEST where it failed. But if the data is vaild it loops four times. And the next loop is also repeated four times before it comes to it's end.

In the interface we have exactly four areas. So we can assume that it needs four times something written at "last crack" and then four times the actual area content.

Knowing this we can start to experiment with actual input :)

if("HELLO" in data):
  #return "HELLO OK\r\n"
  # return "000 Crack Server at io.krcf.org (V0.1 [experimental]) ready.\r\nHELLO OK\r\n"
  ready = "000 Crack Server at io.krcf.org (V0.1 [experimental]) ready.\r\n"

  asciiart  = "#____#___##____###__#____#_______"
  asciiart += "#____#__#__#__#____#_#__#________"
  asciiart += "#####_#____#_#______###_________"
  asciiart += "#____#_#####_#______#__#________"
  asciiart += "#____#_#____#_#____#_#___#_______"
  asciiart += "#____#_#____#__###__#____#______"

  a1 = asciiart
  a2 = asciiart
  a3 = asciiart
  a4 = asciiart
  
  return ready + f"001 \"NUM\" \"1x1\" \"1\" \"2\" \"3\" \"4\" \"{a1}\" \"{a2}\" \"{a3}\" \"{a4}\"\r\n"


hack hack

We just unlocked the main part of the cfa!

With our gained knowledge we could do this again and again until we unlock all features of the cfa and extend our server script accordingly. There are a few other interesting functionalities like all of the actual cracking stuff or posting a billboard message etc.

But let's bring this reversing thing up to another next level again, should we?

The next techniqe I will show here should be used extremly carefully.

The following technique we will see in a second is (if all goes well and it succeeds) one of the most powerfull and efficient techniques ever existed.

If you like, it's basically an unhackable, military grade, blockchain, AI, Cyber, advanced hell of a techniqe.

I simply name it: "contact the developer and ask for source code" 🤯🤯🤯

CRACKSERVERHH

After a ton of reversing (shown above) it really became a bit tedious and a bit of a pain in the ass to single step through every piece of the cfa. So I thought what if I just had the source code? That would propably improve my learning curve a lot. Or at least some kind of working server to get all of the missing parts at a network protocol level.

And so I just tried my luck and sent an email to knowbotic research with exactly these questions.

And I was lucky. Alexander Tuchaček answered my email and we arrange a video call. During the call, Alex wasn't sure if the 20-year-old code still existed, but he wanted to take a look at an old hard drive.

And indeed. After our conversation, another email arrived with a ZIP file. This contained the complete source code of the server! How cool is that?

And now follows a short manual for: RUNNING CODE FROM YEAR 2000!!!

It's been 22 years since the code was written. On a modern system it is an eternal fiddling, maybe even a completely hopeless project to get the code to work. Perhaps it could even be completely rewritten faster.

So what to do?

God, bless Linux!
  • 1. Donwload ancient Debian ISO
  • 2. Install it (as VM)
  • 3. Add archive sources:
    deb http://archive.debian.org/debian/ sarge main non-free contrib
    deb-src http://archive.debian.org/debian/ sarge main non-free contrib
  • 4. install python:
    apt-get update
    apt-get install python
  • 5. Make two minor adjustments in the server code:
    start.py line 38: IP_ADDRESS = '192.168.0.31'
    start.py line 204: # comment out this line os.setegid (gid)
  • 6. run it
    python start.py
debian:~/CrackServerHH/secret# python start.py
python start.py
/root/CrackServerHH/secret/http_server.py:13: DeprecationWarning: the regex module is deprecated; please use the re module
  import regex
/usr/lib/python2.3/FCNTL.py:7: DeprecationWarning: the FCNTL module is deprecated; please use fcntl
  DeprecationWarning)
/usr/lib/python2.3/regsub.py:15: DeprecationWarning: the regsub module is deprecated; please use re.sub()
  DeprecationWarning)
warning: Cannot do reverse lookup
info: Medusa (V0.1) started at Tue Feb 22 19:51:28 2022
        Hostname: 192.168.0.31
        Port:8080

info: FTP server started at Tue Feb 22 19:51:28 2022
        Authorizer:
        Hostname: debian
        Port: 8021
info: Monitor Server (V0.1) started on port 9999
Chat Server (V0.1) started on port 8888
Exception exceptions.AttributeError: "DbfilenameShelf instance has no attribute 'writeback'" in  ignored
Crack Server (V0.8) started at Tue Feb 22 19:51:28 2022  Hostname: debian  Port: 6666

Crack Server (V0.8) started at Tue Feb 22 19:51:28 2022
Hostname: debian Port: 6666


Let's test the server real quick and send some data.

echo -e "HELLO a b c d\r\n" | nc 192.168.0.31 6666
000 Crack Server at debian (V0.8 [experimental]) ready.

Ok, another quick test sending valid data:

echo -e "HELLO a b c d\r\nLOGIN 1 123456\r\nPOST ~~~hacking.art~~~\r\nBILLBOARD\r\n" | nc 192.168.0.31 6666
000 Crack Server at debian (V0.8 [experimental]) ready.
HELLO OK 6 "NUM" "45x20" "22/02/2022 19:51" "22/02/2022 19:51" "22/02/2022 19:51" "22/02/2022 19:51" "none defined" "none defined" "none defined" "none defined"
LOGIN OK "cracked"
POST OK
~~~hacking.art~~~" "22/02 20:06:42

Insane! It works like a charm! Never thought it would work first try to get the server up and running... wow!

Now let's do this with the real cfa...

Boom! Area Cracked!

P.S. I have a lot of ideas what to do from here. For example, modify the cfa, review & hack the server, write my own faster crack-client, fuzzing and exploit the binary and so on and so forth. But that will probably be own articles, because this one is already long enough :)