banner
CedricXu

CedricXu

计科学生 / 摄影爱好者

[CSAPP]BombLab

Let's defuse the bomb together!

It has been two semesters since the last Data Lab, and I finally remembered that I have time to do the CSAPP lab. This lab is called Bomb Lab, and the story background is that Dr. Evil designed a bomb that can only be defused by correctly entering six strings. We need to reverse engineer and disassemble the bomb's executable file to figure out the six strings that can defuse the bomb.

Lab Overview#

Lab materials can be found on the CMU website: https://csapp.cs.cmu.edu/3e/labs.html

There are a total of three files in the compressed package:

.
├── README    
├── bomb      // Bomb executable file
└── bomb.c    // Source code for the bomb

Let's first take a look at the bomb.c file.

hkybb

From this, we can see the general working process of the bomb:

  • Read a line of string
  • Use the phase_i function to determine if the input is valid

So, by studying the verification process of each phase function, we can deduce the string content.

Use objdump -d bomb > bomb.s to obtain the disassembled file of the bomb.

4hxfc

Use tmux on the left side for GDB debugging and view the assembly code on the right side.

Next, let's start defusing the bomb!

Phase_1#

The assembly code for phase_1 is as follows:

2wf9q

string_not_equal takes two string pointer inputs, edi and esi, compares the two strings, and if they are equal, eax outputs 0.

So the purpose of phase_1 is to compare the input string with the string at 0x402400. Then it uses test to check if the output is 0; if it is zero, it passes; otherwise, it calls explode_bomb.

Thus, our answer is the string at 0x404200. Let's check it using GDB.

aul0o

For information on using GDB, you can refer to this CMU manual: https://csapp.cs.cmu.edu/2e/docs/gdbnotes-x86-64.pdf

Tips: Dr. Evil allows you to use file input: ./bomb <file>, to avoid the tedious repetitive input every time you try to defuse the bomb.

Let's try to defuse the first bomb. First, write the string we just obtained in answer.txt.

sx06g

The first string has been successfully entered!

ud239

Phase_2#

The assembly code for phase_2 is as follows:

4st2b

This function is significantly longer than the previous one and also calls read_six_numbers, with many branching jump instructions.

  400efe:	48 83 ec 28          	sub    $0x28,%rsp
  400f02:	48 89 e6             	mov    %rsp,%rsi
  400f05:	e8 52 05 00 00       	call   40145c <read_six_numbers>

It allocates 0x28 space on the stack and stores six numbers in the stack.

  400f0a:	83 3c 24 01          	cmpl   $0x1,(%rsp)
  400f0e:	74 20                	je     400f30 <phase_2+0x34>
  400f10:	e8 25 05 00 00       	call   40143a <explode_bomb>

If the first number is not 1, the bomb explodes.

  400f17:	8b 43 fc             	mov    -0x4(%rbx),%eax
  400f1a:	01 c0                	add    %eax,%eax
  400f1c:	39 03                	cmp    %eax,(%rbx)
  400f1e:	74 05                	je     400f25 <phase_2+0x29>
  400f20:	e8 15 05 00 00       	call   40143a <explode_bomb>

If the next number is not double the current number, the bomb explodes.

  400f25:	48 83 c3 04          	add    $0x4,%rbx
  400f29:	48 39 eb             	cmp    %rbp,%rbx
  400f2c:	75 e9                	jne    400f17 <phase_2+0x1b>
  400f2e:	eb 0c                	jmp    400f3c <phase_2+0x40>

rbx points to the position after the last number. If it has reached this point, it means all six numbers have been processed, and the judgment ends.

From the above analysis, the second string needs to meet the following requirements:

  • Composed of 6 numbers
  • The first number is 1
  • Each number from the second one onwards is double the previous one

So the string is 1 2 4 8 16 32.

Phase_3#

The assembly code for phase_3 is as follows:

h3g6b

We can see that there are many regular jump statements, and there is an sscanf at the beginning. The 0x4025cf that appears earlier should store the format being read. Let's print it out using GDB.

h2i7l

We can see that the string we input should be two numbers.

  400f6a:	83 7c 24 08 07       	cmpl   $0x7,0x8(%rsp)
  400f6f:	77 3c                	ja     400fad <phase_3+0x6a>

0x8(%rsp) is the first number. If it is greater than 7, it jumps to bomb explosion.

  400f71:	8b 44 24 08          	mov    0x8(%rsp),%eax
  400f75:	ff 24 c5 70 24 40 00 	jmp    *0x402470(,%rax,8)
  400f7c:	b8 cf 00 00 00       	mov    $0xcf,%eax
  400f81:	eb 3b                	jmp    400fbe <phase_3+0x7b>
  400f83:	b8 c3 02 00 00       	mov    $0x2c3,%eax
  400f88:	eb 34                	jmp    400fbe <phase_3+0x7b>
  400f8a:	b8 00 01 00 00       	mov    $0x100,%eax
  400f8f:	eb 2d                	jmp    400fbe <phase_3+0x7b>
  400f91:	b8 85 01 00 00       	mov    $0x185,%eax
  400f96:	eb 26                	jmp    400fbe <phase_3+0x7b>
  400f98:	b8 ce 00 00 00       	mov    $0xce,%eax
  400f9d:	eb 1f                	jmp    400fbe <phase_3+0x7b>
  400f9f:	b8 aa 02 00 00       	mov    $0x2aa,%eax
  400fa4:	eb 18                	jmp    400fbe <phase_3+0x7b>
  400fa6:	b8 47 01 00 00       	mov    $0x147,%eax
  400fab:	eb 11                	jmp    400fbe <phase_3+0x7b>

Based on the first number, it selects the second number, similar to a switch statement. Assuming the first number is 0, it jumps to *0x02470. Let's check what is stored there using GDB.

qv677

It turns out to jump to 0x400f7c, where %eax=0xcf, which is 207.

  400fbe:	3b 44 24 0c          	cmp    0xc(%rsp),%eax
  400fc2:	74 05                	je     400fc9 <phase_3+0x86>
  400fc4:	e8 71 04 00 00       	call   40143a <explode_bomb>

It checks whether the second number is equal to %eax. If not, the bomb explodes.

So one form of this string is 0 207. Depending on the first number, the second number can have other different values.

Phase_4#

The assembly code for phase_4 is as follows:

4gwzt

We can see that besides calling sscanf, it also calls the function func4.

As usual, let's first take a look at the format received by sscanf.

yis7d

We can see that there are still two numbers.

  40102e:	83 7c 24 08 0e       	cmpl   $0xe,0x8(%rsp)
  401033:	76 05                	jbe    40103a <phase_4+0x2e>
  401035:	e8 00 04 00 00       	call   40143a <explode_bomb>

And the first number must be less than or equal to 14 (0xe).

  40103a:	ba 0e 00 00 00       	mov    $0xe,%edx
  40103f:	be 00 00 00 00       	mov    $0x0,%esi
  401044:	8b 7c 24 08          	mov    0x8(%rsp),%edi
  401048:	e8 81 ff ff ff       	call   400fce <func4>

Pass 14, 0, and the first number as parameters to func4. Next, let's analyze what happens in func4.

jp6wb

There are two branch jumps and two recursive calls. We suspect it is a program similar to binary search. Let's try to write it in C format.

int func4(int i, int j, int target) {
    int mid = ((i - j + ((i - j) >> 31)) >> 1) + j;
    if (mid == target) {
        return 0;
    } else if (mid < target) {
        return 2 * func4(i, mid + 1, target) + 1;
    } else {
        return 2 * func4(mid - 1, j, target);
    }
}

To make the output 0, we just need to keep searching to the left without ever finding mid on the right side. The input is i=14, j=0, so the possible values for the first number are 7, 3, 1, 0.

  401051:	83 7c 24 0c 00       	cmpl   $0x0,0xc(%rsp)
  401056:	74 05                	je     40105d <phase_4+0x51>
  401058:	e8 dd 03 00 00       	call   40143a <explode_bomb>

Finally, it checks whether the second number is zero. So this string can take the form 3 0.

Phase_5#

The assembly code for phase_5 is as follows:

8dfcd

We see two string constants 0x4024b0 and 0x40245e. Let's check what they are using GDB.

17fa2

"maduiersnfotvbylSo you think you can stop the bomb with ctrl-c, do you"

"flyers"

I speculate that the first string is a dictionary since there is an offset 0x4024b0(%rdx) when used; the second string is the target string, which is compared with the user input string after being translated through the dictionary using <string_not_equal>.

Based on this assumption, let's carefully observe the translation process through the dictionary:

  40108b:	0f b6 0c 03          	movzbl (%rbx,%rax,1),%ecx
  40108f:	88 0c 24             	mov    %cl,(%rsp)
  401092:	48 8b 14 24          	mov    (%rsp),%rdx
  401096:	83 e2 0f             	and    $0xf,%edx
  401099:	0f b6 92 b0 24 40 00 	movzbl 0x4024b0(%rdx),%edx

Here, %cl is the low 8 bits of the %ecx register, and taking it with 0xf means taking the low four bits, so when translating, we take the low four bits of the character byte as the offset!

Thus, "flyers" corresponds to the dictionary offsets "9 15 14 5 6 7". We can write a Python script to generate some valid strings.

spell = [9, 15, 14, 5, 6, 7]

for offset in range(8):
    result = "".join(chr(num + 16 * offset) for num in spell)
    print(f"Offset {offset * 16} : {result}")

The output is:

ryghc

So the fifth string can be IONEFG. It seems the guess was correct, and we arrived at the answer without carefully reading the assembly code line by line.

Phase_6#

This function is significantly longer than the previous ones. Here, I will present the most important part of the code.

upap4

I will briefly explain what this code does without going into detailed analysis. Interested readers can download the source code to check it out.

In memory, there is a linked list.

rcr1i

First, read in 6 numbers, each ranging from 1 to 6.

Then take the complement of these 6 numbers with respect to 7, i.e., i=7-i.

Using the number i as an index, find the i-th node of the linked list and push the pointer to this node onto the stack.

By rewriting each node's nextnode pointer according to the order in the stack, the linked list nodes' order is rearranged to match the stack order.

Finally, check whether the data in the linked list nodes is in descending order.

To make the numbers in the nodes sorted in descending order, the order should be 3 4 5 6 1 2. After taking the complement with respect to 7, it becomes 4 3 2 1 6 5, which is the string we need.

Bomb Defused Successfully#

Thus, we have obtained all six strings needed to defuse the bomb and written them into answer.txt.

h0218

Running ./bomb answer.txt, we get:

ed5zn

Although this lab is at a hello world level for true reverse engineering, it familiarized me with x86-64 assembly and basic GDB commands.

Overall, this was a very interesting experiment, and I couldn't stop!

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.