angr 入门介绍（二）_学逆向论坛|免费的CTF在线练习平台|ctf攻防训练靶场|网安夺旗竞赛系统|软件破解|病毒分析|游戏外挂|视频教程|xuenixiang.com

xuenixiang 发表于 2020-6-15 00:50:32

angr 入门介绍（二）

part 2.1　　这一篇中采用的例子不是angr_ctf 仓库里面的了，而是 Enigma 2017 Crackme 0 （writeup），在看下面的内容之前，可以先看一下这个writeup，现在我将会使用angr解决它。
Enigma 2017 Crackme0　　红色部分表示我们不感兴趣的路径，因为它们执行了wrong函数。，绿色部分才是我们感兴趣的代码路径。蓝色部分是angr开始执行分析的指令地址。与其记录下每个 call wrong 指令的地址，我们可以直接丢弃所有到达 wrong 函数的状态。然后看一下 fromhex() 函数，看看是否能排除一些不感兴趣的路径。我们也希望避免 0x8048670 这条路径，因为它让我们必须提供输入，否则关闭程序。
　　正如我们先前看到的， fromhex 函数将会根据输入返回一个不同的值，但是通过对main函数的分析可以知道，我们对能使这个函数返回0的状态感兴趣：
　　基本上，jz（0时跳转）与 je 指令相同。如果eax为0， test eax, eax 指令将会设置 eflags 的零标志位
　　，如果设置了零标志位，那么 je跟 jz 将跳转到指定的地址。我们只对设置零标志位的状态感兴趣，了解这一点后，再回到 fromhex 函数中，记录下除返回零以外的其他代码的路径。
　　这样，我们就有了所有我们认为感兴趣的和要避免的代码路径，我们来考虑一下应该把符号缓冲区放在哪？
　　从上面的截图可以看出，指向我们输入的字符串的指针在调用 fromhex() 之前被直接压到栈上，这意味着我们可以保存我们输入的字符串到任何地方，然后把该地址赋给 eax ，程序将会负责剩下的工作。接下来就是讲如何实现。
　　先回顾一下我们现在所了解的：

[*]我们将会从 0x8048692这个地址开始分析，是在调用 fromhex()函数之前的 push eax 指令。
[*]我们希望到达的地址是 0x80486d3 。
[*]不感兴趣的程序路径是：。
[*]我们知道我们输入的字符串的地址保存在 eax
　　知道这些以后，开始构建我们的脚本，先导入必要的库
import angr
import claripy

　　我们定义了main函数和必要的变量，随后定义了初始状态。
def main():
path_to_binary = "./crackme_0"
project = angr.Project(path_to_binary)

start_addr = 0x8048692 # address of "PUSH EAX" right before fromhex()
avoid_addr = # addresses we want to avoid
success_addr= 0x80486d3 # address of code block leading to "That is correct!"
initial_state = project.factory.blank_state(addr=start_addr)

　　现在就开始制作我们的符号位向量，并选择把他们保存到哪，我选择栈上的一个地址 0xffffcc80 ，这个地址可以是任意的，那不重要。我们初始化了 password_length 为32，因为从前面的分析可知程序需要32个字节的字符串，记住符号位向量长度是用bit描述的。对于符号字符串，位向量长度将是字符串的长度（32个字节）乘以8。
password_length = 32 # amount of characters that compose the string
password = claripy.BVS("password", password_length * 8) # create a symbolic bitvector
fake_password_address = 0xffffcc80 # random address in the stack where we will store our string

　　现在就把符号位向量保存到内存中并把该内存地址放入 eax ，angr用下面的方法可以很容易的做到这一点。
initial_state.memory.store(fake_password_address, password) # store symbolic bitvector to the address we specified before
initial_state.regs.eax = fake_password_address # put address of the symbolic bitvector into eax

　　之后开始仿真，让angr查找我们指定的代码路径。
simulation = project.factory.simgr(initial_state)
simulation.explore(find=success_addr, avoid=avoid_addr)

　　现在查看是否有结果，并打印它
if simulation.found:
solution_state = simulation.found

solution = solution_state.solver.eval(password, cast_to=bytes) # concretize the symbolic bitvector
print("[+] Success! Solution is: {}".format(solution.decode("utf-8")))

else: print("[-] Bro, try harder.")

　　下面是完整的脚本
import angr
import claripy

def main():
path_to_binary = "./crackme_0"
project = angr.Project(path_to_binary)

start_addr = 0x8048692 # address of "PUSH EAX" right before fromhex()
avoid_addr = # addresses we want to avoid
success_addr = 0x80486d3 # address of code block leading to "That is correct!"
initial_state = project.factory.blank_state(addr=start_addr)

password_length = 32 # amount of characters that compose the string
password = claripy.BVS("password", password_length * 8) # create a symbolic bitvector
fake_password_address = 0xffffcc80 # random address in the stack where we will store our string

initial_state.memory.store(fake_password_address, password) # store symbolic bitvector to the address we specified before
initial_state.regs.eax = fake_password_address # put address of the symbolic bitvector into eax

simulation = project.factory.simgr(initial_state)
simulation.explore(find=success_addr, avoid=avoid_addr)

if simulation.found:
solution_state = simulation.found

solution = solution_state.solver.eval(password, cast_to=bytes) # concretize the symbolic bitvector
print("[+] Success! Solution is: {}".format(solution.decode("utf-8")))

else: print("[-] Bro, try harder.")

if __name__ == '__main__':
main()

　　现在测试运行它
part 3　　上面我们学习了如何用angr操纵内存，下面我们将学习如何处理 malloc 函数。
05_angr_symbolic_memory　　先看一下main函数
　　看上去并不复杂，开始解析它。可以看到第一个代码块设置了栈，并调用了 scanf 函数。可以看到它接收一个格式化字符串跟依赖于格式化字符串的参数作为输入。这里使用的调用约定(cdcel) 将参数从右至左压到栈上，因此我们知道压倒栈上的最后一个参数应该是格式化字符串本身（%8s%8s%8s%8s）。
　　根据格式化字符串，我们可以推断出需要四个参数，实际上，有四个地址在格式化字符串之前压到栈上。
　　先记下这四个地址（ user_input :0xA1BA1C0 ）,现在我们知道程序用4个8字节长的字符串作为输入，我们看一下是如何操作他们的。
　　红框内是一个类似于 for 循环的开始，将变量内容与 0x1f 作比较，如果变量小于等于 0x1f，将会跳转到 0x804860A 。
　　在这个循环里，变量将会加一。这就意味着循环从0开始，到0x1f结束，将会迭代32次。这是有道理的，因为我们输入了32个字节（4个八字节字符串）。循环的时候，把我们输入的每个字节传入 complex_function 函数，现在看一下 complex_function 函数。
　　不必花费太多时间在逆向上面，可以看出这个函数对输入做了一系列变换操作，如果注意到高亮部分的代码块，你就会发现它打印了 Try Again ，并结束了进程。我们不希望这样，因此我们需要记住避免这个分支。再回到 main函数，看看循环以后发生了什么。
　　输入在经过程序操作后，会与 NJPURZPCDYEAXCSJZJMPSOMBFDDLHBVN 作比较。如果两个字符串匹配则会打印 good job ，否则打印 Try Again 。现在先概括一下我们所知道的：

[*]程序接收4个8字节字符串输入。
[*]输入保存在一下四个位置：
[*]complex_function 函数循环操作字符串。
[*]循环结束后变异后的字符串与 NJPURZPCDYEAXCSJZJMPSOMBFDDLHBVN 作比较
[*]匹配则打印 Good Job
[*]complex_function跟 main 函数都会到达路径 Try again
　　现在我们知道了足够的信息，打开 scaffold05.py 。
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv
project = angr.Project(path_to_binary)

start_address = ???
initial_state = project.factory.blank_state(addr=start_address)

password0 = claripy.BVS('password0', ???)
...

password0_address = ???
initial_state.memory.store(password0_address, password0)
...

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return ???

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return ???

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found

solution0 = solution_state.se.eval(password0,cast_to=str)
...
solution = ???

print solution
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

　　从main函数开始改
def main():
path_to_binary = "05_angr_symbolic_memory"
project = angr.Project(path_to_binary) # (1)

start_address = 0x8048601
initial_state = project.factory.blank_state(addr=start_address) # (2)

password0 = claripy.BVS('password0', 64) # (3)
password1 = claripy.BVS('password1', 64)
password2 = claripy.BVS('password2', 64)
password3 = claripy.BVS('password3', 64)

　　在（1）建立对象，在（2）处初始化初始状态。注意我们从调用 scanf 后面的 MOV DWORD , 0x0 指令开始。之后在（3）生成4个符号位向量，注意size是64bit，因为每个字符串是8字节长的。
password0_address = 0xa1ba1c0 # (1)
initial_state.memory.store(password0_address, password0) # (2)
initial_state.memory.store(password0_address + 0x8,password1) # (3)
initial_state.memory.store(password0_address + 0x10, password2)
initial_state.memory.store(password0_address + 0x18, password3)

simulation = project.factory.simgr(initial_state) # (4)

　　我们定义了输入字符串的起始地址，第一个符号位向量将保存在这，其他3个顺序保存在后面。
def is_successful(state): # (1)
stdout_output = state.posix.dumps(sys.stdout.fileno())
if b'Good Job.\n' in stdout_output:
return True
else: return False

def should_abort(state): # (2)
stdout_output = state.posix.dumps(sys.stdout.fileno())
if b'Try again.\n' in stdout_output:
return True
else: return False

simulation.explore(find=is_successful, avoid=should_abort) # (3)

　　这里跟以前的例子一样，通过判断输出，确定是否触发了正确的路径。
if simulation.found:
solution_state = simulation.found # (1)

solution0 = solution_state.solver.eval(password0,cast_to=bytes) # (2)
solution1 = solution_state.solver.eval(password1,cast_to=bytes)
solution2 = solution_state.solver.eval(password2,cast_to=bytes)
solution3 = solution_state.solver.eval(password3,cast_to=bytes)

solution = solution0 + b" " + solution1 + b" " + solution2 + b" " + solution3 # (3)

print("[+] Success! Solution is: {}".format(solution.decode("utf-8"))) # (4)
else:
raise Exception('Could not find the solution')

　　查看是否有状态能触发我们感兴趣的路径，并将符号位向量转换为具体的字符串，然后打印
import angr
import claripy
import sys

def main():
path_to_binary = "05_angr_symbolic_memory"
project = angr.Project(path_to_binary)

start_address = 0x8048601
initial_state = project.factory.blank_state(addr=start_address)

password0 = claripy.BVS('password0', 64)
password1 = claripy.BVS('password1', 64)
password2 = claripy.BVS('password2', 64)
password3 = claripy.BVS('password3', 64)

password0_address = 0xa1ba1c0
initial_state.memory.store(password0_address, password0)
initial_state.memory.store(password0_address + 0x8,password1)
initial_state.memory.store(password0_address + 0x10, password2)
initial_state.memory.store(password0_address + 0x18, password3)

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
if b'Good Job.\n' in stdout_output:
return True
else: return False

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
if b'Try again.\n' in stdout_output:
return True
else: return False

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found

solution0 = solution_state.solver.eval(password0,cast_to=bytes)
solution1 = solution_state.solver.eval(password1,cast_to=bytes)
solution2 = solution_state.solver.eval(password2,cast_to=bytes)
solution3 = solution_state.solver.eval(password3,cast_to=bytes)

solution = solution0 + b" " + solution1 + b" " + solution2 + b" " + solution3

print("[+] Success! Solution is: {}".format(solution.decode("utf-8")))
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main()

　　运行成功
06_angr_symbolic_dynamic_memory　　这个挑战跟上一个没有太大区别，知识保存字符串的内存使用 malloc 分配，而不是保存在栈上。
　　下面我们逐块分析。
　　蓝绿两个色块里面用 malloc 分配了两块内存，大小都为9个字节。并将返回的地址分别保存在 buffer0 , buffer1 ，这两个buffer地址分别是 0xABCC8A4 和 0xABCC8AC。
　　可以看到调用 scanf ，并向两个地址写入两个8字节字符串（因为还有一个 null,所以malloc申请9个字符，并初始化为0）。
　　可以看到这里与上一个题目非常像：一个局部变量保存在，然后判断它是否等于7.

[*]两个输入的字符串全是8字节。
[*]从0到7有8次迭代。
[*]　　下面的代码块每执行一次都会加1.

　　我们可以假设这里是一个 for 循环，它每次循环都会分别访问两个字符串的1个字节。，如果您仔细看前面的代码块，就会发现，每次迭代都会以作为索引字符串的第n个字节。每访问一个字节就会执行一次 complex_function 。
　　看一下 complex_function 函数：
　　进行了通常的数学变换操作。，像先前的挑战一样，有一个 Try Again 代码块。在回头看main函数，看看循环以后发生了什么。
　　在这一块，buffer0跟buffer1的内容会分别跟两个不同的字符串作比较，都一样的话，才会输出 Good Job 。在解决这个问题之前，我们先概括一下我们知道的：

[*]程序申请了两个9字节的buffer。
[*]输入两个8字节字符串。
[*]循环了8次。
[*]每次迭代都会用 complex_function 函数”加密“ 输入的两个字符串。
[*]之后分别将这两个字符串与程序内置字符串作比较。
[*]如果全相同就成功了。
　　看一下scaffold06.py 文件：
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv
project = angr.Project(path_to_binary)

start_address = ???
initial_state = project.factory.blank_state(addr=start_address)

password0 = claripy.BVS('password0', ???)
...

fake_heap_address0 = ???
pointer_to_malloc_memory_address0 = ???
initial_state.memory.store(pointer_to_malloc_memory_address0, fake_heap_address0, endness=project.arch.memory_endness)
...

initial_state.memory.store(fake_heap_address0, password0)
...

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return ???

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return ???

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found

solution0 = solution_state.se.eval(password0,cast_to=str)
...
solution = ???

print solution
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

def main():
path_to_binary = "./06_angr_symbolic_dynamic_memory"
project = angr.Project(path_to_binary) # (1)

start_address = 0x8048699
initial_state = project.factory.blank_state(addr=start_address) # (2)

password0 = claripy.BVS('password0', 64) # (3)
password1 = claripy.BVS('password1', 64)

　　在（1）建立一个项目。然后我们决定从哪里开始并相应地设置一个状态（2）。注意，我们从地址0x8048699 开始，该地址指向调用scanf 之后的指令mov dword，0x0 。我们跳过了所有的malloc 函数，稍后我们将在脚本中处理它们。之后，我们初始化两个64位的符号位向量（3）。
　　下一部分：
fake_heap_address0 = 0xffffc93c # (1)
pointer_to_malloc_memory_address0 = 0xabcc8a4 # (2)
fake_heap_address1 = 0xffffc94c # (3)
pointer_to_malloc_memory_address1 = 0xabcc8ac # (4)

initial_state.memory.store(pointer_to_malloc_memory_address0, fake_heap_address0, endness=project.arch.memory_endness) # (5)
initial_state.memory.store(pointer_to_malloc_memory_address1, fake_heap_address1, endness=project.arch.memory_endness) # (6)

initial_state.memory.store(fake_heap_address0, password0) # (7)
initial_state.memory.store(fake_heap_address1, password1) # (8)

　　这里是关键，angr并不会真正的运行程序，所以不需要一定把内存分配到堆上，可以伪造任何地址，我们在栈上分配了两个地址，如（1）（3）所示。并将 buffer0,buffer1地址分别赋值给 pointer_malloc_memory_address0,pointer_malloc_memory_address1，如（2）（4）。之后我们告诉angr，将两个fake address分别保存到 buffer0,buffer1 ，因为程序实际执行的时候就会把 malloc返回的地址保存到这里。最后我们把符号位向量保存到伪造的地址里。下面捋一下这几个地址的关系：
BEFORE:
buffer0 -> malloc()ed address 0 -> string 0
buffer1 -> malloc()ed address 1 -> string 1

AFTER:
buffer0 -> fake address 0 -> symbolic bitvector 0
buffer1 -> fake address 1 -> symbolic bitvector 1　　我们用伪造的地址取代了malloc返回的地址，并保存符号位向量。
simulation = project.factory.simgr(initial_state) # (1)

def is_successful(state): # (2)
stdout_output = state.posix.dumps(sys.stdout.fileno())
if b'Good Job.\n' in stdout_output:
return True
else: return False

def should_abort(state): # (3)
stdout_output = state.posix.dumps(sys.stdout.fileno())
if b'Try again.\n' in stdout_output:
return True
else: return False

simulation.explore(find=is_successful, avoid=should_abort) # (4)

if simulation.found:
solution_state = simulation.found

solution0 = solution_state.solver.eval(password0, cast_to=bytes) # (5)
solution1 = solution_state.solver.eval(password1, cast_to=bytes)

print("[+] Success! Solution is: {0} {1}".format(solution0.decode('utf-8'), solution1.decode('utf-8'))) # (6)
else:
raise Exception('Could not find the solution')

　　在（1）初始化仿真，在（2）（3）像以前一样定义两个函数，用于选出我们感兴趣的路径。然后探索代码路径，如找到解，就将对应的符号位向量转换为对应的字符串，然后打印出来，下面是完整的脚本。
import angr
import claripy
import sys

def main():
path_to_binary = "./06_angr_symbolic_dynamic_memory"
project = angr.Project(path_to_binary)

start_address = 0x8048699
initial_state = project.factory.blank_state(addr=start_address)

password0 = claripy.BVS('password0', 64)
password1 = claripy.BVS('password1', 64)

fake_heap_address0 = 0xffffc93c
pointer_to_malloc_memory_address0 = 0xabcc8a4
fake_heap_address1 = 0xffffc94c
pointer_to_malloc_memory_address1 = 0xabcc8ac
initial_state.memory.store(pointer_to_malloc_memory_address0, fake_heap_address0, endness=project.arch.memory_endness)
initial_state.memory.store(pointer_to_malloc_memory_address1, fake_heap_address1, endness=project.arch.memory_endness)

initial_state.memory.store(fake_heap_address0, password0)
initial_state.memory.store(fake_heap_address1, password1)

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
if b'Good Job.\n' in stdout_output:
return True
else: return False

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
if b'Try again.\n' in stdout_output:
return True
else: return False

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found

solution0 = solution_state.solver.eval(password0, cast_to=bytes)
solution1 = solution_state.solver.eval(password1, cast_to=bytes)

print("[+] Success! Solution is: {0} {1}".format(solution0.decode('utf-8'), solution1.decode('utf-8')))
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main()

页: [1]

学逆向论坛's Archiver

angr 入门介绍（二）