Fixing A Rust Binary (Hack-A-Sat 2022 Once Unop a Djikstar)
文章目录
Description
The challange is a Djikstar algorithm problem. We are given 3 CSV files which contain distances between the stars, and we are asked to find a path from our ship “ShippyMcShipFace” to the destination “Honolulu”. Normal Dijkstra seems not working on this challenge, maybe the cost is not simply the sum of all distances.
However, the challange also provided us with a Rust solver for this problem, but some parts of the code are patched by NOPs, making this program unable to run. Our task there is to fix this program.
Fixes
The main logic is at function once_unop_a_dijkstar::run
. There are 5 locations replaced by NOPs, we need to figure out what the original instructions were.
Location #1 0x16BA0
The program first opens 3 CSV files to read the distance data. The first read operation is corrupted:
|
|
We should not call same rax for 2 times. Apparently, we need a lea rax to set rax to another function. We can look at the other 2 read operations, and find out that a _$LT$core..result..Result$LT$T$C$E$GT$$u20$as$u20$core..ops..try_trait..Try$GT$::branch
should be called after csv::reader::Reader$LT$csv..reader..Reader$LT$std..fs..File$GT$$GT$::from_path
.
The first patch is changing 0x16BA0
to lea rax, [0x30EA0]
.
Location #2 0x16DB7
|
|
A similar pattern of mov rax, cs:memcpy_ptr
and call rax
can be found at 0x16D9E
and 0x16F15
.
The second patch is changing 0x16DB7
to mov rdx, [rsp+0xb8]
and mov rax, [rsp+0xc0]
.
Location #3 0x18AF8
After reading 3 CSV data files, we find a crash inside a small function, the call trace is once_unop_a_dijkstar::run
-> once_unop_a_dijkstar::parse_reader_file
-> once_unop_a_dijkstar::get_target_traversal_cost
.
Inside once_unop_a_dijkstar::get_target_traversal_cost
we can see that the function epilogue is missing, so we just put add rsp 0x28
, then ret
.
Location #4 0x1721E
|
|
We set rax to some function pointer, so the next step should be call rax
, but call rax
is only 4 bytes long, there are still 2 bytes missing.
This is the most tricky point in this challenge, the argument var_1B8
is the result object of the previous call std::sync::mutex::Mutex$LT$T$GT$::lock
, which, if we look at the Rust document, is a LockResult<MutexGuard<'_, T>>
. For Result<T, E>
type in Rust, we know that Rust uses a 1 byte flag to indicate whether this Result
is T(Successful)
or E(Error)
.
An object bigger than 8 bytes but smaller than 16 bytes can be returned in rax and rdx. For our case, the actual object is in rax, and the flag byte is in dl. This way of returning a Result
object can also be seen at 0x17F5B and 0x1836B.
We place a mov [rsp+0x37], dl
after call rax
to make things work.
Location #5 0x172FE
This one requires the most imagination:
|
|
We don’t know what function to call, so I tried putting a breakpoint to see what arguments are passed to it:
We see our source and destination node in rsi and rdx. And we know all node distances are in our hashmap, and that it should return a Option<Vec<once_unop_a_dijkstar::Node>>
for the code that follows. By looking at the list of all functions in this binray, I found a group of pathfinding::directed::*
functions which seem to be the implementation of Dijkstra algorithm.
I tried to call the top level function, pathfinding::directed::dijkstra::dijkstra
, luckily, it worked. So for this place, we need a lea rax, [0x23D40]
Extracting the result
Unfortunately, there is still a big hole at 0x173CF. But we have already got the result in a vector, the code left is just printing them out. So the easiest way is to extract the result from the memory. I put a breakpoint at 0x17382, right after a unwrap
which gives us the Vec<once_unop_a_dijkstar::Node>
. We can see that a 6-element vector is in rdi after call rax
:
|
|
Conclusion
A deep, solid unstanding of Rust ABI is required for this challenge, especially for location #4. I want to thank Jay Bosamiya for giving me ideas on location #4 and #5.
C and C++ binaries plant stereotype of assembly into our brain. When we are faced with Rust and Go binaries, which have different memory layout, data structure, runtime environment and calling convention, we start to question about the essence of an architecture and its ABI. Only after this exploration can a reverse-engineer enter his next stage of understanding computers.