A blog for Confluence group.

Saturday, 29 August 2020

Reversing Rust String And Str Datatypes

Lets build an app that uses several data-types in order to see how is stored from a low level perspective.

Rust string data-types

The two first main objects are "str" and String, lets check also the constructors.




Imports and functions

Even such a basic program links several libraries and occupy 2,568Kb,  it's really not using the imports and expots the runtime functions even the main. 


Even a simple string operation needs 544 functions on rust:


Main function

If you expected see a clear main function I regret to say that rust doesn't seem a real low-level language In spite of having a full control of the memory.


Ghidra turns crazy when tries to do the recursive parsing of the rust code, and finally we have the libc _start function, the endless loop after main is the way Ghidra decompiles the HLT instruction.


If we jump to main, we see a function call, the first parameter is rust_main as I named it below:



If we search "hello world" on the Defined Strings sections, matches at the end of a large string


After doing "clear code bytes" we can see the string and the reference:


We can see that the literal is stored in an non null terminated string, or most likely an array of bytes. we have a bunch of byte arrays and pointed from the code to the beginning.
Let's follow the ref.  [ctrl]+[shift]+[f] and we got the references that points to the rust main function.


After several naming thanks to the Ghidra comments that identify the rust runtime functions, the rust main looks more understandable.
See below the ref to "hello world" that is passed to the string allocated hard-coding the size, because is non-null terminated string and there is no way to size this, this also helps to the rust performance, and avoid the c/c++ problems when you forgot the write the null byte for example miscalculating the size on a memcpy.


Regarding the string object, the allocator internals will reveal the structure in static.
alloc_string function call a function that calls a function that calls a function and so on, so this is the stack (also on static using the Ghidra code comments)

1. _$LT$alloc..string..String$u20$as$u20$core..convert..From$LT$$RF$str$GT$$GT$::from::h752d6ce1f15e4125
2. alloc::str::_$LT$impl$u20$alloc..borrow..ToOwned$u20$for$u20$str$GT$::to_owned::h649c495e0f441934
3. alloc::slice::_$LT$impl$u20$alloc..borrow..ToOwned$u20$for$u20$$u5b$T$u5d$$GT$::to_owned::h1eac45d28
4. alloc::slice::_$LT$impl$u20$$u5b$T$u5d$$GT$::to_vec::h25257986b8057640
5. alloc::slice::hack::to_vec::h37a40daa915357ad
6. core::slice::_$LT$impl$u20$$u5b$T$u5d$$GT$::len::h2af5e6c76291f524
7. alloc::vec::Vec$LT$T$GT$::extend_from_slice::h190290413e8e57a2
8. _$LT$alloc..vec..Vec$LT$T$GT$$u20$as$u20$alloc..vec..SpecExtend$LT$$RF$T$C$core..slice..Iter$LT$T$GT$$GT$$GT$::spec_extend::h451c2f92a49f9caa
...


Well I'm not gonna talk about the performance impact on stack but really to program well reusing code grants the maintainability and its good, and I'm sure that the rust developed had measured that and don't compensate to hardcode directly every constructor.

At this point we have two options, check the rust source code, or try to figure out the string object in dynamic with gdb.

Source code

Let's explain this group of substructures having rust source code in the hand.
The string object is defined at string.rs and it's simply an u8 type vector.



And the definition of vector can be found at vec.rs  and is composed by a raw vector an the len which is the usize datatype.



The RawVector is a struct that helds the pointer to the null terminated string stored on an Unique object, and also contains the allocation pointer, here raw_vec.rs definition.



The cap field is the capacity of the allocation and a is the allocator:



Finally the Unique object structure contains a pointer to the null terminated string, and also a one byte marker core::marker::PhantomData



Dynamic analysis

The first parameter of the constructor is the interesting one, and in x64 arch is on RDI register, the extrange sequence RDI,RSI,RDX,RCX it sounds like ACDC with a bit of imagination (di-si-d-c)

So the RDI parĂ¡meter is the pointer to the string object:



So RDI contains the stack address pointer that points the the heap address 0x5578f030.
Remember to disable ASLR to correlate the addresses with Ghidra, there is also a plugin to do the synchronization.

Having symbols we can do:
p mystring

and we get the following structure:

String::String {
  vec: alloc::vec::Vec {
    buf: alloc::raw_vec::RawVec {
      ptr: core::ptr::unique::Unique {
        pointer: 0x555555790130 "hello world\000",
        _marker: core::marker::PhantomData
     },
     cap: 11,
     a: alloc::alloc::Global
   },
   len: 11
  }
}

If the binary was compiled with symbols we can walk the substructures in this way:

(gdb) p mystring.vec.buf.ptr
$6 = core::ptr::unique::Unique {pointer: 0x555555790130 "hello world\000", _marker: core::marker::PhantomData}

(gdb) p mystring.vec.len

$8 = 11

If we try to get the pointer of each substructure we would find out that the the pointer is the same:


If we look at this pointer, we have two dwords that are the pointer to the null terminated string, and also 0xb which is the size, this structure is a vector.


The pionter to the c string is 0x555555790130




This seems the c++ string but, let's look a bit deeper:

RawVector
  Vector:
  (gdb) x/wx 0x7fffffffdf50
  0x7fffffffdf50: 0x55790130  -> low dword c string pointer
  0x7fffffffdf54: 0x00005555  -> hight dword c string pointer
  0x7fffffffdf58: 0x0000000b  -> len

0x7fffffffdf5c: 0x00000000
0x7fffffffdf60: 0x0000000b  -> low cap (capacity)
0x7fffffffdf64: 0x00000000  -> hight cap
0x7fffffffdf68: 0xf722fe27  -> low a  (allocator)
0x7fffffffdf6c: 0x00007fff  -> hight a
0x7fffffffdf70: 0x00000005 

So in this case the whole object is in stack except the null-terminated string.




Related word

NcN 2015 CTF - theAnswer Writeup


1. Overview

Is an elf32 static and stripped binary, but the good news is that it was compiled with gcc and it will not have shitty runtimes and libs to fingerprint, just the libc ... and libprhrhead
This binary is writed by Ricardo J Rodrigez

When it's executed, it seems that is computing the flag:


But this process never ends .... let's see what strace say:


There is a thread deadlock, maybe the start point can be looking in IDA the xrefs of 0x403a85
Maybe we can think about an encrypted flag that is not decrypting because of the lock.

This can be solved in two ways:

  • static: understanding the cryptosystem and programming our own decryptor
  • dynamic: fixing the the binary and running it (hard: antidebug, futex, rands ...)


At first sight I thought that dynamic approach were quicker, but it turned more complex than the static approach.


2. Static approach

Crawling the xrefs to the futex, it is possible to locate the main:



With libc/libpthread function fingerprinting or a bit of manual work, we have the symbols, here is the main, where 255 threads are created and joined, when the threads end, the xor key is calculated and it calls the print_flag:



The code of the thread is passed to the libc_pthread_create, IDA recognize this area as data but can be selected as code and function.

This is the thread code decompiled, where we can observe two infinite loops for ptrace detection and preload (although is static) this antidebug/antihook are easy to detect at this point.


we have to observe the important thing, is the key random?? well, with the same seed the random sequence will be the same, then the key is "hidden" in the predictability of the random.

If the threads are not executed on the creation order, the key will be wrong because is xored with the th_id which is the identify of current thread.

The print_key function, do the xor between the key and the flag_cyphertext byte by byte.


And here we have the seed and the first bytes of the cypher-text:



With radare we can convert this to a c variable quickly:


And here is the flag cyphertext:


And with some radare magics, we have the c initialized array:


radare, is full featured :)

With a bit of rand() calibration here is the solution ...



The code:
https://github.com/NocONName/CTF_NcN2k15/blob/master/theAnswer/solution.c





3. The Dynamic Approach

First we have to patch the anti-debugs, on beginning of the thread there is two evident anti-debugs (well anti preload hook and anti ptrace debugging) the infinite loop also makes the anti-debug more evident:



There are also a third anti-debug, a bit more silent, if detects a debugger trough the first available descriptor, and here comes the fucking part, don't crash the execution, the execution continues but the seed is modified a bit, then the decryption key will not be ok.





Ok, the seed is incremented by one, this could be a normal program feature, but this is only triggered if the fileno(open("/","r")) > 3 this is a well known anti-debug, that also can be seen from a traced execution.

Ok, just one byte patch,  seed+=1  to  seed+=0,   (add eax, 1   to add eax, 0)

before:


after:



To patch the two infinite loops, just nop the two bytes of each jmp $-0



Ok, but repairing this binary is harder than building a decryptor, we need to fix more things:

  •  The sleep(randInt(1,3)) of the beginning of the thread to execute the threads in the correct order
  •  Modify the pthread_cond_wait to avoid the futex()
  • We also need to calibrate de rand() to get the key (just patch the sleep and add other rand() before the pthread_create loop
Adding the extra rand() can be done with a patch because from gdb is not possible to make a call rand() in this binary.

With this modifications, the binary will print the key by itself. 

Related posts

Workshop And Presentation Slides And Materials

All of our previous workshop and presentation slides and materials are available in one location, from Google Drive.

From now on, we are only going to keep the latest-greatest version of each talk/workshop and announce changes on Twitter.
Read more