15 March 2017

Nobody expected 64 bits

Apparently if you are not mortally embarrassed by the quality of your code you are releasing it too late [(tm) Silicon Valley]. But to use another only-too-often-used-quote - "Release early, release often". I've made mistakes of hoarding my tools and code for too long, not releasing them because they weren't perfect. This was obviously road to nowhere because if I don't release, nobody uses it. And if nobody uses it I have no motivation to develop it anymore. So, to break this circle I present you a new version of function Annotator for Binary Ninja.

First thing worth mentioning is a new database of functions prototypes. To be exact we now have 4728 prototypes. From this place big thanks to Zach Riggle for his functions project - this update would not be here if not for him.

Next thing is virtual stack for x64 platform - from now on you can also annotate 64-bit applications for Intel/AMD processors.

One small thing that I still need to properly implement is full support for functions operating on floating point types (float, double and long double). Right now they are not properly annotated and there are two important reasons for that:
For 32 bit platform floating point arguments are pushed on the stack using instructions like fstp or fst. Sadly, Binary Ninja right now does not have a corresponding Low Level Instructions for those. They are just showing as unimplemented(). The moment Binary Ninja starts supporting them I just add some more parsing code and everything will be fine.
64 bit platform is slightly more complicated. First of all, arguments to functions are passed via registers. Integers, pointers and such are passed through 6 registers - RDI, RSI, RDX, RCX, R8 and R9 and order matters. Floating point arguments are passed via XMM0-7 registers. Now, let's imagine that we have two functions f1(int, float) and f2(float, int). What will compiler do? Well, on Linux, in case of f1() first argument will end up in RDI and second in XMM0, but in f2 first argument will end up in XMM0 and second one in RDI.
"Wait a minute" - you will say - "but this is exactly the same". I'm glad you are seeing the same problem. Just having state of registers won't tell us what the first and the second argument is unless you know types in the first place. Virtual Stack does not know types, so until I refactor my code FP types won't be supported.

New updates are planned so stay tuned! And of course, please let me know what you think about it and report all bugs.

21 February 2017

Annotate all the things

I don't do reverse engineering for a living but I still like to peek under the hood of binaries from time to time. Either because of testing, looking for bugs or just for fun. Problem is, that IDA Pro, de-facto standard tool for any Reverse Engineer is prohibitively expensive for most of the people. On top of that, licensing policy is very annoying and illogical. But enough about IDA Pro - let's talk about new contender on this field - Binary Ninja.

I'm not going to repeat all the praises that this tool is receiving. Instead, you may for example read how you can use it to automatically reverse 2000 binaries or maybe how the underlying Low Level Instrumentation Language works. All in all platform looks very promising and I couldn't wait to try it after seeing it for the first time. Couple of months ago I was playing with the Beta and pretty much bought it first day it was released.

There is one tiny problem with Binary Ninja however - IDA Pro was here for years, therefore it is both feature rich and ecosystem around it is pretty robust. Binja still has a long way to go in this department - there are not that many useful plugins and some features are missing. One thing I've noticed for example is that while reversing basic libc functions and system calls are not annotated in any way. There is no prototype of them and arguments are not marked in any way. So instead of complaining I've decided to utilize available API and just fix that.

Let's start by defining a problem. For example we have a listing like this:

Not terribly descriptive, right? Well, at least for strcpy() we roughly remember the prototype so we can quickly find where arguments are being pushed on the stack. But what about fchmodat() or sigaction(). Yeah, you need to get back to man page. How cool would be to open a binary and get this:

This is exactly what Annotator plugin does - it iterates through all instruction in the code building a virtual stack as it goes, but instead of variables it tracks instructions that pushed a given variable on to the stack. Upon encountering a call of known function it uses this virtual stack to annotate it with a proper argument prototype.

This is a very first release so it is probably riddled with bugs. Not to mention some features are missing. Right now not all glibc function prototypes are present because I haven't found a good and reliable way to extract them from headers - instead I'm using a combination of grep, regex and cut with some manual cleanup effort. That unfortunately takes time. Same goes for system calls, but I should be able to put all Linux 32bit ones today. Ah, and you have to run plugin manually in every function you view - right now there is no way to automatically apply it to all the functions - I'm contemplating to write one method allowing user to apply it to whole underlying call graph, but we will see about that.

Another thing is quite naive virtual stack implementation - for sure it requires more work to track stack growth more accurately and for example track number of arguments for functions with va_arg type of arguments. Right now I'm also scanning blocks of code in linear manner, but for future version I will probably switch to recursive mode with stack isolation for each path (well, right now I haven't encountered situation where functions arguments are done in different code block than the call itself, but better safe than sorry). Last thing to improve is number of virtual stacks - first for x64 platforms and later for ARM architecture.

Please, let me know what do you think about the extension and report all the bugs.