Linux’s ptrace API sucks!

Jan 29 2008

I love Linux, as a developer, I find the tools available suit my style of work perfectly. Sometimes the tool that I want isn’t available. That’s OK though, because whenever I can, I try to contribute.

I do a lot of reverse engineering work and thus the lack of anything like Ollydbg spawned off my EDB project. It’s a debugger designed to focus on applications at a machine code level. This project is coming along nicely but there is one thing that I really wish I could change…ptrace sucks, and it sucks a lot.

First of all, it has no inherent support for threads. This is a huge problem as many modern applications are multi threaded. Instead, since Linux treats threads as independent processes which happen to share the the majority of there address spaces, you are supposed to ptrace each thread individually. So, you need to attach to all “pids” found in “/proc//task/”. On top of this, you have to set an option in ptrace to attach to new threads as they are created with the clone system call. It is entirely undocumented whether you need to do this on a per process basis or a per-thread basis. Finally, you need track threads exiting so you know to stop looking for events on those. That’s an awful lot of effort just to be able to debug threads. By the way, the information necessary to know which bits of the status code tell you if it was a clone event (new thread) is also entirely undocumented.

Did i mention the potential race condition with attaching to all threads in the /proc//task/ directory? Since the threads could be spawning more threads while you are enumerating the directories. Threads could even exit during this time. So you have to loop continually trying to attach until the you are sure the thread count is stabilized and all are attached to. The only saving grace here is that attaching stops a thread so it is possible to get them all if you think it through.

Next, I wish that the PTRACE_PEEK* and PTRACE_POKE* request types had support for non-word width granularity. It makes setting a breakpoint more annoying than it has to be since you really only want to read/write a single byte in that case. Not only that, but reading/writing from the edges of region boundaries is equally annoying. A much better interface would have been similar to the file API where you can specify and address and a length. In addition to this, you need to pay careful attention to the various gotchas due to the fact that the return value is both an error code and a result. So if it returns (long)-1, then you need to check errno just to make sure that it isn’t an error.

The usage of wait for debug events is just awkward. It works great for single threaded command line debuggers like gdb, but for a GUI, where you want things to be interactive while the debugger is waiting for the next event, it is a disaster. Sure you can use a separate thread to capture events and deliver them to the GUI, but then you have issues properly shutting down that thread, since it will pretty much always be blocked! Also, wait has no timeout, so if you aren’t careful it is possible to get hung forever waiting for an even that will never happen. There is SIGCHLD, which sounds promising at first, but the fun part is that without sigprocmask trickery, you can’t predict which of your threads will get the signal. Grr!

Finally, there is lots of information that would be better suited being part of the debugging API. A great example of this is x86 segments. This really should be in the user area. You can get the segment values from the user area or even a PTRACE_GETREGS request. But the segment values are nearly worthless without being able to look at things like the segment base and limits. I understand not all platforms have this data, and x86-64 has much less usage of segments, but that’s why it should be in the user area.

A better API would first of all, be at a process level. I don’t care how it works under the hood, I want to attach to “processes.” There should also be a function to enumerate threads, this would only be valid when the process is stopped. This way you could get/set the context of each thread by passing a tid. Just these changes would make things much easier.

Overall, the user space API provided by ptrace could use a large overhaul. I understand the desire to be consistent with other unix’s debugging APIs, but this should not get in the way of making something usable.

utrace sounds ok, but as far as I know, it is designed to be kernel level changes. In fact, it appears there are plans to have ptrace implemented on top of utrace in the future. That’s great and all, but the user space API needs an update! I can only hope that utrace bring along a new user space API as well.