I think this sounds interesting, but it's hard to tell because the page does such a poor job of explaining what the software _actually_ does. I can't seem to find an example use case that would explain to me why I need this. I think the site may benefit from adding a page that shows use cases.
You're definitely right about that. I'm afraid the website was rushed due to lack of time. :/ I'll try to clarify here with some use-cases:
- There's this new hot app everybody's so excited about, but it's only available for iOS and you'd love to interop with it. You realize it's relying on encrypted network protocols and tools like Wireshark just won't cut it. You pick up Frida and use it for API tracing. (frida-trace)
- You're building a desktop app which has been deployed at a customer's site. There's a problem but the built-in logging code just isn't enough. You need to send your customer a custom build with lots of expensive logging code. Then you realize you could just use Frida and build an application-specific tool that will add all the diagnostics you need, and in just a few lines of Python. No need to send the customer a new custom build - you just send the tool which will work on many versions of your app.
- You'd like to build a Wireshark on steroids with support for sniffing encrypted protocols. It could even manipulate function calls to fake network conditions that would otherwise require you to set up a test lab.
- Your in-house app could use some black-box tests without polluting your production code with logic required only for exotic testing.
This is a great overview. The iOS interop you just outlined would be a huge use case for my self personally, and I'd love to see a tutorial that would walk through the specifics of making that work in the future.
Awesome! I'll try to get a tutorial on that written tomorrow evening. By the way, feel free to drop by #frida on FreeNode if you have any questions or would like to bounce ideas. :)
1. All the examples show attaching to an already running process. Is there a way to inject into a process from the moment it starts? I've got something I've been wanting to look at that is packed and the part I want to analyze happens right after unpacking but before UI comes up.
2. It looks like this would work even with software that uses some of the anti-debugging techniques like checking for an attached debugger etc... Is that right?
1. This is supported through frida.spawn() and frida.resume(). This is however only implemented in the Mac backend for now.
2. Yes this is indeed the case - and using Frida's Stalker (code tracer) you can even trace code without software/hardware breakpoints or any modifications to the original code.
Sorry, I'm at work so I don't have much time to investigate, but how does it know what the arguments of the functions are? Is it just walking the stack until it finds something that looks like a return address, then assume everything before that is an argument?
Also, what kind of hooking does it support? I am only really familiar with windows DLLs, but is it able to hook the Import Address Table? How does this work with statically linked binaries when it can't just walk the IAT?
This stuff can probably be answered by looking at the source, but my curiosity currently is overruled by deadlines.
No worries, the Frida website could definitely have explained things better!
Regarding the number of function arguments it doesn't know that, so that's something you need to know at the higher level. It is however safe to access arguments out of bounds, just expect "garbage" values if you access values beyond the Nth argument of an N argument function. Also, using Frida's Memory API (Memory.readUtf8String() for example) you'll get a JS exception thrown if you attempt to access an invalid address.
As for IAT, Frida is agnostic as far as hooking goes, it only needs the address of the function in memory (which you can obtain through for example Module.enumerateExports("kernel32.dll", ...)). How it works is that it dynamically rewrites/relocates the start of the function with a JMP to a trampoline that executes your JS hook, then the overwritten instruction(s) and finally jumps back to the next instruction after the overwritten ones.
Regarding functions in static libraries you can use Process.enumerateRanges() or Module.enumerateRanges() and Memory.scan() together to find the functions you're interested in, and then just Interceptor.attach() to them.
Awesome, thanks for the answer. That makes sense to just leave indexing the stack up to the user. Same with letting the user do some legwork to find the address of the function.
Have you considered the (sort of edge) case when arguments are passed in registers? For example (in the VC compiler, as far as I remember) the "this" pointer is generally passed in the register ECX instead of being pushed onto the stack. In those cases it might be useful to have access to the register contents inside the JS callback.
Along a similar line, the EAX register is generally used for return values. It might be useful to provide access to that as well (through another callback?), although that would probably require changing how your trampoline works so that it replaces the return address in the stack... (i miss working with low level things sometimes).
Ahh yes, exposing registers is planned and very trivial to implement, as it's available at the C code level, just needs to be exposed to the script runtime (V8). Future versions may also allow specifying an ABI, so args[0] maps to ecx for thiscall in 32-bit mode.
My explanation of the hooking was a bit oversimplified, there is indeed support for hooking the return. Just implement onLeave(retval) in addition to onEnter(args), and you'll have access to the return value (coming from EAX/RAX).
Feedback is most appreciated, so please let me know if there's any issues. :)
It seems like a neat idea however I have encountered some problems.
- I can't install it with `pip` instead of `easy_install`. It would be nice to support this as well.
$ sudo pip install frida
[sudo] password for ***:
Downloading/unpacking frida
Could not find any downloads that satisfy the requirement frida
No distributions at all found for frida
Storing complete log in /home/***/.pip/pip.log
- After installing with `easy_install` I now have problems with running a simple example:
$ frida-trace -i 'recv*' ls
Failed to attach: libffi.so.5: cannot open shared object file: No such file or directory
Traceback (most recent call last):
File "/usr/local/bin/frida-trace", line 9, in <module>
load_entry_point('frida==1.0.5', 'console_scripts', 'frida-trace')()
File "/usr/local/lib/python2.7/dist-packages/frida-1.0.5-py2.7-linux-x86_64.egg/frida/tracer.py", line 525, in main
frida.shutdown()
File "/usr/local/lib/python2.7/dist-packages/frida-1.0.5-py2.7-linux-x86_64.egg/frida/__init__.py", line 13, in shutdown
get_device_manager()._manager.close()
File "/usr/local/lib/python2.7/dist-packages/frida-1.0.5-py2.7-linux-x86_64.egg/frida/__init__.py", line 22, in get_device_manager
import _frida
ImportError: libffi.so.5: cannot open shared object file: No such file or directory
- `View on Github` links to the organization page and I'm not sure where to submit these issues. (frida-core? frida-python? frida-tracer?)
Thanks a lot! This appears to be an issue with the Frida build system and the Linux buildbot slave. We're investigating and will hopefully have a bugfix release out in the next few hours.
Hope we'll get the pip issue sorted in the near future, need to do some more research into what's possible when not relying on eggs for binary releases (ran into issues when we tried to use bdist_dumb, so we postponed it for later).
As general feedback: the current trend of showing on the frontpage how easy it is to install things is, in my opinion, largely irrelevant. Rather, show a simple use case, be it code snippets or other scenarios, to help the audience understand what the thing is and why it is useful.
UPDATE: Think "show less, explain more". That might help many.
Its a valid question. Seems that it is a Javascript interface Written in Python that parasitically ties into existing executables. Why not write the whole thing in JS?
Frida's core is written in C and injects Google's V8 engine into the target processes, where your JS gets executed with full access to memory, hooking functions and even calling native functions inside the process. There's a bi-directional communication channel that is used to talk between your app (Python?) and the JS running inside the target process.
On top of this C core there are multiple language bindings (Python, .NET and a browser plugin), and it is very easy to build further bindings for other languages and environments (Node.js could be a future binding if anyone's interested in helping out with that).
I was thinking this exactly, it would be great to have it in node.js. I'd love to dedicate the time, but I've got my own open source project I'm trying to get going at the moment.
For anybody that does want to try this before the node-bindings are available, maybe package it with node-python? https://npmjs.org/package/node-python
The one-line blurb sounds a lot like my project, Cycript, so I figure I'll mention it here. I've been working on a new website recently with documentation, and have a new release coming very soon with iOS 7 and ARM64 support (along with a ton of new features) as well. (All of this likely to be released quite soon, but many developers have been actively using Cycript since I first released it in 2009 regardless: it is one of the more important tools for iOS introspection in the jailbreak community.)
Cycript is an Objective-C/JavaScript bridge that allows you to inject code into native applications on Mac OS X or iOS (either using a jailbroken device, into the simulator, or using an SDK to embed into your application). (I do not currently support Windows, or runtimes like .NET.) It features a highly-interactive REPL with runtime grammar-assisted tab completion and live syntax highlighting. Users type am enhanced JavaScript syntax that includes a number of features from C and Objective-C.
Here is a link to a talk I gave at 360|iDev demonstrating its usage, as well as how to download the current beta version for Mac OS X 10.9 and the iOS 7 version of Xcode. I am also here linking to the Cycript website (which still sucks, but gets the gist across of what Cycript is and how one interacts with it; the talk is better, though, and more up-to-date) as well as the iPhone Dev Wiki (which has a lot of third-party documentation).
Cycript lets you swap out implementations of native code in an Objective-C application, and can do do at the C function level quite easily when combined with Substrate (which is very easy to call out to, as Cycript has quite sophisticated support for FFI due to the large amounts of C-specific syntax and bridging it has; the next release is going to be even better at this, with the syntax now supporting type signatures specified using C-style type signatures, and with C++11 lambda syntax for more direct construction).
To compare, Frida looks a little more like a scriptable debugger, whereas Cycript is designed more to interact with applications as they are being used. With Frida, you use a Python API to inject JavaScript snippets into the other app, and then do IPC back and forth with your code. Cycript instead gives you an interactive console to a JavaScript environment running in the other application, with IPC implicit in the REPL. (FWIW, there are advantages to the kind of approach Frida has, and adding features to Cycript to make it a little more like a debugger have been on my todo list.)
It sounds like Frida and Cycript have similar goals - REPL and other cool applications are planned and meant to be built on top of the core that is currently exposed to Python, .NET and a browser plugin (I built a cheesy WebGL network sniffeer running in the browser, the basic idea was to build an online collaborative reverse-engineering app where users with the plugin installed can live stream data and everybody collaborates on analyzing it online. Pure crack, I know, but I would personally love to have such a tool).
On the topic of "in browser", Cycript is actually used by another project of mine called Cydget, which uses Substrate to modify WebKit running inside of SpringBoard to add support for type="text/cycript" script elements. Developers can then use Cycript's Objective-C syntax and FFI to interact directly with native iOS libraries from HTML pages that are rendered on the user's lock screen. (This project somewhat languished in popularity for a few years until a resurgence earlier this year when I started making a concerted effort on reddit to get theme artists to stop using WinterBoard, another project of mine, and instead switch to Cydget). (Neither WinterBoard nor Cydget are yet ported to iOS 7, but WinterBoard is just about done, at which point I start working on Cydget; the last weeks since evasi0n7's release have been quite hectic with porting ;P.) What I don't do is make it easy to inject from that process back into another one (which I can see being sort of interesting for visualizations).
This is why I love HNs. Thanks for sharing! After reading about Frida and watching the Cycript video, I'm sort of ashamed of being amazed at Facebook's AST Lint rules they are building into Phabricator. Both are awesome, but these are like 10x more.
That's spot on. Regarding license protection logic that is also true, but maybe this could help with further nails in the DRM coffin — software running on the end-user's hardware is never safe against tampering. :)
This is a bit over my head but I am intrigued. Can someone give a few examples of what this would let you accomplish and how it differs from current methods?
From what I've seen, they're making function hooking as easy as possible, by exposing it simply in a language plenty of developers are currently familiar with (javascript). Previously, it would take assembly code or c++ to accomplish this.
As to what you can do with this; since a program basically consists of logic (function calls) + data, you can change any behavior of a program by modifying the function calls and return values. For example, I have previously used these techniques to extend MSN (Windows Live) Messenger to support third party chat networks (Facebook, Google Talk), even though the program didn't expose any APIs to extend functionality (but by simply hooking and injecting data into network calls such as recv and send).
Interesting. This reminded me of old-time cracking methods. One tutorial (which I unfortunately can't find now) taught how to add functionality to an application. The tutorial used notepad.exe as an example, and after following, you added line numbering (at the left of the text-area and an entry in the menu-bar to show/hide it.
In those old times, this was done using a combination of W32dasm, SoftIce and Hiew ( a hex-editor that allowed you to view the instructions in a PE exe file).
The most impressive crack of this type I remember seeing was one where the "demo" version of a 3D-modelling program did not have the "Save file" routine (it was completely absent from the exe) and which was implemented as part of a crack (including GetSaveFileName dialog and all). That was some amazing stuff there.
If I understand correctly, this program offers a 'dedicated' process to achieve similar functionality.
This is unfortunate but no secret (have a look at frida-website's README), it was the best I could do with my non-existant design skills. :/ This is one area where I hope someone with such skills could help us out (it's mentioned in the `Contributing` section on the website).
Documentation is coming soon, I'm sorry the website was a bit rushed. If you want a head start just add http://ospy.org/ as a Source in Cydia and install the Frida package on your jailbroken device. Then from Python use frida.get_device_manager().enumerate_devices() and with the iOS Device object just call .enumerate_processes() and .attach() as you would on the root "frida" API.
Oops, that's a silly bug in frida-gum (sorry!), it should strip off the "L" suffix when calling toString() on a NativePointer object. As a band-aid you can either try with a 32-bit target process or locally hack it in your installed frida/core.py (just strip off the trailing "L" anywhere there's an int(a, 16) call. Expect a bugfix release tomorrow evening to fix this proper, so don't hesitate to let me know of any other issues.
Could someone explain what has made this possible to do this now rather than before? Is it V8? What does it give the developer that allows him to be able to debug and trace anything?
This looks awesome. Can anyone do a writeup on how this would be used to, for instance, reverse engineer a random program's serial verification routine?
frida-gum is LGPL and the rest of the code is GPLv3+. This is subject to change - if it's an issue for someone I'm all for changing to a more liberal license (Frida has very few authors so relicensing is trivial).