SilentMoonwalk
PoC Implementation of a fully dynamic call stack spoofer
TL;DR
SilentMoonwalk is a PoC implementation of a fully dynamic call stack spoofer, implementing a technique to remove the original caller from the call stack, using ROP to desynchronize unwinding from control flow.
Authors
The authors of the research are:
I want to stress that this work would have been impossible without the work of Waldo-IRC and Trickster0, which both contributed to the early stages of the PoC, and to the research behind the PoC.
Overview
This repository demonstrates a PoC implementation to spoof the call stack when calling arbitrary Windows APIs.
This attempt was inspired by this Twitter thread, and this Twitter thread, where sensei namazso showed and suggested to extend the stack unwinding approach with a ROP chain to both desynchronize the unwinding from real control flow and restore the original stack afterwards.
This PoC attempts to do something similar to the above, and uses a desync stack to completely hide the original call stack, also removing the EXE image base from it. Upon return, a ROP gadget is invoked to restore the original stack. In the code, this process is repeated 10 times in a loop, using different frames at each iteration, to prove stability.
Supported Modes
The tool currently supports 2 modes, where one is actually a wrong patch to a non-working pop RBP frame identified, which operates by shifting the current RSP and adding two fake frames to the call stack. As it operates using synthetic frames, I refer to this mode as “SYNTHETIC”.
When selecting the frame that unwinds by popping the RBP register from the stack, the tool might select an unsuitable frame, ending up in an abruptly cut call stack, as observable below.
Synthetic Call Stack Mode
A silly solution to the problem would be to create two fake frames and link them back to the cut call stack. This would create a sort of apparently legit call stack, even without a suitable frame which unwinds calling POP RBP, but:
- You would lose the advantage of the desync technique
- The stack would be still unwindable
- The resulting call stack could seem legit just on the first glance, but it would probably not pass a strict check
The result of the _synthetic spoof can be observed in the image below:
Note: This operation mode is disabled by default. To enable this mode, change the CALLSTACK_TYPE to 1
Desync Stack Mode
This mode is the right solution to the above problem, whereby the non-suitable frame is simply replaced by another, suitable one.
Utility
In the repository, you can find also a little util to inspect runtime functions, which might be useful to analyse runtime function entries.
UnwindInspector.exe -h
Unwind Inspector v0.100000
Mandatory args:
-m <module>: Target DLL
-f <function>: Target Function
-a <function-address>: Target Function Address
Sample Output:
UnwindInspector.exe -m kernelbase -a 0x7FFAAE12182C
[*] Using function address 0x7ffaae12182c
Runtime Function (0x000000000000182C, 0x00000000000019ED)
Unwind Info Address: 0x000000000026AA88
Version: 0
Ver + Flags: 00000000
SizeOfProlog: 0x1f
CountOfCodes: 0xc
FrameRegister: 0x0
FrameOffset: 0x0
UnwindCodes:
[00h] Frame: 0x741f - 0x04 - UWOP_SAVE_NONVOL (RDI, 0x001f)
[01h] Frame: 0x0015 - 0x00 - UWOP_PUSH_NONVOL (RAX, 0x0015)
[02h] Frame: 0x641f - 0x04 - UWOP_SAVE_NONVOL (RSI, 0x001f)
[03h] Frame: 0x0014 - 0x00 - UWOP_PUSH_NONVOL (RAX, 0x0014)
[04h] Frame: 0x341f - 0x04 - UWOP_SAVE_NONVOL (RBX, 0x001f)
[05h] Frame: 0x0012 - 0x00 - UWOP_PUSH_NONVOL (RAX, 0x0012)
[06h] Frame: 0xb21f - 0x02 - UWOP_ALLOC_SMALL (R11, 0x001f)
[07h] Frame: 0xf018 - 0x00 - UWOP_PUSH_NONVOL (R15, 0x0018)
[08h] Frame: 0xe016 - 0x00 - UWOP_PUSH_NONVOL (R14, 0x0016)
[09h] Frame: 0xd014 - 0x00 - UWOP_PUSH_NONVOL (R13, 0x0014)
[0ah] Frame: 0xc012 - 0x00 - UWOP_PUSH_NONVOL (R12, 0x0012)
[0bh] Frame: 0x5010 - 0x00 - UWOP_PUSH_NONVOL (RBP, 0x0010)
Build
In order to build the POC and observe a similar behaviour to the one in the picture, ensure to:
- Disable GS (
/GS-
) - Disable Code Optimisation (
/Od
) - Disable Whole Program Optimisation (Remove
/GL
) - Disable size and speed preference (Remove
/Os
,/Ot
) - Enable intrinsic if not enabled (
/Oi
)
Previous Work
It’s worth mentioning previous work done on this topic, which built the foundation of this work.
- Return Address Spoofing: Original technique and idea, by Namaszo. Every other PoC I’m aware of was built on top of that.
- YouMayPasser: This amazing work by Arash is the first properly done extension of the Return Address Spoofing PoC by Namaszo.
- VulcanRaven: A call stack spoofer that operates the spoofing by synthetically creating a Thread Stack mirroring another real call stack.
- Unwinder: A very nice Rust PoC implementation of a call stack spoofer which operates by parsing unwind code information to replace frames in the call stack.
Credits
- Huge shoutout to waldo-irc and trickster0, which collaborated with me on this research. I owe everything to them.
- All the credit for the idea behind this goes to namaszo, which I personally consider a genius. He also cross checked this PoC before release, so huge thanks to him.
Notes
- [SYNTHETIC STACK ONLY]: For a limitation in the way I’m locating the gadgets, the maximum number of arguments is 8 for now (it is TRIVIAL to modify and add more params, but I couldn’t bother).
- [DSESYNC STACK ONLY]: For a limitation in how I’m setting up the spoofer, the maximum number of supported arguments is 4 for now.
- Testing on this one was pretty limited. There might be exceptions I’m not aware of at the moment.
- Unwinding involving 128-bit registers was no tested.
- Calling functions that use 128-bit registers is not officially supported.