Advertisement

Stack dump on Win32 how to get API addresses?

Started by December 18, 2005 11:40 PM
6 comments, last by Jan Wassenberg 18 years, 10 months ago
I've implemented an exception handler in my program which is able to dump the stack and list the name of any function pointer it finds. It does this by comparing pointers: 1) At the beginning of execution, record the stack pointer immediately, to know where the stack begins. 2) Save each function's name and pointer in an array. It's important to save them in the order they appear. 3) On an exception, get the current stack pointer, and read 32-bit words from there to the initial stack pointer (grabbed in step 1). 4) If the word read is between the pointer to one function and the next, print that function's name. (If anyone knows a better way, I'd love to hear it; this is rather hackish.) This seems to work well, but there is one problem - it won't list functions in the Windows API. I've seen other programs do this, so there must be some way; how would I go about doing it?
---------------------------------dofile('sig.lua')
You should be using the StackWalk64 API to walk the stack, because looking at the current stack frame can be somewhat hit-and-miss...

Here's a good sample on using it, it includes using imagehlp.dll for getting symbol names and such. Though it uses the now-obsolete StackWalk API and friends, it should be a good starting-point.
Advertisement
Here's an approach that I figured out last year. It may not work for every situation and won't work for a 64 bit system. YMMV.

// ---------------------------------------------------------------------------/*When executing a near call, the processor does the following (see Figure 6-4):1. Pushes the current value of the EIP register on the stack.2. Loads the offset of the called procedure in the EIP register.3. Begins execution of the called procedure.When executing a near return, the processor performs these actions:1. Pops the top-of-stack value (the return instruction pointer) into the EIP register.2. (If the RET instruction has an optional n argument.) Increments the stack pointer by thenumber of bytes specified with the n operand to release parameters from the stack.3. Resumes execution of the calling procedure.// ---------------------------------------------------------------------------E8 cd CALL rel32 Call near, relative, displacement relative to next instructionThe function address can be determined by adding the displacementfound in the call instruction to the return address on the stack.// ---------------------------------------------------------------------------FF /2 CALL r/m32 Call near, absolute indirect, address given in r/m32For a near call, an absolute offset is specified indirectly in a general-purpose register or amemory location (r/m16 or r/m32). The operand-size attribute determines the size of the targetoperand (16 or 32 bits).The target operand specifies an absolute offset in the code segment(that is an offset from the base of the code segment)// ---------------------------------------------------------------------------FF /3 CALL m16:32 Call far, absolute indirect, address given in m16:32disassembly examples from kernel32.dllFF154C134E7C   CALL     DWORD PTR [0x7C4E134C]FF15FC102D7C   CALL     DWORD PTR [0x7C2D10FC]FF151014547C   CALL     DWORD PTR [0x7C541410]if 5th byte back == E8 ==> near relative callif 5th byte back == 15 ==> far absolute callcheck if always 15 - it looks to be the casea ModRM byte of 15 indicates a 32 bit absolute displacement follows*/DWORD _stdcall GetFunctionAddress(DWORD dwReturn){    if ( 0 != dwReturn ) {        // near relative call        if ( 0xE8 == *(((BYTE *)(DWORD *)dwReturn) - 5) ) {            // return address + displacement == function address            return dwReturn + *(((DWORD *)dwReturn) - 1);        }        // far absolute indirect call typical of ntdll.dll        if ( 0x15FF == *(((WORD *)(DWORD *)dwReturn) - 3) ) {            return *(((DWORD *)dwReturn) - 1);        }        // near absolute indirect        // instruction could be two, three or four bytes        // return *(((DWORD *)dwReturn) - 1);        // just fall through (a length disassembler is overkill)    }    return 0;}


To use that function pass it the return value stored on the stack. To get the name of the API, you'll have to cross reference that address with the information stored in the import section of the pe-file header.
"I thought what I'd do was, I'd pretend I was one of those deaf-mutes." - the Laughing Man
[all code below is under GPL license]

Quote: 1) At the beginning of execution, record the stack pointer immediately, to know where the stack begins.

Somewhat safer (in case reusing code somewhere you don't know who your caller is):
static NT_TIB* get_tib(){	NT_TIB* tib;	__asm	{		mov		eax, fs:[NT_TIB.Self]		mov		[tib], eax	}	return tib;}..	// out of bounds (note: IA32 stack grows downwards)	NT_TIB* tib = get_tib();	if(!(tib->StackLimit < p && p < tib->StackBase))		return false;


LessBread: yes, disassembling CALL instruction that came before return address is a good way to go about walking the stack. Here's "overkill" (*g*) that handles all addressing modes and returns call target:
// checks if there is an IA-32 CALL instruction right before ret_addr.// returns ERR_OK if so and ERR_FAIL if not.// also attempts to determine the call target. if that is possible// (directly addressed relative or indirect jumps), it is stored in// target, which is otherwise 0.//// this is useful for walking the stack manually.LibError ia32_get_call_target(void* ret_addr, void** target){	*target = 0;	// points to end of the CALL instruction (which is of unknown length)	const u8* c = (const u8*)ret_addr;	// this would allow for avoiding exceptions when accessing ret_addr	// close to the beginning of the code segment. it's not currently set	// because this is really unlikely and not worth the trouble.	const size_t len = ~0u;	// CALL rel32 (E8 cd)	if(len >= 5 && c[-5] == 0xE8)	{		*target = (u8*)ret_addr + *(i32*)(c-4);		return ERR_OK;	}	// CALL r/m32 (FF /2)	// .. CALL [r32 + r32*s]          => FF 14 SIB	if(len >= 3 && c[-3] == 0xFF && c[-2] == 0x14)		return ERR_OK;	// .. CALL [disp32]               => FF 15 disp32	if(len >= 6 && c[6] == 0xFF && c[-5] == 0x15)	{		void* addr_of_target = *(void**)(c-4);		if(!debug_is_pointer_bogus(addr_of_target))		{			*target = *(void**)addr_of_target;			return ERR_OK;		}	}	// .. CALL [r32]                  => FF 00-3F(!14/15)	if(len >= 2 && c[-2] == 0xFF && c[-1] < 0x40 && c[-1] != 0x14 && c[-1] != 0x15)		return ERR_OK;	// .. CALL [r32 + r32*s + disp8]  => FF 54 SIB disp8	if(len >= 4 && c[-4] == 0xFF && c[-3] == 0x54)		return ERR_OK;	// .. CALL [r32 + disp8]          => FF 50-57(!54) disp8	if(len >= 3 && c[-3] == 0xFF && (c[-2] & 0xF8) == 0x50 && c[-2] != 0x54)		return ERR_OK;	// .. CALL [r32 + r32*s + disp32] => FF 94 SIB disp32	if(len >= 7 && c[-7] == 0xFF && c[-6] == 0x94)		return ERR_OK;	// .. CALL [r32 + disp32]         => FF 90-97(!94) disp32	if(len >= 6 && c[-6] == 0xFF && (c[-5] & 0xF8) == 0x90 && c[-5] != 0x94)		return ERR_OK;	// .. CALL r32                    => FF D0-D7                 	if(len >= 2 && c[-2] == 0xFF && (c[-1] & 0xF8) == 0xD0)		return ERR_OK;	return ERR_FAIL;}


And complete stack walk code while at it:

/*Subroutine linkage example code:	push	param2	push	param1	call	funcret_addr:	[..]func:	push	ebp	mov		ebp, esp	sub		esp, local_size	[..]Stack contents (down = decreasing address)	[param2]	[param1]	ret_addr	prev_ebp         (<- current ebp points at this value)	[local_variables]*//*	call	func1ret1:func1:	push	ebp	mov		ebp, esp	call	func2ret2:func2:	push	ebp	mov		ebp, esp	STARTHERE	*/#if CPU_IA32 && !CONFIG_OMIT_FPstatic LibError ia32_walk_stack(STACKFRAME64* sf){	// read previous values from STACKFRAME64	void* prev_fp  = (void*)sf->AddrFrame .Offset;	void* prev_ip  = (void*)sf->AddrPC    .Offset;	void* prev_ret = (void*)sf->AddrReturn.Offset;	if(!debug_is_stack_ptr(prev_fp))		return ERR_11;	if(prev_ip && !debug_is_code_ptr(prev_ip))		return ERR_12;	if(prev_ret && !debug_is_code_ptr(prev_ret))		return ERR_13;	// read stack frame	void* fp       = ((void**)prev_fp)[0];	void* ret_addr = ((void**)prev_fp)[1];	if(!debug_is_stack_ptr(fp))		return ERR_14;	if(!debug_is_code_ptr(ret_addr))		return ERR_15;	void* target;	LibError err = ia32_get_call_target(ret_addr, &target);	RETURN_ERR(err);	if(target)	// were able to determine it from the call instruction		debug_assert(debug_is_code_ptr(target));	sf->AddrFrame .Offset = (DWORD64)fp;	sf->AddrPC    .Offset = (DWORD64)target;	sf->AddrReturn.Offset = (DWORD64)ret_addr;	return ERR_OK;}#endif	// #if CPU_IA32 && !CONFIG_OMIT_FP// called for each stack frame found by walk_stack, passing information// about the frame and <user_arg>.// return INFO_CB_CONTINUE to continue, anything else to stop immediately// and return that value to walk_stack's caller.//// rationale: we can't just pass function's address to the callback -// dump_frame_cb needs the frame pointer for reg-relative variables.typedef LibError (*StackFrameCallback)(const STACKFRAME64*, void*);// iterate over a call stack, calling back for each frame encountered.// if <pcontext> != 0, we start there; otherwise, at the current context.// return an error if callback never succeeded (returned 0).//// lock must be held.static LibError walk_stack(StackFrameCallback cb, void* user_arg = 0, uint skip = 0, const CONTEXT* pcontext = 0){	// to function properly, StackWalk64 requires a CONTEXT on	// non-x86 systems (documented) or when in release mode (observed).	// exception handlers can call walk_stack with their context record;	// otherwise (e.g. dump_stack from debug_assert), we need to query it.	CONTEXT context;	// .. caller knows the context (most likely from an exception);	//    since StackWalk64 may modify it, copy to a local variable.	if(pcontext)		context = *pcontext;	// .. need to determine context ourselves.	else	{		skip++;	// skip this frame		// there are 4 ways to do so, in order of preference:		// - asm (easy to use but currently only implemented on IA32)		// - RtlCaptureContext (only available on WinXP or above)		// - intentionally raise an SEH exception and capture its context		//   (spams us with "first chance exception")		// - GetThreadContext while suspended* (a bit tricky + slow).		//		// * it used to be common practice to query the current thread's context,		// but WinXP SP2 and above require it be suspended.		//		// this MUST be done inline and not in an external function because		// compiler-generated prolog code trashes some registers.#if CPU_IA32		ia32_get_current_context(&context);#else		// try to import RtlCaptureContext (available on WinXP and later)		HMODULE hKernel32Dll = LoadLibrary("kernel32.dll");		VOID (*pRtlCaptureContext)(PCONTEXT*);		*(void**)&pRtlCaptureContext = GetProcAddress(hKernel32Dll, "RtlCaptureContext");		FreeLibrary(hKernel32Dll);	// doesn't actually free the lib		if(pRtlCaptureContext)			pRtlCaptureContext(&context);		// not available: raise+handle an exception; grab the reported context.		else		{			__try			{				RaiseException(0xF001, 0, 0, 0);			}			__except(context = (GetExceptionInformation())->ContextRecord, EXCEPTION_CONTINUE_EXECUTION)			{			}		}#endif	}	pcontext = &context;	STACKFRAME64 sf;	memset(&sf, 0, sizeof(sf));	sf.AddrPC.Offset    = pcontext->PC_;	sf.AddrPC.Mode      = AddrModeFlat;	sf.AddrFrame.Offset = pcontext->FP_;	sf.AddrFrame.Mode   = AddrModeFlat;	sf.AddrStack.Offset = pcontext->SP_;	sf.AddrStack.Mode   = AddrModeFlat;	// for each stack frame found:	LibError ret = ERR_SYM_NO_STACK_FRAMES_FOUND;	for(;;)	{		// rationale:		// - provide a separate ia32 implementation so that simple		//   stack walks (e.g. to determine callers of malloc) do not		//   require firing up dbghelp. that takes tens of seconds when		//   OS symbols are installed (because symserv is wanting to access		//   inet), which is entirely unacceptable.		// - VC7.1 sometimes generates stack frames despite /Oy ;		//   ia32_walk_stack may appear to work, but it isn't reliable in		//   this case and therefore must not be used!		// - don't switch between ia32_stack_walk and StackWalk64 when one		//   of them fails: this needlessly complicates things. the ia32		//   code is authoritative provided its prerequisite (FP not omitted)		//   is met, otherwise totally unusable.		LibError err;#if CPU_IA32 && !CONFIG_OMIT_FP		err = ia32_walk_stack(&sf);#else		sym_init();		// note: unfortunately StackWalk64 doesn't always SetLastError,		// so we have to reset it and check for 0. *sigh*		SetLastError(0);		const HANDLE hThread = GetCurrentThread();		BOOL ok = StackWalk64(machine, hProcess, hThread, &sf, (PVOID)pcontext,			0, SymFunctionTableAccess64, SymGetModuleBase64, 0);		err = LibError_from_win32(ok);#endif		// no more frames found - abort. note: also test FP because		// StackWalk64 sometimes erroneously reports success.		void* fp = (void*)(uintptr_t)sf.AddrFrame .Offset;		if(err < 0 || !fp)			return ret;		if(skip)		{			skip--;			continue;		}		ret = cb(&sf, user_arg);		// callback reports it's done; stop calling it and return that value.		// (can be 0 for success, or a negative error code)		if(ret != INFO_CB_CONTINUE)		{			debug_assert(ret <= 0);	// shouldn't return > 0			return ret;		}	}}
E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3
Nice work Jan.

With the complete stackwalk, why call LoadLibrary on k32?

// try to import RtlCaptureContext (available on WinXP and later)HMODULE hKernel32Dll = LoadLibrary("kernel32.dll");VOID (*pRtlCaptureContext)(PCONTEXT*);*(void**)&pRtlCaptureContext = GetProcAddress(hKernel32Dll,"RtlCaptureContext");FreeLibrary(hKernel32Dll);	// doesn't actually free the lib


Why not use GetModuleHandle instead and skip the FreeLibrary call?

HMODULE hKernel32Dll = GetModuleHandle("kernel32.dll");VOID (*pRtlCaptureContext)(PCONTEXT*);*(void**)&pRtlCaptureContext = GetProcAddress(hKernel32Dll,"RtlCaptureContext");


Have you ever encountered a windows exe that didn't link with k32?

"I thought what I'd do was, I'd pretend I was one of those deaf-mutes." - the Laughing Man
Ah, thanks for pointing that out. This code was copied from another import where the DLL actually needed to be loaded.
Kernel32 is guaranteed to be loaded into every process because the loader calls kernel32!_BaseProcessStart to kick off execution, so this is safe.
I've made the change locally.
E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3
Advertisement
These all look like nice ideas (thanks for the help), but they still require me to manually record each function's name and address in some array. If I compile my program in debug mode (using the -g option with gcc) and check the EXE in a hex editor, I can see a long list of all the function and variable names already made up. There must be some way I can use that instead? (And if anyone knows how to get VC++ in Visual Studio 6 to read this info and display actual function names, that'd be awesome too.)
---------------------------------dofile('sig.lua')
Quote: (And if anyone knows how to get VC++ in Visual Studio 6 to read this info and display actual function names, that'd be awesome too.)

See dbghelp SymGetTypeInfo(..TI_GET_SYMNAME..).
E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3

This topic is closed to new replies.

Advertisement