memory corruption WideCharToMultiByte

Recently I have been debugging some memory corruption that happened only after something executed 64 times, free complained that memory has been corrupted. Thus started the usual process of debugging. I started ApplicationVerifier and set the application and enabled the basic checks.

Application Verifier- Basic Checks

Application Verifier- Basic Checks

Soon my application started failing with access violation during startup, this happened even before I reached the code where memory was corrupted. Now I have two problems to investigate. This time I was crashing with call to WideCharToMultiByte. Started debugging to see exactly where it was failing:

  ntdll.dll!RtlUnicodeToUTF8N() Unknown
  KernelBase.dll!WideCharToMultiByte() Unknown
  vfbasics.dll!000007fef081cd0c() Unknown
> heapcorruption.exe!dllNotificationFunction(unsigned long NotificationReason, const _LDR_DLL_NOTIFICATION_DATA * NotificationData, void * Context) Line 64 C++
  ntdll.dll!string "Enabling heap debug options\n"() Unknown
  ntdll.dll!LdrpFindOrMapDll() Unknown
  ntdll.dll!LdrpLoadDll() Unknown
  ntdll.dll!LdrLoadDll() Unknown
  vfbasics.dll!000007fef08074de() Unknown
  KernelBase.dll!LoadLibraryExW() Unknown
  heapcorruption.exe!main() Line 94 C++
  heapcorruption.exe!invoke_main() Line 75 C++

Failing right in guts of ntdll code- for sure something invalid has been passed to WideCharToMultiByte, so I started with windbg and hit g. Once it crashed into ntdll it was time to analyze what went wrong: analyze -v

windbg output:

0:000> !analyze -v
*******************************************************************************
*                                                                             *
*                        Exception Analysis                                   *
*                                                                             *
*******************************************************************************
FAULTING_IP: 
ntdll!RtlUnicodeToUTF8N+149
00007ffb`06cb74e9 410fb709        movzx   ecx,word ptr [r9]

EXCEPTION_RECORD:  ffffffffffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 00007ffb06cb74e9 (ntdll!RtlUnicodeToUTF8N+0x0000000000000149)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000000
   Parameter[1]: 0000021c29c68000
Attempt to read from address 0000021c29c68000

FAULTING_THREAD:  0000000000002748
PROCESS_NAME:  heapcorruption.exe

ADDITIONAL_DEBUG_TEXT:  
FAULTING_MODULE: 00007ffb06c50000 ntdll
DEBUG_FLR_IMAGE_TIMESTAMP:  584119c4
ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.
EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.
EXCEPTION_PARAMETER1:  0000000000000000
EXCEPTION_PARAMETER2:  0000021c29c68000
READ_ADDRESS:  0000021c29c68000 

FOLLOWUP_IP: 
heapcorruption!dllNotificationFunction+ae [f:\code\blogposts\heapcorruption\heapcorruption\heapcorruption.cpp @ 70]
00007ff7`c351181e 898534030000    mov     dword ptr [rbp+334h],eax
BUGCHECK_STR:  APPLICATION_FAULT_INVALID_POINTER_READ_WRONG_SYMBOLS
PRIMARY_PROBLEM_CLASS:  INVALID_POINTER_READ
DEFAULT_BUCKET_ID:  INVALID_POINTER_READ
LAST_CONTROL_TRANSFER:  from 00007ffb04001c4f to 00007ffb06cb74e9

One thing we know for sure: we are accessing invalid memory- there is potential for reading past the end of string.

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.
EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.

Let’s roll up our sleeves and start looking at what is being passed to WideCharToMultiByte and is that valid. Now this is what the disassembly looks like in VS (interleaved):

00007FF7C3511770  mov         qword ptr [rsp+18h],r8  
00007FF7C3511775  mov         qword ptr [rsp+10h],rdx  
00007FF7C351177A  mov         dword ptr [rsp+8],ecx  
00007FF7C351177E  push        rbp  
00007FF7C351177F  push        rdi  
00007FF7C3511780  sub         rsp,548h  
00007FF7C3511787  lea         rbp,[rsp+40h]  
00007FF7C351178C  mov         rdi,rsp  
00007FF7C351178F  mov         ecx,152h  
00007FF7C3511794  mov         eax,0CCCCCCCCh  
00007FF7C3511799  rep stos    dword ptr [rdi]  
00007FF7C351179B  mov         ecx,dword ptr [rsp+568h]  
00007FF7C35117A2  mov         rax,qword ptr [__security_cookie (07FF7C351C000h)]  
00007FF7C35117A9  xor         rax,rbp  
00007FF7C35117AC  mov         qword ptr [rbp+4F0h],rax  
    //printf("length of string is- %d chars, and Length in Bytes- %d\n", (int)wcslen(NotificationData->Loaded.FullDllName->Buffer), 
    //                                          NotificationData->Loaded.FullDllName->Length);
    char utf8Str[3 * _MAX_PATH + 1] = "\0";
00007FF7C35117B3  movzx       eax,word ptr [string "\0" (07FF7C3519D24h)]  
00007FF7C35117BA  mov         word ptr [utf8Str],ax  
00007FF7C35117BE  lea         rax,[rbp+12h]  
00007FF7C35117C2  mov         rdi,rax  
00007FF7C35117C5  xor         eax,eax  
00007FF7C35117C7  mov         ecx,30Bh  
00007FF7C35117CC  rep stos    byte ptr [rdi]  
    int nBytes = WideCharToMultiByte(CP_UTF8, 0, 
00007FF7C35117CE  mov         rax,qword ptr [NotificationData]  
00007FF7C35117D5  mov         rax,qword ptr [rax+8]  
00007FF7C35117D9  movzx       eax,word ptr [rax]  
00007FF7C35117DC  mov         rcx,qword ptr [NotificationData]  
00007FF7C35117E3  mov         rcx,qword ptr [rcx+8]  
00007FF7C35117E7  mov         qword ptr [rsp+38h],0  
00007FF7C35117F0  mov         qword ptr [rsp+30h],0  
00007FF7C35117F9  mov         dword ptr [rsp+28h],30Ch  
00007FF7C3511801  lea         rdx,[utf8Str]  
00007FF7C3511805  mov         qword ptr [rsp+20h],rdx  
00007FF7C351180A  mov         r9d,eax  
00007FF7C351180D  mov         r8,qword ptr [rcx+8]  
00007FF7C3511811  xor         edx,edx  
00007FF7C3511813  mov         ecx,0FDE9h  
00007FF7C3511818  call        qword ptr [__imp_WideCharToMultiByte (07FF7C3520018h)]  
00007FF7C351181E  mov         dword ptr [nBytes],eax  
                                     NotificationData->Loaded.FullDllName->Buffer, 
                                     NotificationData->Loaded.FullDllName->Length,
                                     //(int)wcslen(NotificationData->Loaded.FullDllName->Buffer),
                                     utf8Str, (int)sizeof(utf8Str) - 1, nullptr, nullptr);
    

Things to note: rep stos dword ptr [rdi] we are initing utf8Str. The according to x64 calling conventions first 4 parameters are passed in registers remaining ones in stack.
The order of parameters passed in registers are:

1. rcx
2. rdx
3. r8
4. r9

Thus for the following function call (all paramters are passed in reverse order- last ones are put in place first)

    int nBytes = WideCharToMultiByte(CP_UTF8, 0, 
                                     NotificationData->Loaded.FullDllName->Buffer, 
                                     NotificationData->Loaded.FullDllName->Length,
                                     utf8Str, (int)sizeof(utf8Str) - 1, nullptr, nullptr);

first two nullptr/ 0 are passed on stack

00007FF7C35117E7  mov         qword ptr [rsp+38h],0  
00007FF7C35117F0  mov         qword ptr [rsp+30h],0  

then followed by an (int)sizeof(utf8Str) - 1, whose value is 30Ch

00007FF7C35117F9  mov         dword ptr [rsp+28h],30Ch 

then utf8Str

00007FF7C3511801  lea         rdx,[utf8Str]  
00007FF7C3511805  mov         qword ptr [rsp+20h],rdx  

then followed by first 4 parameters:

; NotificationData->Loaded.FullDllName->Length
00007FF7C351180A  mov         r9d,eax         
        
; NotificationData->Loaded.FullDllName->Buffer
00007FF7C351180D  mov         r8,qword ptr [rcx+8]    

00007FF7C3511811  xor         edx,edx                 ; 0
00007FF7C3511813  mov         ecx,0FDE9h              ; CP_UTF8

And here is the assembly generated from windbg as code is being executed:

   64 00007ff7`c35117cc f3aa            rep stos byte ptr [rdi]
   70 00007ff7`c35117ce 488b8528050000  mov     rax,qword ptr [rbp+528h]
   70 00007ff7`c35117d5 488b4008        mov     rax,qword ptr [rax+8]
   70 00007ff7`c35117d9 0fb700          movzx   eax,word ptr [rax]
   70 00007ff7`c35117dc 488b8d28050000  mov     rcx,qword ptr [rbp+528h]
   70 00007ff7`c35117e3 488b4908        mov     rcx,qword ptr [rcx+8]
   70 00007ff7`c35117e7 48c744243800000000 mov   qword ptr [rsp+38h],0
   70 00007ff7`c35117f0 48c744243000000000 mov   qword ptr [rsp+30h],0
   70 00007ff7`c35117f9 c74424280c030000 mov     dword ptr [rsp+28h],30Ch
   70 00007ff7`c3511801 488d5510        lea     rdx,[rbp+10h]
   70 00007ff7`c3511805 4889542420      mov     qword ptr [rsp+20h],rdx
   70 00007ff7`c351180a 448bc8          mov     r9d,eax
   70 00007ff7`c351180d 4c8b4108        mov     r8,qword ptr [rcx+8]
   70 00007ff7`c3511811 33d2            xor     edx,edx
   70 00007ff7`c3511813 b9e9fd0000      mov     ecx,0FDE9h
   70 00007ff7`c3511818 ff15fae70000    call    qword ptr [heapcorruption!_imp_WideCharToMultiByte (00007ff7`c3520018)]

Now the most important thing to track here is Length & Buffer values. Buffer was passed in r8 which is [rcx+8]

Now I want to see the address just before call to heapcorruption!_imp_WideCharToMultiByte- so i have set the breakpoint at address with command: bp 00007FF7C3511818 followed by g once windbg breaks then i can observe register values with command r:

0:000> bp 00007FF7C3511818
0:000> g
Breakpoint 1 hit
heapcorruption!dllNotificationFunction+0xa8:
00007ff7`c3511818 ff15fae70000    call    qword ptr [heapcorruption!_imp_WideCharToMultiByte (00007ff7`c3520018)] ds:00007ff7`c3520018=00007ffaddfbcc94
0:000> r
rax=0000000000000076 rbx=0000021c29c5afe0 rcx=000000000000fde9
rdx=0000000000000000 rsi=00007ffb06d9c5d0 rdi=000000cb396ff2ed
rip=00007ff7c3511818 rsp=000000cb396fef90 rbp=000000cb396fefd0
 r8=0000021c29c67f80  r9=0000000000000076 r10=0000000000000000
r11=000000cb396ff548 r12=0000021c29c63fb0 r13=0000000000000000
r14=0000000000000000 r15=0000021c29c61f80
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
heapcorruption!dllNotificationFunction+0xa8:
00007ff7`c3511818 ff15fae70000    call    qword ptr [heapcorruption!_imp_WideCharToMultiByte (00007ff7`c3520018)] ds:00007ff7`c3520018=00007ffaddfbcc94

Note the buffer was passed in r8 it’s contents are 0000021c29c67f80 and size was passed in r9 it’s contents are 0000000000000076 (118 bytes). you can look at what’s in the address with this command:

; we are dealing with unicode so better turn it on
0:000> .enable_unicode 1     
 
; interpret data at this address as strings- luckily for us it is null terminated
0:000> du 0000021c29c67f80    
0000021c`29c67f80  "F:\code\blogposts\heapcorruption"
0000021c`29c67fc0  "\x64\Debug\ClassLibrary.dll"

; du prints at max 48 chars per line

after this observation i hit g and there comes access violation:

0:000> g
(2728.2748): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
ntdll!RtlUnicodeToUTF8N+0x149:
00007ffb`06cb74e9 410fb709        movzx   ecx,word ptr [r9] ds:0000021c`29c68000=????

Now the complaint from windbg here is that address is not a string, analyze -v command executed earlier, thus i set out to print what is in the memory:

0:000> du 0000021c29c68000
0000021c`29c68000  "????????????????????????????????"
0000021c`29c68040  "????????????????????????????????"
0000021c`29c68080  "????????????????????????????????"
0000021c`29c680c0  "????????????????????????????????"
0000021c`29c68100  "????????????????????????????????"
0000021c`29c68140  "????????????????????????????????"
0000021c`29c68180  "????????????????????????????????"
0000021c`29c681c0  "????????????????????????????????"
0000021c`29c68200  "????????????????????????????????"
0000021c`29c68240  "????????????????????????????????"
0000021c`29c68280  "????????????????????????????????"
0000021c`29c682c0  "????????????????????????????????"

; there is nothing here!- so why are we even trying to copy this mem?

Now I have strong suspicion that we are reading past the end of Buffer passed in earlier. 0000021c29c68000 - 0000021c29c67f80 --> 80h
here is the problem- we are reading past the buffer!, r9 had value of 76h and now we are at 80h this is the problem.

With this information when i look back at this call:

    int nBytes = WideCharToMultiByte(CP_UTF8, 0, 
                                     NotificationData->Loaded.FullDllName->Buffer, 
                                     NotificationData->Loaded.FullDllName->Length,
                                     utf8Str, (int)sizeof(utf8Str) - 1, nullptr, nullptr);

I am passing in NotificationData->Loaded.FullDllName->Length as it is to function- which is actually #bytes not #wchars! the fix should be to pass #chars instead:

    //  in case you string is always null terminated
    //  int numChars = (int)wcslen(NotificationData->Loaded.FullDllName->Buffer);
    //
    //  2 is sizeof(WCHAR)
    int numChars = NotificationData->Loaded.FullDllName->Length/2;  
    
    int nBytes = WideCharToMultiByte(CP_UTF8, 0, 
                                     NotificationData->Loaded.FullDllName->Buffer, numChars,
                                     utf8Str, (int)sizeof(utf8Str) - 1, nullptr, nullptr);

and everything works great, also this is specified in documentation for UNICODE_STRING and WideCharToMultiByte

Anyway with this heap corruption fixed I started reviewing the code as to why original problem happened only at 65th iteration- turns out it was due to a bitfield. An int was used to mark 32bits for some processing and we overflowed that- since binaries were 64bits, due to padding it survived till 64th iteration and the luckly corrupted memory.

I have uploaded heapcorruption at github