2017年4月

Internet Explorer/wmf图像解析分析

逆向Internet Explorer/Edge的代码,找找思路。初步每周1篇。

Edge里面并不支持wmf,所以这次就光IE了,IE版本:IE 11.0.9600.18499

Windows Metafile (WMF) is a graphics file format on Microsoft Windows systems, originally designed in the early 1990s. Windows Metafiles are intended to be portable between applications and may contain both vector graphics and bitmap components.

Essentially, a WMF file stores a list of function calls that have to be issued to the Windows Graphics Device Interface (GDI) layer in order to display an image on screen

In 2007 Enhanced Metafile (EMF) a newer 32-bit version with additional commands appeared. EMF is also used as a graphics language for printer drivers. The last(?) version of EMF, 4.0, appeared in 2008.

With the release of Windows XP, the Enhanced Metafile Format Plus Extensions (EMF+) format was introduced. EMF+ provides a way to serialize calls to the GDI+ API in the same way that WMF/EMF stores calls to GDI.
  1. WMF Parser

IE中加载wmf后,会经由下列栈走到解析类CImgTaskWmf::ReadImage()中。

0:023> bp mshtml!CImgTaskWmf::ReadImage
0:023> g
Breakpoint 0 hit
eax=00000000 ebx=0a370a00 ecx=0a370a00 edx=02d7e000 esi=0a370a00 edi=0fb8fbc0
eip=5bd84df1 esp=0fb8fb8c ebp=0fb8fbb4 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
MSHTML!CImgTaskWmf::ReadImage:
5bd84df1 8bff            mov     edi,edi
0:023> kvn
 # ChildEBP RetAddr  Args to Child              
00 0fb8fb88 5bd84da2 5bd84d10 00000016 9ac6cdd7 MSHTML!CImgTaskWmf::ReadImage (FPO: [Non-Fpo])
01 0fb8fbb4 5b9d9639 0fb8fbd8 0a39af00 0a370a00 MSHTML!CImgTaskWmf::Decode+0x92 (FPO: [Non-Fpo])
02 0fb8fbf8 5b9d94f4 00000000 5b9d94d0 0fb8fc44 MSHTML!CImgTask::Exec+0x120 (FPO: [Non-Fpo])
03 0fb8fc08 74d259d8 0a39af00 5225dee8 00000000 MSHTML!CImgTaskExec::FiberProc+0x24 (FPO: [1,0,0])
04 0fb8fc44 74d25986 ffffffff 77562f03 00000000 KERNELBASE!_BaseFiberStart+0x49 (FPO: [Non-Fpo])
05 0fb8fc54 00000000 00000000 00000000 00000000 KERNELBASE!BaseFiberStart+0x16 (FPO: [Non-Fpo])

让我们先从这一小截来。IE在加载图片时,CImgHelper会调用void WINAPI CImgTaskExec::FiberProc(void * pv),这个FiberProc相当于做了一个异步加载的工作。传进来的指针pv为PFIBERINFO,FiberProc创建一个CImgTask对象,随即将这个PFIBERINFO中的“ImgTask”的成员传给CImgTask,并调用其Exec函数。

CImgTask::Exec调用Decode。这个Decode是一个虚函数,根据不同的TaskType走。比如这里是Wmf的Task,就会走到CImgTaskWmf::Decode中。

让我们看一看CImgTaskWmf::Decode。以下是几乎没有处理的源代码。

__int32 __thiscall CImgTaskWmf::Decode(CImgTaskWmf *this, int *a2)
{
  CImgTaskWmf *pImgTaskWmf; // esi@1
  int val1; // eax@3
  int v4; // ecx@3
  int v5; // ST04_4@3
  int v6; // ST00_4@3
  __int32 result; // eax@3
  int v8; // [sp+4h] [bp-20h]@1
  char v9; // [sp+8h] [bp-1Ch]@1
  __int16 v10; // [sp+Eh] [bp-16h]@3
  __int16 v11; // [sp+10h] [bp-14h]@3
  __int16 v12; // [sp+12h] [bp-12h]@3
  __int16 v13; // [sp+14h] [bp-10h]@3
  unsigned __int16 v14; // [sp+16h] [bp-Eh]@3

  v8 = 0;
  *a2 = 1;
  pImgTaskWmf = this;
  if ( CImgTask::Read(this, &v9, 0x16u, (unsigned __int32 *)&v8, (unsigned __int32)this) < 0 || v8 != 22 )
  {
    result = 0x80004005;
  }
  else
  {
    val1 = MulDiv(v12 - v10, 96, v14);
    v4 = v13;
    *((_DWORD *)pImgTaskWmf + 29) = abs(val1);
    v5 = abs(MulDiv(v4 - v11, 96, v14));
    v6 = *((_DWORD *)pImgTaskWmf + 29);
    *((_DWORD *)pImgTaskWmf + 30) = v5;
    CImgTask::OnSize(pImgTaskWmf, v6, v5, 0, -1);
    result = CImgTaskWmf::ReadImage(pImgTaskWmf);
  }
  return result;
}

先不要管,因为这里有一个重要的CImgTask::Read不知道是在Read什么,先看看CImgTask::Read

__int32 __thiscall CImgTask::Read(CImgTask *this, void *Dst, unsigned __int32 a3, unsigned __int32 *a4, unsigned __int32 a5)
{
  CDwnMemStream *v5; // ecx@1
  __int32 result; // eax@1
  unsigned __int32 v7; // [sp+4h] [bp-4h]@1

  v5 = (CDwnMemStream *)*((_DWORD *)this + 301);
  v7 = 0;
  result = CDwnMemStream::Read(v5, Dst, a3, a3, &v7);
  if ( a4 )
    *a4 = v7;
  return result;
}

又引入了一个CDwnMemStream::Read,分析就是这样,总是引入新的……不过看到CDwn,可以基本知道快“到头”了。

unsigned __int32 __thiscall CDwnMemStream::Read(CDwnMemStream *this, void *Dst, unsigned __int32 a3, unsigned __int32 a4, unsigned __int32 *a5)
{
  int v5; // ebx@1
  CDwnMemStream *v6; // esi@1
  unsigned __int32 v7; // edi@1
  int v8; // edx@1
  int v9; // eax@1
  int v10; // ecx@2
  int v11; // eax@3
  unsigned __int32 v12; // edi@4
  bool v13; // zf@7
  unsigned __int32 v15; // ecx@13
  int v16; // [sp+Ch] [bp-Ch]@1
  unsigned __int32 v17; // [sp+10h] [bp-8h]@6
  int v18; // [sp+14h] [bp-4h]@3

  v5 = a3;
  v6 = this;
  v7 = 0;
  v8 = *((_DWORD *)this + 3);
  v9 = *((_DWORD *)this + 5) << 14;
  v16 = v8 + v9;
  if ( *((_BYTE *)this + 24)
    || (v15 = *((_DWORD *)this + 8) - v8 - v9, a3 <= v15)
    || a4 <= v15
    || (v17 = 0,
        v7 = CImgTask::ReadIntoStream(*((CImgTask **)v6 + 9), v6, a3 - v15, a4 - v15, &v17),
        (v7 & 0x80000000) == 0) )
  {
    v10 = *((_DWORD *)v6 + 5) << 14;
    if ( a3 >= *((_DWORD *)v6 + 8) - v10 - *((_DWORD *)v6 + 3) )
      v5 = *((_DWORD *)v6 + 8) - v10 - *((_DWORD *)v6 + 3);
    v11 = v5;
    v18 = v5;
    if ( v5 )
    {
      while ( 1 )
      {
        v12 = 0x4000 - *((_DWORD *)v6 + 3);
        if ( v11 < v12 )
          v12 = v11;
        v17 = CMemPage::Read(
                *(CMemPage **)(*(_DWORD *)(*((_DWORD *)v6 + 7) + 4) + 4 * *((_DWORD *)v6 + 5)),
                Dst,
                *((_DWORD *)v6 + 3),
                v12);
        if ( (v17 & 0x80000000) != 0 )
          break;
        v13 = (((_WORD)v12 + (unsigned __int16)*((_DWORD *)v6 + 3)) & 0x3FFF) == 0;
        *((_DWORD *)v6 + 3) = ((_WORD)v12 + (unsigned __int16)*((_DWORD *)v6 + 3)) & 0x3FFF;
        if ( v13 )
          ++*((_DWORD *)v6 + 5);
        Dst = (char *)Dst + v12;
        v11 = v18 - v12;
        v18 = v11;
        if ( !v11 )
        {
          v7 = v17;
          goto LABEL_10;
        }
      }
      CDwnMemStream::SetReadPosition(v6, v16);
      v7 = v17;
    }
    else
    {
LABEL_10:
      if ( a5 )
        *a5 = v5;
    }
  }
  return v7;
}

CDwnMemStream::Read代码如上,可以分成几个小部分来看。

A.首先是这一连串的||

  if ( *((_BYTE *)this + 24)
    || (v15 = *((_DWORD *)this + 8) - v8 - v9, cb <= v15)
    || pcbRead <= v15
    || (v17 = 0,
        v7 = CImgTask::ReadIntoStream(*((CImgTask **)_this + 9), _this, cb - v15, pcbRead - v15, &v17),
        (v7 & 0x80000000) == 0) )

IDA的转换有些蛋疼,|| 是遇到真短路的。但是遇到这种复杂的IDA很可能出错,所以相信自己为好,直接从汇编语句看吧。

I.

mov     esi, ecx         ;ESI = this
push    edi
xor     edi, edi         ;EDI = 0
mov     eax, [esi+14h]   ;eax = *(this+0x14)
mov     edx, [esi+0Ch]   ;edx = *(this+0xc)
shl     eax, 0Eh         ;eax <<= 0xe.   eax is some value..
cmp     byte ptr [esi+18h], 0 ; if(this + 0x18 == 0)
lea     ecx, [edx+eax]   ;ecx = *(*(this+0xc) + eax);  so edx is a pointer value.
mov     [ebp+var_C], ecx ; some variant = ecx;
jz      loc_638F41B2     ; if zero then jump

II.

loc_638F41B2: (true)
mov     ecx, [esi+20h]   ; ecx = *(this + 0x20)
sub     ecx, edx         ; ecx -= *(this+0xc)    so this+0x20 is a pointer. pointer - pointer = delta 
sub     ecx, eax         ; ecx -= eax      another value - valuee
cmp     ebx, ecx         ; if(arg2 >= ecx)     [note: ebx is arg2 ]
jbe     loc_638F412D

III.

(false)
mov     eax, [ebp+arg3] ; eax = arg3;
cmp     eax, ecx        ; if (arg3 >= ecx)
jbe     loc_638F412D

IV.

(false)
sub     eax, ecx         ; arg3 -= ecx
mov     [ebp+var_8], edi ; some variant 2 = 0  [ note edi == 0 now]
lea     edx, [ebp+var_8] ; edx = &some variant2
push    edx              
push    eax              ; eax now is original arg3.
mov     eax, ebx         ; eax = arg2
sub     eax, ecx         ; eax -= ecx;
mov     ecx, [esi+24h]   ; ecx = this+0x24   [!!]
push    eax             
push    esi              ; struct CDwnMemStream *
call    ?ReadIntoStream@CImgTask@@QAEJAAVCDwnMemStream@@KKPAK@Z ; CImgTask::ReadIntoStream(CDwnMemStream &,ulong,ulong,ulong *)
mov     edi, eax          ; edi = return value
test    edi, edi          ; if (retValue != 0)
jns     loc_638F412D

关键点在CImgTask::ReadIntoStreamCImgTask::ReadIntoStream是一个thiscall,因此ecx为“this”(CImgTask*)。所以mov ecx, [esi+24h]这里this+0x24就是CImgTask*。也就是 CDwnMemStream* + 0x24 = CImgTask*

CImgTask::ReadIntoStream第二个参数为CDwnMemStream &类型,因此esi,也即this是CDwnMemStream*,这个没有疑问。

第三个参数是什么?这得再进CImgTask::ReadIntoStream,分析CImgTask::ReadIntoStream的代码可以得知,第三个参数是还需要获取的字节数,命名为cbReq。我是从这段代码段得知它的作用的:

while(not_finished)
{
    ...
    cbPos += cbGot;
    cbReq -= cbGot;
    if ( !cbReq )
      goto Finish;
    ...
}

第三个参数由push eax传入,鉴于上面的eax = arg2; eax -= ecx; 我们大概清楚了 arg2就是资源大小,也命名为cbReq。 ecx是已获得大小,命名 cbGot。

第四个参数呢,push eax ,这里的eax还是传入的原始arg3。再去CImgTask::ReadIntoStream看一眼,原来arg3是“每次最小需要获取的量”,命名为cbMin。我是从这段代码得知的:

...

DoTimeOut:
      if ( CImgTask::IsDwnBindEof(_pBindData) || !cbGot && cbTot >= a4 )
        goto Finish;
...

第五个参数,是一个DWORD的指针,按照编程人员的习惯,感觉我们已经初步给它命名DWORD* ccbGot了。不过还是去CImgTask::ReadIntoStream一探究竟。

Wow,就是想象的那样。那就这么给它命名好了。

  _ccbGot = ccbGot;
  if ( CImgTask::IsDwnBindEof(this) )
  {
    *ccbGot = 0;

    ……

   *_ccbGot = cbTot;
  return result_1;

附上完整的CImgTask::ReadIntoStream

signed int __thiscall CImgTask::ReadIntoStream(CImgTask *this, struct CDwnMemStream *pDwnMemStream, unsigned __int32 cbReq, unsigned __int32 cbMin, unsigned __int32 *ccbGot)
{
  CImgTask *_pBindData; // esi@1
  signed int maxSize; // edi@5
  __int32 result_1; // ebx@6
  unsigned __int32 cbGot; // edi@8
  unsigned __int32 cch; // [sp+Ch] [bp-2014h]@7
  unsigned __int32 cbTot; // [sp+10h] [bp-2010h]@3
  unsigned __int32 *_ccbGot; // [sp+14h] [bp-200Ch]@1
  char Src; // [sp+18h] [bp-2008h]@7

  _pBindData = this;
  _ccbGot = ccbGot;
  if ( CImgTask::IsDwnBindEof(this) )
  {
    *ccbGot = 0;
    CDwnMemStream::FinalizeContent(*((CDwnMemStream **)_pBindData + 301));
    return 0;
  }
  if ( pDwnMemStream != *((struct CDwnMemStream **)_pBindData + 301) )
    return 0x80004005;
  cbTot = 0;
  while ( !*((_DWORD *)_pBindData + 25) )
  {
    maxSize = cbReq;
    if ( cbReq > 0x2000 )
      maxSize = 0x2000;
    result_1 = CDwnMemStream::EnsureCanAppend(*((CDwnMemStream **)_pBindData + 301), maxSize);
    if ( result_1 < 0 )
      goto Finish;
    result_1 = CDwnBindData::Read(*((CDwnBindData **)_pBindData + 23), &Src, maxSize, &cch);
    if ( result_1 < 0 )
      goto Finish;
    cbGot = cch;
    if ( cch )
    {
      result_1 = CDwnMemStream::Append(*((CDwnMemStream **)_pBindData + 301), &Src, cch);
      if ( result_1 < 0 )
        goto Finish;
      cbGot = cch;
    }
    cbTot += cbGot;
    cbReq -= cbGot;
    if ( !cbReq )
      goto Finish;
    if ( cbGot )
    {
      if ( CDwnTask::IsTimeout(_pBindData) )
      {
        cbGot = cch;
        goto DoTimeOut;
      }
    }
    else
    {
DoTimeOut:
      if ( CImgTask::IsDwnBindEof(_pBindData) || !cbGot && cbTot >= cbMin )
        goto Finish;
      if ( Microsoft_IEEnableBits & 2 )
      {
        Template_pt(Microsoft_IEHandle, dword_64751CDC, cbGot);
        cbGot = cch;
      }
      CImgTaskExec::YieldTask(*((CImgTaskExec **)_pBindData + 4), _pBindData, cbGot == 0);
      if ( Microsoft_IEEnableBits & 2 )
        Template_pt(Microsoft_IEHandle, dword_64751CDC, cch);
    }
  }
  result_1 = 0x80004004;
Finish:
  if ( CImgTask::IsDwnBindEof(_pBindData) )
    CDwnMemStream::FinalizeContent(*((CDwnMemStream **)_pBindData + 301));
  *_ccbGot = cbTot;
  return result_1;
}

B. 回到上一层CDwnMemStream::Read,我们重新命名它,得到完整代码:

unsigned __int32 __thiscall CDwnMemStream::Read(CDwnMemStream *this, void *dataDestination, unsigned __int32 cbReq_1, unsigned __int32 cbMinReq, unsigned __int32 *ccbGot)
{
  int cbReq; // ebx@1
  CDwnMemStream *_this; // esi@1
  unsigned __int32 SubResult; // edi@1
  int someValue; // edx@1
  int v9; // eax@1
  int dwSomeValue; // ecx@2
  int cbRemain; // eax@3
  unsigned __int32 stepSize; // edi@4
  bool v13; // zf@7
  unsigned __int32 v15; // ecx@13
  int pos; // [sp+Ch] [bp-Ch]@1
  unsigned __int32 ReadResult; // [sp+10h] [bp-8h]@6
  int cbFullReqValue; // [sp+14h] [bp-4h]@3

  cbReq = cbReq_1;
  _this = this;
  SubResult = 0;
  someValue = *((_DWORD *)this + 3);
  v9 = *((_DWORD *)this + 5) << 14;
  pos = someValue + v9;
  if ( *((_BYTE *)this + 24)
    || (v15 = *((_DWORD *)this + 8) - someValue - v9, cbReq_1 <= v15)
    || cbMinReq <= v15
    || (ReadResult = 0,
        SubResult = CImgTask::ReadIntoStream(
                      *((CImgTask **)_this + 9),
                      _this,
                      cbReq_1 - v15,
                      cbMinReq - v15,
                      &ReadResult),
        (SubResult & 0x80000000) == 0) )
  {
    dwSomeValue = *((_DWORD *)_this + 5) << 14;
    if ( cbReq_1 >= *((_DWORD *)_this + 8) - dwSomeValue - *((_DWORD *)_this + 3) )
      cbReq = *((_DWORD *)_this + 8) - dwSomeValue - *((_DWORD *)_this + 3);
    cbRemain = cbReq;
    cbFullReqValue = cbReq;
    if ( cbReq )
    {
      while ( 1 )
      {
        stepSize = 0x4000 - *((_DWORD *)_this + 3);
        if ( cbRemain < stepSize )
          stepSize = cbRemain;
        ReadResult = CMemPage::Read(
                       *(CMemPage **)(*(_DWORD *)(*((_DWORD *)_this + 7) + 4) + 4 * *((_DWORD *)_this + 5)),
                       dataDestination,
                       *((_DWORD *)_this + 3),
                       stepSize);
        if ( (ReadResult & 0x80000000) != 0 )   // SUCCEED(...)   marco actually..
          break;
        v13 = (((_WORD)stepSize + (unsigned __int16)*((_DWORD *)_this + 3)) & 0x3FFF) == 0;
        *((_DWORD *)_this + 3) = ((_WORD)stepSize + (unsigned __int16)*((_DWORD *)_this + 3)) & 0x3FFF;
        if ( v13 )
          ++*((_DWORD *)_this + 5);
        dataDestination = (char *)dataDestination + stepSize;
        cbRemain = cbFullReqValue - stepSize;
        cbFullReqValue = cbRemain;
        if ( !cbRemain )
        {
          SubResult = ReadResult;
          goto Exit;
        }
      }
      CDwnMemStream::SetReadPosition(_this, pos);
      SubResult = ReadResult;
    }
    else
    {
Exit:
      if ( ccbGot )
        *ccbGot = cbReq;
    }
  }
  return SubResult;
}

C. 我们基本可以理解这个函数的作用:从流中按照分片大小依次读入。


所以,Decode的第一部分知道了。看看Read返回后的部分,两个MulDiv计算高宽,然后触发OnSize,再进入ReadImage。看看ReadImage。

int __thiscall CImgTaskWmf::Decode(CImgTaskWmf *this, int *a2)
{
  CImgTaskWmf *pThis; // esi@1
  int v3; // eax@3
  int v4; // ecx@3
  int v5; // ST04_4@3
  int v6; // ST00_4@3
  int result; // eax@3
  int bytesRead; // [sp+4h] [bp-20h]@1
  char v9; // [sp+8h] [bp-1Ch]@1
  __int16 v10; // [sp+Eh] [bp-16h]@3
  __int16 v11; // [sp+10h] [bp-14h]@3
  __int16 v12; // [sp+12h] [bp-12h]@3
  __int16 v13; // [sp+14h] [bp-10h]@3
  unsigned __int16 v14; // [sp+16h] [bp-Eh]@3

  bytesRead = 0;
  *a2 = 1;
  pThis = this;
  if ( CImgTask::Read(this, &v9, 0x16u, (unsigned __int32 *)&bytesRead, (unsigned __int32)this) < 0 || bytesRead != 0x16 )// read fail or read size not != 0x16
  {
    result = 0x80004005;
  }
  else
  {
    v3 = MulDiv(v12 - v10, 0x60, v14);
    v4 = v13;
    *((_DWORD *)pThis + 29) = abs(v3);
    v5 = abs(MulDiv(v4 - v11, 0x60, v14));
    v6 = *((_DWORD *)pThis + 29);
    *((_DWORD *)pThis + 30) = v5;
    CImgTask::OnSize(pThis, v6, v5, 0, -1);
    result = CImgTaskWmf::ReadImage(pThis);
  }
  return result;
}

ReadImage前面也是有很复杂的if判断,一样拆开。因为函数是一个thiscall,所以:

I.

mov     esi, ecx                        ;esi = `this`
lea     eax, [ebp+bytesRead]            ;eax is variant bytesRead
xor     ecx, ecx            ;ecx = 0;
mov     [ebp+var_438], esi      ;somevariant1 = `this`
push    ecx
push    eax
push    12h
lea     eax, [ebp+Src]          ;eax is somevariant2
mov     [ebp+var_428], ecx      ;somevariant3 = 0
mov     [ebp+var_424], ecx      ;somevariant4 = 0
mov     edi, ecx            ;edi = 0
mov     [ebp+hmf], ecx          ;hMf = 0.   handle to metafile
mov     [ebp+h], ecx            ;h ? = 0.
mov     [ebp+bytesRead], ecx        ;bytesRead = 0
mov     ecx, esi            ;ecx = `this`
push    eax             ;push somevariant2
call    ?Read@CImgTask@@IAEJPAXKPAKK@Z ; CImgTask::Read(void *,ulong,ulong *,ulong)
test    eax, eax            ;if(eax < 0) // if(!SUCCEED(eax))
js      loc_64151BA6

CImgTask::Read的四个参数含义分别是this(ecx)、buf、cbMin、cbGot。这个我们之前已经分析好了。

__int32 __thiscall CImgTask::Read(CImgTask *this, char *buf, unsigned int cbMin, unsigned __int32 *_cbGot)
{
  CDwnMemStream *v5; // ecx@1
  __int32 result; // eax@1
  unsigned __int32 cbGot; // [sp+4h] [bp-4h]@1

  v5 = (CDwnMemStream *)*((_DWORD *)this + 301);
  cbGot = 0;
  result = CDwnMemStream::Read(v5, buf, cbMin, cbMin, &cbGot);
  if ( _cbGot )
    *_cbGot = cbGot;
  return result;
}

重新整理一下上面的代码。

mov     esi, ecx                        ;esi = `this`
lea     eax, [ebp+bytesRead]            ;eax is variant bytesRead
xor     ecx, ecx            ;ecx = 0;
mov     [ebp+var_438], esi      ;somevariant1 = `this`
push    ecx             ;0
push    eax             ;buffer
push    12h             ;cbMin(cbReq)
lea     eax, [ebp+Src]          ;eax is somevariant2 --> cbGot
mov     [ebp+var_428], ecx      ;somevariant3 = 0
mov     [ebp+var_424], ecx      ;somevariant4 = 0
mov     edi, ecx            ;edi = 0
mov     [ebp+hmf], ecx          ;hMf = 0.   handle to metafile
mov     [ebp+h], ecx            ;h ? = 0.
mov     [ebp+bytesRead], ecx        ;bytesRead = 0
mov     ecx, esi            ;ecx = `this`
push    eax             ;push somevariant2 --> cbGot
call    ?Read@CImgTask@@IAEJPAXKPAKK@Z ; CImgTask::Read(void *,ulong,ulong *,ulong)
test    eax, eax            ;if(eax < 0) // if(!SUCCEED(eax))
js      loc_64151BA6

II.

(false)
cmp     [ebp+bytesRead], 12h
jnz     loc_64151BA6

判断读取到的数据是否为0x12,这是一个header的长度。如果不对,肯定有问题。

III.

(false)
mov     ecx, [ebp+var_412]
xor     eax, eax
shld    eax, ecx, 1
add     ecx, ecx
push    eax
push    ecx
lea     ecx, [ebp+bufSize]
call    _ULongLongToULong@12 ; ULongLongToULong(x,x,x)
test    eax, eax
js      loc_64151BA6

将bufsize从ULONGLONG 转为ULONG并判断是否成功以及溢出

IV.

(false)
mov     eax, [ebp+bufSize]
cmp     eax, 12h
jb      loc_64151BA6

判断bufSize是否<0x12。

V.

(false)
mov     ecx, _g_hProcessHeap
mov     edx, eax
call    ??$HeapAllocClear@$00@MemoryProtection@@YGPAXPAXI@Z ; MemoryProtection::HeapAllocClear<1>(void *,uint)
mov     ebx, eax
test    ebx, ebx
jz      loc_64151BA6

如果bufSize<0x12,则在Process Heap分配一块内存,并判断是否成功。

作者:blast
涞源:nul.pw/2017/04/18/218.html

VI.

(false)
push    12h             ; MaxCount
lea     eax, [ebp+buf]
push    eax             ; Src
push    [ebp+bufSize]   ; DstSize
push    ebx             ; Dst
call    ds:__imp__memcpy_s ;将读出的数据写入Buffer,不知是否可能出现问题,可以再看看。

MetaFile Header的处理到此结束。下面开始处理数据部分

mov     eax, [ebp+bufSize]
lea     ecx, [ebp+bytesRead]
and     [ebp+bytesRead], edi
add     esp, 0Ch
add     eax, 0FFFFFFEEh    ;实际上是-0x12,也就是去掉了刚刚读取的头,剩余的部分再Read一次。
mov     [ebp+var_42C], eax 
push    ecx
push    eax
lea     eax, [ebx+12h]
mov     ecx, esi
push    eax
call    ?Read@CImgTask@@IAEJPAXKPAKK@Z ; CImgTask::Read(void *,ulong,ulong *,ulong)
test    eax, eax
js      loc_64151B60

后面的就是处理Meta File自身的逻辑了。

这一片逻辑整理下来为:

 if ( CImgTask::Read(this, &buf, 0x12u, (unsigned __int32 *)&bytesRead, 0) < 0// if not Read Error then 
    || bytesRead != 0x12                        // judge Read byte is 0x12
    || ULongLongToULong(2 * v32, (unsigned __int64)v32 >> 31) < 0// ULONGLONG -> LONGLONG, judge if there's any interger overflow
    || bufSize < 0x12                           // judge if header part is illegal
    || (pMem = MemoryProtection::HeapAllocClear<1>(g_hProcessHeap, bufSize)) == 0 )// judge if Heap allocating is succeed
  {
    result_1 = 0x80004005;                      // any fail will make this func return 0x8004005
    goto LABEL_28;
  }
  _memcpy_s((void *)pMem, bufSize, &buf, 0x12u);
  bytesRead = 0;
  remaining = bufSize - 0x12;
  if ( CImgTask::Read(v1, (char *)(pMem + 18), bufSize - 0x12, (unsigned __int32 *)&bytesRead, (unsigned __int32)v19) < 0
    || bytesRead != remaining )
  {
    result_1 = 0x80004005;
    goto LABEL_19;
  }

接下来,绘制Meta File。伪代码如下,又是一长串

 hmf = SetMetaFileBitsEx(bufSize, (const BYTE *)pMem);
  if ( hmf )
  {
    MemoryProtection::HeapFree(v20, v21, v22);
    pMem = 0;
    bufSize = 0;
    v5 = (struct ColorPalette **)TSmartPointer<ColorPalette>::operator&(&bufSize);
    if ( ColorPaletteInternal::GetColorPalette(v6, v5) < 0 )
    {
      result_1 = 0x80004005;
      TSmartPointer<ColorPalette>::~TSmartPointer<ColorPalette>(&bufSize);
      goto LABEL_23;
    }
    v7 = (const void *)(bufSize + 1036);
    CopyColorsFromPaletteEntries((struct tagRGBQUAD *)0x100, (const struct tagPALETTEENTRY *)v20, (unsigned int)v21);
    v8 = v23;
    _memcpy_s((char *)v23 + 152, 0x400u, v7, 0x400u);
    TSmartPointer<ColorPalette>::~TSmartPointer<ColorPalette>(&bufSize);
    LOBYTE(remaining) = *((_BYTE *)v23 + 112);
    v19 = (void *)TSmartPointer<IWICProgressiveLevelControl>::operator&(&v27);
    v9 = TSmartPointer<CDCompLayer>::operator&(&v28);
    LOBYTE(v10) = 1;
    result_1 = CImgCacheEntry::Create(
                 v10,
                 8,
                 *((_DWORD *)v8 + 29),
                 *((_DWORD *)v8 + 30),
                 &v33,
                 256,
                 1,
                 255,
                 0,
                 remaining,
                 0,
                 0,
                 v9,
                 v19);
    if ( result_1 < 0 )
    {
LABEL_23:
      if ( hmf )
        DeleteMetaFile(hmf);
      goto LABEL_25;
    }
    v11 = v28;
    *((_DWORD *)v23 + 33) = 255;
    *((_DWORD *)v8 + 32) = 1;
    v12 = (int)v11 + 48;
    v19 = (void *)(*((_DWORD *)v8 + 30) * CImgBits::CbLine(v11));
    v18 = *((_BYTE *)v8 + 132);
    v13 = *(_DWORD *)v12;
    v14 = *(int (**)(void))(*(_DWORD *)v12 + 16);
    __guard_check_icall_fptr(*(_DWORD *)(v13 + 16));
    v15 = (void *)v14();
    if ( &v18 != &v18 )
      __fastfail(4u);
    memset(v15, v18, (size_t)v19);
    v2 = GetMemoryDC();
    if ( !v2 )
    {
      result_1 = 0x80004005;
      goto LABEL_23;
    }
    v19 = (void *)*((_DWORD *)v28 + 19);
    *((_DWORD *)v28 + 23) |= 2u;
    h = SelectObject(v2, v19);
    SaveDC(v2);
    SetMapMode(v2, 8);
    v16 = v23;
    v19 = 0;
    SetViewportExtEx(v2, *((_DWORD *)v23 + 29), *((_DWORD *)v23 + 30), 0);
    PlayMetaFile(v2, hmf);
    RestoreDC(v2, -1);
    TSmartPointer<IWICBitmapSource>::operator=<IWICFormatConverter>(&v27);
    *((_DWORD *)v16 + 31) = -1;
    LogSqmIncrement(0x7Au, 1u);
    result_1 = 0;
LABEL_19:
    if ( h )
      SelectObject(v2, h);
    if ( v2 )
      DeleteDC(v2);
    goto LABEL_23;
  }

我打算先丢掉一半的代码,原因是,这些是GDI32.DLL提供的API,如果有空,我之后再跟踪GDI32.DLL的实现。

这一串代码为:

hmf = SetMetaFileBitsEx(dwSize, pBuf); 
if (!hmf) ...

CopyColorsFromPaletteEntries(arg, globalArg, 256); 
memcpy(...); 

// create memory dc and rendering meta file into this bitmap.     
hdc = GetMemoryDC(); 

SaveDC(SelectObject(hdc, ...); 

SetMapMode(hdc, MM_ANISOTROPIC); 
SetViewportExtEx(hdc, width, height, NULL); 
PlayMetaFile(hdc, hmf); 
RestoreDC(hdc, -1); 

这里面没有“...”的我们就先不看了,关注几个:

I.

dwhcb = (void *)(*((_DWORD *)v8 + 30) * CImgBits::CbLine(v11));// height * cbLine actually..
dwTrans = *((_BYTE *)v8 + 132);
v13 = *(_DWORD *)v12;
getBits = *(int (**)(void))(*(_DWORD *)v12 + 16);
__guard_check_icall_fptr(*(_DWORD *)(v13 + 16));
bits = (void *)getBits();
if ( &dwTrans != &dwTrans )
  __fastfail(4u);
memset(bits, dwTrans, (size_t)dwhcb);
hdc = GetMemoryDC();
if ( !hdc )
{
  result_1 = 0x80004005;
  goto LABEL_23;
}

这一段,首先是dwTrans让我们很是疑惑,不过看到这个fastfail应该都释然了,if ( &dwTrans != &dwTrans ) __fastfail(4u);,不能利用的。

II.退出前

LABEL_25:
  if ( pMem )
    MemoryProtection::HeapFree(ghalfTone_ape, val256, v22);
LABEL_28:
  TSmartPointer<CImgBits>::~TSmartPointer<CImgBits>(&v28);
  TSmartPointer<CImgCacheEntry>::~TSmartPointer<CImgCacheEntry>(&v27);
  return result_1;
}

很安全的样子。

这样,MSHTML.DLL里面的部分我们就全看完了,确实处理很安全,在我这次逆向中看起来也没发现什么可以用的点。如果还有空的话,再看一看GDI32.DLL里面的实现好了。

03 - 自动化抽取dump信息

又是一周,老实说又忘记了这码事,趁着周末填一下。在了解了如何抽取DUMP文件中关于URI的方法之后,继续自动化操作就好。我们把dump归档到某个目录下(work_path),然后利用批处理,或者随便什么语言什么脚本,只要你开心就好,去调用cdb,并将解析出的结果保存到txt中。

废话不多说,我就是这么喜欢直奔主题,以下是批处理文件。如果你要使用这个批处理,机智的你一定知道怎么修改来匹配你机器的环境。如果不知道的话那还是别找漏洞了:)。

@echo off 
set work_path=E:\wowowowow\dump
E: 
cd %work_path% 
for /R %%s in (*.dmp) do ( 
    echo "*************************"
    echo NOW PROCESSING: %%s
    echo "*************************"    
    echo  .ecxr;k;.foreach^(place ^{s -^[1^]u 0 L^?80000000 ^"http^"^}^)^{du ^/c100 ^$^{place^}^}| "C:\Program Files (x86)\Windows Kits\8.1\Debuggers\x86\cdb.exe" -i srv*E:\symbols\*http://msdl.microsoft.com/download/symbols -z %%s >> %%s.out.txt
) 
echo "DONE."

经过一番等待之后,得到如下的输出:

outp1.png

毕竟不是文物分析,dmp文件老实说可以删掉了。使用可以预览文档内容的工具,例如FileLocator Pro(Linux 下的替代:searchmonkey),对“*.txt”来搜索“://” or "http://"或者其他你感兴趣的东西,分析得到dump文件中保存的那一段历史。

搜索“WAV”,得到这么一串URL:

00234824  "http://DOGE/main.php?DOGEDOGE&A=ListView"
051239e8  "http://DOGE/play.php?file=DOGEDOGEDOGEDOGEDOGEDOGE-015800405922.WAV"

outp1.png

根据URL中残留的信息,可以断定是在网页中预览了某个WAV文件,然后产生了崩溃,崩溃的位置也不错;

0:006> Unable to load image C:\WINDOWS\system32\wmp.dll, Win32 error 0n2
*** WARNING: Unable to verify timestamp for wmp.dll
*** ERROR: Module load completed but symbols could not be loaded for wmp.dll
eax=00000001 ebx=00000000 ecx=00000000 edx=003e953c esi=00000000 edi=00000001
eip=082a7e7f esp=03b9f950 ebp=03b9f960 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010246
wmp+0x37e7f:
082a7e7f ??              ???
  *** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr  
WARNING: Stack unwind information not available. Following frames may be wrong.
03b9f94c 06b1ded0 wmp+0x37e7f

接下来只要保存文件,尝试复现即可。

(作者 blast,CH 03 完)