65.9K
CodeProject 正在变化。 阅读更多。
Home

深入可执行文件:为 VB 程序员介绍可移植可执行文件格式

starIconstarIconstarIconstarIcon
emptyStarIcon
starIcon

4.78/5 (24投票s)

2003年5月7日

CPOL

4分钟阅读

viewsIcon

117001

downloadIcon

1859

描述了Windows可执行文件的布局以及如何读取它。

引言

Portable Executable格式是描述Win32可执行文件的各个部分如何组合在一起的数据结构。它允许操作系统加载可执行文件,定位运行该可执行文件所需的动态链接库,并导航编译到该可执行文件中的代码、数据和资源部分。

摆脱DOS

PE格式是为Windows创建的,但微软必须确保在DOS中运行这样的可执行文件会产生有意义的错误消息并退出。为此,Windows可执行文件的最开始部分实际上是一个DOS可执行文件(有时称为stub),它会显示“此程序需要Windows”或类似信息,然后退出。

DOS stub的格式是

Private Type IMAGE_DOS_HEADER
    e_magic As Integer   ''\\ Magic number
    e_cblp As Integer    ''\\ Bytes on last page of file
    e_cp As Integer      ''\\ Pages in file
    e_crlc As Integer    ''\\ Relocations
    e_cparhdr As Integer ''\\ Size of header in paragraphs
    e_minalloc As Integer ''\\ Minimum extra paragraphs needed
    e_maxalloc As Integer ''\\ Maximum extra paragraphs needed
    e_ss As Integer    ''\\ Initial (relative) SS value
    e_sp As Integer    ''\\ Initial SP value
    e_csum As Integer  ''\\ Checksum
    e_ip As Integer  ''\\ Initial IP value
    e_cs As Integer  ''\\ Initial (relative) CS value
    e_lfarlc As Integer ''\\ File address of relocation table
    e_ovno As Integer ''\\ Overlay number
    e_res(0 To 3) As Integer ''\\ Reserved words
    e_oemid As Integer ''\\ OEM identifier (for e_oeminfo)
    e_oeminfo As Integer ''\\ OEM information; e_oemid specific
    e_res2(0 To 9) As Integer ''\\ Reserved words
    e_lfanew As Long ''\\ File address of new exe header
End Type

该结构中唯一对Windows有意义的字段是e_lfanew,它是指向新Windows可执行文件头的文件指针。要跳过程序的DOS部分,请将文件指针设置为此字段中的值。

Private Sub SkipDOSStub(ByVal hfile As Long) 

Dim BytesRead As Long

'\\ Go to start of file...
Call SetFilePointer(hfile, 0, 0, FILE_BEGIN)
If Err.LastDllError Then
    Debug.Print LastSystemError
End If

Dim stub As IMAGE_DOS_HEADER
Call ReadFileLong(hfile, VarPtr(stub), Len(stub), BytesRead, ByVal 0&)
Call SetFilePointer(hfile, stub.e_lfanew, 0, FILE_BEGIN)

End Sub

NT头

NT头包含Windows程序加载器加载程序所需的信息。它由PE文件签名,然后是IMAGE_FILE_HEADERIMAGE_OPTIONAL_HEADER记录组成。

对于设计用于在Windows下运行的应用程序(即非OS/2或VxD文件),PE文件签名的四个字节应等于&h4550。其他定义的签名是

Public Enum ImageSignatureTypes
    IMAGE_DOS_SIGNATURE = &H5A4D     ''\\ MZ
    IMAGE_OS2_SIGNATURE = &H454E     ''\\ NE
    IMAGE_OS2_SIGNATURE_LE = &H454C  ''\\ LE
    IMAGE_VXD_SIGNATURE = &H454C     ''\\ LE
    IMAGE_NT_SIGNATURE = &H4550      ''\\ PE00
End Enum

PE文件签名之后是IMAGE_NT_HEADERS结构,它存储有关可执行文件目标环境的信息。该结构如下:

Private Type IMAGE_FILE_HEADER
    Machine As Integer
    NumberOfSections As Integer
    TimeDateStamp As Long
    PointerToSymbolTable As Long
    NumberOfSymbols As Long
    SizeOfOptionalHeader As Integer
    Characteristics As Integer
End Type

Machine成员描述了可执行文件编译的目标CPU。它可以是以下之一:

Public Enum ImageMachineTypes
    IMAGE_FILE_MACHINE_I386 = &H14C   ''\\ Intel 386.
    ''\\ MIPS little-endian,= &H160 big-endian
    IMAGE_FILE_MACHINE_R3000 = &H162  
    IMAGE_FILE_MACHINE_R4000 = &H166  ''\\ MIPS little-endian
    IMAGE_FILE_MACHINE_R10000 = &H168  ''\\ MIPS little-endian
    IMAGE_FILE_MACHINE_WCEMIPSV2 = &H169  ''\\ MIPS little-endian WCE v2
    IMAGE_FILE_MACHINE_ALPHA = &H184      ''\\ Alpha_AXP
    IMAGE_FILE_MACHINE_POWERPC = &H1F0    ''\\ IBM PowerPC Little-Endian
    IMAGE_FILE_MACHINE_SH3 = &H1A2   ''\\ SH3 little-endian
    IMAGE_FILE_MACHINE_SH3E = &H1A4  ''\\ SH3E little-endian
    IMAGE_FILE_MACHINE_SH4 = &H1A6   ''\\ SH4 little-endian
    IMAGE_FILE_MACHINE_ARM = &H1C0   ''\\ ARM Little-Endian
    IMAGE_FILE_MACHINE_IA64 = &H200  ''\\ Intel 64
End Enum

SizeOfOptionalHeader成员指示紧随其后的IMAGE_OPTIONAL_HEADER结构的的大小(以字节为单位)。实际上,此结构并非可选,因此有些名不副实。该结构定义如下:

Private Type IMAGE_OPTIONAL_HEADER
    Magic As Integer
    MajorLinkerVersion As Byte
    MinorLinkerVersion As Byte
    SizeOfCode As Long
    SizeOfInitializedData As Long
    SizeOfUninitializedData As Long
    AddressOfEntryPoint As Long
    BaseOfCode As Long
    BaseOfData As Long
End Type

然后紧跟着是IMAGE_OPTIONAL_HEADER_NT结构。

Private Type IMAGE_OPTIONAL_HEADER_NT
    ImageBase As Long
    SectionAlignment As Long
    FileAlignment As Long
    MajorOperatingSystemVersion As Integer
    MinorOperatingSystemVersion As Integer
    MajorImageVersion As Integer
    MinorImageVersion As Integer
    MajorSubsystemVersion As Integer
    MinorSubsystemVersion As Integer
    Win32VersionValue As Long
    SizeOfImage As Long
    SizeOfHeaders As Long
    CheckSum As Long
    Subsystem As Integer
    DllCharacteristics As Integer
    SizeOfStackReserve As Long
    SizeOfStackCommit As Long
    SizeOfHeapReserve As Long
    SizeOfHeapCommit As Long
    LoaderFlags As Long
    NumberOfRvaAndSizes As Long
    DataDirectory(0 To 15) As IMAGE_DATA_DIRECTORY
End Type

该结构中最有用的字段(至少对我来说)是16个IMAGE_DATA_DIRECTORY条目。它们描述了可执行文件的特定部分的位置(如果存在)。结构定义如下:

Private Type IMAGE_DATA_DIRECTORY
    VirtualAddress As Long
    Size As Long
End Type

目录按如下顺序保存:

Public Enum ImageDataDirectoryIndexes
    IMAGE_DIRECTORY_ENTRY_EXPORT = 0  ''\\ Export Directory
    IMAGE_DIRECTORY_ENTRY_IMPORT = 1  ''\\ Import Directory
    IMAGE_DIRECTORY_ENTRY_RESOURCE = 2 ''\\ Resource Directory
    IMAGE_DIRECTORY_ENTRY_EXCEPTION = 3   ''\\ Exception Directory
    IMAGE_DIRECTORY_ENTRY_SECURITY = 4   ''\\ Security Directory
    IMAGE_DIRECTORY_ENTRY_BASERELOC = 5  ''\\ Base Relocation Table
    IMAGE_DIRECTORY_ENTRY_DEBUG = 6   ''\\ Debug Directory
    IMAGE_DIRECTORY_ENTRY_ARCHITECTURE = 7   ''\\ Architecture Specific Data
    IMAGE_DIRECTORY_ENTRY_GLOBALPTR = 8  ''\\ RVA of GP
    IMAGE_DIRECTORY_ENTRY_TLS = 9  ''\\ TLS Directory
    ''\\ Load Configuration Directory
    IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG = 10    
    ''\\ Bound Import Directory in headers
    IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT = 11   
    IMAGE_DIRECTORY_ENTRY_IAT = 12  ''\\ Import Address Table
    ''\\ Delay Load Import Descriptors
    IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT = 13   
End Enum

请注意,如果可执行文件不包含某个部分(这很常见),它仍然会有一个IMAGE_DATA_DIRECTORY条目,但地址和大小都将为零。

镜像数据目录

导出目录

导出目录包含此可执行文件导出的函数详细信息。例如,如果您查看MSVBVM50.dll的导出目录,它将列出它导出的所有函数,这些函数构成了Visual Basic 5运行时环境。

该目录包含一些信息,告诉您有多少个导出的函数,然后是三个并行数组,分别提供函数的地址、名称和序号。结构定义如下:

Private Type IMAGE_EXPORT_DIRECTORY
    Characteristics As Long
    TimeDateStamp As Long
    MajorVersion As Integer
    MinorVersion As Integer
    lpName As Long
    Base As Long
    NumberOfFunctions As Long
    NumberOfNames As Long
    lpAddressOfFunctions As Long    '\\ Three parrallel arrays...(LONG)
    lpAddressOfNames As Long        '\\ (LONG)
    lpAddressOfNameOrdinals As Long '\\ (INTEGER)
End Type

您可以按如下方式从可执行文件中读取此信息:

Private Sub ProcessExportTable(ExportDirectory As IMAGE_DATA_DIRECTORY)

Dim deThis As IMAGE_EXPORT_DIRECTORY
Dim lBytesWritten As Long
Dim lpAddress As Long

Dim nFunction As Long

If ExportDirectory.VirtualAddress > 0 And ExportDirectory.Size > 0 Then
    '\\ Get the true address from the RVA
    lpAddress = AbsoluteAddress(ExportDirectory.VirtualAddress)
    '\\ Copy the image_export_directory structure...
    Call ReadProcessMemoryLong(DebugProcess.Handle, lpAddress, _
                   VarPtr(deThis), Len(deThis), lBytesWritten)
    With deThis
        If .lpName <> 0 Then
            image.Name = StringFromOutOfProcessPointer(DebugProcess.Handle,_
                   image.AbsoluteAddress(.lpName), 32, False)
        End If
        If .NumberOfFunctions > 0 Then
            For nFunction = 1 To .NumberOfFunctions
                lpAddress = LongFromOutOfprocessPointer_
                   (DebugProcess.Handle, _
                   image.AbsoluteAddress(.lpAddressOfNames)_
                   + ((nFunction - 1) * 4))
                fExport.Name = StringFromOutOfProcessPointer_
                   (DebugProcess.Handle, _
                   image.AbsoluteAddress(lpAddress), 64, False)
                fExport.Ordinal = .Base + _
                   IntegerFromOutOfprocessPointer(DebugProcess.Handle, _
                   image.AbsoluteAddress(.lpAddressOfNameOrdinals) + _
                   ((nFunction - 1) * 2))
                fExport.ProcAddress = LongFromOutOfprocessPointer_
                   (DebugProcess.Handle,_
                   image.AbsoluteAddress(.lpAddressOfFunctions) + _
                   ((nFunction - 1) * 4))
            Next nFunction
        End If
    End With
End If
    
End Sub

导入目录

导入目录列出了此可执行文件依赖的动态链接库以及它从这些动态链接库导入的函数。它由一个IMAGE_IMPORT_DESCRIPTOR结构数组组成,该数组由一个lpName参数为零的结构实例终止。结构定义如下:

Private Type IMAGE_IMPORT_DESCRIPTOR
    lpImportByName As Long ''\\ 0 for terminating null import descriptor
    TimeDateStamp As Long  ''\\ 0 if not bound,
                           ''\\ -1 if bound, and real date\time stamp
                   ''\\ in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND)
                   ''\\ O.W. date/time stamp of DLL bound to (Old BIND)
    ForwarderChain As Long ''\\ -1 if no forwarders
    lpName As Long
    ''\\ RVA to IAT (if bound this IAT has actual addresses)
    lpFirstThunk As Long 
End Type

您可以按如下方式遍历导入目录:

Private Sub ProcessImportTable(ImportDirectory As IMAGE_DATA_DIRECTORY)

Dim lpAddress As Long
Dim diThis As IMAGE_IMPORT_DESCRIPTOR
Dim byteswritten As Long
Dim sName As String
Dim lpNextName As Long
Dim lpNextThunk As Long

Dim lImportEntryIndex As Long

Dim nOrdinal As Integer
Dim lpFuncAddress As Long


'\\ If the image has an imports section...
If ImportDirectory.VirtualAddress > 0 And ImportDirectory.Size > 0 Then
    '\\ Get the true address from the RVA
    lpAddress = AbsoluteAddress(ImportDirectory.VirtualAddress)
    Call ReadProcessMemoryLong(DebugProcess.Handle, lpAddress, _
             VarPtr(diThis), Len(diThis), byteswritten)
    
    While diThis.lpName <> 0
        '\\ Process this import directory entry
        sName = StringFromOutOfProcessPointer(DebugProcess.Handle, _
             image.AbsoluteAddress(diThis.lpName), 32, False)

        '\\ Process the import file's functions list
        If diThis.lpImportByName <> 0 Then
            lpNextName = LongFromOutOfprocessPointer(DebugProcess.Handle,_
                     image.AbsoluteAddress(diThis.lpImportByName))
            lpNextThunk = LongFromOutOfprocessPointer(DebugProcess.Handle,_
                     image.AbsoluteAddress(diThis.lpFirstThunk))
            While (lpNextName <> 0) And (lpNextThunk <> 0)
                '\\ get the function address
                lpFuncAddress = LongFromOutOfprocessPointer_
                                  (DebugProcess.Handle, lpNextThunk)
                nOrdinal = IntegerFromOutOfprocessPointer_
                                   (DebugProcess.Handle, lpNextName)
                '\\ Skip the two-byte ordinal hint
                lpNextName = lpNextName + 2
                '\\ Get this function's name
                sName = StringFromOutOfProcessPointer(DebugProcess.Handle, _
                     image.AbsoluteAddress(lpNextName), 64, False)
                If Trim$(sName) <> "" Then
                    '\\ Get the next imported function...
                    lImportEntryIndex = lImportEntryIndex + 1
                    
                    lpNextName = LongFromOutOfprocessPointer_
                       (DebugProcess.Handle, _
                       image.AbsoluteAddress(diThis.lpImportByName _
                       + (lImportEntryIndex * 4)))
                       
                    lpNextThunk = LongFromOutOfprocessPointer_
                       (DebugProcess.Handle,_
                       image.AbsoluteAddress(diThis.lpFirstThunk_
                       + (lImportEntryIndex * 4)))
                Else
                    lpNextName = 0
                End If
            Wend
        End If
               
        '\\ And get the next one
        lpAddress = lpAddress + Len(diThis)
        Call ReadProcessMemoryLong(DebugProcess.Handle, lpAddress, _
                VarPtr(diThis), Len(diThis), byteswritten)
    Wend

End If
    
End Sub

资源目录

资源目录的结构有些复杂。它由一个根目录(由IMAGE_RESOURCE_DIRECTORY结构定义)组成,紧随其后的是多个资源目录项(由IMAGE_RESOURCE_DIRECTORY_ENTRY结构定义)。它们的定义如下:

Private Type IMAGE_RESOURCE_DIRECTORY
    Characteristics As Long '\\Seems to be always zero?
    TimeDateStamp As Long
    MajorVersion As Integer
    MinorVersion As Integer
    NumberOfNamedEntries As Integer
    NumberOfIdEntries As Integer
End Type

Private Type IMAGE_RESOURCE_DIRECTORY_ENTRY
    dwName As Long
    dwDataOffset As Long
    CodePage As Long
    Reserved As Long
End Type

每个资源目录项可以指向实际的资源数据,也可以指向另一层资源目录项。如果dwDataOffset的最高位被设置,则它指向一个目录。否则,它指向资源数据。

这些信息有什么用?

一旦您了解了可执行文件的组成方式,您就可以利用这些信息来深入了解其工作原理。您可以查看编译到其中的资源、它依赖的DLL以及它从这些DLL导入的实际函数。更重要的是,您可以将调试器附加到可执行文件,并跟踪那些非常棘手的通用保护性故障。下一篇文章将介绍如何附加调试器并使用PE文件格式。

© . All rights reserved.