深入可执行文件:为 VB 程序员介绍可移植可执行文件格式






4.78/5 (24投票s)
描述了Windows可执行文件的布局以及如何读取它。
引言
Portable Executable格式是描述Win32可执行文件的各个部分如何组合在一起的数据结构。它允许操作系统加载可执行文件,定位运行该可执行文件所需的动态链接库,并导航编译到该可执行文件中的代码、数据和资源部分。
摆脱DOS
PE格式是为Windows创建的,但微软必须确保在DOS中运行这样的可执行文件会产生有意义的错误消息并退出。为此,Windows可执行文件的最开始部分实际上是一个DOS可执行文件(有时称为stub),它会显示“此程序需要Windows”或类似信息,然后退出。
DOS stub的格式是
Private Type IMAGE_DOS_HEADER
e_magic As Integer ''\\ Magic number
e_cblp As Integer ''\\ Bytes on last page of file
e_cp As Integer ''\\ Pages in file
e_crlc As Integer ''\\ Relocations
e_cparhdr As Integer ''\\ Size of header in paragraphs
e_minalloc As Integer ''\\ Minimum extra paragraphs needed
e_maxalloc As Integer ''\\ Maximum extra paragraphs needed
e_ss As Integer ''\\ Initial (relative) SS value
e_sp As Integer ''\\ Initial SP value
e_csum As Integer ''\\ Checksum
e_ip As Integer ''\\ Initial IP value
e_cs As Integer ''\\ Initial (relative) CS value
e_lfarlc As Integer ''\\ File address of relocation table
e_ovno As Integer ''\\ Overlay number
e_res(0 To 3) As Integer ''\\ Reserved words
e_oemid As Integer ''\\ OEM identifier (for e_oeminfo)
e_oeminfo As Integer ''\\ OEM information; e_oemid specific
e_res2(0 To 9) As Integer ''\\ Reserved words
e_lfanew As Long ''\\ File address of new exe header
End Type
该结构中唯一对Windows有意义的字段是e_lfanew
,它是指向新Windows可执行文件头的文件指针。要跳过程序的DOS部分,请将文件指针设置为此字段中的值。
Private Sub SkipDOSStub(ByVal hfile As Long)
Dim BytesRead As Long
'\\ Go to start of file...
Call SetFilePointer(hfile, 0, 0, FILE_BEGIN)
If Err.LastDllError Then
Debug.Print LastSystemError
End If
Dim stub As IMAGE_DOS_HEADER
Call ReadFileLong(hfile, VarPtr(stub), Len(stub), BytesRead, ByVal 0&)
Call SetFilePointer(hfile, stub.e_lfanew, 0, FILE_BEGIN)
End Sub
NT头
NT头包含Windows程序加载器加载程序所需的信息。它由PE文件签名,然后是IMAGE_FILE_HEADER
和IMAGE_OPTIONAL_HEADER
记录组成。
对于设计用于在Windows下运行的应用程序(即非OS/2或VxD文件),PE文件签名的四个字节应等于&h4550。其他定义的签名是
Public Enum ImageSignatureTypes
IMAGE_DOS_SIGNATURE = &H5A4D ''\\ MZ
IMAGE_OS2_SIGNATURE = &H454E ''\\ NE
IMAGE_OS2_SIGNATURE_LE = &H454C ''\\ LE
IMAGE_VXD_SIGNATURE = &H454C ''\\ LE
IMAGE_NT_SIGNATURE = &H4550 ''\\ PE00
End Enum
PE文件签名之后是IMAGE_NT_HEADERS
结构,它存储有关可执行文件目标环境的信息。该结构如下:
Private Type IMAGE_FILE_HEADER
Machine As Integer
NumberOfSections As Integer
TimeDateStamp As Long
PointerToSymbolTable As Long
NumberOfSymbols As Long
SizeOfOptionalHeader As Integer
Characteristics As Integer
End Type
Machine
成员描述了可执行文件编译的目标CPU。它可以是以下之一:
Public Enum ImageMachineTypes
IMAGE_FILE_MACHINE_I386 = &H14C ''\\ Intel 386.
''\\ MIPS little-endian,= &H160 big-endian
IMAGE_FILE_MACHINE_R3000 = &H162
IMAGE_FILE_MACHINE_R4000 = &H166 ''\\ MIPS little-endian
IMAGE_FILE_MACHINE_R10000 = &H168 ''\\ MIPS little-endian
IMAGE_FILE_MACHINE_WCEMIPSV2 = &H169 ''\\ MIPS little-endian WCE v2
IMAGE_FILE_MACHINE_ALPHA = &H184 ''\\ Alpha_AXP
IMAGE_FILE_MACHINE_POWERPC = &H1F0 ''\\ IBM PowerPC Little-Endian
IMAGE_FILE_MACHINE_SH3 = &H1A2 ''\\ SH3 little-endian
IMAGE_FILE_MACHINE_SH3E = &H1A4 ''\\ SH3E little-endian
IMAGE_FILE_MACHINE_SH4 = &H1A6 ''\\ SH4 little-endian
IMAGE_FILE_MACHINE_ARM = &H1C0 ''\\ ARM Little-Endian
IMAGE_FILE_MACHINE_IA64 = &H200 ''\\ Intel 64
End Enum
SizeOfOptionalHeader
成员指示紧随其后的IMAGE_OPTIONAL_HEADER
结构的的大小(以字节为单位)。实际上,此结构并非可选,因此有些名不副实。该结构定义如下:
Private Type IMAGE_OPTIONAL_HEADER
Magic As Integer
MajorLinkerVersion As Byte
MinorLinkerVersion As Byte
SizeOfCode As Long
SizeOfInitializedData As Long
SizeOfUninitializedData As Long
AddressOfEntryPoint As Long
BaseOfCode As Long
BaseOfData As Long
End Type
然后紧跟着是IMAGE_OPTIONAL_HEADER_NT
结构。
Private Type IMAGE_OPTIONAL_HEADER_NT
ImageBase As Long
SectionAlignment As Long
FileAlignment As Long
MajorOperatingSystemVersion As Integer
MinorOperatingSystemVersion As Integer
MajorImageVersion As Integer
MinorImageVersion As Integer
MajorSubsystemVersion As Integer
MinorSubsystemVersion As Integer
Win32VersionValue As Long
SizeOfImage As Long
SizeOfHeaders As Long
CheckSum As Long
Subsystem As Integer
DllCharacteristics As Integer
SizeOfStackReserve As Long
SizeOfStackCommit As Long
SizeOfHeapReserve As Long
SizeOfHeapCommit As Long
LoaderFlags As Long
NumberOfRvaAndSizes As Long
DataDirectory(0 To 15) As IMAGE_DATA_DIRECTORY
End Type
该结构中最有用的字段(至少对我来说)是16个IMAGE_DATA_DIRECTORY
条目。它们描述了可执行文件的特定部分的位置(如果存在)。结构定义如下:
Private Type IMAGE_DATA_DIRECTORY
VirtualAddress As Long
Size As Long
End Type
目录按如下顺序保存:
Public Enum ImageDataDirectoryIndexes
IMAGE_DIRECTORY_ENTRY_EXPORT = 0 ''\\ Export Directory
IMAGE_DIRECTORY_ENTRY_IMPORT = 1 ''\\ Import Directory
IMAGE_DIRECTORY_ENTRY_RESOURCE = 2 ''\\ Resource Directory
IMAGE_DIRECTORY_ENTRY_EXCEPTION = 3 ''\\ Exception Directory
IMAGE_DIRECTORY_ENTRY_SECURITY = 4 ''\\ Security Directory
IMAGE_DIRECTORY_ENTRY_BASERELOC = 5 ''\\ Base Relocation Table
IMAGE_DIRECTORY_ENTRY_DEBUG = 6 ''\\ Debug Directory
IMAGE_DIRECTORY_ENTRY_ARCHITECTURE = 7 ''\\ Architecture Specific Data
IMAGE_DIRECTORY_ENTRY_GLOBALPTR = 8 ''\\ RVA of GP
IMAGE_DIRECTORY_ENTRY_TLS = 9 ''\\ TLS Directory
''\\ Load Configuration Directory
IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG = 10
''\\ Bound Import Directory in headers
IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT = 11
IMAGE_DIRECTORY_ENTRY_IAT = 12 ''\\ Import Address Table
''\\ Delay Load Import Descriptors
IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT = 13
End Enum
请注意,如果可执行文件不包含某个部分(这很常见),它仍然会有一个IMAGE_DATA_DIRECTORY
条目,但地址和大小都将为零。
镜像数据目录
导出目录
导出目录包含此可执行文件导出的函数详细信息。例如,如果您查看MSVBVM50.dll的导出目录,它将列出它导出的所有函数,这些函数构成了Visual Basic 5运行时环境。
该目录包含一些信息,告诉您有多少个导出的函数,然后是三个并行数组,分别提供函数的地址、名称和序号。结构定义如下:
Private Type IMAGE_EXPORT_DIRECTORY
Characteristics As Long
TimeDateStamp As Long
MajorVersion As Integer
MinorVersion As Integer
lpName As Long
Base As Long
NumberOfFunctions As Long
NumberOfNames As Long
lpAddressOfFunctions As Long '\\ Three parrallel arrays...(LONG)
lpAddressOfNames As Long '\\ (LONG)
lpAddressOfNameOrdinals As Long '\\ (INTEGER)
End Type
您可以按如下方式从可执行文件中读取此信息:
Private Sub ProcessExportTable(ExportDirectory As IMAGE_DATA_DIRECTORY)
Dim deThis As IMAGE_EXPORT_DIRECTORY
Dim lBytesWritten As Long
Dim lpAddress As Long
Dim nFunction As Long
If ExportDirectory.VirtualAddress > 0 And ExportDirectory.Size > 0 Then
'\\ Get the true address from the RVA
lpAddress = AbsoluteAddress(ExportDirectory.VirtualAddress)
'\\ Copy the image_export_directory structure...
Call ReadProcessMemoryLong(DebugProcess.Handle, lpAddress, _
VarPtr(deThis), Len(deThis), lBytesWritten)
With deThis
If .lpName <> 0 Then
image.Name = StringFromOutOfProcessPointer(DebugProcess.Handle,_
image.AbsoluteAddress(.lpName), 32, False)
End If
If .NumberOfFunctions > 0 Then
For nFunction = 1 To .NumberOfFunctions
lpAddress = LongFromOutOfprocessPointer_
(DebugProcess.Handle, _
image.AbsoluteAddress(.lpAddressOfNames)_
+ ((nFunction - 1) * 4))
fExport.Name = StringFromOutOfProcessPointer_
(DebugProcess.Handle, _
image.AbsoluteAddress(lpAddress), 64, False)
fExport.Ordinal = .Base + _
IntegerFromOutOfprocessPointer(DebugProcess.Handle, _
image.AbsoluteAddress(.lpAddressOfNameOrdinals) + _
((nFunction - 1) * 2))
fExport.ProcAddress = LongFromOutOfprocessPointer_
(DebugProcess.Handle,_
image.AbsoluteAddress(.lpAddressOfFunctions) + _
((nFunction - 1) * 4))
Next nFunction
End If
End With
End If
End Sub
导入目录
导入目录列出了此可执行文件依赖的动态链接库以及它从这些动态链接库导入的函数。它由一个IMAGE_IMPORT_DESCRIPTOR
结构数组组成,该数组由一个lpName
参数为零的结构实例终止。结构定义如下:
Private Type IMAGE_IMPORT_DESCRIPTOR
lpImportByName As Long ''\\ 0 for terminating null import descriptor
TimeDateStamp As Long ''\\ 0 if not bound,
''\\ -1 if bound, and real date\time stamp
''\\ in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND)
''\\ O.W. date/time stamp of DLL bound to (Old BIND)
ForwarderChain As Long ''\\ -1 if no forwarders
lpName As Long
''\\ RVA to IAT (if bound this IAT has actual addresses)
lpFirstThunk As Long
End Type
您可以按如下方式遍历导入目录:
Private Sub ProcessImportTable(ImportDirectory As IMAGE_DATA_DIRECTORY)
Dim lpAddress As Long
Dim diThis As IMAGE_IMPORT_DESCRIPTOR
Dim byteswritten As Long
Dim sName As String
Dim lpNextName As Long
Dim lpNextThunk As Long
Dim lImportEntryIndex As Long
Dim nOrdinal As Integer
Dim lpFuncAddress As Long
'\\ If the image has an imports section...
If ImportDirectory.VirtualAddress > 0 And ImportDirectory.Size > 0 Then
'\\ Get the true address from the RVA
lpAddress = AbsoluteAddress(ImportDirectory.VirtualAddress)
Call ReadProcessMemoryLong(DebugProcess.Handle, lpAddress, _
VarPtr(diThis), Len(diThis), byteswritten)
While diThis.lpName <> 0
'\\ Process this import directory entry
sName = StringFromOutOfProcessPointer(DebugProcess.Handle, _
image.AbsoluteAddress(diThis.lpName), 32, False)
'\\ Process the import file's functions list
If diThis.lpImportByName <> 0 Then
lpNextName = LongFromOutOfprocessPointer(DebugProcess.Handle,_
image.AbsoluteAddress(diThis.lpImportByName))
lpNextThunk = LongFromOutOfprocessPointer(DebugProcess.Handle,_
image.AbsoluteAddress(diThis.lpFirstThunk))
While (lpNextName <> 0) And (lpNextThunk <> 0)
'\\ get the function address
lpFuncAddress = LongFromOutOfprocessPointer_
(DebugProcess.Handle, lpNextThunk)
nOrdinal = IntegerFromOutOfprocessPointer_
(DebugProcess.Handle, lpNextName)
'\\ Skip the two-byte ordinal hint
lpNextName = lpNextName + 2
'\\ Get this function's name
sName = StringFromOutOfProcessPointer(DebugProcess.Handle, _
image.AbsoluteAddress(lpNextName), 64, False)
If Trim$(sName) <> "" Then
'\\ Get the next imported function...
lImportEntryIndex = lImportEntryIndex + 1
lpNextName = LongFromOutOfprocessPointer_
(DebugProcess.Handle, _
image.AbsoluteAddress(diThis.lpImportByName _
+ (lImportEntryIndex * 4)))
lpNextThunk = LongFromOutOfprocessPointer_
(DebugProcess.Handle,_
image.AbsoluteAddress(diThis.lpFirstThunk_
+ (lImportEntryIndex * 4)))
Else
lpNextName = 0
End If
Wend
End If
'\\ And get the next one
lpAddress = lpAddress + Len(diThis)
Call ReadProcessMemoryLong(DebugProcess.Handle, lpAddress, _
VarPtr(diThis), Len(diThis), byteswritten)
Wend
End If
End Sub
资源目录
资源目录的结构有些复杂。它由一个根目录(由IMAGE_RESOURCE_DIRECTORY
结构定义)组成,紧随其后的是多个资源目录项(由IMAGE_RESOURCE_DIRECTORY_ENTRY
结构定义)。它们的定义如下:
Private Type IMAGE_RESOURCE_DIRECTORY
Characteristics As Long '\\Seems to be always zero?
TimeDateStamp As Long
MajorVersion As Integer
MinorVersion As Integer
NumberOfNamedEntries As Integer
NumberOfIdEntries As Integer
End Type
Private Type IMAGE_RESOURCE_DIRECTORY_ENTRY
dwName As Long
dwDataOffset As Long
CodePage As Long
Reserved As Long
End Type
每个资源目录项可以指向实际的资源数据,也可以指向另一层资源目录项。如果dwDataOffset
的最高位被设置,则它指向一个目录。否则,它指向资源数据。
这些信息有什么用?
一旦您了解了可执行文件的组成方式,您就可以利用这些信息来深入了解其工作原理。您可以查看编译到其中的资源、它依赖的DLL以及它从这些DLL导入的实际函数。更重要的是,您可以将调试器附加到可执行文件,并跟踪那些非常棘手的通用保护性故障。下一篇文章将介绍如何附加调试器并使用PE文件格式。