Exploit Development 02 — PE File Format 1

Milad Kahsari Alhadi
6 min readJun 4, 2018

--

PE is the native windows 32-bit file format and PE+ is the native windows 64-bit file format. Every windows 32-bit executable (except VxDs and 16-bit DLLs) uses PE file format.

32-bit DLLs, COM files, OCX controls, Control Panel Applets (.CPL files) Scren Servers and .NET executable are all PE format. Even NT’s kernel mode drivers use PE file format.

Nevertheless, Why do we need to know about it? There are multiple reasons. For example, adding code to an executable, change original entry point of a program to an compressor, manually unpacking an executable and stuff like that. Nowadays most malware comes packed in order to reduce size and to provide an added layer of protection against malware researcher.

In a packed executable, the import tables are usually destroyed and data is often encrypted. The packer inserts code to unpack the file in memory upon execution, and then jumps to the original entry point of the file (where the original program actually starts executing).

If we manage to dump this memory region after the packer finished unpacking the executable, we still need to fix the sections and import tables before our app will run. How will we do that if we don’t even know what the PE format is?

Basic Structure of PE File

The picture above shows the basic structure of a PE file. Also I have to mention here this note, at least, a PE file will have code section and data section but an application for Windows can have other predefined sections like text, bss (Bloack Storage Start), rdata (Read-Only Data), data (Initialized Data), rsrc (Resources), edata (Export Data), idata (Import Data), pdata (I don’t what is the purpose of this section), and debug section.

However, A section in a portable executable (PE) file is a named contiguous block of memory that contains either code or data.

Some sections contain code or data that your program declared and uses directly, while other data sections are created for you by the linker and library manager (lib.exe) and contain information vital to the operating system.

Some applications do not need all of these sections that I have mentioned, while others may define still more sections to suit their specific needs — It’s completely optional. Also, the names of these section can be changed via Visual Studio -> Project Properties -> Linker -> Advanced -> Merge Sections.

the /MERGE option in visual studio combines the first section with the second section, naming the resulting section to. For example, /merge:.text=.Milad. If the second section does not exist, LINK renames the section from as to. The /MERGE option is useful for creating VxDs and overriding the compiler-generated section names.

Another important point is that the structure of a PE file on disk is exactly the same as when it is loaded into memory with by some minor difference like alignment on the memory and disk so if you can locate info in the file on disk you will be able to find it when the file is loaded into memory.

However, it is not copied exactly into memory. The windows dynamic loader decides which parts need mapping in (Page In) and omits (Page Out) any others. Data that is not mapped in is placed at the end of the file like Debug information.

Also, the location of an item in the file on the hard disk will often differ from its location once loaded into memory because of the Page-Based Virtual Memory Management and Address Space Layout Randomization mechanism that modern operating systems used like Windows and Linux.

When the sections are loaded into RAM they are aligned to fit 4 Kb memory pages (It specified by Section Alignment member in Optional Header of PE), each section starting on a new page.

Also, this alignment is important for other purposes like section | segment privilege management by operating system because in a program .text section must be executable while data section doesn’t need to be executable. Based on this need, with section alignment linker can set some attributes to sections. File alignment specifies section alignment on the disk while Section alignment specifies section alignment on the memory.

Translating virtual address to physical address by Memory Management Unit and Translation Lookaside Buffer is explained below with illustration.

The concept of virtual memory is that instead of letting software directly access physical memory, the processor and OS create an invisible layer between the two.

Every time an attempt is made to access memory, the processor memory management unit with by translation lookaside buffer consults a page table that tells the process which physical memory address to actually use.

It wouldn’t be practical to have a table entry for each byte of memory (the page table would be larger than the total physical memory), so instead, processors divide memory into pages. This has several advantages:

  1. It enables the creation of multiple address spaces with different access and operational level. An address space is an isolated page table that only allows access to memory that is pertinent to the current program or process. It ensures that programs are completely isolated from one another and that an error causing one program to crash is not able to poison another program’s address space.
  2. It enables the processor to enforce certain rules on how memory is accessed. Sections are needed in PE files because different areas in the file are treated differently by the memory manager when a module is loaded. At load time, the memory manager sets the access rights on memory pages for the different sections based on their settings in the section header. This determines whether a given section is readable, writable, or executable. This means each section must typically start on a fresh page.

Note: However, the default page size for 32-bit Windows is 4096 bytes and it would be wasteful to align executable to a 4096 bytes page boundary on disk as that would make them significantly bigger than necessary. Because of this, the PE header has two different alignment fields: Section alignment and file alignment. Section alignment is how sections are aligned in memory and File alignment with 512 bytes size is how sections are aligned in the file on disk and is a multiple of disk sector size in order to optimize the loading process. The below picture shows this size.

Nevertheless, When PE files are loaded into memory by the windows loader, the in-memory version is known as a module. The starting address where file mapping begins is called an HMODULE.

A module (object) in memory represents all the code, data and resources from an executable file that is needed for execution whilst the term process basically refers to an isolated address space which can be used for running such a module.

In the next article, I will continue the discussion about PE Specification. If you have any ideas and comments about these articles please send your message to m.kahsari@gmail.com.

--

--

Milad Kahsari Alhadi
Milad Kahsari Alhadi

Written by Milad Kahsari Alhadi

I am founder of Ai000 Cybernetic QLab which is a non-profit research place.

Responses (1)