Notes

Do not read codes but patch binary.

Three chains for writing a linker

This is for organising my knowledge to create a simple linker of PE. Linker is a bit hard target for being written from scratch without being considered its abstraction to me.

What a linker will do is roughly, having one or multiple object files which contains multiple sections and its corresponding data or code, then creating an executable file or static or dynamic load library.

The internal role in terms of its data structure can be split as next 3.

1. Section merging
merging them if the name of a section is identical with the one which you have observed.
2. Static relocation
resolving address of import symbols if it finds corresponding exports symbols on a different object file.
3. Preparation of dynamic relocation
preparing a data for dynamic relocation if you could not do statically.

Three linked lists which I name as "chain" are introduced accordingly.


1. Section Chain ( and Section Container)

To merge sections nicely, a linked list which links same section will be used. When you read an input object file, you assume you mapped all sections with its all tagged data or code pointed by PointerToRawData element on heap of the linker process.(Of course, you can only map sections not data at first stage, but it needs some overhead by reassignment of input file pointer). Merging them between different sections should not involve reallocation of one of data as this is just in vain. You should let them be on a same element of a linked list and do not merge them till you actually write an output file. You need two different stage linked list.

1. Section Container
=> A linked list which links sections that represents the head of the internal linked list(they are not merged).
2. Section Chain
=> An internal linked list for sections which will be merged eventually.

Below diagram will illustrate this scheme.

o -> o -> o
|    |
o1  o1
     | 
    o2

o : Section Container
o1,o2 : Section Chain

SectionChain will refer actual section data(pointer to IMAGE_SECTION_HEADER allocated on a heap), but SectionContainer will not.


2. Symbol Chain (with symbol hash table)

If you put extern symbols on an object file, it lacks elements on actual data which normally kept as blank. They needs to be resolved by linker or dynamic loader. Candidate suppliers of the address have storage class such as export or global not static. Symbol table will hold these information. When you read new object file and its symbol table, if you find a symbol which needs to be resolved, it should look for the candidate which might supply the missing address. Every time you have new import address, you need to compare the ever-exported symbol name if one of them is matched. It should not be done by iterating a loop which contains strcmp as it gets you heavy work. Often, hash table is introduced, and all export/import symbol is coded as hash produced by a hash function. The size of it which often called bucket should be adjusted through the number of given object file or its total number of symbols. Each hash function should contain a heads of linked list to deal with collision. Then, when you compute the hash for a name of import or export, if you find something there on the hash table, you need to confirm if the hash was computed by the same query, and if they are different trace the linked list. Of course, you need to do something equivalent to strcmp at serveral times if it lacks luck. But, it must be far better way rather than computing strcmp from top of the list of export symbol.


3. Object Chain (with SectionChain)

SymbolChain is good for resolving symbols, but not good for a task like collecting all of export/import symbols listed on object file. Such task is required if one of the import symbol cannot be statically linked, but could be done by one of symbols on dynamic library supplied by -l in case of gcc. They cannot provide a decisive address at this stage as they will not be on output file statically. Instead, linker will set an element of entry on image import table which tells indirectly loader to resolve the empty address. After you have done all of static relocation, and each section pointed from SectionChain have estimated virtual address and virtual size when they are executed, you need to resolve non-resolved symbols against list of dynamic link library that loader will load. ObjectChain simply points to head entry of SYMBOL_TABLE of the object file and hold its size and help collecting all of non-symbols. Since not all of entry of object table needs to be the target it is wrapped by a linked list(which I name it SymbolChain as well) where you can traverse just the entries which needs to be checked.

Basic Procedure

Non-Incremental way

  1. Read list of input object file name, and allocate each of them on heap
  2. Check header, Estimate appropriate size of bucket of hash table.
  3. Read list of image section header of each object file(construct SectionChain)
  4. Read a symbol table of each object file(fill SymbolChain in a value of symbol hash table if the entry is import/export)
    Construct ObjectChain with the iteration
  5. Allocate additional section, normally named such as idata or edata and more as you like.
    -> Its existance should not violate virtual address of section which contains export function.
  6. After reading every object files, fill VirtualAddress and estimated pointer on generated output file(PointerToRawData) ( plus VirtualSize not mandatory..) on SectionContainer by summarising SectionChain
  7. Do relocation in between the object file
    7-1. Collect import symbols which its address needs to be resolved from ObjectChain(iterating via SymbolChain on it)
    7-2. Compute the hash and get the pointer of IMAGE_SYMBOL
    7-3. IMAGE_SYMBOL has SectionIndex. Get the section name from ObjectChain.
    7-4. Traverse SectionChain and find a SectionChain whose name is matched with the one that you are looking for, get the virtual address.
    7-5. After having virtual address of the section and Offset of the export function IMAGE_SYMBOL.Value, you can fill virtual address offset of the output section.
  8. Add data for import & export section (it requires its VirtualAddress at this stage not on 7.)
  9. Generate an output file putting additional image dos & optional header.

Position settlement and relocation

What is tricky part for implementing linker is a mutual dependency between settling virtual address/ output file pointer on fixed position(procedure 6) and relocation(procedure 7). If you do not fix virtual address of an exported symbol, you cannot resolve an import symbol by it as the virtual address might be just what you want to fill in(depends on relocation type). On the other hand, linker still needs to generate some codes(such as PLT) which might consequently slide virtual address of following sections.

Some linkers solves this dependency iterating 6,7 again until virtual address settles down, but it causes link time overhead.

Better way should be separating a section which needs to be fixed for relocation and a section which does not require the settlement at the stage. If you are careful about this, you can make sure section address will not be changed during relocation adding some codes. For instance, virtual address of import/export section does not needs to be fixed until relocation. This applies when you need to generate some small codes; it should not violate settlement of virtual address, instead set a new section; somewhere after relocation target area.

Bad example is

a linker prepares a code for relocation of loader

0xff 0x25 
.text

0xe8 xx xx xx xx

....

(bottom on .text section)
0xff 0x25 xx xx xx xx(jump to IAT)

...

.data(relocation target section)
... 

.idata 

IAT

When a compiler generated call by 0xe8 xx xx xx xx you need to set a buffer(often inserting another jump code) for resolving by loader as relative call cannot take a look at values filled by loader on IAT.

Inserted jump code 0xff 0x25 xx xx xx xx is often put on the bottom of the section that the resolved function exists. But, if the section is followed by another section which might supply export symbols, then the virtual address may be slided back for the insertion.

To prevent this, the buffer code needs to be stubbed on a section after sections that requires its virtual address fixed on relocation.

How To Check validness of an output fie

When you generate an executable, you can validate if the output is valid at 3 stage.

  • Is the image successfully mapped before loading? This is done by CreateProcesswith its primary thread suspended.

  • Does loader load it properly? Checked by rewriting entry point as eternal loop such as (0xef 0xee) while you suspend it initially and resume it after rewriting. This is useful for checking optional header

  • Is the image correctly executed?


That's it. The code is on

https://github.com/Hiroshi123/bin_tools/tree/master/src/core/link

still needs more work :()

I added a diagram to illustrate the structure.

f:id:vrodxda:20200511221650p:plain
diagram

This link has some rough documentation.

https://github.com/Hiroshi123/bin_tools/wiki/LDD-documentation