Notes

Do not read codes but patch binary.

About linker development

It has been already a half year since I've written an article about a draft of my linker (http://vrodxda.hatenablog.com/entry/2019/11/30/160029) for PE format.

Since then, it had been progressed slowly from time to time. (https://github.com/Hiroshi123/bin_tools/tree/master/src/core/link)

Looking back the trajectory so far, I have added

Although it is still on the way, I will write down something I have learned from my linker development to somebody who is suppose to write an own linker.

Necessity of custom Loader

Passing through at some point, what needs to be taken into account with much considerations is about dynamic loader investigation for dynamic API relocation.
In my opinion, this sets an exclusive barrier to write a linker.

It is not enough to have a debugger and default pre-set loader to validate an output of a linker because you cannot debug at some part. To iterate rapidly the output validation test, you need to have at least one custom loader which ideally resembles the default loader.

On linux, I prefer to build musl libc( https://www.musl-libc.org/ ) by myself inserting printf where I need to know. Often, there are some differences between default libc from glibc. Still considering amount of code size which needs to be read and length of build time, it is good to pick one of light-weight libc up for the job.

On windows, default linker lives in ntdll!LdrInitializeThunk(http://vrodxda.hatenablog.com/entry/2019/09/18/085454) It seems that there are no alternatives and source codes unlike libc.

Luckily, there are lots of custom loader available if you google it by reflective PE loader. This is one of the technique which malware favours. Previously, I did some research for my job to investigate emotet loader, and it helped me a lot for validating intermediate output of my linker because it has less code and functionality and can let it executable without generating a new process.

Do not stick to what it formally should be

What made my progress slow was persistence to the output of default linker.
There might be some motivation to develop a linker. It might be for malware analysis, making ctf challenge, or just a mere curiosity.
In case, I did not set any objectives for my development at first point and tried to generate as default linker does. And, I came across so many bugs which come from non-essential functionalities.

These are for example in my case 1. Multiple loadable program headers .e.g. alignment 2. Preparation of section and Symbol which can be stripped

Regarding 1, default linker tries to let every loadable section into a shared program header if its protection is shared. I was not good at implementing this feature in a satisfiable way.
This is because if you allow multiple program headers, section order needs to be flexible and some header might come after the range where its size is fixed after relocation. This lets linking process complex and slow as you need to iterate re-computation relocation address and relocation itself.

At some point, I made a decision that I let every loadable section on one uniformed program header no matter what and do extra mapping and protection later on like UPX does.

For instance, .plt and .got are composed together and put aligned as I do not want to fix the virtual address even when relocation is on the way.

// .plt 
// jump the address loading from 4byte from next two byte
0xff 0x25 0x02 0x00 0x00 0x00 
// just for pudding
0x00 0x00
// .got comes here
0x00 0x00 0x00 0x00 0x00 0x00 0x00

Indeed, you cannot use default loader protection GNU_RELRO as it is on a same page. Nevertheless, I did not want to manage complexity which inherits from original linker design.

Good for making CTF challenge

Unintentionally, a generated elf executable provides nothing to objdump and less information to readelf keeping it still viable. This might come from at some degree dropping of formalness which I have written above.
Since I love obfuscation and de-obfuscation, this design feels good to me.
I enjoy thinking these days how challenging a question could be with it.
For instance, it is not difficult to prevent debuggers from stepping in if you write some initial routines before __libc_start_main. I do not know many challenges where a question had been implemented from linking level, but it should be enjoyable from both challenge providers & solvers if it is linked in a special way.