Notes

Do not read codes but patch binary.

Simple overwriting of GOT

GOT overwriting is a good starting point for making sense of what is all about "relocation". Relocation can be done in two phases; statically or dynamically. Here, I will mention about dynamic relocation.

When you call a function whatever defined on a shared object dynamically, you must set where the instruction pointer heads for to call it. To reach out the point, it needs to be hopped on at least 2 section headers mapped separately on a different program header when you relocate it.

First step is on .plt (procedure linkage table) section which will be usually mapped on so called "text segment" with .text section by libc so.

Second step is on .got(global offset table) section which is distinctly mapped on "data segment" considering a memory gap starting from the first mapping which is set as virtual address offset of second program header.

But note that dynamic linker will not jump on the address where global offset table sits. Procedure linkage table will indicate where the next instruction should come referring to the value of global offset table not the address of global offset table itself on data segment.

What the dynamic loader relocates means is simply rewriting a sequence of bytes on GOT which denotes a memory address. It addresses next instruction of .plt before the relocation so that relocation itself can be optional for a loader. When it is not set, instruction pointer will be set to the next instruction as it often does, and go to the next jump which are the head of global offset table. If you set the value.

Why does it need to be optional? Consider a situation where you loaded a shared library which depends on many functions on other libraries with heavy dependencies, but you won't make use of most of them. For example, if you include a header which was defined on stdlibc++.so, it refers many functions on libc.so; a function on std::Thread class might depend on pthread_create(4) on libc.so. In that case, it is just in vain if you do not call std::Thread on main program to connect std::Thread on stdlibc++.so with pthread_create on libc.so. Loading relocation cost can be postponed in that situation till the actual function call. One example about this is one of the flag;RT_LAZY which can be set as 2nd argument of dlopen(2), which will save computational cost of loading phase.

Coming back to the functionality which is provided by PLT & GOT, you can easily hook a function which bridges to another shared object by rewriting (re(relocating)) address on GOT unless it is set lazily.

First, you get the address of PLT, then go to GOT where you modify the value to wherever you want to jump on instead. Note that when you have GNU_RELRO on program header, it contains the head & tail of post-filled-in address including .got, and set read only mode after relocation is done. But, if something is filled at first by dynamic loader, the memory protection can be gained permission to write calling again mprotect(,PROT_WRITE).

Here is the short snippet of the code to do that.


#include <stdio.h>
#include <sys/mman.h>

// This is a simple example to hook puts() defined on libc.so
// on Global offset table and covert to function f2() defined on this file.

// what this does is as follows.
// 1st, find procedure linkage table(PLT) of puts examining part of text area.
// 2nd, get the pointer of PLT and get the offset of GOT.
// 3rd, get the pointer of global offset table(GOT).
// 4th, rewrite it!
// 5th, test if the hooking is worked out.

// basically this scheme should work out in any 

void f1() {
  // when you call put function,
  // instruction pointer which will be jumped from them will tell you
  // where is PLT.
  puts("hei!");
};

void f2(){
  const char str[] = "world!";
  // default gcc might convert printf(string\n)
  // to puts. be careful as puts had been converted to f2 itself.
  printf("%s \n",str);
};

int main() {
  
  unsigned char* begin = (unsigned char*)&f1;
  unsigned char* end = (unsigned char*)&f2;
  for (;begin!=end;begin++) {
    // assume call instruction is represented as
    // 0xe8 + 4byte(%rip).
    if (*begin == 0xe8) {
      // cast the pointer so you can grab 4 bytes to get offset operand.
      unsigned int* offset1 = (unsigned int*)(begin+1);
      // calculate address of PLT e.g. in assembly asm ( "lea offset(%rip), %rax" ;)
      size_t* tmp_plt_addr = (size_t*) ((size_t)(begin+5)+(long int)(*offset1));      

      // address calculation towards negative side is a bit tricky.
      // you just need to pass the beggining 3byte of original one as previous subtraction altered it,
      unsigned short* plt_addr = (((size_t)tmp_plt_addr)&0x000000ffffff) | (size_t)(begin+4) & 0xffffff000000;
      // below might work out alternatively.
      /* size_t* plt_addr = (size_t*) ((size_t)(begin+5)+(long int)(*ptr_) - 0x100000000) ; */
      
      // after you come to plt, what waits you is not another call, but jump.
      // And the jump is not ordinary jump but jump to the address which was set on GOT.
      // it starts from 0xff,0xx25,4byte(%rip)
      // first 0xff is opcode, the second is determining addressing.
      // what you need to hold is last 4 byte which is offset from current %rip to GOT.

      // since the type is short one step means 2byte forwards.
      plt_addr+=1;
      // get the offset which is 4 byte.
      unsigned int* offset = (unsigned int*)plt_addr;
      // proceed another 2*2(4byte) to reach %rip.
      plt_addr+=2;
      // calculate GOT of this function.
      // it does not hold any instruction but only 8byte address.
      size_t* got_addr = (size_t*)((size_t)plt_addr + (unsigned int)*offset);

      // if there is a GNU_RELRO on program headers,
      // it might be the case that libc protected the part of data segment as read only
      // after loader relocated them.
      // you can reask kernel the mapping can be writable.
      mprotect(got_addr,4096,PROT_WRITE);
      // finally rewrite it to whatever you like to jump.
      *got_addr = (size_t)f2;      
      break;
    }
  }

  // When you call puts after got overwriting,
  // f2 will be instead called.
  // make sure you get "world!" not "hello!"
  puts("hello!");
};

That is simple enough to contain essence of relocation. But, if you want to reach comprehensive understanding, I will recommend you to read musl(https://www.musl-libc.org/) carefully.