Mix-C-and-Assembly

Following a guide from OSdev

The guide

Using gcc in OS dev

  • First is that you can’t include any header files that you didn’t write yourself.
    • Most header files have dependinces on the OS for which they were written for.
    • For example, printf and cout, when a program is run that uses either one of these commands an OS service(the syscall I guess) is called to display text.
    • the only portions of C/C++ that we can use are what we will call the core language. The core language includes only the reserved keywords and expressions that are avalible when no header files are included.

A simple kernel which do nothing

1
2
3
4
int main(){
repeat:
goto repeat;
}
1
2
$ gcc -ffreestanding -c -o kernel.o kernel.c
# -ffreestanding means to produce code that is meant to be run without an OS
1
2
3
4
5
$ ld -Ttext 0x100000 --oformat binary -o kernel.bin kernel.o
ld: warning: cannot find entry symbol _start; defaulting to 0000000000100000
# --oformat specify the output format, we'll be using binary
# -Ttext specify the address that the code will be loaded too, similar to `org 0x100000` in assembly
# -o specify the name of the file that is created
1
2
$ man gcc
$ man ld

Recreat the kernel in assembly

1
2
3
[BITS 32]
repeat:
jmp repeat
1
2
$ nasm -f coff ker.asm -o ker.o
# -f specify the output format, we'll be using coff (coff is a type of object file)

Mixing C and Assembly

Functions

The caller pushes the function’s parameters on the stack, one after another, in reverse order (right to left, so that the first argument specified to the function is pushed last)

1
2
3
4
5
6
7
8
9
10
11
12
13
; define C useable functions in assembly
global __funcname
__funcname:
...
leave
ret

; call a C function in assembly
extern __Cfun
push byte ptr [arg2] ; the second argument
push byte ptr [arg1] ;the first argument
call __Cfun
add esp,2 ; clear the argment space on stack

Accessing data item

To get at the contents of C variables, or to declare variables which C can access, you need only declare the names as GLOBAL or EXTERN.

1
2
3
4
5
6
7
8
9
10
; access C data
; int i = 0x7800 ;
extern __i
mov eax, [__i]

; export data to C
segment data
global __d
__d db 99
ends

Mixed kernel

1
2
3
4
5
6
7
8
extern void sayhi(void);
extern void quit(void);

int main(void)
{
sayhi();
quit();
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
[BITS 32]

GLOBAL _sayhi
GLOBAL _quit

SECTION .text

_sayhi: mov byte [es:0xb8f9c],'H'
mov byte [es:0xb8f9e],'i'
ret

_quit: mov esp,ebp
pop ebp
retf
1
2
3
$ gcc -ffreestanding -c -o mix_c.o mix_c.c
$ nasm -f coff -o mix_asm.o mix_asm.asm
$ ld -Ttext 0x100000 --oformat binary -o kernel32.bin mix_c.o mix_asm.o

Linux Warning: There is a problem with ld on Linux. The problem is that the ld that comes with linux distros lists support for the coff object format, but apparently you have to rebuilt binutils from gnu.org to get it working. I found two possible solutions. Recompile ld or edit your assembly files and remove all the leading underscores. Then when you assemble with nasm use the -f aout option instead of coff. I’ve tested the second method briefly and it works.

Tips

  • If you get this error (ld: i386 architecture of input file xxxx.o is incompatible with i386:x86-64 output)when linking, add this arguemnt -m elf_i386

  • It’s helpful to have nasm manual on hand