Design overview of Prodebug
Development Environment and Languages Used
Details on the debugger implementation
The code was developed on an intel 386 linux PC . Linux is a free OS , any one can download it off the web and use it . What's more it can run on the barest of hardware. Hence I chose linux over developing in dos and windows. (which require extensive support).
I have used C and assembly to code this. Assembly language is used primarily in the boot sector code ,interrupt handlers ,accessing the i386 structures like GDT , LDT etc. However a large part of the debugger is written in C.Because of the following reasons :
- C code is a lot more readable , easy to understand and maintain than asm .
- I do not use any libraries , even printf and scanf that I use are my own , so there is absolutely no overhead , no dependencies involved. The code uses pure ANSI C .
- Since a large part of the code , sets up appropriate data structures , I have used C because it is easier to interpret that as a struct , whereas in asm , it will look like some meaningless moving of bytes here and there.
I have used nasm( a 32 bit freely available assembler ) and gcc (The infamous gnu C compiler ) to compile the code . Also sed is used to find some addresses of symbols in the kernel. These are standard tools that can be found on any linux distribution , and are downloadable from the net. To test the code , I have used a real PC and and an IA32 emulator called bochs.
All the tools used are public domain tools as they are freely available to anyone.More information abouthow to install them (if you don't already have them) and the forms they are available in (tarball archieve, rpm etc ) can be found from their respective home pages.
To undersand the code , you need to understand a bit of the following :
- Protected mode intel 386 programming
The authoritative source for this would be Volume 3 of the intel 386 manual(Download).For a list of comprehensive online references , check out this
- Assembly coding , I have used nasm . For those who do not know , nasm is a linux clone for the Microsoft assembler , masm , available on dos systems.
- Periphral specific drivers information
- PIC: Or the programmable interrupt controller, This is useful to understand how interrupts are controlled .
- intel FDC controller : To understand how the floppy driver can be written.
- intel 8042 , keyboard controller
The debugger is chiefly composed of three main modules , namely
- Boot Up Module
- Kernel Proper
- Device Drivers
- Debugger Module
Their details are given below .
Bootup Module details
This resides in boot subdirectory. The bootup module composes of the following files :
- bootsect.s : This is executed in real mode , and is found on the first sector of the floppy image ,It is placed by the bios at memory location 0x7c00 .It proceeds to load the kernel into the main memory using bios routines to read the floppy. It then transfers control to the setup routine , which prepares the system for switch to protected mode.
- setup.s : The setup code , sets up the initial gdt ,idt and enables the A20 gate to go into protected mode . The pic is then reprogrammed to shift the irq's to after the processor interrupts. The switch to protected mode is done by setting the 0th bit in CR0 register , then doing a far jump to reload the segment registers with appropriate protected mode values. Interrupts are disables , and the pic is asked to mask all interrupts .It then jumps to the kernel entry point , which is the Main function found in main.c .
Kernel Proper module details
This resides in kernel subdirectory. The kernel proper is incharge of initization and providing useful routines to the rest of the system. The following files are there in this modules :
- main.c : This is the kernel entry point , it sets up the IDT,GDT,LDT, Sets up the memory manager , initializes keyboard,console , floppy driver and other functions like printk etc which require initialization. The pic is then told to allow only keyboard interrupts , other interrupts like floppy and timer (pit) are enabled and disabled as needed. Also it initializes the kernel TSS . (which will be used later to transfer control to the kernel by the debugger interrupt). After doing all this , interrupts are enabled and the control is passed on to the debugger .
- descriptor.c , idt.c ,tss.c : Initialization and helper routines for accessing the GDT,TSS,IDT .
- alloc.c : the kernel page based memory manager , it handles dynamic allocation of memory in units of a page size (4k).
- printk.c : Implements the printk ,scanf and gets functions which are used thruout for console I/O .
- int.s : This contains the various interrupt routines for floppy , debugger, keyboard , timer.
- asm.s : This file contains all the low level assembly code neccessary for interfacing with the various devices and the processor.Again inline assembly was not used for reasons of portability , although that would have been faster.
This is contained in the files like keyboard.cxx, floppy.cxx,timer.cxx in the kernel subdirectory. The int.s file in the same directory has the interrupt routines for floppy,timer,keyboard,debugger . All the interrupt routines do the following :
These interrupts are initialized in initIDT() in idt.cxx.Note that the debugger interrupt is explained below.
- Save State
- Switch to kernel ds ,cs
- Call the main interrupt routine handler of that particular device , eg : doDebuggerInt() or the doKeyboardInterrupt() .
- Send an EOI to the pic , to unmask that interrupt
- Restore state and do an iret.
- Keyboard Driver :In file keyboard.c The keyboard driver is a simple circular queue , where when a key is pressed , the keyboard interrupt handler calls the enque method , and to retrieve a key , I just deque it.
- Console Driver : In file console.c , It does the following :
- Keeps track of the cursor position and scrolls if neccessary , it also keeps updating the cursor position in the video controller .
- The string to be printed is just put into the video memory starting at the current cursor location
- Also maintains a history of four pages and implements scrolling , by acting the ctrl+U ,ctrl+D keys
- Floppy Driver : In file floppy.c , It does the following :
- Sets up dma for data transfer
- enables the floppy interrupt
- recalibrates the drive if neccessary
- Sends commands to the disk controller to read/write a sector
- Handles floppy driver errors , like wrong seeking etc , by retrying
It consists of the debugger.c file in the kernel subdirectory and the int 1 handler in int.s.The debugger essentially consists of three peices of code ,
Launching of a user program :
- Command Input Loop : This essentially waits for user command by calling a gets function to the console driver and then acts on the command . It calls functions like run(),loadProgram(),setTraps(),setBreakPoint() etc , in response to the commands .
- Int 1 : Interrupt 1 is the debug exception interrupt that the processor raises in response to trace or when a breakpoint is reached ,It's code is contained in int.s
Int 3 : Interrupt 3 on the other hand is raised in case of a software breakpoint (the int 3 instruction or the one byte opcode 0xCC) .Whenever a processor finds this opcode as the first byte of the currently decoded instruction , this handler is invoked. Infact this has how the debugger invokes the runTill command, it just replaces the first byte of the instruction to run till to with the opcode 0xCC , the int 3 handler is invoked and in addition to the following steps , replaces the saved original byte that existed.
Both the debugging interrupt handlers are pretty similar and this is what is exactly done :
- Saves State on state
- Moves to the kernel ds and cs
- Saves the linear address of ss:esp into a variable , this then points to the saved state of the user program on the stack. And can later be used for modifying his registers .
- Calls a function doDebuggerInt
- Pops (restores) state followed by an iret.
- A note on the saved state The thread state is initially stored in its task state segment , later when a interrupt occurs (for example the int 1) ,all the registers including the segment selectors are saved on the stack(except the eip,cs,ss,sp,eflags) .These four registers are not saved by us , as they are already saved by the 386 when an interrupt occurs , however the complications are :
A stack switch will occur if the cpl of the running program was not equal to 0(the kernel priv level). For the error codes , we simply know which interrupts give error codes and hence can go ahead and calculate the offsets of these ss,esp,eip,cs,eflags.The funx doing this is called doFullSaveState in debugger.c.
- If a stack switch does not occur ss and esp (original stack's) are not stored
- Some interrupts push a error code , so the exact of offsets of the saved eip,cs,ss,eflags,sp is dependant on which interrupt occured and whether a stack switch occured.
Also note that initially when a program is launched it takes its values from those stored in its TSS, whereas otherwise when an interrupt occurs it takes the values stored in the ring 0 stack as detailed above.For this purpose we need a variable which can tell us wether we are returning from an interrupt handler or have yet to launch the user program , this variable is called fromThread and is set in the doFullSaveState function.
- DoDebuggerRoutine This routine is contained in debugger.c . It switches the processor back to the kernel state , ie switches to the kernel , the details of this operation are described later.
When the user does a load program , the following things happen :
Handling of a debugger interrupt in detail :
- The program is read from the specified floppy location
- Memory is allocated for the program's stack ,code,data .
- 16k is allocated for each of the code/data/stack and 64k for the extended data segment(eds) .Segment Selectors for these point into the LDT where the descriptors for these segments are created.
- Once all the memory is allocated , the TSS of the program is created , in which the registers are initialized to 0 ,the esp to the highest address in the user stack segment.
- The TSS descriptor is created in the GDT , and is made to point to the TSS structure created above. Note that we allocate one PAGE_SIZE ( 4k ) bytes for the TSS structure , however the tss struct occupies only 104 bytes , so we use the lower part of this memory as a ring 0 stack ( this is the stack , the cpu will switch to this stack if an interrupt occurs) , this explains why the esp0 is set to (PAGE_SIZE).Hence the ring0 stack to which which the cpu switches in case of interrupt is specified in the tss .
- To run the program , we simply do a task switch specifying this program TSS created above. This way the cpu automatically stores the kernel context (useful later )
The doDebugerRoutine simply does a task switch to the saved kernel tss, this way the kernel resumes exactly where it left off , and later when the program eventually has to be run again , we just do task switch to the program TSS . Hence the program starts off by returning from the DoDebuggerInt routine , and the proceeds to complete the interrupt 1 . Note that the earlier saved context saved on the context may have been changed before it does an iret. (this is described above).