Author Avatar
ZHAO YANG MIN 6月 11, 2017
Lab 1

Lab 1: Booting a PC

代码请见https://coding.net/u/yangminz/p/MITJOS/git。可以通过clone:

git clone https://git.coding.net/yangminz/MITJOS.git    

方式获取代码并且cd目录、切换branch运行。

Introduction

This lab is split into three parts. The first part concentrates on getting familiarized with x86 assembly language, the QEMU x86 emulator, and the PC's power-on bootstrap procedure. The second part examines the boot loader for our 6.828 kernel, which resides in the boot directory of the lab tree. Finally, the third part delves into the initial template for our 6.828 kernel itself, named JOS, which resides in the kernel directory.

Software Setup

The files you will need for this and subsequent lab assignments in this course are distributed using the Git version control system. To learn more about Git, take a look at the Git user's manual, or, if you are already familiar with other version control systems, you may find this CS-oriented overview of Git useful.

The URL for the course Git repository is https://pdos.csail.mit.edu/6.828/2016/jos.git. To install the files in your Athena account, you need to clone the course repository, by running the commands below. You must use an x86 Athena machine; that is, uname -a should mention i386 GNU/Linux or i686 GNU/Linux or x86_64 GNU/Linux. You can log into a public Athena host with ssh -X athena.dialup.mit.edu.

athena% mkdir ~/6.828
athena% cd ~/6.828
athena% add git
athena% git clone https://pdos.csail.mit.edu/6.828/2016/jos.git lab
Cloning into lab...
athena% cd lab
athena% 

Git allows you to keep track of the changes you make to the code. For example, if you are finished with one of the exercises, and want to checkpoint your progress, you can commit your changes by running:

athena% git commit -am 'my solution for lab1 exercise 9'
Created commit 60d2135: my solution for lab1 exercise 9
 1 files changed, 1 insertions(+), 0 deletions(-)
athena% 

You can keep track of your changes by using the git diff command. Running git diff will display the changes to your code since your last commit, and git diff origin/lab1 will display the changes relative to the initial code supplied for this lab. Here, origin/lab1 is the name of the git branch with the initial code you downloaded from our server for this assignment.

We have set up the appropriate compilers and simulators for you on Athena. To use them, run add -f 6.828. You must run this command every time you log in (or add it to your ~/.environment file). If you get obscure errors while compiling or running qemu, double check that you added the course locker.

If you are working on a non-Athena machine, you'll need to install qemu and possibly gcc following the directions on the tools page. We've made several useful debugging changes to qemu and some of the later labs depend on these patches, so you must build your own. If your machine uses a native ELF toolchain (such as Linux and most BSD's, but notably not OS X), you can simply install gcc from your package manager. Otherwise, follow the directions on the tools page.

Hand-In Procedure

You will turn in your assignments using the submission website. You need to request an API key from the submission website before you can turn in any assignments or labs.

The lab code comes with GNU Make rules to make submission easier. After commiting your final changes to the lab, type make handin to submit your lab.

athena% git commit -am "ready to submit my lab"
[lab1 c2e3c8b] ready to submit my lab
 2 files changed, 18 insertions(+), 2 deletions(-)

athena% make handin
git archive --prefix=lab1/ --format=tar HEAD | gzip > lab1-handin.tar.gz
Get an API key for yourself by visiting https://6828.scripts.mit.edu/2016/handin.py/
Please enter your API key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 50199  100   241  100 49958    414  85824 --:--:-- --:--:-- --:--:-- 85986
athena%
make handin will store your API key in myapi.key. If you need to change your API key, just remove this file and let make handin generate it again (myapi.key must not include newline characters).

In the case that make handin does not work properly, try fixing the problem with the curl or Git commands. Or you can run make tarball. This will make a tar file for you, which you can then upload via our web interface.

For Lab 1, you do not need to turn in answers to any of the questions below. (Do answer them for yourself though! They will help with the rest of the lab.)

We will be grading your solutions with a grading program. You can run make grade to test your solutions with the grading program.

Part 1: PC Bootstrap

The purpose of the first exercise is to introduce you to x86 assembly language and the PC bootstrap process, and to get you started with QEMU and QEMU/GDB debugging. You will not have to write any code for this part of the lab, but you should go through it anyway for your own understanding and be prepared to answer the questions posed below.

Getting Started with x86 assembly

If you are not already familiar with x86 assembly language, you will quickly become familiar with it during this course! The PC Assembly Language Book is an excellent place to start. Hopefully, the book contains mixture of new and old material for you.

Warning: Unfortunately the examples in the book are written for the NASM assembler, whereas we will be using the GNU assembler. NASM uses the so-called Intel syntax while GNU uses the AT&T syntax. While semantically equivalent, an assembly file will differ quite a lot, at least superficially, depending on which syntax is used. Luckily the conversion between the two is pretty simple, and is covered in Brennan's Guide to Inline Assembly.

Exercise 1. Familiarize yourself with the assembly language materials available on the 6.828 reference page. You don't have to read them now, but you'll almost certainly want to refer to some of this material when reading and writing x86 assembly.

We do recommend reading the section "The Syntax" in Brennan's Guide to Inline Assembly. It gives a good (and quite brief) description of the AT&T assembly syntax we'll be using with the GNU assembler in JOS.


课程页上的汇编语言材料主要是x86体系的,提供了Intel 80386 的编程者参考手册。在上学期ICS的课里,有过阅读Intel 80386 手册的经验。主流的汇编语言有两种格式,分别是AT\&T格式和Intel格式,在NEMU的实验中实现的是AT\&T的格式。

在Brennan's Guide to Inline Assembly的指导中主要提到了这些差异:AT\&T和Intel的差别,包括寄存器命名、源操作数和目的操作数的顺序、立即数的格式、操作符的长度、如何读取寄存器的数值等等。


Certainly the definitive reference for x86 assembly language programming is Intel's instruction set architecture reference, which you can find on the 6.828 reference page in two flavors: an HTML edition of the old 80386 Programmer's Reference Manual, which is much shorter and easier to navigate than more recent manuals but describes all of the x86 processor features that we will make use of in 6.828; and the full, latest and greatest IA-32 Intel Architecture Software Developer's Manuals from Intel, covering all the features of the most recent processors that we won't need in class but you may be interested in learning about. An equivalent (and often friendlier) set of manuals is available from AMD. Save the Intel/AMD architecture manuals for later or use them for reference when you want to look up the definitive explanation of a particular processor feature or instruction.

Simulating the x86

Instead of developing the operating system on a real, physical personal computer (PC), we use a program that faithfully emulates a complete PC: the code you write for the emulator will boot on a real PC too. Using an emulator simplifies debugging; you can, for example, set break points inside of the emulated x86, which is difficult to do with the silicon version of an x86.

In 6.828 we will use the QEMU Emulator, a modern and relatively fast emulator. While QEMU's built-in monitor provides only limited debugging support, QEMU can act as a remote debugging target for the GNU debugger (GDB), which we'll use in this lab to step through the early boot process.

To get started, extract the Lab 1 files into your own directory on Athena as described above in "Software Setup", then type make (or gmake on BSD systems) in the lab directory to build the minimal 6.828 boot loader and kernel you will start with. (It's a little generous to call the code we're running here a "kernel," but we'll flesh it out throughout the semester.)

athena% cd lab
athena% make
+ as kern/entry.S
+ cc kern/entrypgdir.c
+ cc kern/init.c
+ cc kern/console.c
+ cc kern/monitor.c
+ cc kern/printf.c
+ cc kern/kdebug.c
+ cc lib/printfmt.c
+ cc lib/readline.c
+ cc lib/string.c
+ ld obj/kern/kernel
+ as boot/boot.S
+ cc -Os boot/main.c
+ ld boot/boot
boot block is 380 bytes (max 510)
+ mk obj/kern/kernel.img

(If you get errors like "undefined reference to `__udivdi3'", you probably don't have the 32-bit gcc multilib. If you're running Debian or Ubuntu, try installing the gcc-multilib package.)

Now you're ready to run QEMU, supplying the file obj/kern/kernel.img, created above, as the contents of the emulated PC's "virtual hard disk." This hard disk image contains both our boot loader (obj/boot/boot) and our kernel (obj/kernel).

athena% make qemu

This executes QEMU with the options required to set the hard disk and direct serial port output to the terminal. Some text should appear in the QEMU window:

Booting from Hard Disk...
6828 decimal is XXX octal!
entering test_backtrace 5
entering test_backtrace 4
entering test_backtrace 3
entering test_backtrace 2
entering test_backtrace 1
entering test_backtrace 0
leaving test_backtrace 0
leaving test_backtrace 1
leaving test_backtrace 2
leaving test_backtrace 3
leaving test_backtrace 4
leaving test_backtrace 5
Welcome to the JOS kernel monitor!
Type 'help' for a list of commands.
K>

Everything after 'Booting from Hard Disk...' was printed by our skeletal JOS kernel; the K> is the prompt printed by the small monitor, or interactive control program, that we've included in the kernel. These lines printed by the kernel will also appear in the regular shell window from which you ran QEMU. This is because for testing and lab grading purposes we have set up the JOS kernel to write its console output not only to the virtual VGA display (as seen in the QEMU window), but also to the simulated PC's virtual serial port, which QEMU in turn outputs to its own standard output. Likewise, the JOS kernel will take input from both the keyboard and the serial port, so you can give it commands in either the VGA display window or the terminal running QEMU. Alternatively, you can use the serial console without the virtual VGA by running make qemu-nox. This may be convenient if you are SSH'd into an Athena dialup. To quit qemu, type Ctrl+a x.

There are only two commands you can give to the kernel monitor, help and kerninfo.

K> help
help - display this list of commands
kerninfo - display information about the kernel
K> kerninfo
Special kernel symbols:
  entry  f010000c (virt)  0010000c (phys)
  etext  f0101a75 (virt)  00101a75 (phys)
  edata  f0112300 (virt)  00112300 (phys)
  end    f0112960 (virt)  00112960 (phys)
Kernel executable memory footprint: 75KB
K>

The help command is obvious, and we will shortly discuss the meaning of what the kerninfo command prints. Although simple, it's important to note that this kernel monitor is running "directly" on the "raw (virtual) hardware" of the simulated PC. This means that you should be able to copy the contents of obj/kern/kernel.img onto the first few sectors of a real hard disk, insert that hard disk into a real PC, turn it on, and see exactly the same thing on the PC's real screen as you did above in the QEMU window. (We don't recommend you do this on a real machine with useful information on its hard disk, though, because copying kernel.img onto the beginning of its hard disk will trash the master boot record and the beginning of the first partition, effectively causing everything previously on the hard disk to be lost!)

The PC's Physical Address Space

We will now dive into a bit more detail about how a PC starts up. A PC's physical address space is hard-wired to have the following general layout:

+------------------+  <- 0xFFFFFFFF (4GB)
|      32-bit      |
|  memory mapped   |
|     devices      |
|                  |
/\/\/\/\/\/\/\/\/\/\

/\/\/\/\/\/\/\/\/\/\
|                  |
|      Unused      |
|                  |
+------------------+  <- depends on amount of RAM
|                  |
|                  |
| Extended Memory  |
|                  |
|                  |
+------------------+  <- 0x00100000 (1MB)
|     BIOS ROM     |
+------------------+  <- 0x000F0000 (960KB)
|  16-bit devices, |
|  expansion ROMs  |
+------------------+  <- 0x000C0000 (768KB)
|   VGA Display    |
+------------------+  <- 0x000A0000 (640KB)
|                  |
|    Low Memory    |
|                  |
+------------------+  <- 0x00000000

The first PCs, which were based on the 16-bit Intel 8088 processor, were only capable of addressing 1MB of physical memory. The physical address space of an early PC would therefore start at 0x00000000 but end at 0x000FFFFF instead of 0xFFFFFFFF. The 640KB area marked "Low Memory" was the only random-access memory (RAM) that an early PC could use; in fact the very earliest PCs only could be configured with 16KB, 32KB, or 64KB of RAM!

The 384KB area from 0x000A0000 through 0x000FFFFF was reserved by the hardware for special uses such as video display buffers and firmware held in non-volatile memory. The most important part of this reserved area is the Basic Input/Output System (BIOS), which occupies the 64KB region from 0x000F0000 through 0x000FFFFF. In early PCs the BIOS was held in true read-only memory (ROM), but current PCs store the BIOS in updateable flash memory. The BIOS is responsible for performing basic system initialization such as activating the video card and checking the amount of memory installed. After performing this initialization, the BIOS loads the operating system from some appropriate location such as floppy disk, hard disk, CD-ROM, or the network, and passes control of the machine to the operating system.

When Intel finally "broke the one megabyte barrier" with the 80286 and 80386 processors, which supported 16MB and 4GB physical address spaces respectively, the PC architects nevertheless preserved the original layout for the low 1MB of physical address space in order to ensure backward compatibility with existing software. Modern PCs therefore have a "hole" in physical memory from 0x000A0000 to 0x00100000, dividing RAM into "low" or "conventional memory" (the first 640KB) and "extended memory" (everything else). In addition, some space at the very top of the PC's 32-bit physical address space, above all physical RAM, is now commonly reserved by the BIOS for use by 32-bit PCI devices.

Recent x86 processors can support more than 4GB of physical RAM, so RAM can extend further above 0xFFFFFFFF. In this case the BIOS must arrange to leave a second hole in the system's RAM at the top of the 32-bit addressable region, to leave room for these 32-bit devices to be mapped. Because of design limitations JOS will use only the first 256MB of a PC's physical memory anyway, so for now we will pretend that all PCs have "only" a 32-bit physical address space. But dealing with complicated physical address spaces and other aspects of hardware organization that evolved over many years is one of the important practical challenges of OS development.

The ROM BIOS

In this portion of the lab, you'll use QEMU's debugging facilities to investigate how an IA-32 compatible computer boots.

Open two terminal windows. In one, enter make qemu-gdb (or make qemu-nox-gdb). This starts up QEMU, but QEMU stops just before the processor executes the first instruction and waits for a debugging connection from GDB. In the second terminal, from the same directory you ran make, run make gdb. You should see something like this,

athena% make gdb
GNU gdb (GDB) 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
+ target remote localhost:26000
The target architecture is assumed to be i8086
[f000:fff0] 0xffff0:	ljmp   $0xf000,$0xe05b
0x0000fff0 in ?? ()
+ symbol-file obj/kern/kernel
(gdb) 

We provided a .gdbinit file that set up GDB to debug the 16-bit code used during early boot and directed it to attach to the listening QEMU. (If it doesn't work, you may have to add an add-auto-load-safe-path in your .gdbinit in your home directory to convince gdb to process the .gdbinit we provided. gdb will tell you if you have to do this.)

The following line:

[f000:fff0] 0xffff0:	ljmp   $0xf000,$0xe05b

is GDB's disassembly of the first instruction to be executed. From this output you can conclude a few things:

  • The IBM PC starts executing at physical address 0x000ffff0, which is at the very top of the 64KB area reserved for the ROM BIOS.
  • The PC starts executing with CS = 0xf000 and IP = 0xfff0.
  • The first instruction to be executed is a jmp instruction, which jumps to the segmented address CS = 0xf000 and IP = 0xe05b.

Why does QEMU start like this? This is how Intel designed the 8088 processor, which IBM used in their original PC. Because the BIOS in a PC is "hard-wired" to the physical address range 0x000f0000-0x000fffff, this design ensures that the BIOS always gets control of the machine first after power-up or any system restart - which is crucial because on power-up there is no other software anywhere in the machine's RAM that the processor could execute. The QEMU emulator comes with its own BIOS, which it places at this location in the processor's simulated physical address space. On processor reset, the (simulated) processor enters real mode and sets CS to 0xf000 and the IP to 0xfff0, so that execution begins at that (CS:IP) segment address. How does the segmented address 0xf000:fff0 turn into a physical address?

To answer that we need to know a bit about real mode addressing. In real mode (the mode that PC starts off in), address translation works according to the formula: physical address = 16 * segment + offset. So, when the PC sets CS to 0xf000 and IP to 0xfff0, the physical address referenced is:

   16 * 0xf000 + 0xfff0   # in hex multiplication by 16 is
   = 0xf0000 + 0xfff0     # easy--just append a 0.
   = 0xffff0 

0xffff0 is 16 bytes before the end of the BIOS (0x100000). Therefore we shouldn't be surprised that the first thing that the BIOS does is jmp backwards to an earlier location in the BIOS; after all how much could it accomplish in just 16 bytes?

Exercise 2. Use GDB's si (Step Instruction) command to trace into the ROM BIOS for a few more instructions, and try to guess what it might be doing. You might want to look at Phil Storrs I/O Ports Description, as well as other materials on the 6.828 reference materials page. No need to figure out all the details - just the general idea of what the BIOS is doing first.


刚开始运行GDB比较困难。虽然我能运行QEMU,但是如果不使用 qemu-nox 的方式在终端运行,虽然会出现QEMU的VGA界面,但是虚拟机上面的鼠标就完全消失了,这样进行其他的操作就比较困难,只能强制关闭虚拟机再重启。连续碰上两三次这个情况,心情就比较烦躁……总之,先在一个终端运行,然后再开一个终端窗口,设置GDB的安全路径运行。

实际上正确的操作是这样的:ctrl+alt可以退出QEMU对话框,出现鼠标。先按住ctrl+a,松开,再按x,就可以结束会话。

make qemu-nox-gdb
gdb -q -iex 'add-auto-load-safe-path .' .gdbinit

总之,运行之后如图:

从GDB汇编输出的内容看,QEMU从物理地址0x000f fff0开始执行,而这就是顶层64KB留给BIOS的空间。通过si单步执行可以看到,QEMU启动以后的执行过程是这样的:

(gdb)si:
[f000:fff0]    0xffff0: ljmp   $0xf000,$0xe05b
[f000:e05b]    0xfe05b: cmpl   $0x0,%cs:0x6ac8
[f000:d161]    0xfd161: mov    $0x8f,%eax
[f000:d167]    0xfd167: out    %al,$0x70
[f000:d169]    0xfd169: in     $0x71,%al
[f000:d16b]    0xfd16b: in     $0x92,%al
[f000:d16d]    0xfd16d: or     $0x2,%al
[f000:d16f]    0xfd16f: out    %al,$0x92
[f000:d171]    0xfd171: lidtw  %cs:0x6ab8
[f000:d177]    0xfd177: lgdtw  %cs:0x6a74
[f000:d17d]    0xfd17d: mov    %cr0,%eax
[f000:d180]    0xfd180: or     $0x1,%eax
[f000:d184]    0xfd184: mov    %eax,%cr0
[f000:d187]    0xfd187: ljmpl  $0x8,$0xfd18f

QEMU首先进行物理地址的跳转,从0xf000跳转到0xe05b执行。

将立即数0与寄存器偏移地址cs:0x6ac8的值进行比较

将立即数0x8f(= 1000 1111)放到寄存器eax中

将寄存器al中的数据输出到端口0x70,再从0x71端口读入数据到寄存器al. I/O端口0x70, 0x71是唯二能够访问CMOS(和Real-Time Clock)的端口,而CMOS储存了BIOS的信息以供启动关闭。在计算机关机的情况下,仍然有一块独立电池供电,使Clock和CMOS信息活动。

从0x92端口读入数据到寄存器al,和0x2(= 0000 0010)做或运算,将低2位的比特修改为1,再重新输出回到端口.0x92端口是系统控制端口之一:System Control Port A.

从物理地址cs:0x6ab8和cs:0x6a74分别加载中断描述符和全局描述符

将控制寄存器cr0的值传到通用寄存器eax ,再通过或运算修改最低位为1,然后传回控制寄存器。


When the BIOS runs, it sets up an interrupt descriptor table and initializes various devices such as the VGA display. This is where the "Starting SeaBIOS" message you see in the QEMU window comes from.

After initializing the PCI bus and all the important devices the BIOS knows about, it searches for a bootable device such as a floppy, hard drive, or CD-ROM. Eventually, when it finds a bootable disk, the BIOS reads the boot loader from the disk and transfers control to it.

Part 2: The Boot Loader

Floppy and hard disks for PCs are divided into 512 byte regions called sectors. A sector is the disk's minimum transfer granularity: each read or write operation must be one or more sectors in size and aligned on a sector boundary. If the disk is bootable, the first sector is called the boot sector, since this is where the boot loader code resides. When the BIOS finds a bootable floppy or hard disk, it loads the 512-byte boot sector into memory at physical addresses 0x7c00 through 0x7dff, and then uses a jmp instruction to set the CS:IP to 0000:7c00, passing control to the boot loader. Like the BIOS load address, these addresses are fairly arbitrary - but they are fixed and standardized for PCs.

The ability to boot from a CD-ROM came much later during the evolution of the PC, and as a result the PC architects took the opportunity to rethink the boot process slightly. As a result, the way a modern BIOS boots from a CD-ROM is a bit more complicated (and more powerful). CD-ROMs use a sector size of 2048 bytes instead of 512, and the BIOS can load a much larger boot image from the disk into memory (not just one sector) before transferring control to it. For more information, see the "El Torito" Bootable CD-ROM Format Specification.

For 6.828, however, we will use the conventional hard drive boot mechanism, which means that our boot loader must fit into a measly 512 bytes. The boot loader consists of one assembly language source file, boot/boot.S, and one C source file, boot/main.c Look through these source files carefully and make sure you understand what's going on. The boot loader must perform two main functions:

  1. First, the boot loader switches the processor from real mode to 32-bit protected mode, because it is only in this mode that software can access all the memory above 1MB in the processor's physical address space. Protected mode is described briefly in sections 1.2.7 and 1.2.8 of PC Assembly Language, and in great detail in the Intel architecture manuals. At this point you only have to understand that translation of segmented addresses (segment:offset pairs) into physical addresses happens differently in protected mode, and that after the transition offsets are 32 bits instead of 16.
  2. Second, the boot loader reads the kernel from the hard disk by directly accessing the IDE disk device registers via the x86's special I/O instructions. If you would like to understand better what the particular I/O instructions here mean, check out the "IDE hard drive controller" section on the 6.828 reference page. You will not need to learn much about programming specific devices in this class: writing device drivers is in practice a very important part of OS development, but from a conceptual or architectural viewpoint it is also one of the least interesting.

After you understand the boot loader source code, look at the file obj/boot/boot.asm. This file is a disassembly of the boot loader that our GNUmakefile creates after compiling the boot loader. This disassembly file makes it easy to see exactly where in physical memory all of the boot loader's code resides, and makes it easier to track what's happening while stepping through the boot loader in GDB. Likewise, obj/kern/kernel.asm contains a disassembly of the JOS kernel, which can often be useful for debugging.

You can set address breakpoints in GDB with the b command. For example, b *0x7c00 sets a breakpoint at address 0x7C00. Once at a breakpoint, you can continue execution using the c and si commands: c causes QEMU to continue execution until the next breakpoint (or until you press Ctrl-C in GDB), and si N steps through the instructions N at a time.

To examine instructions in memory (besides the immediate next one to be executed, which GDB prints automatically), you use the x/i command. This command has the syntax x/Ni ADDR, where N is the number of consecutive instructions to disassemble and ADDR is the memory address at which to start disassembling.

Exercise 3. Take a look at the lab tools guide, especially the section on GDB commands. Even if you're familiar with GDB, this includes some esoteric GDB commands that are useful for OS work.

Set a breakpoint at address 0x7c00, which is where the boot sector will be loaded. Continue execution until that breakpoint. Trace through the code in boot/boot.S, using the source code and the disassembly file obj/boot/boot.asm to keep track of where you are. Also use the x/i command in GDB to disassemble sequences of instructions in the boot loader, and compare the original boot loader source code with both the disassembly in obj/boot/boot.asm and GDB.

Trace into bootmain() in boot/main.c, and then into readsect(). Identify the exact assembly instructions that correspond to each of the statements in readsect(). Trace through the rest of readsect() and back out into bootmain(), and identify the begin and end of the for loop that reads the remaining sectors of the kernel from the disk. Find out what code will run when the loop is finished, set a breakpoint there, and continue to that breakpoint. Then step through the remainder of the boot loader.

Be able to answer the following questions:

  • At what point does the processor start executing 32-bit code? What exactly causes the switch from 16- to 32-bit mode?
  • What is the last instruction of the boot loader executed, and what is the first instruction of the kernel it just loaded?
  • Where is the first instruction of the kernel?
  • How does the boot loader decide how many sectors it must read in order to fetch the entire kernel from disk? Where does it find this information?

从文件./obj/boot/boot.asm和./boot/boot.S来看,实际上有两个数据段 .code16, .code32. 并且在./obj/boot/boot.asm中可以看到,首先进入.code16:

.globl start
start:
  .code16                     # Assemble for 16-bit mode

而切换到 .code32 是在:

  # Switch from real to protected mode, using a bootstrap GDT
  # and segment translation that makes virtual addresses 
  # identical to their physical addresses, so that the 
  # effective memory map does not change during the switch.
  lgdt    gdtdesc
  movl    %cr0, %eax
  orl     $CR0_PE_ON, %eax
  movl    %eax, %cr0
  
  # Jump to next instruction, but in 32-bit code segment.
  # Switches processor into 32-bit mode.
  ljmp    $PROT_MODE_CSEG, $protcseg

  .code32                     # Assemble for 32-bit mode

关于保护模式的切换,参考Intel 80386手册。实际上X86体系中的控制寄存器CR0的第0位为PE,PE=0:处理器运行于实模式;PE=1:处理器运行于保护模式:

10.3 Switching to Protected Mode

Setting the PE bit of the MSW in CR0 causes the 80386 to begin executing in protected mode. The current privilege level (CPL) starts at zero. The segment registers continue to point to the same linear addresses as in real address mode (in real address mode, linear addresses are the same physical addresses).

14.4.1 Switching to Protected Mode

The only way to leave real-address mode is to switch to protected mode. The processor enters protected mode when a MOV to CR0 instruction sets the PE(protection enable) bit in CR0.

所以会有一条或操作指令,用来修改CR0寄存器的最后PE位,以此进入保护模式。

先执行boot loader,再执行kernel的指令。可以看到,实际上函数 void bootmain(void) 是不应该返还的,所以在 bad 节之前的代码就是boot loader的最后的语句:

  // call the entry point from the ELF header
  // note: does not return!
  ((void (*)(void)) (ELFHDR->e_entry))();

在汇编文件中对应的指令:

    7d6b: ff 15 18 00 01 00     call   *0x10018

可以看到,程序计数器在这里数到0x7d6b,也就是boot loader最后一条指令,向下单步执行si就应该看到导入kernel之后的第一步指令了:

(gdb) b *0x7d6b
Breakpoint 2 at 0x7d6b
(gdb) c
Continuing.
=> 0x7d6b:  call   *0x10018

Breakpoint 2, 0x00007d6b in ?? ()
(gdb) si
=> 0x10000c:  movw   $0x1234,0x472
0x0010000c in ?? ()
(gdb)

实际上从上面就可以看到,kernel的第一个指令在程序计数器0x10000c的地方。

显然这一段代码是读入sectors的:

  // load each program segment (ignores ph flags)
  ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
  eph = ph + ELFHDR->e_phnum;
  for (; ph < eph; ph++)
    // p_pa is the load address of this segment (as well
    // as the physical address)
    readseg(ph->p_pa, ph->p_memsz, ph->p_offset);

根据注释,函数void readseg(uint32_t pa, uint32_t count, uint32_t offset)的作用是从offset开始将kernel读入物理地址。因此,读入块数的信息count是ph->p_memsz。从其初始化与赋值的情况,结合汇编文件和头文件内的结构定义来看:

    // p_pa is the load address of this segment (as well
    // as the physical address)
    readseg(ph->p_pa, ph->p_memsz, ph->p_offset);
    7d55: ff 73 04              pushl  0x4(%ebx)
    7d58: ff 73 14              pushl  0x14(%ebx)

可以分析出在即将调用函数readseg()之前,进行了参数传递。寄存器%ebx储存的是ph指针的地址,而0x4(%ebx), 0x14(%ebx)则依次是ph->p_offset, ph->p_memsz,这与头文件inc/elf.h中Proghdr结构内部的相对偏移4, 20字节是一致的。而这两个参数回答了读取多少次、从哪里读取这两个问题。


Loading the Kernel

We will now look in further detail at the C language portion of the boot loader, in boot/main.c. But before doing so, this is a good time to stop and review some of the basics of C programming.

Exercise 4. Read about programming with pointers in C. The best reference for the C language is The C Programming Language by Brian Kernighan and Dennis Ritchie (known as 'K&R'). We recommend that students purchase this book (here is an Amazon Link) or find one of MIT's 7 copies.

Read 5.1 (Pointers and Addresses) through 5.5 (Character Pointers and Functions) in K&R. Then download the code for pointers.c, run it, and make sure you understand where all of the printed values come from. In particular, make sure you understand where the pointer addresses in lines 1 and 6 come from, how all the values in lines 2 through 4 get there, and why the values printed in line 5 are seemingly corrupted.

There are other references on pointers in C (e.g., A tutorial by Ted Jensen that cites K&R heavily), though not as strongly recommended.

Warning: Unless you are already thoroughly versed in C, do not skip or even skim this reading exercise. If you do not really understand pointers in C, you will suffer untold pain and misery in subsequent labs, and then eventually come to understand them the hard way. Trust us; you don't want to find out what "the hard way" is.


使用gcc编译程序,得到的结果:

gcc -o run x.c
'/home/yangminz/Desktop/run' 
1: a = 0x7fffffffdd40, b = 0x602010, c = (nil)
2: a[0] = 200, a[1] = 101, a[2] = 102, a[3] = 103
3: a[0] = 200, a[1] = 300, a[2] = 301, a[3] = 302
4: a[0] = 200, a[1] = 400, a[2] = 301, a[3] = 302
5: a[0] = 200, a[1] = 128144, a[2] = 256, a[3] = 302
6: a = 0x7fffffffdd40, b = 0x7fffffffdd44, c = 0x7fffffffdd41

通过objdump -d run $>$ dump.txt命令导出汇编指令, readelf -a run $>$ elf.txt导出ELF文件查看,配合GDB可以进行分析。很容易从汇编指令中确定函数f()开始执行的物理地址,GDB设置断点调到这里,就可以发现三个指针的赋值过程:

第1行的指针赋值:

a = 0x7fffffffdd40,实际上是随着不同情况下的帧栈指针而变化的:直接从栈指针%rsp开辟了0x40 = 4 * 16的空间,而a的地址也就是栈指针的地址减去0x40;

b = 0x602010,也随帧栈指针而变:寄存器edi被立即数0x10=16赋值,作为malloc函数的参数传递。中间执行malloc的种种过程我看不下去了,总之一旦从函数返回,寄存器rax就储存着指针b的值;

c = (nil) 这个是未分配空间的野指针。GDB断点运行到printf即将调用之前,查看所有寄存器的信息,可以看到哪些寄存器储存了形参,从而倒推出帧指针3个的偏置位置将数值传送到寄存器,貌似$rcx=0是野指针的地址……

第2行的值很简单,就是数组a的简单赋值。但由于c指向了a,实际上将数组a的地址空间和c重合,且修改了数值,所以a[0]的值被改为200

第3行分别通过数组和指针的方式修改了a[1], a[2], a[3]的值.

第4行移动了c的首指针,所以实际上修改了地址空间a[1]的值。

第5行修改数值的情况稍微复杂一点,由于char的大小为1字节,所以指针c移动到int a[1]的第2个字节处,在这里转换为int赋值500,所以同时影响了a[1], a[2],使二者面目全非。

第6行指针打印也类似,不赘述。


To make sense out of boot/main.c you'll need to know what an ELF binary is. When you compile and link a C program such as the JOS kernel, the compiler transforms each C source ('.c') file into an object ('.o') file containing assembly language instructions encoded in the binary format expected by the hardware. The linker then combines all of the compiled object files into a single binary image such as obj/kern/kernel, which in this case is a binary in the ELF format, which stands for "Executable and Linkable Format".

Full information about this format is available in the ELF specification on our reference page, but you will not need to delve very deeply into the details of this format in this class. Although as a whole the format is quite powerful and complex, most of the complex parts are for supporting dynamic loading of shared libraries, which we will not do in this class. The Wikipedia page has a short description.

For purposes of 6.828, you can consider an ELF executable to be a header with loading information, followed by several program sections, each of which is a contiguous chunk of code or data intended to be loaded into memory at a specified address. The boot loader does not modify the code or data; it loads it into memory and starts executing it.

An ELF binary starts with a fixed-length ELF header, followed by a variable-length program header listing each of the program sections to be loaded. The C definitions for these ELF headers are in inc/elf.h. The program sections we're interested in are:

  • .text: The program's executable instructions.
  • .rodata: Read-only data, such as ASCII string constants produced by the C compiler. (We will not bother setting up the hardware to prohibit writing, however.)
  • .data: The data section holds the program's initialized data, such as global variables declared with initializers like int x = 5;.

When the linker computes the memory layout of a program, it reserves space for uninitialized global variables, such as int x;, in a section called .bss that immediately follows .data in memory. C requires that "uninitialized" global variables start with a value of zero. Thus there is no need to store contents for .bss in the ELF binary; instead, the linker records just the address and size of the .bss section. The loader or the program itself must arrange to zero the .bss section.

Examine the full list of the names, sizes, and link addresses of all the sections in the kernel executable by typing:

athena% i386-jos-elf-objdump -h obj/kern/kernel

You can substitute objdump for i386-jos-elf-objdump if your computer uses an ELF toolchain by default like most modern Linuxen and BSDs.

You will see many more sections than the ones we listed above, but the others are not important for our purposes. Most of the others are to hold debugging information, which is typically included in the program's executable file but not loaded into memory by the program loader.

Take particular note of the "VMA" (or link address) and the "LMA" (or load address) of the .text section. The load address of a section is the memory address at which that section should be loaded into memory.

The link address of a section is the memory address from which the section expects to execute. The linker encodes the link address in the binary in various ways, such as when the code needs the address of a global variable, with the result that a binary usually won't work if it is executing from an address that it is not linked for. (It is possible to generate position-independent code that does not contain any such absolute addresses. This is used extensively by modern shared libraries, but it has performance and complexity costs, so we won't be using it in 6.828.)

Typically, the link and load addresses are the same. For example, look at the .text section of the boot loader:

athena% i386-jos-elf-objdump -h obj/boot/boot.out

The boot loader uses the ELF program headers to decide how to load the sections. The program headers specify which parts of the ELF object to load into memory and the destination address each should occupy. You can inspect the program headers by typing:

athena% i386-jos-elf-objdump -x obj/kern/kernel

The program headers are then listed under "Program Headers" in the output of objdump. The areas of the ELF object that need to be loaded into memory are those that are marked as "LOAD". Other information for each program header is given, such as the virtual address ("vaddr"), the physical address ("paddr"), and the size of the loaded area ("memsz" and "filesz").

Back in boot/main.c, the ph->p_pa field of each program header contains the segment's destination physical address (in this case, it really is a physical address, though the ELF specification is vague on the actual meaning of this field).

The BIOS loads the boot sector into memory starting at address 0x7c00, so this is the boot sector's load address. This is also where the boot sector executes from, so this is also its link address. We set the link address by passing -Ttext 0x7C00 to the linker in boot/Makefrag, so the linker will produce the correct memory addresses in the generated code.

Exercise 5. Trace through the first few instructions of the boot loader again and identify the first instruction that would "break" or otherwise do the wrong thing if you were to get the boot loader's link address wrong. Then change the link address in boot/Makefrag to something wrong, run make clean, recompile the lab with make, and trace into the boot loader again to see what happens. Don't forget to change the link address back and make clean again afterward!


没理解什么是"break"……我感觉这一题的用意是理解从boot loader到kernel的链接过程。

通过objdump -h obj/kern/kernel可以看到.text代码节:

obj/kern/kernel:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00001871  f0100000  00100000  00001000  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE

所以内核代码节的链接地址(虚拟内存地址)VMA是f0100000,被映射到加载地址LMA100000。实际上kernel的汇编代码也表明内核入口的虚拟地址是F0100000。

同样的方法,通过objdump -h obj/boot/boot.out可以看到

obj/boot/boot.out:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000186  00007c00  00007c00  00000074  2**2
                  CONTENTS, ALLOC, LOAD, CODE

找到boot开始加载的地址是0x7c00,而内核入口调用在0x7d6b,在这之间ELF信息肯定已经被装载到0x100000中了。通过GDB断点运行到0x7c00,设置监视点并扫描内存以确定装载链接信息的位置:

(gdb) b *0x7c00
Breakpoint 1 at 0x7c00
(gdb) watch *0x100000
Hardware watchpoint 2: *0x100000
(gdb) c
Continuing.
[   0:7c00] => 0x7c00:  cli    

Breakpoint 1, 0x00007c00 in ?? ()
(gdb) x/4x 0x100000
0x100000: 0x00000000  0x00000000  0x00000000  0x00000000
(gdb) c
Continuing.
The target architecture is assumed to be i386
=> 0x7cd7:  repnz insl (%dx),%es:(%edi)

Hardware watchpoint 2: *0x100000

Old value = 0
New value = 464367618
0x00007cd7 in ?? ()
(gdb) x/4x 0x100000
0x100000: 0x1badb002  0x00000000  0x00000000  0x00000000
如果把kern/kernel.ld的入口信息
  /* AT(...) gives the load address of this section, which tells
     the boot loader where to load the kernel in physical memory */
  .text : AT(0x100000) {
    *(.text .stub .text.* .gnu.linkonce.t.*)
  }

给改错了,就会报错:


Look back at the load and link addresses for the kernel. Unlike the boot loader, these two addresses aren't the same: the kernel is telling the boot loader to load it into memory at a low address (1 megabyte), but it expects to execute from a high address. We'll dig in to how we make this work in the next section.

Besides the section information, there is one more field in the ELF header that is important to us, named e_entry. This field holds the link address of the entry point in the program: the memory address in the program's text section at which the program should begin executing. You can see the entry point:

athena% i386-jos-elf-objdump -f obj/kern/kernel

You should now be able to understand the minimal ELF loader in boot/main.c. It reads each section of the kernel from disk into memory at the section's load address and then jumps to the kernel's entry point.

Exercise 6. We can examine memory using GDB's x command. The GDB manual has full details, but for now, it is enough to know that the command x/Nx ADDR prints N words of memory at ADDR. (Note that both 'x's in the command are lowercase.) Warning: The size of a word is not a universal standard. In GNU assembly, a word is two bytes (the 'w' in xorw, which stands for word, means 2 bytes).

Reset the machine (exit QEMU/GDB and start them again). Examine the 8 words of memory at 0x00100000 at the point the BIOS enters the boot loader, and then again at the point the boot loader enters the kernel. Why are they different? What is there at the second breakpoint? (You do not really need to use QEMU to answer this question. Just think.)


这题貌似在做第五题的时候做掉了……差别在于用GDB扫描8还是4……


Part 3: The Kernel

We will now start to examine the minimal JOS kernel in a bit more detail. (And you will finally get to write some code!). Like the boot loader, the kernel begins with some assembly language code that sets things up so that C language code can execute properly.

Using virtual memory to work around position dependence

When you inspected the boot loader's link and load addresses above, they matched perfectly, but there was a (rather large) disparity between the kernel's link address (as printed by objdump) and its load address. Go back and check both and make sure you can see what we're talking about. (Linking the kernel is more complicated than the boot loader, so the link and load addresses are at the top of kern/kernel.ld.)

Operating system kernels often like to be linked and run at very high virtual address, such as 0xf0100000, in order to leave the lower part of the processor's virtual address space for user programs to use. The reason for this arrangement will become clearer in the next lab.

Many machines don't have any physical memory at address 0xf0100000, so we can't count on being able to store the kernel there. Instead, we will use the processor's memory management hardware to map virtual address 0xf0100000 (the link address at which the kernel code expects to run) to physical address 0x00100000 (where the boot loader loaded the kernel into physical memory). This way, although the kernel's virtual address is high enough to leave plenty of address space for user processes, it will be loaded in physical memory at the 1MB point in the PC's RAM, just above the BIOS ROM. This approach requires that the PC have at least a few megabytes of physical memory (so that physical address 0x00100000 works), but this is likely to be true of any PC built after about 1990.

In fact, in the next lab, we will map the entire bottom 256MB of the PC's physical address space, from physical addresses 0x00000000 through 0x0fffffff, to virtual addresses 0xf0000000 through 0xffffffff respectively. You should now see why JOS can only use the first 256MB of physical memory.

For now, we'll just map the first 4MB of physical memory, which will be enough to get us up and running. We do this using the hand-written, statically-initialized page directory and page table in kern/entrypgdir.c. For now, you don't have to understand the details of how this works, just the effect that it accomplishes. Up until kern/entry.S sets the CR0_PG flag, memory references are treated as physical addresses (strictly speaking, they're linear addresses, but boot/boot.S set up an identity mapping from linear addresses to physical addresses and we're never going to change that). Once CR0_PG is set, memory references are virtual addresses that get translated by the virtual memory hardware to physical addresses. entry_pgdir translates virtual addresses in the range 0xf0000000 through 0xf0400000 to physical addresses 0x00000000 through 0x00400000, as well as virtual addresses 0x00000000 through 0x00400000 to physical addresses 0x00000000 through 0x00400000. Any virtual address that is not in one of these two ranges will cause a hardware exception which, since we haven't set up interrupt handling yet, will cause QEMU to dump the machine state and exit (or endlessly reboot if you aren't using the 6.828-patched version of QEMU).

Exercise 7. Use QEMU and GDB to trace into the JOS kernel and stop at the movl %eax, %cr0. Examine memory at 0x00100000 and at 0xf0100000. Now, single step over that instruction using the stepi GDB command. Again, examine memory at 0x00100000 and at 0xf0100000. Make sure you understand what just happened.

What is the first instruction after the new mapping is established that would fail to work properly if the mapping weren't in place? Comment out the movl %eax, %cr0 in kern/entry.S, trace into it, and see if you were right.


运行到内核的指定指令处,扫描内存如下

(gdb) x/4x 0x100000
0x100000: 0x1badb002  0x00000000  0xe4524ffe  0x7205c766
(gdb) x/4x 0xf0100000
0xf0100000 <_start+4026531828>: 0x00000000  0x00000000  0x00000000  0x00000000

执行stepi命令以后,

(gdb) stepi
=> 0x100028:  mov    $0xf010002f,%eax
0x00100028 in ?? ()
(gdb) x/4x 0x100000
0x100000: 0x1badb002  0x00000000  0xe4524ffe  0x7205c766
(gdb) x/4x 0xf0100000
0xf0100000 <_start+4026531828>: 0x1badb002  0x00000000  0xe4524ffe  0x7205c766

执行stepi命令只运行了一条命令

mov    $0xf010002f,%eax

但是却将物理地址上的值传送到了虚拟地址。这条指令实际上是将重定位的立即数$relocated传送到通用寄存器。这部分真的不知道是怎么实现的,只能看kern/entry.S的注释了,我的理解是这样的:

首先,内核的代码,也就是kern/entry.S的.text节,它的链接地址(link address),也就是虚拟内存地址VMA,是在~(KERNBASE + 1 Meg);而bootloader在~1 Meg. 函数RELOC(x)将符号从其虚拟地址映射到物理地址,也就是加载地址LMA.

bootloader刚通过ELF链接到入口时,虚拟内存尚未建立,所以先在物理地址1MB上加载内核,但内核程序的C语言代码是在虚拟地址上运行的,所以要先建立一个页表进行地址翻译,将[KERNBASE, KERNBASE+4MB)的虚拟地址空间映射到[0, 4MB)的物理空间。然后将物理地址加载到控制寄存器CR3,这个寄存器用于使处理器定位到当前任务的页表位置。再将设定过的分页控制符PG传送到CR0,就建立了分页机制。

如果没有页表将虚拟页映射到物理页,那么在建立分页的指令之后,运行内核需要虚拟地址转换的时候就会出错。通过GDB单步调试发现,出错的位置在0xf010002c:

qemu: fatal: Trying to execute code outside RAM or ROM at 0xf010002c

0xf010002c应该是重定位的地址,赋值给寄存器eax,应该要跳转到这个地址,但是因为没有建设虚拟地址的翻译,所以跳转出错了。


Formatted Printing to the Console

Most people take functions like printf() for granted, sometimes even thinking of them as "primitives" of the C language. But in an OS kernel, we have to implement all I/O ourselves.

Read through kern/printf.c, lib/printfmt.c, and kern/console.c, and make sure you understand their relationship. It will become clear in later labs why printfmt.c is located in the separate lib directory.

Exercise 8. We have omitted a small fragment of code - the code necessary to print octal numbers using patterns of the form "%o". Find and fill in this code fragment.


补全%o的case:

    case 'o':
      num = getint(&ap, lflag);
      base = 8;
      goto number;

模仿16进制的情况就好。


Be able to answer the following questions:

  1. Explain the interface between printf.c and console.c. Specifically, what function does console.c export? How is this function used by printf.c?
  2. Explain the following from console.c:
    1      if (crt_pos >= CRT_SIZE) {
    2              int i;
    3              memmove(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
    4              for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
    5                      crt_buf[i] = 0x0700 | ' ';
    6              crt_pos -= CRT_COLS;
    7      }
    
  3. For the following questions you might wish to consult the notes for Lecture 2. These notes cover GCC's calling convention on the x86.

    Trace the execution of the following code step-by-step:

    int x = 1, y = 3, z = 4;
    cprintf("x %d, y %x, z %d\n", x, y, z);
    
    • In the call to cprintf(), to what does fmt point? To what does ap point?
    • List (in order of execution) each call to cons_putc, va_arg, and vcprintf. For cons_putc, list its argument as well. For va_arg, list what ap points to before and after the call. For vcprintf list the values of its two arguments.
  4. Run the following code.
        unsigned int i = 0x00646c72;
        cprintf("H%x Wo%s", 57616, &i);
    
    What is the output? Explain how this output is arrived at in the step-by-step manner of the previous exercise. Here's an ASCII table that maps bytes to characters.

    The output depends on that fact that the x86 is little-endian. If the x86 were instead big-endian what would you set i to in order to yield the same output? Would you need to change 57616 to a different value?

    Here's a description of little- and big-endian and a more whimsical description.

  5. In the following code, what is going to be printed after 'y='? (note: the answer is not a specific value.) Why does this happen?
        cprintf("x=%d y=%d", 3);
    
  6. Let's say that GCC changed its calling convention so that it pushed arguments on the stack in declaration order, so that the last argument is pushed last. How would you have to change cprintf or its interface so that it would still be possible to pass it a variable number of arguments?

Question 1

console.c和printf.c的通讯实际上还是程序链接中的符号解析问题。可以看到,在printf.c中的三个函数

static void putch(int ch, int *cnt)
{
  cputchar(ch);
  *cnt++;
}
int vcprintf(const char *fmt, va_list ap);
int cprintf(const char *fmt, ...);

实际上是利用了在console.c的外部强符号cputchar(),这个函数在printf.c内部用putch()包装,然后其他两个函数再调用printf.c内部的这个函数。

Question 2

  if (crt_pos >= CRT_SIZE) {
    int i;
    memmove(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
    for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
      crt_buf[i] = 0x0700 | ' ';
    crt_pos -= CRT_COLS;
  }

应该是一个屏幕输出溢出的处理。crt_pos是命令行光标的位置,而CRT_SIZE是屏幕能够显示的双字节字符的数量。如果光标到了屏幕结尾,则用memmove将内存块中的字符向后复制一个双字节行宽的大小,在显示的时候也就挤掉一行字符,再用空格调整,并且调整光标的位置跃过一行。

Question 3

int x = 1, y = 3, z = 4;
cprintf("x %d, y %x, z %d\n", x, y, z);

关于这部分的调试:容易看到,在kern/entry.S中跳转到C代码有这么一句:

  # now to C code
  call  i386_init

所以找到同目录下的文件kern/init.c,在函数void i386_init(void)内有:

cprintf("6828 decimal is %o octal!\n", 6828);

可见,将要要调试的格式化输出代码放到这里编译即可。

直接从形参看到,char *fmt是格式化字符串,ap对应格式化字符串中对应的存放变量的地址,这些变量用了va_list栈形式存储,ap指向这个栈的指针。

按照之前说的方法,在kern/init.c中插入代码编译以后,再到obj/kern/kernel.asm查看。并且在GDB上运行调试,最初相关的几个函数的调用顺序是这样的:

cprintf ==> vcprintf ==> vprintfmt ==> putch ==> cputchar ==> cons_putc ==> vprintfmt ==> getint ==> va_arg
  int x = 1, y = 3, z = 4;
  cprintf("x %d, y %x, z %d\n", x, y, z);
f01000c8: 6a 04                 push   $0x4
f01000ca: 6a 03                 push   $0x3
f01000cc: 6a 01                 push   $0x1
f01000ce: 68 12 19 10 f0        push   $0xf0101912
f01000d3: e8 2e 08 00 00        call   f0100906 

可以看到格式化字符串的char *fmt参量的指针地址,以及相关的变量是如何push进栈中的。接下来按执行顺序追溯三个函数和它们的形参:

f0100906 :
    cnt = vcprintf(fmt, ap);
f010090f:   50                      push   %eax
f0100910:   ff 75 08                pushl  0x8(%ebp)
f0100913:   e8 c8 ff ff ff          call   f01008e0 

用GDB查看可知,vcprintf的两个形参值分别寄存器%eax=0xf010ffd4和

寄存器偏置0x8(%ebp)=0xf010ffc8+0x8=0xf010ffd0.

接下来按顺序在cputchar中执行到cons_putc:

f0100669 :
void cputchar(int c){
f0100669:   55                      push   %ebp
f010066a:   89 e5                   mov    %esp,%ebp
f010066c:   83 ec 08                sub    $0x8,%esp
    cons_putc(c);
f010066f:   8b 45 08                mov    0x8(%ebp),%eax
f0100672:   e8 89 fc ff ff          call   f0100300 

其形参只有一个整型数存放在寄存器eax中,p $eax为120,对应于ASCII的120号可显示字符x,也就是格式化字符串:

"x %d, y %x, z %d\n"

的第一个字符。

调用完以后层层退出栈,回到vprintfmt。接下来应该按照格式化字符串依次输出%d, %x, %d}。所以可以在printfmt.c中查看%d的case,发现首先调用了函数getint,在其内部才有va_arg。讨巧一点可以避开深入va_arg,而直接查看getint前后ap的值即可。对应的汇编部分:

static long long getint(va_list *ap, int lflag)
{
    if (lflag >= 2)
f0100f60:   83 f9 01                cmp    $0x1,%ecx
f0100f63:   7e 19                   jle    f0100f7e 
        return va_arg(*ap, long long);

可以看到没有call命令,所以va_arg应该不是函数,可能是宏之类的表达式。在调试中不进入va_arg,直接查看getint前后ap的值:

        case 'd':
            num = getint(&ap, lflag);
f0100fb6:   8b 55 d8                mov    -0x28(%ebp),%edx
f0100fb9:   8b 4d dc                mov    -0x24(%ebp),%ecx

调用之前是0x00000000,之后是0x00000001,正是x的值。

Question 4

按照Question 3的方法调试,结果为:

He110 World

…………好冷,仿佛一个冷笑话。

输出的机制和Question 3一样,就不赘述。在小端机下,i按照00 64 6c 72储存,换大端机为72 6c 64 00就行,而57616是比较高层的抽象所以不需要换。

Question 5

cprintf("x=%d y=%d", 3);

在某一次实验中输出的结果:x=3 y=1604会得到这个结果的原因是,所有格式化字符串的变量都存在一个栈链表中,在这里只有一个整型数,但是输出的时候按照两个变量输出,所以输出了地址空间上紧挨着栈,但不属于栈的这部分变量,造成了不确定的打印结果。

Question 6

如果GCC改变了它的习惯,最后一个变量最后push,那么在cprintf中,栈链va_list ap的顺序也需要翻过来,增加一个倒序的过程即可。

Challenge Enhance the console to allow text to be printed in different colors. The traditional way to do this is to make it interpret ANSI escape sequences embedded in the text strings printed to the console, but you may use any mechanism you like. There is plenty of information on the 6.828 reference page and elsewhere on the web on programming the VGA display hardware. If you're feeling really adventurous, you could try switching the VGA hardware into a graphics mode and making the console draw text onto the graphical frame buffer.

色彩控制在kern/console.c : cga_putc()函数中:

static void cga_putc(int c)
{
  // if no attribute given, then use black on white
  if (!(c & ~0xFF))
    c |= 0x0700;
  switch (c & 0xff) { 
    /* ... */
    default:
      crt_buf[crt_pos++] = c;   /* write the character */
      break;
  }

实际上从default的输出可以看到,对命令行输出了int c。但是int整型数有2个字节,而ASCII只有1个字节,所以命令行除了输出ASCII的字符以外,还包含了额外的信息,这部分信息是颜色信息。

注释那里的位级运算将int c的低八位过滤为零,然后判断c是否为零,也就是判断c原先的高八位是否为0000 0000。如果高八位没有给定参数,也就是零,则黑白打印,反之则控制了颜色输出。

所以这里实际上是一个编码的问题了。

为了控制字符的颜色输出,增加一个格式化控制符%C,比如在格式化字符串里,

cprintf("%C Color test", 0x0400);

,将改变字符串Color test输出为红色。颜色控制的参数这部分,在《汇编语言程序设计》/第九章/第二节-显示器I/O这一节中有讲到过,有一张4-bit的IRGB颜色表。

所以想法就是把颜色控制符在%C这里传进去:

#include 
void vprintfmt(void (*putch)(int, void*), void *putdat, 
  const char *fmt, va_list ap){
  switch (ch = *(unsigned char *) fmt++) {
    case 'C':
      csa = getint(&ap, lflag);
      break;
  }
}

其中全局变量csa是控制颜色的,也就是0x400 - RED},需要在不同的.c文件之间使用,所以被定义在一个头文件中:

/* inc/csa.h */
int csa;

这样在kern/console.c中就可以调用这个参数了:

static void cga_putc(int c){
  // if no attribute given, then use black on white
  if (!csa) csa = 0x0700;
  if (!(c & ~0xFF))
    c |= csa;
}

输出的结果是这样的:

但是之前让我困惑了好久,我一直采用make qemu-nox的方式启动JOS,但是一直没有任何颜色显示:

找也找不到bug,看了其他人的工作好像都没有出现这个问题,所以一直做不出颜色。最后换成了VGA窗口,才显示出颜色。具体什么原因我还是搞不清楚,我猜测或许可能和终端采用的编码方式有关。

The Stack

In the final exercise of this lab, we will explore in more detail the way the C language uses the stack on the x86, and in the process write a useful new kernel monitor function that prints a backtrace of the stack: a list of the saved Instruction Pointer (IP) values from the nested call instructions that led to the current point of execution.

Exercise 9. Determine where the kernel initializes its stack, and exactly where in memory its stack is located. How does the kernel reserve space for its stack? And at which "end" of this reserved area is the stack pointer initialized to point to?


在obj/kern/kernel.asm中有这么一段代码:

f010002f :
relocated:
  # Clear the frame pointer register (EBP)
  # so that once we get into debugging C code,
  # stack backtraces will be terminated properly.
  movl  $0x0,%ebp     # nuke frame pointer
f010002f: bd 00 00 00 00        mov    $0x0,%ebp
  # Set the stack pointer
  movl  $(bootstacktop),%esp
f0100034: bc 00 00 11 f0        mov    $0xf0110000,%esp

即是用来初始化frame, stack的,相应的部分在kern/entry.S中。运行到这里时,用GDB查看栈寄存器:

(gdb) info reg
esp            0xf0110000 0xf0110000 
ebp            0x0  0x0

可以看到栈指针的地址。实际上这部分在kern/entry.S的.data节的全局变量中:

.data
  .globl    bootstack
bootstack:
  .space    KSTKSIZE
  .globl    bootstacktop  

结合内核中的代码,bootstacktop是栈顶指针,KSTKSIZE是保留空间。


The x86 stack pointer (esp register) points to the lowest location on the stack that is currently in use. Everything below that location in the region reserved for the stack is free. Pushing a value onto the stack involves decreasing the stack pointer and then writing the value to the place the stack pointer points to. Popping a value from the stack involves reading the value the stack pointer points to and then increasing the stack pointer. In 32-bit mode, the stack can only hold 32-bit values, and esp is always divisible by four. Various x86 instructions, such as call, are "hard-wired" to use the stack pointer register.

The ebp (base pointer) register, in contrast, is associated with the stack primarily by software convention. On entry to a C function, the function's prologue code normally saves the previous function's base pointer by pushing it onto the stack, and then copies the current esp value into ebp for the duration of the function. If all the functions in a program obey this convention, then at any given point during the program's execution, it is possible to trace back through the stack by following the chain of saved ebp pointers and determining exactly what nested sequence of function calls caused this particular point in the program to be reached. This capability can be particularly useful, for example, when a particular function causes an assert failure or panic because bad arguments were passed to it, but you aren't sure who passed the bad arguments. A stack backtrace lets you find the offending function.

Exercise 10. To become familiar with the C calling conventions on the x86, find the address of the test_backtrace function in obj/kern/kernel.asm, set a breakpoint there, and examine what happens each time it gets called after the kernel starts. How many 32-bit words does each recursive nesting level of test_backtrace push on the stack, and what are those words?

Note that, for this exercise to work properly, you should be using the patched version of QEMU available on the tools page or on Athena. Otherwise, you'll have to manually translate all breakpoint and memory addresses to linear addresses.


函数test_backtrace在kernel.asm中的地址是0xf0100040,断点运行到这里,查看stack的状态

(gdb) info stack
#0  test_backtrace (x=5) at kern/init.c:13
#1  0xf01000e3 in i386_init () at kern/init.c:41
#2  0xf010003e in relocated () at kern/entry.S:80
(gdb) c
Continuing.
=> 0xf0100040 : push   %ebp

Breakpoint 1, test_backtrace (x=4) at kern/init.c:13
13  {
(gdb) info stack
#0  test_backtrace (x=4) at kern/init.c:13
#1  0xf0100068 in test_backtrace (x=5) at kern/init.c:16
#2  0xf01000e3 in i386_init () at kern/init.c:41
#3  0xf010003e in relocated () at kern/entry.S:80

以上是断点两次的结果,可以看到每一次对函数test_backtrace的调用对应于一次stack状态的变化,push或pop。

在递归调用这部分,

    test_backtrace(x-1);
f010005c: 83 ec 0c              sub    $0xc,%esp
f010005f: 8d 43 ff              lea    -0x1(%ebx),%eax
f0100062: 50                    push   %eax
f0100063: e8 d8 ff ff ff        call   f0100040 
f0100068: 83 c4 10              add    $0x10,%esp
f010006b: eb 11                 jmp    f010007e 

可见递归调用时栈指针向低地址增长了0xc,之所以开辟这么大空间是为了完成递归调用,恰好有:0xf010005c + 0xc = 0xf0100068


The above exercise should give you the information you need to implement a stack backtrace function, which you should call mon_backtrace(). A prototype for this function is already waiting for you in kern/monitor.c. You can do it entirely in C, but you may find the read_ebp() function in inc/x86.h useful. You'll also have to hook this new function into the kernel monitor's command list so that it can be invoked interactively by the user.

The backtrace function should display a listing of function call frames in the following format:

Stack backtrace:
  ebp f0109e58  eip f0100a62  args 00000001 f0109e80 f0109e98 f0100ed2 00000031
  ebp f0109ed8  eip f01000d6  args 00000000 00000000 f0100058 f0109f28 00000061
  ...

Each line contains an ebp, eip, and args. The ebp value indicates the base pointer into the stack used by that function: i.e., the position of the stack pointer just after the function was entered and the function prologue code set up the base pointer. The listed eip value is the function's return instruction pointer: the instruction address to which control will return when the function returns. The return instruction pointer typically points to the instruction after the call instruction (why?). Finally, the five hex values listed after args are the first five arguments to the function in question, which would have been pushed on the stack just before the function was called. If the function was called with fewer than five arguments, of course, then not all five of these values will be useful. (Why can't the backtrace code detect how many arguments there actually are? How could this limitation be fixed?)

The first line printed reflects the currently executing function, namely mon_backtrace itself, the second line reflects the function that called mon_backtrace, the third line reflects the function that called that one, and so on. You should print all the outstanding stack frames. By studying kern/entry.S you'll find that there is an easy way to tell when to stop.

Here are a few specific points you read about in K&R Chapter 5 that are worth remembering for the following exercise and for future labs.

  • If int *p = (int*)100, then (int)p + 1 and (int)(p + 1) are different numbers: the first is 101 but the second is 104. When adding an integer to a pointer, as in the second case, the integer is implicitly multiplied by the size of the object the pointer points to.
  • p[i] is defined to be the same as *(p+i), referring to the i'th object in the memory pointed to by p. The above rule for addition helps this definition work when the objects are larger than one byte.
  • &p[i] is the same as (p+i), yielding the address of the i'th object in the memory pointed to by p.

Although most C programs never need to cast between pointers and integers, operating systems frequently do. Whenever you see an addition involving a memory address, ask yourself whether it is an integer addition or pointer addition and make sure the value being added is appropriately multiplied or not.

Exercise 11. Implement the backtrace function as specified above. Use the same format as in the example, since otherwise the grading script will be confused. When you think you have it working right, run make grade to see if its output conforms to what our grading script expects, and fix it if it doesn't. After you have handed in your Lab 1 code, you are welcome to change the output format of the backtrace function any way you like.

If you use read_ebp(), note that GCC may generate "optimized" code that calls read_ebp() before mon_backtrace()'s function prologue, which results in an incomplete stack trace (the stack frame of the most recent function call is missing). While we have tried to disable optimizations that cause this reordering, you may want to examine the assembly of mon_backtrace() and make sure the call to read_ebp() is happening after the function prologue.


在MIT 6.828的lecture 2提供的课件中,描述了栈的结构:

从这张图来看,eip只要在ebp高一位的位置取就可以了,而前五个参数则取相应的更高五位,这里指针的运算比较搞一点:

int mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
  uint32_t * ebp = (uint32_t *)read_ebp();
  uint32_t eip;
  uint32_t arg0, arg1, arg2, arg3, arg4;
  cprintf("Stack backtrace:\n");
  while(ebp != 0){
    eip = *(uint32_t *)(ebp + 1);
    arg0 = ebp[2];
    arg1 = ebp[3];
    arg2 = ebp[4];
    arg3 = ebp[5];
    arg4 = ebp[6];
    cprintf("ebp %08x eip %08x args %08x %08x %08x %08x %08x\n",
     ebp, eip, arg0, arg1, arg2, arg3, arg4);
    ebp = (uint32_t *)(*(uint32_t *)ebp);
  }
  return 0;
}

这里本来想像read_ebp()一样,写asm volatile的语句来读取eip的信息,但是想一想实际上eip在栈中是ebp高位,所以也就没必要了,而且这个做法像ebp一样,只能在初始化的地方使用。args其实也可以用指针,但是写出来就比较冗余,在这里指针的关系真的挺搞的,要分清楚需要一定耐心。

之所以设置ebp回溯到0,是因为在内核的汇编代码中有这么一段初始化:

  # Clear the frame pointer register (EBP)
  # so that once we get into debugging C code,
  # stack backtraces will be terminated properly.
  movl  $0x0,%ebp     # nuke frame pointer

可以看到帧指针的顶点是0。输出结果为:

entering test_backtrace 5
entering test_backtrace 4
entering test_backtrace 3
entering test_backtrace 2
entering test_backtrace 1
entering test_backtrace 0
Stack backtrace:
ebp f010ff18 eip f010007b args 00000000 00000000 00000000 00000000 f01008fd
ebp f010ff38 eip f0100068 args 00000000 00000001 f010ff78 00000000 f01008fd
ebp f010ff58 eip f0100068 args 00000001 00000002 f010ff98 00000000 f01008fd
ebp f010ff78 eip f0100068 args 00000002 00000003 f010ffb8 00000000 f01008fd
ebp f010ff98 eip f0100068 args 00000003 00000004 00000000 00000000 00000000
ebp f010ffb8 eip f0100068 args 00000004 00000005 00000000 00010094 00010094
ebp f010ffd8 eip f01000d4 args 00000005 00001aac 00000644 00000000 00000000
ebp f010fff8 eip f010003e args 00111021 00000000 00000000 00000000 00000000
leaving test_backtrace 0
leaving test_backtrace 1
leaving test_backtrace 2
leaving test_backtrace 3
leaving test_backtrace 4
leaving test_backtrace 5

grade的结果是:

  printf: OK 
  backtrace count: OK 
  backtrace arguments: OK 
  backtrace symbols: FAIL 
    AssertionError: got:
      
    expected:
      test_backtrace
      test_backtrace
      test_backtrace
      test_backtrace
      test_backtrace
      test_backtrace
      i386_init
    
  backtrace lines: FAIL 
    AssertionError: No line numbers
    
Score: 40/50

其中符号和行数没有通过,是在下一题中完成。

另外,关于为什么不能按照函数的实际形参数量来打印参数的问题,实际上我们上学期在做ICS的PA时已经遇到过这个问题,在PA2 - 不停计算的机器/3.5 简易调试器(2)里也说过这个问题:

由于缺乏形参和局部变量的具体信息, 我们只需要打印地址, 函数名, 以及前4个参数就可以了, 打印格式可以参考GDB中 bt 命令的输出. 如何确定某个地址落在哪一个函数中呢? 这就需要符号表的帮助了.

对于 Type 属性为 FUNC 的表项, Value 属性指示了函数的起始地址, Size 属性指示了函数的大小, 通过这两个属性就可以确定函数的范围了. 由于函数的范围是互不相交的, 因此我们可以通过扫描符号表中 Type 属性为 FUNC 的每一个表项, 唯一确定一个地址在所的函数. 为了得到函数名, 你只需要根据表项中的 Name 属性在字符串表中找到相应的字符串就可以了.


At this point, your backtrace function should give you the addresses of the function callers on the stack that lead to mon_backtrace() being executed. However, in practice you often want to know the function names corresponding to those addresses. For instance, you may want to know which functions could contain a bug that's causing your kernel to crash.

To help you implement this functionality, we have provided the function debuginfo_eip(), which looks up eip in the symbol table and returns the debugging information for that address. This function is defined in kern/kdebug.c.

Exercise 12. Modify your stack backtrace function to display, for each eip, the function name, source file name, and line number corresponding to that eip.

In debuginfo_eip, where do __STAB_* come from? This question has a long answer; to help you to discover the answer, here are some things you might want to do:

  • look in the file kern/kernel.ld for __STAB_*
  • run i386-jos-elf-objdump -h obj/kern/kernel
  • run i386-jos-elf-objdump -G obj/kern/kernel
  • run i386-jos-elf-gcc -pipe -nostdinc -O2 -fno-builtin -I. -MD -Wall -Wno-format -DJOS_KERNEL -gstabs -c -S kern/init.c, and look at init.s.
  • see if the bootloader loads the symbol table in memory as part of loading the kernel binary

Complete the implementation of debuginfo_eip by inserting the call to stab_binsearch to find the line number for an address.

Add a backtrace command to the kernel monitor, and extend your implementation of mon_backtrace to call debuginfo_eip and print a line for each stack frame of the form:

K> backtrace
Stack backtrace:
  ebp f010ff78  eip f01008ae  args 00000001 f010ff8c 00000000 f0110580 00000000
         kern/monitor.c:143: monitor+106
  ebp f010ffd8  eip f0100193  args 00000000 00001aac 00000660 00000000 00000000
         kern/init.c:49: i386_init+59
  ebp f010fff8  eip f010003d  args 00000000 00000000 0000ffff 10cf9a00 0000ffff
         kern/entry.S:70: <unknown>+0
K> 

Each line gives the file name and line within that file of the stack frame's eip, followed by the name of the function and the offset of the eip from the first instruction of the function (e.g., monitor+106 means the return eip is 106 bytes past the beginning of monitor).

Be sure to print the file and function names on a separate line, to avoid confusing the grading script.

Tip: printf format strings provide an easy, albeit obscure, way to print non-null-terminated strings like those in STABS tables. printf("%.*s", length, string) prints at most length characters of string. Take a look at the printf man page to find out why this works.

You may find that some functions are missing from the backtrace. For example, you will probably see a call to monitor() but not to runcmd(). This is because the compiler in-lines some function calls. Other optimizations may cause you to see unexpected line numbers. If you get rid of the -O2 from GNUMakefile, the backtraces may make more sense (but your kernel will run more slowly).


就像在上一题结尾所说,和ICS PA一样,必须要通过符号表才能将更多的信息在帧栈链中打印出来。

万变不离其宗,所以纵观函数debuginfo_eip,其实就是将结构Eipdebuginfo *info补上符号表中的数据:

  // Initialize *info
  info->eip_file = "";
  info->eip_line = 0;
  info->eip_fn_name = "";
  info->eip_fn_namelen = 9;
  info->eip_fn_addr = addr;
  info->eip_fn_narg = 0;

其中除了eip_line以外,其他所有的成员变量都已经赋过值了,所以只需要模仿其他成员变量对其赋值。对于符号表中元素的定位,大致过程是这样的:

唯一需要的信息就是地址。由于函数的地址空间在符号表上是连续而互斥的,所以只要给定了函数执行的某一地址,以及告知stab_binsearch需要查找的符号类型为N_FUN(在/inc/stab.h中定义),就可以找到函数的范围[lfun, rfun]。而所有信息,都定义在这一区间的符号表中。

定位到区间,以及补全结构体,需要层层递进。首先要找到包含函数的源文件,在符号表区间[lfile, rfile]中。然后在这个区间中定位函数的地址空间,为[lfun, rfun]。再在函数的地址空间[lfun, rfun]中找到行号的区间[lline, rline]。

找到了三段地址空间,它们之间的集合关系是单调递增的包含。随后就可以更精细地找到结构体的成员变量们,需要符号表stab的参与。之前找到的三段地址空间,其左区间都是起始地址,也就是符号表结构体数组

struct Stab {
  uint32_t n_strx;  // index into string table of name
  uint8_t n_type;         // type of symbol
  uint8_t n_other;        // misc info (usually empty)
  uint16_t n_desc;        // description field
  uintptr_t n_value;  // value of symbol
};

地址,所以stab[lfun]这样的数组恰能找到所需的符号地址空间,然后引用其成员变量。所以,回到行号的问题,需要写的代码只有短短几段:

  // Search within [lline, rline] for the line number stab.
  // If found, set info->eip_line to the right line number.
  // If not found, return -1.
  stab_binsearch(stabs, &lline, &rline, N_SLINE, addr);
  info->eip_line = stabs[lline].n_desc;
  if(lline > rline){
    info->eip_line = -1;
  }
  /* ... */
  cprintf("    %s:%d: %.*s+%d\n", 
    info->eip_file, info->eip_line,
    info->eip_fn_namelen, info->eip_fn_name,
    addr == info->eip_fn_addr ? 0:addr);

在这里为了减少参数传递的麻烦,所以第二行的更多函数信息的输出直接在这里进行了。对kern/kdebug.c的函数调整一下:

int mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
  uint32_t * ebp = (uint32_t *)read_ebp();
  uint32_t eip;
  uint32_t arg0, arg1, arg2, arg3, arg4;

  /* ********************************** */
  uintptr_t addr;
  struct Eipdebuginfo info;
  /* ********************************** */

  cprintf("Stack backtrace:\n");
  while(ebp != 0){
    eip = *(uint32_t *)(ebp + 1);

    /* ********************************** */
    addr = (uintptr_t)*(ebp + 1);
    /* ********************************** */

    arg0 = ebp[2];
    arg1 = ebp[3];
    arg2 = ebp[4];
    arg3 = ebp[5];
    arg4 = ebp[6];
    cprintf("ebp %08x eip %08x args %08x %08x %08x %08x %08x\n", 
      ebp, eip, arg0, arg1, arg2, arg3, arg4);

    /* ********************************** */
    memset(&info, 0, sizeof(info));
    debuginfo_eip(addr, &info);
    /* ********************************** */

    ebp = (uint32_t *)(*(uint32_t *)ebp);
  }
  return 0;
}

使用make grade评定:

yangminz@yangminz-VM:~/6.828/lab$ make grade
make clean
make[1]: Entering directory '/home/yangminz/6.828/lab'
rm -rf obj .gdbinit jos.in qemu.log
make[1]: Leaving directory '/home/yangminz/6.828/lab'
./grade-lab1 
make[1]: Entering directory '/home/yangminz/6.828/lab'
sh: echo: I/O error
+ as kern/entry.S
+ cc kern/entrypgdir.c
sh: echo: I/O error
+ cc kern/init.c
+ cc kern/console.c
+ cc kern/monitor.c
+ cc kern/printf.c
+ cc kern/kdebug.c
+ cc lib/printfmt.c
+ cc lib/readline.c
+ cc lib/string.c
sh: echo: I/O error
+ ld obj/kern/kernel
+ as boot/boot.S
+ cc -Os boot/main.c
+ ld boot/boot
boot block is 390 bytes (max 510)
+ mk obj/kern/kernel.img
make[1]: Leaving directory '/home/yangminz/6.828/lab'
running JOS: (1.5s) 
  printf: OK 
  backtrace count: OK 
  backtrace arguments: OK 
  backtrace symbols: OK 
  backtrace lines: OK 
Score: 50/50

就大功告成了。


This completes the lab. In the lab directory, commit your changes with git commit and type make handin to submit your code.