Android’s kernel for beagleboard-xm

Rowboat port enable TI device on android’s linux kernel at http://gitorious.org/rowboat

1, Build

   1: make CROSS_COMPILE=arm-eabi- distclean

   2: make CROSS_COMPILE=arm-eabi- omap3_beagle_android_defconfig

   3: make CROSS_COMPILE=arm-eabi- uImage

The generated uImage is in arch/arm/boot.

2, Modifications

2.1 arm\mach-omap2\board-omap3beagle.c

This file is major BSP for beagleboard. It defines __mach_desc_OMAP3_BEAGLE for architecture features as follows

MACHINE_START(OMAP3_BEAGLE, "OMAP3 Beagle Board")
    /* Maintainer: Syed Mohammed Khasim - http://beagleboard.org */
    .phys_io    = 0x48000000,
    .io_pg_offst    = ((0xfa000000) >> 18) & 0xfffc,
    .boot_params    = 0x80000100,
    .map_io     = omap3_beagle_map_io,
    .init_irq   = omap3_beagle_init_irq,
    .init_machine   = omap3_beagle_init,
    .timer      = &omap_timer,
MACHINE_END

This structure is put into .arch.info.init section referenced in link script file at arm\kernel\vmlinux.lds.S

    .init : {           /* Init code and data       */
        _stext = .;
        _sinittext = .;
            HEAD_TEXT
            INIT_TEXT
        _einittext = .;
        __proc_info_begin = .;
            *(.proc.info.init)
        __proc_info_end = .;
        __arch_info_begin = .;
            *(.arch.info.init)
        __arch_info_end = .;
        __tagtable_begin = .;
            *(.taglist.init)
        __tagtable_end = .;

The machine description structure is as follows:

struct machine_desc {
    /*
     * Note! The first four elements are used
     * by assembler code in head.S, head-common.S
     */
    unsigned int        nr;     /* architecture number  */
    unsigned int        phys_io;    /* start of physical io */
    unsigned int        io_pg_offst;    /* byte offset for io 
                         * page tabe entry  */

    const char      *name;      /* architecture name    */
    unsigned long       boot_params;    /* tagged list      */

    unsigned int        video_start;    /* start of video RAM   */
    unsigned int        video_end;  /* end of video RAM */

    unsigned int        reserve_lp0 :1; /* never has lp0    */
    unsigned int        reserve_lp1 :1; /* never has lp1    */
    unsigned int        reserve_lp2 :1; /* never has lp2    */
    unsigned int        soft_reboot :1; /* soft reboot      */
    void            (*fixup)(struct machine_desc *,
                     struct tag *, char **,
                     struct meminfo *);
    void            (*map_io)(void);/* IO mapping function  */
    void            (*init_irq)(void);
    struct sys_timer    *timer;     /* system tick timer    */
    void            (*init_machine)(void);
};

 

3, Reference

[1] http://processors.wiki.ti.com/index.php?title=TI-Android-FroYo-DevKit-V2_UserGuide

[2] http://gitorious.org/rowboat

Android’s Binder

 

The Binder communicates between processes using a small custom kernel module.This is used instead of standard Linux IPC facilities so that we can efficiently model our IPC operations as “thread migration”. That is, an IPC between processes looks as if the thread instigating the IPC has hopped over to the destination process to execute the code there, and then hopped back with the result.

Why android need IPC communication/binder? Although all android app are using java language, but android uses dalvik VM, unlike traditional OSGi’s JVM, each dalvik app resided in single linux process. As Radoslav Gerganow said, this prevent all app closed when VM is broken. So the IPC is necessary for each android app’s communication.

The binder in android is based on OpenBinder with some modifications. The binder’s protocol version used in android-kernel 2.6.32 is 7. Binder IPC in android is based on binder driver /drivers/staging/android/binder.c.

1 Workflow

image

2 Binder Driver

When a user-space thread wants to participate in Binder IPC (either to send an IPC to another process or to receiving an incoming IPC), the first thing it must do is open the driver supplied by the Binder kernel module. This associates a file descriptor with that thread, which the kernel module uses to identify the initiators and recipients of Binder IPCs.

2.1 binder_init()

  • Create procfs /proc/binder and some entries as:
    • state
    • stats transactions
    • transaction_log
    • failed_transaction_log
  • Register binder device via misc_register()

2.2 binder_ioctl()

  • BINDER_WRITE_READ

sends zero or more Binder operations, then blocks waiting to receive incoming operations and return with a result. (This is the same as doing a normal write() followed by a read() on the file descriptor, just a little more efficient.)

The ioctl’s  data structure is

struct binder_write_read {
    signed long write_size; /* bytes to write */
    signed long write_consumed; /* bytes consumed by driver */
    unsigned long   write_buffer;
    signed long read_size;  /* bytes to read */
    signed long read_consumed;  /* bytes consumed by driver */
    unsigned long   read_buffer;
};

Upon calling the driver, write_buffer contains a series of commands for it to perform, and upon return read_buffer is filled in with a series of responses for the thread to execute.

Here is a list of the commands that can be sent by a process to the driver, with comments describing the data that follows each command in the buffer:

enum BinderDriverCommandProtocol {
    BC_TRANSACTION = _IOW('c', 0, struct binder_transaction_data),
    BC_REPLY = _IOW('c', 1, struct binder_transaction_data),
    /*
     * binder_transaction_data: the sent command.
     */

    BC_ACQUIRE_RESULT = _IOW('c', 2, int),
    /*
     * not currently supported
     * int:  0 if the last BR_ATTEMPT_ACQUIRE was not successful.
     * Else you have acquired a primary reference on the object.
     */

    BC_FREE_BUFFER = _IOW('c', 3, int),
    /*
     * void *: ptr to transaction data received on a read
     */

    BC_INCREFS = _IOW('c', 4, int),
    BC_ACQUIRE = _IOW('c', 5, int),
    BC_RELEASE = _IOW('c', 6, int),
    BC_DECREFS = _IOW('c', 7, int),
    /*
     * int: descriptor
     */

    BC_INCREFS_DONE = _IOW('c', 8, struct binder_ptr_cookie),
    BC_ACQUIRE_DONE = _IOW('c', 9, struct binder_ptr_cookie),
    /*
     * void *: ptr to binder
     * void *: cookie for binder
     */

    BC_ATTEMPT_ACQUIRE = _IOW('c', 10, struct binder_pri_desc),
    /*
     * not currently supported
     * int: priority
     * int: descriptor
     */

    BC_REGISTER_LOOPER = _IO('c', 11),
    /*
     * No parameters.
     * Register a spawned looper thread with the device.
     */

    BC_ENTER_LOOPER = _IO('c', 12),
    BC_EXIT_LOOPER = _IO('c', 13),
    /*
     * No parameters.
     * These two commands are sent as an application-level thread
     * enters and exits the binder loop, respectively.  They are
     * used so the binder can have an accurate count of the number
     * of looping threads it has available.
     */

    BC_REQUEST_DEATH_NOTIFICATION = _IOW('c', 14, struct binder_ptr_cookie),
    /*
     * void *: ptr to binder
     * void *: cookie
     */

    BC_CLEAR_DEATH_NOTIFICATION = _IOW('c', 15, struct binder_ptr_cookie),
    /*
     * void *: ptr to binder
     * void *: cookie
     */

    BC_DEAD_BINDER_DONE = _IOW('c', 16, void *),
    /*
     * void *: cookie
     */
};

The most interesting commands here are BC_TRANSACTION and BC_REPLY, which initiate an IPC transaction and return a reply for a transaction, respectively. The data structure following these commands is:

enum transaction_flags {
    TF_ONE_WAY  = 0x01, /* this is a one-way call: async, no return */
    TF_ROOT_OBJECT  = 0x04, /* contents are the component's root object */
    TF_STATUS_CODE  = 0x08, /* contents are a 32-bit status code */
    TF_ACCEPT_FDS   = 0x10, /* allow replies with file descriptors */
};

struct binder_transaction_data {
    /* The first two are only used for bcTRANSACTION and brTRANSACTION,
     * identifying the target and contents of the transaction.
     */
    union {
        size_t  handle; /* target descriptor of command transaction */
        void    *ptr;   /* target descriptor of return transaction */
    } target;
    void        *cookie;    /* target object cookie */
    unsigned int    code;       /* transaction command */

    /* General information about the transaction. */
    unsigned int    flags;
    pid_t       sender_pid;
    uid_t       sender_euid;
    size_t      data_size;  /* number of bytes of data */
    size_t      offsets_size;   /* number of bytes of offsets */

    /* If this transaction is inline, the data immediately
     * follows here; otherwise, it ends with a pointer to
     * the data buffer.
     */
    union {
        struct {
            /* transaction data */
            const void  *buffer;
            /* offsets from buffer to flat_binder_object structs */
            const void  *offsets;
        } ptr;
        uint8_t buf[8];
    } data;
};

Thus, to initiate an IPC transaction, you will essentially perform a BINDER_READ_WRITE ioctl with the write buffer containing bcTRANSACTION follewed by a binder_transaction_data. In this structure target is the handle of the object that should receive the transaction, code tells the object what to do when it receives the transaction, priority is the thread priority to run the IPC at, and there is a data buffer containing the transaction data, as well as an (optional) additional offsets buffer of meta-data.

Given the target handle, the driver determines which process that object lives in and dispatches this transaction to one of the waiting threads in its thread pool (spawning a new thread if needed). That thread is waiting in a BINDER_WRITE_READ ioctl() to the driver, and so returns with its read buffer filled in with the commands it needs to execute. These commands a very similar to the write commands, for the most part corresponding to write operations on the other side:

enum BinderDriverReturnProtocol {
    BR_ERROR = _IOR('r', 0, int),
    /*
     * int: error code
     */

    BR_OK = _IO('r', 1),
    /* No parameters! */

    BR_TRANSACTION = _IOR('r', 2, struct binder_transaction_data),
    BR_REPLY = _IOR('r', 3, struct binder_transaction_data),
    /*
     * binder_transaction_data: the received command.
     */

    BR_ACQUIRE_RESULT = _IOR('r', 4, int),
    /*
     * not currently supported
     * int: 0 if the last bcATTEMPT_ACQUIRE was not successful.
     * Else the remote object has acquired a primary reference.
     */

    BR_DEAD_REPLY = _IO('r', 5),
    /*
     * The target of the last transaction (either a bcTRANSACTION or
     * a bcATTEMPT_ACQUIRE) is no longer with us.  No parameters.
     */

    BR_TRANSACTION_COMPLETE = _IO('r', 6),
    /*
     * No parameters... always refers to the last transaction requested
     * (including replies).  Note that this will be sent even for
     * asynchronous transactions.
     */

    BR_INCREFS = _IOR('r', 7, struct binder_ptr_cookie),
    BR_ACQUIRE = _IOR('r', 8, struct binder_ptr_cookie),
    BR_RELEASE = _IOR('r', 9, struct binder_ptr_cookie),
    BR_DECREFS = _IOR('r', 10, struct binder_ptr_cookie),
    /*
     * void *:  ptr to binder
     * void *: cookie for binder
     */

    BR_ATTEMPT_ACQUIRE = _IOR('r', 11, struct binder_pri_ptr_cookie),
    /*
     * not currently supported
     * int: priority
     * void *: ptr to binder
     * void *: cookie for binder
     */

    BR_NOOP = _IO('r', 12),
    /*
     * No parameters.  Do nothing and examine the next command.  It exists
     * primarily so that we can replace it with a BR_SPAWN_LOOPER command.
     */

    BR_SPAWN_LOOPER = _IO('r', 13),
    /*
     * No parameters.  The driver has determined that a process has no
     * threads waiting to service incomming transactions.  When a process
     * receives this command, it must spawn a new service thread and
     * register it via bcENTER_LOOPER.
     */

    BR_FINISHED = _IO('r', 14),
    /*
     * not currently supported
     * stop threadpool thread
     */

    BR_DEAD_BINDER = _IOR('r', 15, void *),
    /*
     * void *: cookie
     */
    BR_CLEAR_DEATH_NOTIFICATION_DONE = _IOR('r', 16, void *),
    /*
     * void *: cookie
     */

    BR_FAILED_REPLY = _IO('r', 17),
    /*
     * The the last transaction (either a bcTRANSACTION or
     * a bcATTEMPT_ACQUIRE) failed (e.g. out of memory).  No parameters.
     */
};

The recipient, in user space will then hand this transaction over to the target object for it to execute and return its result. Upon getting the result, a new write buffer is created containing the bcREPLY reply command with a binder_transaction_data structure containing the resulting data. This is returned with a BINDER_WRITE_READ ioctl() on the driver, sending the reply back to the original process and leaving the thread waiting for the next transaction to perform.

The original thread finally returns back from its own BINDER_WRITE_READ with a brREPLY command containing the reply data.

Note that the original thread may also receive BR_TRANSACTION commands while it is waiting for a reply. This represents a recursion across processes the receiving thread making a call on to an object back in the original process. It is the responsibility of the driver to keep track of all active transactions, so it can dispatch transactions to the correct thread when recursion happens.

  • BINDER_SET_MAX_THREADS

  • BINDER_SET_CONTEXT_MGR

  • BINDER_THREAD_EXIT

  • BINDER_VERSION

  • BINDER_SET_IDLE_TIMEOUT

(Not used in android)

  • BINDER_SET_WAKEUP_TIME

(Not used in android)

Reference

Android on my beagleboard-xm

Ha, I get android boot on my beagleboard-xm and output to 24 ich screen in 1920 * 1080.

Seems the startup frame buffer is not correct:

It does a big android….too large screen…..

Seems it cost about 5 minutes booting from welcome screen to android desktop. I think it is caused by too slow for booting from MMC directly, I need try put rootfs in usb disk but it requires enable usb hub early.

android_calc

android_setting_menu

beagleboard-xm research(2)–u-boot

1, Build

From http://code.google.com/p/beagleboard/wiki/BeagleSourceCode, download u-boot 1.3.3 for beagleboard

If you uses latest ARM gcc from codesourcery, you maybe get following error

arm-none-linux-gnueabi-gcc -g  -Os   -fno-strict-aliasing  -fno-common -ffixed-r8 -msoft-float  -D__KERNEL__ -DTEXT_BASE=0x80e80000 -I/home/ken/bb/u-boot/u-boot-beagle/include -fno-builtin -ffreestanding -nostdinc -isystem /opt/sourcery_g++/bin/../lib/gcc/arm-none-linux-gnueabi/4.3.3/include -pipe  -DCONFIG_ARM -D__ARM__ -march=armv7a  -Wall -Wstrict-prototypes -c -o hello_world.o hello_world.c
hello_world.c:1: error: bad value (armv7a) for -march= switch

This issue is caused by latest GCC changing for ARMV7-A architecture, that should uses -march=armv7-a but not -march=armv7a.

To fix it, in u-boot\cpu\omap3\config.mk, change following line:

PLATFORM_CPPFLAGS += -march=armv7a

To:

PLATFORM_CPPFLAGS += -march=armv7-a

Although success to build uboot.bin image, but beagleboard-xm fail to boot it:

a) the serial baudrate is changed from 115200 to 57600

b) system hang after find no NAND memory.

But the u-boot image built from git mainline git://git.denx.de/u-boot.git with omap3 patch can work correctly, please reference http://www.elinux.org/BeagleBoard

   1: git clone git://git.denx.de/u-boot.git u-boot-main

   2: cd u-boot-main

   3: git checkout --track -b omap3 origin/master

Build

   1: make CROSS_COMPILE=arm-none-linux-gnueabi- mrproper

   2: make CROSS_COMPILE=arm-none-linux-gnueabi- omap3_beagle_config

   3: make CROSS_COMPILE=arm-none-linux-gnueabi- 

As mentioned by previous discussion, u-boot.bin is loaded into the first of internal SDRAM at address 0x80008000. So in uboot\board\ti\beagle\config.mk:

   1: #

   2: # Physical Address:

   3: # 8000'0000 (bank0)

   4: # A000/0000 (bank1)

   5: # Linux-Kernel is expected to be at 8000'8000, entry 8000'8000

   6: # (mem base + reserved)

   7:  

   8: # For use with external or internal boots.

   9: CONFIG_SYS_TEXT_BASE = 0x80008000

CONFIG_SYS_TEXT_BASE as macro passed into build options as:

arm-none-linux-gnueabi-gcc   -D__ASSEMBLY__ -g  -Os   -fno-common -ffixed-r8 -msoft-float   -D__KERNEL__ -DCONFIG_SYS_TEXT_BASE=0x80008000 -I/home/ken/bb/u-boot/u-boot-mailine/include -fno-builtin -ffreestanding -nostdinc -isystem /opt/sourcery_g++/bin/../lib/gcc/arm-none-linux-gnueabi/4.3.3/include -pipe  -DCONFIG_ARM -D__ARM__ -marm  -mabi=aapcs-linux -mno-thumb-interwork -march=armv5   -o start.o start.S –c

(BTW: there is some interesting compiler options used: –ffreestanding, –isystem, –mabi=aapcs-linux)

It is worth to mention that uboot will keep some information into a global_data in top of stack, the structure is defined in uboot\arch\arm\include\asm\global_data.h:

   1: typedef    struct    global_data {

   2:     bd_t        *bd;

   3:     unsigned long    flags;

   4:     unsigned long    baudrate;

   5:     unsigned long    have_console;    /* serial_init() was called */

   6:     unsigned long    env_addr;    /* Address  of Environment struct */

   7:     unsigned long    env_valid;    /* Checksum of Environment valid? */

   8:     unsigned long    fb_base;    /* base address of frame buffer */

   9: #ifdef CONFIG_VFD

  10:     unsigned char    vfd_type;    /* display type */

  11: #endif

  12: #ifdef CONFIG_FSL_ESDHC

  13:     unsigned long    sdhc_clk;

  14: #endif

  15: #ifdef CONFIG_AT91FAMILY

  16:     /* "static data" needed by at91's clock.c */

  17:     unsigned long    cpu_clk_rate_hz;

  18:     unsigned long    main_clk_rate_hz;

  19:     unsigned long    mck_rate_hz;

  20:     unsigned long    plla_rate_hz;

  21:     unsigned long    pllb_rate_hz;

  22:     unsigned long    at91_pllb_usb_init;

  23: #endif

  24: #ifdef CONFIG_ARM

  25:     /* "static data" needed by most of timer.c on ARM platforms */

  26:     unsigned long    timer_rate_hz;

  27:     unsigned long    tbl;

  28:     unsigned long    tbu;

  29:     unsigned long long    timer_reset_value;

  30:     unsigned long    lastinc;

  31: #endif

  32:     unsigned long    relocaddr;    /* Start address of U-Boot in RAM */

  33:     phys_size_t    ram_size;    /* RAM size */

  34:     unsigned long    mon_len;    /* monitor len */

  35:     unsigned long    irq_sp;        /* irq stack pointer */

  36:     unsigned long    start_addr_sp;    /* start_addr_stackpointer */

  37:     unsigned long    reloc_off;

  38: #if !(defined(CONFIG_SYS_NO_ICACHE) && defined(CONFIG_SYS_NO_DCACHE))

  39:     unsigned long    tlb_addr;

  40: #endif

  41:     void        **jt;        /* jump table */

  42:     char        env_buf[32];    /* buffer for getenv() before reloc. */

  43: } gd_t;

The structure size maybe different according to configure macros, so at beginning of build, a script is used to calculate current size of global data:

   1: arm-none-linux-gnueabi-gcc -DDO_DEPS_ONLY \

   2:         -g  -Os   -fno-common -ffixed-r8 -msoft-float   -D__KERNEL__ -DCONFIG_SYS_TEXT_BASE=0x80008000 -I/home/ken/bb/u-boot/u-boot-mailine/include -fno-builtin -ffreestanding -nostdinc -isystem /opt/sourcery_g++/bin/../lib/gcc/arm-none-linux-gnueabi/4.3.3/include -pipe  -DCONFIG_ARM -D__ARM__ -marm  -mabi=aapcs-linux -mno-thumb-interwork -march=armv5 -Wall -Wstrict-prototypes -fno-stack-protector   \

   3:         -o lib/asm-offsets.s lib/asm-offsets.c -c -S

   4: Generating include/generated/generic-asm-offsets.h

   5: tools/scripts/make-asm-offsets lib/asm-offsets.s include/generated/generic-asm-offsets.h

In EFI, there is similar design that put PeiCore’s private data at top of stack as global data.

2, Memory Map

0x9fff0000  ~  TLB table

0x9ff7f000 ~ 0x9fff0000 : Reserved for U-boot (449K)

0x9ff1f000 ~ 0x9ff7f000: for malloc(384k)

0x9ff1efe0 ~ 0x9ff1f000: board info (32 bytes)

0x9ff1ef68 ~ 0x9ff1efe0: global data (120 bytes)

0x9ff1ef68: New stack point

0x80008000                              reset vector

0x8007020 ~0x80008028      interrupt vectors

0x8000100 : Linux boot parameters

SDRAM #1

0x4020FF80 ~ 0x40210000  global_data

0x4020F800 ~0x4020FF80   stack

 

2, Workflow

  1. uboot\arch\cpu\armv7\start.S
    • Like x-load, start.S provide the first assemble loader for u-boot
    • The first instruction is reset vector and the interrupt/exception handle are closed to it. As system.map file:
80008000 T _start

80008020 t _undefined_instruction

80008024 t _software_interrupt

80008028 t _prefetch_abort

8000802c t _data_abort

80008030 t _not_used

80008034 t _irq

80008038 t _fiq

    • Switch CPU to SVC32 mode.
   1: mrs    r0, cpsr

   2: bic    r0, r0, #0x1f

   3: orr    r0, r0, #0xd3

   4: msr    cpsr,r0

    • Copy interrupt vectors to ROM indirect address: 0x4020F800
    • Because beagleboard-xm does not have NAND/OneNand device, so need copy DPLL initialize code into ROM indirect address after interrupt vectors
    • Init CPU in assemble like x-load:
      • Setup important registers: mmu, cache
      • Setup memory timing.
    • Setup stack for C code at 0x4020FF80 (uboot\include\configs\omap3_beagle.h):
   1: #define CONFIG_SYS_INIT_RAM_ADDR    0x4020f800

   2: #define CONFIG_SYS_INIT_RAM_SIZE    0x800

   3: #define CONFIG_SYS_INIT_SP_ADDR        (CONFIG_SYS_INIT_RAM_ADDR + \

   4:                      CONFIG_SYS_INIT_RAM_SIZE - \

   5:                      GENERATED_GBL_DATA_SIZE)

 

As above mentioned, before top of stack, global_data will be stored, the size of global_data is determined/generated at build time (include\generated\generic-asm-offices.h)

   1: #define GENERATED_GBL_DATA_SIZE (128) /* (sizeof(struct global_data) + 15) & ~15 */

    • Call C function board_init_f (uboot\arch\arm\lib\board.c):
      • Assign/Init global data structure at 0x4020F800
      • Disable memory I/O cache for compiler optization, just like MemoryFence() used in edk2 MdePkg:
   1: __asm__ __volatile__("": : :"memory");

Because many hardware intialization or I/O accessing will use write/read same MMIO address, the compiler maybe optimizate these code out or re-arrange read/write sequence, so it will break. Above instruction like asm volatitle used.

      • call all function defined in init_sequence array:
        • timer_init (arch/arm/cpu/armv7/omap-common/timer.c)
          • here used GPTIMER2 (there are 12 GP time in OMAP3), which base adress is 0x49032000
        • Initialize environment, because no NAND, so the environment is relocated to RAM as in arch\arm\include\asm\global_data.h
   1: #define    GD_FLG_RELOC        0x00001    /* Code was relocated to RAM        */

            By default, some configuration value comes from global variable default_environment in common\env_common.c such as baudrate.

        • serial initliazation

beagleboard-xm use NS16650 serial at COM3 0x49020000, datasheet at http://www.national.com/ds/PC/PC16550D.pdf

        • Init stage1 console for print
        • Print CPU/board information
        • Init I2C device.
        • Init SDRAM device, caculate the bank’s size.
      • Reserve RAM memory for u-boot at top of RAM1 started from 0x80000000
      • Relocate code for new location at 0x9ff7f000/stack at 0x9ff1ef60 (arch\arm\cpu\armv7\start.S, relocate_code())
      • Jump to board_init_r() in new location in RAM, (The sequence is very like PeiCore relocation in EFI)
      • In beaglboard’s board_init_r() (board\ti\beagle\beagle.c):
        • Init GPMC
        • set board id for linux as 1546
        • set boot parameter address at 0x80000100
        • Init MMC driver
        • Init stdio drivers such as serial, nulldev
        • Init jumptable?
        • Evalute board version, for beagleboard-xm board, set VAUX2 to 1.8v for EHCI PHY. And print DIE ID at 0x4830A200
        • Init IRQ/FIQ stack which size are all 4K
        • Change CPSR to enable interrupt
        • Enter main_loop() function to read boot/user script …

beagleboard-xm research(1) — Initialization & x-load

1, General boot process and device

The initialization process for OMAP Dm37x beagleboard:

  • Preinitialization
  • Power/clock/reset ramp sequence
  • Boot ROM
  • Boot Loader
  • OS/application

Six external pins(sys_boot[5:0]) are used to select interfaces or devices for booting. The interfaces are GPMC, MMC1, MMC2, USB and UART.

The ROM code has two booting functions: peripheral booting and memory booting:

  • In peripheral booting, the ROM code pools a selected communication interface such as UART or USB, downloads the executable code over the interface, and execute it in internal SRAM. Downloaded software from an external host can be used to program flash memories connected to the device.
  • In memory booting, the ROM code finds bootstrap in permanent memories such as flash memory or memory cards and executes it. The process is normally performed after cold or warm device reset.

Overall boot sequence is as follows:

Following is 32K SRAM memory map of GP device, which is used only during the booting process.

Beagleboard-xm is OMAP3, so use 64K SRAM which range is 40200000-4020FFFF.

2, Boot from MMC/SD card

In general, beagleboard-xm uses memory booting from MMC/SD card, because this board does not have NAND device. There are some limitations as follows:

  • Supports MMC/SD cards compliant with the Multimedia Card System Specification v4.2 from the MMCA Technical Committee and the SD I/O Card Specification v2.0 from the SD Association. Includes high-capacity (size >2GB) cards: HC-SD and HC MMC
  • 3-V power supply, 3-V I/O and 1.8-V I/O voltages on port 1
  • Supports eMMC/eSD (1.8-V I/O voltage and 3.0-V Core voltage) on port 2. The external transceiver mode on port 2 is not supported.
  • Initial 1-bit MMC mode, 4-bit SD mode
  • Clock frequency:
                –   Identification mode: 400 kHz
                –   Data transfer mode: 20 MHz
  • Only one card connected to the bus 
  • Raw mode, image data read directly from card sectors 
  • FAT12/16/32 support, with or without a master boot record (MBR). 
  • For a FAT (12/16/32)-formatted memory card, the booting file must not exceed 128 KB.
  • For a raw-mode memory card, the booting image must not exceed 128 KB.  

The image used by the booting procedure is taken from a booting file named MLO. This file must be in the root directory on an active primary partition of type FAT12/16 or FAT32.

An MMC/SD card can be configured as floppy-like or hard-drive-like:

  • When acting like a floppy, the content of the card is a single FAT12/16/32 file system without an MBR holding a partition table.
  • When acting like a hard drive, an MBR is present in the first sector of the card. This MBR holds a table of partitions, one of which must be FAT12/16/32, primary, and active.

3, MLO image format

For a GP device, the image is simple and must contain a small header having the size of the software to load and the destination address of where to store it when a booting device is other than XIP. The XIP device image is even simpler and starts with executable code.



4, x-load

4.1 Why uses x-load

As above mentioned, the SRAM in beagleboard-xm is very tiny as 64K, the u-boot image size is almost 196K, so beagleboard-xm can not use u-boot as MLO. The x-load is used here, which can be considered as u-boot loader, and it’s size is around 24K.

4.2 How to build x-load

  • Get mainline x-load source code from

git clone git://gitorious.org/x-load-omap3/mainline.git

make CROSS_COMPILE=arm-none-linux-gnueabi- omap3530beagle_config

make CROSS_COMPILE=arm-none-linux-gnueabi-
       Although beagleboard-xm use DM3735 process, there is updated config file in x-load mainline’s tree. So it is ok for reuse omap3530beagle_config file.

  • Generate MLO file

After building, x-load.bin is generated as raw executable binary. As above mentioned about non-XIP image format, the size and address should be added at image’s first 16 bytes. So use signGP scipt to do it. The source code of signGP is http://beagleboard.googlecode.com/files/signGP.c

4.3 x-load’s research

4.3.1 memory map

In beagleboard-xm, ROM code will load x-load binary into SRAM (0x4020000 ~ 0x4020FFFF) 64K range. The range is allocated as follows:

Runtime stack: 0x4020000 ~ 0x40207FFF

MLO                : 0x4020800 ~ 0x4020FFFF

please reference board\omap3530beagle\config.mk for TEXT_BASE setting:

   1: # For XIP in 64K of SRAM or debug (GP device has it all availabe)

   2: # SRAM 40200000-4020FFFF base

   3: # initial stack at 0x4020fffc used in s_init (below xloader).

   4: # The run time stack is (above xloader, 2k below)

   5: # If any globals exist there needs to be room for them also

   6: TEXT_BASE = 0x40200800

Please reference cpu\omap3\start.S for stack pointer setting:

   1: /* Set up the stack                            */

   2: stack_setup:

   3:     ldr    r0, _TEXT_BASE        /* upper 128 KiB: relocated uboot   */

   4:     sub    sp, r0, #128        /* leave 32 words for abort-stack   */

   5:     and    sp, sp, #~7        /* 8 byte alinged for (ldr/str)d    */

Because the x-load is non-XIP code, so TEXT_BASE is passed to compiler:

arm-none-linux-gnueabi-gcc -Wa,-gstabs -D__ASSEMBLY__ -g  -Os   -fno-strict-aliasing  -fno-common -ffixed-r8  -D__KERNEL__ -DTEXT_BASE=0x40200800 -I/home/ken/bb/x-load/mainline/include -fno-builtin -ffreestanding -nostdinc -isystem /usr/lib/gcc/i486-linux-gnu/4.4.3/include -pipe  -DCONFIG_ARM -D__ARM__ -march=armv7-a  -c -o cpu/omap3/start.o /home/ken/bb/x-load/mainline/cpu/omap3/start.S

 

4.3.2 startup process

  1. The boot is started from cpu\omap3\start.S and the first instruction is reset vector.
  2. set cpu mode to Supervisor (SVC) 32 bit mode.
  3. Copy vectors to indirect address 0x4020F800 (SRAM_OFFSET0 + SRAM_OFFSET1 + SRAM_OFFSET2)
  4. relocates clock code into SRAM where its safer to execute
  5. Initialize CPU
    1. Invalidate instruction, L2 cache, and invalidate TLBs, disable MMU
    2. Initialize SRAM stack at 0x4020FFFC, so can use C code now.
    3. In C code s_init do some early initialization such watchdog,  configure SDRAM.
  6. Relocate code section
  7. Set runtime stack
  8. Clear bss section for uninitialization value.
  9. Jump to C code start_armboot().
    1. Initialize the serial device
    2. print version information like
      Texas Instruments X-Loader 1.4.4ss
    3. Initialize I2C which base address is 0x48070000 in L4 core.
    4. reading GPIO173, 172, 171 to determin the version of beagleboard then print it, for beagleboard-xm board, the value should be 0, 0, 0
    5. Initialize MMC card and load u-boot.bin from MMC card into pop SDRAM 0x80008000
      1. If no MMC found, try to boot from onenand or nand, but beagleboard-xm does not has these devices
      2. try to boot from serial ……
    6. Jump to 0x80008000, over for x-load.

4.3.3 x-load vs EFI’s SEC phase

So we can see the x-load is very like SEC phase in UEFI specification, in Intel’s tiano implementation, the SEC phase mainly:

  • Enter protect mode
  • Prepare early C stack in CAR, the CAR is instruction cache in process for temporary stack/heap, just like Omap’s internal SRAM for boot phase. Unlike SRAM, the CAR will be disabled/destroyed after SEC phase.
  • Initialize CPU such as MTRR for flash range.
  • Initialize early ACPI timer for performance collection.
  • Find the PeiCore from flash and shadow into CAR for PEI phase.

5, Reference
=======
1) DM37x Multimedia Device Silicon Revision 1.x

关于ARM指令集



众所周知.Intel通过发布新版的”多媒体指令集”领跑X86处理器.就是SSE啦..
一旦相关厂商未能及时跟上Intel的脚步,那么最新版的应用程序就无法使用指令集接口做优化…
而我们所看到的ARM处理器也是有指令集的.
同理可得.如果使用的处理器指令集更新了,却不对应用软件做新指令集的优化则无法得到真正的性能提升…
为什么在这里说这个问题.主要是由于嵌入式系统的开发,都必须要过这个坎….选指令集….
通常一个小公司一旦选择择某个指令集的就一直会按照那个指令集走下去…为什么?因为重写代码的成本是很高的.而且软件编译器也得重新开发,否则第三方软件编译之后无法更有效率的运行在使用新指令集的CPU上.
我们可以注意到…iphone是跨指令集的(v6和v7).因为曾经使用过ARM11架构(v6指令集),而3GS是cortex A8(V7指令集),所以个人认为从3G到3Gs的开发是相当费钱的一步..
______________________________________________________________
当然google的系统和nokia的系统也都是跨指令集的….nokia是元老级的,而google的情况和apple差不多.
nokia每次要跨指令集通常都会发布新的FP补丁包过渡(比如说S603rd FP2),所以会显得比较明显.而且最糟的是nokia在兼容性方面做的相当差(也可能是故意的),S60v2就和S60v3的软件完全不兼容,v2支持v5te和v4t指令集的CPU,而v3支持v5te和v6指令集的CPU,甚至非常遗憾的说S60系统现在不支持V7指令集,这也是为什么nokia要特地生产一款linux终端来用cortex A8.
当然,这也可能未必是由于nokia实力太差.而是,兼容性确实很难兼顾.
如果苹果也是和nokia一样从v4t指令集开始做,它就能保证一路走下来全部兼容吗?
google目前似乎更倾向支持v6指令集的CPU,以换取更统一的平台(nexus one的ARM11CPU)
________________________________________________________________
这么说起来,是否支持A8的优劣就出来了.
支持A8,开发成本会上升(且在频率没有拉升之前无法看到性能提升),开发周期会延长,但会便于今后A9处理器的无痛过渡.可以扩大OS的使用范围,因为还有预留的性能提高空间.
不支持A8,节约开发成本,保持兼容性,缩短开发周期,降低开发难度.但会缩小终端的使用范围,因为性能已接近上限.
这在GPGPU未雨绸缪的时候显得非常敏感,如果GPGPU可以解决CPU性能瓶颈,那么便携式设备是否还会对CPU的速度如此渴求呢?也许A4处理的真实架构可以为我们揭开apple的想法…
tips:RTOS为实时系统realtime os,这个通常不会出现手机平台上,而只会出现在一些对可靠性要求极高的设备上(医疗装置,生命维持装置).
      而红色的platform OS就是我们的手机系统或者机顶盒一类的嵌入式系统了.

我的beagleboard-xm到货拉

好久没玩过OMAP了,绕来绕去这几年最终又从Intel的架构绕回来了ARM,世界变化真快啊

USB驱动开发——基于windows的WDM模型

译自 Programming the Microsoft Windows Driver Model / Walter Oney — 2nd ed,第12章,第2节——Working with the Bus Driver,加入了个人一些理解,希望对大家编写USB设备驱动有一定帮助,欢迎指正。

和其他设备驱动不同,USB设备驱动不直接与底层硬件进行通信,而是先建立一个称为USB请求块(USB request blocks,URB)的数据结构,把它发送给父级驱动,父级驱动根据URB中的信息对底层硬件进行相应操作,这里父级驱动通常就是指USB总线驱动。发送URB可以使用主功能码为IRP_MJ_INTERNAL_DEVICE_CONTROL的IRP来实现,也可以直接调用父级驱动提供的接口调用函数来实现。

1. 初始化请求
URB时一种预先定义的数据结构,包含许多域(field)。为了建立一个URB,首先要开辟一个存储空间来存储这个URB,然后运行初始化程序向这个空间来填充一些数据,即设置URB的各个域,例如,若令设备能够对一个IRP_MN_START_DEVICE请求做出相应,那么首先需要做的一个工作就是读取设备描述符,可能需要使用类似下面的一段代码:
USB_DEVICE_DESCRIPTOR dd;
URB urb;
UsbBuildGetDescriptorRequest(&urb,
sizeof(_URB_CONTROL_DESCRIPTOR_REQUEST),
USB_DEVICE_DESCRIPTOR_TYPE, 0, 0, &dd, NULL,
sizeof(dd), NULL);

这里声明了一个名为urb的URB类型的局部变量,用它来存储要构建的URB。URB这种数据结构是在USBDI.H中进行定义的,USBDI.H可以在DDK开发软件安装目录下找到。URB是一种共用体结构,里面又定义了一些子结构体,每个结构体用于存放特定的USB请求。URB具体定义如下:

typedef struct _URB {
    union {
            struct _URB_HEADER                           UrbHeader;
            struct _URB_SELECT_INTERFACE                 UrbSelectInterface;
            struct _URB_SELECT_CONFIGURATION             UrbSelectConfiguration;
            struct _URB_PIPE_REQUEST                     UrbPipeRequest;
            struct _URB_FRAME_LENGTH_CONTROL             UrbFrameLengthControl;
            struct _URB_GET_FRAME_LENGTH                 UrbGetFrameLength;
            struct _URB_SET_FRAME_LENGTH                 UrbSetFrameLength;
            struct _URB_GET_CURRENT_FRAME_NUMBER         UrbGetCurrentFrameNumber;
            struct _URB_CONTROL_TRANSFER                 UrbControlTransfer;
            struct _URB_BULK_OR_INTERRUPT_TRANSFER       UrbBulkOrInterruptTransfer;
            struct _URB_ISOCH_TRANSFER                   UrbIsochronousTransfer;

            // for standard control transfers on the default pipe
            struct _URB_CONTROL_DESCRIPTOR_REQUEST       UrbControlDescriptorRequest;
            struct _URB_CONTROL_GET_STATUS_REQUEST       UrbControlGetStatusRequest;
            struct _URB_CONTROL_FEATURE_REQUEST          UrbControlFeatureRequest;
            struct _URB_CONTROL_VENDOR_OR_CLASS_REQUEST UrbControlVendorClassRequest;
            struct _URB_CONTROL_GET_INTERFACE_REQUEST    UrbControlGetInterfaceRequest;
            struct _URB_CONTROL_GET_CONFIGURATION_REQUEST      UrbControlGetConfigurationRequest;
    };
} URB, *PURB;

这里类似于 _URB_**** 的代码也是在USBDI.H预先定义的某种结构体类型,例如: _URB_CONTROL_GET_STATUS_REQUEST在USBDI.H中定义如下:

struct _URB_CONTROL_DESCRIPTOR_REQUEST {
#ifdef OSR21_COMPAT
    struct _URB_HEADER;    
#else
    struct _URB_HEADER Hdr;                 // function code indicates get or set.
#endif   
    PVOID Reserved;
    ULONG Reserved0;
    ULONG TransferBufferLength;
    PVOID TransferBuffer;
    PMDL TransferBufferMDL;             // *optional*
    struct _URB *UrbLink;               // *optional* link to next urb request
                                        // if this is a chain of commands
    struct _URB_HCD_AREA hca;               // fields for HCD use
    USHORT Reserved1;
    UCHAR Index;
    UCHAR DescriptorType;
    USHORT LanguageId;
    USHORT Reserved2;
};

初始化URB可以通过类似于UsbBuildGetDescriptorRequest这样的函数来完成,以下给出一个表来说明:

这些函数是在另一个头文件USBDLIB.H中定义的,在DDK安装目录下查找即可,对于表中的函数都可以在DDK的说明文档里找到,利用这些函数就可以实现对URB的初始化。这里以UsbBuildGetDescriptorRequest为例,说明一个函数具体功能,有些参数我也不能完全讲清其含义,把英文直接贴在上面了。

VOID 
  UsbBuildGetDescriptorRequest(
    IN OUT PURB  Urb, // 指向一个要初始化的URB首地址
    IN USHORT  Length, // 确定URB的长度
    IN UCHAR  DescriptorType, // 确定描述符类型
    IN UCHAR  Index, // Specifies the device-defined index of the descriptor that is to be retrieved
    IN USHORT  LanguageId, // Specifies the language ID of the descriptor to be retrieved when USB_STRING_DESCRIPTOR_TYPE is set in DescriptorType. This parameter must be zero for any other value in DescriptorType.
    IN PVOID  TransferBuffer  OPTIONAL, // 读回的描述符后所存放的地址
    IN PMDL  TransferBufferMDL  OPTIONAL, // Pointer to a resident buffer to receive the descriptor data or is NULL if an MDL is supplied in TransferBufferMDL.
    IN ULONG  TransferBufferLength, // Specifies the length of the buffer specified in TransferBuffer or described in TransferBufferMDL.
    IN PURB  Link  OPTIONAL // 必需为NULL
    );

2. 发送URB

    发送一个URB需要建立并发送一个内部IOCTL请求到父级驱动,很多时候需要等待设备的应答。发送URB的函数如下:

NTSTATUS SendAwaitUrb(PDEVICE_OBJECT fdo, PURB urb)

{

PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension;

KEVENT event; //用于建立一个同步的IRP

KeInitializeEvent(&event, NotificationEvent, FALSE);

IO_STATUS_BLOCK iostatus;

PIRP Irp = IoBuildDeviceIoControlRequest (IOCTL_INTERNAL_USB_SUBMIT_URB, pdx->LowerDeviceObject, NULL, 0, NULL, 0, TRUE, &event, &iostatus); //建立一个IOCTL

PIO_STACK_LOCATION stack = IoGetNextIrpStackLocation(Irp);

stack->Parameters.Others.Argument1 = (PVOID) urb;//将URB发送到特定地址

NTSTATUS status = IoCallDriver(pdx->LowerDeviceObject, Irp);

if (status == STATUS_PENDING)

{

KeWaitForSingleObject(&event, Executive, KernelMode,

FALSE, NULL);

status = iostatus.Status;

}

return status;

}

3.URB 的返回状态

    发送一个URB到USB总线驱动后,会收到一个NTSTATUS类型的状态码,它描述了操作的结果。在总线驱动内部使用另一种状态码,类型名为USBD_STATUS。当总线驱动完成一个URB后,它会设置URB的UrbHeader.Status域,这个域值一种USBD_STATUS值。开发人员可以在自己的驱动中检查这个域值来获取关于URB处理过程中的一些信息。DDK中URB_STATUS宏定义如下:

NTSTATUS status = SendAwaitUrb(fdo, &urb);

USBD_STATUS ustatus = URB_STATUS(&urb);

    没有特定的协议来保存URB_STATUS类型的状态值并将其传给驱动程序,开发人员可以有很大的自由度来处理这些状态值。

4. 配置(configuration)

     总线驱动能够自动侦测新接入的USB设备,然后读取设备描述符来判断是那种类型的设备,设备描述符的vedor和product identifier域及其它一些描述符决定了需要导入的驱动。

    通常配置管理器会调用驱动的AddDevice函数,AddDevice会建立一个设备对象,并将其与驱动链接等等。配置管理器最终会向驱动发送一个IRP_MN_START_DEVICE Plug and Play 请求,这会使驱动调用一个名为StartDevice 的函数,其大体框架如下:

NTSTATUS StartDevice(PDEVICE_OBJECT fdo)

{

PDEVICE_EXTENSION pdx =

(PDEVICE_EXTENSION) fdo->DeviceExtension;

<configure device>

return STATUS_SUCCESS;

}

5.系统如何装载驱动

    假设USB设备的vendor ID为0x0547,product ID 为0x102A,那么设备接入时PnP管理器会寻找一个包含设备名为 USB\VID_0547&PID_102A的注册表入口,如果没有匹配的入口,那么PnP管理器会触发一个找到新硬件的向导,要求定位一个能描述这个设备的INF文件,根据INF文件向导会自动安装相应位置的驱动,并更新注册表。一旦PnP管理器实现了对注册表入口的定位,就可以动态装载驱动。

    在StartDevice需要做的工作如下:首先为设备选择一个配置,然后选择一个或多个接口,此后发送一个选择配置的URB到总线驱动,总线驱动根据URB来配置设备,并建立一个通信管道,可以与被选择接口中的设备端点通信,总线驱动会提供可以访问管道的句柄。至此配置过程完成。

6. 读取配置描述符

    必须读取配置描述符到一个连续的存储空间,因为硬件不允许直接访问接口和端点描述符,下面的代码指出了如何使用2个URB来读取一个配置描述符:

ULONG iconfig = 0;

URB urb;

USB_CONFIGURATION_DESCRIPTOR tcd;

UsbBuildGetDescriptorRequest(&urb,

sizeof(_URB_CONTROL_DESCRIPTOR_REQUEST),

USB_CONFIGURATION_DESCRIPTOR_TYPE,

iconfig, 0, &tcd, NULL, sizeof(tcd), NULL);

SendAwaitUrb(fdo, &urb);

ULONG size = tcd.wTotalLength;

PUSB_CONFIGURATION_DESCRIPTOR pcd =

(PUSB_CONFIGURATION_DESCRIPTOR) ExAllocatePool(

NonPagedPool, size);

UsbBuildGetDescriptorRequest(&urb,

sizeof(_URB_CONTROL_DESCRIPTOR_REQUEST),

USB_CONFIGURATION_DESCRIPTOR_TYPE,

iconfig, 0, pcd, NULL, size, NULL);

SendAwaitUrb(fdo, &urb);

ExFreePool(pcd);

    程序中使用一个URB来读取配置描述符,并存入一个名为tcd临时描述符存储区域,tcd包含了配置、接口和端点描述符总的长度(wTotalLength),依据此长度来分配存储空间(pcd),使用第2个URB来读取整个描述符,并存入分配好的连续存储空间。

7. 选择配置

驱动通过给设备发送一系列的控制命令来进行设置并使能接口,即选择了某种配置。使用USBD_CreateConfigurationRequestEx函数建立建立URB来实现这些控制命令,函数声明如下:

PURB USBD_CreateConfigurationRequestEx( IN PUSB_CONFIGURATION_DESCRIPTOR ConfigurationDescriptor, IN PUSBD_INTERFACE_LIST_ENTRY InterfaceList );

参数说明如下:

ConfigurationDescriptor —— 指向一个配置描述符的指针,这个配置描述符包含了从USB设备获取的所有接口、端点、厂商和class-specific描述符。

InterfaceList —— Pointer to the first element in a variable-length array of USBD_INTERFACE_LIST_ENTRY structures

      感觉单纯的翻译不能无法说清基于WDM的USB驱动开发,因为这里面确实涉及到了WDM相当多的基础知识,理解上有很大困难。此外,文中只是一个USB驱动开发的大略步骤,并不能形成一个完整的实例以供参考,所以还是建议大家根据一个具体的驱动代码来分析。目前我也在反复的看 Programming the Microsoft Windows Driver Model 这本书的其它章节,一边翻译一边结合例子来看,有些吃力,不过坚持下来总会有所收获的。

       路漫漫其修远兮,吾将上下而求索~ 加油…..

WinDBG (Windows内核调试器原理浅析)【转】

WinDBG (Windows内核调试器原理浅析)【转】

前段时间忽然对内核调试器实现原来发生了兴趣,于是简单分析了一下当前windows下主流内核调试器原理,并模仿原理自己也写了个极其简单的调试器:)
  WinDBG
  WinDBG和用户调试器一点很大不同是内核调试器在一台机器上启动,通过串口调试另一个相联系的以Debug方式启动的系统,这个系统可以是虚拟机上的系统,也可以是另一台机器上的系统(这只是微软推荐和实现的方法,其实象SoftICE这类内核调试器可以实现单机调试)。很多人认为主要功能都是在WinDBG里实现,事实上并不是那么一回事,windows已经把内核调试的机制集成进了内核,WinDBG、kd之类的内核调试器要做的仅仅是通过串行发送特定格式数据包来进行联系,比如中断系统、下断点、显示内存数据等等。然后把收到的数据包经过WinDBG处理显示出来。   
  在进一步介绍WinDBG之前,先介绍两个函数:KdpTrace、KdpStub,我在《windows异常处理流程》一文里简单提过这两个函数。现在再提一下,当异常发生于内核态下,会调用KiDebugRoutine两次,异常发生于用户态下,会调用KiDebugRoutine一次,而且第一次调用都是刚开始处理异常的时候。
  当WinDBG未被加载时KiDebugRoutine为KdpStub,处理也很简单,主要是对由int 0x2d引起的异常如DbgPrint、DbgPrompt、加载卸载SYMBOLS(关于int 0x2d引起的异常将在后面详细介绍)等,把Context.Eip加1,跳过int 0x2d后面跟着的int 0x3指令。
真正实现了WinDBG功能的函数是KdpTrap,它负责处理所有STATUS_BREAKPOINT和STATUS_SINGLE_STEP(单步)异常。STATUS_BREAKPOINT的异常包括int 0x3、DbgPrint、DbgPrompt、加载卸载SYMBOLS。DbgPrint的处理最简单,KdpTrap直接向调试器发含有字符串的包。DbgPrompt因为是要输出并接收字符串,所以先将含有字符串的包发送出去,再陷入循环等待接收来自调试器的含有回复字符串的包。SYMBOLS的加载和卸载通过调用KdpReportSymbolsStateChange,int 0x3断点异常和int 0x1单步异常(这两个异常基本上是内核调试器处理得最多的异常)通过调用KdpReportExceptionStateChange,这两个函数很相似,都是通过调用KdpSendWaitContinue函数。
  KdpSendWaitContinue可以说是内核调试器功能的大管家,负责各个功能的分派。这个函数向内核调试器发送要发送的信息,比如当前所有寄存器状态,每次单步后我们都可以发现寄存器的信息被更新,就是内核调试器接受它发出的包含最新机器状态的包;还有SYMBOLS的状态,这样加载和卸载了SYMBOLS我们都能在内核调试器里看到相应的反应。然后KdpSendWaitContinue等待从内核调试器发来的包含命令的包,决定下一步该干什么。让我们来看看KdpSendWaitContinue都能干些什么:
  case DbgKdReadVirtualMemoryApi:
  KdpReadVirtualMemory(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdReadVirtualMemory64Api:
  KdpReadVirtualMemory64(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdWriteVirtualMemoryApi:
  KdpWriteVirtualMemory(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdWriteVirtualMemory64Api:
  KdpWriteVirtualMemory64(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdReadPhysicalMemoryApi:
  KdpReadPhysicalMemory(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdWritePhysicalMemoryApi:
  KdpWritePhysicalMemory(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdGetContextApi:
  KdpGetContext(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdSetContextApi:
  KdpSetContext(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdWriteBreakPointApi:
  KdpWriteBreakpoint(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdRestoreBreakPointApi:
  KdpRestoreBreakpoin(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdReadControlSpaceApi:
  KdpReadControlSpace(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdWriteControlSpaceApi:
  KdpWriteControlSpace(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdReadIoSpaceApi:
  KdpReadIoSpace(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdWriteIoSpaceApi:
  KdpWriteIoSpace(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdContinueApi:
  if (NT_SUCCESS(ManipulateState.u.Continue.ContinueStatus) != FALSE) {
  return ContinueSuccess;
  } else {
  return ContinueError;
  }
  break;
  case DbgKdContinueApi2:
  if (NT_SUCCESS(ManipulateState.u.Continue2.ContinueStatus) != FALSE) {
  KdpGetStateChange(&ManipulateState,ContextRecord);
  return ContinueSuccess;
  } else {
  return ContinueError;
  }
  break;
  case DbgKdRebootApi:
  KdpReboot();
  break;
  case DbgKdReadMachineSpecificRegister:
  KdpReadMachineSpecificRegister(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdWriteMachineSpecificRegister:
  KdpWriteMachineSpecificRegister(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdSetSpecialCallApi:
  KdSetSpecialCall(&ManipulateState,ContextRecord);
  break;
  case DbgKdClearSpecialCallsApi:
  KdClearSpecialCalls();
  break;
  case DbgKdSetInternalBreakPointApi:
  KdSetInternalBreakpoint(&ManipulateState);
  break;
  case DbgKdGetInternalBreakPointApi:
  KdGetInternalBreakpoint(&ManipulateState);
  break;
  case DbgKdGetVersionApi:
  KdpGetVersion(&ManipulateState);
  break;
  case DbgKdCauseBugCheckApi:
  KdpCauseBugCheck(&ManipulateState);
  break;
  case DbgKdPageInApi:
  KdpNotSupported(&ManipulateState);
  break;
  case DbgKdWriteBreakPointExApi:
  Status = KdpWriteBreakPointEx(&ManipulateState,
  &MessageData,
  ContextRecord);
  if (Status) {
  ManipulateState.ApiNumber = DbgKdContinueApi;
  ManipulateState.u.Continue.ContinueStatus = Status;
  return ContinueError;
  }
  break;
  case DbgKdRestoreBreakPointExApi:
  KdpRestoreBreakPointEx(&ManipulateState,&MessageData,ContextRecord);
  break;
  case DbgKdSwitchProcessor:
  KdPortRestore ();
  ContinueStatus = KeSwitchFrozenProcessor(ManipulateState.Processor);
  KdPortSave ();
  return ContinueStatus;
  case DbgKdSearchMemoryApi:
  KdpSearchMemory(&ManipulateState, &MessageData, ContextRecord);
  break;
写内存、搜索内存、设置/恢复断点、继续执行、重启等等,WinDBG里的功能是不是都能实现了?呵呵。
  每次内核调试器接管系统是通过调用在KiDispatchException里调用KiDebugRoutine(KdpTrace),但我们知道要让系统执行到KiDispatchException必须是系统发生了异常。而内核调试器与被调试系统之间只是通过串口联系,串口只会发生中断,并不会让系统引发异常。那么是怎么让系统产生一个异常呢?答案就在KeUpdateSystemTime里,每当发生时钟中断后在HalpClockInterrupt做了一些底层处理后就会跳转到这个函数来更新系统时间(因为是跳转而不是调用,所以在WinDBG断下来后回溯堆栈是不会发现HalpClockInterrupt的地址的),是系统中调用最频繁的几个函数之一。在KeUpdateSystemTime里会判断KdDebuggerEnable是否为TRUE,若为TRUE则调用KdPollBreakIn判断是否有来自内核调试器的包含中断信息的包,若有则调用DbgBreakPointWithStatus,执行一个int 0x3指令,在异常处理流程进入了KdpTrace后将根据处理不同向内核调试器发包并无限循环等待内核调试的回应。现在能理解为什么在WinDBG里中断系统后堆栈回溯可以依次发现KeUpdateSystemTime->RtlpBreakWithStatusInstruction,系统停在了int 0x3指令上(其实int 0x3已经执行过了,只不过Eip被减了1而已),实际已经进入KiDispatchException->KdpTrap,将控制权交给了内核调试器。
  系统与调试器交互的方法除了int 0x3外,还有DbgPrint、DbgPrompt、加载和卸载symbols,它们共同通过调用DebugService获得服务。
  NTSTATUS DebugService(
  ULONG   ServiceClass,
  PVOID   Arg1,
  PVOID   Arg2
  ){
  NTSTATUS    Status;
  __asm {
  mov     eax, ServiceClass
  mov     ecx, Arg1
  mov     edx, Arg2
  int     0x2d
  int     0x3  
  mov     Status, eax
  }
  return Status;}
  ServiceClass可以是BEAKPOINT_PRINT(0x1)、BREAKPOINT_PROMPT(0x2)、BREAKPOINT_LOAD_SYMBOLS(0x3)、BREAKPOINT_UNLOAD_SYMBOLS(0x4)。为什么后面要跟个int 0x3,M$的说法是为了和int 0x3共享代码(我没弄明白啥意思-_-),因为int 0x2d的陷阱处理程序是做些处理后跳到int 0x3的陷阱处理程序中继续处理。但事实上对这个int 0x3指令并没有任何处理,仅仅是把Eip加1跳过它。所以这个int 0x3可以换成任何字节。
int 0x2d和int 0x3生成的异常记录结(EXCEPTION_RECORD)ExceptionRecord.ExceptionCode都是STATUS_BREAKPOINT(0x80000003),不同是int 0x2d产生的异常的ExceptionRecord.NumberParameters>0且ExceptionRecord.ExceptionInformation对应相应的ServiceClass比如BREAKPOINT_PRINT等。事实上,在内核调试器被挂接后,处理DbgPrint等发送字符给内核调试器不再是通过int 0x2d陷阱服务,而是直接发包。用M$的话说,这样更安全,因为不用调用KdEnterDebugger和KdExitDebugger。
  最后说一下被调试系统和内核调试器之间的通信。被调试系统和内核调试器之间通过串口发数据包进行通信,Com1的IO端口地址为0x3f8,Com2的IO端口地址为0x2f8。在被调试系统准备要向内核调试器发包之前先会调用KdEnterDebugger暂停其它处理器的运行并获取Com端口自旋锁(当然,这都是对多处理器而言的),并设置端口标志为保存状态。发包结束后调用KdExitDebugger恢复。每个包就象网络上的数据包一样,包含包头和具体内容。包头的格式如下:
  typedef struct _KD_PACKET {
  ULONG PacketLeader;
  USHORT PacketType;
  USHORT ByteCount;
  ULONG PacketId;
  ULONG Checksum;
  } KD_PACKET, *PKD_PACKET;
  PacketLeader是四个相同字节的标识符标识发来的包,一般的包是0x30303030,控制包是0x69696969,中断被调试系统的包是0x62626262。每次读一个字节,连续读4次来识别出包。中断系统的包很特殊,包里数据只有0x62626262。包标识符后是包的大小、类型、包ID、检测码等,包头后面就是跟具体的数据。这点和网络上传输的包很相似。还有一些相似的地方比如每发一个包给调试器都会收到一个ACK答复包,以确定调试器是否收到。若收到的是一个RESEND包或者很长时间没收到回应,则会再发一次。对于向调试器发送输出字符串、报告SYMBOL情况等的包都是一接收到ACK包就立刻返回,系统恢复执行,系统的表现就是会卡那么短短一下。只有报告状态的包才会等待内核调试器的每个控制包并完成对应功能,直到发来的包包含继续执行的命令为止。无论发包还是收包,都会在包的末尾加一个0xaa,表示结束。
现在我们用几个例子来看看调试流程。
  记得我以前问过jiurl为什么WinDBG的单步那么慢(相对softICE),他居然说没觉得慢?*$&$^$^(&(&(我ft。。。现在可以理解为什么WinDBG的单步和从操作系统正常执行中断下来为什么那么慢了。单步慢是因为每单步一次除了必要的处理外,还得从串行收发包,怎么能不慢。中断系统慢是因为只有等到时钟中断发生执行到KeUpdateSystemTime后被调试系统才会接受来自WinDBG的中断包。现在我们研究一下为什么在KiDispatchException里不能下断点却可以用单步跟踪KiDispatchException的原因。如果在KiDispatchException中某处下了断点,执行到断点时系统发生异常又重新回到KiDispatchException处,再执行到int 0x3,如此往复造成了死循环,无法不能恢复原来被断点int 0x3所修改的代码。但对于int 0x1,因为它的引起是因为EFLAG寄存中TF位被置位,并且每次都自动被复位,所以系统可以被继续执行而不会死循环。现在我们知道了内部机制,我们就可以调用KdXXX函数实现一个类似WinDBG之类的内核调试器,甚至可以替换KiDebugRoutine(KdpTrap)为自己的函数来自己实现一个功能更强大的调试器,呵呵。
  SoftICE
  SoftICE的原理和WinDBG完全不一样。它通过替换正常系统中的中断处理程序来获得系统的控制权,也正因为这样它才能够实现单机调试。它的功能实现方法很底层,很少依赖与windows给的接口函数,大部分功能的实现都是靠IO端口读写等来完成的。
  SoftICE替换了IDT表中以下的中断(陷阱)处理程序:
  0x1:    单步陷阱处理程序
  0x2:    NMI不可屏蔽中断
  0x3:    调试陷阱处理程序
  0x6:    无效操作码陷阱处理程序
  0xb:    段不存在陷阱处理程序
  0xc:    堆栈错误陷阱处理程序
  0xd:    一般保护性错误陷阱处理程序
  0xe:    页面错误陷阱处理程序
  0x2d:    调试服务陷阱处理程序
  0x2e:    系统服务陷阱处理程序
  0x31:    8042键盘控制器中断处理程序
  0x33:    串口2(Com2)中断处理程序
  0x34:    串口1(Com1)中断处理程序
  0x37:    并口中断处理程序
  0x3c:    PS/2鼠标中断处理程序
  0x41:    未使用
  (这是在PIC系统上更换的中断。如果是APIC系统的话更换的中断号有不同,但同样是更换这些中断处理程序)
  其中关键是替换了0x3 调试陷阱处理程序和0x31 i8042键盘中断处理驱动程序(键盘是由i8042芯片控制的),SoftICE从这两个地方获取系统的控制权。
  启动softICE服务后SoftICE除了更换了IDT里的处理程序,还有几点重要的,一是HOOK了i8042prt.sys里的READ_PORT_UCHAR函数,因为在对0x60端口读后,会改变0x64端口对应控制寄存器的状态。所以在SoftICE的键盘中断控制程序读了0x60端口后并返回控制权给正常的键盘中断控制程序后,不要让它再读一次。还有就是把物理内存前1MB的地址空间通过调用MmMapIoSpace映射到虚拟的地址空间里,里面包括显存物理地址,以后重画屏幕就通过修改映射到虚拟地址空间的这段显存内容就行了。
如果显示模式是彩色模式,那么显存起始地址是0xb8000,CRT索引寄存器端口0x3d4,CRT数据寄存器端口0x3d5。如果显示模式是单色模式,那么显存起始地址是0xb0000,CRT索引寄存器端口0x3b4,CRT数据寄存器端口0x3b5。首先写索引寄存器选择要进行设置的显示控制内部寄存器之一(r0-r17),然后将参数写到其数据寄存器端口。
  i8042键盘控制器中断控制驱动程序在每按下一个键和弹起一个键都会被触发。SoftICE在HOOK了正常的键盘中断控制程序获得系统控制权后,首先从0x60端口读出按下键的扫描码然后向0x20端口发送通用EOI(0x20)表示中断已结束,如果没有按下激活热键(ctrl+d),则返回正常键盘中断处理程序。如果是按下热键则会判断控制台(就是那个等待输入命令的显示代码的黑色屏幕)是否被激活,未被激活的话则先激活。然后设置IRQ1键盘中断的优先级为最高,同时设置两个8259A中断控制器里的中断屏蔽寄存器(向0x21和0xa1发中断掩码,要屏蔽哪个中断就把哪一位设为1),只允许IRQ1(键盘中断)、IRQ2(中断控制器2级联中断,因为PS/2鼠标中断是归8259A-2中断控制器管的,只有开放IRQ2才能响应来自8259A-2管理的中断)、IRQ12(PS/2鼠标中断,如果有的话),使系统这时只响应这3个中断。新的键盘和鼠标中断处理程序会建立一个缓冲区,保存一定数量的输入扫描信息。当前面的工作都完成后会进入一段循环代码,负责处理键盘和鼠标输入的扫描码缓冲区,同时不断地更新显存的映射地址缓冲区重画屏幕(这段循环代码和WinDBG里循环等待从串口发来的包的原理是一样的,都是在后台循环等待用户的命令)。
  这段循环代码是在激活控制台的例程里调用的,也就是说当控制台已被激活的话正常流程不会再次进入这段循环代码的(废话,再进入系统不就死循环了)。当有一个新的键按下时,都会重新调用一遍键盘中断处理程序,因为控制台已激活,所以它只是简单地更新键盘输入缓冲区内容然后iret返回。它并不会返回正常的键盘中断处理程序,因为那样会交出控制权(想证明这点也很简单,在SoftICE里断正常的键盘中断处理程序,然后g,1秒后在这里断下,这是我们可以F10,如果SoftICE会把控制权交给正常的键盘中断处理程序的话,在这里早就发生死循环了)。鼠标中断驱动也是一样。这个时候实际iret返回到的还是那段循环代码里面,所以被调试的代码并不会被执行,除非按下了F10之类的键,它会指示退出循环返回最开始时的中断处理程序,然后再iret返回最开始中断的地方。当然,因为设置了EFLAG里的TF位,执行了一个指令又会通过单步的处理程序进入那段循环的代码。
而处理int 0x3也差不多,若没有激活控制台则先激活并屏蔽除了键盘、鼠标及8259A-2中断控制器外的所有中断,然后进入那段循环代码。
  作为对比同样来看一下在SoftICE里处理int 0x3和单步的过程。当执行到int 0x3时,激活控制台并屏蔽中断,然后将int 0x3指令前后范围的指令反汇编并写入显存映射地址空间,并把最新的寄存器值也写进去,最后在后台循环等待键盘输入命令。当命令是F10时,设置好EFLAG的TF位,清除8259A中断控制器里的中断屏蔽寄存器,开放所有中断,将控制台清除,从循环代码中返回新键盘(或int 0x3)中断处理程序,然后再返回到正常键盘(或int 0x3)中断处理程序,由这里iret到被中断代码处执行。执行了一个指令后因为发生单步异常又进入后台循环代码。
  SoftICE里的单步比WinDBG要快得多的原因很简单,SoftICE只需要把反汇编出来的代码和数据经过简单处理再写入显存映射地址缓冲区里刷新屏幕就可以继续执行了,省略了串行的发包收包,怎么会不快。而中断系统更快,按下键中断就会发生,根本不用象WinDBG等时钟中断才能把系统断下来。
  后记:
  好象说得很简单,其实一个内核调试器实现起来极其复杂,没说得再详细,一是因为题目就叫“浅析”,就是类似于科普的东西;二是水平和时间有限(主要原因^^);三是真要详细写起来就不是这几千字能说得明白的东西了。还有,反汇编ntice.sys真是一项艰巨的任务,比分析漏洞要复杂N倍,刚开始没着门道时真看得我头昏眼花。在此特别感谢Syser的作者,牛人就是牛人,在我对SoftICE工作原理的认识还处于混沌状态时,几句话点醒了我^^。因为水平有限难免有很多错漏,还忘高手指出:)