mips 异常

zMIPS体系结构采用的是 精确异常 处理模式
这是什么意思呢？下面来看从“See MIPS Run”一书中的摘录：

“In a precise-exception CPU, on any exception we get pointed at one instruction(the exception victim). All instructions preceding the exception victim in execution sequence are complete; any work done on the victim and on any subsequent instructions (BNN NOTE: pipeline effects) has no side effects that the software need worry about. The software that handles exceptions can ignore all the timing effects of the CPU’s implementations”

上面的意思其实很简单：在发生这个异常之前的一切计算行为会完整的结束并体现效果。在发生这个异常之后的一切计算行为（包含当前这条指令）将不会产生任何效果。
另外一种解释是：

A precise exception is one in which the EPC (CP0, Register 14, Select 0) can be used to identify the instruction that caused the exception. For imprecise exceptions, the instruction that caused the exception cannot be identified. Most exceptions are precise. Bus error exceptions may be imprecise.

异常处理的一般过程

With the exception of Reset, Soft Reset, NMI, and Debug exceptions, which have their own special processing as described below, exceptions have the same basic processing flow:
• If the EXL bit in the Status register is cleared, theEPC register is loaded with the PC at which execution will be restarted and the BD bit is set appropriately in theCause register. If the instruction is not in the delay slot of a branch, the BD bit inCausewill be cleared and the value loaded into theEPCregister is the current PC. If the instruction is in the delay slot of a branch, the BD bit inCauseis set andEPCis loaded with PC-4.If the EXL bit in theStatus register is set, theEPCregister is not loaded and the BD bit is not changed in theCauseregister.
• The CE and ExcCode fields of the Cause registers are loaded with the values appropriate to the exception. The CE field is loaded, but not defined, for any exception type other than a coprocessor unusable exception.
• The EXL bit is set in the Status register.
• The processor is started at the exception vector.
The value loaded into EPC represents the restart address for the exception and need not be modified by exception handler software in the normal case. Software need not look at the BD bit in the Cause register unless is wishes to identify the address of the instruction that actually caused the exception. Note that individual exception types may load additional information into other registers. This is noted in the description of each exception type below.

EPC中存放的是异常发生时执行的指令地址，或者分支延时发生异常，则存放的是分支的指令地址，不管怎么样，异常处理函数返回都从EPC开始恢复执行，如果在分支延时指令发生异常，则需要在cause寄存器中存放相应标志，这样就可以准确的知道发生异常的指令地址了。

Operation:

ifStatus EXL= 0 then
 if InstructionInBranchDelaySlot then
  EPC <- PC - 4
  Cause BD<- 1
 else
  EPC <- PC
  Cause BD<- 0
 endif
 if ExceptionType = TLBRefill then
  vectorOffset <- 0x000
 elseif (ExceptionType = Interrupt) and
  (Cause IV= 1) then
  vectorOffset <- 0x200
 else
  vectorOffset <- 0x180
 endif
else
 vectorOffset <- 0x180
endif
Cause CE<- FaultingCoprocessorNumber
Cause ExcCode<- ExceptionType
Status EXL<- 1
if Status BEV= 1 then
 PC <- 0xBFC0_0200 + vectorOffset
else
 PC <- 0x8000_0000 + vectorOffset
endif

As with any procedure, the exception handler must save any registers it may modify, and then restore them before returning control to the interrupted program. Saving registers in memory poses a problem in MIPS:
addressing the memory requires a register (the base register) in which the address is formed. This means that a register must be modified before any register can be saved! The MIPS register usage convention (see Laboratory 4) reserves registers $26 and $27( $k0and$k1 ) for the use of the interrupt handler. This means that the interrupt handler can use these registers without having to save them first. A user program that uses these registers may find them unexpectedly changed.The CPU operates in one of the two possible modes,userandkernel.User programs run in user mode. The CPU enters the kernel mode when an exception happens. Coprocessor 0 can only be used in kernel mode.

说明：为何分支延时槽中的指令发生异常要从分支指令重新执行呢，这是因为mips的指令执行是流水线结构，分析指令的执行结果不会影响到延时槽中指令的执行，也就是说不管分支指令往哪里跳，延时槽的指令都会执行，如果EPC保存延时指令地址，则分析指令执行的结果将会丢失，这样异常处理结束后恢复执行的结果就不正确

异常入口（向量）

The Reset ,Soft Reset , andNMI exceptions are always vectored to location 0xBFC0_0000. Debug exceptions are vectored to location 0xBFC0_0480 or to location 0xFF20_0200 if the ProbTrap bit is 0 or 1, respectively, in the EJTAG Control register (ECR).

如果在’ EJTAG控制寄存器’ (ECR)中ProbTrap位分别为0或1，则调试异常被指向位置’ 0xBFC0_0480 ‘或位置’ 0xFF20_0200 ‘。

Addresses forall other exceptionsare a combination of a vector offset and a base address .

Table4-2 gives the base address as a function of the exception and whether the BEV bit is set in theStatus register.

Table 4-3 gives the offsets from the base address as a function of the exception.

CauseIV: 将CP0 CauseIV位设置为1会导致中断异常使用专用的异常向量偏移量(0x200), 而不是使用一般的异常向量偏移量(0x180)。

Table 4-4combines these two tables into one that contains all possible vector addresses as a function of the state that can affect the vector selection.

将这两个表合并为一个表，其中包含所有可能的向量地址，作为可能影响向量选择的状态函数。

In MIPS32® Release 2 and higher architectures, software is allowed to specify the vector base address via the CP0 Ebaseregister for exceptions that occur when CP0 Status BEV equals 0.

StatusBEV= 1: Exceptions vector to an uncached entry point in KSEG1: 0xBFC00xxx
StatusBEV= 0: Exceptions vector to cached entry points in KSEG0: defined by CP0 Ebase register, plus someoffset

Note:StatusBEV = 1 at reset. IfEbaseis to be changed, it must be done with StatusBEV= 1(i.e. at system boot). The operation of the CPU isUNDEFINEDifEbaseis written whenStatusBEV= 0. TheEbasedefault is 0x8000_0000 after reset.

EBase寄存器是一个可读写寄存器，包含例外向量基地址和一个只读的CPU号。

对Cache Error这个特殊的异常来说，需要给他安排一个任何时候都是Uncached的基地址了。因为发生这个异常时Cache已经不可靠了，在处理它是就不能使用它了。

因此这个异常的入口基地址为：

BEV = 1 : BFC0,0300 （系统启动地址空间 : kseg1）

BEV = 0 :[SP]: A000,0000 （物理内存地址 : kseg1）

[MP]: EBASE[31.30] || 1 || EBASE[28…12] || 0x000 （物理内存地址 : kseg1)

上面的总结一下： Reset ,Soft Reset和NMI：不受任何配置的影响，异常向量位置总是在0XBFC0_0000

General Exception：异常向量在0xBFC0_0200 + 0x180 或 Ebase + 0x180

Interrupt： IV 表示是否使用专用的异常处理向量， IV=0，采用General Exception中断向量， IV=1，则采用int专用的中断向量

TLB refill： EXL为0时，采用TLB refill专用的异常处理向量，EXL为1时，采用General Exception中断向量

异常优先级

所谓的优先级是指：当在某个时刻，同时多个异常或中断出现时，CPU将会按照上述的优先级来处理。

前面一列为exception的编号，后面一列为改异常的描述

异常相关寄存器

The BadVAddr register

This register (its name stands for Bad Virtual Address) will contain the memory address where the exception has occurred.

An unaligned memory access, for instance, will generate an exception and the address where the access was attempted will be stored in BadVAddr.

SR(Status Register，状态寄存器)

EXL Exception Level ; set by the processor when any exception other than Reset, Soft Reset, NMI, or Cache Error exception are taken.

0: normal 1: exception

当EXL被置位时，中断是被禁止的。换句话说，这时SR[IE]位是不管用了，相当于所有的中断都被屏蔽了。

TLB Refill异常将会使用General Exception Vector而不是缺省的TLB Refill Vector.
如果再次发生异常，EPC将不会被自动更新。这一点要非常注意。如果想支持嵌套异常，要在异常处理例程中清EXL位。当然要先保存EPC的值。另外要注意的：MIPS当陷入Exception/Interrupt时，并不改变SR[UX],SR[KX]或SR[SX]的值。SR[EXL]为1自动的将CPU mode运行在核心模式下。这一点要注意。

ERL Error Level ; set by the processor when Reset, Soft Reset, NMI, or Cache Error exception are taken.

0: normal 1: error

当ERL被置位时，中断被禁止. 中断返回ERET使用的是ErrorEPC而不是EPC。需要非常注意这个区别。

Kuseg和xkuseg 被认为是没有映射(Mapped)的和没有缓存（Un-Cached）。

可以这样理解，MIPS CPU只有在这个时刻才是一种实模式(real mode)，可以不需要TLB的映射，就直接使用kuseg的地址空间。

The ERET instruction to return from exception is used for returning from exception level (Status.EXL) and error level (Status.ERL). If both bits are set however we should be returning from ERL first, as ERL can interrupt EXL, for example when an NMI is taken.

都是通过eret返回的，如果EXL和ERL同时设置了，则应该首先从ERL返回，PC设置为ErrorPC, 清除ERL，注意这时不会清除EXL。

ERET指令用模拟器实现的代码大致如下：

if (kvm_read_c0_guest_status(cop0) & ST0_ERL) {
      kvm_clear_c0_guest_status(cop0, ST0_ERL);
      vcpu->arch.pc = kvm_read_c0_guest_errorepc(cop0);
  } 
else if (kvm_read_c0_guest_status(cop0) & ST0_EXL) {
     kvm_clear_c0_guest_status(cop0, ST0_EXL);
     vcpu->arch.pc = kvm_read_c0_guest_epc(cop0);
}

IE Interrupt Enable 0: disable interrupts 1: enable interrupts。请记住：当SR[EXL]或SR[ERL]被SET时，SR[IE]是无效的。
BEV Normal/Bootstrap exception vectors location

BEV=1 ：非缓存异常处理入口固定定位于非缓存的、启动安全的 kseg1 内存区域

BEV=0 ：异常处理入口不固定，通过 EBase 寄存器可以编程移动，系统正常运行时为 0, 由EBASE 指定
SR Soft Reset, 如果是soft reset，该位置1，表明是软件复位
NMI 如果是NMI，该位置1，表明是不可屏蔽中断
IM[7:0] Interrupt Mask
UM Kernel/User Mode， UM=1用户模式，中断发生时不改变该bit的值

UM:ERL:EXL Mode

100: User

000: Kernel

x10: Kernel (exception handling)

x01: Kernel (error handling)

Cause

在处理器异常发生时，这个寄存器标识了异常的原因。其中最重要的是2-6位，5个bit的exception code位。它们标识了引起异常的原因，具体数值代表的异常类型

BD: Exception happened in a branch delay slot
IV: Use general vs special interrupt vector 打开iv 后， interrupt 走0x200的vector offset（vector base 为0xbfc00200或者ebase）
IP[7:0]: Interrupt(s) pending
Exc Code: Exception code

EPC存放返回地址

这个寄存器的作用很简单，就是保存异常发生时的指令地址。从这个地方可以找到异常发生的指令，再结合BadVaddr sp 等寄存器，就可以推导出异常时的程序调用关系，从而定位问题的根因

WatchLo、WatchHi

这一对寄存器用来设定内存硬件断点，也就是对指定点的内存进行检测。当访问的内存地址和这两个寄存器地址一致时，会发生一个异常。应该是不同于gdb的watch指令，后面可以试一下。

CP0相关

CP0 主要操作

mfc0 rt,rd 将cp0的rd寄存器内容传输到rt通用寄存器

mtc0 rt, rd 将rd通用寄存器的内容传输到CP0中寄存器rd

mfhi/mflo rt 将CP0 的hi、lo 寄存器的内容传输到rt通用寄存器中

mthi/mtlo rt 将rt通用寄存器的内容传输到CP0的hi、lo 寄存器中

CP0 冒险现象

mips体系结构是一个无互锁，高度流水的五级pipeline架构，这就意味着，前一条指令如果尚未执行完，后一条指令可能已经进入了取指令、译码阶段。这样可能发生CP0 冒险（CP0 Hazard）现象。mfc和mtc的指令执行速度是比较慢的，开始执行完下一条指令时，有可能CP0寄存器的值尚未最后传输到指定的目标通用寄存器中。此时，如果读取该通用寄存器，有可能未能得到正确的值。这就是所谓的CP0 冒险现象。

为避免CP0 冒险，我们在编程时需要在CP0 操作指令的后面加入一条与前一条指令的目的通用寄存器无关的指令，也就是所谓的延迟槽（delay slot），如果对性能不敏感，可以考虑用nop填充延迟槽

MIPS cpu中断机制

mips 异常

在mips中，中断陷阱trap 系统调用和任何可以中断程序正常执行流的情况都称之为异常

精确异常：在引发异常的指令执行时，后面一条指令已经完成了读取和译码的预备工作，当异常产生时，这些预备工作被废弃，CPU从异常中返回时，再重新做读取和译码的工作。

Mips 对异常的处理是给异常分配一些类型，然后由软件给它们定义一些优先级，然后由同一个入口进入异常分配程序，在分配程序中根据类型及优先级确定该执行哪个对应的函数

CP0中的epc寄存器用于指向异常发生时指令跳转前的执行位置，一般是被中断指令地址。当异常时，是返回这个地址继续执行，但如果被中断指令是在分支延迟槽中，则会硬件自动处理使epc往回指一条指令（pc-4），即分支指令。在重新执行分支指令时，分支延迟槽中的指令会被再执行一次。

mips异常处理步骤

设置epc，指向返回位置
设置status寄存器，exl位迫使cpu进入内核模式，并且禁用中断，即exl位置1
设置cause 寄存器，使得软件能看见异常原因，地址异常时，也要设置BadVAddr寄存器，存储管理系统异常还要设置一些mmu寄存器
cpu从异常入口点取指令执行

mips异常处理例子

保护现场：在异常处理例程入口，需要保护被中断的程序的现场，存储寄存器的状态，保证关键状态不被覆盖
处理异常：根据cause exccode确定发生了什么类型的异常，完成想要做的任何事情
准备返回：恢复线程，修改SR，设置成安全模式（内核态，禁止异常）exl置1，也就是异常发生后的模式
从异常返回：指令eret，即清除SR：exl位， exl置0，将控制权返回给存储在epc中的地址

中断寄存器相关

cpu核外部的事件，即从一些真正的硬件连线过来的输入信号（外中断或硬中断）

Cause：exccode编码为00000中断，这些中断使cpu转向某外部事件，mips cpu有8个独立终端位(在Cause: IP7-2 和 IP 1-0段), 其中6个(IP7-2)为外部中断，2个(IP 1-0)为内部中断(可用软件访问), 片上的时钟计数/定时器都会连接到一个硬件中断位上。

mips 对中断的支持涉及到 SR 和 Cause 寄存器

使能全局中断(IE: interupt enable)：要想使能中断，SR的IE位必须置1
中断使能屏蔽（IM： interupt mask） SR[15-8]为中断屏蔽位，对应IM[7-0]，这8个bit决定了哪些中断源有请求时可以触发一个异常。实际上是对中断信号的使能开关。 8个中断源中的6个 IM[7-2]对应cause IP[7-2]，用于外部设备硬件中断，其余2个IM[1-0]对应cause ip[1-0]，为软件中断屏蔽位，所谓中断源就是产生硬中断信号的PIC外接设备或软中断。
异常级别（EXL ： exception level）：异常发生后， cpu立即设置SR EXL置1，进入异常模式，异常模式强制cpu进入内核特权级模式并屏蔽中断，而不会理会SR其他位的值。 EXL位在已经设置的情况下，还没有真正准备好调用主内核的例程（中断恢复或者中断钩子的处理？）。在这种状态下，系统不能处理其他异常， 保持EXL 足够的时间保存现场，使软件决定cpu新的特权级别和中断屏蔽位应该如何设置
异常类型（EXCcode： Exception Code）： PIC每个输入引脚上的有效（相应的IM位 1，未被屏蔽），输入每个周期都会采样，如果使能，则引起一个异常。异常处理程序检查到Cause （exccode =0）说明发生的异常是中断，此时进入通用中断程序
中断请求寄存器（IP： interupt pending）： Cause[15-8]位为中断挂起状态位，用于指示哪些设备发生了中断，具体来说是识别PIC的哪个接入引脚对应的设备发来了中断信号，IP[7-2]随着CPU 硬件输入引脚上的信号而变化，而IP1-0为软件中断位，可读可写并存储最后写入的值。当SR：IM[7-0]某些位使能，且硬中断或软中断触发时，cause：IP[7-2]上的pending位，确定是哪个设备发生了中断

中断处理步骤：

注意和异常的区别, 异常很有可能是不可恢复异常，中断处理要麻烦一些，因为要恢复现场。

中断是异常的一种，所以中断处理只是异常处理的一条分流，经过上一层异常处理处理后，处理步骤如下：
1. 将cause：IP与SR：IM进行逻辑与运算，获得一个或多个使能的中断请求
2. 选择一个使能的中断来处理，优先处理最高优先级的中断
3. 存储SR： IM中的中断屏蔽位，改变SR：IM 以保证禁止当前中断以及所有优先级小于等于本中断的中断在处理时发生
4. 对于嵌套异常，此时需要保护现场
5. 修改cpu到合适的状态以适应中断处理程序的高层部分，这时通常允许一些嵌套中断或异常。设置全局中断使能SR：IE位，以允许处理高优先级的中断。还需要改变cpu特权级域（SR：KSU），使CPU处于内核态，清除SR：EXL离开异常模式，并把这些改动反映到状态寄存器中
6. 执行中断处理程序，完成要做的事情
7. 恢复现场，恢复相关寄存器，返回被中断的指令（返回被中断程序）

常见的异常处理

Reset Exception

A reset exception occurs when the SI_ColdReset signal is asserted to the processor. This exception is not maskable. When a Reset exception occurs, the processor performs a full reset initialization , including aborting state machines, establishing critical state, and generally placing the processor in a state in which it can execute instructions from uncached, unmapped address space. On a Reset exception, the state of the processor in not defined, with the following exceptions:

对于Reset和Soft Reset异常来说，当发生此类异常时，ErrorEPC是否准确记录了发生异常时执行的指令地址，这需要根据特定处理器的手册决定。它依赖于具体处理器的实现。而且，似乎在Reset时也没有必要去记录处理器正在执行的指令

The Random register is initialized to the number of TLB entries - 1 (4Kc core).
The Wired register is initialized to zero (4Kc core).
The Config register is initialized with its boot state.
The RP, BEV, TS, SR, NMI, and ERL fields of the Status register are initialized to a specified state.
The I, R, and W fields of the WatchLo register are initialized to 0.
The ErrorEPC register is loaded with pc-4 if the state of the processor indicates that it was executing an instruction in the delay slot of a branch. Otherwise, the ErrorEPC register is loaded with pc. Note that this value may or may not be predictable.
PC is loaded with 0xBFC0_0000
Cause Register ExcCode Value: None
Additional State Saved: None
Entry Vector Used: Reset (0xBFC0_0000)

Operation:

Random <- TLBEntries - 1
Wired <- 0
Config <- ConfigurationState
Status RP  <- 0
Status BEV  <- 1
Status TS  <- 0
Status SR  <- 0
Status NMI  <- 0
Status ERL  <- 1
WatchLo I  <- 0
WatchLo R  <- 0
WatchLo W  <- 0
if InstructionInBranchDelaySlot then
ErrorEPC <- PC - 4
else
ErrorEPC <- PC
endif
PC <- 0xBFC0_0000

Non-Maskable Interrupt (NMI) Exception

A non-maskable interrupt exception occurs when the SI_NMI signal is asserted to the processor. SI_NMI is an edge sensitive signal - only one NMI exception will be taken each time it is asserted. An NMI exception occurs only at instruction boundaries, so it does not cause any reset or other hardware initialization. The state of the cache, memory, and other processor states are consistent and all registers are preserved , with the following exceptions:

The BEV, TS, SR, NMI, and ERL fields of the Status register are initialized to a specified state.
The ErrorEPC register is loaded with PC-4 if the state of the processor indicates that it was executing an instruction in the delay slot of a branch. Otherwise, the ErrorEPC register is loaded with PC.
PC is loaded with 0xBFC0_0000.
Cause Register ExcCode Value: None
Additional State Saved: None
Entry Vector Used: Reset (0xBFC0_0000)

Operation:

Status BEV   <- 1
Status TS   <- 0
Status SR   <- 0
Status NMI   <- 1
Status ERL   <- 1
if InstructionInBranchDelaySlot then
ErrorEPC <- PC - 4
else
ErrorEPC <- PC
endif
PC <- 0xBFC0_0000

Machine Check Exception (4Kc core)

A machine check exception occurs when the processor detects an internal inconsistency. The following condition causes a machine check exception;

Cause Register ExcCode Value: MCheck

Additional State Saved: None

Entry Vector Used: General exception vector (offset 0x180)

Interrupt Exception

外部中断。它是唯一一个异步发生的异常。之所以说中断是异步发生的，是因为相对于其他异常来说，从时序上看，中断的发生是不可预料的，无法确定中断的发生是在流水线的哪一个阶段。MIPS的五级流水线设计如下：

IF, RD, ALU, MEM, WB。MIPS处理器的中断控制部分有这样的设计：在中断发生时，如果该指令已经完成了MEM阶段的操作，则保证该指令执行完毕。反之，则丢弃流水线对这条指令的工作。除NMI外，所有的内部或外部硬件中断(Hardware Interrupt)均共用这一个异常向量(Exception Vector)。前面提到的CP0中的Counter/Compare这一对计数寄存器，当Counter计数值和Compare门限值相等时，即触发一个硬件中断。

The interrupt exception occurs when one or more of the eight interrupt requests is enabled by the Status register and the interrupt input is asserted. The delay from assertion of an unmasked interrupt to fetch of the first instructions at the exception vector is a minimum of 5 clock cycles. More may be needed if a committed instruction has to complete before the exception can be taken. A SYNC instruction which has already started flushing the cache and write buffers must wait until this is completed before the interrupt exception can be taken.

Register ExcCode Value: Int
Additional State Saved:
Entry Vector Used:

General exception vector (offset 0x180) if the IV bit in the Cause register is 0; interrupt vector (offset 0x200) if the IV bit in the Cause register is 1.

TLB Refill Exception

Instruction Fetch or Data Access (4Kc core)

TLB Miss Load/Write，如果试图访问没有在MMU的TLB中映射的内存地址，会触发这个异常。在支持虚拟内存的操作系统中，这会触发内存的页面倒换，系统的Exception Handler会将所需要的内存页从虚拟内存中调入物理内存，并更新相应的TLB表项。

During an instruction fetch or data access, a TLB refill exception occurs when no TLB entry in a TLB-based MMU matches a reference to a mapped address space and the EXL bit is 0 in the Status register. Note that this is distinct from the case in which an entry matches but has the valid bit off. In that case, a TLB Invalid exception occurs.

*Cause* Register ExcCode Value:

TLBL: Reference was a load or an instruction fetch

TLBS: Reference was a store

Additional State Saved:

Entry Vector Used:

TLB refill vector (offset 0x000) if Status EXL = 0 at the time of exception; general exception vector (offset 0x180) if Status EXL = 1 at the time of exception

TLB Invalid Exception — Instruction Fetch or Data Access (4Kc core)

During an instruction fetch or data access, a TLB invalid exception occurs in one of the following cases:

• No TLB entry in a TLB-based MMU matches a reference to a mapped address space; and the EXL bit is 1 in theStatus register.

• A TLB entry in a TLB-based MMU matches a reference to a mapped address space, but the matched entry has the valid bit off .

Cause Register ExcCode Value:

TLBL: Reference was a load or an instruction fetch

TLBS: Reference was a store

Additional State Saved:

Entry Vector Used:

General exception vector (offset 0x180)

Bus Error Exception — Instruction Fetch or Data Access

一般地原因是Cache尚未初始化的时候访问了Cached的内存空间所致。因此，要注意在系统上电后，Cache初始化之前，只访问Uncached的地址空间，也就是0xA0000000-0xBFFFFFFF这一段。默认地，上电初始化的入口点0xBFC00000就位于这一段。(某些MIPS实现中可以通过外部硬线连接修改入口点地址，但为了不引发无法预料的问题，不要将入口点地址修改为Uncached段以外的地址)

A bus error exception occurs when an instruction or data access makes a bus request (due to a cache miss or an uncacheable reference) and that request terminates in an error. The bus error exception can occur on either an instruction fetch or a data access. Bus error exceptions that occur on an instruction fetch have a higher priority than bus error exceptions that occur on a data access.

Bus errors taken on the requested (critical) word of an instruction fetch or data load are precise. Other bus errors, such as stores or non-critical words of a burst read, can be imprecise. These errors are taken when the EB_RBErr or EB_WBErr signals are asserted and may occur on an instruction that was not the source of the offending bus cycle.

在指令获取或数据加载的请求(关键)字上接收的总线错误是精确的。其他总线错误，如存储或突发读取的非关键字，可能是不精确的。这些错误是在断言EB_RBErr或EB_WBErr信号时发生的，可能发生在一个指令上，而该指令不是发生故障的总线周期的来源。

Cause Register ExcCode Value:

IBE: Error on an instruction reference

DBE: Error on a data reference

Additional State Saved: None

Entry Vector Used:

General exception vector (offset 0x180)

详细异常处理流程图

异常的嵌套

在有的情况下，希望在异常或中断中，系统可以继续处理其他的异常或中断。这需要系统软件处理如下事情：

进入处理程序后，保存Context, EPC, status, cause等寄存器的值到内存栈中，然后设置UM=0，设置CPU模式为核心态，（异常并不会改status中UM的值，EXL=1， ERL=1，UM=0，任意一个条件成立都是内核态，为何要设置为内核态呢，因为只有在内核态才能访问cp0的特权资源），然后清除SR[EXL], 从而支持EPC会被更新，从而支持嵌套处理。但是当还没来得及清除SR[EXL]，另外一个异常立即就来了怎么办呢（中断不可能来，只可能是执行异常），那么所有寄存器的值都不会更新，直接跳转到异常向量处开始执行

在任何情况下，reset， softreset，NMI，都会无条件响应，并且设置ERL=1，也就是说错误处理可以无条件相应，当然错误处理也是可以嵌套的。

在ERL=1时，禁止任何中断和异常，除了reset， softreset，NMI

*SR[IE]是一个很重要的位来处理嵌套异常。值得注意的，或容易犯错的一点是：

在做恢复上下文时，要避免重入问题。比如，要用eret返回时，要建立EPC的值。在此之前，一定要先关闭中断disable interrupt. 否则，EPC可能被冲掉。

下面是一段异常中断返回的例子代码：

/* 读取SR的当前值*/
mfc0 t0,C0[SR]
/*加一个delay slot指令 */
nop
/* 清除SR[IE]，关闭中断 */
li t1,~SR[IE]
and t0,t0,t1
mtc0 t0,C0[SR]
nop
/* 可以安全的恢复EPC的值*/
ld t1,R_EPC(sp)
mtc0 t1,C0[EPC]
nop
lhu k1, /* 恢复老的中断屏蔽码，被暂时保留在k1里*/
or t0,t0,k1
/*从新对SR[EXL]置位。ERET会自动将其清除。一定要理解，为什么中断例程要在前面要清除 EXL。如果不的话。就不能支持嵌套异常。为什么，希望读者能思考并回答。并且，在清EXL之前，我们一定要先把CPU模式变为核心模式。*/
ori t0,t0,SR[EXL]
/*一切就绪，恢复中断屏蔽码和对EXL置位*/
mtc0 t0,C0[SR]
nop
ori t0,t0,SR[IE]
/* 置为IE */
ori t0,t0,SR[IMASK7 ]
mtc0 t0,C0[SR ]
nop
/*恢复CPU模式 */
ori t0, t0,SR[USERMODE]
mtc0, t0, C0[SR]
eret
/* eret将对EXL清零 。所以要注意，如果你在处理程序中改变了CPU的模式，一定要确保，在重新设置EXL位后，恢复CPU的原来模式，否则用户进程将会在核心态下运行。

代码实例分析

cfe中的异常处理 fsbl部分：

LEAF( do_chip_init )
    move    s0, ra
    li    t0, INITIAL_SR               //#define INITIAL_SR      ((/*CP0_STATUS_SR_MASK |*/ CP0_STATUS_CU1_MASK | CP0_STATUS_BEV_MASK | CP0_STATUS_IE_MASK) & ~( CP0_STATUS_ERL_MASK | CP0_STATUS_EXL_MASK))
    mtc0  t0, CP0_STATUS
    nop
    nop
    mtc0  zero, CP0_CAUSE           # clear software interrupts
    nop
    nop

在这里将ERL和EXL都清除掉了，并且使能了中断

#define CP0_STATUS_KSU_MASK _MM_MAKEMASK(2,3)

#define CP0_STATUS_KSU_SHIFT (3)

复位后，处在内核态，并且cfe中一直处在内核态中，没有发生cpu模块的转换

ssbl部分：

void cfe_main(int a,int b)
{
    /*
     * Set up the exception vectors
     */
    cfe_setup_exceptions();
}
void cfe_setup_exceptions(void)
{
    _exc_setvector(XTYPE_TLBFILL,  (void *)   cfe_exception );
    _exc_setvector(XTYPE_XTLBFILL, (void *) cfe_exception);
    _exc_setvector(XTYPE_CACHEERR, (void *) _exc_cache_crash_sim);
    _exc_setvector(XTYPE_EXCEPTION,(void *) cfe_exception);
    _exc_setvector(XTYPE_INTERRUPT,(void *) cfe_exception);
    _exc_setvector(XTYPE_EJTAG,    (void *) cfe_exception);
    exc_handler.catch_exc = 0;
    q_init( &(exc_handler.jmpbuf_stack));
#if (!CFG_BOOTRAM) && (CFG_RUNFROMKSEG0)
    /*
     * Install RAM vectors, and clear the BEV bit in the status
     * register.  Don't do this if we're running from PromICE RAM
     */
    exc_install_ram_vectors();
#endif
}
#define XTYPE_RESET    0
#define XTYPE_TLBFILL    8
#define XTYPE_XTLBFILL    16
#define XTYPE_CACHEERR    24
#define XTYPE_EXCEPTION    32
#define XTYPE_INTERRUPT    40
#define XTYPE_EJTAG    48

        .globl    _exc_vectab
_exc_vectab:    _LONG_    0        # XTYPE_RESET
        _LONG_    0        # XTYPE_TLBFILL  (not used)        
        _LONG_    0        # XTYPE_XTLBFILL
        _LONG_    0        # XTYPE_CACHEERR (not used)
        _LONG_    0        # XTYPE_EXCEPTION
        _LONG_    0        # XTYPE_INTERRUPT
        _LONG_    0        # XTYPE_EJTAG
LEAF(_exc_setvector)

        la    v0,_exc_vectab
        srl    a0,3        /* convert 8-byte index to array index */
        sll    a0,BPWSIZE    /* convert back to index appropriate for word size */
        add    v0,a0
        SR    a1,(v0)
        j    ra

END(_exc_setvector)

void cfe_exception(int code,uint64_t *info)
{
    int idx;
    if(exc_handler.catch_exc == 1) {      //允许异常处理被捕获
        /*Deal with exception without restarting CFE.*/
        /*Clear relevant SR bits*/
        _exc_clear_sr_exl();
        _exc_clear_sr_erl();
        /*Reset flag*/
        exc_handler.catch_exc = 0;
        exc_longjmp_handler();       
    }
    //仅仅打印异常引起异常的原因和信息
    xprintf("**Exception %d: EPC=%08X, Cause=%08X, VAddr=%08X\n",
        code,(uint32_t)info[XCP0_EPC],
        (uint32_t)info[XCP0_CAUSE],(uint32_t)info[XCP0_VADDR]);
    xprintf("RA=%08X, PRID=%08X\n",
        (uint32_t)info[XGR_RA],(uint32_t)info[XCP0_PRID]);
    xprintf("\n");
    for (idx = 0;idx < 32; idx+= 2) {
    xprintf("%2s ($%2d) = %08X %2s ($%2d) = %08X\n",
        regnames+(idx*2),
        idx,(uint32_t)info[XGR_ZERO+idx],
        regnames+((idx+1)*2),
        idx+1,(uint32_t)info[XGR_ZERO+idx+1]);
    }
    xprintf("\n");
    xprintf("\n*** Waiting for system reset ***\n");    //直接挂机
    while(1);
}

将异常向量保存在_exc_vectab这个表中，那么这个表由谁去调用呢？

LEAF( _exc_entry )
        .set noreorder
        .set noat
        subu    k1,sp,EXCEPTION_SIZE
        SRL    k1,3
        SLL    k1,3

        SREG    zero,XGR_ZERO(k1)                #保存现场  #define SREG     sw
        SREG     AT,XGR_AT(k1)

        SREG    v0,XGR_V0(k1)
        SREG    v1,XGR_V1(k1)

        SREG    a0,XGR_A0(k1)
        SREG    a1,XGR_A1(k1)
        SREG    a2,XGR_A2(k1)
        SREG    a3,XGR_A3(k1)

        SREG    t0,XGR_T0(k1)
        SREG    t1,XGR_T1(k1)
        SREG    t2,XGR_T2(k1)
        SREG    t3,XGR_T3(k1)
        SREG    t4,XGR_T4(k1)
        SREG    t5,XGR_T5(k1)
        SREG    t6,XGR_T6(k1)
        SREG    t7,XGR_T7(k1)

        SREG    s0,XGR_S0(k1)
        SREG    s1,XGR_S1(k1)
        SREG    s2,XGR_S2(k1)
        SREG    s3,XGR_S3(k1)
        SREG    s4,XGR_S4(k1)
        SREG    s5,XGR_S5(k1)
        SREG    s6,XGR_S6(k1)
        SREG    s7,XGR_S7(k1)

        SREG    t8,XGR_T8(k1)
        SREG    t9,XGR_T9(k1)

        SREG    gp,XGR_GP(k1)
        SREG    sp,XGR_SP(k1)
        SREG    fp,XGR_FP(k1)
        SREG    ra,XGR_RA(k1)

        mfc0    t0,C0_CAUSE
        mfc0    t1,C0_SR
        mfc0    t2,C0_BADVADDR
        mfc0    t3,C0_EPC
        mfc0    t4,C0_PRID
        mflo    t5
        mfhi    t6    
        SREG    t0,XCP0_CAUSE(k1)
        SREG    t1,XCP0_SR(k1)
        SREG    t2,XCP0_VADDR(k1)
        SREG    t3,XCP0_EPC(k1)
        SREG    t4,XCP0_PRID(k1)
        SREG    t5,XGR_LO(k1)
        SREG    t6,XGR_HI(k1)

#if CFG_EMBEDDED_PIC
        la        gp,PHYS_TO_K0(CFE_LOCORE_GLOBAL_GP)
        LR        gp,0(gp)        # get our GP handle from low memory vector
#else
        la        gp,_gp            # Load up GP, not relocated so it's easy
#endif

        /* Exception occurred in CFE */
        move    a0,k0            # Pass exception type
        move    a1,k1            # Pass frame to exception handler
        la        t0, _exc_vectab         # get base of exception vectors
        srl        k0,3            # convert 8-byte index to array index
        sll        k0,BPWSIZE        # convert back to index appropriate for word size
        addu    t0,k0            # get vector address
        LR        t0,(t0)            # to call handler

        move    sp,k1            # "C" gets fresh stack area
        jalr    t0            # Call exception handler
        nop

        move    k1, sp
        LREG      AT,XGR_AT(k1)

        LREG    t0,XGR_LO(k1)
        LREG    t1,XGR_HI(k1)
        mtlo    t0
        mthi    t1

        LREG    a0,XGR_A0(k1)
        LREG    a1,XGR_A1(k1)
        LREG    a2,XGR_A2(k1)
        LREG    a3,XGR_A3(k1)

        LREG    t0,XGR_T0(k1)
        LREG    t1,XGR_T1(k1)
        LREG    t2,XGR_T2(k1)
        LREG    t3,XGR_T3(k1)
        LREG    t4,XGR_T4(k1)
        LREG    t5,XGR_T5(k1)
        LREG    t6,XGR_T6(k1)
        LREG    t7,XGR_T7(k1)

        LREG    s0,XGR_S0(k1)
        LREG    s1,XGR_S1(k1)
        LREG    s2,XGR_S2(k1)
        LREG    s3,XGR_S3(k1)
        LREG    s4,XGR_S4(k1)
        LREG    s5,XGR_S5(k1)
        LREG    s6,XGR_S6(k1)
        LREG    s7,XGR_S7(k1)

        LREG    t8,XGR_T8(k1)
        LREG    t9,XGR_T9(k1)

        LREG    gp,XGR_GP(k1)
        LREG    sp,XGR_SP(k1)
        LREG    fp,XGR_FP(k1)
        LREG    ra,XGR_RA(k1)

/* do any CP0 cleanup here */

        LREG    v0,XGR_V0(k1)
        LREG    v1,XGR_V1(k1)
    
        ERET

        .set at
        .set reorder

END(_exc_entry)

那么 _exc_entry又是怎么调用的呢？

static void exc_install_ram_vectors(void)
{
    uint32_t *ptr;
    int idx;
    /* Debug: blow away the vector area so we can see what we did */
    ptr = (uint32_t *) PHYS_TO_K0(0);
    for (idx = 0; idx < 0x1000/sizeof(uint32_t); idx++) *ptr++ = 0;
    /*
     * Set up the vectors.  The cache error handler is set up
     * specially.
     */
    exc_setup_hw_vector(MIPS_RAM_VEC_TLBFILL,  CPUCFG_TLBHANDLER,XTYPE_TLBFILL);
    exc_setup_hw_vector(MIPS_RAM_VEC_XTLBFILL,   _exc_entry ,XTYPE_XTLBFILL);
    exc_setup_hw_vector(MIPS_RAM_VEC_CACHEERR, _exc_entry,XTYPE_CACHEERR);
    exc_setup_hw_vector(MIPS_RAM_VEC_EXCEPTION,_exc_entry,XTYPE_EXCEPTION);
    exc_setup_hw_vector(MIPS_RAM_VEC_INTERRUPT,_exc_entry,XTYPE_INTERRUPT);
    /*
     * Flush the D-cache and invalidate the I-cache so we can start
     * using these vectors.
     */
    cfe_flushcache(CFE_CACHE_FLUSH_D | CFE_CACHE_INVAL_I,0,0);
    /*
     * Write the handle into our low memory space.  If we need to save
     * other stuff down there, this is a good place to do it.
     * This call uses uncached writes - we have not touched the
     * memory in the handlers just yet, so they should not be
     * in our caches.
     */
    _exc_setup_locore((intptr_t) CPUCFG_CERRHANDLER);  //重新设置cache 错误异常处理向量
    /*
     * Finally, clear BEV so we'll use the vectors in RAM.
     */
    _setstatus(_getstatus() & ~M_SR_BEV);
    /*
     * XXX There's a hazard here, but we're not going to worry about
     * XXX it.  It is unlikely we'll use the vectors any time soon.
     */
}

清掉BEV标志，因此除了reset， soft reset，NMI外，所有其他的中断向量都在RAM中0X8000_0000开始处

#define MIPS_ROM_VEC_RESET    0x0000
#define MIPS_ROM_VEC_TLBFILL    0x0200
#define MIPS_ROM_VEC_XTLBFILL    0x0280
#define MIPS_ROM_VEC_CACHEERR    0x0300
#define MIPS_ROM_VEC_EXCEPTION    0x0380
#define MIPS_ROM_VEC_INTERRUPT    0x0400
#define MIPS_ROM_VEC_EJTAG    0x0480

#define MIPS_RAM_VEC_TLBFILL    0x0000                       //EXL=0时TLB异常入口
#define MIPS_RAM_VEC_XTLBFILL    0x0080                      //EXL=1时TLB异常入口
#define MIPS_RAM_VEC_CACHEERR    0x0100                  //cache 错误异常入口
#define MIPS_RAM_VEC_EXCEPTION    0x0180                  //通用异常入口
#define MIPS_RAM_VEC_INTERRUPT    0x0200                  //中断入口
#define MIPS_RAM_VEC_END    0x0300
#define CPUCFG_TLBHANDLER        bcmcore_tlbhandler

TLB异常为何要在分EXL处理呢，主要是为了处理更高效，因为EXL=0时，TLB发生异常的频率很高，所以单独搞个专用的TLB异常向量，可以极大的提升系统性能，因为通用异常处理向量要根据异常码分开处理，效率不高

 _exc_setup_locore((intptr_t) CPUCFG_CERRHANDLER);  //重新设置cache 错误异常处理向量，这个函数
        li    t0,PHYS_TO_K1(CFE_LOCORE_GLOBAL_CERRH)
        SR    a0,0(t0)
#把正常的异常处理向量保存在CFE_LOCORE_GLOBAL_CERRH内存中，然后
        li    t0,PHYS_TO_K1(MIPS_RAM_VEC_CACHEERR)

        LOADREL(t1,_exc_cerr_htable)
        LR    t2,R_EXC_CERR_TEMPLATE_END(t1)
        LR    t1,R_EXC_CERR_TEMPLATE(t1)
#将_exc_cerr_htable中存放的异常处理函数拷贝覆盖到    exc_setup_hw_vector(MIPS_RAM_VEC_CACHEERR, _exc_entry,XTYPE_CACHEERR);已经安装的内存中，也就是0x0100的内存中
_exc_cerr_htable:
        _LONG_    _exc_cerr_template
        _LONG_    _exc_cerr_template_end

LEAF(_exc_cerr_template)
        LR    k0,CFE_LOCORE_GLOBAL_CERRH(zero)
        jal    k0
         nop
#取出保存的异常向量地址
/*
* Temporary until all our CPU packages support a cache error handler
*/
#ifndef CPUCFG_CERRHANDLER
#define CPUCFG_CERRHANDLER 0xBFC00000
#else
extern void CPUCFG_CERRHANDLER(void);
#endif

这里实际上是没有定义 CPUCFG_CERRHANDLER的，因此在cfe中cache错误将导致直接重启

static void exc_setup_hw_vector(uint32_t vecoffset,
                  void *target,
                  uint32_t k0code)
{
    uint32_t *vec;
    uint32_t new;
    uint32_t lower,upper;
    new = (uint32_t) (intptr_t) target;    /* warning: assumes compatibility addresses! */
    lower = new & 0xffff;
    upper = (new >> 16) & 0xffff;
    if ((lower & 0x8000) != 0) {
    upper++;
    }
    /*
     * Get a KSEG0 version of the vector offset.
     */
    vec = (uint32_t *) PHYS_TO_K0(vecoffset);
    /*
     * Patch in the vector.  Note that we have to flush
     * the L1 Dcache and invalidate the L1 Icache before
     * we can use this.  
     */
    vec[0] = 0x3c1b0000 | upper;   /* lui   k1, HIGH(new)     */
    vec[1] = 0x277b0000 | lower;   /* addiu k1, k1, LOW(new)  */
    vec[2] = 0x03600008;           /* jr    k1                */
    vec[3] = 0x241a0000 | k0code;  /*  li   k0, code          */
}

kernel中的异常处理

处理程序什么时候安装?

traps_init(arch/mips/kernel/traps.c,setup_arch之后start_kernel调用)
/* Copy the generic exception handler code to it's final destination. */
memcpy((void *)(KSEG0 + 0x80), &except_vec1_generic, 0x80);
memcpy((void *)(KSEG0 + 0x100), &except_vec2_generic, 0x80);
memcpy((void *)(KSEG0 + 0x180), &except_vec3_generic, 0x80);
flush_icache_range(KSEG0 + 0x80, KSEG0 + 0x200);
/*
* Setup default vectors
*/
for (i = 0; i <= 31; i++)
set_except_vector(i, handle_reserved);

装的什么?

except_vec3_generic(head.S) #(除了TLB refill例外都用这个入口): /* General exception vector R4000 version. */
NESTED(except_vec3_r4000, 0, sp)
.set noat
mfc0 k1, CP_CAUSE
andi k1, k1, 0x7c /* 从cause寄存器取出异常号 */
li k0, 31<<2 beq k1, k0, handle_vced /* 如果是vced,处理之*/
li k0, 14><<2 beq k1, k0, handle_vcei /* 如果是vcei,处理之*/
/* 这两个异常是和cache相关的,cache出了问题,不能再在这个cached的位置处理啦 */
la k0, exception_handlers /* 取出异常处理程序表 */
addu k0, k0, k1 lw k0, (k0) /*处理函数*/
nop jr k0 /*运行异常处理函数*/
nop

那个异常处理程序表是如何初始化的呢?

在traps_init中,大家会看到set_exception_vector(i,handler)这样的代码, 填的就是这张表啦.可是,如果你用souce insigh之类的东西去找那个handler,往往就落空了,??怎么没有handle_ri,handle_tlbl…_?不着急,只不过是一个小trick, 还记得x86中断处理的handler代码吗? 它们是用宏生成的:

entry.S
#define BUILD_HANDLER(exception,handler,clear,verbose)
.align 5;
NESTED(handle_##exception, PT_SIZE, sp);
.set noat;
SAVE_ALL; /* 保存现场,切换栈(如必要)*/
__BUILD_clear_##clear(exception); /*关中断?*/
.set at;
__BUILD_##verbose(exception);
jal do_##handler; /*干活*/
move a0, sp;
ret_from_exception; /*回去*/
nop;
END(handle_##exception) /*生成处理函数*/
BUILD_HANDLER(adel,ade,ade,silent) /* #4 */
BUILD_HANDLER(ades,ade,ade,silent) /* #5 */
BUILD_HANDLER(ibe,ibe,cli,verbose) /* #6 */
BUILD_HANDLER(dbe,dbe,cli,silent) /* #7 */
BUILD_HANDLER(bp,bp,sti,silent) /* #9 */

认真追究下去,这里的一些宏是很重要的,象SAVE_ALL(include/asm/stackframe.h), 异常处理要高效,正确,这里要非常小心.这是因为硬件做的事情实在太少了.

异常处理的一般过程

异常入口（向量）

异常优先级

异常相关寄存器

SR(Status Register，状态寄存器)

Cause

EPC存放返回地址

WatchLo、WatchHi

CP0相关

CP0 主要操作

CP0 冒险现象

MIPS cpu中断机制

mips 异常

mips异常处理步骤

mips异常处理例子

中断寄存器相关

中断处理步骤：

常见的异常处理

Reset Exception

Non-Maskable Interrupt (NMI) Exception

Machine Check Exception (4Kc core)

Interrupt Exception

TLB Refill Exception

TLB Invalid Exception — Instruction Fetch or Data Access (4Kc core)

Bus Error Exception — Instruction Fetch or Data Access

详细异常处理流程图

异常的嵌套

代码实例分析

ssbl部分：