# Switch to Restricted Mode ## Introduction Morello supports two banks of registers for certain registers such as Default Data Capability `DDC`, Capability Stack Pointer `CSP`, and Thread ID Register `TPIDR_EL0`. These banks are referred to as Restricted and Executive. Access to these banks is controlled by the `EXECUTIVE` permission bit in `PCC`. When this bit is set, we run in the Executive mode, otherwise we run in the Restricted mode. | Name | Executive bank | Restricted bank | | ------------------------ | ---------- | ----------- | | Default Data Capability | DDC_EL0 | RDDC_EL0 | | Capability Stack Pointer | CSP_EL0 | RCSP_EL0 | | Thread ID Register | CTPIDR_EL0 | RCTPIDR_EL0 | In the Restricted mode we can access one of the banks of these registers while in the Executive mode both banks can be accessed. | Accessing via... | Executive | Restricted | | ---------------- | ---------- | ---------- | | DDC | DDC_EL0 | RDDC_EL0 | | RDDC_EL0 | RDDC_EL0 | (fault) | | CSP | CSP_EL0 | RCSP_EL0 | | RCSP_EL0 | RCSP_EL0 | (fault) | | CTPIDR_EL0 | TPIDR_EL0 | RTPIDR_EL0 | | RCTPIDR_EL0 | RTPIDR_EL0 | (fault) | This can be used to implement compartmentalisation. The management code would run in the Executive mode and would be able to set up stack pointer and thread ID register for each compartment while the isolated code running in such compartment would only "perceive" the environment set up for it. There are also restrictions imposed by the architecture on interworking between these two modes. To switch to the Restricted mode we need two things: - A restricted function pointer (that is a sentry without the `EXECUTIVE` permission). - Branch via special instruction `B(L)RR` (Branch (with Link) to capability Register with possible switch to Restricted). Branching via ordinary `BLR` instruction would clear the tag in the target `PCC` which would result in a capability fault on instruction fetch right after branching. The same rule applies to returning from a function running in the Executive mode via a restricted `CLR`. Instead a special instruction `RETR` (Return with possible switch to Restricted) should be used. Both `BLRR` and `RETR` will fault if executed in the Restricted mode. Returning from the Restricted mode (or calling an executive function) only requires a sentry with the `EXECUTIVE` permission. Ordinary `BLR` and `RET` instructions can be used. Switching to the Restricted mode without proper setup may not work or result in a security problem, that is why Morello requires use of the new instructions for this. On the other hand, being able to request an operation run in the Executive mode from code that runs in the Restricted mode is useful because this is how we can ask our runtime to switch to another compartment. ## Design Overview In this example we use our own tiny runtime library to make things very simple. The execution starts in the Executive mode when the `_start` function is called (see [src/start.S](src/start.S)). We must run usual initialisation to setup all capabilities before we can proceed. Then we prepare switch to Restricted mode and call the `main` function. This is where we enter the application code: <---------------- E ----------------> <------ R ------> <---- E ----> _start --> _init_compartments --> _start --> main --> _start --> exit We execute `main` in something called a "root compartment". It is like any other compartment, but it is set up automatically when the app starts. Application can remain in the root compartment for the duration of the process or it can instantiate more compartments and execute some code in them. When switching to another compartment the following things happen: - Callee-saved registers are saved to the caller's stack. - A compartment descriptor is loaded from compartments private data. - Executive switch function is called (switching to Executive mode). - Caller's `TPIDR_EL0` and `CSP` are stored on the executive stack (not accessible from other compartments). - Callee's `TPIDR_EL0` and `CSP` capabilities are loaded and set up from the compartment descriptor. - Target function's arguments are loaded into registers. - All unused registers are sanitised. - Target function is called in the Restricted mode (unless the sentry has the `EXECUTIVE` permission, see below). - The executive link capability is RB-sealed and is saved to `CLR`. After returning from the target function we do these operations in reverse. All stack allocations are placed in private mappings and only capabilities that are explicitly provided by the caller compartment to the callee compartment can be used to exchange data between compartments. ## Code Examples ### Restricting Global Functions Let's consider the following example in the [restricted.c](restricted.c) file. The first part of it shows one of the consequences of the rules described above. When using an indirect call to a global function, we may accidentally switch to Executive mode. This depends on how the capability for this global function was set up by the runtime initialisation (see `init` function). Such a function will usually inherit executable capabilities from the `AT_CHERI_EXEC_RX_CAP` root capability provided by the kernel, and this capability will have both `EXECUTIVE` and `SYSTEM` permissions by default. This is why we will need to change the initialisation procedure and remove the `EXECUTIVE` and `SYSTEM` permissions from all function pointers that are supposed to be used by the restricted code. However, we may also want to keep executive copies of some of the global functions to use them from executive code (e.g. for setting things up or managing compartments). This example is rather simple, and executive copies of the standard library functions are not needed as we can do everything via direct calls. ### Creating a Compartment We create a new instance of a compartment by calling (see [include/rcmpt.h](include/rcmpt.h)): switch_t *cmpt_fun = create_compartment(target_fun, 2 /* pages */); The returned capability is actually a sentry. It can be used in the same way as the wrapped target function `target_fun`. This sentry points to code generated on the fly from the `_thunk` function (see `_thunk` in [src/start.S](src/start.S)). This code will use near-relative load to access the compartment descriptor with the following information: typedef struct { void *exec; // executive pointer to switch trampoline (sentry) void *target; // target function (sentry) void *tp; // thread pointer (not used currently) void *sp; // stack pointer void *cid; // CID capability for the compartment } thunk_data_t; The thunk code branches to the capability in the `exec` field (it points to the `_switch` function and contains `EXECUTIVE` permission). The switch code runs in Executive mode and therefore can do the necessary setup before branching to the restricted target function. Note that at this point if the target function pointer is executive the switch to compartment will not happen and the target function will remain running in Executive mode. In theory, the code that creates compartment instances via the `create_compartment` function will not have access to any executive function pointers. If it does, it can do the switch to the Executive mode just by doing an indirect call via it, in which case the compartment isolation will be breached. This is another reason why we should initialise any global function pointers with care. The check if the target function is restricted is not currently implemented. Invocation of a compartment is simple: instead of res = target_fun(arg0, arg1, arg2); just use res = cmpt_fun(arg0, arg1, arg2); Note that variadic targets are not supported in the current implementation. ### Nested Compartment Calls In the same way as we call a compartment from the root compartment, we can also do nested calls and invoke other compartments. Every switch to the next compartment will use Executive mode and a separate stack frame on the executive stack to retain caller's data while running the callee. ## Private Data of the Compartment Manager Note that compartment manager holds global object that contains data that can be used to escape for a compartment. In this implementation it remains unprotected. To solve this problem, we can use the BRS-sealed capability pair (see the `privdata` example in the `compartments` folder). This suggests that in practice the switch-to-restricted compartments might co-exists with the branch-to-sealed-pair compartments. However, we could also use private memory mapping to store this global object and keep the capability pointing to it on the executive stack that would be inaccessible from any compartments (including the root compartment).