
		Xvisor Design Document

This is document gives an overview of the xvisor design. It gives very 
important insight required for writing architecture support for Xvisor.


	 	Chapter 1: Modeling Virtual Machines

A virtual machine (VM) is a software emulation/simulation of a physical machine
(i.e. a computer system) that can executes programs or operating systems.

Virtual machines are separated into two major categories (based on their use): 
  * System Virtual Machine: A system virtual machine provides a complete system
    platform which supports the execution of a complete operating system (OS). 
  * Process Virtual Machine: A process virtual machine is designed to run a 
    single program, which means that it supports a single process. 

An essential characteristic of a virtual machine is that the software running 
inside is limited to the resources and abstractions provided by the virtual 
machine—it cannot break out of its virtual world.

(Citation: Refer Wikipedia page on "Virtual Machine" and "Hypervisor")

Xvisor is a hardware assisted system virtualization software (i.e. implements
system virtual machine with hardware acceleration) running directly on a host
machine (i.e. physical machine/hardware). In short, we can say that Xvisor 
is a Native (or Type-1) Hypervisor (or Virtual Machine Monitor).

We refer system virtual machine instances as "Guest" instances and virtual CPUs
of system virtual machines as "VCPU" instances in Xvisor. Also, VCPU belonging
to a Guest is refered as "Normal VCPU" and VCPU not belonging to any guest is 
referred as "Orphan VCPU". Xvisor creates Orphan VCPUs for various background 
processing and running managment daemons. 

Any modern CPU architecure has atleast two priviledge modes: User Mode, and
Supervisor Mode. The User Mode has lowest priviledge and Supervisor Mode has
highest priviledge. Xvisor runs Normal VCPUs in User Mode and Orphan VCPUs in
Supervisor Mode (Note: Architecture specific code has to treat Normal and 
Orphan VCPUs differently). 

The figure 1 below gives a clear picture of the System Virtual Machine Model 
implemented by Xvisor.

+--------------------------+    +--------------------------+    +------------+
|          Guest_0         |    |          Guest_N         |    |            |
+--------------------------+    +--------------------------+    | +--------+ |
| +--------+    +--------+ |    | +--------+    +--------+ |    | |        | |
| |        |    |        | |    | |        |    |        | |    | | Orphan | |
| | VCPU_0 | .. | VCPU_M | |    | | VCPU_0 | .. | VCPU_K | |    | | VCPU_R | |
| |        |    |        | |    | |        |    |        | |    | |        | |
| +--------+    +--------+ | .. | +--------+    +--------+ |    | +--------+ |
+--------------------------+    +--------------------------+    |      .     |
|       Address Space      |    |       Address Space      |    |      .     |
|+--------+ +-----+ +-----+|    |+--------+ +-----+ +-----+|    |      .     |
|| Memory | | PIC | | PIT ||    || Memory | | PIC | | PIT ||    | +--------+ |
|+--------+ +-----+ +-----+|    |+--------+ +-----+ +-----+|    | |        | |
|+-----+ +------+ +-----+  |    |+-----+ +------+ +-----+  |    | | Orphan | |
|| ROM | | UART | | LCD |  |    || ROM | | UART | | LCD |  |    | | VCPU_0 | |
|+-----+ +------+ +-----+  |    |+-----+ +------+ +-----+  |    | |        | |
+--------------------------+    +--------------------------+    | +--------+ |
+---------------------------------------------------------------+            |
|                                                                            |
|                 eXtensible Versatile hypervISOR (Xvisor)                   |
|                                                                            |
+----------------------------------------------------------------------------+
+----------------------------------------------------------------------------+
|                                                                            |
|            Host Machine (Host CPU + Host Memory + Host Devices)            |
|                                                                            |
+----------------------------------------------------------------------------+


	 	Chapter 2: Hypervisor Configuration

In the “early days”, configuring OS for a system was simple because it had a 
single central-processing-unit (CPU) core executing an operating system (OS). 
Yet in today’s world powerful single-core and multicore processors can actually
be configured in many different configurations. A multicore processor can be 
managed by a single symmetrical- multiprocessing (SMP) operating system, which
manages all of the cores. Alternatively, each core can be given to a separate 
OS in an asymmetrical-multiprocessing (AMP) configuration. SMP and AMP both 
have their challenges and advantages. For example, SMP doesn’t always scale 
well, depending on workload. For its part, AMP can be difficult to configure 
with regard to which OS gets access to which device. Operating systems assume 
that they have full control over the hardware devices that they detect. Often, 
this creates conflicts in the AMP case. Xvisor provides technology to partition 
or virtualize processing cores, memory, and devices between the multiple OSs 
that are used to build a system.

Xvisor maintians its configuration in the form of tree data structure called 
"device tree" to ease the task of configuring Xvisor running on a single-core 
or multi-core system. It is highly inspired from Device Tree Script (DTS) used 
by of_platform of Linux kernel. As a result, system designers can utilize a 
wide variety of configurations—including mixes of AMP, SMP, and core 
virtualization—to build their next-generation systems. 

In Linux, if an architecture (e.g. PowerPC) is using of_platform then at 
booting time Linux kernel will expect a DTB file (Flattened device tree file) 
from the boot loader. The DTB file is binary file generated by compiling a DTS 
(Device tree script) using DTC (Device tree compiler). An of_platform enabled 
Linux kernel only probes those drivers which are compatible or matching to the 
devices mentioned in the device tree populated from DTB file. Unlike Linux
of_platform using DTB file is not mandatory for Xvisor. The Xvisor architecture
specific code (or board specific code more precisely) can populate the device
tree from various sources such as DTB or ACPI tables. In simpler words, device 
tree in Xvisor is a data structure used for managing hypervisor configuration.

(Note: For more information on device tree syntax used by PowerPC Linux Kernel 
refer https://www.power.org/resources/downloads/Power_ePAPR_APPROVED_v1.0.pdf)

Although Xvisor device tree just a data structure, following constraints must 
be ensured while updating/populating Xvisor device tree:
  * Node Name: It must have characters from one of the following only,
      -> digit: [0-9]
      -> lowercase letter: [a-z]
      -> upercase letter: [A-Z]
      -> underscore: _
      -> dash: -
  * Attribute Name: It must have characters from one of the following only,
      -> digit: [0-9]
      -> lowercase letter: [a-z]
      -> upercase letter: [A-Z]
      -> underscore: _
      -> dash: -
      -> hash: #
  * Attribute String Value: A string attribute value must end with NULL 
    character (i.e. '\0' or character value 0). For a string list, each string 
    must be seperated by excatly one NULL character.
  * Attribute 32-bit unsigned Value: A 32-bit integer value must be represented
    in big-endian format or little-endian format based on the endianess of host
    CPU architecture. 
  * Attribute 64-bit unsigned Value: A 64-bit integer value must be represented
    in big-endian format or little-endian format based on the endianess of host
    CPU architecture. 
(Note: Architecture specific code must ensure that the above constraints are 
satisfied while populating device tree)
(Note: For standard attributes used by Xvisor refer source code.)

The figure 2 below shows the device tree representation of the hypervisor setup
shown in figure 1 of chapter 1.

  (Root)
+--------+
|        |
+--------+
    |
    |          (Host CPUs)
    |          +--------+
    |----------|  cpus  |
    |          +--------+
    |              |
    |              |          +--------+
    |              +----------|  cpu0  |
    |              |          +--------+
    |              |              .
    |              |              .
    |              |              .
    |              |          +--------+
    |              +----------|  cpuL  |
    |                         +--------+
    |
    |        (Host Hardware)
    |          +--------+
    +----------|  ....  |
    |          +--------+
    |
    |     (General Configuration)
    |          +--------+
    +----------|  vmm   |
    |          +--------+
    |
    |     (Guest Templates)
    |          +-----------+
    +----------| templates |
    |          +-----------+
    |
    |     (Guests Instances)
    |          +--------+
    +----------| guests |
               +--------+
                   |
                   |           (Guest)
                   |          +--------+
                   +----------| guest0 |
                   |          +--------+
                   |              |        (Guest VCPUs)
                   |              |          +--------+
                   |              |----------| vcpus  |
                   |              |          +--------+
                   |              |              |
                   |              |              |            (VCPU)
                   |              |              |          +--------+
                   |              |              +----------| vcpu0  |
                   |              |              |          +--------+
                   |              |              |              .
                   |              |              |              .
                   |              |              |              .
                   |              |              |            (VCPU)
                   |              |              |          +--------+
                   |              |              +----------| vcpuM  |
                   |              |                         +--------+
                   |              |
                   |              |     (Guest Address Space)
                   |              |          +--------+
                   |              +----------| aspace |
                   |                         +--------+
                   |                             |
                   |                             |        (Guest Region)
                   |                             |          ----------
                   |                             +----------| Memory |
                   |                             |          ----------
                   |                             |               
                   |                             |        (Guest Region)
                   |                             |          +--------+
                   |                             +----------|  PIC   |
                   |                             |          +--------+
                   |                             |               
                   |                             |        (Guest Region)
                   |                             |          +--------+
                   |                             +----------|  PIT   |
                   |                             |          +--------+
                   |                             |               
                   |                             |        (Guest Region)
                   |                             |          +--------+
                   |                             +----------|  UART  |
                   |                             |          +--------+
                   |                             |               
                   |                             |        (Guest Region)
                   |                             |          +--------+
                   |                             +----------|  LCD   |
                   |                             |          +--------+
                   |                             |               
                   |                             |        (Guest Region)
                   |                             |          ----------
                   |                             +----------|  ROM   |
                   |                                        +--------+
                   |
                   |           (Guest)
                   |          ----------
                   +----------| guestN |
                              ----------
                                  |        (Guest VCPUs)
                                  |          +--------+
                                  +----------| vcpus  |
                                  |          +--------+
                                  |              |
                                  |              |            (VCPU)
                                  |              |          +--------+
                                  |              +----------| vcpu0  |
                                  |              |          +--------+
                                  |              |              .
                                  |              |              .
                                  |              |              .
                                  |              |            (VCPU)
                                  |              |          +--------+
                                  |              +----------| vcpuK  |
                                  |                         +--------+
                                  |
                                  |     (Guest Address Space)
                                  |          +--------+
                                  +----------| aspace |
                                             +--------+
                                                 |
                                                 |        (Guest Region)
                                                 |          +--------+
                                                 +----------| Memory |
                                                 |          +--------+
                                                 |               
                                                 |        (Guest Region)
                                                 |          +--------+
                                                 +----------|  PIC   |
                                                 |          +--------+
                                                 |               
                                                 |        (Guest Region)
                                                 |          +--------+
                                                 +----------|  PIT   |
                                                 |          +--------+
                                                 |               
                                                 |        (Guest Region)
                                                 |          ----------
                                                 +----------|  UART  |
                                                 |          +--------+
                                                 |               
                                                 |        (Guest Region)
                                                 |          +--------+
                                                 +----------|  LCD   |
                                                 |          +--------+
                                                 |               
                                                 |        (Guest Region)
                                                 |          +--------+
                                                 +----------|  ROM   |
                                                            +--------+

By default, Xvisor will always support configuring device tree using DTS.
It also includes a DTC compiler taken from Linux kernel source code and a 
light-weight DTB parsing library (libfdt) which can be used by architecture 
specific code (or board specific code) to populate device tree for Xvisor.


	 	Chapter 3: Hypervisor Timer

Like any OS, a hypervisor also needs to keep track of passing time using a 
timekeeping subsystem. We refer timekeeping subsystem of Xvisor as hypervisor 
timer. A timekeeping subsystem of an OS does two crucial tasks, which are:
  1. Keep track of passing time: The legacy way of achiving this is to count
     periodic interrupts, but this method is very imprecise and with low 
     resolution. The more precise and low overhead way of achiving this to 
     use a clocksource device (i.e. free-running cycle accurate hardware 
     counter) as reference for time elapsed.
  2. Schedule events in future: To achive this an OS will keep per-CPU list 
     of events sorted ascending based upon their expiry time. The OS will use
     per-CPU clockevent device (i.e. PIT) to schedule events one-by-one from 
     the sorted events with earliest expiring event first. All OSes will 
     differ only in how they maintain per-CPU sorted event list.

Unlike any OS, Timekeeping is especially problematic for hypervisors because 
a number of challenges. The most obvious problem is that time is now shared 
between the host and, potentially many guest instances. The guest OS never
gets 100% of CPU execution time, despite the fact that it may very well make 
that assumption. It may expect it to remain true to very exacting bounds when 
interrupt sources are disabled, but in reality only its virtual interrupt 
sources are disabled, and the machine may still be preempted at any time. This
causes problems as the passage of real time, the injection of guest interrupts 
and the associated clock sources are no longer completely synchronized with 
real time.

One of the most immediate problems that occurs with legacy guest OSes is that
the system timekeeping routines are often designed to keep track of time by 
counting periodic interrupts. These interrupts may come from the PIT or the 
RTC, but the problem is the same: the host virtualization engine may not be 
able to deliver interrupts at proper rate, and so guest time may fall behind. 
This is especially problematic if a high interrupt rate (such as 1000 HZ) is 
selected which is unfortunately the default for many Linux guests.

There are three approaches to solving this problem:
  1. It may be possible to simply ignore it for guests which have a separate 
     time source for tracking 'wall clock' or 'real time' may not need any 
     adjustment of their interrupts to maintain proper time.
  2. If this is not sufficient, it may be necessary to inject additional 
     interrupts into the guest in order to increase the effective interrupt 
     rate. This approach leads to complications in extreme conditions, where 
     host load or guest lag is too much to compensate for.
  3. The guest may need to become aware of lost ticks and compensate for them 
     internally. Although promising in theory, the implementation of this 
     policy in Linux has been extremely error prone, and a number of buggy 
     variants of lost tick compensation are distributed across commonly used 
     Linux systems.

From the above it is clear that a hypervisor will have to keep track of time 
elapsed with low overhead, high precision and high resolution (i.e. we cannot
count periodic interrupts for tracking elapsed time). In simpler words, the 
timekeeping in hypervisor has to be tickless and high resolution. Further the
PIT emulators in hypervisor may have to keep backlogs of pending periodic
interrupts.

The hypervisor timer subsystem of Xvisor is highly inspired from Linux hrtimer 
subsystem and is completely tickless. It provides the following features:
  1. 64-bit Timestamp: The timestamp represents nanoseconds elapsed since
     Xvisor was booted. (i.e. uptime of Xvisor in-terms of nanoseconds)
  2. Timer events: We can create or destroy timer event with associated 
     expirey time in nanoseconds and an expiry call back handler. The timer 
     events are one shot events (i.e. they are stopped automatically when
     they expire) and to have periodic timer events we will have to manually 
     re-start the timer event from its expiry call back handler.

The hypervisor timer requires architecture specific code to provide one global
clocksource device and one clockevent device for each host CPU to provide 
above mentioned features.


	 	Chapter 4: Hypervisor Manager

The VCPU & Guest instances in Xvisor are created & managed by the Hypervisor 
Manager. It also provides routines for triggering VCPU state changes and it 
notifies the hypervisor scheduler whenever a VCPU state changes.

Just like any OS, a VCPU instance in Xvisor has an architecture dependent part
and an architecture independent part. 

The architecture dependent part of VCPU context consist of:
  1. Arch Registers: Registers which are updated by processor in user mode 
     (or unprivileged mode) only. This registers are usually general purpose 
     registers and status flags which are automatically updated by processor
     (e.g. comparision flags, overflow flag, zero flag, etc). Both Normal and
     Orphan VCPUs require their own copy or arch registers. We refer arch 
     registers of a VCPU as "arch_regs_t *regs" member of our VCPU structure.
  2. Arch Private: Registers which are updated by processor in supervisor 
     mode (or privileged mode) only. Whenever Normal VCPU tries to read/write 
     such register, we get an exception and we can return/update its virtual 
     value. In most cases there are also some additional data structures
     (like MMU context, shadow TLB, shadow page table, ... etc) in arch 
     private. Orphan VCPUs usually dont require arch private, only Normal 
     VCPUs require them. We refer arch private of a VCPU as "void *arch_priv" 
     member of our VCPU structure.

The architecture independent part of VCPU context consist of:
  1.  ID: Globally unique identification number
  2.  SUBID: Identification number unique within parent Guest. (Only for 
      Normal VCPUs)
  3.  Name: The name given for this VCPU. (Only for Orphan VCPUs)
  4.  Device Tree Node: Pointer to VCPU device tree node. (Only for
      Normal VCPUs)
  5.  Is Normal: Flag showing whether this VCPU is Normal or Orphan.
  6.  Guest: Pointer to parent Guest. (Only for Normal VCPUs)
  7.  State: Current VCPU state. (Explained below.)
  8.  Reset Count: Number of times the VCPU has been resetted.
  9.  Start PC: Starting value of Program Counter.
  10. Start SP: Starting value of Stack Pointer. (Only for Orphan VCPUs)
  11. Virtual IRQ Info.: Management information for VCPU Virtual interrupts.
  12. Scheduling Info.: Managment information required by scheduler 
      (e.g. priority, prempt_count, time_slice, etc)
  13. Waitqueue Info.: Information required by waitqueues.
  14. Device Emulation Context: Pointer to private information required by
      device emulation framework per VCPU.

A VCPU can be in excatly one state at any give instance of time. Below is a 
brief description of all possible states:
  * UNKNOWN: VCPU does not belong to any Guest and is not Orphan VCPU. To 
    enforce lower memory foot print, we pre-allocate memory based on maximum 
    number of VCPUs and put them in this state.
  * RESET: VCPU is initialized and is waiting for someone to kick it to READY
    state. To create a new VCPU, the VCPU scheduler picks up a VCPU in UNKNOWN 
    state from pre-allocated VCPUs and intialize it. After initialization the
    newly created VCPU is put in RESET state.
  * READY: VCPU is ready to run on hardware.
  * RUNNING: VCPU is currently running on hardware.
  * PAUSED: VCPU has been stopped and can resume later. A VCPU is set in this 
    state (usually by architecture specific code) when it detects that the VCPU
    is idle and can be scheduled out.
  * HALTED: VCPU has been stopped and cannot resume. A VCPU is set in this 
    state (usually by architecture specific code) when some errorneous access
    is done by that VCPU.

A VCPU state change can occur from various locations such as architecture 
specific code, some hypervisor thread, scheduler, some emulated device, etc. 
Its not possible to have an exhaustive list of all possible scenarios that 
would require a VCPU state change, but the VCPU state changes have to strictly 
follow a finite-state machine which is ensured by the hypervisor manager. 

The figure 3 shows finite-state machine for VCPU state changes.

                                           +---------+
                               [Reset]     |         |      [Halt]     
                           +---------------|  HALTED |<-----------------+
                           |               |         |                  |
                           |               +---------+                  |
                           |                    A                       |
                           |                    | [Halt]                |
                           V                    |                       |
+---------+ [Create]  +---------+  [Kick]  +---------+ [Scheduler] +---------+
|         |---------->|         |--------->|         |------------>|         |
| UNKNOWN |           |  RESET  |          |  READY  |             | RUNNING |
|         |<----------|         |<---------|         |<------------|         |
+---------+ [Destroy] +---------+  [Reset] +---------+ [Scheduler] +---------+
                        A     A              A     |                 |     |
                        |     |     [Resume] |     | [Pause]         |     |
                        |     |              |     V                 |     |
                        |     |            +---------+               |     |
                        |     |            |         |               |     |
                        |     +------------|  PAUSED |<--------------+     |
                        |        [Reset]   |         |   [Pause]           |
                        |                  +---------+                     |
                        |                                                  |
                        +--------------------------------------------------+
                                             [Reset]

The number of virtual interrupts per VCPU and their priority are provided by
architecture specific code. The assertion/deassertion of any virtual interrupt 
is triggered from architecture independent code. Also, A VCPU can wait/pause 
till the next assertion of a virtual interrupt.

A Guest instance consist of the following:
  1.  ID: Globally unique identification number.
  2.  Device Tree Node: Pointer to Guest device tree node.
  3.  VCPU Count: Number of VCPU intances belonging to this Guest.
  4.  VCPU List: List of VCPU instances belonging to this Guest.
  5.  Guest Address Space Info: Information required for managing Guest
      physical address space.
  6.  Arch Private: Architecture dependent context of this Guest.

A Guest Address Space is also architecture independent abstraction. It consist 
of the following: 
  1.  Device Tree Node: Pointer to Guest Address Space device tree node
  2.  Guest: Pointer to Guest to which this Guest Address Space belongs.
  3.  Region List: A set of "Guest Regions" 
  4.  Device Emulation Context: Pointer to private information required by
      device emulation framework per Guest Address Space.

Each Guest Region has a unique Guest Physical Address (i.e. Physical address at
which region is accessible to Guest VCPUs) and Physical Size (i.e. Size of 
Guest Region). Further a Guest Region can be one of the three forms:
  * Real Guest Region: A Real Guest Region gives direct access to a Host 
    Machine Device/Memory (e.g. RAM, UART, etc). This type of regions directly
    map guest physical adress to Host Physical Address (i.e. Physical address 
    in Host Machine).
  * Virtual Guest Region: A Virtual Guest Region gives access to an emulated
    device (e.g. emulated PIC, emulated Timer, etc). This type of region is
    typically linked with an emulated device. The architecture specific code 
    is responsible for redirecting virtual guest region read/write access to
    the Xvisor device emulation framework. 
  * Aliased Guest Region: An Aliased Guest Region gives access to another Guest
    Region at an alternate Guest Physical Address. 


	 	Chapter 5: Hypervisor Scheduler

The hypervisor scheduler of Xvisor is generic and pluggable with respect to the 
scheduling strategy (or scheduling algorithm). It updates per-CPU ready queues
whenever it gets notifications from hypervisor manager about VCPU state change.
The hypervisor scheduler uses per-CPU hypervisor timer event to allocate time 
slice for a VCPU. When a scheduler timer event expires for a CPU, the scheduler 
will find next VCPU using some scheduling strategy (or algorithm) and configure 
the scheduler timer event for next VCPU.

For Xvisor a Normal VCPU is a black box (i.e. any thing could be running on the
VCPU) and exception or interrupt is the only way to get back control. Whenever
we are executing Xvisor code we could be in any one of following contexts:
  1. IRQ Context: When serving an interrupt generated from some external device
     of host machine.
  2. Normal Context: When emulating some functionality or instruction or IO on 
     behalf of Normal VCPU in Xvisor.
  3. Orphan Context: When running some part of Xvisor code as Orphan VCPU or
     Thread (Note: Threads are described later.)

The scheduler keeps track of the current execution context with the help from 
architecture specific exception or interrupt handlers. 

The expected high-level steps involved in architecture specific VCPU context
switching (i.e. arch_vcpu_switch()) are as follows:
  1. Save arch registers (or arch_regs_t) from stack (saved by architecture 
     specific exception or interrupt handler) to current VCPU arch registers
     (or arch_regs_t).
  2. Restore arch register (or arch_regs_t) of next VCPU on stack (will be 
     restored when returning from exception or interrupt handler C code).
  3. Switch context of architecture specific CPU resources such as MMU, 
     Floating point subsystem, etc.

The possible scenarios in which a VCPU context switch is invoked by scheduler
are as follows:
  1. When time slice alloted to current VCPU expires we invoke VCPU context 
     switch. We call this situation as VCPU premption.
  2. If a Normal VCPU misbehaves (i.e. does invalid register/memory access) 
     then architecture specific code can detect such situation and halt/pause 
     the responsible Normal VCPU using APIs from hypervisor manager.
  3. An Orphan VCPU (or Thread) chooses to voluntarily pause (i.e. sleep).
  4. An Orphan VCPU (or Thread) chooses to voluntarily yield its time slice.
  5. The VCPU state can also be changed from some other VCPU using hypervisor
     manager APIs.

We can choose between different scheduling strategies (or algorithms) from 
Xvisor menuconfig options. Any scheduling strategy (or algorithm) will get 
following scheduling information per-VCPU:
  1. Priority: Priority of the VCPU. Higher the value higher the priority.
  2. Time Slice: Minimum amount of time in nano-seconds that a VCPU must get
     once it is scheduled.
  3. Prempt Count: Number of locks held by this VCPU. If prempt count of 
     current VCPU is greater than zero then we cannot prempt.
  4. Private Pointer: In addition to above scheduling strategy (or alogrithm) 
     can maintain per-VCPU private information.

Currently, scheduling strategies (or algorithms) available are:
  1. Round-Robin (RR)
  2. Priority Based Round-Robin (PRR)
(Note: By default Xvisor uses Priority Based Round-Robin strategy)


	 	Chapter 6: Hypervisor Load Balancer

TBD.


	 	Chapter 7: Hypervisor Threads

The threading framework for managing background threads in Xvisor is called 
Hypervisor Threads. Threads in Xvisor are no different from Orphan VCPUs, 
in-fact each thread is a wrapper over an orphan VCPU. The best example of a
thread would be our management terminal (mterm).

To create a thread in Xvisor we will require five mandatory things:
  1. Name: The name assigned to this thread.
  2. Function: Thread entry function pointer.
  3. Data: Void pointer to arbitary data to be passed as argument to thread
     entry function.
  4. Priority: Priority of the thread (or underlying Orphan VCPU). Higher the 
     value higher the priority.
  5. Time Slice: Minimum amount of time in nano-seconds that a thread (or 
     underlying Orphan VCPU) must get once it is scheduled.

We don't need to explicitly create stack for each thread. The hyprevisor 
threads framework automatically creates fixed sized stack for each thread when
they are created. The stack size for all threads can be changed at compile time
using Xvisor menuconfig options.

The thread ID, Priority, and Time Slice will be same as ID, Priority, and
Time Slice of the underlying Orphan VCPU.

A thread can be in one of the following states at any point of time:
  * CREATED: The thread is freshly created one and it is not running yet.
  * RUNNING: The thread is running.
  * SLEEPING: The thread is sleeping in a waitqueue.
  * STOPPED: The thread has stopped. Either it was forcefully stopped or it is
    done with its task.
(Note: Orphan VCPU states can be directly mapped to one of the thread states, 
hence to get current thread state we look at state of underlying Orphan VCPU.)

For inter-thread synchronization we have following synchronization primitives:
  1. Spinlocks: Typically used for smaller critical sections, and for 
     syncronization between IRQ context, Normal context and Orphan context.
  2. Completion: Typcially used when a thread (or Orphan VCPU) wants to wait 
     for an event (for e.g. interrupt from host device) to occur.
  3. Semaphore: Traditional semaphore lock which allows thread (or Orphan VCPU) 
     to sleep when lock (or resource) not available. 
  4. Mutex: Traditional mutex lock which allows thread (or Orphan VCPU) to 
     sleep when lock not available.

From the above Completion, Semaphore, and Mutex use Xvisor waitqueues for 
sleeping. Only a thread (or Orphan VCPU) can sleep in a waitqueue because we
cannot sleep in IRQ and Normal context. Due to this sleepable operations on 
Completion, Semaphore, and Mutex should only be done in Orphan context.


		Chapter 8: Device Driver Framework

TBD. Also explain Character Device Framework as an example.


		Chapter 9: Device Emulation Framework

TBD.


		Chapter 10: Workqueue Framework

TBD.


		Chapter 11: Command Manager

TBD.


		Chapter 12: Wall-Clock Subsystem

TBD.


		Chapter 13: Virtual Serial Port Framework

TBD.


