CSC-4730
README.md

CSC 4730 | Operating Systems | Fall '21 | Notes

Notes from operating systems class. Full 14 week course with Dr. Perry Kivolowitz.

9.10.21

Accessable address space

Address space on x64 is 2^64 HIMEM

(Can't go to 0, no null dereference. 0 is guarded to catch common 'pointer problems')

|Address Space|

Address: 0
Dragon πŸ‰ Code
Globals Heap ↓
(grows to higher addresses)
(Doesn't get smaller, only bigger)
(Grows using NEW or MALLOC) Unallocated memory
DRAGON πŸ‰
DRAGON πŸ‰
DRAGON πŸ‰ Stack ↑↓
(Contains local vars)
(Contains return address)
(Stack shrinks on return)
(Starts at highest address, grows 'up', able to shrink)
Address: (2^64) Himem


Executing process' can't access OS address space memory.
No program overlap due to address space memory virtualization.

Process States:

W,R,B,Z

W: Waiting
(aka Runnable, Sleeping)
(Ready to go from W to R, needs to be scheduled by OS.)

R: Running
(On CPU executing.)
(May be descheduled back to W eg. waiting for IO or Sleep.)
(May go from R to Z 'zombie'.)

B: Blocked
(Running can go to blocked if eg. waiting for IO or Sleep, when finished - goes back to W.)
(Cannot go from B to R directly.)

Z: Zombie
(Entire address space has been deleted, some data tracked by OS.)
(Housed in process control table.)
(Exists until parent of process recieves information required.)

Process':

Most number of running processes (on CPU) you have is == num cores.

PID:0 is the 'first process' - init
(Built by OS during boot.)
(init checks for dead process' exit with no parent. Releases Zs.)

Process Control Table:

(Keeps where B & W left off. Copy of registers.)
(PC, & all registers eg. SP.)
(Keeps process states: WRBZEU.)
(E is embryo process, still being built/constructed and address space allocation -> goes to W.)
(U is unused 'null' and is available to become a process.)
(Open files)(openfiletable array)
(inodes contain data block information)
(Process name)

XV6 Unmodified address space

(simplistic & limited)
(stack can't move/grow)

|XV6 Address Space |

Address: 0
Code
Globals Stack (fixed) Heap ↓



Address: Himem

Future Projects

XV6p add guard page XV6p add stack

p1 Notes:

create shell - user programs 'Toy' shell - limited function Connected via pipes

Open File Table Layout

Open File Table | openfiletable array
--------------       Sys Calls:
| stdin cin  | 0       open
--------------         close
| stdout cout| 1       read
--------------         write
| stderr cerr| 2       seek
--------------         dup/dup2
|            | 3       pipe
--------------         fork
|            | 4       exec
--------------         execvp
|            | 5
______________

Fork returns 0 in child
Fork returns PID in parent

pid = fork();
if(pid > 0){
    //you are parent
}
else {
    //you are child
}

C++ vs. xv6 C methods:

//C++:
cout << "pid" << getpid() << endl;
//C:
printf(fd, fmt, "print");

Variable number of arguments

void foo(...);


9.15.21

Whiteboard Examples (p1)

Send data from parent to child

OFT
0 std in
1 std out
2 std err
3 rs (first avail)(will get closed by parent)
4 ws 

Parent code:
int pipe fds[2];
    = pipe(pipefds);
pid = fork() //copy of OFT
if (pid > 0) {
    //parent
    close(pipefds[0]);
    write(pipefds[1],______);
}
else if (pid == 0) {
    //child
    close(pipefds[1]);
    close(0); //STD_FILE null ?
    dup(pipefds[0]);
    close(pipefds[0]);
    =read(0); //exec("wc")
}

OFT child
0 rs
1 std out
2 std err
3 
4 


pipefds
|read|write|

//'bookkeeping' is important rs & ws need to be closed when applicable


Parent sets up c1 to send to c2

parent
ls stdout ->
stdin -> wc

= pipe(pipefds);
fork();
//if c1
else {
    fork();
    //if c2
}
close(rs);
close(ws);
...
wait(c1); //gets pid returned from fork 
wait(c2);

OFT parent
0
1
2 
3 rs
4 ws

//if child
close(rs);
close(1);
dup(ws);
close(ws);
exec("ls");

OFT c1
0 in
1 ws
2 err
3 
4 

//if c2
close(ws);
close(0);
dup(rs);
close(rs);
exec("wc");

OFT c2
0 rs
1 out
2 err
3 
4 

-i & -o will be implimented in children

-i Child will open file and move fd where it needs to go

-o Child will open file and move fd where it needs to go before exec (no return from exec)

-a and -o can't both be called, use the last called option.

Can start P1, use getopt

Limited Direct Execution

Programs execute directly on CPU

'Limited' because certain parts are executed.

Some instructions can only execute in priveleged state. (Elevation from user code to kernel).

AKA, unless you need something special, your program is executing directly on the CPU.

Fork system call flow/pipeline in XV6:

Your code

fork();

usys.S

puts # in a register
causes T_SYSCALL trap //your process is locked elevates privelages
alltraps //does bookkeeping & calls trap() Executes at kernel.
return from trap (de-escelates privleges)

user.h

include in user program

traps.h

T_SYSCALL 64 //someone wants to make a sys call

trapasm.S

During boot, the kernel initializes a trap table.
XV6 is simplified.
Entire trap table is one call after another to all traps. (If you get here call all traps, generall catch all).


traps.c

//Optimizes the common case.

syscall.h

//Contains
// System call numbers
#define SYS_fork    1
#define SYS_exit    2
#define SYS_wait    3
#define SYS_pipe    4
#define SYS_read    5
#define SYS_kill    6
#define SYS_exec    7
#define SYS_fstat   8
#define SYS_chdir   9
#define SYS_dup    10
#define SYS_getpid 11
#define SYS_sbrk   12
#define SYS_sleep  13
#define SYS_uptime 14
#define SYS_open   15
#define SYS_write  16
#define SYS_mknod  17
#define SYS_unlink 18
#define SYS_link   19
#define SYS_mkdir  20
#define SYS_close  21

syscall.c

syscall(void)
{
  int num;
  struct proc *curproc = myproc();

  num = curproc->tf->eax;
  if(num > 0 && num < NELEM(syscalls) && syscalls[num]) {
    curproc->tf->eax = syscalls[num]();
  } else {
    cprintf("%d %s: unknown sys call %d\n",
            curproc->pid, curproc->name, num);
    curproc->tf->eax = -1;
  }
}

Operating system internals are like the internals of a watch ~ Dr. Perry Kivolowitz



9.17.21

Review Topics

fork
pip
exec
dup/dup2

Mentioned:

seek(fd,pos,flag)
//essentially a 'tape deck', positions in files can be read or written to (if random access). Flag denotes relativity to beginning, current, or end position of file. Part of inode is current position in open file.
Redirect vs pipe:
opening a pipe gives two file descriptors (read & write).
Manipulation of std in/out using close() and dup().
Once something is connected to fd[0] (std in), process' have no context to this.
Redirect effects std input.
The shell handles '>' as overwrite and '>>' as append if destination is existing.
Shell '<' gives an input to a command.
Std err can be redirected using '2>'.

Read:

(OSTEP ch 7) https://pages.cs.wisc.edu/~remzi/OSTEP/cpu-sched.pdf

(OSTEP ch 8) https://pages.cs.wisc.edu/~remzi/OSTEP/cpu-sched-mlfq.pdf

9.20.21

Cpu-scheduling

Initial assumptions:

  1. Each job runs for the same amount of time.
  2. All jobs arrive at the same time.
  3. Once started, each job runs to completion.
  4. All jobs only use the CPU (i.e., they perform no I/O)
  5. The run-time of each job is known.

Great Idea (of CS):
Create an algorithm that connot be realized in the actual world. Why? (Perfection is the benchmark)

Scheduling algorithms:

Turn around Time:

  • Tcompletion - Tarrival

Response Time:

  • TFirstRun - Tarrival

First Come First Serve (FCFS)

Shortest Job First (SJF)

Shortest Time-to-Completion First (STCF)

Round-Robin (RR)

Scheduling quantum or time slice for Round Robin (RR) scheduling algorithm.

Ideal quantum?...

When quantum is used up process goes from running to waiting.

Multi-Level Feedback Queue (MLFQ)

Basic rules for MLFQ:

  • Rule 1: If Priority(A) > Priority(B), A runs (B doesn’t).
  • Rule 2: If Priority(A) == Priority(B), A & B run in RR.
  • Rule 3: When a job enters the system, it is placed at the highest priority (the topmost queue).
  • Rule 4a: If a job uses up an entire time slice while running, its priority is reduced (i.e., it moves down one queue).
  • Rule 4b: If a job gives up the CPU before the time slice is up, it stays at the same priority level.
  • Rule 5: Epoch: Lift everyone back to highest queue.

Ideal epoch?...

Mentioned:

Every queue level can have different parameters. e.g. quantam can be different.

Sophisticated OS may have 50 or more queues.


P2

P2 will be first project inside XV6

Goal: Add guard page for NULL dereference

(Heap is the only thing allowed to expand)

Modify address space so that there is,

  • 0
  • 4k of dragons (4kB region that kills process)
  • Code | globals
  • Stack
  • Heap

Mentioned:

Kernel printf (for limited debugging)

proc.c
ln 210 copyuvm located in vm.c

The linker is going to place "main()" at a fixed location.

( Ν‘Β° ΝœΚ– Ν‘Β°) $(LD) $(LDFLAGS) -N -e main -Ttext 0x1000 -o $@ $^

  • src
  • ppc
    • src*
  • compiler
    • assembly
  • assembler
    • object code
  • linker
    • executable


9.22.21

STCF, SJF, and FCFS experience the convoy effect.

DBMSs and OSs have a lot in common.

Mentioned:

Third project will be a 'lottery scheduler'.


9.24.21

Lottery Scheduler

  • Relative Priority
  • Tickets 'lottery tickets' are issued to processes
e.g. Tickets issued:
A = 100
B = 50
C = 200
----
100|50|200

100|150|350
 A   B   C

 Use random number to win the lottery.

Cons: Short term inaccurate. Sorting/searching might not scale well with large # of processes.

Pros: Long term settles to accurate. Easy to impliment.

Can impliment temporary boosts of priority.

Can assign process groups (sub a sub c etc.).

Stride Scheduler

Bypasses disadvantages of randomess and produces same benefits of lottery.

Each program is assigned a stride.

  divide by 10,000
Priorities | Stride
A: 100        100
B: 50         200
C: 200        50
-------------------

Pass | A | B | C | (Run)
       0   0   0   A
      100  0   0   B
      100 200  0   C
      100 200  50  C
      100 200  100 A
      200 200  100 C
      200 200  150 C
      200 200  200 B
      200 400  200 A
      300 400  200 C
      300 400  250 C
      300 400  300 A
      400 400  300 C
      ...

      ---------------
Update pass value when quantum is exhausted.

P3 will be to impliment Stride Scheduling.

Difficulty of P3 is parsing the input.

Special case for freshly awoken sleep process' - get reassigned the smallest existing pass value. (process pass = minimum pass)

When a new process is added - pass value gets assigned the same as the smallest existing pass value.

Mentioned:

Stride, when a process is scheduled and running it is temporarily removed from stride table view.

P3 CSV commands in file.

P3 can be implimented however necessary as it is simulated (linear search acceptable).

P3 is 'deterministic'.
Grade will be known ahead of time.

For P3, take advantage of CPP's abilities.

Completely Fair Scheduler (CFS) quantum is adjusted based on number of runnable tasks.


9.27.21

Review P2

Hints were given. See notes above from 9.20.21.



9.29.21

Multiprocessor scheduling.

 CPU       CPU  (registers)
cache     cache  (L1)(L2)...
  |_________|
       |
    Memory (GB)

A thread shares an address space vs. forked proceses. (roomates vs. neighbors).

New & delete are based on malloc() and free(). New is intelligent based on context - malloc() just gives bytes.

void * malloc(size_t);

Note: in cpp, sizeof is an operator (like + or -), remember to not get sizeof pointers.

void free(void*)

p = malloc(0x1000);

free(p*);

  • p is a pointer to:
    |sentinel|size|sentinel|4096 bytes|

  • Sentinel will detect mem over/underruns.

  • free() knows how to interpret proper size from pointer p.

  • Don't forget to free() for every malloc().

  • Freeing too soon is a common bug especially with multithreading.

  • Don't free properly allocated memory more than once.

Common bug:
int * p;

instead do:
int * p = malloc(sizeof(int));

or:
int * p = NULL;


Good practice:

//Create & init
struct foo * f = malloc(sizeof(foo));

memset(f,0,sizeof(foo));

//Create & init
char * s = (char*)malloc(100);

*s = 0;


Mentioned:

Don't forget NULL terminators
Eg. assign space for c string:

malloc(strlen() + 1);

There is always a minimum of one page between heap stack.
  • Read chapters 13 - 16


10.1.21

P3

Stride scheduler refer to ch 9 of OSTEP

Some commands take one or two arguments.

  • Arguments are seperated with commas no spaces (csv).

Commands:

  • New Job + priority
  • Finished, current job finished
    • New schedule decision
  • Interupt, quantum of current job is exhausted
    • New schedule decision
  • Block, blocks currently running job
  • Unblock, unblocks any blocked job
  • Runnable, print jobs that could run
  • Running, print which job is running
  • Blocked, print which jobs are on blocked queue.

assume only perfect input


Notes:

Old OG address spaces:

  • Every address was a physical address.
    • No flexibility.

Static relocation:

  • At link time set the start address.
  • User can determine where program starts.
  • Can run multiple programs at once in regions of memory.
    • Could run same program linked different ways.
  • Every address is still a physical address.

Dynamic relocations:

  • At load time, user specified a base address.
    • Dynamic could rewrite (inspects binary executable) branch addresses, which zone to run in was decided at the program launch.
    • uses offsets
  • Still physical addresses.
    • Moderate flexibility.

Base/Bounds, Offset/Length:

  • Base register & bounds register (hw).
    • All programs are linked using same address space.
  • When program is executing, hardware will take address and add it to base to get 'real' address.
  • Addresses are now virtural.
  • Highly flexible. Programs can move around in memory.

Code, heap, stack + Base/Bounds

  • Every program's code/globals, heap, & stack have a base/bounds register.
    • Code segment
    • Heap segment
    • Code segment
  • High flexibility, finely grained segments.
  • More programs can fit into ram at once.

Explicit vs. Implicit Segmentation:

Explicit:

  • Address, take top two bits. Know which base/bounds to use.
    • If 00, this is an instruction or global (code)
    • If 01, heap.
    • If 10, stack.
    • If 11, error.

Implicit:

  • If address was generated by the program counter:
    • Code segment.
  • If address is relative or used by stack pointer:
    • Stack secgment.
  • If address is anything else:
    • Heap segment.

Stack, heap, code is all still contiguious.


Paging:

  • All of address space.

10.4.21

Heap is managed in user space by the runtime library.

The OS allocates usable RAM to your address space, but then the heap is managed by runtime library.

malloc(1k) a
malloc(2k) b
malloc(1k) c
malloc(2k) d
free b
free d

Pointers will be adjusted by runtime when free spaces are touching to make a region 'coalesced' (2k + 1k = 3k free).

  • coalescing
  • splitting

Managed (except for head), by list in the free memory.


Algorithms to decide how to fill 'holes':

  • Best
    • Scan entire list, choose hole that is closest to size needed.
  • Worst
    • Find biggest hole and split it.
  • First
    • Find first hole (head onwards) big enough and use it.
  • Next
    • Find next hole (where malloc left off) big enough and use it.
  • Buddy
    • Always divides holes in half. Gives away on half. Powers of two.
  • Slab
    • Suppose you have a specific data structure that you frequently need to alloc & free (fixed size).
      • Chances are high as a student now, one day you may impliment slab and experience a big performance boost (low level networking).

Slab:

eg.
1K each
1000 of them

malloc(1k) * 1k
split into 1k chunks
maintain free list
Need one? Consult list and hand out.
List could be a stack.
Fast, efficient, no fragmentation.
OS is only involved once per block.

Mentioned:

Implimentation of one or more algorithms listed for a project. (P5?)

P4 will be back to xv6, P5 will be back to userland.

P4 will likely be moving the stack. Traditional layout w/stack on bottom of address space.

Can start looking at p4 *now*.


10.18.21

Heierarchical Page Tables

(For scalibility)

(Page tables referenced by page tables)

  24          24          16
1st order | 2nd order | offset

Vbit valid -> many more pages (1000x or more) Vbit invalid -> 1000x or more invalid pages



10.20.21

Translation Lookaside Buffer (TLB)

A caching system in hardware for address translations.

Virtual Addresss

  • Is it in the tlb?
    • Yes -> done
    • No -> is it valid in PT?
      • No -> fault seg
      • Yes -> is it present?
        • No -> fault page
        • Yes -> fill in the tlb.

Virtual Memory

Has a backing device - e.g. disk

Has larger capacity than main memory.

Layout:

| -TLB- | P | V | D | R |

[Swap]
TLB: Translocation Lookaside Buffer
P: Pagetable
V: Valid
D: Dirty
R: Referenced

If there is a pagefault, pass handling (trap) from hardware (low level event) to software (high level event handling).

Noticed: you need some page off the swap device.

  • Data for swap device data location on page table. Know where to get it.
  • Know where to put it. Introduces page replacement algorithm. What to swap?


Page Replacement Algorithms

E.g.

Pages in proc.     Physical RAM
1                   1  2
2                   2  1
3                   3  3
4

FIFO

  • 1,2,3,4,1,2,3,4,...
  • Ejects wrong pages.

Random

  • Would work better, but...

LRU - least recently used

  • Given certain workloads may not work.
  • Genrally better as it learns from history.
  • Uses working set.
  • Approximation of LRU is called clock algorithm

Locality

  • 1: spatial
    • If you've used something, things nearby might be used soon.
  • 2: temporal
    • If you've used something recently, you'll probably use it 'soon' again.

Clock Algorithm (v1)

  • Uses P,V,D,R
  • Loop over all pages that are valid and present.
    • If reference bit is '0', pick this page to eject.
    • If ref bit is 1, set it to 0 and move on.
      • Restarts the clock

Clock Algorithm (v2)

  • Loop over all pages that are valid and present.
    • Continue to reset R bit.
    • Look for clean unreferenced pages, D == 0, eject.
      • Else, loop again, look for any unreferenced page (dirty or not).
        • Else, pick page at random.
  • Less complex than LRU, same results. Run by a daemon, not user or os program. Daemon kicks out pages until n level is reached.


10.27.21

Concurrency

Methods of concurrency:

  • Pipes
    • Named pipes
  • Shared Memory
  • Files
  • Sockets
  • Semaphores
  • Threads
    • Mutexes

Rollups allow for correct multithread operation at full speed.

  • A for loop returning its own version of a variable. Not overwriting.


10.29.21

Project 6

C++ program using advanced cpp-isms. Needs to be a gorgeous program.

  • Use using
  • Use typedef

Project will create a vector of unsigned int of arbitrary size (cmd line arg) - fill it with nums 0 to size of vector (consecutive integers) shuffle vector (ramdom order in vector 'algorithms include). Based on num threads (default 4), subdivide vector into equal portions (last will get strays). Then, use threads to 'observe' 'pile' (shared resource) and place integer onto pile (print). Exit the thread when it has gone through its segment.

Notes:

  • Enforce minimums and maximums. (Defaults)
  • All threads should exit.
  • Include thread and mutex.
  • Less than 2 lines of concurrency code.
  • Professor's code is 130 lines (no comments).


11.01.21

Mutexes & Threading

Mutual exclusion is a way to ensure that only one thread at a time can access a shared resource. Mutexes acquire a lock on a shared resource. Don't acquire a lock twice.

mutex m;

m.lock();
m.unlock();
m.try_lock(); // returns true if lock acquired.


Lock guard: Give it a mutex and it will lock it when it is created and unlock it when it is destroyed. This is a safer way to use a mutex than locking it manually. It is also a way to make sure that the mutex is unlocked if an exception is thrown.

Ways to impliment mutual exclusion:

  • Prevent being interrupted...
    • Elevate / deelevate priority.
    • Multiple levels of interrupts.
  • Use a mutex
    • Needs hardware support.
    lock(int * m) {
      while(*m == 1) {
        *m = 1;
      }
    }
    unlock(int * m) {
      *m = 0;
    }
    
    • Will not work.

Locks typically require hardware support for efficient implementation. This support usually takes the form of one or more atomic instructions such as

  • test-and-set
  • compare-and-swap
  • load linked & store conditional
  • fetch and add

These instructions allow a single process to test if the lock is free, and if free, acquire the lock in a single atomic operation.

//code:
while(testAndSet(*m, 1) == 1);

while(compareAndSwap(*m, 1, 0) == 1);
while(true) {
  while(loadLinked(*m) == 1);
  if (storeConditional(*m, 1) == 1) {
    return;
  }
//these are atomic instructions:
testAndSet(*old, new){
  temp = *old;
  *old = new;
  return temp;
}

compareAndSwap(*m, int new, int expected){
  temp = *m;
  if(temp == expected){
    *m = new;
  }
  return temp;
}

loadLinked(*m){
  return *m;
}
storeConditional(int *m, int v){
  if(noone has written to *m since LL){
    *m = v;
    return 1;
  }
  return 0;
}

int fetchAndAdd(int *ptr){
  int old = *ptr;
  *ptr = old + 1;
  return old;
}
lock.init(tl_t*t){
  t->ticket=0;
  t->turn =0;
}
unlock(tl_t*){
  l->turn++
}



11.3.21

How to impliment Threads

  • Multiple stacks in shared address space.
  • No need to copy original thread's stack.
  • Each thread has own entry in process table.
  • Handle when threads die, add flag in process table changing from thread to zombie process. When reference count to threads == 0 then kill process.


11.8.21

Fleegman's Dillema aka Edsger W. Dijkstra's Dining Philosophers

Default algorithm results in deadlock.

Solve one of two ways:

  • Trylock
  • Lock

How the flugelhorn is put down is not specified.

Use C++11 style mutexes and not pthreads.

After you play, yield the cpu. sched_yield()

Protect cout with a mutex so no output will overlay any other output.

Immediately try to play again after yielding.



11.10.21

Condition Variables

Condition variables have a built in queue.

When someone signals, there is a choice to be made: Do you wake up one of the sleeping threads or all of them.

C++ Example:

#include <mutex>
#include <condition_variable>

void child0() {
  m.lock();
  cout << "child0\n";
  done = 1;
  cv.notify_one();
  m.unlock();
}

void child1() {
  m.lock();
  cout << "child1\n";
  cv.notify_one();
  m.unlock();
}
void thr_join0(){
  unique_lock<mutex> lk(m);
  while (done==0){
    cv.wait(1k);
  }
}

void thr_join1(){
  unique_lock<mutex> lk(m);
  cv.wait(1k);
}

Mutex is acting like a mutual exclusion controller for the queue inside the condition variable.

Semaphores

You can build a semaphore out of a condition variable, counter, and mutex. Semaphore is a synchronization primitive. Allows a certain number of concurrent threads but not more.

When you establish a semaphore you have to set the counter. Initialize the counter (semaphore) with the number of resources you are willing to give away IMMEDIATELY after initialization.

Semaphores are extremely flexible.

Producer Consumer Pattern

This pattern is very common.

Circular buffer / ring buffer is a producer consumer.

The producer has to synchronize with the consumer and vice versa so there is no lapping.

The semaphore is perfect for implementing producer consumer patterns.

Hint for p7: Make an array of pointers to functions, create an index - select from there.



11.12.21

Hardware

Talking to Hardware

Access to hardware is like accessing memory.

graph LR
Read --> Fread --> Cin
Read --> User/OS
Read --> Buffering_system --> Device_Driver
Device_Driver --> Hardware --> Memory_Mapped_Device_Registers
Device_Driver --> Memory_Mapped_Device_Registers

Pointers to hardware must be marked as volitile.

The way to set bits in a type can be done using bit-fields or bit-bashing:

Bit-fields:

struct CNTRL_REG_T {
  unsigned short data:4;
  unsigned short unused:5;
  unsigned short on_off: 1;
  unsigned short unused: 2;
  unsigned short enable: 1;
};

CNTRL_REG_T c;

c.data

Bit-bashing:

unsigned short* c = Address;
#define GETDATA((c))* (c)& 0xF
unsigned short *c = Address;
#define GETDATA((c)) *(C) &~0xF
#define GETDATA((c)) *(C) |9

HDD:

Platter spins. Data blocks (512B) are concentric.

In order to read a block, the heads must position over the data-specific ring and then wait for the rotation.

Disc will spin at fixed RPM. Rotational latency.

Blocks are written all at once or not at all.

The head must move across the surface of the disc to reach the lane/track of required block. Seek latency.

The head moving from any track to any track is around ~5ms.

HDDs are ballistic, therefore there is a Settling Latency <1ms.

Data transfer must occur, therefore there is a Transfer Latency.

Blocks

Performance improvements can be made in software by changing block order.



11.15.21

RAID

Redundant Array of Inexpensive drives

RAID0 (Striping)

  • Logically reorderd blocks onto multiple hard drives.
  • Will read blocks from drives in parallell.
  • Chunk size is blocks read next to each other to 'make' a larger block.
  • No redundancy of data.

RAID1 (Mirroring)

  • Min drives is 2.
  • All data on drives is redundant.

RAID1+0 (Striping and mirroring)

  • Data is redundant from mirror.
  • Data io is fast from striping.

RAID4 (Striping with parity)

  • Min drives is 3.
  • Data will not be lost if one drive is lost - due to XOR or bits into the parity bit. Multiple drive failure is loss of data.
  • Efficiency suffers. (n-1/n) storage capacity of one less. Bottlenecked by parity.

RAID5 (Striping with rotating parity)

  • Avoids bottleneck of dedicated parity drive.
  • Same storage efficiency and read, but faster write speed than RAID4.
  • Data will not be lost if one drive is lost - due to XOR or bits into the parity bit. Multiple drive failure is loss of data.

RAID6 (Striping with rotating double parity)

  • Storage efficiency has decreased. (n-2/n)
  • Avoids bottleneck of dedicated parity drive.
  • Data will not be lost if up to three drives fail. Four drive loss is loss of data.

Mentioned:

P8 is building a pipe

Two parts:

  • producer consumer
  • shmops (Shared memory operations)

Mac OSX is a dog's breakfast. P9 is FSCK



11.19.21

File System

Purpose:

  • Organization
    • Heierarchical
  • Categorization
    • Files
    • Timestamps
    • Type
    • Size
  • Location
  • Permissions
  • File system geometry
  • Free Blocks

Blocks are either used or free.

'lingua franca' (drive must support common tongue for access to block 0 boot block)

Bootstrapping process (booting):

graph LR
BIOS --> boot_block --> boot_loader
boot_loader --> super_block

Boot loader

  • Knows a little about the file system.
  • Find a specific file.
  • Load, execute init.

Super Block

  • Sata structure describing the drive and file system.

e.g. XV6 superblock structure (fs.h):

// mkfs computes the super block and builds an initial file system. The
// super block describes the disk layout:
struct superblock {
  uint size;         // Size of file system image (blocks)
  uint nblocks;      // Number of data blocks
  uint ninodes;      // Number of inodes.
  uint nlog;         // Number of log blocks
  uint logstart;     // Block number of first log block
  uint inodestart;   // Block number of first inode block
  uint bmapstart;    // Block number of first free map block
};

inodes - Information ninodes

  • Store everything there is to know about an individual file (exception of name).

XV6 inode structure:

// On-disk inode structure
struct dinode {
  short type;           // File type
  short major;          // Major device number (T_DEV only)
  short minor;          // Minor device number (T_DEV only)
  short nlink;          // Number of links to inode in file system
  uint size;            // Size of file (bytes)
  uint addrs[NDIRECT+1];   // Data block addresses
};

Special files represent hardware:

  • Character special files
  • Block special files
  • Major and minor device numbers.
  • Accessed via device driver.

inode addrs: Take array [13]

  • First 12 are first 12 blocks of data file.
  • Block# or free. Block# is address on disk.
    • Int/bsize = block# then in%bsize = byte on block#.
  • 12th is single indirect block. Contains block nums of more data. Root of one-level tree.
  • 0-11 = direct blocks, 12 is single indirect block.

XV6 Directory Structure:

// Directory is a file containing a sequence of dirent structures.
#define DIRSIZ 14

struct dirent {
  ushort inum;
  char name[DIRSIZ];
};

Every directory entry is 16 bytes wide. (inode num (2B)+ name(14B))

Note the longest file name supported in xv6 is 14 chars.

A file is independent of the name it is referenced by.



11.22.21

P8

Tips:

include using ""

open(0) //gives fd

lseek(fd, offset, whence)

close(0)

Build a model of data in ram (several data structures that duplicate data on disk).

Read in superblock into superblock struct,

struct SuperBlock sb;
lseek(fd, BISZE*1, seek_set); //Process return codes and do something!
read(fd, &sb, sizeof(sb))//Process return values.

malloc size for holding all inodes.

Record/create metadata about block n in data structure for block usage. (Helps with missing block)

Parent and current point to current if root directory.

33 items in directory, moving into another block.

Link count and number of files n should be the same.

Should be very specifc about error messages.

Metablocks = everything up to start of data.

Error should be:

Block: 65 is found more than once in the inodes:
2
3

Each image has a specific problem. Report on the first error and stop processing.

./pckfs -f

Bitbash helper functin that returns value (bool) of bit. So one can say get me the bitmap entry for block n. blocknum(which block), int/blocknum( which byte) int%blocknum(which bit).

May be a lot of type casting. void* is your friend, but there is no address arithmetic with void*.

Remember,if file name is 14 bytes - there will be no null terminator.