ALevel-CS Chapter 20 System Software

20.01 The purposes of an operating system (OS)

操作系统架构 - 用户态和内核态

Linux进程的4GB空间，3G-4G部分大家是共享的，是内核态的地址空间，这里存放在整个内核代码和所有的内核模块，以及内核所维护的数据。Linux操作系统的体系架构分为用户态和内核态（或者用户空间和内核）。

内核：从本质上看是一种软件——控制计算机的硬件资源，并提供上层应用程序运行的环境。
用户态：即上层应用程序的活动空间，应用程序的执行必须依托于内核提供的资源，包括CPU资源、存储资源、I/O资源等。

备注：为了使上层应用能够访问到这些资源，内核必须为上层应用提供访问的接口：即系统调用。

系统调用：是操作系统的最小功能单位，我们可以把系统调用看成是一种不能再化简的操作（类似于原子操作，但是不同概念）,用户态的应用程序可以通过三种方式来访问内核态的资源：系统调用、库函数、shell脚本

例子：用户运行一程序，该程序所创建的进程开始是运行在用户态的，执行文件操作，网络数据发送等操作步骤如下：

系统调用进入内核态：通过write，send等系统调用，进行调用内核中的代码来完成操作进入内核态。
内核态执行操作：进入3GB-4GB中的内核地址空间去执行这些代码完成操作。
切回用户态：内核态执行完之后，切换用户态。

备注：这样操作的好处在于用户态的程序就不能随意操作1内核地址空间，具有一定的安全保护作用。

Operating system structure

The logical structure of the operating system provides two modes of operation.

User mode is the one available for the user or an application program. The alternative has a number of different names but the most often used is ‘kernel mode’.

The difference between the two is that kernel mode has sole access to part of the memory and to certain system functions that user mode cannot access.

To work properly each higher layer needs to be fully serviced by a lower layer (as in a network protocol stack).

A layered structure for an operating system is hard to achieve in practice. A more flexible approach uses a modular structure.

20.02 The input/output (I/O) system

The operating system can ensure that I/O passes via the CPU but for large quantities of data the operating system can ensure direct transfer between memory and an I/O device.

A CPU typically operates at GHz frequencies. One second sees more than one trillion clock cycles.

20.03 Process scheduling

Key Terms

High-level scheduler - makes decisions about which program stored on disk should be moved into memory
Low-level scheduler - makes decisions about which process stored in memory should have access to the CPU
Process - a program in memory that has an associated process control block
Process control block (PCB) - a complex data structure containing all data relevant to the running of a process
Thread - part of a program which is handled as an individual process when being executed

进程调度

无论是在批处理系统还是分时系统中，用户进程数一般都多于处理机数、这将导致它们互相争夺处理机。另外，系统进程也同样需要使用处理机。这就要求进程调度程序按一定的策略，动态地把处理机分配给处于就绪队列中的某一个进程，以使之执行。

进程的基本状态

等待态：等待某个事件的完成；
就绪态：等待系统分配处理器以便运行；
运行态：占有处理器正在运行。

运行态→等待态往往是由于等待外设，等待主存等资源分配或等待人工干预而引起的。
等待态→就绪态则是等待的条件已满足，只需分配到处理器后就能运行。
运行态→就绪态不是由于自身原因，而是由外界原因使运行状态的进程让出处理器，这时候就变成就绪态。例如时间片用完，或有更高优先级的进程来抢占处理器等。
就绪态→运行态系统按某种策略选中就绪队列中的一个进程占用处理器，此时就变成了运行态

调度处理

高级调度：(High-Level Scheduling)又称为作业调度，它决定把后备作业调入内存运行；
低级调度：(Low-Level Scheduling)又称为进程调度，它决定把就绪队列的某进程获得CPU；
中级调度：(Intermediate-Level Scheduling)又称为在虚拟存储器中引入，在内、外存对换区进行进程对换。

调度方式

非剥夺方式 - 分派程序一旦把处理机分配给某进程后便让它一直运行下去，直到进程完成或发生某事件而阻塞时，才把处理机分配给另一个进程。
剥夺方式 - 当一个进程正在运行时，系统可以基于某种原则，剥夺已分配给它的处理机，将之分配给其它进程。剥夺原则有：优先权原则、短进程优先原则、时间片原则。

算法

先进先出算法 - 把处理机分配给最先进入就绪队列的进程，一个进程一旦分得处理机，便一直执行下去，直到该进程完成或阻塞时，才释放处理机。
短进程有限算法 - 最短CPU运行期优先调度算法(SCBF--Shortest CPU Burst First)。该算法从就绪队列中选出下一个“CPU执行期最短”的进程，为之分配处理机。
轮转法 -

Process scheduling

Programs that are available to be run on a computer system are initially stored on disk.
A user could submit a program as a ‘job’ which would include the program and some instructions about how it should be run.
A long-term or high-level scheduler program controls the selection of a program stored on disk to be moved into main memory.
Occasionally a program has to be taken back to disk due to the memory getting overcrowded. This is controlled by a medium-term scheduler.
When the program is installed in memory, a short-term or low-level scheduler controls when it has access to the CPU.

Process states

Process can be defined as ‘a program being executed’.

A process control block (PCB) can be created in memory ready to receive data when the process is executed. Once in memory the state of the process can change.

A new process arrives in memory and a PCB is created; it changes to the ready state.
A process in the ready state is given access to the CPU by the dispatcher; it changes to the running state.
A process in the running state is halted by an interrupt; it returns to the ready state.
A process in the running state cannot progress until some event has occurred (I/O perhaps); it changes to the waiting state (sometimes called the ‘suspended’ or ‘blocked’ state).
A process in the waiting state is notified that an event is completed; it returns to the ready state.
A process in the running state completes execution; it changes to the terminated state.

It is possible for a process to be separated into different parts for execution. The separate parts are called threads. If this has happened, each thread is handled as an individual process.

Interrupts

Some interrupts are caused by errors that prematurely terminate a running process.

The interrupt mechanism is used when a process in the running state makes a system call requiring an I/O operation and has to change to the waiting state.

Whatever the reason for an interrupt, the OS kernel must invoke an interrupt-handling routine.

Scheduling algorithms

A scheduling algorithm can be preemptive(抢占式) or non-preemptive(非抢占式).

A preemptive algorithm can halt a process that would otherwise continue running undisturbed. If an algorithm is preemptive it may involve prioritising processes.

FCFS(first come first served) is a non-preemptive algorithm and can be implemented by placing the processes in a first-in first-out (FIFO) queue.
A round-robin algorithm allocates a time slice to each process and is therefore preemptive, because a process will be halted when its time slice has run out. It can be implemented as a FIFO queue.
A priority-based scheduling algorithm is more complicated.
1. estimated time of process execution
2. estimated remaining time for execution
3. length of time already spent in the ready queue
4. whether the process is I/O bound or CPU bound.

20.04 Memory management

Key Terms

Segmentation - where a large process is divided into segments for loading into memory but the segments are not constrained to be the same size
Paging - where a large process is divided into pages which have to be the same size
Virtual memory - a paging mechanism that allows a program to use more memory addresses than are available in main memory
Disk thrashing - when paging is being used and a repetitive state has been reached where loading one page causes a need for another page to be loaded almost immediately but the loading of this new page causes the same immediate need

内存抽象 - 地址空间

把进程对应的内存依旧留在物理内存中，需要的时候就切换到特定的区域。这就涉及到了内存的保护机制，毕竟进程之间可以随意读取、写入内容就乱套了，非常不安全。因此操作系统需要对物理内存做一层抽象，也就是「地址空间」(Address Space)，一个进程的地址空间包含了该进程所有相关内存，比如 code / stack / heap。

一个 16 KB 的地址空间

int x;
 
x = x + 3; 

// 汇编

128: movl 0x0(%ebx), %eax  ;load 0+ebx into eax
132: addl $0x03, %eax ;add 3 to eax register
135: movl %eax, 0x0(%ebx) ;store eax back to mem

最前面的是 PC (Program Counter)，用来表示当前 code 的索引，比如 CPU 执行到 128 时，进行了 Context Switch（上下文切换），那么在 Switch 回来后，还可以接着从 132 开始执行（当然需要先把 PC 存起来）。之后的就是汇编代码，告诉 CPU 该如何操作。

从进程的角度看

真实的物理内存看

基址寄存器与界限寄存器可以简单的动态重定位：每个内存地址送到内存之前，都会自动加上基址寄存器的内容。

从 32KB 处作为开始，48KB 作为结束。那 32 / 48 可不可以动态设置呢，只要在 CPU 上整两个寄存器，基址寄存器base 和界限寄存器bounds 就可以了，base 指明从哪里开始，bounds 指定哪里是边界。因此真实物理地址和虚拟地址之间的关系是：

physical address = virtual address + base

base and bounds 这种做法最大的问题在于空间浪费，Stack 和 Heap 中间有一块 free space，即使没有用，也被占着，那如何才能解放这块区域呢，进入虚拟内存。

虚拟内存

抽象满足了多进程的要求，但很多情况下，现有内存无法满足仅仅一个大进程的内存要求。物理内存不够用的情况下，如何解决

覆盖overlays：在早期的操作系统曾使用覆盖技术来解决这个问题，将一个程序分为多个块，基本思想是先将块0加入内存，块0执行完后，将块1加入内存。依次往复，这个解决方案最大的问题是需要程序员去程序进行分块，这是一个费时费力让人痛苦不堪的过程。后来这个解决方案的修正版就是虚拟内存。
交换swapping：可以将暂时不能执行的程序（进程）送到外存中，从而获得空闲内存空间来装入新程序（进程），或读人保存在外存中而处于就绪状态的程序。
虚拟内存：虚拟内存的基本思想是，每个进程有用独立的逻辑地址空间，内存被分为大小相等的多个块,称为页(Page).每个页都是一段连续的地址。对于进程来看,逻辑上貌似有很多内存空间，其中一部分对应物理内存上的一块(称为页框，通常页和页框大小相等)，还有一些没加载在内存中的对应在硬盘上。

页式和段式存储管理

如果允许将一个进程分散到许多不连续的空间，就可以避免内存紧缩，减少碎片。基于这一思想，通过引入进程的逻辑地址，把进程地址空间与实际存储空间分离，增加存储管理的灵活性。

地址空间：将源程序经过编译后得到的目标程序，存在于它所限定的地址范围内，这个范围称为地址空间。地址空间是逻辑地址的集合。

存储空间：指主存中一系列存储信息的物理单元的集合，这些单元的编号称为物理地址存储空间是物理地址的集合。

根据分配时所采用的基本单位不同，可将离散分配的管理方式分为以下三种：页式存储管理、段式存储管理和段页式存储管理。其中段页式存储管理是前两种结合的产物。

页式和段式存储管理的区别

需求：是信息的物理单位，分页是为了实现离散分配方式，以减少内存的碎片，提高内存的利用率。或者说，分页仅仅是由于系统管理的需要，而不是用户的需要。段是信息的逻辑单位，它含有一组其意义相对完整的信息。分段的目的是为了更好地满足用户的需要。一条指令或一个操作数可能会跨越两个页的分界处，而不会跨越两个段的分界处。
大小：页大小固定且由系统决定，把逻辑地址划分为页号和页内地址两部分，是由机器硬件实现的。段的长度不固定，且决定于用户所编写的程序，通常由编译系统在对源程序进行编译时根据信息的性质来划分。
逻辑地址表示：页式系统地址空间是一维的，即单一的线性地址空间，程序员只需利用一个标识符，即可表示一个地址。分段的作业地址空间是二维的，程序员在标识一个地址时，既需给出段名，又需给出段内地址。
比页大，因而段表比页表短，可以缩短查找时间，提高访问速度。

Partitions and segments

An early approach to memory management when different processes were loaded into memory simultaneously was to partition the memory.

An improvement was dynamic partitioning where the partition size was allowed to adjust to match the process size. The rule of one process per partition remained.

An extension of this idea which allowed for larger processes to be handled was segmentation. The large process was divided into segments. Each segment was loaded into a dynamic partition in memory.

factors of limited the efficiency:

the segments were not constrained to be the same size.
the size of process did not allow all of the segments for one process to be in memory at the same time.

These two factors combined to cause fragmentation both of the memory and of the disk storage. This resulted in degradation in the performance of the system.

Paging and virtual memory

The modern approach is to use paging. The process is divided into equal-sized pages and memory is divided into frames of the same size. The secondary storage can also be divided into frames.

It could be possible to load all of the pages into memory at the same time.

A special case for the use of paging is when a program is so large that the address space needed for it is larger than the size of the memory. Paging now supports what is known as virtual memory management.

Paging requires the CPU to transfer address values to a memory management unit that allocates a corresponding address on a page. An address has to comprise two parts: the page number plus the offset from the start of the page.

Paging
- a program with 48 instructions
- three pages occupy the first 48 logical memory addreses
Physical address
- first two program pages have been loaded into page frames
- The page frames used do not have to be adjacent

The system overhead in running virtual memory can be a disadvantage. The worst problem is disk thrashing, when part of a process on one page requires another page which is on disk. When that page is loaded it almost immediately requires the original page again.

20.05 Operating system facilities provided for the user

The user interface may be made available as a command line, a graphical display or a voice recognition system.

When a program involves use of a device, the operating system provides the device driver: the user just expects the device to work.

The operating system will provide a file system for a user to store data and programs.

20.06 Translation software

Key Terms

Symbol table - a data structure in which each record contains the name and attributes of an identifier

Translation software

A compiler can be described as having a ‘front end’ and a ‘back end’. The front-end program performs analysis of the source code and unless errors are found produces an intermediate code that expresses completely the semantics (the meaning) of the source code. The backend program then takes this intermediate code as input and performs synthesis of object code.

no error in the source code.

Front-end analysis stages

The four stages of front-end analysis:

lexical analysis (词法分析)
syntax analysis(语法分析)
semantic analysis(语义分析)
intermediate code generation.(中间代码生成)

The Sourse code that is the input data for a compiler or interpreter consists of a sequence of characters.

Each meaningful individual character or collection of characters is referred to as a lexeme.

# declaration statement
Var count : integer;

# five lexemes
Var Count : integer ;


# The assignment statement
PercentMark[Count] := Score * 10
# eight lexemes
PercentMark [ Count ] := Score * 10

For each identifier recognized there must be an entry made in the symbol table.

The symbol table contains identifier attributes such as the data type, where it is declared and where it is assigned a value.

Parse tree for an assignment statement

address code

y := a + (b * c – d) / e

The assignment statement could be converted into the following four statements, each requiring at most three addresses:

temp := b * c 
temp := temp – d 
temp := temp / e 
y := a + temp

Representation of the grammar of a language

For each programming language, there is a defined grammar. This grammar must be understood by a programmer and also by a compiler writer.

One method of presenting the grammar rules is a syntax diagram.

Figure represents the grammar rule that an identifier must start with a letter which can be followed by any combination of none or more letters or digits.

An alternative approach is to use Backus–Naur Form (BNF)

<Identifier> ::= <Letter>|<Identifier><Letter>|<Identifier><Digit>

<Digit> ::= 0|1|2|3|4|5|6|7|8|9

<Letter> ::= <UpperCaseLetter>|<LowerCaseLetter>

<UpperCaseLetter> ::= A|B|C|D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z

<LowerCaseLetter> ::= a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z

The use of | is to separate individual options.
The ::= characters can be read as ‘is defined as’.

By contrast, BNF is a general approach which can be used to describe any assembly of data. Furthermore, it can be used as the basis for an algorithm.

Back-end synthesis stages

Front-end analysis has established that there are syntax errors.

Backend process is the presentation of a list of these errors. For each error, there will be an explanation and the location within the program source code.

In the absence of errors, the main back-end stage is machine code generation from the intermediate code. This may involve optimization of the code.

x := (a + b) * (a – b) 
y := (a + 2 * b) * (a – b)

# The most efficient code would be:

temp := (a – b) 
x := (a + b) * temp 
y := x + temp * b

Evaluation of expressions

The expression can be evaluated by first converting the infix representation in the code to Reverse Polish Notation (RPN). RPN is a postfix representation which never requires brackets and has no rules of precedence.

Example : Converting an expression to RPN

# Simple Expression
a + b * c

# RPN - first step is to convert b * c to get the intermediate from

a + b c *

# final RPN

a b c * +

Example : Converting an expression from RPN

# RPN expression
x 2 * y 3 * + 6 /

# step1 converted to give an intermediate from

(x * 2) y 3 * + 6 /

# following success versions

(x * 2)(y * 3) + 6 /
(x * 2) + (y * 3) 6 /
((x * 2) + (y * 3)) / 6

# final 
(x * 2 + y * 3) / 6

Example - Using a syntax tree to convert an expression to RPN


# expression 

a + b * c

# post-order traversal: This starts at the lowest leaf to the left of the root and then uses left–right–root ordering which ensures

a b c * +

Example - Using a stack with an RPN expression

Shunting-yard algorithm

If it is an identifier, it passes straight through to the RPN expression line.
If it is an operator, there are two options.
- If the stack line is empty or contains a lower precedence operator, the operator is diverted into the stack line.
- If the stack line contains an equal or higher preference operator, then that operator is popped from the stack into the RPN expression line and the new operator takes its place on the stack line.
When all tokens have left the infix line, the operators remaining on the stack line are popped one by one from the stack line onto the RPN expression line.

a + b * c

Evaluating an RPN expression

# x has the value 3 and y has the value 4
x 2 * y 3 * + 6 /

辅助阅读

一文讲清楚操作系统的内核态与用户态

进程调度(百度百科)

大厂面试爱问的「调度算法」，20 张图一举拿下

操作系统内存管理(思维导图详解)