Chapter 8
Memory-Management Strategies
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Memory Management
Motivation
Keep several processes in memory to improve a system’s performance
Selection of different memory management methods
Application-dependent
Hardware-dependent
Memory – A large array of words or bytes, each with its own address.
Memory is always too small!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Memory Management
The Viewpoint of the Memory Unit
A stream of memory addresses!
What should be done?
Which areas are free or used (by whom)
Decide which processes to get memory
Perform allocation and de-allocation
Remark:
Interaction between CPU scheduling and memory allocation!
Background
Address Binding – binding of instructions and data to memory addresses
source program
object module
load module
in-memory binary memory image
compiling
linking
loading other object
modules
system library
dynamically loaded system
library
Binding Time
Known at compile time, where a program will be in memory - “absolute code”
MS-DOS *.COM At load time:
- All memory reference by a program will be translated - Code is relocatable
- Fixed while a program runs At execution time
- binding may change as a program run
symbolic address e.g., x
Relocatable address
Absolute address
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Background
• Binding at the Compiling Time
•A process must execute at a specific memory space
• Binding at the Load Time
• Relocatable Code
• Process may move from a memory segment to another → binding is delayed till run-time Main
Memory
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Logical Versus Physical Address
Memory +
Relocation Register
14000
Physical Address 14346 CPU
Logical Address
346
Memory Management Unit (MMU) –
“Hardware-Support”
The user program deals with logical addresses
- Virtual Addresses (binding at the run time)
Memory Address Register
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Logical Versus Physical Address
A logical (physical) address space is the set of logical (physical) addresses generated by a process. Physical addresses of a program is transparent to any process!
MMU maps from virtual addresses to physical addresses. Different memory mapping
schemes need different MMU’s that are hardware devices. (slow down)
Compile-time & load-time binding schemes results in the collapsing of logical and physical address spaces.
Dynamic Loading
A routine will not be loaded until it is called. A relocatable linking loader must be called to load the desired routine and change the program’s address tables.
Advantage
Memory space is better utilized.
Users may use OS-provided
libraries to achieve dynamic loading
Dynamic Loading
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Dynamic Linking
Dynamic Linking Static Linking
A small piece of code, called stub, is used to locate or load the appropriate routine
language library
program object module+
binary program image Advantage
Simple Save memory space by sharing
the library code among processes Æ Memory
Protection & Library Update!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Overlays
Motivation
Keep in memory only those instructions and data needed at any given time.
Example: Two overlays of a two-pass assembler
overlay driver common routines
Symbol table
10KB 30KB 20KB
Pass 1
70KB Pass 2 80KB
Certain relocation &
linking algorithms are needed!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Memory space is saved at the cost of run-time I/O.
Overlays can be achieved w/o OS support:
⇒ “absolute-address” code
However, it’s not easy to program a overlay structure properly!
⇒ Need some sort of automatic
techniques that run a large program in a limited physical memory!
Overlays
Swapping
OS
User Space
swap out
swap in
Process p1
Process p2
Should a process be put back into the same memory space that it occupied previously?
↔ Binding Scheme?!
Main Memory Backing Store
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Swapping
A Naive Way
Dispatcher checks whether
the process is in memory
Dispatch CPU to the process Pick up
a process from the ready queue
Swap in the process
Yes
No
Potentially High Context-Switch Cost:
2 * (1000KB/5000KBps + 8ms) = 416ms Transfer Time Latency Delay
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Swapping
The execution time of each process should be long relative to the swapping time in this case (e.g., 416ms in the last example)!
Only swap in what is actually used. ⇒ Users must keep the system informed of memory usage.
Who should be swapped out?
“Lower Priority” Processes?
Any Constraint?
⇒ System Design
= disk+ 100ms sec per 1000k
100k
= disk+ 100ms sec per 1000k
100k
I/O buffering
I/O buffering Memory
OS
Pi
?I/O?
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Swapping
Separate swapping space from the file system for efficient usage
Disable swapping whenever possible such as many versions of UNIX –
Swapping is triggered only if the memory usage passes a threshold, and many processes are running!
In Windows 3.1, a swapped-out process is not swapped in until the user selects the process to run.
Contiguous Allocation – Single User
A single user is allocated as much memory as needed
Problem: Size Restriction → Overlays (MS/DOS) User
OS
Unused 0000
8888 a
b
a
b
limit register relocation register
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Contiguous Allocation – Single User
Hardware Support for Memory Mapping and Protection
CPU < +
memory
relocation register limit
register
logical address
No
Yes physical address
trap
Disadvantage: Wasting of CPU and Resources
∵ No Multiprogramming Possible
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Contiguous Allocation – Multiple Users
Fixed Partitions
Memory is divided into fixed partitions, e.g., OS/360 (or MFT)
A process is allocated on an entire partition
An OS Data Structure:
proc 1 proc 7 proc 5
20k
45k 60k
90k 100k Partition 1
Partition 2
Partition 3 Partition 4
“fragmentation”
Partitions
# size location status 1
2 3 4
25KB 20k 15KB 45k 30KB 60k 10KB 90k
Used Used Used Free
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Contiguous Allocation – Multiple Users
Hardware Supports
Bound registers
Each partition may have a
protection key (corresponding to a key in the current PSW)
Disadvantage:
Fragmentation gives poor memory utilization !
Dynamic Partition
s Partitions are dynamically created.
OS tables record free and used partitions
Contiguous Allocation – Multiple Users
Used
Base = 20k size = 20KB
user = 1
Base = 70k size = 20KB
user = 2
Free size = 30KBBase = 40k
Base = 90k size = 20KB
Input Queue
P3 with a 40KB memory request !
free
free OS
Process 1
Process 2 20k
40k
70k 90k 110k
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Contiguous Allocation – Multiple Users
Solutions for dynamic storage allocation :
First Fit – Find a hole which is big enough
Advantage: Fast and likely to have large chunks of memory in high memory locations
Best Fit – Find the smallest hole which is big enough. → It might need a lot of search time and create lots of small fragments !
Advantage: Large chunks of memory available
Worst Fit – Find the largest hole and create a new partition out of it!
Advantage: Having largest leftover holes with lots of search time!
Better in Time and Storage Usage
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
P1 600KB 10 P2 1000KB 5 P3 300KB 20 P4 700KB 8 P5 500KB 15
Process Memory Time
A job queue
Contiguous Allocation Example – First Fit (RR Scheduler with Quantum = 1)
Time = 0 Time = “0” Time = 14
Time = “14” Time = 28 Time = “28”
OS OS OS
400k
2560k 2560k
2300k 400k
2000k 1000k
2560k 2300k 400k
2000k 1000k
OS OS OS
P1
P3
P1
P3 P2
P2 terminates &
frees its memory
2560k 2300k 400k
2000k 1000k 1700k
P1
P3 P4
P3 P4
2560k 2300k 400k
2000k 1000k 1700k
300KB
260KB
+ 560KB
P5? 2560k 2300k 400k
2000k 1000k 1700k 900k
P3 P4 P5
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Fragmentation – Dynamic Partitions
External fragmentation occurs as small chunks of memory accumulate as a by- product of partitioning due to imperfect fits.
Statistical Analysis For the First-Fit Algorithm:
1/3 memory is unusable – 50-percent rule
Solutions:
a. Merge adjacent free areas.
b. Compaction
- Compact all free areas into one contiguous region - Requires user processes to be relocatable
Any optimal compaction strategy???
Fragmentation – Dynamic Partitions
Cost: Time Complexity O(n!)?!!
Combination of swapping and compaction
Dynamic/static relocation OS
P1 P2 400KB
P3 300KB
P4 200KB
0
300K 500K 600K 1000K 1200K 1500K 1900K 2100K
OS P1 P2
*P3
*P4 900K
0
300K 500K 600K 800K 1200K
2100K
OS P1 P2
*P4 P3 900K
0
300K 500K 600K 1000K 1200K
2100K
OS P1 P2
*P4 P3 900K
0
300K 500K 600K
1500K 1900K 2100K
MOVE 600KB MOVE 400KB MOVE 200KB
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Fragmentation – Dynamic Partitions
Internal fragmentation:
A small chunk of “unused” memory internal to a partition.
Reduce free-space maintenance cost
Æ Give 20,002 bytes to P3 and have 2 bytes as an internal fragmentation!
OS P1
20,002 bytes P2
P3 request 20KB
?? give P3 20KB & leave a 2-byte free area??
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Fragmentation – Dynamic Partitions
Dynamic Partitioning:
Advantage:
⇒ Eliminate fragmentation to some degree
⇒ Can have more partitions and a higher degree of multiprogramming
Disadvantage:
Compaction vs Fragmentation
The amount of free memory may not be enough for a process! (contiguous allocation)
Memory locations may be allocated but never referenced.
Relocation Hardware Cost & Slow Down
⇒ Solution: Paged Memory!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Paging
Objective
Users see a logically contiguous address space although its physical addresses are throughout physical memory
Units of Memory and Backing Store
Physical memory is divided into fixed-sized blocks called frames.
The logical memory space of each process is divided into blocks of the same size
called pages.
The backing store is also divided into blocks of the same size if used.
Paging – Basic Method
CPU p d f d
..
f
……
Page Table Page Number p ……
d f
Base Address of Page p Page Offset
Physical Address Logical Address
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Paging – Basic Method
Address Translation
A page size tends to be a power of 2 for efficient address translation.
The actual page size depends on the computer architecture. Today, it is from 512B or 16KB
.
p d
page # page offset
m
m-n n
max number of pages: 2m-n Logical Address Space: 2m Physical Address Space: ???
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Paging – Basic Method
A B C D Page
0
4
8
12
16
0 1 2 3
Logical Memory
5 6 1 2
0 1 2 3
Page Table
01 01
Logical Address 1 * 4 + 1 = 5
110 01 Physical Address
= 6 * 4 + 1 = 25
C D
A B
Frame 0 1 2 3 4 5 6 7
Physical Memory
0 4 8 12 16 20 24 28
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Paging – Basic Method
No External Fragmentation
Paging is a form of dynamic relocation.
The average internal fragmentation is about one-half page per process
The page size generally grows over time as processes, data sets, and memory have become larger.
4-byte page table entry & 4KB per page Æ 232 * 212B = 244B = 16TB of physical memory
Page Size Disk I/O Efficiency
Page Table Maintenance
Internal
Fragmentation
* Example: 8KB or 4MB for Solaris.
Paging – Basic Method
Page Replacement:
An executing process has all of its pages in physical memory.
Maintenance of the Frame Table
One entry for each physical frame
The status of each frame (free or allocated) and its owner
The page table of each process must be saved when the process is preempted. Æ Paging increases context-switch time!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Paging – Hardware Support
Page Tables
Where: Registers or Memory
Efficiency is the main consideration!
The use of registers for page tables
The page table must be small!
The use of memory for page tables
Page-Table Base Register (PTBR)
a A Page
Table
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Paging – Hardware Support
Page Tables on Memory
Advantages:
The size of a page table is unlimited!
The context switch cost may be low if the CPU dispatcher merely changes PTBR, instead of reloading another page table.
Disadvantages:
Memory access is slowed by a factor of 2
Translation Look-aside buffers (TLB)
Associate, high-speed memory
(key/tag, value) – 16 ~ 1024 entries
Less than 10% memory access time
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Paging – Hardware Support
Translation Look-aside Buffers(TLB):
Disadvantages: Expensive Hardware and Flushing of Contents for Switching of
Page Tables
Advantage: Fast – Constant-Search Time
item
key value
Paging – Hardware Support
CPU p d
……..
Page# Frame#
f Logical Address
f d
Physical Address
Physical Memory
• Update TLB if a TLB miss occurs!
• Replacement of TLB entries might be needed.
TLB Miss
…. TLB
Page Table p
* Address-Space Identifiers (ASID) in TLB for process matching? Protection? Flush?
TLB Hit
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Paging – Effective Memory Access Time
Hit Ratio = the percentage of times that a page number is found in the TLB
The hit ratio of a TLB largely depends on the size and the replacement
strategy of TLB entries!
Effective Memory Access Time
Hit-Ratio * (TLB lookup + a mapped memory access) + (1 – Hit-Ratio) * (TLB lookup + a page table lookup + a mapped memory access)
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Paging – Effective Memory Access Time
An Example
20ns per TLB lookup, 100ns per memory access
Effective Access Time = 0.8*120ns
+0.2*220ns = 140 ns, when hit ratio = 80%
Effective access time = 0.98*120ns
+0.02*220ns = 122 ns, when hit ratio = 98%
Intel 486 has a 32-register TLB and claims a 98 percent hit ratio.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Paging – Protection & Sharing
Protection
Use a Page-Table Length Register (PTLR) to indicate the size of the page table.
Unused Paged table entries might be ignored during maintenance.
y v 2
y v 7
y 3
y v
1 0
Page Table
Is the page in memory?
r/w/e protected: 100r, 010w, 110rw,
…
Modified?
Valid Page?
Valid-Invalid Bit
memory r/w/e dirty
Paging – Protection & Sharing
P0 P1 P2
P4 P5
0 2K 4K
8K
10,468
…
12,287
V 2
V 3
V 4
V 7
V 8
V 9
i 0
i 0
0 1 2 3 4 5 6 7
Page Table
p d
Logical address 3
P0 P1 P2
P3 0
1 2 3 4 5 6 7
P4 P5 8
9
11
Example: a 12287-byte Process (16384=214)
(PTLR entries?) P3
6K
10K
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Paging – Protection & Sharing
Procedures which are executed often (e.g., editor) can be divided into procedure + date. Memory can be saved a lot.
Reentrant procedures can be saved! The non-modified nature of saved code must be enforced
Address referencing inside shared pages could be an issue.
*ed1 3
4 6 1
*ed2
*ed3
* Data 1
3 4 6 7
* data1
*
* ed1
*
* ed2
*
* ed3
* data2 ::
*ed1
*ed2
*ed3
* Data 2 Page
Table 1
Page Table 2
P1 P2
page 0 1 2 3 4 5 6 7 n
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Multilevel Paging
Motivation
The logical address space of a process in many modern computer system is very large, e.g., 232 to 264 Bytes.
32-bit address Æ 220 page entries Æ 4MB
4KB per page 4B per entries page table
Æ Even the page table must be divided into pieces to fit in the memory!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Multilevel Paging – Two-Level Paging
d P2 P1
Logical Address
Outer-Page Table
A page of page table
P1
P2
d
Physical Memory
PTBR
Forward-Mapped Page Table
Multilevel Paging – N-Level Paging
1 + 1 + … + 1 + 1
= n+1 accesses
Pn d ..
P2 P1
N pieces
PTBR
P1
P2
Pn
… Physical
Memory d
Logical Address
Motivation: Two-level paging is not
appropriate for a huge logical address space!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Multilevel Paging – N-Level Paging
Example
98% hit ratio, 4-level paging, 20ns TLB access time, 100ns memory access time.
Effective access time = 0.98 X 120ns + 0.02 X 520ns = 128ns
SUN SPARC (32-bit addressing) Æ 3-level paging
Motorola 68030 (32-bit addressing) Æ 4- level paging
VAX (32-bit addressing) Æ 2-level paging
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Hashed Page Tables
Objective:
To handle large address spaces
Virtual address Æ hash function Æ a linked list of elements
(virtual page #, frame #, a pointer)
Clustered Page Tables
Each entry contains the mappings for several physical-page frames, e.g., 16.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Inverted Page Table
Motivation
A page table tends to be big and does not correspond to the # of pages residing in the physical memory.
Each entry corresponds to a physical frame.
Virtual Address: <Process ID, Page Number, Offset>
CPU pid P d f d
pid: p Logical
Address
Physical Address
Physical Memory
An Inverted Page Table
Inverted Page Table
Each entry contains the virtual address of the frame.
Entries are sorted by physical addresses.
One table per system.
When no match is found, the page table of the corresponding process must be referenced.
Example Systems: HP Spectrum, IBM RT, PowerPC, SUN UltraSPARC
CPU pid P d f d
pid: p Logical
Address
Physical Address
Physical Memory
An Inverted Page Table
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Inverted Page Table
Advantage
Decrease the amount of memory needed to store each page table
Disadvantage
The inverted page table is sorted by physical addresses, whereas a page reference is in a logical address.
The use of Hash Table to eliminate
lengthy table lookup time: 1HASH + 1IPT
The use of an associate memory to hold recently located entries.
Difficult to implement with shared memory
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Segmentation
Segmentation is a memory management
scheme that support the user view of memory:
A logical address space is a collection of segments with variable lengths.
Subroutine
Sqrt
Stack
Symbol table
Main program
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Segmentation
Why Segmentation?
Paging separates the user’s view of memory from the actual physical
memory but does not reflect the logical units of a process!
Pages & frames are fixed-sized, but segments have variable sizes.
For simplicity of representation,
<segment name, offset> Æ <segment- number, offset>
Segmentation – Hardware Support
Address Mapping
CPU s d
+
Physical Memory<
s
limit base
yes
no trap
base Segment
Table limit
d d
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Segmentation – Hardware Support
Implementation in Registers – limited size!
Implementation in Memory
Segment-table base register (STBR)
Segment-table length register (STLR)
Advantages & Disadvantages – Paging
Use an associate memory (TLB) to improve the effective memory access time !
TLB must be flushed whenever a new segment table is used !
a
Segment table
STBR STLR
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Segmentation – Protection & Sharing
Advantage:
Segments are a semantically defined portion of the program and likely to have all entries being
“homogeneous”.
Example: Array, code, stack, data, etc.
Æ Logical units for protection !
Sharing of code & data improves memory usage.
Sharing occurs at the segment level.
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Segmentation – Protection & Sharing
Potential Problems
External Fragmentation
Segments must occupy contiguous memory.
Address referencing inside shared segments can be a big issue:
How to find the right segment number if the number of users sharing the segments increase! Æ Avoid reference to segment #
offset Seg#
Indirect addressing?!!!
Should all shared-code segments have the same segment number?
Segmentation – Fragmentation
Motivation:
Segments are of variable lengths!
Æ Memory allocation is a dynamic storage-allocation problem.
best-fit? first-fit? worst-ft?
External fragmentation will occur!!
Factors, e.g., average segment sizes
A byte
Size External
Fragmentation
Overheads increases substantially!
(base+limit “registers”)
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Segmentation – Fragmentation
Remark:
Its external fragmentation problem is better than that of the dynamic
partition method because segments are likely to be smaller than the
entire process.
Internal Fragmentation??
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Segmentation with Paging
Motivation :
Segmentation has external fragmentation.
Paging has internal fragmentation.
Segments are semantically defined portions of a program.
Æ “Page” Segments !
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Pentium Segmentation
sd p
g s
Selector Segment Offset
13 1 2
8K Private Segments + 8K Public Segments
Page Size = 4KB or 4MB (page size flag in the page directory), Max Segment Size = 4GB
Tables:
Local Descriptor Table (LDT)
Global Descriptor Table (GDT)
6 microprogram segment registers for caching
32
d
10
p2 p1
10 12
Logical Address
Linear Address
Pentium Segmentation
sd s+g+p
: :
Segment Base Segment Length
:
: >- f d
d p2
+
no Trap
; f
Physical Memory
16 32
p2 Segment
table
Physical address
10 12
*Page table are limited by the segment lengths of their segments. Consider the page size flag and the invalid bit of each page directory entry.
Page Table Logical Address
Descriptor Table
p1 10
p1 ;
Page Directory
Page Directory Base Register
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Linux on Pentium Systems
Limitation & Goals
Supports over a variety of machines
Use segmentation minimally – GDT
On individual segment for the kernel code, kernel data, the user code, the user data, the task state segment, the default LDT
Protection: user and kernel modes
d
10
p2 p1
10 12
Linear Address on Pentium
p2 d 3-Level Paging Address p1
middle directory
global directory page table
offset
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Paging and Segmentation
To overcome disadvantages of paging or segmentation alone:
Paged segments – divide segments further into pages.
Segment need not be in contiguous memory.
Segmented paging – segment the page table.
Variable size page tables.
Address translation overheads increase!
An entire process still needs to be in memory at once!
Æ Virtual Memory!!
* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005.
Paging and Segmentation
Considerations in Memory Management
Hardware Support, e.g., STBR, TLB, etc.
Performance
Fragmentation
Multiprogramming Levels
Relocation Constraints?
Swapping: +
Sharing?!
Protection?!