Memory Hierarchy in Computer Organisation

Q: What is the memory hierarchy in computer organisation?

The memory hierarchy is a layered structure of storage devices arranged by speed, size, and cost: Registers (fastest, smallest, most expensive per bit) → L1 Cache → L2 Cache → L3 Cache → Main Memory (DRAM) → Secondary Storage (SSD/HDD) → Tertiary Storage. Each level is slower and larger than the level above it but cheaper per bit.

Q: What is the difference between SRAM and DRAM?

SRAM (Static RAM) uses flip-flops to store each bit — fast (1–5 ns), no refresh needed, but expensive and large. Used for cache. DRAM (Dynamic RAM) uses a capacitor and transistor per bit — slower (60–100 ns), needs periodic refresh to retain data, but cheap and dense. Used for main memory.

Q: What is a TLB in computer organisation?

A TLB (Translation Lookaside Buffer) is a small, fully-associative cache that stores recent virtual-to-physical address translations. When the CPU accesses a virtual address, it first checks the TLB. A TLB hit directly provides the physical address without accessing the page table in memory. A TLB miss requires a page table walk — accessing memory one or more times to find the translation.

Q: What is thrashing in memory management?

Thrashing occurs when a system spends more time handling page faults than executing program instructions. It happens when the combined working sets of running processes exceed available physical memory — pages are constantly being swapped in and out. CPU utilisation drops sharply. The fix is to reduce the degree of multiprogramming (run fewer processes) or add more physical memory.

Registers, Cache, DRAM, ROM, Virtual Memory, TLB & Memory Organisation — Complete GATE CS Notes

Last updated: April 2026 | GATE CS syllabus aligned

Key Takeaways

Memory hierarchy trades off speed, size, and cost — faster memory is smaller and more expensive per bit
SRAM (flip-flop based) is used for cache — fast, no refresh, expensive. DRAM (capacitor based) is used for main memory — slower, needs refresh, cheap
Memory organisation: to expand address space or data width, chips can be connected in a bank (word extension) or interleaved (bandwidth improvement)
Virtual memory allows processes to use more address space than physical RAM — the OS manages page tables and handles page faults
TLB (Translation Lookaside Buffer) caches recent address translations — a TLB hit avoids page table memory accesses
Effective memory access time with TLB = TLB hit time + (TLB miss rate × page table access time)
Thrashing occurs when page fault rate is so high that the CPU barely executes program instructions

1. The Memory Hierarchy Pyramid

Every computer manages a fundamental tension: the memory closest to the CPU (registers) is blazing fast but can only hold a handful of values; the memory that can store your entire operating system and applications (hard disk) is enormous but agonisingly slow. The memory hierarchy bridges this gap by using multiple layers, each faster and more expensive than the layer below it.

Level	Technology	Typical Size	Access Time	Managed By
Registers	Flip-flops (SRAM)	~256 bytes (32 × 64-bit)	0.3–0.5 ns	Compiler
L1 Cache	SRAM	32–64 KB per core	1–4 cycles (≈1 ns)	Hardware
L2 Cache	SRAM	256 KB – 1 MB	10–20 cycles (≈5 ns)	Hardware
L3 Cache	SRAM	4–32 MB (shared)	30–50 cycles (≈15 ns)	Hardware
Main Memory	DRAM	4–128 GB	100–200 cycles (≈60 ns)	OS + Hardware
SSD	NAND Flash	128 GB – 8 TB	~50–100 μs	OS
HDD	Magnetic disk	500 GB – 20 TB	~5–10 ms	OS
Optical / Tape	Various	Effectively unlimited	Seconds to minutes	Operator

The principle that makes this hierarchy work: most programs access a small subset of their data most of the time (locality of reference). If the cache holds the 1% of data that accounts for 99% of accesses, average access time is close to cache speed despite main memory being 100× slower.

2. SRAM vs DRAM

Property	SRAM (Static RAM)	DRAM (Dynamic RAM)
Storage element	6-transistor flip-flop (bistable latch)	1 capacitor + 1 transistor
Data retention	Holds data as long as power is on (no refresh)	Capacitor leaks — must refresh every few ms
Speed	Very fast: 1–5 ns	Slower: 50–100 ns
Density	Low — 6 transistors per bit	High — 1 transistor + capacitor per bit
Power consumption	Higher (always active)	Lower when not accessed
Cost	Expensive per MB	Cheap per GB
Use	L1/L2/L3 cache, register files, TLB	Main memory (RAM)

DRAM Refresh Overhead:
DRAM must be refreshed every T_refresh period (typically 64 ms).
During refresh, memory is unavailable for access.
Overhead % = (Refresh time per row × Number of rows) / T_refresh × 100%

3. ROM Types — ROM, PROM, EPROM, EEPROM, Flash

Read-Only Memory is non-volatile — it retains data without power. Different ROM types offer different trade-offs between reprogrammability and complexity.

Type	Full Name	Programmed By	Erasable?	Use
ROM	Read-Only Memory	Manufacturer (mask ROM)	No	Fixed firmware in consumer devices
PROM	Programmable ROM	User (one-time, by burning fuses)	No	Small-run production firmware
EPROM	Erasable PROM	User (electrically)	Yes — UV light (entire chip)	Development and prototyping
EEPROM	Electrically Erasable PROM	User (electrically)	Yes — byte by byte, electrically	BIOS chips, smart cards
Flash	Flash EEPROM	User (electrically)	Yes — in blocks, electrically	SSDs, USB drives, BIOS, phones

4. Memory Organisation — Chips, Banks & Interleaving

Real systems build their memory from multiple chips. Two configurations matter for GATE:

Word Extension (Increasing Address Space)

Connect chips to cover a larger address range — each chip covers a portion of the address space.

Number of chips needed for word extension:
Total memory needed = N bytes
Each chip holds = C bytes
Number of chips = N / C

Address lines needed from CPU = log₂(N / word_size_in_bytes)
Address lines handled by chip select = log₂(N / C)
Address lines going into each chip = log₂(C / word_size)

Bit Extension (Wider Data Word)

Connect multiple chips in parallel to increase data width — each chip provides some bits of each word.

Number of chips for bit extension:
If CPU has 32-bit data bus and each chip is 8 bits wide:
Chips needed = 32 / 8 = 4 chips (connected in parallel)

Memory Interleaving

Interleaving divides memory into banks that can be accessed independently. While one bank is being read, the others are already preparing their next access — increasing effective memory bandwidth.

Property	Low-Order Interleaving	High-Order Interleaving (Banking)
Bank selection bits	Low-order address bits → consecutive addresses in different banks	High-order bits → consecutive addresses in same bank
Best for	Sequential access (streaming) — exploits spatial locality	Independent bank access by multiple processors
Conflict	Stride-k access causes conflicts if k = bank count	Sequential access serialised within one bank

5. Virtual Memory

Virtual memory is an abstraction that gives each process the illusion of having a large, private address space, even if physical RAM is limited. The OS and hardware collaborate to map virtual addresses to physical addresses, swapping pages between RAM and disk as needed.

Why virtual memory?

Isolation: each process has its own virtual address space — process A cannot accidentally overwrite process B’s memory
Size: a process’s virtual address space can be larger than physical RAM (64-bit processes can address 16 exabytes)
Simplicity: programmers do not need to manage physical memory locations

Virtual Address → Physical Address:
Virtual page number (VPN) = Virtual address / Page size
Page offset = Virtual address mod Page size (same in physical address)
Physical Frame Number (PFN) = lookup VPN in page table
Physical Address = PFN × Page size + Page offset

Number of pages in virtual space:
= Virtual address space / Page size
= 2^(virtual address bits) / 2^(page offset bits)
= 2^(VPN bits)

6. Page Tables & Address Translation

The page table is an array (typically in main memory) where each entry maps one virtual page to one physical frame. Each entry (PTE — Page Table Entry) contains:

Valid bit: 1 = page is in physical memory; 0 = page is on disk (triggers a page fault)
Physical Frame Number (PFN): the location in RAM
Protection bits: read/write/execute permissions
Dirty bit: page has been written (must save to disk on eviction)
Reference/Access bit: page has been accessed recently (used by replacement algorithms)

Page table size:
= Number of virtual pages × Size of each PTE
= (Virtual address space / Page size) × PTE size

Example: 32-bit virtual address, 4 KB pages, 4-byte PTE:
Number of pages = 2³²/ 2¹² = 2²⁰ = 1,048,576 pages
Page table size = 1,048,576 × 4 bytes = 4 MB (per process — huge!)

This 4 MB per-process overhead led to multi-level page tables: the page table itself is paged, so only the portions needed are kept in memory. A 2-level page table splits the VPN into two fields — a first-level index and a second-level index. Modern processors use 4-level page tables for 64-bit addressing.

7. TLB — Translation Lookaside Buffer

Every memory access in a virtual memory system requires at least two physical memory accesses: one to read the page table, and one for the actual data. This doubles memory access time — unacceptable for performance. The TLB solves this.

The TLB is a small (16–512 entries), fully-associative, on-chip cache that stores recent VPN→PFN mappings. On most accesses, the TLB provides the physical address in 1–2 cycles, avoiding the page table lookup entirely.

Effective Memory Access Time (EMAT) with TLB:
EMAT = TLB hit rate × (TLB time + Memory time)
+ TLB miss rate × (TLB time + Page table time + Memory time)

Simplified (if TLB access is part of every translation):
EMAT = TLB time + (TLB miss rate × Page table walk time) + Memory time

TLB hit: Physical address found in TLB → access data in memory
(Total: TLB lookup + 1 memory access)

TLB miss, page in memory: TLB lookup fails → walk page table (1+ memory accesses) → update TLB → access data
(Total: TLB lookup + k page table accesses + 1 memory access, where k = page table levels)

TLB miss, page fault: Page not in RAM → OS handles page fault → load page from disk → retry
(Disk access: millions of cycles — extremely expensive)

Event	TLB	Page Table	Physical Memory	Disk
TLB Hit	Hit	Not needed	Accessed	Not needed
TLB Miss, Page In RAM	Miss	Accessed (to get PFN)	Accessed	Not needed
TLB Miss, Page Fault	Miss	Accessed (valid bit = 0)	Page loaded into RAM	Page read from disk

8. Thrashing

Thrashing is a catastrophic performance failure in virtual memory systems. It happens when the total working set of all running processes exceeds available physical memory. Pages that are constantly needed keep getting evicted to make room for other pages that are also constantly needed — causing a continuous, high-rate stream of page faults.

The symptom: CPU utilisation drops sharply (often below 10%) even as I/O activity spikes, because the CPU is idle waiting for pages to be loaded from disk.

Causes of thrashing:

Too many processes running simultaneously (high degree of multiprogramming)
A process whose working set exceeds available frames

Solutions:

Reduce the degree of multiprogramming (swap out a process entirely)
Increase physical memory
Use working set model — allocate enough frames for each process to hold its working set
Use page fault frequency (PFF) algorithm — add frames to processes with high fault rates; reclaim from processes with low fault rates

9. GATE-Level Worked Examples

Example 1 — Memory Chip Organisation (GATE 2020)

Problem: A memory system requires 32 KB of total memory with 8-bit wide access. Available chips are 1K × 4-bit. How many chips are needed?

Solution:
Required: 32 KB = 32 × 1024 = 32,768 locations × 8 bits
Each chip: 1K locations × 4 bits = 1024 × 4

Chips for bit extension (to get 8 bits wide): 8 / 4 = 2 chips per row
Chips for word extension (to get 32K locations): 32K / 1K = 32 rows

Total chips = 2 × 32 = 64 chips

Example 2 — Page Table Size (GATE 2021)

Problem: A 32-bit virtual address system uses pages of size 8 KB. Each page table entry is 4 bytes. What is the size of the page table for a single process?

Solution:
Page size = 8 KB = 2¹³ bytes → Page offset = 13 bits
VPN bits = 32 − 13 = 19 bits
Number of virtual pages = 2¹⁹ = 524,288 pages
Page table size = 524,288 × 4 bytes = 2,097,152 bytes = 2 MB

Example 3 — EMAT with TLB (GATE 2022)

Problem: A system has a TLB with 90% hit rate. TLB access time = 10 ns. Main memory access time = 100 ns. On a TLB miss, one additional page table access is needed. What is the EMAT?

Solution:
TLB hit: 10 ns (TLB) + 100 ns (memory) = 110 ns
TLB miss: 10 ns (TLB) + 100 ns (page table in memory) + 100 ns (data in memory) = 210 ns

EMAT = 0.90 × 110 + 0.10 × 210
= 99 + 21 = 120 ns

10. Common Mistakes

Confusing SRAM and DRAM refresh requirements
SRAM does not need refresh — it is a bistable circuit that holds its state indefinitely while powered. DRAM capacitors leak and must be refreshed every ~64 ms. Saying “SRAM needs periodic refresh” is a common incorrect choice in GATE MCQs.
Forgetting the page table access in EMAT calculation
On a TLB miss, the CPU must access the page table in main memory before accessing the data. Students often count only the data memory access after a TLB miss, missing the page table lookup. For a single-level page table: TLB miss adds 1 extra memory access.
Mixing virtual address bits and physical address bits
Virtual and physical address spaces can be different sizes. Page table size is determined by virtual address space (number of virtual pages). Physical frame addresses are determined by physical address space. Always distinguish which address space the question refers to.
Computing page table size using physical frames, not virtual pages
Page table has one entry per virtual page, not per physical frame. Even if physical RAM is only 512 MB (fewer frames), the page table still needs entries for all 2^VPN virtual pages.
Assuming thrashing is caused by a single process
Thrashing is a system-level phenomenon caused by the aggregate working set of all running processes exceeding physical memory. A single process with a small working set will not thrash — but it becomes a victim when other processes push total demand over the limit.

11. Frequently Asked Questions

What is the memory hierarchy in computer organisation?

The memory hierarchy is a layered arrangement of storage, ordered from fastest/smallest/most expensive (registers) to slowest/largest/cheapest (tape or optical). Each level acts as a cache for the level below it. The hierarchy works because most programs exhibit locality — they reuse a small, changing subset of their data most of the time.

What is the difference between SRAM and DRAM?

SRAM uses a 6-transistor bistable flip-flop to store each bit. It is fast (1–5 ns), does not require refreshing, but is large and expensive. Used for CPU caches. DRAM stores each bit in a tiny capacitor that leaks charge — it requires periodic refresh every ~64 ms to retain data, but offers very high density and low cost per gigabyte. Used for main memory.

What is a TLB in computer organisation?

The Translation Lookaside Buffer (TLB) is a small, fully-associative, on-chip cache that stores recent virtual-to-physical address translations. Without the TLB, every memory access would require at least two physical memory reads — one to look up the page table, one for the actual data. The TLB reduces most translations to a single on-chip lookup, dramatically reducing average memory access time.

What is thrashing in memory management?

Thrashing is a condition where the system’s page fault rate is so high that it spends more time handling page faults (loading pages from disk) than executing program instructions. CPU utilisation collapses, I/O spikes, and the system appears frozen. It is caused when total working set size exceeds physical memory. The solution is to reduce the number of concurrent processes or add physical memory.

Memory Hierarchy — RAM, ROM, Virtual Memory & TLB | GATE CS