Memory Hierarchy — RAM, ROM, Virtual Memory & TLB | GATE CS

Memory Hierarchy in Computer Organisation

Registers, Cache, DRAM, ROM, Virtual Memory, TLB & Memory Organisation — Complete GATE CS Notes

Last updated: April 2026  |  GATE CS syllabus aligned

Key Takeaways

  • Memory hierarchy trades off speed, size, and cost — faster memory is smaller and more expensive per bit
  • SRAM (flip-flop based) is used for cache — fast, no refresh, expensive. DRAM (capacitor based) is used for main memory — slower, needs refresh, cheap
  • Memory organisation: to expand address space or data width, chips can be connected in a bank (word extension) or interleaved (bandwidth improvement)
  • Virtual memory allows processes to use more address space than physical RAM — the OS manages page tables and handles page faults
  • TLB (Translation Lookaside Buffer) caches recent address translations — a TLB hit avoids page table memory accesses
  • Effective memory access time with TLB = TLB hit time + (TLB miss rate × page table access time)
  • Thrashing occurs when page fault rate is so high that the CPU barely executes program instructions

1. The Memory Hierarchy Pyramid

Every computer manages a fundamental tension: the memory closest to the CPU (registers) is blazing fast but can only hold a handful of values; the memory that can store your entire operating system and applications (hard disk) is enormous but agonisingly slow. The memory hierarchy bridges this gap by using multiple layers, each faster and more expensive than the layer below it.

LevelTechnologyTypical SizeAccess TimeManaged By
RegistersFlip-flops (SRAM)~256 bytes (32 × 64-bit)0.3–0.5 nsCompiler
L1 CacheSRAM32–64 KB per core1–4 cycles (≈1 ns)Hardware
L2 CacheSRAM256 KB – 1 MB10–20 cycles (≈5 ns)Hardware
L3 CacheSRAM4–32 MB (shared)30–50 cycles (≈15 ns)Hardware
Main MemoryDRAM4–128 GB100–200 cycles (≈60 ns)OS + Hardware
SSDNAND Flash128 GB – 8 TB~50–100 μsOS
HDDMagnetic disk500 GB – 20 TB~5–10 msOS
Optical / TapeVariousEffectively unlimitedSeconds to minutesOperator

The principle that makes this hierarchy work: most programs access a small subset of their data most of the time (locality of reference). If the cache holds the 1% of data that accounts for 99% of accesses, average access time is close to cache speed despite main memory being 100× slower.

2. SRAM vs DRAM

PropertySRAM (Static RAM)DRAM (Dynamic RAM)
Storage element6-transistor flip-flop (bistable latch)1 capacitor + 1 transistor
Data retentionHolds data as long as power is on (no refresh)Capacitor leaks — must refresh every few ms
SpeedVery fast: 1–5 nsSlower: 50–100 ns
DensityLow — 6 transistors per bitHigh — 1 transistor + capacitor per bit
Power consumptionHigher (always active)Lower when not accessed
CostExpensive per MBCheap per GB
UseL1/L2/L3 cache, register files, TLBMain memory (RAM)
DRAM Refresh Overhead:
DRAM must be refreshed every T_refresh period (typically 64 ms).
During refresh, memory is unavailable for access.
Overhead % = (Refresh time per row × Number of rows) / T_refresh × 100%

3. ROM Types — ROM, PROM, EPROM, EEPROM, Flash

Read-Only Memory is non-volatile — it retains data without power. Different ROM types offer different trade-offs between reprogrammability and complexity.

TypeFull NameProgrammed ByErasable?Use
ROMRead-Only MemoryManufacturer (mask ROM)NoFixed firmware in consumer devices
PROMProgrammable ROMUser (one-time, by burning fuses)NoSmall-run production firmware
EPROMErasable PROMUser (electrically)Yes — UV light (entire chip)Development and prototyping
EEPROMElectrically Erasable PROMUser (electrically)Yes — byte by byte, electricallyBIOS chips, smart cards
FlashFlash EEPROMUser (electrically)Yes — in blocks, electricallySSDs, USB drives, BIOS, phones

4. Memory Organisation — Chips, Banks & Interleaving

Real systems build their memory from multiple chips. Two configurations matter for GATE:

Word Extension (Increasing Address Space)

Connect chips to cover a larger address range — each chip covers a portion of the address space.

Number of chips needed for word extension:
Total memory needed = N bytes
Each chip holds = C bytes
Number of chips = N / C

Address lines needed from CPU = log₂(N / word_size_in_bytes)
Address lines handled by chip select = log₂(N / C)
Address lines going into each chip = log₂(C / word_size)

Bit Extension (Wider Data Word)

Connect multiple chips in parallel to increase data width — each chip provides some bits of each word.

Number of chips for bit extension:
If CPU has 32-bit data bus and each chip is 8 bits wide:
Chips needed = 32 / 8 = 4 chips (connected in parallel)

Memory Interleaving

Interleaving divides memory into banks that can be accessed independently. While one bank is being read, the others are already preparing their next access — increasing effective memory bandwidth.

PropertyLow-Order InterleavingHigh-Order Interleaving (Banking)
Bank selection bitsLow-order address bits → consecutive addresses in different banksHigh-order bits → consecutive addresses in same bank
Best forSequential access (streaming) — exploits spatial localityIndependent bank access by multiple processors
ConflictStride-k access causes conflicts if k = bank countSequential access serialised within one bank

5. Virtual Memory

Virtual memory is an abstraction that gives each process the illusion of having a large, private address space, even if physical RAM is limited. The OS and hardware collaborate to map virtual addresses to physical addresses, swapping pages between RAM and disk as needed.

Why virtual memory?

  • Isolation: each process has its own virtual address space — process A cannot accidentally overwrite process B’s memory
  • Size: a process’s virtual address space can be larger than physical RAM (64-bit processes can address 16 exabytes)
  • Simplicity: programmers do not need to manage physical memory locations
Virtual Address → Physical Address:
Virtual page number (VPN) = Virtual address / Page size
Page offset = Virtual address mod Page size (same in physical address)
Physical Frame Number (PFN) = lookup VPN in page table
Physical Address = PFN × Page size + Page offset

Number of pages in virtual space:
= Virtual address space / Page size
= 2^(virtual address bits) / 2^(page offset bits)
= 2^(VPN bits)

6. Page Tables & Address Translation

The page table is an array (typically in main memory) where each entry maps one virtual page to one physical frame. Each entry (PTE — Page Table Entry) contains:

  • Valid bit: 1 = page is in physical memory; 0 = page is on disk (triggers a page fault)
  • Physical Frame Number (PFN): the location in RAM
  • Protection bits: read/write/execute permissions
  • Dirty bit: page has been written (must save to disk on eviction)
  • Reference/Access bit: page has been accessed recently (used by replacement algorithms)
Page table size:
= Number of virtual pages × Size of each PTE
= (Virtual address space / Page size) × PTE size

Example: 32-bit virtual address, 4 KB pages, 4-byte PTE:
Number of pages = 2³²/ 2¹² = 2²⁰ = 1,048,576 pages
Page table size = 1,048,576 × 4 bytes = 4 MB (per process — huge!)

This 4 MB per-process overhead led to multi-level page tables: the page table itself is paged, so only the portions needed are kept in memory. A 2-level page table splits the VPN into two fields — a first-level index and a second-level index. Modern processors use 4-level page tables for 64-bit addressing.

7. TLB — Translation Lookaside Buffer

Every memory access in a virtual memory system requires at least two physical memory accesses: one to read the page table, and one for the actual data. This doubles memory access time — unacceptable for performance. The TLB solves this.

The TLB is a small (16–512 entries), fully-associative, on-chip cache that stores recent VPN→PFN mappings. On most accesses, the TLB provides the physical address in 1–2 cycles, avoiding the page table lookup entirely.

Effective Memory Access Time (EMAT) with TLB:
EMAT = TLB hit rate × (TLB time + Memory time)
+ TLB miss rate × (TLB time + Page table time + Memory time)

Simplified (if TLB access is part of every translation):
EMAT = TLB time + (TLB miss rate × Page table walk time) + Memory time

TLB hit: Physical address found in TLB → access data in memory
(Total: TLB lookup + 1 memory access)

TLB miss, page in memory: TLB lookup fails → walk page table (1+ memory accesses) → update TLB → access data
(Total: TLB lookup + k page table accesses + 1 memory access, where k = page table levels)

TLB miss, page fault: Page not in RAM → OS handles page fault → load page from disk → retry
(Disk access: millions of cycles — extremely expensive)

EventTLBPage TablePhysical MemoryDisk
TLB HitHitNot neededAccessedNot needed
TLB Miss, Page In RAMMissAccessed (to get PFN)AccessedNot needed
TLB Miss, Page FaultMissAccessed (valid bit = 0)Page loaded into RAMPage read from disk

8. Thrashing

Thrashing is a catastrophic performance failure in virtual memory systems. It happens when the total working set of all running processes exceeds available physical memory. Pages that are constantly needed keep getting evicted to make room for other pages that are also constantly needed — causing a continuous, high-rate stream of page faults.

The symptom: CPU utilisation drops sharply (often below 10%) even as I/O activity spikes, because the CPU is idle waiting for pages to be loaded from disk.

Causes of thrashing:

  • Too many processes running simultaneously (high degree of multiprogramming)
  • A process whose working set exceeds available frames

Solutions:

  • Reduce the degree of multiprogramming (swap out a process entirely)
  • Increase physical memory
  • Use working set model — allocate enough frames for each process to hold its working set
  • Use page fault frequency (PFF) algorithm — add frames to processes with high fault rates; reclaim from processes with low fault rates

9. GATE-Level Worked Examples

Example 1 — Memory Chip Organisation (GATE 2020)

Problem: A memory system requires 32 KB of total memory with 8-bit wide access. Available chips are 1K × 4-bit. How many chips are needed?

Solution:
Required: 32 KB = 32 × 1024 = 32,768 locations × 8 bits
Each chip: 1K locations × 4 bits = 1024 × 4

Chips for bit extension (to get 8 bits wide): 8 / 4 = 2 chips per row
Chips for word extension (to get 32K locations): 32K / 1K = 32 rows

Total chips = 2 × 32 = 64 chips

Example 2 — Page Table Size (GATE 2021)

Problem: A 32-bit virtual address system uses pages of size 8 KB. Each page table entry is 4 bytes. What is the size of the page table for a single process?

Solution:
Page size = 8 KB = 2¹³ bytes → Page offset = 13 bits
VPN bits = 32 − 13 = 19 bits
Number of virtual pages = 2¹⁹ = 524,288 pages
Page table size = 524,288 × 4 bytes = 2,097,152 bytes = 2 MB

Example 3 — EMAT with TLB (GATE 2022)

Problem: A system has a TLB with 90% hit rate. TLB access time = 10 ns. Main memory access time = 100 ns. On a TLB miss, one additional page table access is needed. What is the EMAT?

Solution:
TLB hit: 10 ns (TLB) + 100 ns (memory) = 110 ns
TLB miss: 10 ns (TLB) + 100 ns (page table in memory) + 100 ns (data in memory) = 210 ns

EMAT = 0.90 × 110 + 0.10 × 210
= 99 + 21 = 120 ns

10. Common Mistakes

  1. Confusing SRAM and DRAM refresh requirements
    SRAM does not need refresh — it is a bistable circuit that holds its state indefinitely while powered. DRAM capacitors leak and must be refreshed every ~64 ms. Saying “SRAM needs periodic refresh” is a common incorrect choice in GATE MCQs.
  2. Forgetting the page table access in EMAT calculation
    On a TLB miss, the CPU must access the page table in main memory before accessing the data. Students often count only the data memory access after a TLB miss, missing the page table lookup. For a single-level page table: TLB miss adds 1 extra memory access.
  3. Mixing virtual address bits and physical address bits
    Virtual and physical address spaces can be different sizes. Page table size is determined by virtual address space (number of virtual pages). Physical frame addresses are determined by physical address space. Always distinguish which address space the question refers to.
  4. Computing page table size using physical frames, not virtual pages
    Page table has one entry per virtual page, not per physical frame. Even if physical RAM is only 512 MB (fewer frames), the page table still needs entries for all 2^VPN virtual pages.
  5. Assuming thrashing is caused by a single process
    Thrashing is a system-level phenomenon caused by the aggregate working set of all running processes exceeding physical memory. A single process with a small working set will not thrash — but it becomes a victim when other processes push total demand over the limit.

11. Frequently Asked Questions

What is the memory hierarchy in computer organisation?

The memory hierarchy is a layered arrangement of storage, ordered from fastest/smallest/most expensive (registers) to slowest/largest/cheapest (tape or optical). Each level acts as a cache for the level below it. The hierarchy works because most programs exhibit locality — they reuse a small, changing subset of their data most of the time.

What is the difference between SRAM and DRAM?

SRAM uses a 6-transistor bistable flip-flop to store each bit. It is fast (1–5 ns), does not require refreshing, but is large and expensive. Used for CPU caches. DRAM stores each bit in a tiny capacitor that leaks charge — it requires periodic refresh every ~64 ms to retain data, but offers very high density and low cost per gigabyte. Used for main memory.

What is a TLB in computer organisation?

The Translation Lookaside Buffer (TLB) is a small, fully-associative, on-chip cache that stores recent virtual-to-physical address translations. Without the TLB, every memory access would require at least two physical memory reads — one to look up the page table, one for the actual data. The TLB reduces most translations to a single on-chip lookup, dramatically reducing average memory access time.

What is thrashing in memory management?

Thrashing is a condition where the system’s page fault rate is so high that it spends more time handling page faults (loading pages from disk) than executing program instructions. CPU utilisation collapses, I/O spikes, and the system appears frozen. It is caused when total working set size exceeds physical memory. The solution is to reduce the number of concurrent processes or add physical memory.

Leave a Comment