I/O Systems in Computer Organisation — DMA, Interrupts & Buses | GATE CS

I/O Systems in Computer Organisation

Programmed I/O, Interrupts, DMA, Buses & Device Controllers — Complete GATE CS Notes

Last updated: April 2026  |  GATE CS syllabus aligned

Key Takeaways

  • I/O systems transfer data between the CPU/memory and external devices — the key design challenge is doing this without wasting CPU cycles
  • Programmed I/O (polling): CPU continuously checks device status — simple but wastes CPU cycles in busy-wait
  • Interrupt-driven I/O: device signals CPU when ready — CPU runs other tasks in between, but interrupt overhead exists for each transfer
  • DMA (Direct Memory Access): dedicated controller handles bulk transfers; CPU only involved at start and end — most efficient for large transfers
  • Cycle stealing: DMA takes one bus cycle at a time; burst mode: DMA holds bus for entire transfer
  • I/O bus structure (system bus, I/O bus, device bus) determines bandwidth and latency between components
  • Device controllers bridge the gap between the digital signals of the bus and the physical signals of the device

1. The I/O Performance Challenge

The CPU operates in nanoseconds. Keyboards respond in tens of milliseconds. Hard disks take 5–10 milliseconds. Even the fastest SSDs take 50–100 microseconds. If the CPU had to sit and wait every time it initiated an I/O operation, a program reading a 1 GB file from a hard disk would spend 99.9999% of its time doing nothing — the CPU running at 3 GHz executing zero instructions.

Three techniques address this problem, each progressively more efficient:

  1. Programmed I/O (Polling): CPU waits, checking repeatedly
  2. Interrupt-Driven I/O: CPU is notified when device is ready
  3. Direct Memory Access (DMA): dedicated hardware handles the transfer
MethodCPU InvolvementBest ForOverhead
Programmed I/O100% — busy-wait loopSimple embedded systems; fast devicesMaximum — CPU completely tied up
Interrupt-DrivenInterrupt handler onlyLow-bandwidth, latency-sensitive I/O (keyboard, mouse)Moderate — one interrupt per data unit
DMASetup + completion interrupt onlyHigh-bandwidth transfers (disk, network, video)Minimal — proportional to transfer size, not data rate

2. Programmed I/O (Polling)

In programmed I/O, the CPU initiates an I/O operation and then enters a tight loop, reading the device’s status register on every iteration, waiting for the “ready” flag to be set.

Polling loop (pseudocode):
CPU writes command to device control register
WHILE (device_status_register != READY) DO
  /* busy-wait — CPU does nothing useful */
END WHILE
CPU reads/writes device data register

When polling makes sense: when the device is very fast and the wait is short (a few CPU cycles), or in simple embedded systems without an OS where an idle CPU is acceptable.

When polling fails: slow devices (disk, network) waste thousands or millions of CPU cycles per transfer. In a multitasking OS, this is unacceptable — the CPU cannot run other processes.

CPU time wasted by polling (worst case):
Polling overhead = Polling frequency × Cycles per poll

Example: Poll a network card 1000 times/second, 400 cycles per poll, 500 MHz CPU:
Cycles per second wasted = 1000 × 400 = 400,000
CPU fraction wasted = 400,000 / 500,000,000 = 0.08% (acceptable for this frequency)

But for a 100 MB/s disk polled at 100M times/second: 100M × 400 = 40 billion cycles — 8× a 5 GHz CPU’s capacity.

3. Interrupt-Driven I/O

With interrupt-driven I/O, the CPU starts an I/O operation, then resumes executing other instructions. When the device completes its operation, it raises an interrupt signal on the bus. The CPU finishes its current instruction, saves its state (registers, PC), and jumps to the Interrupt Service Routine (ISR) — the handler for that device. After the ISR completes, the CPU restores its state and resumes where it left off.

Interrupt Handling Steps

  1. Device raises interrupt line on the bus
  2. CPU checks for interrupts at end of each instruction cycle
  3. CPU acknowledges the interrupt (sends interrupt acknowledge signal)
  4. Device puts its interrupt vector number on the data bus
  5. CPU saves current state (pushes PC and registers onto stack)
  6. CPU uses interrupt vector to fetch ISR address from Interrupt Vector Table (IVT)
  7. CPU jumps to ISR and executes it
  8. ISR reads/writes data to/from device
  9. CPU restores saved state (pops from stack) and resumes

Types of Interrupts

TypeSourceExampleMaskable?
Hardware interruptExternal device via interrupt lineKeyboard press, disk I/O completeUsually yes (via interrupt flag)
Software interrupt (trap)Instruction in the program (INT n)System call, debugging breakpointNo
Exception (fault)CPU detects an errorDivision by zero, page fault, invalid opcodeNo
Non-Maskable Interrupt (NMI)Critical hardwareMemory parity error, power failureNo — always handled

Interrupt Priority

Multiple devices may request interrupts simultaneously. The system assigns priorities — higher-priority interrupts preempt lower-priority ISRs. This is managed either by a dedicated Programmable Interrupt Controller (PIC, e.g., Intel 8259A) or daisy-chaining (devices connected in series, closest to CPU has highest priority).

4. Direct Memory Access (DMA)

Interrupt-driven I/O is efficient for slow, infrequent I/O. But for transferring megabytes of data from a disk or network card, it still generates one interrupt per byte or word — thousands of context switches per megabyte. DMA eliminates this by offloading the entire data transfer to a dedicated DMA controller.

DMA Operation Sequence

  1. CPU programs the DMA controller: source address, destination address, byte count, and direction (device→memory or memory→device)
  2. CPU resumes execution of other instructions
  3. DMA controller takes over the bus and transfers data directly between the I/O device and main memory, one block at a time
  4. DMA controller interrupts the CPU when the entire transfer is complete
  5. CPU handles the single interrupt and processes the transferred data
DMA transfer time:
Total transfer time ≈ Setup time + (Data size / Bus bandwidth) + Interrupt overhead

CPU is free during the transfer (except for cycle stealing — see below)

Without DMA (interrupt-driven, 1 interrupt per word):
CPU overhead = Number of words × (Interrupt save/restore cycles)

With DMA:
CPU overhead = 1 interrupt (setup) + 1 interrupt (completion)
CPU time saved ≈ (N − 2) × interrupt handling cycles, where N = number of words

5. DMA Modes — Cycle Stealing vs Burst Mode

ModeHow It WorksCPU ImpactTransfer Rate
Cycle StealingDMA requests bus for one cycle, transfers one word, releases bus; repeatsCPU loses individual cycles — execution slows slightly but continuesSlower — bus acquired/released per word
Burst ModeDMA acquires bus and holds it for the entire transferCPU is completely blocked during the burst — cannot access memoryFastest — maximum bus utilisation
Transparent (Background) DMADMA uses bus only when CPU does not need itZero — CPU never stalledSlowest — opportunistic access only
Cycle stealing impact on CPU:
If DMA steals 1 cycle every T cycles, CPU slowdown = 1/T
Example: DMA steals 1 cycle every 5 CPU cycles → CPU runs at (5-1)/5 = 80% of normal speed

6. I/O Buses and Bus Architecture

A bus is a shared communication pathway — a collection of wires that transfers data, addresses, and control signals between components. Bus design directly impacts system performance.

Bus Signal Lines

Line TypePurposeWidth (typical)
Data busCarries actual data being transferred8, 16, 32, or 64 bits
Address busCarries the memory or device address32 or 64 bits (determines addressable space)
Control busRead/Write signal, bus request/grant, interrupt lines, clockVarious individual lines

System Bus Hierarchy

Modern systems use a hierarchy of buses rather than one shared bus, to avoid bottlenecks:

BusConnectsSpeedExample
Front-Side Bus (FSB) / System BusCPU ↔ Memory controller / chipsetVery fastIntel QPI, AMD HyperTransport
Memory BusMemory controller ↔ DRAMFastDDR4, DDR5
I/O BusChipset ↔ High-speed peripheralsModeratePCIe
Peripheral BusI/O controller ↔ Slow devicesSlowUSB, SATA, I²C
Bus bandwidth:
Bandwidth = Bus width (bytes) × Bus clock frequency
Example: 64-bit bus at 100 MHz = 8 bytes × 100 MHz = 800 MB/s

Bus transfer time for N bytes:
Transfer time = N / Bandwidth + Bus overhead (arbitration + addressing)

7. Device Controllers

A device controller (I/O controller) is the hardware interface between the system bus and a physical I/O device. The CPU communicates with the controller, not directly with the device. The controller translates bus signals into device-specific commands and vice versa.

Controller registers:

  • Data register: holds the data being transferred
  • Status register: device state (busy, ready, error)
  • Control register: CPU writes commands here (start, stop, mode)

I/O port addressing: The CPU accesses controller registers either through:

  • Port-mapped I/O (Isolated I/O): separate address space for I/O; special IN/OUT instructions (x86)
  • Memory-mapped I/O: controller registers appear as regular memory addresses; normal LOAD/STORE instructions; easier to program (ARM, MIPS)
FeaturePort-Mapped I/OMemory-Mapped I/O
Address spaceSeparate I/O address spaceShared with memory
InstructionsSpecial IN/OUTRegular LOAD/STORE
ProtectionEasy — user mode cannot use IN/OUTHarder — requires page-level protection
Cache interactionsNot cached (separate space)Must mark I/O pages as non-cacheable
Used byx86 (legacy + modern)ARM, MIPS, most embedded systems

8. GATE-Level Worked Examples

Example 1 — CPU Overhead: Polling vs Interrupt vs DMA (GATE 2021 style)

Problem: A disk transfers data at 4 MB/s. The CPU runs at 500 MHz with 400 cycles per interrupt. Compare CPU overhead for interrupt-driven I/O (one interrupt per 32-bit word) vs DMA (one interrupt per 4 KB block).

Solution:
Data rate = 4 MB/s = 4 × 2²⁰ bytes/s

Interrupt-driven (1 interrupt per 4-byte word):
Interrupts per second = (4 × 2²⁰) / 4 = 2²⁰ = 1,048,576 interrupts/s
CPU cycles per second = 1,048,576 × 400 = 419,430,400 cycles/s
CPU fraction = 419,430,400 / 500,000,000 ≈ 83.9% — CPU nearly saturated!

DMA (1 interrupt per 4 KB block):
Blocks per second = (4 × 2²⁰) / (4 × 2¹⁰) = 2¹⁰ = 1,024 blocks/s
CPU cycles per second = 1,024 × 400 = 409,600 cycles/s
CPU fraction = 409,600 / 500,000,000 ≈ 0.08% — negligible!

Example 2 — DMA Transfer Time (GATE 2022 style)

Problem: A DMA controller transfers 8 KB of data from a disk to memory. The disk transfer rate is 1 MB/s. Bus cycle time is 100 ns. In cycle stealing mode, DMA steals 1 bus cycle per 4 bytes. What fraction of CPU time is stolen?

Solution:
Transfer size = 8 KB = 8192 bytes
Bus cycles stolen = 8192 / 4 = 2048 cycles
Transfer time = 8 KB / (1 MB/s) = 8192 / (1024 × 1024) ≈ 7.8 ms
Time stolen = 2048 × 100 ns = 204,800 ns = 0.2048 ms
CPU fraction stolen = 0.2048 / 7.8 ≈ 2.6%

Example 3 — Interrupt Vector Table (GATE 2019 style)

Problem: An interrupt vector table starts at address 0x0000. Each interrupt vector entry is 4 bytes (a 32-bit address). Interrupt type 5 occurs. What address is fetched from the IVT?

Solution:
IVT base = 0x0000
Each entry = 4 bytes
Address of type 5 entry = 0x0000 + 5 × 4 = 0x0000 + 20 = 0x0014
The 4 bytes at address 0x0014 contain the ISR address for interrupt type 5.

9. Common Mistakes

  1. Saying DMA involves zero CPU participation
    DMA significantly reduces CPU involvement, but the CPU still programs the DMA controller at the start and handles the completion interrupt at the end. In cycle stealing mode, the CPU also loses individual bus cycles. DMA is not “CPU-free” — it is “CPU-light.”
  2. Confusing interrupt-driven I/O with DMA
    Both use interrupts, but for different things. Interrupt-driven I/O generates one interrupt per data word/byte — the CPU handles each transfer. DMA generates one interrupt per entire block transfer — the DMA controller handles each individual word transfer; the CPU only gets interrupted at the end.
  3. Forgetting bus arbitration overhead in bandwidth calculations
    Bus bandwidth = width × frequency, but this is the theoretical peak. Actual throughput is lower due to bus arbitration (deciding who gets the bus), addressing cycles, and turnaround time between reads and writes. GATE questions sometimes ask about effective vs theoretical bandwidth.
  4. Mixing up port-mapped and memory-mapped I/O addressing
    In port-mapped I/O, device registers have their own address space separate from memory — you cannot access them with LOAD/STORE. In memory-mapped I/O, device registers appear in the regular memory address space — you use LOAD/STORE, but those addresses must be marked non-cacheable.
  5. Treating all interrupt types as maskable
    Non-maskable interrupts (NMI) cannot be disabled by the software interrupt flag. They are used for critical hardware failures. Treating NMI as maskable is a factual error in GATE MCQ answers.

10. Frequently Asked Questions

What is the difference between polling and interrupt-driven I/O?

In polling, the CPU wastes cycles in a busy-wait loop checking whether an I/O device is ready. In interrupt-driven I/O, the CPU starts the I/O operation and resumes other work; the device signals the CPU via an interrupt when it is ready. Polling is simpler but wasteful; interrupt-driven I/O is more efficient for slow or infrequent I/O operations.

What is DMA in computer organisation?

DMA (Direct Memory Access) is a technique where a dedicated DMA controller transfers data between I/O devices and main memory without requiring the CPU to execute each individual transfer. The CPU programs the DMA controller with the transfer parameters (addresses, size, direction), then continues executing. When the entire transfer completes, the DMA controller sends a single interrupt to the CPU. This is far more efficient than interrupt-driven I/O for large data transfers like disk reads.

What is cycle stealing in DMA?

Cycle stealing is a DMA operating mode where the DMA controller takes control of the memory bus for exactly one bus cycle to transfer one word, then immediately releases the bus back to the CPU. The CPU is briefly suspended for that single cycle but otherwise continues normally. This contrasts with burst mode, where the DMA holds the bus for the entire transfer, completely blocking the CPU.

What is an interrupt vector?

An interrupt vector is an entry in the Interrupt Vector Table (IVT) — a table stored in a fixed location in memory. Each entry contains the starting address of the Interrupt Service Routine (ISR) for a particular interrupt type. When a device interrupts the CPU, the device puts its interrupt number on the bus; the CPU uses that number to index into the IVT, fetches the ISR address, and jumps to the handler.

Leave a Comment