I/O Systems in Computer Organisation

Q: What is the difference between polling and interrupt-driven I/O?

In polling (programmed I/O), the CPU continuously checks the status of an I/O device in a busy-wait loop, wasting CPU cycles when the device is not ready. In interrupt-driven I/O, the CPU initiates the I/O operation and then continues executing other instructions; when the device is ready, it sends an interrupt signal, causing the CPU to pause and handle the I/O completion.

Q: What is cycle stealing in DMA?

Cycle stealing is a DMA mode where the DMA controller temporarily takes control of the memory bus for one bus cycle to transfer one word, then returns control to the CPU. The CPU is suspended for one cycle (it 'loses' that cycle). This is less disruptive than burst mode DMA but allows the DMA to operate concurrently with the CPU.

Q: What is an interrupt vector?

An interrupt vector is a table in memory (the Interrupt Vector Table or IVT) that stores the starting addresses of interrupt service routines (ISRs). When an interrupt occurs, the CPU uses the interrupt number (type) as an index into the IVT to find and jump to the correct ISR. This allows different devices to have their own handlers.

Programmed I/O, Interrupts, DMA, Buses & Device Controllers — Complete GATE CS Notes

Last updated: April 2026 | GATE CS syllabus aligned

Key Takeaways

I/O systems transfer data between the CPU/memory and external devices — the key design challenge is doing this without wasting CPU cycles
Programmed I/O (polling): CPU continuously checks device status — simple but wastes CPU cycles in busy-wait
Interrupt-driven I/O: device signals CPU when ready — CPU runs other tasks in between, but interrupt overhead exists for each transfer
DMA (Direct Memory Access): dedicated controller handles bulk transfers; CPU only involved at start and end — most efficient for large transfers
Cycle stealing: DMA takes one bus cycle at a time; burst mode: DMA holds bus for entire transfer
I/O bus structure (system bus, I/O bus, device bus) determines bandwidth and latency between components
Device controllers bridge the gap between the digital signals of the bus and the physical signals of the device

1. The I/O Performance Challenge

The CPU operates in nanoseconds. Keyboards respond in tens of milliseconds. Hard disks take 5–10 milliseconds. Even the fastest SSDs take 50–100 microseconds. If the CPU had to sit and wait every time it initiated an I/O operation, a program reading a 1 GB file from a hard disk would spend 99.9999% of its time doing nothing — the CPU running at 3 GHz executing zero instructions.

Three techniques address this problem, each progressively more efficient:

Programmed I/O (Polling): CPU waits, checking repeatedly
Interrupt-Driven I/O: CPU is notified when device is ready
Direct Memory Access (DMA): dedicated hardware handles the transfer

Method	CPU Involvement	Best For	Overhead
Programmed I/O	100% — busy-wait loop	Simple embedded systems; fast devices	Maximum — CPU completely tied up
Interrupt-Driven	Interrupt handler only	Low-bandwidth, latency-sensitive I/O (keyboard, mouse)	Moderate — one interrupt per data unit
DMA	Setup + completion interrupt only	High-bandwidth transfers (disk, network, video)	Minimal — proportional to transfer size, not data rate

2. Programmed I/O (Polling)

In programmed I/O, the CPU initiates an I/O operation and then enters a tight loop, reading the device’s status register on every iteration, waiting for the “ready” flag to be set.

Polling loop (pseudocode):
CPU writes command to device control register
WHILE (device_status_register != READY) DO
/* busy-wait — CPU does nothing useful */
END WHILE
CPU reads/writes device data register

When polling makes sense: when the device is very fast and the wait is short (a few CPU cycles), or in simple embedded systems without an OS where an idle CPU is acceptable.

When polling fails: slow devices (disk, network) waste thousands or millions of CPU cycles per transfer. In a multitasking OS, this is unacceptable — the CPU cannot run other processes.

CPU time wasted by polling (worst case):
Polling overhead = Polling frequency × Cycles per poll

Example: Poll a network card 1000 times/second, 400 cycles per poll, 500 MHz CPU:
Cycles per second wasted = 1000 × 400 = 400,000
CPU fraction wasted = 400,000 / 500,000,000 = 0.08% (acceptable for this frequency)

But for a 100 MB/s disk polled at 100M times/second: 100M × 400 = 40 billion cycles — 8× a 5 GHz CPU’s capacity.

3. Interrupt-Driven I/O

With interrupt-driven I/O, the CPU starts an I/O operation, then resumes executing other instructions. When the device completes its operation, it raises an interrupt signal on the bus. The CPU finishes its current instruction, saves its state (registers, PC), and jumps to the Interrupt Service Routine (ISR) — the handler for that device. After the ISR completes, the CPU restores its state and resumes where it left off.

Interrupt Handling Steps

Device raises interrupt line on the bus
CPU checks for interrupts at end of each instruction cycle
CPU acknowledges the interrupt (sends interrupt acknowledge signal)
Device puts its interrupt vector number on the data bus
CPU saves current state (pushes PC and registers onto stack)
CPU uses interrupt vector to fetch ISR address from Interrupt Vector Table (IVT)
CPU jumps to ISR and executes it
ISR reads/writes data to/from device
CPU restores saved state (pops from stack) and resumes

Types of Interrupts

Type	Source	Example	Maskable?
Hardware interrupt	External device via interrupt line	Keyboard press, disk I/O complete	Usually yes (via interrupt flag)
Software interrupt (trap)	Instruction in the program (INT n)	System call, debugging breakpoint	No
Exception (fault)	CPU detects an error	Division by zero, page fault, invalid opcode	No
Non-Maskable Interrupt (NMI)	Critical hardware	Memory parity error, power failure	No — always handled

Interrupt Priority

Multiple devices may request interrupts simultaneously. The system assigns priorities — higher-priority interrupts preempt lower-priority ISRs. This is managed either by a dedicated Programmable Interrupt Controller (PIC, e.g., Intel 8259A) or daisy-chaining (devices connected in series, closest to CPU has highest priority).

4. Direct Memory Access (DMA)

Interrupt-driven I/O is efficient for slow, infrequent I/O. But for transferring megabytes of data from a disk or network card, it still generates one interrupt per byte or word — thousands of context switches per megabyte. DMA eliminates this by offloading the entire data transfer to a dedicated DMA controller.

DMA Operation Sequence

CPU programs the DMA controller: source address, destination address, byte count, and direction (device→memory or memory→device)
CPU resumes execution of other instructions
DMA controller takes over the bus and transfers data directly between the I/O device and main memory, one block at a time
DMA controller interrupts the CPU when the entire transfer is complete
CPU handles the single interrupt and processes the transferred data

DMA transfer time:
Total transfer time ≈ Setup time + (Data size / Bus bandwidth) + Interrupt overhead

CPU is free during the transfer (except for cycle stealing — see below)

Without DMA (interrupt-driven, 1 interrupt per word):
CPU overhead = Number of words × (Interrupt save/restore cycles)

With DMA:
CPU overhead = 1 interrupt (setup) + 1 interrupt (completion)
CPU time saved ≈ (N − 2) × interrupt handling cycles, where N = number of words

5. DMA Modes — Cycle Stealing vs Burst Mode

Mode	How It Works	CPU Impact	Transfer Rate
Cycle Stealing	DMA requests bus for one cycle, transfers one word, releases bus; repeats	CPU loses individual cycles — execution slows slightly but continues	Slower — bus acquired/released per word
Burst Mode	DMA acquires bus and holds it for the entire transfer	CPU is completely blocked during the burst — cannot access memory	Fastest — maximum bus utilisation
Transparent (Background) DMA	DMA uses bus only when CPU does not need it	Zero — CPU never stalled	Slowest — opportunistic access only

Cycle stealing impact on CPU:
If DMA steals 1 cycle every T cycles, CPU slowdown = 1/T
Example: DMA steals 1 cycle every 5 CPU cycles → CPU runs at (5-1)/5 = 80% of normal speed

6. I/O Buses and Bus Architecture

A bus is a shared communication pathway — a collection of wires that transfers data, addresses, and control signals between components. Bus design directly impacts system performance.

Bus Signal Lines

Line Type	Purpose	Width (typical)
Data bus	Carries actual data being transferred	8, 16, 32, or 64 bits
Address bus	Carries the memory or device address	32 or 64 bits (determines addressable space)
Control bus	Read/Write signal, bus request/grant, interrupt lines, clock	Various individual lines

System Bus Hierarchy

Modern systems use a hierarchy of buses rather than one shared bus, to avoid bottlenecks:

Bus	Connects	Speed	Example
Front-Side Bus (FSB) / System Bus	CPU ↔ Memory controller / chipset	Very fast	Intel QPI, AMD HyperTransport
Memory Bus	Memory controller ↔ DRAM	Fast	DDR4, DDR5
I/O Bus	Chipset ↔ High-speed peripherals	Moderate	PCIe
Peripheral Bus	I/O controller ↔ Slow devices	Slow	USB, SATA, I²C

Bus bandwidth:
Bandwidth = Bus width (bytes) × Bus clock frequency
Example: 64-bit bus at 100 MHz = 8 bytes × 100 MHz = 800 MB/s

Bus transfer time for N bytes:
Transfer time = N / Bandwidth + Bus overhead (arbitration + addressing)

7. Device Controllers

A device controller (I/O controller) is the hardware interface between the system bus and a physical I/O device. The CPU communicates with the controller, not directly with the device. The controller translates bus signals into device-specific commands and vice versa.

Controller registers:

Data register: holds the data being transferred
Status register: device state (busy, ready, error)
Control register: CPU writes commands here (start, stop, mode)

I/O port addressing: The CPU accesses controller registers either through:

Port-mapped I/O (Isolated I/O): separate address space for I/O; special IN/OUT instructions (x86)
Memory-mapped I/O: controller registers appear as regular memory addresses; normal LOAD/STORE instructions; easier to program (ARM, MIPS)

Feature	Port-Mapped I/O	Memory-Mapped I/O
Address space	Separate I/O address space	Shared with memory
Instructions	Special IN/OUT	Regular LOAD/STORE
Protection	Easy — user mode cannot use IN/OUT	Harder — requires page-level protection
Cache interactions	Not cached (separate space)	Must mark I/O pages as non-cacheable
Used by	x86 (legacy + modern)	ARM, MIPS, most embedded systems

8. GATE-Level Worked Examples

Example 1 — CPU Overhead: Polling vs Interrupt vs DMA (GATE 2021 style)

Problem: A disk transfers data at 4 MB/s. The CPU runs at 500 MHz with 400 cycles per interrupt. Compare CPU overhead for interrupt-driven I/O (one interrupt per 32-bit word) vs DMA (one interrupt per 4 KB block).

Solution:
Data rate = 4 MB/s = 4 × 2²⁰ bytes/s

Interrupt-driven (1 interrupt per 4-byte word):
Interrupts per second = (4 × 2²⁰) / 4 = 2²⁰ = 1,048,576 interrupts/s
CPU cycles per second = 1,048,576 × 400 = 419,430,400 cycles/s
CPU fraction = 419,430,400 / 500,000,000 ≈ 83.9% — CPU nearly saturated!

DMA (1 interrupt per 4 KB block):
Blocks per second = (4 × 2²⁰) / (4 × 2¹⁰) = 2¹⁰ = 1,024 blocks/s
CPU cycles per second = 1,024 × 400 = 409,600 cycles/s
CPU fraction = 409,600 / 500,000,000 ≈ 0.08% — negligible!

Example 2 — DMA Transfer Time (GATE 2022 style)

Problem: A DMA controller transfers 8 KB of data from a disk to memory. The disk transfer rate is 1 MB/s. Bus cycle time is 100 ns. In cycle stealing mode, DMA steals 1 bus cycle per 4 bytes. What fraction of CPU time is stolen?

Solution:
Transfer size = 8 KB = 8192 bytes
Bus cycles stolen = 8192 / 4 = 2048 cycles
Transfer time = 8 KB / (1 MB/s) = 8192 / (1024 × 1024) ≈ 7.8 ms
Time stolen = 2048 × 100 ns = 204,800 ns = 0.2048 ms
CPU fraction stolen = 0.2048 / 7.8 ≈ 2.6%

Example 3 — Interrupt Vector Table (GATE 2019 style)

Problem: An interrupt vector table starts at address 0x0000. Each interrupt vector entry is 4 bytes (a 32-bit address). Interrupt type 5 occurs. What address is fetched from the IVT?

Solution:
IVT base = 0x0000
Each entry = 4 bytes
Address of type 5 entry = 0x0000 + 5 × 4 = 0x0000 + 20 = 0x0014
The 4 bytes at address 0x0014 contain the ISR address for interrupt type 5.

9. Common Mistakes

Saying DMA involves zero CPU participation
DMA significantly reduces CPU involvement, but the CPU still programs the DMA controller at the start and handles the completion interrupt at the end. In cycle stealing mode, the CPU also loses individual bus cycles. DMA is not “CPU-free” — it is “CPU-light.”
Confusing interrupt-driven I/O with DMA
Both use interrupts, but for different things. Interrupt-driven I/O generates one interrupt per data word/byte — the CPU handles each transfer. DMA generates one interrupt per entire block transfer — the DMA controller handles each individual word transfer; the CPU only gets interrupted at the end.
Forgetting bus arbitration overhead in bandwidth calculations
Bus bandwidth = width × frequency, but this is the theoretical peak. Actual throughput is lower due to bus arbitration (deciding who gets the bus), addressing cycles, and turnaround time between reads and writes. GATE questions sometimes ask about effective vs theoretical bandwidth.
Mixing up port-mapped and memory-mapped I/O addressing
In port-mapped I/O, device registers have their own address space separate from memory — you cannot access them with LOAD/STORE. In memory-mapped I/O, device registers appear in the regular memory address space — you use LOAD/STORE, but those addresses must be marked non-cacheable.
Treating all interrupt types as maskable
Non-maskable interrupts (NMI) cannot be disabled by the software interrupt flag. They are used for critical hardware failures. Treating NMI as maskable is a factual error in GATE MCQ answers.

10. Frequently Asked Questions

What is the difference between polling and interrupt-driven I/O?

In polling, the CPU wastes cycles in a busy-wait loop checking whether an I/O device is ready. In interrupt-driven I/O, the CPU starts the I/O operation and resumes other work; the device signals the CPU via an interrupt when it is ready. Polling is simpler but wasteful; interrupt-driven I/O is more efficient for slow or infrequent I/O operations.

What is DMA in computer organisation?

DMA (Direct Memory Access) is a technique where a dedicated DMA controller transfers data between I/O devices and main memory without requiring the CPU to execute each individual transfer. The CPU programs the DMA controller with the transfer parameters (addresses, size, direction), then continues executing. When the entire transfer completes, the DMA controller sends a single interrupt to the CPU. This is far more efficient than interrupt-driven I/O for large data transfers like disk reads.

What is cycle stealing in DMA?

Cycle stealing is a DMA operating mode where the DMA controller takes control of the memory bus for exactly one bus cycle to transfer one word, then immediately releases the bus back to the CPU. The CPU is briefly suspended for that single cycle but otherwise continues normally. This contrasts with burst mode, where the DMA holds the bus for the entire transfer, completely blocking the CPU.

What is an interrupt vector?

An interrupt vector is an entry in the Interrupt Vector Table (IVT) — a table stored in a fixed location in memory. Each entry contains the starting address of the Interrupt Service Routine (ISR) for a particular interrupt type. When a device interrupts the CPU, the device puts its interrupt number on the bus; the CPU uses that number to index into the IVT, fetches the ISR address, and jumps to the handler.

I/O Systems in Computer Organisation — DMA, Interrupts & Buses | GATE CS