I/O and Disk Scheduling in Operating Systems – SSTF, SCAN, C-SCAN & LOOK Algorithms
What You Will Learn
- Disk structure: platters, tracks, sectors, cylinders
- Disk scheduling algorithms: FCFS, SSTF, SCAN, C-SCAN, LOOK, C-LOOK
- Disk access time formula and GATE calculations
- I/O subsystem: buffering, caching, spooling
- RAID levels: 0, 1, 4, 5, 6, 10
- GATE PYQs with complete solutions
GATE Weightage: 1–3 marks regularly. Disk scheduling algorithm traces and disk access time calculations are the most frequent question types.
1. Disk Structure
Understanding physical disk structure is essential for disk scheduling in operating systems. A hard disk drive (HDD) consists of:
- Platters: Circular disks coated with magnetic material; a disk has 1–5 platters.
- Tracks: Concentric circles on each platter surface.
- Sectors: Fixed-size arcs of a track (traditionally 512 B, now 4 KB); the smallest unit of disk I/O.
- Cylinders: Set of all tracks at the same radius across all platter surfaces. Seeking moves between cylinders.
- Read/Write Head: Electromagnetic device that reads/writes data; one per platter surface.
- Arm: Moves all heads in unison to a given cylinder.
Disk Capacity Formula
Example: 2 platters × 2 surfaces × 10,000 tracks × 500 sectors × 512 B = 10,240,000,000 B ≈ 10 GB
2. Disk Access Time
The total time to access data from a disk consists of three components:
Seek Time: Time to move arm to the target cylinder
(given directly in problems, or as average seek time)
Rotational Latency: Time for target sector to rotate under head
Average Rotational Latency = (1/2) × (1/RPM) × 60,000 ms
Maximum Rotational Latency = (1/RPM) × 60,000 ms (full rotation)
Transfer Time: Time to read/write the data
Transfer Time = Data size / Transfer rate
Average Rotational Latency = (1/2) × (60/7200) × 1000 = (1/2) × 8.33 ms = 4.17 ms
Transfer Time = 4096 / (50 × 10⁶) × 1000 ms = 0.082 ms
Total Access Time = 8 + 4.17 + 0.082 = 12.25 ms
Effective Disk Access Time with Cache Hit
where h = cache hit rate
3. Disk Scheduling Algorithms
Disk scheduling algorithms determine the order in which disk I/O requests are serviced to minimise total seek time. All examples below use the same request queue for direct comparison.
Disk cylinders: 0–199, Current head position: 50
Request queue: 98, 183, 37, 122, 14, 124, 65, 67
Previous head position (for SCAN direction): 45 (moving toward higher)
3.1 FCFS – First Come First Served
Requests are served in the order they arrive. No optimisation.
Total seek = |50–98| + |98–183| + |183–37| + |37–122| + |122–14| + |14–124| + |124–65| + |65–67|
= 48 + 85 + 146 + 85 + 108 + 110 + 59 + 2 = 643 cylinders
- Pro: Fair — no starvation
- Con: High seek time; wild arm movements
3.2 SSTF – Shortest Seek Time First
Always service the request closest to the current head position. Greedy algorithm.
Seek = |50–65| + |65–67| + |67–37| + |37–14| + |14–98| + |98–122| + |122–124| + |124–183|
= 15 + 2 + 30 + 23 + 84 + 24 + 2 + 59 = 236 cylinders
- Pro: Much lower average seek time than FCFS
- Con: Can cause starvation for far-away requests; not optimal
3.3 SCAN (Elevator Algorithm)
Head moves in one direction, services all requests in that direction, then reverses. Like an elevator.
50 → 65 → 67 → 98 → 122 → 124 → 183 → 199 (end) → 37 → 14
Seek = |50–65| + |65–67| + |67–98| + |98–122| + |122–124| + |124–183| + |183–199| + |199–37| + |37–14|
= 15 + 2 + 31 + 24 + 2 + 59 + 16 + 162 + 23 = 334 cylinders
- Pro: No starvation; reasonable average seek time
- Con: Requests at extremes may wait longer
3.4 C-SCAN (Circular SCAN)
Head moves in one direction only. After reaching the end, it jumps back to cylinder 0 (no servicing on the return) and continues in the same direction.
50 → 65 → 67 → 98 → 122 → 124 → 183 → 199 (end) → [jump to 0] → 14 → 37
Seek distance = 149 (50 to 199) + 199 (jump: 199 to 0) + 37 (0 to 37)
= 385 cylinders (but more uniform wait times)
- Pro: More uniform wait time than SCAN
- Con: Counts the jump from end to 0 in seek distance
3.5 LOOK and C-LOOK
LOOK is like SCAN but the arm reverses at the last pending request (not at the disk boundary). C-LOOK is like C-SCAN but jumps from the last request to the first pending request (not to cylinder 0).
50 → 65 → 67 → 98 → 122 → 124 → 183 (last high) → [reverse] → 37 → 14
Seek = 133 (50 to 183) + 169 (183 to 14) = 302 cylinders
C-LOOK (from 50, moving toward higher):
50 → 65 → 67 → 98 → 122 → 124 → 183 → [jump to 14] → 14 → 37
Seek = 133 + 169 + 23 = 325 cylinders
- Pro: Better than SCAN/C-SCAN as arm doesn’t travel to disk boundary uselessly
- Con: Slightly more complex implementation
4. Algorithm Comparison
| Algorithm | Seek Distance | Starvation? | Uniformity | Use Case |
|---|---|---|---|---|
| FCFS | 643 (highest) | No | Fair arrival order | Light loads |
| SSTF | 236 (low) | Yes | Poor (biased to center) | Moderate loads |
| SCAN | 334 | No | Moderate | General purpose |
| C-SCAN | 385 | No | Good | Heavy loads |
| LOOK | 302 (better) | No | Moderate | General purpose |
| C-LOOK | 325 | No | Good | Heavy loads |
• SSTF gives minimum average seek time but may cause starvation
• SSTF is not always optimal — it’s a greedy heuristic
• LOOK and C-LOOK are better practical choices than SCAN/C-SCAN
• For uniform wait time: C-SCAN or C-LOOK
• FCFS is the only algorithm that never reorders requests
5. I/O Subsystem
The I/O subsystem in an operating system manages communication between the CPU and I/O devices through layered abstractions.
I/O Hardware Concepts
- Port: Connection point for a device (e.g., USB, serial port)
- Bus: Shared communication channel (PCI, SATA, USB)
- Device Controller: Electronics that operate a device; has registers (data, command, status)
I/O Methods
| Method | Description | CPU Involvement |
|---|---|---|
| Programmed I/O (Polling) | CPU continuously checks device status register (busy-waiting) | Very high — CPU stuck in loop |
| Interrupt-Driven I/O | Device interrupts CPU when ready; CPU does other work meanwhile | Moderate — interrupted per transfer |
| DMA (Direct Memory Access) | DMA controller transfers data between device and memory without CPU | Minimal — CPU only initiates/completes |
Buffering
A buffer is memory used to temporarily hold data during I/O transfer, smoothing speed mismatches.
- Single buffering: One buffer; CPU waits while buffer is being filled by device
- Double buffering: Two buffers; while CPU processes buffer A, device fills buffer B; then swap
- Circular buffering: Multiple buffers in a ring; used for continuous streams (audio, video)
Caching
A disk cache (buffer cache) holds recently-read disk blocks in memory. On a cache hit, no disk access is needed. Cache hit rate h determines performance:
Spooling
Spooling (Simultaneous Peripheral Operations On-Line) stores output for a slow device (e.g., printer) in a spool directory on disk. Processes write to spool; the spooler daemon sends to the device when ready. Allows multiple processes to share a dedicated device (printer) without blocking.
I/O Kernel Subsystem Services
- Scheduling: Orders I/O requests using disk scheduling algorithms
- Buffering: Stores data in transit
- Caching: Holds frequently-used data in fast memory
- Spooling: Manages queue for devices serving one request at a time
- Error Handling: Retries, error codes, logging
- I/O Protection: Privileged instructions for device access; user mode uses system calls
6. RAID
RAID (Redundant Array of Independent Disks) uses multiple disks to improve performance, reliability, or both. Key RAID levels tested in GATE:
| RAID Level | Technique | Min Disks | Disk Failure Tolerance | Performance |
|---|---|---|---|---|
| RAID 0 | Striping only | 2 | 0 (any failure = data loss) | Best read/write speed |
| RAID 1 | Mirroring only | 2 | 1 (if different disks) | Good read, normal write |
| RAID 4 | Block striping + dedicated parity disk | 3 | 1 | Parity disk is bottleneck |
| RAID 5 | Block striping + distributed parity | 3 | 1 | Good balance |
| RAID 6 | Block striping + two distributed parities | 4 | 2 | Slower write, very reliable |
| RAID 10 (1+0) | Mirroring then striping | 4 | 1 per mirror pair | High perf + reliability |
RAID Parity
RAID 6: Usable capacity = (N – 2) × disk_size for N disks
RAID 1: Usable capacity = disk_size (all disks store same data)
RAID 0: Usable capacity = N × disk_size (no redundancy)
RAID 0: 5 TB usable, 0 fault tolerance
RAID 1: 1 TB usable, very high fault tolerance
RAID 5: 4 TB usable, 1 disk fault tolerance
RAID 6: 3 TB usable, 2 disk fault tolerance
7. GATE Examples
GATE 2014: A disk has 200 cylinders (0–199). Current head is at cylinder 53, moving toward higher numbered cylinders. Queue: 98, 183, 37, 122, 14, 124, 65, 67. Calculate total head movement using SCAN.
View Solution
Moving toward higher: 53 → 65 → 67 → 98 → 122 → 124 → 183 → 199 → [reverse] → 37 → 14
Seek = (65-53) + (67-65) + (98-67) + (122-98) + (124-122) + (183-124) + (199-183) + (199-37) + (37-14)
= 12 + 2 + 31 + 24 + 2 + 59 + 16 + 162 + 23 = 331 cylinders
GATE 2008: A disk rotates at 3000 RPM. Data transfer rate is 1 MB/s. Seek time is 5 ms. What is the average time to read a 512-byte sector?
View Solution
Rotational speed = 3000 RPM → 50 rotations/second → 1 rotation = 20 ms
Average rotational latency = 20/2 = 10 ms
Transfer time = 512 B / (1 × 10⁶ B/s) × 1000 ms = 0.512 ms
Total = 5 + 10 + 0.512 = 15.512 ms
GATE 2012: A system uses 2-level disk scheduling: outer level uses Round Robin across 4 processes; inner level uses SSTF. Which statement is correct about starvation?
View Solution
SSTF at the inner level can cause starvation for requests from one process if requests from other processes near the current head keep arriving. The outer Round Robin ensures each process gets a turn, but within each process’s turn, SSTF may still prioritise certain sectors. Result: No starvation at process level (guaranteed by RR), but possible starvation within process if combined with heavy I/O from nearby sectors.
GATE 2019: A RAID 5 configuration uses 5 disks each of size 100 GB. What is the usable capacity?
View Solution
RAID 5 usable capacity = (N – 1) × disk size = (5 – 1) × 100 GB = 400 GB
8. Common Mistakes
- SCAN goes to disk boundary, LOOK does not: In SCAN, the arm travels all the way to cylinder 0 or 199 even if there’s no request there. In LOOK, the arm reverses at the last pending request. Many students confuse these.
- C-SCAN jumps to cylinder 0, C-LOOK jumps to first request: C-SCAN always returns to cylinder 0 on the jump; C-LOOK jumps to the lowest-numbered pending request. Both count the jump distance.
- Rotational latency formula: Average latency is half of one full rotation, not one full rotation. At 7200 RPM, full rotation = 8.33 ms, average = 4.17 ms.
- SSTF starvation: SSTF is not FCFS — it can skip over a request that arrived first if other requests are closer. This causes starvation for far-away requests.
- RAID 0 has no fault tolerance: RAID 0 (striping only) has ZERO fault tolerance. If any one disk fails, all data is lost.
- DMA vs Interrupt I/O: In interrupt-driven I/O, the CPU is interrupted for every byte or word transferred. In DMA, the CPU is involved only at the start and end of the transfer; DMA controller handles the data movement autonomously.
9. Frequently Asked Questions
Which disk scheduling algorithm gives minimum seek time?
SSTF (Shortest Seek Time First) gives the minimum average seek time among simple algorithms by always choosing the nearest request. However, it is not optimal and can cause starvation. LOOK is generally the best practical choice: near-SSTF performance without starvation, because the arm reverses at the last request rather than scanning the entire disk.
What is the difference between SCAN and C-SCAN disk scheduling?
SCAN moves the arm in one direction, services all requests, then reverses at the disk boundary and services requests in the other direction. This causes longer waits for requests just passed. C-SCAN moves in one direction only; after reaching the end, it jumps back to the start without servicing anything. This provides more uniform waiting times at the cost of the return-jump overhead.
How is disk access time calculated?
Disk access time = seek time + rotational latency + transfer time. Seek time is given or averaged. Average rotational latency = (1/2) × one full rotation time = (1/2) × (60,000/RPM) ms. Transfer time = data size / transfer rate. Example: 5400 RPM disk has average rotational latency = (1/2) × (60,000/5400) = 5.56 ms.
What is RAID and what are the different RAID levels?
RAID combines multiple disks for performance or reliability. RAID 0 (striping): max performance, no redundancy. RAID 1 (mirroring): full redundancy, 50% capacity. RAID 5 (distributed parity): (N-1)/N capacity, tolerates 1 failure. RAID 6: (N-2)/N capacity, tolerates 2 failures. RAID 10 (1+0): mirrors then stripes, high performance + reliability. RAID 5 is most widely used in enterprise storage.
What is the difference between buffering and caching in I/O?
Buffering holds data temporarily in transit between two components at different speeds (e.g., between a slow network card and fast CPU). Caching stores copies of frequently accessed data to avoid slow re-fetches. A disk buffer cache serves both purposes: it buffers writes (batching small writes into large transfers) and caches reads (keeping recently-used blocks in RAM). The key distinction is that buffers are one-time conduits; caches are persistent stores for re-use.