Data Link Layer in Computer Networks – Framing, Error Detection and Sliding Window Protocols

Q: What are the main functions of the data link layer?

The data link layer has six main functions: (1) Framing – divides the bitstream from the physical layer into manageable frames; (2) Physical Addressing – adds MAC addresses to identify sender and receiver on the same network; (3) Error Detection and Correction – uses techniques like CRC and checksums to detect corrupted frames; (4) Flow Control – prevents a fast sender from overwhelming a slow receiver; (5) Access Control – determines which device can use the shared medium at a given time (MAC sub-layer); (6) Error Control – acknowledges received frames and requests retransmission of lost/corrupted frames.

Q: How does CRC (Cyclic Redundancy Check) work?

CRC works by treating the message bits as a binary polynomial and dividing it by a pre-agreed generator polynomial using binary division (XOR operations). The sender appends the remainder of this division (the CRC checksum) to the message. The receiver performs the same division on the received data (message + CRC). If the remainder is zero, no errors occurred. If non-zero, an error is detected. CRC is very effective at detecting burst errors — it can detect all single-bit errors, all double-bit errors (with suitable generator), all odd-number-of-bit errors, and all burst errors of length ≤ degree of generator polynomial.

Q: What is the difference between Go-Back-N and Selective Repeat?

In Go-Back-N (GBN), if a frame error is detected, the receiver discards that frame AND all subsequent frames, and the sender retransmits the error frame plus all following frames. The receiver uses a window of size 1 (accepts only in-order frames). Sender window size can be up to 2^n – 1. In Selective Repeat (SR), the receiver accepts and buffers out-of-order frames; only the specific errored frame is retransmitted. Both sender and receiver have windows of size up to 2^(n-1). SR is more efficient but requires more memory at the receiver. GBN is simpler to implement but wastes bandwidth by retransmitting correctly-received frames.

Q: What is the difference between flow control and error control in data link layer?

Flow control manages the rate of data transmission between sender and receiver to prevent buffer overflow — it ensures a fast sender doesn't overwhelm a slow receiver. Techniques include Stop-and-Wait and sliding window. Error control detects and recovers from frames that are lost or corrupted during transmission — it uses acknowledgements (ACKs), negative acknowledgements (NAKs), and retransmission schemes (ARQ: Automatic Repeat reQuest). Error control deals with transmission errors; flow control deals with speed mismatches. Both are handled at the data link layer.

Q: What is the efficiency of Stop-and-Wait protocol?

Stop-and-Wait protocol efficiency = 1/(1 + 2a), where a = propagation delay / transmission time = Tp/Tt. If a is large (long distance or short frames), efficiency drops significantly — the sender spends most of its time waiting for ACKs. Example: if Tp = 5 ms and Tt = 1 ms, then a = 5, efficiency = 1/(1+10) = 1/11 ≈ 9%. Sliding window protocols improve efficiency: with window size W, efficiency = W/(1 + 2a) when W < 1 + 2a, or efficiency = 1 when W ≥ 1 + 2a.

What You Will Learn

What the data link layer does and its two sub-layers (LLC and MAC)
Framing techniques: character count, flag bytes, bit stuffing
Error detection: parity, checksum, CRC — how each works
Error correction: Hamming code concept
Flow control and ARQ protocols: Stop-and-Wait, Go-Back-N, Selective Repeat
Efficiency formulas and worked examples

1. Data Link Layer Overview

The data link layer (Layer 2 in the OSI model) is responsible for node-to-node delivery of data — reliable communication between two directly connected devices on the same network (e.g., your laptop to your Wi-Fi router, or one switch to another).

Think of it this way: the Network layer (Layer 3) worries about getting a packet from Mumbai to London. The Data Link layer worries about getting it from your laptop to the router sitting 10 metres away. It handles the immediate hop — one link at a time.

Main Functions

Framing: Wraps raw bits from the Physical layer into structured units called frames
Physical Addressing: Adds MAC addresses for sender and receiver identification on the local network
Error Detection: Detects if bits got corrupted during transmission
Error Control: Handles retransmission of lost or corrupted frames
Flow Control: Prevents the sender from overwhelming a slow receiver
Access Control: Determines which device can use the shared channel (managed by MAC sub-layer)

Two Sub-layers

Sub-layer	Full Name	Function
LLC	Logical Link Control	Interface between Data Link and Network layer; handles flow control and error notification; protocol multiplexing (identifies which Network layer protocol is being used)
MAC	Medium Access Control	Controls how devices on a shared medium take turns transmitting; handles physical addressing (MAC addresses); collision handling

2. Framing

The physical layer transmits a continuous stream of bits with no natural boundaries. Framing is the process of breaking this stream into discrete units (frames) so the receiver can identify where each frame starts and ends.

Character Count

The first field of the frame specifies how many characters (bytes) are in the frame. The receiver reads this count and knows exactly where the frame ends.

Problem: If the count field itself gets corrupted, the receiver loses synchronisation for all subsequent frames — a catastrophic error that's hard to recover from.

Flag Bytes with Byte Stuffing

A special flag byte (e.g., 01111110 in HDLC) marks the start and end of each frame. Since this flag pattern might appear in the data, byte stuffing is used: the sender inserts an escape byte before any occurrence of the flag or escape byte in the data. The receiver strips these escape bytes.

Used in: PPP (Point-to-Point Protocol)
Problem: Inefficient if data contains many flag bytes

Flag Bits with Bit Stuffing

The flag pattern 01111110 (six consecutive 1s) marks frame boundaries. Whenever the sender finds five consecutive 1s in the data, it automatically inserts a 0 bit (bit stuffing). The receiver, upon seeing five 1s, checks the next bit — if it's 0, it's a stuffed bit and is removed; if it's 1, it's a flag.

Used in: HDLC, USB
Advantage: Works for any data content; self-synchronising

Bit stuffing example:
Original data: 0110 11111 10011
After stuffing (insert 0 after every five 1s): 0110 111110 10011

Receiver: sees five 1s followed by 0 → removes the 0 → gets original data back.

Physical Layer Coding Violations

Some physical layer encodings (like Manchester encoding) deliberately use code sequences that never appear in normal data as frame delimiters. No stuffing is needed because the delimiter is physically impossible to occur in regular data.

3. Error Detection Techniques

When bits travel through a physical medium, electromagnetic interference, signal attenuation, or hardware issues can flip bits. Error detection techniques allow the receiver to detect when this happens.

Parity Check

A single extra bit (parity bit) is added to make the total number of 1-bits even (even parity) or odd (odd parity).

Data: 1011001 (four 1s — even number already)
Even parity bit = 0 → Transmitted: 10110010

Data: 1011101 (five 1s — odd number)
Even parity bit = 1 → Transmitted: 10111011

Detects: Any odd number of bit errors (1, 3, 5...)
Cannot detect: Even number of bit errors (2, 4... — parity appears correct)
2D Parity: Parity bits for each row AND each column; can detect AND correct single-bit errors

Checksum

The sender treats the data as a sequence of 16-bit integers, adds them together (wrapping around on overflow using one's complement addition), and sends the sum's complement as the checksum. The receiver adds all data segments plus checksum — if the result is all 1s (0xFFFF), no error.

Used in: TCP, UDP, IP headers
Detects: Most errors; stronger than simple parity
Weakness: Can miss errors if two errors cancel each other out in addition

CRC (Cyclic Redundancy Check)

CRC is the most powerful and widely used error detection technique in networking (used in Ethernet, Wi-Fi, USB, Bluetooth).

How CRC works:

Both sender and receiver agree on a generator polynomial G(x) of degree r (e.g., CRC-32 uses a 32-bit generator)
Sender appends r zeros to the message M, then divides (M × 2^r) by G using binary XOR division
The remainder R of this division is the CRC checksum (r bits)
Sender transmits (M + R) — the original message with checksum appended
Receiver divides the received bits by G. If remainder = 0 → no error. If non-zero → error detected

CRC Example:
Message M = 1101 (4 bits), Generator G = 1011 (degree 3, so r = 3)

Step 1: Append 3 zeros → 1101 000
Step 2: Divide 1101000 by 1011 using XOR:
  1101000 ÷ 1011:
  1101 XOR 1011 = 0110 → bring down 0 → 1100
  1100 XOR 1011 = 0111 → bring down 0 → 1110
  1110 XOR 1011 = 0101 → remainder = 101

Transmitted frame = 1101 101 (message + CRC)

Receiver: 1101101 ÷ 1011 → remainder = 000 → no error ✓

What CRC can detect:

All single-bit errors
All double-bit errors (with proper generator)
All odd numbers of errors (with generators that include x+1 factor)
All burst errors of length ≤ r (generator degree)
Most burst errors of length > r (with probability 1 – 2^(-r))

4. Error Correction

Error detection finds errors but requires retransmission. Error correction (Forward Error Correction — FEC) allows the receiver to fix errors without requesting retransmission. This is critical in systems where retransmission is expensive (deep-space communication, one-way broadcasts).

Hamming Code

Hamming code adds redundancy bits (parity bits) at specific positions (powers of 2: positions 1, 2, 4, 8...) within the data. Each parity bit covers a subset of data bits. By checking which parity bits fail, the receiver can pinpoint the exact position of the error and flip that bit.

Minimum parity bits needed:
For m data bits, need r parity bits where: 2^r ≥ m + r + 1

Examples:
m = 4 data bits → r = 3 parity bits (2³ = 8 ≥ 4+3+1 = 8 ✓)
m = 8 data bits → r = 4 parity bits (2⁴ = 16 ≥ 8+4+1 = 13 ✓)

Hamming distance:
Minimum Hamming distance d between codewords:
To detect e errors: d ≥ e + 1
To correct e errors: d ≥ 2e + 1

Hamming distance example:
To detect up to 3-bit errors: need minimum Hamming distance = 4
To correct up to 1-bit error: need minimum Hamming distance = 3
Hamming code (with d = 3) can correct single-bit errors OR detect double-bit errors, but not both simultaneously.

5. Flow Control

Flow control prevents a fast sender from sending data faster than a slow receiver can process it. Without flow control, the receiver's buffer overflows and frames are lost.

The receiver controls the rate by telling the sender how many frames it can accept. Two main mechanisms are used at the Data Link layer:

Stop-and-Wait: Simplest form — sender sends one frame, waits for ACK, then sends next
Sliding Window: Sender can have multiple unacknowledged frames in transit simultaneously

These mechanisms are also used for error control via ARQ (Automatic Repeat reQuest) — when an error is detected or a frame times out, the protocol requests retransmission.

6. Stop-and-Wait Protocol

The simplest ARQ protocol. The sender sends one frame and waits for an acknowledgement (ACK) before sending the next frame. If a frame is lost or corrupted, a timeout triggers retransmission.

Key terms:
Tt = Transmission time = Frame size / Bandwidth
Tp = Propagation delay = Distance / Speed of signal
a = Tp / Tt

Efficiency (no errors):
η = Tt / (Tt + 2Tp) = 1 / (1 + 2a)

Throughput:
Throughput = Efficiency × Bandwidth

Example:
Bandwidth = 1 Mbps, Frame size = 1000 bits, Propagation delay = 5 ms

Tt = 1000 bits / 1,000,000 bps = 1 ms
Tp = 5 ms, a = 5/1 = 5
Efficiency = 1 / (1 + 2×5) = 1/11 ≈ 9.09%
Throughput = 0.0909 × 1 Mbps ≈ 90.9 Kbps

Interpretation: The link is idle 90.9% of the time — the sender is just waiting for ACKs. This is extremely inefficient for high-bandwidth, long-distance links.

When Stop-and-Wait Works Well

Stop-and-Wait is efficient only when propagation delay is very small relative to transmission time (a << 1). For short-distance, low-bandwidth links — like a serial cable in the same room — it's perfectly adequate.

7. Go-Back-N Protocol

Go-Back-N (GBN) allows the sender to have multiple unacknowledged frames in transit simultaneously, up to a window size W. This dramatically improves efficiency on long-distance links.

How Go-Back-N Works

Sender can transmit up to W frames without receiving an ACK
Receiver expects frames in order; accepts only the next expected frame
If frame N is corrupted or lost: receiver discards frame N AND all subsequent frames, sends a NAK (or simply doesn't ACK)
Sender retransmits frame N and ALL frames sent after N — "goes back N frames"
Receiver uses cumulative ACKs: ACK N means "all frames up to N received correctly"

Window size limit:
With n-bit sequence numbers: Sender window size W ≤ 2^n – 1
(Receiver window size = 1 — accepts only in-order frames)

Efficiency:
If W ≥ 1 + 2a: η = 1 (sender never idle)
If W < 1 + 2a: η = W / (1 + 2a)

With error probability p:
η = W(1-p) / [(1 + 2a)(1-p + pW)] (simplified: η ≈ (1-p) / (1 + 2ap))

Example:
Bandwidth = 1 Mbps, Frame = 1000 bits, Tp = 5 ms, W = 7 frames

Tt = 1 ms, a = 5, (1 + 2a) = 11
W = 7 < 11, so η = 7/11 ≈ 63.6%

If W = 11 or more: η = 1 (100% efficiency — sender always busy)

Why the Sequence Number Limit is 2^n – 1

Suppose we use n = 3 bits → sequence numbers 0–7. If the sender sends frames 0–7 (window = 8 = 2³), and all ACKs are lost, the sender retransmits 0–7. But the receiver — which correctly got all 8 frames — is now waiting for the next batch starting at 0. It cannot distinguish "new frame 0" from "retransmitted frame 0." Limiting window to 7 (2³–1) prevents this ambiguity.

8. Selective Repeat Protocol

Selective Repeat (SR) improves on Go-Back-N by retransmitting only the specific frame that was lost or corrupted, rather than all subsequent frames. The receiver buffers out-of-order frames until the missing frame arrives.

How Selective Repeat Works

Both sender and receiver maintain sliding windows
Receiver buffers out-of-order frames and sends individual ACKs
Sender retransmits only the specific timed-out/NAKed frame
Receiver delivers frames to the upper layer in order once all gaps are filled

Window size limit:
With n-bit sequence numbers:
Sender window size = Receiver window size ≤ 2^(n-1)
(Both windows must be ≤ half the sequence number space)

Efficiency:
If W ≥ 1 + 2a: η = 1
If W < 1 + 2a: η = W / (1 + 2a)

Same formula as GBN — but SR achieves this with a smaller window size!

Why SR window ≤ 2^(n-1):
3-bit sequence numbers → 0–7. If both windows = 4 = 2^(3-1):
Sender sends 0,1,2,3. All received correctly, but ACKs lost.
Sender retransmits 0,1,2,3. Receiver's window has moved to 4,5,6,7.
New frame "0" falls OUTSIDE receiver's window → rejected as duplicate. Safe.

If window = 5: receiver cannot distinguish new frame 0 from retransmitted frame 0. Ambiguity!

Sender Buffer vs Receiver Buffer

Protocol	Sender Buffer	Receiver Buffer
Stop-and-Wait	1 frame	1 frame
Go-Back-N	W frames (up to 2^n – 1)	1 frame (in-order only)
Selective Repeat	W frames (up to 2^(n-1))	W frames (buffers out-of-order)

9. Protocol Comparison

Feature	Stop-and-Wait	Go-Back-N	Selective Repeat
Sender window size	1	Up to 2^n – 1	Up to 2^(n-1)
Receiver window size	1	1 (in-order only)	Up to 2^(n-1)
On error: retransmit	Just that frame	Error frame + all after it	Just the error frame
Receiver buffering	None	None (discards out-of-order)	Yes (buffers out-of-order)
Complexity	Very simple	Moderate	Complex
Bandwidth efficiency	Very low	Medium–High	High
Best for	Short-distance, low-bandwidth	High bandwidth, low error rate	High bandwidth, high error rate

Practical choice: Selective Repeat is used in modern protocols like TCP (which has selective acknowledgement — SACK option) because it minimises unnecessary retransmissions. Go-Back-N is simpler and works well when errors are rare and bandwidth is not the bottleneck.

10. Common Misconceptions

"CRC corrects errors": CRC only detects errors — it tells the receiver that something went wrong. The receiver then requests retransmission (ARQ). CRC does not have the information to correct errors. Hamming code is an example of error-correcting code.
"Bigger window always means better efficiency": In Go-Back-N, a very large window with a high error rate is counterproductive — each error causes retransmission of many frames. Selective Repeat handles high error rates better with smaller effective retransmission cost.
"Go-Back-N receiver discards only the bad frame": In Go-Back-N, the receiver discards the errored frame AND all subsequent frames received after it (even if those frames were correct). Only in-order frames are accepted. Selective Repeat accepts and buffers out-of-order frames.
"The sequence number space must equal the window size": The sequence number space must be LARGER than the window size to avoid ambiguity. For GBN: sequence space = 2^n, window ≤ 2^n – 1. For SR: window ≤ 2^(n-1).
"Stop-and-Wait is always inefficient": Stop-and-Wait is efficient when propagation delay is negligible (a ≈ 0). For very short-distance links or very large frames relative to link speed, it's perfectly adequate. It becomes inefficient only when a is large (long distance or high bandwidth).

11. Frequently Asked Questions

What are the main functions of the data link layer?

The data link layer handles: framing (packaging bits into frames with start/end delimiters), physical addressing (adding MAC addresses), error detection (CRC, checksum, parity to find corrupted frames), error control (retransmission via ARQ when frames are lost/corrupted), flow control (preventing sender from overwhelming receiver), and access control (MAC sub-layer decides who transmits on shared media). Together these functions ensure reliable hop-to-hop data delivery.

How does CRC (Cyclic Redundancy Check) work?

CRC treats the data as a binary polynomial, divides it by a pre-agreed generator polynomial using XOR arithmetic, and appends the remainder as a checksum. The receiver performs the same division — a zero remainder means no error. CRC detects all single-bit errors, all burst errors shorter than the generator length, and most longer burst errors. It's used in Ethernet, Wi-Fi, ZIP files, and USB because it's efficient in hardware and highly reliable.

What is the difference between Go-Back-N and Selective Repeat?

Go-Back-N: receiver accepts only in-order frames; on error, sender retransmits the bad frame plus all subsequent frames (even correctly received ones); receiver window = 1; sender window ≤ 2^n – 1. Selective Repeat: receiver buffers out-of-order frames; only the specific bad frame is retransmitted; both windows ≤ 2^(n-1). SR is more efficient at high error rates but requires more receiver memory. GBN is simpler to implement and fine for low-error-rate links.

What is the difference between flow control and error control in data link layer?

Flow control addresses speed mismatch: it prevents a fast sender from overrunning a slow receiver's buffer. The receiver signals how many frames it can handle (via window size). Error control addresses transmission corruption: it detects/corrects frames damaged during transmission using techniques like CRC for detection, and ARQ (Stop-and-Wait, GBN, SR) for recovery via retransmission. Both are responsibilities of the data link layer, and sliding window protocols implement both simultaneously.

What is the efficiency of Stop-and-Wait protocol?

Efficiency = 1/(1 + 2a), where a = propagation delay / transmission time. When a = 0 (no propagation delay), efficiency = 100%. When a = 10 (propagation delay is 10× transmission time), efficiency = 1/21 ≈ 4.8%. This shows why Stop-and-Wait is impractical for satellite links or transcontinental connections — the sender spends most of its time idle. Sliding window protocols solve this: efficiency = W/(1 + 2a), so choosing W ≥ 1 + 2a achieves 100% efficiency.