Skip to main content.

14.2.16. Why is my Ethernet operation not reliable?

Question:
My ethernet connection is not working reliable. On one switch it works fine, but on another one it doesn't.
or:
Question:
I always see transmit errors or timeouts for the first packet of a download, but then it works well.
or:
Question:
I cannot mount the Linux root file system over NFS; especially not with recent Linux kernel versions (older kernel versions work better). Specifying "proto=tcp" as mount option greatly improves the situation.
etc.
Answer:
There are many possible explanations for such problems. After eliminating the obvious sources (like broken cables etc.) you should check the configuration of your Ethernet PHY. One common cause of problems is if your PHY is hard configured in duplex mode (for example 100baseTX Full Duplex or 10baseT Full Duplex). If such a setup is combined with a autonegotiating switch, then trouble is ahead.

Jerry Van Baren explained this as follows:
Ignoring the configuration where both ends are (presumably correctly) 
manually configured, you end up with five cases, two of them 
misconfigured and WRONG:
1) Autonegotiation     <-> autonegotiation - reliable.
2) 10bT half duplex    <-> autonegotiation - reliable.
3) 100bT half duplex   <-> autonegotiation - reliable.
4) 10bT *FULL* duplex  <-> autonegotiation - *UNreliable*.
5) 100bT *FULL* duplex <-> autonegotiation - *UNreliable*.

The problem that I've observed is that the *humans* (the weak links) 
that do the manual configuration don't understand that "parallel 
detection" *must be* half duplex by definition in the spec (it is hard 
to define a reliable algorithm to detect full duplex capability so the 
spec writers punted).  As a result, the human invariably picks "full 
duplex" because everybody knows full duplex is better... and end up as 
case (4) or (5).  They inadvertently end up with a slower unreliable 
link (lots of "collisions" resulting in runt packets) rather than the 
faster better link they thought they were picking (d'oh!).  The really 
bad thing is that the network works fine in testing on an isolated LAN 
with no traffic and absolutely craps its pants when it hits the real
world.

That is my reasoning behind my statement that we can generally ignore 
the autonegotiation <-> fixed configuration case because the odds of it 
working properly are poor anyway.
Rule:
Always try to set up your PHY for autonegotiation.
If you must use some fixed setting, then set it to half duplex mode.
If you really must use a fixed full-duplex setting, then you absolutley must make sure that the link partner is configured exactly the same.
14.2.15. Why do I get TFTP timeouts? 1. Abstract 14.2.17. How the Command Line Parsing Works
Prev Home Next