2020/07/31 07:58:46
I' trying to debug an issue I'm having with a LAN8720 chip.
The chip is connected to a iMX283 SoC and is working with the REFCLK provided by the SoC.
The SoC is running Linux v5.7 with the Freescale FEC and the SMSC LAN8710/20 drivers.
The issue is that when the PHY chip reports the link to be up, the Linux driver performs a hardware reset of its MAC module, which leads to the PHY losing the link. The LEDs on the ethernet port turn off at this time and autonegotiation restarts.
I'm pretty sure the issue occurs exactly at the time of the hard reset as (1) I tried adding delay before the reset and (2) changing the hard for a soft reset works fine (I'm not sure of the exact differences between these two reset modes though).
I'm trying to work out what is happening and whether this is a problem with the hardware or the software. I would be very grateful if someone could advise on the following questions:
- What could possibly cause the link to drop? I'm thinking that if the REFCLK gets disturbed or turned off temporarily this might produce this behavior. Any other likely possibilities?
- Looking at the RXP/RXN signals showed that the signaling between the magnetics module and the PHY chip abruptly and completely stop at the time the link goes down (see attached picture for an example). Is this expected or would this rather indicate an issue with, e.g., power? I doubt that the PHY chip resets itself as this would clear the interrupt mask, which is not the case.
I would be very grateful for any help with this issue.
Best regards,
Laurent Badel

Attached Image(s)

2020/09/28 06:02:21
I'm still puzzled regarding this issue.
I've confirmed with a scope that indeed the reference clock shows a temporary decrease in frequency from 50Mhz to 25Mhz due to the reset of the MAC (the duration of the glitch is about 300ms with linux 5.8).
I've also confirmed that the outbound TX signals from the PHY show an increase in width starting exactly at the same time as the reference clock glitch, consistent with the fact that they are clocked at 25Mhz instead of 50.
Oddly enough I've also noticed that this link failure occurs only with some link partners (specifically HP computers in my case, which are using the I218/I219/X722 intel chips - the link does not drop when the link partner is a raspberry pi, or the same HP laptop using a usb/ethernet adapter). Also this seems to occur only with newer versions of linux, as 2.6.35 does not show the same behavior.
Interestingly the time between the end of TX signals and the beginning of autonegotiation is very close to 1200ms, which suggests to me that the link_break_timer may be engaged (this timer lasts approx. 1200ms from the LAN8720 datasheet). The fact that this timer is engaged would probably mean that the PHY has decided to restart autonegotiation, either because bit 0.9 has been set by software (which I am trying to confirm but without luck so far), or for another unspecified reason.
Would there by any chance happen to be an (undocumented) register that disables or otherwise modulates the "link fail inhibit" timer (apparently some PHYs offer this possibility)?
Any other suggestions on where to look at to figure out the root cause?
Thanks in advance,
2020/10/30 06:46:25
Did it solve the problem? Maybe I should open a technical ticket
2020/11/23 05:25:15
Eventually I convinced myself that the drop in REF CLK frequency is the reason for the link loss.
IEEE 802.3 clause 22 states that "the criteria for determining link validity is PHY-specific", which suggests to me that the link partner may choose to invalidate the link as soon as an error is registered, and we cannot assume that it will tolerate a glitch of even the shortest duration (tests shows that even holding the REF CLK down for < 10us was enough to invalidate the link with Intel chips, where raspberry pi 4 will tolerate up to several 100ms before dropping the link).
Thus I believe that the fault is with the ethernet driver, which should not reset the MII mode (which causes the REF CLK to drop from 50MHz to 25MHz) when the link is detected to be up.
Thus no problem with the LAN8720 chip as far as I am concerned.
Thanks much for your assistance.
Best regards,
© 2021 APG vNext Commercial Version 4.5

Use My Existing Forum Account