• AVR Freaks

AnsweredHot!Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired lease)

Author
moser
Super Member
  • Total Posts : 449
  • Reward points : 0
  • Joined: 2015/06/16 02:53:47
  • Location: Germany
  • Status: offline
2017/07/11 04:28:57 (permalink)
0

Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired lease)

I'd like to do a few small modifications to the DHCP client implementation. Most of it is understandable, but I have trouble understanding some of its mechanics. I was looking for all places, where the IP might get changed. As far as I can tell there are only these four places:
  1. TCPIP_DHCP_Disable() (see call sequence 1)
  2. TCPIP_DHCP_Request() (see call sequence 2)
  3. state change from TCPIP_DHCP_WAIT_LEASE_CHECK to case TCPIP_DHCP_BOUND when the ARP check timeout tells the IP is free (see call sequence 3)
  4. _DHCP_CheckFailEvent() (see call sequence 4)
The first three are clear to me. But I am not sure about the mechanism behind _DHCP_CheckFailEvent(). I have seen, it is connected to _DHCPSetRunFail() and pClient->flags.bReportFail and pClient->tOpStart and I have looked at all places where those are set or reset.
 
What is actually the purpose of _DHCP_CheckFailEvent() and _DHCPSetRunFail()?
What is the meaning of pClient->flags.bReportFail, and when is it set or is it not set?
 
It looks to me as following:
The timeout is started after TCPIP_DHCP_Initialize(), TCPIP_DHCP_Enable(), or after TCPIP_DHCP_ConnectionHandler() with TCPIP_MAC_EV_CONN_ESTABLISHED. And within 10 seconds (TCPIP_DHCP_TIMEOUT), if the DHCP doesn't reach bound state, then it might ask the next address service (in my case a static fallback IP, as I have link local disabled) and also stops the mechanism by setting bReportFail to 0. Also when a rebind or renew is successful, this mechanism is stopped by _DHCPSetBoundState() which sets bReportFail to false. What is the idea behind this?
 
What is also confusing to me is the following:
When the lease expires, how does the DHCP module stop using the lease IP and hand over back to another address service?
The expiry-check is in dhcp.c:1407 and it calls _DHCPSetRunFail(). But _DHCPSetRunFail() does not restart the _DHCP_CheckFailEvent() mechanism, if bReportFail was already false. And if there once was a successful renew or rebind, then I expect bReportFail to be false. But if after that the lease expires again (without successful renew/rebind), then the DHCP module seems to go back to discovery state, but it seems to continue to use the expired lease address forever. Is that true? (I haven't tested it yet.) Or what am I overlooking here?
 
 
Appendix:
Call sequences (including some selected "if"s and "case" statements) for IP changes in the DHCP module
 
Call sequence 1:
  • TCPIP_DHCP_Disable() dhcp.c:872
  • if(pClient->flags.bDHCPEnabled != 0) dhcp.c:878
  • TCPIP_STACK_AddressServiceEvent() dhcp.c:881
  • TCPIP_STACK_AddressServiceDefaultSet() tcpip_manager.c:3108
  • _TCPIPStackSetConfigAddress() tcpip_manager.c:3301
Call sequence 2:
  • TCPIP_DHCP_Request() dhcp.c:903
  • _DHCPStartOperation() dhcp.c:905
  • _TCPIPStackSetConfigAddress() dhcp.c:954
Call sequence 3:
  • TCPIP_DHCP_Task() dhcp.c:1105
  • TCPIP_DHCP_Process() dhcp.c:1121
  • case TCPIP_DHCP_WAIT_LEASE_CHECK dhcp.c:1272
  • else if((_DHCPSecondCountGet() - pClient->startWait) >= pClient->tLeaseCheck) dhcp.c:1291
  • _DHCPSetNewLease() dhcp.c:1302
  • _TCPIPStackSetConfigAddress() dhcp.c:1961
Call sequence 4:
  • TCPIP_DHCP_Task() dhcp.c:1105
  • TCPIP_DHCP_Process() dhcp.c:1121
  • _DHCPCheckRunFailEvent() dhcp.c:1165
  • if(pClient->smState > TCPIP_DHCP_WAIT_LINK && pClient->smState < TCPIP_DHCP_BOUND) dhcp.c:2788
  • if(pClient->flags.bReportFail && pClient->tOpStart != 0) dhcp.c:2790
  • if((_DHCPSecondCountGet() - pClient->tOpStart) >= pClient->tOpFailTmo) dhcp.c:2793
  • TCPIP_STACK_AddressServiceEvent() dhcp:2794
  • TCPIP_STACK_AddressServiceDefaultSet() tcpip_manager.c:954
  • _TCPIPStackSetConfigAddress() tcpip_manager.c:3301
#1
rainad
Moderator
  • Total Posts : 1157
  • Reward points : 0
  • Joined: 2009/05/01 13:39:25
  • Location: 0
  • Status: offline
Re: Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired leas 2017/07/11 12:15:49 (permalink)
3.33 (3)
moser
When the lease expires, how does the DHCP module stop using the lease IP and hand over back to another address service?
... it seems to continue to use the expired lease address forever. Is that true?

There is a timeout and from the BOUND state the client will go to RENEW, then REBIND (if there was a timeout) and then finally to DISCOVERY if all failed. It will also generate an address event indicating failure, that will switch the IP address to a different service (static, ZCLL, etc.).
 
If you have noticed any error in the behavior of the DHCP client please let us know and we'll try to fix it immediately.
 
 
#2
moser
Super Member
  • Total Posts : 449
  • Reward points : 0
  • Joined: 2015/06/16 02:53:47
  • Location: Germany
  • Status: offline
Re: Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired leas 2017/07/12 09:28:17 (permalink)
0
Thanks for the reply, rainad.
 
I was hoping you could point me to the code, where the address event indicating failure is actually generated.
 
I really believe now there might be a mistake. I think the following three places should set bReportFail to 1, but they don't do it:
  • The NAK in TCPIP_DHCP_GET_RENEW_ACK
  • The NAK in TCPIP_DHCP_GET_REBIND_ACK
  • The tExpSeconds timeout in TCPIP_DHCP_GET_REBIND_ACK
I'll try to test if my assumption is true.
 
 
This is what bothers me about the lease expiry. The following is the code where the DHCP client is already in REBIND state, and has send a request to expand the lease as broadcast.
dhcp.c:1402 in function TCPIP_DHCP_Process()

             case TCPIP_DHCP_GET_REBIND_ACK:
                 // process received data
                 if(_DHCPProcessReceiveData(pClient, pNetIf) == TCPIP_DHCP_TIMEOUT_MESSAGE)
                 { // no data available
                     // check first for Texp timeout
                     if((_DHCPSecondCountGet() - pClient->tRequest) >= pClient->tExpSeconds)
                     { // REBIND state expired; restart
                         _DHCPSetRunFail(pClient, TCPIP_DHCP_SEND_DISCOVERY, false);
                         _DHCPNotifyClients(pNetIf, DHCP_EVENT_TIMEOUT);
                     }
                     else if((_DHCPSecondCountGet() - pClient->startWait) >= pClient->t3Seconds)
                     { // check if there's time for retransmission of another DHCP request
                         _DHCPClientStateSet(pClient, TCPIP_DHCP_SEND_REBIND);
                     }
                     // else no time for retry yet;
                 }
                 break;

_DHCPProcessReceiveData() looks for an ACK or NACK. If there is an ACK it returns to BOUND state. If there is an NAK it calls _DHCPSetRunFail() as following:
 
dhpc.c:1928 in function _DHCPProcessReceiveData()

        case TCPIP_DHCP_NAK_MESSAGE:
            dhcpRecvFail = true;
            dhcpEv = DHCP_EVENT_NACK;
            break;
        default:
            // remain in the same state; do nothing
            break;
    }
 
    if(dhcpRecvFail)
    {
        _DHCPSetRunFail(pClient, TCPIP_DHCP_SEND_DISCOVERY, false);
    }

 
If _DHCPProcessReceiveData() didn't receive anything it returns a TCPIP_DHCP_TIMEOUT_MESSAGE.
 
Then, back in the code above the lease expiry (pClient->tExpSeconds) is checked. If it is expired, it calls _DHCPSetRunFail() and _DHCPNotifyClients().
 
_DHCPNotifyClients() sends a notification to the registered clients. As far as I can see, the only registered receiver of the event is the announce module, which registers the ANNOUNCE_Notify() function (in tcpip_announce.c:299).
But the ANNOUNCE_Notify() function gets event DHCP_EVENT_TIMEOUT and for this one it is doing nothing.
 
So, there is only _DHCPSetRunFail(pClient, TCPIP_DHCP_SEND_DISCOVERY, false). Lets look at it:
 
dhcp.c:646 function _DHCPSetRunFail()

static void _DHCPSetRunFail(DHCP_CLIENT_VARS* pClient, TCPIP_DHCP_STATUS newState, bool expBackoff)
{
    _DHCPClientStateSet(pClient, newState);

The first line goes to TCPIP_DHCP_SEND_DISCOVERY state.

    pClient->dhcpOp = TCPIP_DHCP_OPER_INIT; // failure forces a brand new lease acquisition

As far as I can tell this only matters for the next _DHCPSend().

    if(expBackoff)
    { // exponential backoff the DHCP timeout
        pClient->dhcpTmo <<= 1;
        if(pClient->dhcpTmo > TCPIP_DHCP_EXP_BACKOFF_LIMIT)
        {
            pClient->dhcpTmo = TCPIP_DHCP_EXP_BACKOFF_LIMIT;
        }
    }

At the place where we come from there is no exponential backoff.

    if(pClient->flags.bReportFail)
    {
        _DHCPSetFailTimeout(pClient, false, true);
    }
}

And this last part of the function depends on this "strange" bReportFail flag. If it is set then _DHCPSetFailTimeout() is called.
 
dhcp:c:375 function _DHCPSetFailTimeout()

static __inline__ void __attribute__((always_inline)) _DHCPSetFailTimeout(DHCP_CLIENT_VARS* pClient, bool resetTmo, bool isRunTime)
{
    pClient->flags.bReportFail = 1;
    if(resetTmo || pClient->tOpStart == 0)
    {
        pClient->tOpStart = _DHCPSecondCountGet();
#if (TCPIP_DHCP_DEBUG_MASK & TCPIP_DHCP_DEBUG_MASK_FAIL_TMO_EVENT) != 0
        SYS_CONSOLE_PRINT("DHCP cli: %d, set fail tmo: %d, xid: 0x%8x, at: %s, reset: %s\r\n", pClient - DHCPClients, pClient->tOpStart, TCPIP_Helper_htonl(pClient->transactionID.Val), isRunTime ? "run" : "init", resetTmo ? "y" : "n");
#endif // TCPIP_DHCP_DEBUG_MASK
    }
}

Remember, from where we came, we call _DHCPSetFailTimeout() only if the "strange" bReportFail was set, otherwise we skip the function. That function then sets the flag again. And if the tOpStart was never set, then it gets set now.
 
After that TCPIP_DHCP_Process() from case TCPIP_DHCP_GET_REBIND_ACK is finished. Next time when TCPIP_DHCP_Process() is called, the DHCP module is TCPIP_DHCP_SEND_DISCOVERY state. But so far I haven't seen anything which generates the said address event.
 
And then the only place, which I can think of, which could generate the addess event is the _DHCPCheckRunFailEvent(pNetIf, pClient) function. It is called in the first part of TCPIP_DHCP_Process().
 
dhcp.c:1136

static void TCPIP_DHCP_Process(bool isTmo)
{
    unsigned int recvMsg;
    DHCP_CLIENT_VARS* pClient;
    int netIx, nNets;
    TCPIP_NET_IF* pNetIf;
 
    if(isTmo)
    { // update DHCP time keeping
        _DHCPSecondCountSet();
    }
 
    nNets = TCPIP_STACK_NumberOfNetworksGet();
    for(netIx = 0; netIx < nNets; netIx++) 
    {
        pNetIf = (TCPIP_NET_IF*)TCPIP_STACK_IndexToNet (netIx);
        if(!TCPIP_STACK_NetworkIsUp(pNetIf))
        { // inactive interface
            continue;
        }
 
        pClient = DHCPClients + TCPIP_STACK_NetIxGet(pNetIf);
 
        if(pClient->flags.bDHCPEnabled == false)
        { // not enabled on this interface
            continue;
        }
 
        // check loss of lease
        _DHCPCheckRunFailEvent(pNetIf, pClient);
 
        switch(pClient->smState)
       {

The comment here even is saying "// check loss of lease".
 
dhcp.c:2784

// checks and reports loss of lease
static void _DHCPCheckRunFailEvent(TCPIP_NET_IF* pNetIf, DHCP_CLIENT_VARS* pClient)
{
    if(pClient->smState > TCPIP_DHCP_WAIT_LINK && pClient->smState < TCPIP_DHCP_BOUND)
    { // don't have a current lease
        if(pClient->flags.bReportFail && pClient->tOpStart != 0)
        {
            if((_DHCPSecondCountGet() - pClient->tOpStart) >= pClient->tOpFailTmo)
            { // initialization time out
                TCPIP_STACK_AddressServiceEvent(pNetIf, TCPIP_STACK_ADDRESS_SERVICE_DHCPC, TCPIP_STACK_ADDRESS_SERVICE_EVENT_RUN_FAIL);
                _DHCPDbgAddServiceEvent(pClient, TCPIP_STACK_ADDRESS_SERVICE_EVENT_RUN_FAIL, "init tmo");
                pClient->flags.bReportFail = 0; // reported
            }
        }
    }
}

So there are three checks.
 
The first (outer) one checks the state, and we are in TCPIP_DHCP_SEND_DISCOVERY, so this is true.
 
The last (inner) one checks that we are not within the initialisation time, which are the 10 seconds after TCPIP_DHCP_Initialize(), TCPIP_DHCP_Enable(), or after TCPIP_DHCP_ConnectionHandler() with TCPIP_MAC_EV_CONN_ESTABLISHED. As we are comming from a REBIND this one can also be expected to be true.
 
But the second (middle) one, depends on two things. The tOpStart should be set, and this should be true. But the other one is again the "strange" bReportFail flag. If this bReportFail is set, then we are fine and we get the TCPIP_STACK_AddressServiceEvent(). But as described in my first post, I have doubts that this is always the case.
 
I can find five places where bReportFail is modified :
 
377: pClient->flags.bReportFail = 1;
This one is called in _DHCPSetFailTimeout(), which is called by _DHCPEnable(), which is called by TCPIP_DHCP_Initialize(), TCPIP_DHCP_Enable(), or after TCPIP_DHCP_ConnectionHandler() with TCPIP_MAC_EV_CONN_ESTABLISHED. Furthermore _DHCPSetFailTimeout() is called by _DHCPSetRunFail(), but only if bReportFail is true.

542: pClient->flags.bReportFail = 1;
This one is in _DHCPClientClose().

2012: pClient->flags.bReportFail = false;
This one is in _DHCPSetBoundState(), which is called in TCPIP_DHCP_GET_RENEW_ACK and TCPIP_DHCP_GET_REBIND_ACK, and also in _DHCPSetNewLease() which is called in TCPIP_DHCP_WAIT_LEASE_CHECK.
 
2796: pClient->flags.bReportFail = 0; // reported
This one is in _DHCPCheckRunFailEvent() after the TCPIP_STACK_AddressServiceEvent() was reported (see code above).

756: pClient->flags.val = 0;
This one is in TCPIP_DHCP_Initialize().

So, I think, after turning bReportFail off, nobody will turn it on again.
#3
moser
Super Member
  • Total Posts : 449
  • Reward points : 0
  • Joined: 2015/06/16 02:53:47
  • Location: Germany
  • Status: offline
Re: Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired leas 2017/07/13 04:39:36 (permalink)
3.5 (2)
It's proven. There must be an error. The device does not return to the fixed IP!
 
I have an image of a WireShark log attached.
 
I have set my computer to the fixed IP 192.168.1.158. I have set up a DHCP server (I used Tftpd64) on my computer. The address pool starts at 192.168.1.160. The lease time is 5 minutes. My Microchip board is using DHCP and fixed IP as fallback (no link local)
 
  • You can see how my computer takes its IP.
  • Then the Microchip board does the DHCP and gets the 192.168.1.160 from my computer.
  • After a few minutes you can always see the REQUEST, when the board went into RENEW state. The lease is renewed and the board continues to use the IP. This repeats four times in total.
  • Then I just stopped the DHCP Server. Now the next REQUEST in the RENEW doesn't get an answer. The request is repeated after some time.
  • Then the board goes into REBIND state. Now the REQUEST is broadcasted.
  • Then the lease expires, and the board goes to DISCOVERY state. It starts to try getting a new IP by trying to DISCOVER a DHCP server, but nobody responds. It is using now a 0.0.0.0 address again (following the RFC), as it doesn't have an address anymore. The board is doing this because this is hardcoded for the REQUEST in line 1205, by calling _DHCPSend() with TCPIP_DHCP_FLAG_SEND_ZERO_ADD parameter. So far this is correct. But in reality it still has its IP 192.168.1.160, which is wrong. 
  • Then I just tried to ping the old IP. The board responds, which is wrong. Instead it should have gone back to the fixed IP address.
  • Then I even tried to get an HTTP page from the old IP. The board simply delivers the pages still using the old IP. 
post edited by moser - 2017/07/13 05:11:27

Attached Image(s)

#4
rainad
Moderator
  • Total Posts : 1157
  • Reward points : 0
  • Joined: 2009/05/01 13:39:25
  • Location: 0
  • Status: offline
Re: Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired leas 2017/07/13 14:06:29 (permalink)
3.5 (2)
Thank you for reporting this.
I'll perform the tests and try to provide a fix ASAP.
 
 
#5
rainad
Moderator
  • Total Posts : 1157
  • Reward points : 0
  • Joined: 2009/05/01 13:39:25
  • Location: 0
  • Status: offline
Re: Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired leas 2017/07/14 15:11:42 (permalink) ☼ Best Answerby moser 2017/07/24 08:09:59
5 (1)
Hi moser,
 
I've performed the tests and confirmed what you discovered.
I think that the offending instruction/bug is in the _DHCPSetBoundState(), where bReportFail is cleared:
pClient->flags.bReportFail = false;
It should be
pClient->flags.bReportFail = true; 
because from this moment on (bound) any failure should be reported.
I was wondering if you can give it a try on your side and let me know if it works for you.
I'll make sure that the fix is part of the next release.
Thank you.
 
PS. Alternatively the flag could be cleared for any state > BOUND, like you suggested: TCPIP_DHCP_GET_RENEW_ACK, TCPIP_DHCP_GET_REBIND_ACK or TCPIP_DHCP_GET_REBIND_ACK.
 
#6
moser
Super Member
  • Total Posts : 449
  • Reward points : 0
  • Joined: 2015/06/16 02:53:47
  • Location: Germany
  • Status: offline
Re: Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired leas 2017/07/24 00:47:58 (permalink)
0
I'll check and test it.
 
 
Update: Changing the instruction in _DHCPSetBoundState() looks plausible to me and I expect it to work. I don't see any loophole which could still cause a misbehavior. I'm going to test it.
post edited by moser - 2017/07/24 04:20:28
#7
moser
Super Member
  • Total Posts : 449
  • Reward points : 0
  • Joined: 2015/06/16 02:53:47
  • Location: Germany
  • Status: offline
Re: Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired leas 2017/07/24 08:09:14 (permalink)
0
Tested and it works.
 
Now the bReportFail flag makes much more sense to me.
 
Thanks!
post edited by moser - 2017/07/24 08:11:40
#8
rainad
Moderator
  • Total Posts : 1157
  • Reward points : 0
  • Joined: 2009/05/01 13:39:25
  • Location: 0
  • Status: offline
Re: Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired leas 2017/07/24 09:33:08 (permalink)
3 (1)
Thank you for your help in finding and testing this.
The fix will be part of the v2.04 release.
 
post edited by rainad - 2017/07/24 10:10:59
#9
krbvroc1
New Member
  • Total Posts : 29
  • Reward points : 0
  • Joined: 2016/05/10 16:18:59
  • Location: 0
  • Status: offline
Re: Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired leas 2019/04/19 08:35:28 (permalink)
0
moser
Then I just tried to ping the old IP. The board responds, which is wrong. Instead it should have gone back to the fixed IP address.



Is this in an DHCP RFC or something? I believe this thread and subsequent change in Harmony 2.04 maybe  broke our implementation.
 
Under many Linux systems using their default DHCP clients, I believe if there is a loss of the DHCP server (or due to network traffic) the DHCP client will continue to use its same IP until the DHCP server comes back on line.
 
In the scenario I am testing the PIC32 is placed on a network with a DHCP lease time set to an incredibly small 10 seconds (out of my control) and there are lots of Linux servers (>50) on this same subnet. So ARP traffic is bursty/huge at renew time. At T/2 (5 secs) I believe the PIC32 is dropping/missing its DHCPOFFER from the server and it's existing TCP connections get disconnected. My current assumption is that it is due to this change.
 
How would I change the current harmony (or use hooks) to put this behavior back to using the existing IP in case of a DHCP server timeout? I am not sure going to a fallback is smart in this scenario.
 
rainad
Thank you for your help in finding and testing this.
The fix will be part of the v2.04 release.

#10
moser
Super Member
  • Total Posts : 449
  • Reward points : 0
  • Joined: 2015/06/16 02:53:47
  • Location: Germany
  • Status: offline
Re: Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired leas 2019/04/29 02:00:20 (permalink)
0
krbvroc1
moser
Then I just tried to ping the old IP. The board responds, which is wrong. Instead it should have gone back to the fixed IP address.



Is this in an DHCP RFC or something? I believe this thread and subsequent change in Harmony 2.04 maybe  broke our implementation.
 
Under many Linux systems using their default DHCP clients, I believe if there is a loss of the DHCP server (or due to network traffic) the DHCP client will continue to use its same IP until the DHCP server comes back on line.

 
RFC 2131, section 4.4.5, last paragraph ( https://tools.ietf.org/html/rfc2131#section-4.4.5 ) 
RFC 2131, section 4.4.5, last paragraph
If the lease expires before the client receives a DHCPACK, the client moves to INIT state, MUST immediately stop any other network processing and requests network initialization parameters as if the client were uninitialized. If the client then receives a DHCPACK allocating that client its previous network address, the client SHOULD continue network processing. If the client is given a new network address, it MUST NOT continue using the previous network address and SHOULD notify the local users of the problem.

 
Of course I understand, that very often it is helpful to continue to use the DHCP address in the absence of any DHCP server. But it could break the DHCP mechanism.
 
However, typically you do a fallback to another address providing mechanism, for example zeroconf or a fixed IP. And without any knowledge of the network, how can anybody forbid to use the last DHCP IP as a fixed IP? However, I am not aware if there are RFCs (or not) about these fallback approaches and if there are rules how to handle it.
 
You can easily get the IP from a lease by using
bool TCPIP_DHCP_InfoGet(TCPIP_NET_HANDLE hNet, TCPIP_DHCP_INFO* pDhcpInfo);

and store the IP from the TCPIP_DHCP_INFO structure, depending on the status
 
There is a notification mechanism for DHCP events (TCPIP_DHCP_EVENT_TYPE), but I believe there were some important events missing.
#11
krbvroc1
New Member
  • Total Posts : 29
  • Reward points : 0
  • Joined: 2016/05/10 16:18:59
  • Location: 0
  • Status: offline
Re: Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired leas 2019/04/29 07:14:40 (permalink)
0
Thanks moser,
 
 After I posted, I did read that RFC and read that same section. I also did some tests with various Linux DHCP clients and found they do NOT continue to use an expired lease. I am still trying to track down what is causing some reported field issues, but it is tough because 'it works' fine on my test network.
 
 I did find through testing that using a DHCP lease of 10 seconds does not work with the Harmony DHCP client. I am not sure if a timer rollsover or something, but using such a short lease caused problems. I think the default DHCP 'TIMEOUT' is 10 seconds which might mess things up. Maybe to operate with such a short lease time, that timeout value needs to be set shorter than the lease time. Technically this means the DHCP client is bugged because the RFC2131 removed a minimum lease time from the spec (it used to be 120 seconds).
 
 As a temporary workaround, i did modify the Harmony code to add a 'TCPIP_Stack_NetFallbackSet()' function so I can dynamically pass in an IP, mask, and gateway. So at least once I get an initial DHCP lease, I can set that as the fallback. That routine should really be part of Harmony itself.
 
But there is still some underlying network issues in the field that I am trying to determine.
#12
krbvroc1
New Member
  • Total Posts : 29
  • Reward points : 0
  • Joined: 2016/05/10 16:18:59
  • Location: 0
  • Status: offline
Re: Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired leas 2019/05/01 17:15:29 (permalink)
5 (1)
After more testing it would appear the DHCP client cannot properly handle multiple DHCP servers per the RFC. For example, it is legal for 2 DHCP servers on the same subnet to make an DHCPOFFER. The DHCP client choses one. If that server goes offline the DHCP client moves to the REBINDING state and will get a DHCPACK from the online server. Instead of switching to that server, the DHCP client ignores the response (rxErrCode 8), timesout and moves to the DISCOVER state. The new cycle will pick the new server but the damage has been done since existing connections will be closed when the stack is reconfigured to the fallback address.
#13
moser
Super Member
  • Total Posts : 449
  • Reward points : 0
  • Joined: 2015/06/16 02:53:47
  • Location: Germany
  • Status: offline
Re: Understanding the DHCP client implementation (_DHCP_CheckFailEvent(), and expired leas 2019/05/02 09:27:36 (permalink)
0
8 means the server id doesn't match.

// check what we got
if ( dhcpOptData.msgType == TCPIP_DHCP_OFFER_MESSAGE )
{ // store the current server ID
    pClient->dwServerID = dhcpOptData.serverID.Val;
    pClient->flags.bOfferReceived = true;
}
else if(pClient->dwServerID != dhcpOptData.serverID.Val)
{ // Fail if the server id doesn't match
    rxErrCode = 8;
    break;
}

That's strange. This check doesn't make sense in REBINDING state ...
post edited by moser - 2019/05/02 09:36:51
#14
Jump to:
© 2019 APG vNext Commercial Version 4.5