• AVR Freaks

Hot!H2.06 IGMPv3 Losing Subscription

Author
campbellCustom
Junior Member
  • Total Posts : 100
  • Reward points : 0
  • Joined: 2014/08/30 14:35:35
  • Location: 0
  • Status: offline
2020/02/25 14:00:14 (permalink)
0

H2.06 IGMPv3 Losing Subscription

I have several released products (MX, MZ) that have been occasionally losing their ability to receive multicast data in the field. The newest ones are on H2.06. The issue seems to relate to systems configured to do IGMP Snooping. Sometime near boot the PIC32 does an IGMP join. The managed switch seems ok with this join and adds the group address into it's known IP Multicast Group table. The multicast packets are sent to the required port at first. After a few minutes (averaging around ~13 minutes) the switch is no longer showing the port as being included for the group address. The product is no longer receiving the multicast data, which is necessary for it to work. I'm having trouble finding for sure if this is a report issue upon query in the IGMPv3 module. (sadly I cannot debug the switch... )
 
I have tried a little bit of debugging on the products, and gone so far as to reproduce this issue with the MZ EF starter kit, and H2.06's tcpip_udp_server. To build this, the MHC file needs to add the IGMPv3 module, and allow multicast on the interface (then regenerate). The below app.c source adds a state to do a multicast join and is generally familiar to the example. No actual transmitter is required. (the join is lost regardless of data transmission)
 

#include "app.h"
#include "tcpip/tcpip.h"

#define SERVER_PORT 9760

APP_DATA appData;

    void APP_Initialize(void) {
    appData.state = APP_TCPIP_WAIT_INIT;
}

void APP_Tasks(void) {
    SYS_STATUS tcpipStat;
    const char *netName, *netBiosName;
    static IPV4_ADDR dwLastIP[2] = {
        {-1},
        {-1}
    };
    IPV4_ADDR ipAddr;
    int i, nNets;
    TCPIP_NET_HANDLE netH;

    SYS_CMD_READY_TO_READ();
    switch (appData.state) {
        case APP_TCPIP_WAIT_INIT:
            tcpipStat = TCPIP_STACK_Status(sysObj.tcpip);
            if (tcpipStat < 0) { // some error occurred
                SYS_CONSOLE_MESSAGE(" APP: TCP/IP stack initialization failed!\r\n");
                appData.state = APP_TCPIP_ERROR;
            } else if (tcpipStat == SYS_STATUS_READY) {
                // now that the stack is ready we can check the
                // available interfaces
                nNets = TCPIP_STACK_NumberOfNetworksGet();
                for (i = 0; i < nNets; i++) {

                    netH = TCPIP_STACK_IndexToNet(i);
                    netName = TCPIP_STACK_NetNameGet(netH);
                    netBiosName = TCPIP_STACK_NetBIOSName(netH);

#if defined(TCPIP_STACK_USE_NBNS)
                    SYS_CONSOLE_PRINT("    Interface %s on host %s - NBNS enabled\r\n", netName, netBiosName);
#else
                    SYS_CONSOLE_PRINT("    Interface %s on host %s - NBNS disabled\r\n", netName, netBiosName);
#endif  // defined(TCPIP_STACK_USE_NBNS)

                }
                appData.state = APP_TCPIP_WAIT_FOR_IP;

            }
            break;


        case APP_TCPIP_WAIT_FOR_IP:

            // if the IP address of an interface has changed
            // display the new value on the system console
            nNets = TCPIP_STACK_NumberOfNetworksGet();

            for (i = 0; i < nNets; i++) {
                netH = TCPIP_STACK_IndexToNet(i);

                if (!TCPIP_STACK_NetIsReady(netH)) {
                    return; // interface not ready yet!
                }

                ipAddr.Val = TCPIP_STACK_NetAddress(netH);
                if (dwLastIP[i].Val != ipAddr.Val) {
                    dwLastIP[i].Val = ipAddr.Val;

                    SYS_CONSOLE_MESSAGE(TCPIP_STACK_NetNameGet(netH));
                    SYS_CONSOLE_MESSAGE(" IP Address: ");
                    SYS_CONSOLE_PRINT("%d.%d.%d.%d \r\n", ipAddr.v[0], ipAddr.v[1], ipAddr.v[2], ipAddr.v[3]);
                }
            }
            // all interfaces ready. Could start transactions!!!
            appData.state = APP_TCPIP_OPENING_SERVER;
            break;

        case APP_TCPIP_OPENING_SERVER:
        {
            SYS_CONSOLE_PRINT("Waiting for Client Connection on port: %d\r\n", SERVER_PORT);
            appData.socket = TCPIP_UDP_ServerOpen(IP_ADDRESS_TYPE_IPV4, SERVER_PORT, 0);
            if (appData.socket == INVALID_SOCKET) {
                SYS_CONSOLE_MESSAGE("Couldn't open server socket\r\n");
                break;
            }
            appData.state = APP_TCPIP_JOIN_MULTICAST;
        }
            break;

        case APP_TCPIP_WAIT_FOR_CONNECTION:
        {
            if (!TCPIP_UDP_IsConnected(appData.socket)) {
                return;
            } else {
                // We got a connection
                appData.state = APP_TCPIP_JOIN_MULTICAST;
                SYS_CONSOLE_MESSAGE("Received a connection\r\n");
            }
        }
            break;
        case APP_TCPIP_JOIN_MULTICAST:
        {
            IPV4_ADDR IGMP_subscribe_address;
            TCPIP_Helper_StringToIPAddress("239.255.0.1", &IGMP_subscribe_address);
            netH = TCPIP_STACK_IndexToNet(0);

            TCPIP_IGMP_RESULT result;
            result = TCPIP_IGMP_Join(appData.socket, netH, IGMP_subscribe_address);
            if(result != TCPIP_IGMP_OK) {
                Nop(); // break point for debug
            }
            appData.state = APP_TCPIP_SERVING_CONNECTION;
        }
            break;
        case APP_TCPIP_SERVING_CONNECTION:
        {
            if (!TCPIP_UDP_IsConnected(appData.socket)) {
                appData.state = APP_TCPIP_WAIT_FOR_CONNECTION;
                SYS_CONSOLE_MESSAGE("Connection was closed\r\n");
                break;
            }
            int16_t wMaxGet, wMaxPut, wCurrentChunk;
            uint16_t w, w2;
            uint8_t AppBuffer[32];
            memset(AppBuffer, 0, 32);
            // Figure out how many bytes have been received and how many we can transmit.
            wMaxGet = TCPIP_UDP_GetIsReady(appData.socket); // Get UDP RX FIFO byte count
            wMaxPut = TCPIP_UDP_PutIsReady(appData.socket);

            //SYS_CONSOLE_PRINT("\t%d bytes are available.\r\n", wMaxGet);
            if (wMaxGet == 0) {
                break;
            }

            if (wMaxPut < wMaxGet) {
                wMaxGet = wMaxPut;
            }

            SYS_CONSOLE_PRINT("RX Buffer has %d bytes in it\n", wMaxGet);

            // Process all bytes that we can
            // This is implemented as a loop, processing up to sizeof(AppBuffer) bytes at a time.
            // This limits memory usage while maximizing performance.  Single byte Gets and Puts are a lot slower than multibyte GetArrays and PutArrays.
            wCurrentChunk = sizeof (AppBuffer);
            for (w = 0; w < wMaxGet; w += sizeof (AppBuffer)) {
                // Make sure the last chunk, which will likely be smaller than sizeof(AppBuffer), is treated correctly.
                if (w + sizeof (AppBuffer) > wMaxGet)
                    wCurrentChunk = wMaxGet - w;

                // Transfer the data out of the TCP RX FIFO and into our local processing buffer.
                int rxed = TCPIP_UDP_ArrayGet(appData.socket, AppBuffer, sizeof (AppBuffer));

                SYS_CONSOLE_PRINT("\tReceived a message of '%s' and length %d\r\n", AppBuffer, rxed);

                // Perform the "ToUpper" operation on each data byte
                for (w2 = 0; w2 < wCurrentChunk; w2++) {
                    i = AppBuffer[w2];
                    if (i >= 'a' && i <= 'z') {
                        i -= ('a' - 'A');
                        AppBuffer[w2] = i;
                    } else if (i == '\e') //escape
                    {
                        SYS_CONSOLE_MESSAGE("Connection was closed\r\n");
                    }
                }

                SYS_CONSOLE_PRINT("\tSending a messages '%s'\r\n", AppBuffer);

                // Transfer the data out of our local processing buffer and into the TCP TX FIFO.
                TCPIP_UDP_ArrayPut(appData.socket, AppBuffer, wCurrentChunk);

                TCPIP_UDP_Flush(appData.socket);

                appData.state = APP_TCPIP_WAIT_FOR_CONNECTION;
            }
            TCPIP_UDP_Discard(appData.socket);
        }
            break;

        default:
            break;
    }
}


/*******************************************************************************
 End of File
 */



 
 
My system is using a DHCP server (not in the switch) an older Cisco SF-302 which has had a FW update in the last year or so.
 
Some config details in Cisco terms:
'Bridge Multicast Filtering Status" == Enabled and
"Forward All" and "Unregistered Multicast" == off
IPv4 Snooping (IGMPv3) enabled on VLAN 1 (the only VLAN in this case)
 
I intend to try another brand of switch to reproduce this, however I expect it will as this issue appears on customer sites on a few brands of switch.
 
Additionally I have been using 'production' programming rather than debug session programming, to avoid causing unexpected time delays from break point halts. (also the starter kit debugger seems to like crashing a lot...)
 
As of now I'm reading the IGMP standard documents, and looking for a strategy to get consistent multicast RX in the PIC where the switch is filtering multicast. Please let me know if I'm doing something wrong. Any tips/debug suggestions appreciated.
 
-IC
 
 
 
#1

18 Replies Related Threads

    rainad
    Moderator
    • Total Posts : 1371
    • Reward points : 0
    • Joined: 2009/05/01 13:39:25
    • Location: 0
    • Status: online
    Re: H2.06 IGMPv3 Losing Subscription 2020/02/25 15:18:14 (permalink)
    0
    Can you please check that the PIC32 responds to the IGMP query messages?
    Or if there is anything suspicious with those replies.
    Normally you should be able to see some statistics in that switch. Does the IGMP snooping give you any info?
     
    I understand that if you turn off the snooping, everything seems to work OK?
     
     
    #2
    campbellCustom
    Junior Member
    • Total Posts : 100
    • Reward points : 0
    • Joined: 2014/08/30 14:35:35
    • Location: 0
    • Status: offline
    Re: H2.06 IGMPv3 Losing Subscription 2020/02/25 17:07:07 (permalink)
    0

    Can you please check that the PIC32 responds to the IGMP query messages?

    Currently working on it. Spent some time with the port mirror settings in hope to see a wireshark trace with IGMP messages to/from the embedded system. Thus far I have not gotten this to work.
     
    In terms of checking response, I can try to debug in MPLAB X, but I will have to use a different system than the EF starter kit. EF kit debug sessions randomly disconnecting.
     

    Normally you should be able to see some statistics in that switch. Does the IGMP snooping give you any info?

    At the moment, I can see that a listening device has joined a group on a port by looking in the web server, I can also see that it is no longer in the group via the web server. The lights on the jack stop blinking when data stops transporting, the web server reflects this. There is a list of MAC groups, and IPs of interest to multicast. These seem to populate OK with a successful join.
     

    I understand that if you turn off the snooping, everything seems to work OK?

    If the filtering is left on, and snooping is turned off, the listening device receives no data, as all data is filtered out, this is bad. As a work around, this switch allows all multicast traffic direction to a port (which is very bad on some networks) or the tables can be setup by IP and/or MAC. My understanding is that the IGMP protocol allows for filtering, without having to hand enter every multicast device and address.
    #3
    campbellCustom
    Junior Member
    • Total Posts : 100
    • Reward points : 0
    • Joined: 2014/08/30 14:35:35
    • Location: 0
    • Status: offline
    Re: H2.06 IGMPv3 Losing Subscription 2020/02/26 11:54:50 (permalink)
    0
    I have attached some traffic illustrating a missing report to a querier. The zip should contain two files.
     
    I am now using the Cisco switch to transport all multicast traffic on my two NICs, an embedded system, and 'the rest of the network'. The rest of the network has some 25 devices attached across a few switches, and the traces are filtered for IGMP. The devices/addresses of interest are as follows:
     
    The IGMP querier (looks like a TRENDnet) is at address 192.168.1.95.
    The MZ_EF starter kit is at 192.168.1.136
    My NIC1 192.168.1.135 has joined a bunch of multicast groups
    My NIC2 192.168.1.137 has not joined a bunch of multicast groups
     
    In the capture file: MZ_EF_SingleJoin
      The MZ_EF_SK is powered on and issues a join in packets 3 and 4
      Some time later the querier issues a general query. (5,6)
      A bunch of devices on the network respond to the query including both of my NICs
      I'm not seeing a response to query from 192.168.1.136
     
    Using the same physical port on the Cisco switch, the MZ_EF was replaced with a different device at 192.168.1.146 that uses IGMPv2, but generally should behave the same. This device does not have issues with timing out under prior snooping tests.
     
    On the second trace file: TIVA_SingleJoin
      The TIVA joins at 1 and 3.
      Some time later the querier issues a general query. (5,6)
      A bunch of devices respond
      This includes the device at 192.168.1.146
     
    The findings in this traffic may indicate that a switch configured for snooping will timeout a multicast listener who joins at boot, but doesn't respond to query.
     
     
    #4
    rainad
    Moderator
    • Total Posts : 1371
    • Reward points : 0
    • Joined: 2009/05/01 13:39:25
    • Location: 0
    • Status: online
    Re: H2.06 IGMPv3 Losing Subscription 2020/02/26 14:20:04 (permalink)
    0
    For debugging, from my experience using RealICE or ICD4 is usually much more stable.
     
    The PIC32 IGMP module will respond only to IGMPv3 queries. So, if a v2 query it's received, probably it won't reply to it.
    To handle the IGMPv2 queries you'll need to modify/adjust the _IGMP_ProcessV3Query() function.
    You could add some console messages in the _IGMP_ProcessV3Query or in the TCPIP_IGMP_Process() which processes the IGMP packets and find out more what's going on.
     
     
    #5
    campbellCustom
    Junior Member
    • Total Posts : 100
    • Reward points : 0
    • Joined: 2014/08/30 14:35:35
    • Location: 0
    • Status: offline
    Re: H2.06 IGMPv3 Losing Subscription 2020/02/26 16:33:33 (permalink)
    0
    I'm using the default Wireshark (3.2.1) dissector, and seeing the Query as Protocol = IGMPv3
    Also, the 192.168.1.135 is responding with v3 list reports.
     
    Looking at this link (might get removed):
    https://www.ccexpert.us/routing-switching-2/comparison-of-igmpv1-igmpv2-and-igmpv3.html
     
    The only difference between 2 and 3 for the query is the Source- and Group- Specific Queries.
     
    I will continue working on this, as is serious priority for me, but may have to abandon the MZ EF SK in further troubleshooting to use a different debugger. My intent with using the SK has been to provide a common hardware platform.
     
    #6
    campbellCustom
    Junior Member
    • Total Posts : 100
    • Reward points : 0
    • Joined: 2014/08/30 14:35:35
    • Location: 0
    • Status: offline
    Re: H2.06 IGMPv3 Losing Subscription 2020/02/26 18:08:17 (permalink)
    0
    Found an adapter to run the MZ EF SK with an ICD4 ... debugging is working much better.
     
    It's starting to look like a timing/cleanup issue!? When debugging with a breakpoint in _IGMP_ProcessV3Query() (near the top of the function) the query comes in, session pauses on the breakpoint, then I hit run, the MZ EF SK will produce a response.
    (see attached trace: in ReportOK, packet 25 occurs after some 8.5 seconds as it was sitting at breakpoint until i hit run)
     
    Running that same system with a production style programming does not produce a response.
    (see attached trace: ReportNotOK)
     
    This is on the same system, same ports, some software, only difference is debug/w breakpoint vs production.
     
     
     
     
    ...
    EDIT: editing to add the attachment because this forum...
    post edited by campbellCustom - 2020/02/26 18:09:44
    #7
    rainad
    Moderator
    • Total Posts : 1371
    • Reward points : 0
    • Joined: 2009/05/01 13:39:25
    • Location: 0
    • Status: online
    Re: H2.06 IGMPv3 Losing Subscription 2020/02/27 10:02:01 (permalink)
    0
    As mentioned before, first try to add a console print in the TCPIP_IGMP_Process() so that we know that the query request is received and what happens to it.
     
    Another thing that I notice is that you're joining 239.255.0.1 (administratively scoped addresses) for which probably you don't need IGMP at all unless your organization requires this internally.
     
    Can you please give it a try with a Source-Specific multicast address in the range 232.0.0.1 to 232.255.255.255 and see what happens? These should be the used addresses anyway.
     
     
     
    #8
    campbellCustom
    Junior Member
    • Total Posts : 100
    • Reward points : 0
    • Joined: 2014/08/30 14:35:35
    • Location: 0
    • Status: offline
    Re: H2.06 IGMPv3 Losing Subscription 2020/02/27 13:42:59 (permalink)
    0

    As mentioned before, first try to add a console print in the TCPIP_IGMP_Process() so that we know that the query request is received and what happens to it.

    With a console up seeing one _IGMP message regularly when queries show on the network:

    IGMP Assert: _IGMP_GetNewQueryReport, in line: 3246, <CR><LF>

    So the query is getting in. Will spend some more time adding in print statements.
     

    Another thing that I notice is that you're joining 239.255.0.1 (administratively scoped addresses) for which probably you don't need IGMP at all unless your organization requires this internally.

    This address and the need for IGMP are mandated by ANSI E1.31. My needs, will not change in this regard.
     

    Can you please give it a try with a Source-Specific multicast address in the range 232.0.0.1 to 232.255.255.255 and see what happens? These should be the used addresses anyway.

    I tried setting the example to join 232.0.0.1, and I'm not seeing the initial join attempt at all. I am not sure why.
     
    #9
    rainad
    Moderator
    • Total Posts : 1371
    • Reward points : 0
    • Joined: 2009/05/01 13:39:25
    • Location: 0
    • Status: online
    Re: H2.06 IGMPv3 Losing Subscription 2020/02/28 08:00:57 (permalink)
    0
    The assert should NOT happen. So definitely something is wrong.
    Especially since you cannot even use apparently a SSM address.
    Can you please verify the configuration values for your app:
    TCPIP_IGMP_MCAST_GROUPS
    TCPIP_IGMP_INTERFACES
     
    One thing that we need to do:
    - in the igmp_private.h change the debug mask to:
    #define TCPIP_IGMP_DEBUG_LEVEL 9
    i.e. TCPIP_IGMP_DEBUG_MASK_BASIC | TCPIP_IGMP_DEBUG_MASK_REPORT_LISTS
    - when you run into that assertion call the function TCPIP_IGMP_ReportListPrint(0xc007)
    and let's see the result of that listing.
    You can add this as a standard tcpip console command. Otherwise you can make your own mechanism to call this function. For example, when that assert fails, simply call this function.
     
     
    #10
    campbellCustom
    Junior Member
    • Total Posts : 100
    • Reward points : 0
    • Joined: 2014/08/30 14:35:35
    • Location: 0
    • Status: offline
    Re: H2.06 IGMPv3 Losing Subscription 2020/02/28 14:03:13 (permalink)
    0

    Can you please verify the configuration values for your app:
    TCPIP_IGMP_MCAST_GROUPS
    TCPIP_IGMP_INTERFACES

     
    I started with example UDP server, and modified very little (added IGMP, added multicast on the interface, changed default static IP).
     
    To my understanding these are the defaults... :

     
     
     
    /*** IGMP Configuration ***/
     
     
     
    #define TCPIP_STACK_USE_IGMP
     
     
     
    #define TCPIP_IGMP_INTERFACES                                              1
     
     
     
    #define TCPIP_IGMP_MCAST_GROUPS                                                                   7
     
     
     
    #define TCPIP_IGMPV2_SUPPORT_ONLY                                            false
     
     
     
    #define TCPIP_IGMP_SOURCES_PER_GROUP                                       11
     
     
     
    #define TCPIP_IGMP_SOCKET_RECORDS_PER_SOURCE                               4
     
     
     
    #define TCPIP_IGMP_ROBUSTNESS_VARIABLE                         2
     
     
     
    #define TCPIP_IGMP_UNSOLICITED_REPORT_INTERVAL                              1000
     
     
     
    #define TCPIP_IGMP_USER_NOTIFICATION                       false
     
     
     
    #define TCPIP_IGMP_TASK_TICK_RATE                                                               33
     
     
     

     

    One thing that we need to do:
    - in the igmp_private.h change the debug mask to:
    #define TCPIP_IGMP_DEBUG_LEVEL 9
    i.e. TCPIP_IGMP_DEBUG_MASK_BASIC | TCPIP_IGMP_DEBUG_MASK_REPORT_LISTS
    - when you run into that assertion call the function TCPIP_IGMP_ReportListPrint(0xc007)
    and let's see the result of that listing.
    You can add this as a standard tcpip console command. Otherwise you can make your own mechanism to call this function. For example, when that assert fails, simply call this function.

     
    OK, Added the list prints (spoiler... the stuck state gives IGMP GQ Free List: 0 nodes) and a few others:
     
    In _IGMP_ProcessV3Query() adding this:
     

     
     
     
        // check the current status report rules
     
     
     
        // #1 rule: check for existing GQ scheduled earlier
     
     
     
        if (pGq != 0 && pGq->tTransmit < tReport) { // fail
     
     
     
            SYS_CONSOLE_MESSAGE("FAIL!! general query already scheduled... \r\n"); //
     
     
     
            return;
     
     
     
        }
     
     
     
     

    In _IGMP_ScheduleGenQueryReport() adding this:

     
     
     
        SYS_CONSOLE_MESSAGE("Scheduling General Query Report \r\n");
     
     
     
     
     
    in_IGMP_GenerateGenQueryReport() adding this:

     
     
     
        SYS_CONSOLE_MESSAGE("Generating General Query Report .... THIS NEEDS TO WORK \r\n"); //
     
     
     

     
    Console Output
     
    TCP/IP Stack: Initialization Started <LF><CR>
    TCP/IP Stack: Initialization Ended - success <LF><CR>
        Interface PIC32INT on host MCHPBOARD_E     - NBNS enabled<CR><LF>
    PIC32INT IP Address: 192.168.1.136 <CR><LF>
    Waiting for Client Connection on port: 9760<CR><LF>
    Attempted IGMP Join ... Join Ok<CR><LF>
    <CR><LF>
    <CR><LF>
    >IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 1 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    Scheduling General Query Report <CR><LF>
    FAIL!! general query already scheduled... <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    Scheduling General Query Report <CR><LF>
    Scheduling General Query Report <CR><LF>
    Generating General Query Report .... THIS NEEDS TO WORK <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    Scheduling General Query Report <CR><LF>
    FAIL!! general query already scheduled... <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2786, <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2786, <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2786, <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
     
    Oddly, this is able to respond one time, and the wireshark confirms this. (better than before, however only adds 2 minutes before the product goes deaf in a snooping local scope)
     
     
     
     
    EDIT: attach zip
    post edited by campbellCustom - 2020/02/28 14:07:58
    #11
    rainad
    Moderator
    • Total Posts : 1371
    • Reward points : 0
    • Joined: 2009/05/01 13:39:25
    • Location: 0
    • Status: online
    Re: H2.06 IGMPv3 Losing Subscription 2020/03/02 08:08:19 (permalink)
    0
    For some reason, your prints do not show the status of the scheduled query reports. It's like when the TCPIP_IGMP_ReportListPrint() is called, the IGMP_QueryListPrint() does not get called for the TCPIP_IGMP_QUERY_GENERAL.
    Please check what's going on, because we need to check how many requests are scheduled for transmission.
     
    Also to understand why the general query reports do not get through, and not returned to the free pool, there are few settings that need to be adjusted, so please make these updates:
    - #define TCPIP_IGMP_USER_NOTIFICATION true
    so we can be notified of the IGMP events
    - use TCPIP_IGMP_HandlerRegister() to register your handler that receives the IGMP events. In this handler, simply print the event type.
     
     
    #12
    campbellCustom
    Junior Member
    • Total Posts : 100
    • Reward points : 0
    • Joined: 2014/08/30 14:35:35
    • Location: 0
    • Status: offline
    Re: H2.06 IGMPv3 Losing Subscription 2020/03/02 14:09:40 (permalink)
    0

    For some reason, your prints do not show the status of the scheduled query reports. It's like when the TCPIP_IGMP_ReportListPrint() is called, the IGMP_QueryListPrint() does not get called for the TCPIP_IGMP_QUERY_GENERAL.

     
    igmpGenQueryReportList seems to always have zero nodes when attempting to print.
    ReportListPrint() gets a copy of the empty global list, passes it to QueryListPrint()
    Then the for loop that does the printing exits before doing anything because the list is empty.
     
    Looking through usages of igmpGenQueryReportList:
    TCPIP_IGMP_GenQueryTimeout() is being called repeatedly and is removing heads. Looks like other lists are populated with what is removed from igmpGenQueryReportList?
     
    Adding print statements in TCPIP_IGMP_GenQueryTimeout are flooding the console as conditions to execute TCP_IGMP_Tick() are met (presumably every 33ms as the config file suggests).
     
    All three functions in  TCPIP_IGMP_Tick() are giving unresolvable breakpoints (optimized maybe?)
     

    Please check what's going on, because we need to check how many requests are scheduled for transmission.

     
    No new print statements yet, but from previous post:
     
    Console Output
     
    TCP/IP Stack: Initialization Started <LF><CR>
    TCP/IP Stack: Initialization Ended - success <LF><CR>
        Interface PIC32INT on host MCHPBOARD_E     - NBNS enabled<CR><LF>
    PIC32INT IP Address: 192.168.1.136 <CR><LF>
    Waiting for Client Connection on port: 9760<CR><LF>
    Attempted IGMP Join ... Join Ok<CR><LF>
    <CR><LF>
    <CR><LF>
    >IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 1 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    Scheduling General Query Report <CR><LF>
    FAIL!! general query already scheduled... <CR><LF>
    // ^^^^ fails on first try? Or, is this from getting two redundant queries...
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    Scheduling General Query Report <CR><LF>
    Scheduling General Query Report <CR><LF>
    Generating General Query Report .... THIS NEEDS TO WORK <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    Scheduling General Query Report <CR><LF>
    FAIL!! general query already scheduled... <CR><LF> // after second failure, no new schedules
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
     
    The schedule function ran three times prior to the generation...
     

    Also to understand why the general query reports do not get through, and not returned to the free pool, there are few settings that need to be adjusted, so please make these updates:
    - #define TCPIP_IGMP_USER_NOTIFICATION true

    done.
     

    so we can be notified of the IGMP events
    - use TCPIP_IGMP_HandlerRegister() to register your handler that receives the IGMP events. In this handler, simply print the event type.

     
    Making an attempt at this, I put this in, which received no response:

     
    void APP_IGMPPrintEventType(IPV4_ADDR mcastAddress, TCPIP_IGMP_EVENT_TYPE evType, const void* param) {
        Nop();
        SYS_CONSOLE_MESSAGE("Event: ");
        SYS_PRINT("%d \r\n",evType);
    }
     
    // and added this near end of join state in the application... or should i put this in the init?:
                TCPIP_IGMP_HandlerRegister(IGMP_subscribe_address, APP_IGMPPrintEventType, NULL);

     
    The callback doesn't execute. Guessing I've got this wrong somehow.
     
     
     
     
    #13
    rainad
    Moderator
    • Total Posts : 1371
    • Reward points : 0
    • Joined: 2009/05/01 13:39:25
    • Location: 0
    • Status: online
    Re: H2.06 IGMPv3 Losing Subscription 2020/03/02 14:57:30 (permalink)
    0
    The call to TCPIP_IGMP_HandlerRegister() should work any time after the IGMP is initialized;
    Please try with TCPIP_IGMP_HandlerRegister(0, APP_IGMPPrintEventType, NULL);
    It should work. Using 0 for the mcast address will match any address so we'll get all the events.
     
    Regarding  the GQ lists...if the report list is empty (list of scheduled reports) then how come the message
    IGMP GQ Free List: 1 nodes
    IGMP GQ Free List: 0 nodes
    show up? Doesn't make much sense, since the nodes from the GQ free list go to GQ report list, nowhere else.
    Or maybe just the display is wrong.
    The GQ Free list starts with 2 nodes and the GQ Report list is empty.
    When a GQ (general query) is scheduled, the node is retrieved from the free list and added to the report list.
    The only situation in which a scheduled GQ node is not sent back to the free list is when the transmission over the interface failed and retries happen.
    Hopefully the events will show some light.
     
    #14
    campbellCustom
    Junior Member
    • Total Posts : 100
    • Reward points : 0
    • Joined: 2014/08/30 14:35:35
    • Location: 0
    • Status: offline
    Re: H2.06 IGMPv3 Losing Subscription 2020/03/02 16:11:37 (permalink)
    0

    The call to TCPIP_IGMP_HandlerRegister() should work any time after the IGMP is initialized;
    Please try with TCPIP_IGMP_HandlerRegister(0, APP_IGMPPrintEventType, NULL);
    It should work. Using 0 for the mcast address will match any address so we'll get all the events.

     
    Ok, event is decimal 16:

    TCPIP_IGMP_EVENT_GET_GEN_QUERY_REPORT_ERROR     = 0x0010,   // a free general query report could not be obtained
                                                                    // (all are already scheduled)

     
    I added another print for checking the list lock:
     

    // sets up a Gen Query node and inserts it in a Gen Query report list
    static void _IGMP_ScheduleGenQueryReport(TCPIP_IGMP_GEN_QUERY_REPORT_NODE* pGQNode, int ifIx, uint32_t tTransmit) {
        SYS_CONSOLE_MESSAGE("Scheduling General Query Report \r\n");
        pGQNode->ifIx = ifIx;
        pGQNode->queryType = TCPIP_IGMP_QUERY_GENERAL;
        pGQNode->tTransmit = tTransmit;

        igmpListsLock();
        SYS_CONSOLE_MESSAGE("general query adding tail ... \r\n");
        TCPIP_Helper_SingleListTailAdd(&igmpGenQueryReportList, (SGL_LIST_NODE*) pGQNode);
        igmpListsUnlock();
    }

     
    TCP/IP Stack: Initialization Started <LF><CR>
    Initializing IGMP General Query List <CR><LF>
    TCP/IP Stack: Initialization Ended - success <LF><CR>
        Interface PIC32INT on host MCHPBOARD_E     - NBNS enabled<CR><LF>
    PIC32INT IP Address: 192.168.1.136 <CR><LF>
    Waiting for Client Connection on port: 9760<CR><LF>
    Attempted IGMP Join ... Join Ok<CR><LF>
    <CR><LF>
    <CR><LF>
    >IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 1 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    Scheduling General Query Report <CR><LF>
    general query adding tail ... <CR><LF>
    FAIL!! general query already scheduled... <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    Scheduling General Query Report <CR><LF>
    general query adding tail ... <CR><LF>
    Scheduling General Query Report <CR><LF>
    general query adding tail ... <CR><LF>
    Generating General Query Report .... THIS NEEDS TO WORK <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    Scheduling General Query Report <CR><LF>
    general query adding tail ... <CR><LF>
    FAIL!! general query already scheduled... <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2791, <CR><LF>
    Event: 16 <CR><LF> // ***************** first event
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2791, <CR><LF>
    Event: 16 <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2791, <CR><LF>
    Event: 16 <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2791, <CR><LF>
    Event: 16 <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2791, <CR><LF>
    Event: 16 <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2791, <CR><LF>
    Event: 16 <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2791, <CR><LF>
    Event: 16 <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2791, <CR><LF>
    Event: 16 <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2791, <CR><LF>
    Event: 16 <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2791, <CR><LF>
    Event: 16 <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2791, <CR><LF>
    Event: 16 <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2791, <CR><LF>
    Event: 16 <CR><LF>
    IGMP SC Free List: 21 nodes<CR><LF>
    IGMP GQ Free List: 0 nodes<CR><LF>
    IGMP GRPQ Free List: 7 nodes<CR><LF>
    IGMP Assert: _IGMP_GetNewQueryReport, in line: 2791, <CR><LF>
    Event: 16 <CR><LF>
     
    #15
    campbellCustom
    Junior Member
    • Total Posts : 100
    • Reward points : 0
    • Joined: 2014/08/30 14:35:35
    • Location: 0
    • Status: offline
    Re: H2.06 IGMPv3 Losing Subscription 2020/03/03 11:39:24 (permalink)
    0
    Seems like reports (in response to query) are generated consistently when I comment out the block under #1 rule in _IGMP_ProcessV3Query()
     

        // check the current status report rules
        // #1 rule: check for existing GQ scheduled earlier
    //    if (pGq != 0 && pGq->tTransmit < tReport) { // fail
    //        SYS_CONSOLE_MESSAGE("FAIL!! general query already scheduled... \r\n"); // added this failure message
    //        return; // this return is causing something to get stuck...
    //    }

     
    not sure if there are side effects of rescheduling instead in the next block (maybe errors with a misbehaving querier?)
     
    needing to test this in an actual product at this point as I will gladly trade for a lower impact bug.
     
     
     
     
    #16
    rainad
    Moderator
    • Total Posts : 1371
    • Reward points : 0
    • Joined: 2009/05/01 13:39:25
    • Location: 0
    • Status: online
    Re: H2.06 IGMPv3 Losing Subscription 2020/03/03 14:50:01 (permalink)
    0
    You latest post made it clear what the problem was:
    pGq = (TCPIP_IGMP_GEN_QUERY_REPORT_NODE*)_IGMP_FindScheduledQueryReport(TCPIP_IGMP_QUERY_GENERAL, 0, ifIx, true);
     
    if(pGq != 0 && pGq->tTransmit < tReport)
    {
    return;
    }
    The problem is the last parameter of the call, which means "remove".
    Obviously after the node has been removed from the scheduled queue, simply returning won't do much good.
    It should be re-inserted or at least not removed in the first place.
     
    Anyway, this issue has been fixed in the H3. Please make a copy of the igmp.c from 
    https://github.com/Microchip-MPLAB-Harmony/net
    I'm also attaching the same file here, your choice.
    Sorry it took me so long to recall this old bug. It's been fixed in the repo but after the H2 2.06 release.
    Please let me know if everything works as expected after the update.
     
    P.S. I'm having difficulties with attaching the file.
    I'll try again later. But you can download it from github, as indicated.
     
     
     
    #17
    campbellCustom
    Junior Member
    • Total Posts : 100
    • Reward points : 0
    • Joined: 2014/08/30 14:35:35
    • Location: 0
    • Status: offline
    Re: H2.06 IGMPv3 Losing Subscription 2020/03/04 11:06:23 (permalink)
    0
    Ok, thanks, rainad. Will have to do some thinking on best way to implement this fix.
     
    Perhaps I will be hybridizing the H3 source with my H2 install. (future developers will have to modify framework as I have done) This bug is new to me, and was only discovered after having technical support issues in the 'field'. As a result I'm touching released products mostly on MZ, but at least one on MX with non trivial inventory. I've been working in Harmony (on and off) since beta, and have been willing to upgrade projects to new releases up until 2.06. Having H3 release target the more expensive DA chips, or force Cortex move... but anyway this may be for a different thread, just trying to provide some perspective on why I'm here, and why I may continue to be here asking about old stuff.
     
    Thanks again for your help,
    -IC
     
     
    #18
    rainad
    Moderator
    • Total Posts : 1371
    • Reward points : 0
    • Joined: 2009/05/01 13:39:25
    • Location: 0
    • Status: online
    Re: H2.06 IGMPv3 Losing Subscription 2020/03/04 13:17:29 (permalink)
    0
    Sure, glad I could be of help.
    I should have caught the issue earlier...form now on I'll remember that the latest H2 release doesn't have the IGMP fix :)
     
    I think that the H3 file that i attached can be used directly instead of the old one. No need for any "hybridizing". 
     
     
    #19
    Jump to:
    © 2020 APG vNext Commercial Version 4.5