Hello I'm working on a project where we have an embedded unit using Lwip 1.2. It's old, but it has been working ok for us for quite a while. However, recently we have run into a problem with lost connections. From the log, it seems that we start to get retransmissions on TCP. Those retransmissions happen again and agin without being resolved and we eventually get a problem with buffer allocation on the embedded unit which then hangs the entire stack on the unit. I'm no TCP expert but looking at the first few lines in the log file found Here is seems that LWIP in the embedded (192.168.0.100) never gets over a lost sequence number (1876719045). From my understanding, the PC (192.168.0.1) resends that sequence number which turns out to be a pure ack, but LWIP won't let that go and keeps insisting on getting an ack for 1876719045. Can someone please confirm my analysis, or correct me if I'm wrong. Does 192.168.0.1 do something wrong which I don't realize. Or do the retransmissions later on stem from something else? Thanks a million :) asked 01 Apr '16, 07:02 FredrikT |
One Answer:
It looks like your capture was taken somewhere between the embedded unit and the PC. Based on your capture, I see issues on both sides, but more on the embedded unit. Early on, the PC is not responding to simple SYN requests. But later in the capture, the embedded unit is not responding to simple ACKs. Is there any non-switch device between the two machines? Based on IP, they seem to be on the same local network. However, if that's the case, the response times should be fairly quick. There should not be multi-second response times from the embedded unit if it is local to the PC. I think the buffer error and the hanging of the unit is evident at the end of the capture, when the PC is sending ARP requests and not getting any replies. I think that by that time, the embedded unit is completely hosed and cannot respond. I also think that this is due to the fact that it's embedded, and does not have the appropriate level of buffer storage to process all those retransmissions. So I would do a couple things to troubleshoot this further:
Let us know what you find. answered 04 Apr '16, 12:09 jeantunis |
You should take a trace as close as possible near the device 192.168.0.1. Well out of a quick look I would say sommething goes wrong with the devive 192.168.0.1 maybe the app or the OS...???? Buffer shortage is a possible cause, too. If I were you I would investigate that device.
At the end it seems not to answer the ARP Request, which would say something goes wrong with this device or we just didn´t capture them.