Kemp Load Master and Palo Alto Firewall - Random Packet Drops and Disconnections

posted 18 Jul 2020, 04:21 by Tristan Self
We had a rather irritating issue whereby we were seeing intermittent packet drops and connection failures on our Kemp Load Master.

The Kemp Load Master sat inbetween a Palo Alto Firewall within a DMZ zone. Client connections from the Internet would be directed to the Kemp Load Master in the DMZ, which would then make the onward connection to the internal Microsoft Exchange Server cluster, ADFS servers, Shibboleth servers, and other services offered.

We started to see these issues logs on the Kemp Load Master:

Jul 18 11:42:58 kemp-lb-01.domain.co.uk vsslproxy: reencrypt(116) - connect failed to xxx.xxx.xxx.xxx:443 (errno 110) 

The (116) means the Virtual Service on the Kemp and this also ties to one of the back end real servers providing service, as you can see connections were failing to any of the real servers.

https://kemptechnologies.com/faq/troubleshoot/

According to Kemp, this means the connection from the Kemp was being interrupted when on its way to the internal real server.

We also had sporadic issues from some clients, many worked fine, but some were reporting slowness or were finding their connections would drop from time to time. Specifically it was Outlook Web Access, Outlook, Outlook for Mac, other Exchange EWS clients and ActiveSync clients that were seeing the issues. Other services published via the Kemp Load Master were seeming unaffected.

Solution


The issue and fix is described in the above articles.

Essentially, the "Challenge-Ack" mechanism was not being handled by the Palo Alto firewall, leading to the connecting being reset.

  • The following counters were observed.
                              tcp_drop_packet                        1        0 warn      tcp       pktproc   packets dropped because of failure in tcp reassembly
              tcp_drop_out_of_wnd                    2        0 warn      tcp       resource  out-of-window packets dropped
  • Basically the  TCP connection between both client and server enters into a hung state. In other words, the client keeps on trying to establish a new connection while the server continues to respond with a challenge ACK.
To fix the issue, all that was required was to turn on "allow-challenge-ack" on the Palo Alto Firewall:
>configure
#set deviceconfig setting tcp allow-challenge-ack yes
#commit
#exit
>

Note that this only affects PanOS 8.0.7 onwards, from this version onwards this functionality is added, but turned off. You must manually turn "allow-challenge-ack" to yes. Earlier versions of PanOS this was not an issue.


Comments