-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Packets to hostNetwork pods like envoy always contain RST flag #29406
Comments
This issue also happens for me on 1.14.4 on another NixOS cluster. I guess there is a configuration issue either with Cilium or NixOS, but I was not able to find out where the packets actually get dropped. I also tried some combinations of different Cilium Helm configs, like epf masquerading etc., the only thing that changed is that Envoy was not reachable anymore outside from localhost Netstat shows that 0 packets got dropped. It actually shows a 0 for everything except RX-OK which actually looks fine but is weird |
Pretty sure this has to have something to do with these logs, even though I was told the issue is apparently a different one in my previous GH issue:
|
Some
The ip is from the ingress/gateway: |
|
I encountered similar problem. Also with NixOS (not sure if it's the reason). But the error message I got is a bit different
It seems like somehow
The problem is bpf map lookup element operation returns https://man7.org/linux/man-pages/man2/bpf.2.html
So the |
The odd thing is, source ip is the same as dest IP |
oh, nope, somehow it's trying to look up in the conntrack map for |
I still cannot wrap my head around why it does not work with hostNetwork pods. I installed nginx-ingress controller, envoy gateway etc. and they all work fine as they do not use the hostNetwork. Also tried all kinds of configuration combinations on multiple different servers (changed helm values, iptables/firewall, kernel modules etc.) but it just breaks everything more most of the time.
|
This issue has been automatically marked as stale because it has not |
Currently, interaction with BPF maps via syscalls (open, lookup) might result in log messages of the following form, where the error detail is `success`: ``` [info][filter] [cilium/conntrack.cc:229] cilium.bpf_metadata: IPv4 conntrack map global lookup failed: Success ``` This is due to the fact that BPF maps are accessed in the starter process. Hence, the syscalls are also executed in this separate process and the variable `errno` is never set in the Envoy process where the log is written.. Therefore, this commit fixes the error propagation by setting the variable `errno` after retrieving the response from the privileged client doing the call to the starter process. Fixes: cilium#315 Fixes: cilium#470 Signed-off-by: Marco Hofstetter <[email protected]>
This issue has not seen any activity since it was marked stale. |
Is there an existing issue for this?
What happened?
I tried using the Gateway API but Envoy always returns
After sniffing on the "cilium_host" network I found a lot of packets from the backend service to envoy (10.42.0.246) which never arrive and get retransmitted a bunch. I then tried to ping and curl the IP from a busybox pod once with
hostNetwork: true
and once with it disabled but these packets also show up in wireshark with "no response found". The envoy pod/whatever might not respond to ICMP but it should definetely accept/receive the TCP responses from backend services.Completely disabled all IPtables rules to rule that out, the packets did still not arrive successfully.
Cilium Version
cilium-cli: 0.15.11 compiled with go1.21.3 on linux/amd64
cilium image (default): v1.14.2
cilium image (stable): v1.14.4
cilium image (running): unknown. Unable to obtain cilium version, no cilium pods found in namespace "kube-system"
Image versions
Kernel Version
Linux _ 6.5.10 #1-NixOS SMP PREEMPT_DYNAMIC Thu Nov 2 08:37:00 UTC 2023 x86_64 GNU/Linux
Kubernetes Version
Sysdump
cilium-sysdump-20231127-154254.zip
Relevant log output
Anything else?
Helm Chart values:
Code of Conduct
The text was updated successfully, but these errors were encountered: