Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support PacketCapture for arbitrary source / destination #6976

Open
antoninbas opened this issue Feb 6, 2025 · 6 comments
Open

Support PacketCapture for arbitrary source / destination #6976

antoninbas opened this issue Feb 6, 2025 · 6 comments
Labels
area/ops/packetcapture Issues or PRs related to the PacketCapture feature kind/feature Categorizes issue or PR as related to a new feature.

Comments

@antoninbas
Copy link
Contributor

Describe the problem/challenge you have
At the moment the PacketCapture CR requires specifying both a source and a destination:

properties:
spec:
type: object
required:
- source
- captureConfig
- destination

The limitation doesn't make sense to me. We should be able to capture all traffic egressing from a given Pod, or all traffic ingressing into a given Pod. Note that the real requirement here is that at least one of source.pod and destination.pod must be provided, which I believe is already enforced today:

anyOf:
- properties:
source:
required: [pod]
- properties:
destination:
required: [pod]

Describe the solution you'd like
It should be possible to omit the source field when destination.pod is present:

apiVersion: crd.antrea.io/v1alpha1
kind: PacketCapture
metadata:
  name: pc-test
spec:
  timeout: 30
  captureConfig:
    firstN:
      number: 5
  destination:
    pod:
      namespace: default
      name: toolbox
  packet:
    ipFamily: IPv4
    protocol: ICMP

Likewise, it should be possible to omit the destination field when source.pod is present:

apiVersion: crd.antrea.io/v1alpha1
kind: PacketCapture
metadata:
  name: pc-test
spec:
  timeout: 30
  captureConfig:
    firstN:
      number: 5
  source:
    pod:
      namespace: default
      name: toolbox
  packet:
    ipFamily: IPv4
    protocol: ICMP

Anything else you would like to add?
Note that there will be some "interaction" with #6863.
When source is omitted, the only possible capture point is the destination Pod. When destination is omitted, the only possible capture point is the source Pod.

@antoninbas antoninbas added area/ops/packetcapture Issues or PRs related to the PacketCapture feature kind/feature Categorizes issue or PR as related to a new feature. labels Feb 6, 2025
@mviswanathsai
Copy link

I can see the following things that will need to be done in the PR that would fix this:

  1. Update the CRD to where we can define either both of the "destination/source" pair or any one of them. Currently, you have to define both: We can do this at the controller level, or using CEL (I would personally go for CEL, less overload).
  2. Write some tests. (I do see a test framework in place)
  3. See the tests fail.
  4. Then (stating the obvious here), we need to change the underlying logic. But the way I see it, it should not be a complete rewrite or anything. Should be doable with a bit of refactoring.
  5. (Hopefully) See the tests pass.
  6. Update docs to reflect the changes.

I realize this isn’t an ideal "first issue," but I believe I can probably take a good crack at with my present skill set. The packet capture logic itself is relatively contained compared to the rest of the codebase, so it feels manageable. I can’t promise perfection 😛, but I’ll definitely give it a good shot.

@mviswanathsai
Copy link

mviswanathsai commented Feb 8, 2025

Just to know that we are on the same page:

  1. When the destination is configured (and not the source), we would be capturing all the packets that ingress into the destination.
  2. When the source is configured (and not the destination), we would be capturing all the packets that egress out from the source.

Currently though, (paraphrasing form the discussion in #6863 )
We likely only track the source and filter out packets destined for the specified destination at the source. This means we’re capturing packets either at the source or the destination, but not both—which is probably fine, as knowing both the source and destination should be sufficient for most use cases. (However, there might some use cases (as pointed out in #6863) where you need to capture packets at both ends and do some operations on that.)

Edit: I realized we expect atleast one of the source/destination to be a pod. But we always expect both the destination and source to be present.
So there can be a bunch of cases:

  1. Both destination and source are pods. (In this case, we will likely do the capture at the source. See here.)
  2. Destination is an IP while the source is a pod. (The antrea-agent present in the source pod's node will do the capture)
  3. Source is an IP while the destination is a pod. (The antrea-agent present in the destination pod's node will do the capture)

@mviswanathsai
Copy link

One area that isn't entirely clear to me is handling cases where the source or destination is an external IP.

Currently, PacketCapture assumes that at least one of the source or destination is a pod. However, with the proposed changes:

  • If only the destination is specified and it's an external IP, or
  • If only the source is specified and it's an external IP,
    then every antrea-agent on every node would need to perform packet capture. This is because we can no longer localize the capture to a specific node or pod, adding complexity and diverging from the current behavior.

A particular concern is how FirstN should be handled in this scenario. I had a couple of ideas:

  1. (Probably not a good fit for the pilot) Enforcing it globally—for example, capturing at most 100 packets across all nodes—doesn't seem feasible, as each node would independently capture packets, making it difficult to enforce a strict global limit. - This is of little value and I don't see a valid use case for a global FirstN limit.

  2. (Seems a good fit) We should maintain the existing behavior and enforce FirstN at each antrea-agent independently. The key difference would be that instead of generating a single .pcap file per PacketCapture, we would now have one .pcap file per pod that sends/receives the specified packets (e.g., ICMP traffic, based on the PacketCapture configuration) to/from the given external source/destination.

Of these, the 2nd idea seems a good fit IMO. What are your thoughts on this @antoninbas? Apologies upfront, in case I am missing something.

@mviswanathsai
Copy link

As a side note, I may not be super active this coming week. But I will try to keep up with your comments and make a PR whenever I get the time.

@antoninbas
Copy link
Contributor Author

@mviswanathsai thanks for your enthusiasm. Please do feel free to take a stab at this issue.

A few key points based on your questions / comments above:

Update the CRD to where we can define either both of the "destination/source" pair or any one of them. Currently, you have to define both: We can do this at the controller level, or using CEL (I would personally go for CEL, less overload).

There is no need for CEL here. The current restriction just needs to be relaxed. I even shared a link to the relevant section of the OpenAPI schema in the CRD definition.

When the destination is configured (and not the source), we would be capturing all the packets that ingress into the destination.

Correct, and the destination has to be a Pod.

When the source is configured (and not the destination), we would be capturing all the packets that egress out from the source.

Same. This is correct, and the source has to be a Pod.

Currently, PacketCapture assumes that at least one of the source or destination is a pod

Yes, and that will remain the case even after this change.

then every antrea-agent on every node would need to perform packet capture

While #6863 is probably needed to clarify this, there is no plan to have more than one capture point for a PacketCapture. If source.pod and destination.pod are provided, we should default to source.pod (but let user request destination.pod as the capture point - this is what #6863 is for). If source.pod is provided but not destination.pod, then the capture has to be at source.pod. Conversely, if destination.pod is provided but not source.pod, then the capture has to be at destination.pod. Note that with the exception of the feature request in #6863, this is the current behavior today.

A particular concern is how FirstN should be handled in this scenario

I think this concern is no longer relevant once you consider that there is only ever one capture point (source or destination).

The key point of this issue is to relax the current restriction, which implies:

  • updating the openAPI schema
  • possibly update BPF code generation for the capture
  • e2e testing

@ifrah-ashraf
Copy link

Hi @antoninbas,

Could you please review the latest changes for this issue. I’ve addressed the validation updates and made the necessary modifications to ensure correct packet capture behavior.

Thanks !

ifrah-ashraf added a commit to ifrah-ashraf/antrea that referenced this issue Feb 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ops/packetcapture Issues or PRs related to the PacketCapture feature kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

3 participants