Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backend] IRSA for Kubeflow 2.4 Fails with “Access Denied” When Checking S3 Bucket #11729

Open
adamdelman opened this issue Mar 6, 2025 · 0 comments

Comments

@adamdelman
Copy link

adamdelman commented Mar 6, 2025

Environment

  • How did you deploy Kubeflow Pipelines (KFP)?
    I deployed using the kustomize manifests.
  • KFP version:
    2.4.0

Steps to reproduce

  1. Deploy KFP 2.4 on an EKS cluster using the 2.4 kustomize manifests.
  2. Create an S3 bucket in the same region (my-kfp-bucket).
  3. Create an IAM role (KubeflowS3Role) with a trust policy matching system:serviceaccount:my-namespace:ml-pipeline.
  4. Attach the appropriate S3 permissions via KubeflowS3Policy to that role.
  5. Annotate the ml-pipeline ServiceAccount with the role ARN.
  6. Restart the deployment and observe that the ml-pipeline pod fails with “Access Denied” when checking the S3 bucket, while no related IRSA events appear in CloudTrail.

The API server logs show:

F0306 08:19:04.525174 7 client_manager.go:502] Failed to check if object store bucket exists. Error: Access Denied.
No relevant events appear in CloudTrail for the failing S3 requests (e.g., no ListBucket or GetBucketLocation calls under the IRSA role).

Expected result

  1. The ml-pipeline pod should successfully connect to the S3 bucket via IRSA.
  2. CloudTrail should log S3 calls (e.g., ListBucket, GetBucketLocation) under the IRSA role ARN.

Materials and Reference

  1. Namespace & Service Account:
  • Namespace: my-namespace
  • ServiceAccount: ml-pipeline, annotated with:

eks.amazonaws.com/role-arn: arn:aws:iam::<ACCOUNT_ID>:role/KubeflowS3Role
2. IAM Role:

  • Name: KubeflowS3Role
  • ARN (placeholder): arn:aws:iam::<ACCOUNT_ID>:role/KubeflowS3Role
  • Trust Policy: Allows the following condition:
"StringEquals": {
  "oidc.eks.us-east-2.amazonaws.com/id/<OIDC_ID>:sub": "system:serviceaccount:my-namespace:ml-pipeline"
}
  1. IAM Policy: KubeflowS3Policy (attached to KubeflowS3Role), allowing:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetBucketLocation",
        "s3:HeadBucket"
      ],
      "Resource": "arn:aws:s3:::my-kfp-bucket"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::my-kfp-bucket/*"
    }
  ]
}
  1. Deployment:
    The ml-pipeline deployment uses serviceAccountName: ml-pipeline in my-namespace.

  2. ConfigMap:
    The configuration references s3://my-kfp-bucket for the bucket name, and the region is set to us-east-2.
    Steps to Reproduce

  3. Deploy KFP 2.4 on an EKS cluster in us-east-2 using the official 2.4 Helm charts.

  4. Create an S3 bucket in the same region (my-kfp-bucket).

  5. Create an IAM role (KubeflowS3Role) with a trust policy matching system:serviceaccount:my-namespace:ml-pipeline.

  6. Attach the appropriate S3 permissions via KubeflowS3Policy to that role.

  7. Annotate the ml-pipeline ServiceAccount with the role ARN.

  8. Restart the deployment and observe that the ml-pipeline pod fails with “Access Denied” when checking the S3 bucket, while no related IRSA events appear in CloudTrail.


Impacted by this bug? Give it a 👍.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant