-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: File Access Error with vllm using runai_streamer on OCP #193
Comments
In 0.6.6 the only path pattern that works is s3://bucket/dir/ so please use 0.7.3 The runai streamer is using the AWS C++ SDK, which does not automatically fetch credentials from AWS Vault. What you can do is use AWS Vault to get temporary credentials as follows: We are currently working to make the authentication similar to boto3, so in the future this will be handled by the streamer. |
I tried to use also version 0.7.3, with similar results. |
Checking AWS logs should help understanding the authentication problem. In addition, there are the streamer internal logs - |
After adding those env vars I get the following log: |
This error is probably since the configuration files were not downloaded. Config files are downloaded in vLLM using boto3, so please verify that you can download a file from your bucket with boto3. |
Describe the bug
I am attempting to run the vLLM production stack on OpenShift (OCP) while fetching the model from a Dell ECS S3-compatible storage using runai_streamer. However, I consistently encounter the following error:
Could not send runai_request to libstreamer due to: b'file access error'
All necessary environment variables related to AWS credentials and configurations are set, along with all recommended RUNAI_STREAMER environment variables. Despite debugging the run, this is the only error message I receive.
To Reproduce
Deploy vLLM using the image versions 0.7.3 or 0.6.6 and the production stack on OCP, to deploy a single model defined in the modelspec.
Configure model url to the bucket path in the Dell ECS S3-compatible storage - s3://bucket-name (the error occurred with and withont the url ending with a /)
Using an added external secret, define all the env vars in a Vault service, and mount them to the pod.
Start the process and observe the error
Expected behavior
The model should load successfully from the storage without file access errors, as the credentials are correct, and all the env vars defined.
Additional context
I also get the following logs:
No response
The text was updated successfully, but these errors were encountered: