Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytorch 2.3.1 #172

Merged
merged 4 commits into from
Aug 2, 2024
Merged

Pytorch 2.3.1 #172

merged 4 commits into from
Aug 2, 2024

Conversation

sdake
Copy link
Member

@sdake sdake commented Jul 27, 2024

Builds a wheel in Dockerfiles/target containing pytorch 2.3.1. This is needed for vllm because Neural Magic's dynamic quantization uses compute features only available in compute>8.0. Pytorch has a hardcode rejection unless compute>8.9.

@sdake sdake requested review from a team as code owners July 27, 2024 13:08
@cla-bot cla-bot bot added the CLA CLA signed? label Jul 27, 2024
@sdake sdake force-pushed the dockerfile/pytorch branch 4 times, most recently from 1cc2bbb to df087ca Compare July 27, 2024 13:18
This (properly) builds pytorch 2.3.1, including mkl support.
@sdake sdake force-pushed the dockerfile/pytorch branch from df087ca to 091699e Compare July 27, 2024 13:19
@sdake
Copy link
Member Author

sdake commented Jul 27, 2024

vllm-project/vllm#6864

rstarmer
rstarmer previously approved these changes Jul 27, 2024
Copy link
Member

@rstarmer rstarmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to remove the commented out run statements? If they're not needed, they probably shouldn't be included. Looks good otherwise!

Copy link
Contributor

@MostAwesomeDude MostAwesomeDude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to make the given edit in order to build the container.

For what it's worth: the container is still building. By my estimates, that's some 36 CPU-hours. In the future, please consider enrolling containers with CI so that we can use GitHub's CPU hours instead. Here's where it's at, in the final build step:

[8379/8672] Building CXX object test_api/CMakeFiles/test_api.dir/grad_mode.cpp.o

#
#

COPY /workspace/patches/pytorch-compute-86-override.patch /workspace/patches
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
COPY /workspace/patches/pytorch-compute-86-override.patch /workspace/patches
COPY pytorch-compute-86-override.patch /workspace/patches

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that. I had a diverged branch and an update that had fixed this wasn't pushed.

sdake added 2 commits July 29, 2024 03:30
This (properly) builds pytorch 2.3.1, including mkl support.
Copy link
Member

@rstarmer rstarmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rstarmer rstarmer enabled auto-merge (rebase) July 29, 2024 16:26
@rstarmer rstarmer disabled auto-merge July 29, 2024 16:26
@sdake
Copy link
Member Author

sdake commented Jul 29, 2024

@MostAwesomeDude understood related to build time. I am sort of at a loss as to how to handle building properly. I do not want this repository to be responsible for CI-ing the builds as the builds of our various components take an hour or more. even something simple like cloud-hypervisor takes about 15 minutes. builds. I handled this (In Istio, for example) by creating a separate repo that built and pushed new go binaries. (See dockerfile).

I like @howardjohn 's Istio Release Builder as inspiration, although something small and tactical would be a better start IMO.

@sdake sdake enabled auto-merge (rebase) July 30, 2024 09:23
@sdake sdake requested a review from MostAwesomeDude August 2, 2024 15:57
@sdake sdake dismissed MostAwesomeDude’s stale review August 2, 2024 15:58

have resolved the issues here.

@sdake sdake merged commit fb942dd into artificialwisdomai:main Aug 2, 2024
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA CLA signed?
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants