Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to build transformer-engine #1506

Open
sfwu2003 opened this issue Feb 25, 2025 · 4 comments
Open

Failed to build transformer-engine #1506

sfwu2003 opened this issue Feb 25, 2025 · 4 comments

Comments

@sfwu2003
Copy link

Python 3.12.7
pytorch: 2.6.0+cu126
cuda: 12.6
cudnn 9.3.0.75
gcc: 13.3.0
RTX4090
Ubuntu

have export the path already

pip install transformer_engine[pytorch]
Defaulting to user installation because normal site-packages is not writeable
Collecting transformer_engine[pytorch]
Using cached transformer_engine-1.13.0-py3-none-any.whl.metadata (16 kB)
Collecting transformer_engine_cu12==1.13.0 (from transformer_engine[pytorch])
Using cached transformer_engine_cu12-1.13.0-py3-none-manylinux_2_28_x86_64.whl.metadata (16 kB)
Collecting transformer_engine_torch==1.13.0 (from transformer_engine[pytorch])
Downloading transformer_engine_torch-1.13.0.tar.gz (121 kB)
Preparing metadata (setup.py) ... done
Requirement already satisfied: pydantic in /workspace/shared/anaconda3/lib/python3.12/site-packages (from transformer_engine_cu12==1.13.0->transformer_engine[pytorch]) (2.8.2)
Requirement already satisfied: importlib-metadata>=1.0 in /workspace/shared/anaconda3/lib/python3.12/site-packages (from transformer_engine_cu12==1.13.0->transformer_engine[pytorch]) (7.0.1)
Requirement already satisfied: packaging in /workspace/shared/anaconda3/lib/python3.12/site-packages (from transformer_engine_cu12==1.13.0->transformer_engine[pytorch]) (24.1)
Requirement already satisfied: torch in ./.local/lib/python3.12/site-packages (from transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (2.6.0+cu126)
Requirement already satisfied: zipp>=0.5 in /workspace/shared/anaconda3/lib/python3.12/site-packages (from importlib-metadata>=1.0->transformer_engine_cu12==1.13.0->transformer_engine[pytorch]) (3.17.0)
Requirement already satisfied: annotated-types>=0.4.0 in /workspace/shared/anaconda3/lib/python3.12/site-packages (from pydantic->transformer_engine_cu12==1.13.0->transformer_engine[pytorch]) (0.6.0)
Requirement already satisfied: pydantic-core==2.20.1 in /workspace/shared/anaconda3/lib/python3.12/site-packages (from pydantic->transformer_engine_cu12==1.13.0->transformer_engine[pytorch]) (2.20.1)
Requirement already satisfied: typing-extensions>=4.6.1 in /workspace/shared/anaconda3/lib/python3.12/site-packages (from pydantic->transformer_engine_cu12==1.13.0->transformer_engine[pytorch]) (4.11.0)
Requirement already satisfied: filelock in /workspace/shared/anaconda3/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (3.13.1)
Requirement already satisfied: setuptools in /workspace/shared/anaconda3/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (75.1.0)
Requirement already satisfied: sympy==1.13.1 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (1.13.1)
Requirement already satisfied: networkx in /workspace/shared/anaconda3/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (3.3)
Requirement already satisfied: jinja2 in /workspace/shared/anaconda3/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (3.1.4)
Requirement already satisfied: fsspec in /workspace/shared/anaconda3/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (2024.6.1)
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.6.77 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (12.6.77)
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.6.77 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (12.6.77)
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.6.80 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (12.6.80)
Requirement already satisfied: nvidia-cudnn-cu12==9.5.1.17 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (9.5.1.17)
Requirement already satisfied: nvidia-cublas-cu12==12.6.4.1 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (12.6.4.1)
Requirement already satisfied: nvidia-cufft-cu12==11.3.0.4 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (11.3.0.4)
Requirement already satisfied: nvidia-curand-cu12==10.3.7.77 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (10.3.7.77)
Requirement already satisfied: nvidia-cusolver-cu12==11.7.1.2 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (11.7.1.2)
Requirement already satisfied: nvidia-cusparse-cu12==12.5.4.2 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (12.5.4.2)
Requirement already satisfied: nvidia-cusparselt-cu12==0.6.3 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (0.6.3)
Requirement already satisfied: nvidia-nccl-cu12==2.21.5 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (2.21.5)
Requirement already satisfied: nvidia-nvtx-cu12==12.6.77 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (12.6.77)
Requirement already satisfied: nvidia-nvjitlink-cu12==12.6.85 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (12.6.85)
Requirement already satisfied: triton==3.2.0 in ./.local/lib/python3.12/site-packages (from torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (3.2.0)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /workspace/shared/anaconda3/lib/python3.12/site-packages (from sympy==1.13.1->torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (1.3.0)
Requirement already satisfied: MarkupSafe>=2.0 in /workspace/shared/anaconda3/lib/python3.12/site-packages (from jinja2->torch->transformer_engine_torch==1.13.0->transformer_engine[pytorch]) (2.1.3)
Using cached transformer_engine_cu12-1.13.0-py3-none-manylinux_2_28_x86_64.whl (125.2 MB)
Using cached transformer_engine-1.13.0-py3-none-any.whl (459 kB)
Building wheels for collected packages: transformer_engine_torch
Building wheel for transformer_engine_torch (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [22 lines of output]
/workspace/shared/anaconda3/lib/python3.12/site-packages/setuptools/_distutils/dist.py:261: UserWarning: Unknown distribution option: 'tests_require'
warnings.warn(msg)
running bdist_wheel
/workspace/jmwang/.local/lib/python3.12/site-packages/torch/utils/cpp_extension.py:529: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
warnings.warn(msg.format('we could not find ninja.'))
running build
running build_ext
/workspace/jmwang/.local/lib/python3.12/site-packages/torch/utils/cpp_extension.py:458: UserWarning: There are no g++ version bounds defined for CUDA version 12.6
warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
building 'transformer_engine_torch' extension
creating build/temp.linux-x86_64-cpython-312/csrc
creating build/temp.linux-x86_64-cpython-312/csrc/extensions
creating build/temp.linux-x86_64-cpython-312/csrc/extensions/multi_tensor
g++ -pthread -B /workspace/shared/anaconda3/compiler_compat -fno-strict-overflow -Wsign-compare -DNDEBUG -O2 -Wall -fPIC -O2 -isystem /workspace/shared/anaconda3/include -fPIC -O2 -isystem /workspace/shared/anaconda3/include -fPIC -I/tmp/pip-install-d8mpwx1x/transformer-engine-torch_84f4d864065842a4a131c88cea3e6872/common_headers -I/tmp/pip-install-d8mpwx1x/transformer-engine-torch_84f4d864065842a4a131c88cea3e6872/common_headers/common -I/tmp/pip-install-d8mpwx1x/transformer-engine-torch_84f4d864065842a4a131c88cea3e6872/common_headers/common/include -I/tmp/pip-install-d8mpwx1x/transformer-engine-torch_84f4d864065842a4a131c88cea3e6872/csrc -I/workspace/jmwang/.local/lib/python3.12/site-packages/torch/include -I/workspace/jmwang/.local/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/workspace/jmwang/.local/lib/python3.12/site-packages/torch/include/TH -I/workspace/jmwang/.local/lib/python3.12/site-packages/torch/include/THC -I/usr/local/cuda/include -I/workspace/shared/anaconda3/include/python3.12 -c csrc/common.cpp -o build/temp.linux-x86_64-cpython-312/csrc/common.o -O3 -fvisibility=hidden -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1016" -DTORCH_EXTENSION_NAME=transformer_engine_torch -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++17
In file included from /workspace/jmwang/.local/lib/python3.12/site-packages/torch/include/ATen/cudnn/Handle.h:4,
from csrc/common.h:14,
from csrc/common.cpp:7:
/workspace/jmwang/.local/lib/python3.12/site-packages/torch/include/ATen/cudnn/cudnn-wrapper.h:3:10: fatal error: cudnn.h: No such file or directory
3 | #include <cudnn.h>
| ^~~~~~~~~
compilation terminated.
error: command '/usr/bin/g++' failed with exit code 1
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for transformer_engine_torch
Running setup.py clean for transformer_engine_torch
Failed to build transformer_engine_torch
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (transformer_engine_torch)

@ptrendx
Copy link
Member

ptrendx commented Feb 25, 2025

Hi @sfwu2003, how did you install cuDNN? @ksivaman for visibility - that is installation from wheels.

@skr3178
Copy link

skr3178 commented Mar 1, 2025

Same issue when installing.

OS: Ubuntu 22.04
RTX 3060
Python 3.12.9
conda list | grep cudnn
nvidia-cudnn-cu12 9.1.0.70 pypi_0 pypi
torch 2.6.0 pypi_0 pypi

pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable
pip install transformer_engine[pytorch]

      self.run_command(cmd_name)
    File "/home/skr/miniconda3/envs/cosmos/lib/python3.12/site-packages/setuptools/_distutils/cmd.py", line 339, in run_command
      self.distribution.run_command(command)
    File "/home/skr/miniconda3/envs/cosmos/lib/python3.12/site-packages/setuptools/dist.py", line 999, in run_command
      super().run_command(command)
    File "/home/skr/miniconda3/envs/cosmos/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 1002, in run_command
      cmd_obj.run()
    File "/tmp/pip-req-build-nehutrdg/build_tools/build_ext.py", line 119, in run
      ext._build_cmake(
    File "/tmp/pip-req-build-nehutrdg/build_tools/build_ext.py", line 91, in _build_cmake
      raise RuntimeError(f"Error when running CMake: {e}")
  RuntimeError: Error when running CMake: Command '['/home/skr/miniconda3/envs/cosmos/lib/python3.12/site-packages/cmake/data/bin/cmake', '-S', '/tmp/pip-req-build-nehutrdg/transformer_engine/common', '-B', '/tmp/pip-req-build-nehutrdg/build/cmake', '-DPython_EXECUTABLE=/home/skr/miniconda3/envs/cosmos/bin/python', '-DPython_INCLUDE_DIR=/home/skr/miniconda3/envs/cosmos/include/python3.12', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_INSTALL_PREFIX=/tmp/pip-req-build-nehutrdg/build/lib.linux-x86_64-cpython-312', '-DCMAKE_CUDA_ARCHITECTURES=70;80;89;90;100;120', '-Dpybind11_DIR=/tmp/pip-req-build-nehutrdg/.eggs/pybind11-2.13.6-py3.12.egg/pybind11/share/cmake/pybind11', '-GNinja']' returned non-zero exit status 1.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for transformer_engine
Running setup.py clean for transformer_engine
Failed to build transformer_engine
ERROR: Failed to build installable wheels for some pyproject.toml based projects (transformer_engine)

def _load_library():
"""Load shared library with Transformer Engine C extensions"""
module_name = "transformer_engine_torch"
if is_package_installed(module_name):
assert is_package_installed("transformer_engine"), "Could not find transformer-engine."
assert is_package_installed(
"transformer_engine_cu12"
), "Could not find transformer-engine-cu12."
assert (
version(module_name)
== version("transformer-engine")
== version("transformer-engine-cu12")
), (
"TransformerEngine package version mismatch. Found"
f" {module_name} v{version(module_name)}, transformer-engine"
f" v{version('transformer-engine')}, and transformer-engine-cu12"
f" v{version('transformer-engine-cu12')}. Install transformer-engine using 'pip install"
" transformer-engine[pytorch]==VERSION'"
)
if is_package_installed("transformer-engine-cu12"):
if not is_package_installed(module_name):
logging.info(
"Could not find package %s. Install transformer-engine using 'pip"
" install transformer-engine[pytorch]==VERSION'",
module_name,
)
extension = _get_sys_extension()
try:
so_dir = get_te_path() / "transformer_engine"
so_path = next(so_dir.glob(f"{module_name}..{extension}"))
except StopIteration:
so_dir = get_te_path()
so_path = next(so_dir.glob(f"{module_name}.
.{extension}"))
spec = importlib.util.spec_from_file_location(module_name, so_path)
solib = importlib.util.module_from_spec(spec)
sys.modules[module_name] = solib
spec.loader.exec_module(solib)

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Wed_Jan_15_19:20:09_PST_2025
Cuda compilation tools, release 12.8, V12.8.61
Build cuda_12.8.r12.8/compiler.35404655_0

nvidia-smi
Sun Mar 2 00:32:30 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.10 Driver Version: 570.86.10 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3060 Off | 00000000:01:00.0 On | N/A |
| 0% 48C P8 24W / 170W | 592MiB / 12288MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1626 G /usr/lib/xorg/Xorg 269MiB |
| 0 N/A N/A 1900 G /usr/bin/gnome-shell 118MiB |
| 0 N/A N/A 5148 G ...ess --variations-seed-version 144MiB |
+-----------------------------------------------------------------------------------------+

@ptrendx
Copy link
Member

ptrendx commented Mar 4, 2025

The error in the original issue indicates a problem with finding the cuDNN headers:

/workspace/jmwang/.local/lib/python3.12/site-packages/torch/include/ATen/cudnn/cudnn-wrapper.h:3:10: fatal error: cudnn.h: No such file or directory

@skr3178 could you confirm that you are seeing the same error (it should be above the lines you posted)?

@collinmccarthy
Copy link

collinmccarthy commented Mar 9, 2025

@ptrendx I'm having the same issue as the OP with the cuDNN headers. Here's a reproducible example:

conda create --name tr_engine \
 python=3.10 \
 nvidia/label/cuda-12.6.3::cuda \
 nvidia::cudnn

conda activate tr_engine

pip install torch --index-url https://download.pytorch.org/whl/cu126

export CUDA_HOME=$CONDA_PREFIX
export NVTE_FRAMEWORK=pytorch
pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable --verbose

This gives:

  ~/miniconda3/envs/tr_engine/bin/x86_64-conda-linux-gnu-c++ -DNV_CUDNN_FRONTEND_USE_DYNAMIC_LOADING -Dtransformer_engine_EXPORTS -I/tmp/pip-req-build-_h4f8bra/transformer_engine/common/.. -I/tmp/pip-req-build-_h4f8bra/transformer_engine/common/include -I/tmp/pip-req-build-_h4f8bra/transformer_engine/common/../../3rdparty/cudnn-frontend/include -I/tmp/pip-req-build-_h4f8bra/build/cmake/string_headers -isystem /lustre/fsw/portfolios/llmservice/users/cmccarthy/miniconda3/envs/tr_engine/targets/x86_64-linux/include -Wl,--version-script=/tmp/pip-req-build-_h4f8bra/transformer_engine/common/libtransformer_engine.version -O3 -DNDEBUG -std=gnu++17 -fPIC -MD -MT CMakeFiles/transformer_engine.dir/comm_gemm_overlap/comm_gemm_overlap.cpp.o -MF CMakeFiles/transformer_engine.dir/comm_gemm_overlap/comm_gemm_overlap.cpp.o.d -o CMakeFiles/transformer_engine.dir/comm_gemm_overlap/comm_gemm_overlap.cpp.o -c /tmp/pip-req-build-_h4f8bra/transformer_engine/common/comm_gemm_overlap/comm_gemm_overlap.cpp
  In file included from /tmp/pip-req-build-_h4f8bra/transformer_engine/common/normalization/common.cpp:9:
  /tmp/pip-req-build-_h4f8bra/transformer_engine/common/normalization/common.h:10:10: fatal error: cudnn.h: No such file or directory
     10 | #include <cudnn.h>
        |          ^~~~~~~~~
  In file included from /tmp/pip-req-build-_h4f8bra/transformer_engine/common/cudnn_utils.cpp:7:
  /tmp/pip-req-build-_h4f8bra/transformer_engine/common/cudnn_utils.h:10:10: fatal error: cudnn.h: No such file or directory
     10 | #include <cudnn.h>
        |          ^~~~~~~~~
  compilation terminated.
  compilation terminated.

But the file exists at the standard path (I'm assuming $CUDA_HOME/include is standard)

(tr_engine) ~/$ cd $CUDA_HOME && find . -name cudnn.h
./lib/python3.10/site-packages/nvidia/cudnn/include/cudnn.h
./include/cudnn.h

Explicitly adding export NVTE_CUDA_INCLUDE_PATH=$CUDA_HOME/include doesn't work either.

I've attached the full pip install / build output.

Thanks for taking a look.

transformer_engine_build_out.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants