Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: 集成测试代码出现ModuleNotFoundError. 安装失败,如何解决? #9375

Closed
WhuanY opened this issue Nov 6, 2024 · 2 comments
Assignees
Labels
question Further information is requested

Comments

@WhuanY
Copy link
Contributor

WhuanY commented Nov 6, 2024

请提出你的问题

  • 背景:
    目前正在开发LoKrModel的中后期阶段。
    当前,在进行测试时出现了如下问题:
tests/llm/test_lokr.py:80: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/llm/testing_utils.py:73: in run_predictor
    predict()
llm/predict/predictor.py:1625: in predict
    predictor = create_predictor(predictor_args, model_args)
llm/predict/predictor.py:1354: in create_predictor
    from paddlenlp.experimental.transformers import (
paddlenlp/experimental/transformers/__init__.py:15: in <module>
    from .bloom import *
paddlenlp/experimental/transformers/bloom/__init__.py:15: in <module>
    from .modeling import *
paddlenlp/experimental/transformers/bloom/modeling.py:23: in <module>
    from paddlenlp.experimental.transformers.fused_transformer_layers import (
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

    from __future__ import annotations
    
    import os
    from dataclasses import dataclass
    from typing import List, Optional
    
    import numpy as np
    import paddle
    import paddle.distributed as dist
    from paddle.framework import LayerHelper, core, in_dynamic_mode, in_dynamic_or_pir_mode
    from paddle.incubate.nn.functional import (
        fused_layer_norm,
        fused_moe,
        fused_rms_norm,
        masked_multihead_attention,
        variable_length_memory_efficient_attention,
    )
    from paddle.nn import Layer
    from paddle.nn.initializer import Constant
    from paddle.nn.quant import weight_only_linear
    
    from paddlenlp.utils.import_utils import is_paddlenlp_ops_available
    from paddlenlp.utils.log import logger
    
    if not is_paddlenlp_ops_available():
        logger.warning(
            "The paddlenlp_ops package is not installed. you can read the docs and install it by hand, "
            "you can refer to: https://github.com/PaddlePaddle/PaddleNLP/blob/develop/csrc/README.md"
        )
>   from paddlenlp_ops import rebuild_padding_v2
E   ModuleNotFoundError: No module named 'paddlenlp_ops'

我按照错误日志的要求,手动安装paddlenlp_ops包裹。
但是出现如下问题:
执行到
python setup_cuda.py install一行时,出现这个包裹没有找到的问题:

(PaddleLoKr) (base) wuyuhuan@zuchongzhi:~/PaddleNLP/csrc$ python setup_cuda.py install
WARNING: OMP_NUM_THREADS set to 20, not 1. The computation speed will not be optimized if you use data parallel. It will fail if this PaddlePaddle binary is compiled with OpenBlas since OpenBlas does not support multi-threads.
PLEASE USE OMP_NUM_THREADS WISELY.
Cloning into 'third_party/nlohmann_json'...
remote: Enumerating objects: 35301, done.
remote: Counting objects: 100% (6620/6620), done.
remote: Compressing objects: 100% (778/778), done.
error: RPC failed; curl 56 Recv failure: Connection reset by peer
error: 665 bytes of body are still expected
fetch-pack: unexpected disconnect while reading sideband packet
fatal: early EOF
fatal: fetch-pack: invalid index-pack output
Git clone https://github.com/nlohmann/json.git operation failed with the following error: Command '['git', 'clone', '-b', 'v3.11.3', '--single-branch', 'https://github.com/nlohmann/json.git', 'third_party/nlohmann_json']' returned non-zero exit status 128.
Please check your network connection or access rights to the repository.
If the problem persists, please refer to the README file for instructions on how to manually download and install the necessary components.
running install
/home/wuyuhuan/anaconda3/envs/PaddleLoKr/lib/python3.8/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` directly.
        Instead, use pypa/build, pypa/installer or other
        standards-based tools.

        See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
        ********************************************************************************

!!
  self.initialize_options()
/home/wuyuhuan/anaconda3/envs/PaddleLoKr/lib/python3.8/site-packages/setuptools/_distutils/cmd.py:66: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` and ``easy_install``.
        Instead, use pypa/build, pypa/installer or other
        standards-based tools.

        See https://github.com/pypa/setuptools/issues/917 for details.
        ********************************************************************************

!!
  self.initialize_options()
running bdist_egg
running egg_info
writing paddlenlp_ops.egg-info/PKG-INFO
writing dependency_links to paddlenlp_ops.egg-info/dependency_links.txt
writing top-level names to paddlenlp_ops.egg-info/top_level.txt
reading manifest file 'paddlenlp_ops.egg-info/SOURCES.txt'
writing manifest file 'paddlenlp_ops.egg-info/SOURCES.txt'
installing library code to build/paddlenlp_ops/bdist.linux-x86_64/egg
running install_lib
running build_ext
Compiling user custom op, it will cost a few seconds.....
building 'paddlenlp_ops' extension
/usr/local/cuda/bin/nvcc -I/home/wuyuhuan/anaconda3/envs/PaddleLoKr/lib/python3.8/site-packages/paddle/include -I/home/wuyuhuan/anaconda3/envs/PaddleLoKr/lib/python3.8/site-packages/paddle/include/third_party -I/usr/local/cuda/include -I/home/wuyuhuan/anaconda3/envs/PaddleLoKr/include/python3.8 -I/home/wuyuhuan/anaconda3/envs/PaddleLoKr/include/python3.8 -c /home/wuyuhuan/PaddleNLP/csrc/gpu/dequant_int8.cu -o /home/wuyuhuan/PaddleNLP/csrc/build/paddlenlp_ops/lib.linux-x86_64-cpython-38/dequant_int8.cu.o -DPADDLE_WITH_CUDA -DEIGEN_USE_GPU -ccbin cc -Xcompiler -fPIC --expt-relaxed-constexpr -DNVCC -gencode arch=compute_80,code=sm_80 -O3 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_BFLOAT16_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ -U__CUDA_NO_BFLOAT162_OPERATORS__ -U__CUDA_NO_BFLOAT162_CONVERSIONS__ -Igpu -Igpu/cutlass_kernels -Igpu/fp8_gemm_with_cutlass -Igpu/cutlass_kernels/fp8_gemm_fused/autogen -Ithird_party/cutlass/include -Ithird_party/nlohmann_json/single_include -Igpu/sample_kernels -w -DPADDLE_WITH_CUSTOM_KERNEL -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++17
In file included from /home/wuyuhuan/PaddleNLP/csrc/gpu/dequant_int8.cu:15:
/home/wuyuhuan/PaddleNLP/csrc/gpu/helper.h:40:10: fatal error: nlohmann/json.hpp: No such file or directory
   40 | #include "nlohmann/json.hpp"

的问题。
既然没有fetch成功,我猜想可能是版本问题,于是自己到对应的nlohmann仓库去fetch,放到指定工作目录下:
image
再次运行:

python setup_cuda.py install

出现了我不知道如何解决的问题:

/usr/lib/gcc/x86_64-linux-gnu/13/include/amxtileintrin.h(42): error: identifier "__builtin_ia32_ldtilecfg" is undefined
    __builtin_ia32_ldtilecfg (__config);
    ^

/usr/lib/gcc/x86_64-linux-gnu/13/include/amxtileintrin.h(49): error: identifier "__builtin_ia32_sttilecfg" is undefined
    __builtin_ia32_sttilecfg (__config);
    ^

2 errors detected in the compilation of "/home/wuyuhuan/PaddleNLP/csrc/gpu/dequant_int8.cu".
error: command '/usr/local/cuda/bin/nvcc' failed with exit code 2

有没有其他办法解决该算子问题?或者说原本的.md文档需要更新?多谢😊

@WhuanY WhuanY added the question Further information is requested label Nov 6, 2024
@ZHUI
Copy link
Collaborator

ZHUI commented Nov 6, 2024

你好

  1. 如果你的测试,不需要跑预测。可以直接注释掉相关推理测试的运行代码
  2. 编译问题可能是版本问题,你本地的cuda 和 gcc 是什么版本?

@WhuanY WhuanY closed this as completed Nov 11, 2024
@WhuanY
Copy link
Contributor Author

WhuanY commented Nov 11, 2024

不需要跑预测

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants