[Feature] Fused Mixtral support #8901

penPenf28 · 2024-08-08T08:51:47Z

PR types

New features

PR changes

Models

Description

增加了高性能版本Mixtral-8x7B-Instruct-v0.1模型的支持，目前支持bfloat16+wint8，模型包括非block和block版本；

目前代码中包括一些冗余的量化部分，后续会进行修改添加相关的量化支持

paddle-bot · 2024-08-08T08:51:52Z

Thanks for your contribution!

codecov · 2024-08-08T09:26:17Z

Codecov Report

Attention: Patch coverage is 0% with 618 lines in your changes missing coverage. Please review.

Project coverage is 54.05%. Comparing base (d505a97) to head (7ed1917).
Report is 240 commits behind head on develop.

Files with missing lines	Patch %	Lines
...enlp/experimental/transformers/mixtral/modeling.py	0.00%	547 Missing ⚠️
...erimental/transformers/fused_transformer_layers.py	0.00%	69 Missing ⚠️
paddlenlp/experimental/transformers/__init__.py	0.00%	1 Missing ⚠️
...enlp/experimental/transformers/mixtral/__init__.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #8901      +/-   ##
===========================================
- Coverage    54.80%   54.05%   -0.75%     
===========================================
  Files          647      650       +3     
  Lines       102474   104427    +1953     
===========================================
+ Hits         56157    56445     +288     
- Misses       46317    47982    +1665

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚨 Try these New Features:

Flaky Tests Detection - Detect and resolve failed and flaky tests

paddlenlp/experimental/transformers/fused_transformer_layers.py

paddlenlp/experimental/transformers/mixtral/modeling.py

paddlenlp/experimental/transformers/fused_transformer_layers.py

paddlenlp/experimental/transformers/mixtral/modeling.py

DesmonDay · 2024-08-22T06:44:24Z

paddlenlp/experimental/transformers/fused_transformer_layers.py

+        return self.num_experts > 1
+
+    def use_moe(self, i: int) -> bool:
+        return self.has_moe() and (self.moe_every2 is False or (self.moe_every2 and i % 2 == 1))


这个判断有点诡异，万一我是每隔四层换成moe layer呢。

不过只针对mixtral的话，暂时先这样吧。

好的，如果是有4,8...的需求，个人感觉可以把moe_every参数改为一个枚举，利用枚举来做判断，目前在做其他的支持，后续可以提交一个PR再修改

DesmonDay · 2024-08-23T05:09:52Z

建议后面新增相关单测，确保功能正确性。 @penPenf28 @yuanlehome

yuanlehome · 2024-08-23T05:17:54Z

paddlenlp/experimental/transformers/fused_transformer_layers.py

@@ -1128,6 +1154,29 @@ def compute_out_linear(self, fmha_out, i):
            weight_dtype=self.weight_dtype,
        )

+    def compute_fused_moe(self, tmp_out, i):
+        # todo[xinhw]: make bias optional


需尽早修复此bug

yuanlehome · 2024-08-23T05:20:19Z

paddlenlp/experimental/transformers/fused_transformer_layers.py

@@ -713,6 +794,29 @@ def compute_ffn_layernorm(self, out_linear_out, residual_input, i):

        return tmp_out, residual_input

+    def compute_fused_moe(self, tmp_out, i):
+        # todo[xinhw]: make bias optional


需尽早修复此bug

DesmonDay

LGTM

* [Feature] Fused Mixtral support * [Refactor] add MoeConfig and fix static graph export problem * [Bugfix] fix small bug * [Bugfix] fix moe_config bug * [Bugfix] fix moe_config bug * [Refactor] refine code * [Refactor] refine code * [Refactor] refine code * [Refactor] match fused moe api change * [Feature] wint8 support

penPenf28 marked this pull request as draft August 8, 2024 12:01

penPenf28 force-pushed the fused_mixtral branch from 5b4384c to 83a2000 Compare August 14, 2024 08:07

penPenf28 marked this pull request as ready for review August 14, 2024 08:22

yuanlehome reviewed Aug 15, 2024

View reviewed changes

paddlenlp/experimental/transformers/fused_transformer_layers.py Outdated Show resolved Hide resolved

yuanlehome reviewed Aug 15, 2024

View reviewed changes

paddlenlp/experimental/transformers/fused_transformer_layers.py Outdated Show resolved Hide resolved

penPenf28 force-pushed the fused_mixtral branch 3 times, most recently from 6070538 to 2b8afcf Compare August 19, 2024 03:06

vivienfanghuagood reviewed Aug 20, 2024

View reviewed changes

paddlenlp/experimental/transformers/mixtral/modeling.py Show resolved Hide resolved

yuanlehome reviewed Aug 20, 2024

View reviewed changes

paddlenlp/experimental/transformers/fused_transformer_layers.py Outdated Show resolved Hide resolved

yuanlehome reviewed Aug 20, 2024

View reviewed changes

paddlenlp/experimental/transformers/fused_transformer_layers.py Outdated Show resolved Hide resolved

yuanlehome reviewed Aug 20, 2024

View reviewed changes

paddlenlp/experimental/transformers/fused_transformer_layers.py Outdated Show resolved Hide resolved

yuanlehome reviewed Aug 20, 2024

View reviewed changes

paddlenlp/experimental/transformers/mixtral/modeling.py Outdated Show resolved Hide resolved

yuanlehome approved these changes Aug 21, 2024

View reviewed changes

penPenf28 added 9 commits August 21, 2024 15:26

[Feature] Fused Mixtral support

964e769

[Refactor] add MoeConfig and fix static graph export problem

0b9148c

[Bugfix] fix small bug

5797fee

[Bugfix] fix moe_config bug

639b08d

[Bugfix] fix moe_config bug

0ff7f14

[Refactor] refine code

f55c9ca

[Refactor] refine code

09c26b8

[Refactor] refine code

1119b19

[Refactor] match fused moe api change

3e459a5

penPenf28 force-pushed the fused_mixtral branch from 89ff18d to 3e459a5 Compare August 21, 2024 09:21

DesmonDay reviewed Aug 22, 2024

View reviewed changes

[Feature] wint8 support

7ed1917

yuanlehome reviewed Aug 23, 2024

View reviewed changes

yuanlehome approved these changes Aug 23, 2024

View reviewed changes

DesmonDay approved these changes Aug 23, 2024

View reviewed changes

wawltor merged commit 31cc283 into PaddlePaddle:develop Aug 26, 2024
9 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Fused Mixtral support #8901

[Feature] Fused Mixtral support #8901

penPenf28 commented Aug 8, 2024 •

edited

Loading

paddle-bot bot commented Aug 8, 2024

codecov bot commented Aug 8, 2024 •

edited

Loading

DesmonDay Aug 22, 2024

DesmonDay Aug 22, 2024

penPenf28 Aug 22, 2024

DesmonDay commented Aug 23, 2024

yuanlehome Aug 23, 2024

yuanlehome Aug 23, 2024

DesmonDay left a comment

[Feature] Fused Mixtral support #8901

[Feature] Fused Mixtral support #8901

Conversation

penPenf28 commented Aug 8, 2024 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Aug 8, 2024

codecov bot commented Aug 8, 2024 • edited Loading

Codecov Report

DesmonDay Aug 22, 2024

Choose a reason for hiding this comment

DesmonDay Aug 22, 2024

Choose a reason for hiding this comment

penPenf28 Aug 22, 2024

Choose a reason for hiding this comment

DesmonDay commented Aug 23, 2024

yuanlehome Aug 23, 2024

Choose a reason for hiding this comment

yuanlehome Aug 23, 2024

Choose a reason for hiding this comment

DesmonDay left a comment

Choose a reason for hiding this comment

penPenf28 commented Aug 8, 2024 •

edited

Loading

codecov bot commented Aug 8, 2024 •

edited

Loading