-
Notifications
You must be signed in to change notification settings - Fork 26
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Fix] Refactor ZeRO Directory Structure (#211)
## Title - [Fix] Refactor ZeRO Directory Structure ## Description - This PR restructures the zero directory under `oslo/torch/nn/parallel/data_parallel/zero` to enhance code organization and readability. The changes align the implementation with the architecture of our project, providing a more logical separation between different components and functionalities. - Organized heterogeneous components (Inspired by PatrickStar) into the `hetero` subdirectory, centralizing related code and improving maintainability. - Update to Zero Optimizer Wrapper Interface: > In the existing Zero optimizer, we were not sharding the optimizer state, so the wrapper interface has been updated accordingly. My sincere apologies for any confusion or inconvenience this change may cause, and I urge reviewers to assess this modification to ensure alignment with our project's requirements. - Renaming FULL_SHARD to PatrickStar Algorithm: > Please note that the previously termed FULL_SHARD strategy was, in fact, implementing the PatrickStar algorithm. PatrickStar is a novel approach to parallel training of pre-trained models via chunk-based memory management, leveraging CPU-GPU heterogeneous memory space. It has demonstrated significant advantages in model scaling and execution speed. > > However, I felt that the name "PatrickStar" did not adequately convey the specific characteristics of this approach. Therefore, I have taken the liberty to rename it as "hetero," reflecting the heterogeneous memory utilization. I genuinely value the reviewers' opinions on this naming choice and kindly ask for your feedback. If a more suitable name can be agreed upon, I will happily update it accordingly. ## Linked Issues - N/A
- Loading branch information
Showing
41 changed files
with
200 additions
and
162 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,15 +1,15 @@ | ||
from oslo.torch.nn.parallel.data_parallel.zero.sharded_optim.sharded_optim import ( | ||
from oslo.torch.nn.parallel.data_parallel.zero.optim.optim import ( | ||
ZeroRedundancyOptimizer, | ||
) | ||
from oslo.torch.nn.parallel.data_parallel.zero.fully_sharded_data_parallel import ( | ||
_FullyShardedDataParallel, | ||
from oslo.torch.nn.parallel.data_parallel.zero.hetero.data_parallel import ( | ||
_HeteroDataParallel, | ||
) | ||
from oslo.torch.nn.parallel.data_parallel.zero.sharded_optim.heterogeneous_optim import ( | ||
_HeterogeneousZeroOptimizer, | ||
from oslo.torch.nn.parallel.data_parallel.zero.hetero.optim import ( | ||
_HeteroOptimizer, | ||
) | ||
|
||
__ALL__ = [ | ||
"ZeroRedundancyOptimizer", | ||
"_FullyShardedDataParallel", | ||
"_HeterogeneousZeroOptimizer", | ||
"_HeteroDataParallel", | ||
"_HeteroOptimizer", | ||
] |
File renamed without changes.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
from oslo.torch.nn.parallel.data_parallel.zero.hetero.data_parallel import ( | ||
_HeteroDataParallel, | ||
) | ||
from oslo.torch.nn.parallel.data_parallel.zero.hetero.optim import _HeteroOptimizer | ||
|
||
__ALL__ = ["_HeteroDataParallel", "_HeteroOptimizer"] |
17 changes: 17 additions & 0 deletions
17
oslo/torch/nn/parallel/data_parallel/zero/hetero/chunk/__init__.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
from oslo.torch.nn.parallel.data_parallel.zero.hetero.chunk.chunk import ( | ||
Chunk, | ||
TensorState, | ||
ChunkFullError, | ||
) | ||
from oslo.torch.nn.parallel.data_parallel.zero.hetero.chunk.manager import ChunkManager | ||
from oslo.torch.nn.parallel.data_parallel.zero.hetero.chunk.utils import ( | ||
init_chunk_manager, | ||
) | ||
|
||
__ALL__ = [ | ||
"Chunk", | ||
"TensorState", | ||
"ChunkFullError", | ||
"ChunkManager", | ||
"init_chunk_manager", | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.