-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Executor interface design and implementation #4537
Changes from 56 commits
540cc2c
3481bdc
e42cafb
d4be973
b630d40
39b2ff3
0950091
ce4d14b
f1c5d9e
f29a6b0
b5dbe88
023ed5e
6e2f968
1a0d8fa
e946fc1
6c4d1f5
f5e73f4
0009a30
3950515
cb198fa
fe10e86
e02cc57
3014f6a
623848a
20725f2
45c4dca
48b080d
bbceb72
39f75a1
1f5192a
ac0e382
e8a678e
d73aa87
91f5d2b
f087533
a7d700e
683ef60
b68a95f
005f15b
a67e8ea
c83ea1c
6e7666f
c93d74a
089cc11
e515571
340d21d
e655d29
932402c
1540074
7d21d8c
0e1f21a
a17442d
e3161bb
2fc7fc7
975a512
a308ff2
3f9e247
293a7d1
062ff4d
2e7cd20
436ea50
a528a97
f410622
434949c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,176 @@ | ||
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve. | ||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. */ | ||
|
||
#include "paddle/framework/executor.h" | ||
|
||
#include <algorithm> | ||
#include <iostream> | ||
#include <memory> | ||
#include <set> | ||
#include <vector> | ||
|
||
#include "paddle/framework/lod_tensor.h" | ||
#include "paddle/framework/op_registry.h" | ||
#include "paddle/framework/scope.h" | ||
|
||
#include <boost/range/adaptor/reversed.hpp> | ||
|
||
namespace paddle { | ||
namespace framework { | ||
|
||
const std::string kFeedOpType = "feed"; | ||
const std::string kFetchOpType = "fetch"; | ||
|
||
Executor::Executor(const std::vector<platform::Place>& places) { | ||
PADDLE_ENFORCE_GT(places.size(), 0); | ||
device_contexts_.resize(places.size()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. enforce places.size() > 0 ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
for (size_t i = 0; i < places.size(); i++) { | ||
if (platform::is_cpu_place(places[i])) { | ||
device_contexts_[i] = new platform::CPUDeviceContext( | ||
boost::get<platform::CPUPlace>(places[i])); | ||
} else if (platform::is_gpu_place(places[i])) { | ||
#ifdef PADDLE_WITH_CUDA | ||
device_contexts_[i] = new platform::CUDADeviceContext( | ||
boost::get<platform::GPUPlace>(places[i])); | ||
#else | ||
PADDLE_THROW("'GPUPlace' is not supported in CPU only device."); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "CPU only device." does not make much sense, maybe change to: 'GPUPlace' is not supported, please recompile with PADDLE_WITH_CUDA=ON. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
#endif | ||
} | ||
} | ||
} | ||
|
||
Executor::~Executor() { | ||
for (auto& device_context : device_contexts_) { | ||
delete device_context; | ||
} | ||
} | ||
|
||
void Executor::Run(const ProgramDesc& pdesc, Scope* scope, int block_id) { | ||
// TODO(tonyyang-svail): | ||
// - only runs on the first device (i.e. no interdevice communication) | ||
// - will change to use multiple blocks for RNN op and Cond Op | ||
PADDLE_ENFORCE_GT(pdesc.blocks_size(), block_id); | ||
auto& block = pdesc.blocks(block_id); | ||
auto& device = device_contexts_[0]; | ||
|
||
// Instantiate all the vars in the global scope | ||
for (auto& var : block.vars()) { | ||
scope->NewVar(var.name()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if the Variable has already been created? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Then it returns a pointer without re-creating a new |
||
} | ||
|
||
Scope& local_scope = scope->NewScope(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems that we should drop local_scope after invoking There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like there is no easy way to do this. I will add a TODO on this. |
||
|
||
std::vector<bool> should_run = Prune(pdesc, block_id); | ||
PADDLE_ENFORCE_EQ(should_run.size(), block.ops_size()); | ||
for (size_t i = 0; i < should_run.size(); ++i) { | ||
// if (should_run[i]) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No comment out code, please. |
||
if (true) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's better to add a constant value and assigned with true. Otherwise, this would be a magic value. |
||
for (auto& var : block.ops(i).outputs()) { | ||
for (auto& argu : var.arguments()) { | ||
if (local_scope.FindVar(argu) == nullptr) { | ||
local_scope.NewVar(argu); | ||
} | ||
} | ||
} | ||
LOG(INFO) << block.ops(i).type(); | ||
if (block.ops(i).type() == "sum") { | ||
LOG(INFO) << "Here"; | ||
for (auto& var : block.ops(i).inputs()) { | ||
for (auto& argu : var.arguments()) { | ||
LOG(INFO) << var.parameter() << " " << argu; | ||
} | ||
} | ||
} | ||
auto op = paddle::framework::OpRegistry::CreateOp(block.ops(i)); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it will be extremely slow if we create operator every time because the protobuf message will be parsed and copied. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's keep the most straightforward implementation. (i.e. avoid any possible premature optimization). Once we get it running, we can go ahead to do profiling. |
||
LOG(INFO) << op->DebugString(); | ||
op->Run(local_scope, *device); | ||
} | ||
} | ||
|
||
// TODO(tonyyang-svail): | ||
// - Destroy local_scope | ||
} | ||
|
||
std::vector<bool> Executor::Prune(const ProgramDesc& pdesc, int block_id) { | ||
// TODO(tonyyang-svail): | ||
// - will change to use multiple blocks for RNN op and Cond Op | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. only runs the first block for now, will change to use multiple blocks for RNN op and Cond Op There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
auto& block = pdesc.blocks(block_id); | ||
auto& ops = block.ops(); | ||
|
||
bool expect_feed = true; | ||
for (auto& op_desc : ops) { | ||
PADDLE_ENFORCE(op_desc.type() != kFeedOpType || expect_feed, | ||
"All FeedOps are at the beginning of the ProgramDesc"); | ||
expect_feed = (op_desc.type() == kFeedOpType); | ||
} | ||
|
||
bool expect_fetch = true; | ||
for (auto op_iter = ops.rbegin(); op_iter != ops.rend(); ++op_iter) { | ||
auto& op_desc = *op_iter; | ||
PADDLE_ENFORCE(op_desc.type() != kFetchOpType || expect_fetch, | ||
"All FetchOps must at the end of the ProgramDesc"); | ||
expect_fetch = (op_desc.type() == kFetchOpType); | ||
} | ||
|
||
std::set<std::string> dependent_vars; | ||
std::vector<bool> should_run; | ||
for (auto op_iter = ops.rbegin(); op_iter != ops.rend(); ++op_iter) { | ||
auto& op_desc = *op_iter; | ||
|
||
bool found_dependent_vars = false; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this seems a complicated and long method, write some comments for each code block, or split this class into multiple smaller functions. |
||
for (auto& var : op_desc.outputs()) { | ||
for (auto& argu : var.arguments()) { | ||
if (dependent_vars.count(argu) != 0) { | ||
found_dependent_vars = true; | ||
} | ||
} | ||
} | ||
|
||
if (op_desc.type() == kFetchOpType || found_dependent_vars) { | ||
// erase its output to the dependency graph | ||
for (auto& var : op_desc.outputs()) { | ||
for (auto& argu : var.arguments()) { | ||
dependent_vars.erase(argu); | ||
} | ||
} | ||
|
||
// insert its input to the dependency graph | ||
for (auto& var : op_desc.inputs()) { | ||
for (auto& argu : var.arguments()) { | ||
dependent_vars.insert(argu); | ||
} | ||
} | ||
|
||
LOG(INFO) << "1 " << op_desc.type(); | ||
should_run.push_back(true); | ||
} else { | ||
LOG(INFO) << "0 " << op_desc.type(); | ||
should_run.push_back(false); | ||
} | ||
} | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should ENFORCE dependent_vars to be empty here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
// TODO(tonyyang-svail): | ||
// - check this after integration of Init | ||
// PADDLE_ENFORCE(dependent_vars.empty()); | ||
|
||
// since we are traversing the ProgramDesc in reverse order | ||
// we reverse the should_run vector | ||
std::reverse(should_run.begin(), should_run.end()); | ||
|
||
return should_run; | ||
} | ||
|
||
} // namespace framework | ||
} // namespace paddle |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve. | ||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. */ | ||
|
||
#pragma once | ||
|
||
#include "paddle/framework/framework.pb.h" | ||
#include "paddle/framework/op_info.h" | ||
#include "paddle/framework/scope.h" | ||
#include "paddle/framework/tensor.h" | ||
|
||
namespace paddle { | ||
namespace framework { | ||
|
||
class Executor { | ||
public: | ||
explicit Executor(const std::vector<platform::Place>& places); | ||
~Executor(); | ||
|
||
/* @Brief | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think C++ code is the document, and we don't really need to use Doxygen. Therefore, we can write much shorter comments. For this specific case |
||
* Runtime evaluation of the given ProgramDesc under certain Scope | ||
* | ||
* @param | ||
* ProgramDesc | ||
* Scope | ||
*/ | ||
void Run(const ProgramDesc&, Scope*, int); | ||
|
||
protected: | ||
/* @Brief | ||
* Pruning the graph | ||
* | ||
* @param | ||
* ProgramDesc | ||
* | ||
* @return | ||
* vector<bool> Same size as ops. Indicates whether an op should be run. | ||
*/ | ||
std::vector<bool> Prune(const ProgramDesc& pdesc, int block_id); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
private: | ||
std::vector<platform::DeviceContext*> device_contexts_; | ||
}; | ||
|
||
} // namespace framework | ||
} // namespace paddle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we run
cc_test(executor_test SRCS executor_test.cc DEPS executor)
as well when WITH_GPU is ON?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When WITH_GPU is ON, both cpu and gpu code will be tested.