-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update SSD documentation #987
Changes from 12 commits
52a3c29
a4fa39b
ec40950
c207b3b
2a20c0e
be9dc53
0406e6f
ba7516e
e21827a
d912527
4d7c7d3
3036795
7f3d16d
ac23702
ee37989
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,6 +4,14 @@ | |
|
||
## SSD 目标检测 | ||
|
||
## Table of Contents | ||
- [简介](#简介) | ||
- [数据准备](#数据准备) | ||
- [模型训练](#模型训练) | ||
- [模型评估](#模型评估) | ||
- [模型预测以及可视化](#模型预测以及可视化) | ||
- [模型发布](#模型发布) | ||
|
||
### 简介 | ||
|
||
[Single Shot MultiBox Detector (SSD)](https://arxiv.org/abs/1512.02325) 是一种单阶段的目标检测器。与两阶段的检测方法不同,单阶段目标检测并不进行区域推荐,而是直接从特征图回归出目标的边界框和分类概率。SSD 运用了这种单阶段检测的思想,并且对其进行改进:在不同尺度的特征图上检测对应尺度的目标。如下图所示,SSD 在六个尺度的特征图上进行了不同层级的预测。每个层级由两个3x3卷积分别对目标类别和边界框偏移进行回归。因此对于每个类别,SSD 的六个层级一共会产生 38x38x4 + 19x19x6 + 10x10x6 + 5x5x6 + 3x3x4 + 1x1x4 = 8732 个检测结果。 | ||
|
@@ -19,8 +27,6 @@ SSD 可以方便地插入到任何一种标准卷积网络中,比如 VGG、Res | |
|
||
你可以使用 [PASCAL VOC 数据集](http://host.robots.ox.ac.uk/pascal/VOC/) 或者 [MS-COCO 数据集](http://cocodataset.org/#download)。 | ||
|
||
#### PASCAL VOC 数据集 | ||
|
||
如果你想在 PASCAL VOC 数据集上进行训练,请先使用下面的命令下载数据集。 | ||
|
||
```bash | ||
|
@@ -30,8 +36,6 @@ cd data/pascalvoc | |
|
||
`download.sh` 命令会自动创建训练和测试用的列表文件。 | ||
|
||
#### MS-COCO 数据集 | ||
|
||
如果你想在 MS-COCO 数据集上进行训练,请先使用下面的命令下载数据集。 | ||
|
||
``` | ||
|
@@ -70,7 +74,13 @@ cd data/coco | |
python train.py --help | ||
``` | ||
|
||
我们使用了 RMSProp 优化算法来训练 MobileNet-SSD,batch大小为64,权重衰减系数为0.00005,初始学习率为 0.001,并且在第40、60、80、100 轮时使用 0.5, 0.25, 0.1, 0.01乘子进行学习率衰减。在120轮训练后,11point评价标准下的mAP为XXX%。 | ||
训练数据的读取行为定义在 `reader.py` 中。所有的图片都会被resize到300x300。在训练时,图片还会被随机扰动、扩张、裁剪和翻转: | ||
- 扰动: 扰动图片亮度、对比度、饱和度和hue。 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hue换成中文? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. revised |
||
- 扩张: 将原始图片放进一张使用像素均值填充的扩张图中,之后对此图进行resize则相当于缩小了原图。 | ||
- 裁剪: 对图片进行不同大小、比例和IOU的裁剪。 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这个描述看不太懂,需要修改下。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. revised |
||
- 翻转: 水平翻转。 | ||
|
||
我们使用了 RMSProp 优化算法来训练 MobileNet-SSD,batch大小为64,权重衰减系数为0.00005,初始学习率为 0.001,并且在第40、60、80、100 轮时使用 0.5, 0.25, 0.1, 0.01乘子进行学习率衰减。在120轮训练后,11point评价标准下的mAP为73.32%。 | ||
|
||
### 模型评估 | ||
|
||
|
@@ -114,4 +124,4 @@ MobileNet-v1-SSD 300x300 预测可视化 | |
|
||
| 模型 | 预训练模型 | 训练数据 | 测试数据 | mAP | | ||
|:------------------------:|:------------------:|:----------------:|:------------:|:----:| | ||
|MobileNet-v1-SSD 300x300 | COCO MobileNet SSD | VOC07+12 trainval| VOC07 test | XXX% | | ||
|MobileNet-v1-SSD 300x300 | COCO MobileNet SSD | VOC07+12 trainval| VOC07 test | 73.32% | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resize -> 缩放
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revised