Skip to content
Snippets Groups Projects
Unverified Commit aaa08dc4 authored by Miao Zheng's avatar Miao Zheng Committed by GitHub
Browse files

[Doc] Refine MMSegmentation documentation (#2668)

parent 310ec4af
No related branches found
No related tags found
No related merge requests found
Showing
with 175 additions and 495 deletions
# Add New Datasets
# \[WIP\] Add New Datasets
## Customize datasets by reorganizing data
......
# Add New Metrics
# Adding New Data Transforms
1. Write a new pipeline in any file, e.g., `my_pipeline.py`. It takes a dict as input and return a dict.
```python
from mmseg.datasets import TRANSFORMS
@TRANSFORMS.register_module()
class MyTransform:
def transform(self, results):
results['dummy'] = True
return results
```
2. Import the new class.
```python
from .my_pipeline import MyTransform
```
3. Use it in config files.
```python
crop_size = (512, 1024)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(type='RandomResize',
scale=(2048, 1024),
ratio_range=(0.5, 2.0),
keep_ratio=True),
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='PhotoMetricDistortion'),
dict(type='MyTransform'),
dict(type='PackSegInputs'),
]
```
# Adding New Data Transforms
## Customization data transformation
The customized data transformation must inherited from `BaseTransform` and implement `transform` function.
Here we use a simple flipping transformation as example:
```python
import random
import mmcv
from mmcv.transforms import BaseTransform, TRANSFORMS
@TRANSFORMS.register_module()
class MyFlip(BaseTransform):
def __init__(self, direction: str):
super().__init__()
self.direction = direction
def transform(self, results: dict) -> dict:
img = results['img']
results['img'] = mmcv.imflip(img, direction=self.direction)
return results
```
Moreover, import the new class.
```python
from .my_pipeline import MyFlip
```
Thus, we can instantiate a `MyFlip` object and use it to process the data dict.
```python
import numpy as np
transform = MyFlip(direction='horizontal')
data_dict = {'img': np.random.rand(224, 224, 3)}
data_dict = transform(data_dict)
processed_img = data_dict['img']
```
Or, we can use `MyFlip` transformation in data pipeline in our config file.
```python
pipeline = [
...
dict(type='MyFlip', direction='horizontal'),
...
]
```
Note that if you want to use `MyFlip` in config, you must ensure the file containing `MyFlip` is imported during runtime.
......@@ -81,7 +81,7 @@ The arguments of the constructor:
- `process` method processes one batch of data and data_samples.
- `compute_metrics` method computes the metrics from processed results.
#### IoUMetric.process
### IoUMetric.process
Parameters:
......@@ -92,7 +92,7 @@ Returns:
This method doesn't have returns since the processed results would be stored in `self.results`, which will be used to compute the metrics when all batches have been processed.
#### IoUMetric.compute_metrics
### IoUMetric.compute_metrics
Parameters:
......
......@@ -19,7 +19,7 @@ Component Customization
.. toctree::
:maxdepth: 1
add_modules.md
add_models.md
add_datasets.md
add_transforms.md
add_metrics.md
......
# Training Tricks
# \[WIP\] Training Tricks
MMSegmentation support following training tricks out of box.
......
......@@ -6,7 +6,9 @@ The structure of this guide is as follows:
- [Data Transforms](#data-transforms)
- [Design of Data pipelines](#design-of-data-pipelines)
- [Customization data transformation](#customization-data-transformation)
- [Data loading](#data-loading)
- [Pre-processing](#pre-processing)
- [Formatting](#formatting)
## Design of Data pipelines
......@@ -125,48 +127,3 @@ The position of random contrast is in second or second to last(mode 0 or 1 below
- add: `inputs`, `data_sample`
- remove: keys specified by `meta_keys` (merged into the metainfo of data_sample), all other keys
## Customization data transformation
The customized data transformation must inherited from `BaseTransform` and implement `transform` function.
Here we use a simple flipping transformation as example:
```python
import random
import mmcv
from mmcv.transforms import BaseTransform, TRANSFORMS
@TRANSFORMS.register_module()
class MyFlip(BaseTransform):
def __init__(self, direction: str):
super().__init__()
self.direction = direction
def transform(self, results: dict) -> dict:
img = results['img']
results['img'] = mmcv.imflip(img, direction=self.direction)
return results
```
Thus, we can instantiate a `MyFlip` object and use it to process the data dict.
```python
import numpy as np
transform = MyFlip(direction='horizontal')
data_dict = {'img': np.random.rand(224, 224, 3)}
data_dict = transform(data_dict)
processed_img = data_dict['img']
```
Or, we can use `MyFlip` transformation in data pipeline in our config file.
```python
pipeline = [
...
dict(type='MyFlip', direction='horizontal'),
...
]
```
Note that if you want to use `MyFlip` in config, you must ensure the file containing `MyFlip` is imported during runtime.
......@@ -11,11 +11,6 @@ datasets
.. automodule:: mmseg.datasets
:members:
samplers
^^^^^^^^^^
.. automodule:: mmseg.datasets.samplers
:members:
transforms
^^^^^^^^^^^^
.. automodule:: mmseg.datasets.transforms
......@@ -35,7 +30,7 @@ optimizers
:members:
mmseg.evaluation
--------------
-----------------
metrics
^^^^^^^^^^
......
# Model Zoo Statistics
- Number of papers: 47
- ALGORITHM: 36
- BACKBONE: 11
- Number of checkpoints: 612
- \[ALGORITHM\] [ANN](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/ann) (16 ckpts)
- \[ALGORITHM\] [APCNet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/apcnet) (12 ckpts)
- \[BACKBONE\] [BEiT](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/beit) (2 ckpts)
- \[ALGORITHM\] [BiSeNetV1](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/bisenetv1) (11 ckpts)
- \[ALGORITHM\] [BiSeNetV2](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/bisenetv2) (4 ckpts)
- \[ALGORITHM\] [CCNet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/ccnet) (16 ckpts)
- \[ALGORITHM\] [CGNet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/cgnet) (2 ckpts)
- \[BACKBONE\] [ConvNeXt](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/convnext) (6 ckpts)
- \[ALGORITHM\] [DANet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/danet) (16 ckpts)
- \[ALGORITHM\] [DeepLabV3](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/deeplabv3) (41 ckpts)
- \[ALGORITHM\] [DeepLabV3+](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/deeplabv3plus) (42 ckpts)
- \[ALGORITHM\] [DMNet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/dmnet) (12 ckpts)
- \[ALGORITHM\] [DNLNet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/dnlnet) (12 ckpts)
- \[ALGORITHM\] [DPT](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/dpt) (1 ckpts)
- \[ALGORITHM\] [EMANet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/emanet) (4 ckpts)
- \[ALGORITHM\] [EncNet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/encnet) (12 ckpts)
- \[ALGORITHM\] [ERFNet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/erfnet) (1 ckpts)
- \[ALGORITHM\] [FastFCN](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/fastfcn) (12 ckpts)
- \[ALGORITHM\] [Fast-SCNN](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/fastscnn) (1 ckpts)
- \[ALGORITHM\] [FCN](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/fcn) (41 ckpts)
- \[ALGORITHM\] [GCNet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/gcnet) (16 ckpts)
- \[BACKBONE\] [HRNet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/hrnet) (37 ckpts)
- \[ALGORITHM\] [ICNet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/icnet) (12 ckpts)
- \[ALGORITHM\] [ISANet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/isanet) (16 ckpts)
- \[ALGORITHM\] [K-Net](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/knet) (7 ckpts)
- \[BACKBONE\] [MAE](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/mae) (1 ckpts)
- \[ALGORITHM\] [Mask2Former](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/mask2former) (13 ckpts)
- \[ALGORITHM\] [MaskFormer](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/maskformer) (4 ckpts)
- \[BACKBONE\] [MobileNetV2](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/mobilenet_v2) (8 ckpts)
- \[BACKBONE\] [MobileNetV3](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/mobilenet_v3) (4 ckpts)
- \[ALGORITHM\] [NonLocal Net](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/nonlocal_net) (16 ckpts)
- \[ALGORITHM\] [OCRNet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/ocrnet) (24 ckpts)
- \[ALGORITHM\] [PointRend](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/point_rend) (4 ckpts)
- \[BACKBONE\] [PoolFormer](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/poolformer) (5 ckpts)
- \[ALGORITHM\] [PSANet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/psanet) (16 ckpts)
- \[ALGORITHM\] [PSPNet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/pspnet) (54 ckpts)
- \[BACKBONE\] [ResNeSt](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/resnest) (8 ckpts)
- \[ALGORITHM\] [SegFormer](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/segformer) (13 ckpts)
- \[ALGORITHM\] [Segmenter](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/segmenter) (5 ckpts)
- \[ALGORITHM\] [Semantic FPN](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/sem_fpn) (4 ckpts)
- \[ALGORITHM\] [SETR](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/setr) (7 ckpts)
- \[ALGORITHM\] [STDC](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/stdc) (4 ckpts)
- \[BACKBONE\] [Swin Transformer](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/swin) (6 ckpts)
- \[BACKBONE\] [Twins](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/twins) (12 ckpts)
- \[ALGORITHM\] [UNet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/unet) (25 ckpts)
- \[ALGORITHM\] [UPerNet](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/upernet) (16 ckpts)
- \[BACKBONE\] [Vision Transformer](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/vit) (11 ckpts)
# Frequently Asked Questions (FAQ)
# \[WIP\] Frequently Asked Questions (FAQ)
We list some common troubles faced by many users and their corresponding solutions here. Feel free to enrich the list if you find any frequent issues and have ways to help others to solve them. If the contents here do not cover your issue, please create an issue using the [provided templates](https://github.com/open-mmlab/mmsegmentation/blob/master/.github/ISSUE_TEMPLATE/error-report.md/) and make sure you fill in all required information in the template.
......
# Deployment
# \[WIP\] Deployment
> ## [Try the new MMDeploy to deploy your model](https://mmdeploy.readthedocs.io/)
......
# Useful Tools
# \[WIP\] Useful Tools
Apart from training/testing scripts, We provide lots of useful tools under the
`tools/` directory.
......
# 自定义数据集(待更新)
# 新增自定义数据集(待更新)
## 通过重新组织数据来定制数据集
......
# 添加评测指标
# 新增评测指标 (待更新)
# 新增模块(待更新)
中文版文档支持中,请先阅读[英文版本](../../en/advanced_guides/add_models.md)
# 自定义模型(待更新)
## 自定义优化器 (optimizer)
假设您想增加一个新的叫 `MyOptimizer` 的优化器,它的参数分别为 `a`, `b`, 和 `c`
您首先需要在一个文件里实现这个新的优化器,例如在 `mmseg/core/optimizer/my_optimizer.py` 里面:
```python
from mmcv.runner import OPTIMIZERS
from torch.optim import Optimizer
@OPTIMIZERS.register_module
class MyOptimizer(Optimizer):
def __init__(self, a, b, c)
```
然后增加这个模块到 `mmseg/core/optimizer/__init__.py` 里面,这样注册器 (registry) 将会发现这个新的模块并添加它:
```python
from .my_optimizer import MyOptimizer
```
之后您可以在配置文件的 `optimizer` 域里使用 `MyOptimizer`
如下所示,在配置文件里,优化器被 `optimizer` 域所定义:
```python
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
```
为了使用您自己的优化器,域可以被修改为:
```python
optimizer = dict(type='MyOptimizer', a=a_value, b=b_value, c=c_value)
```
我们已经支持了 PyTorch 自带的全部优化器,唯一修改的地方是在配置文件里的 `optimizer` 域。例如,如果您想使用 `ADAM`,尽管数值表现会掉点,还是可以如下修改:
```python
optimizer = dict(type='Adam', lr=0.0003, weight_decay=0.0001)
```
使用者可以直接按照 PyTorch [文档教程](https://pytorch.org/docs/stable/optim.html?highlight=optim#module-torch.optim) 去设置参数。
## 定制优化器的构造器 (optimizer constructor)
对于优化,一些模型可能会有一些特别定义的参数,例如批归一化 (BatchNorm) 层里面的权重衰减 (weight decay)。
使用者可以通过定制优化器的构造器来微调这些细粒度的优化器参数。
```python
from mmcv.utils import build_from_cfg
from mmcv.runner import OPTIMIZER_BUILDERS
from .cocktail_optimizer import CocktailOptimizer
@OPTIMIZER_BUILDERS.register_module
class CocktailOptimizerConstructor(object):
def __init__(self, optim_wrapper_cfg, paramwise_cfg=None):
def __call__(self, model):
return my_optimizer
```
## 开发和增加新的组件(Module)
MMSegmentation 里主要有2种组件:
- 主干网络 (backbone): 通常是卷积网络的堆叠,来做特征提取,例如 ResNet, HRNet
- 解码头 (decoder head): 用于语义分割图的解码的组件(得到分割结果)
### 添加新的主干网络
这里我们以 MobileNet 为例,展示如何增加新的主干组件:
1. 创建一个新的文件 `mmseg/models/backbones/mobilenet.py`
```python
import torch.nn as nn
from ..registry import BACKBONES
@BACKBONES.register_module
class MobileNet(nn.Module):
def __init__(self, arg1, arg2):
pass
def forward(self, x): # should return a tuple
pass
def init_weights(self, pretrained=None):
pass
```
2.`mmseg/models/backbones/__init__.py` 里面导入模块
```python
from .mobilenet import MobileNet
```
3. 在您的配置文件里使用它
```python
model = dict(
...
backbone=dict(
type='MobileNet',
arg1=xxx,
arg2=xxx),
...
```
### 增加新的解码头 (decoder head)组件
在 MMSegmentation 里面,对于所有的分割头,我们提供一个基类解码头 [BaseDecodeHead](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/decode_heads/decode_head.py)
所有新建的解码头都应该继承它。这里我们以 [PSPNet](https://arxiv.org/abs/1612.01105) 为例,
展示如何开发和增加一个新的解码头组件:
首先,在 `mmseg/models/decode_heads/psp_head.py` 里添加一个新的解码头。
PSPNet 中实现了一个语义分割的解码头。为了实现一个解码头,我们只需要在新构造的解码头中实现如下的3个函数:
```python
@HEADS.register_module()
class PSPHead(BaseDecodeHead):
def __init__(self, pool_scales=(1, 2, 3, 6), **kwargs):
super(PSPHead, self).__init__(**kwargs)
def init_weights(self):
def forward(self, inputs):
```
接着,使用者需要在 `mmseg/models/decode_heads/__init__.py` 里面添加这个模块,这样对应的注册器 (registry) 可以查找并加载它们。
PSPNet的配置文件如下所示:
```python
norm_cfg = dict(type='SyncBN', requires_grad=True)
model = dict(
type='EncoderDecoder',
pretrained='pretrain_model/resnet50_v1c_trick-2cccc1ad.pth',
backbone=dict(
type='ResNetV1c',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
dilations=(1, 1, 2, 4),
strides=(1, 2, 1, 1),
norm_cfg=norm_cfg,
norm_eval=False,
style='pytorch',
contract_dilation=True),
decode_head=dict(
type='PSPHead',
in_channels=2048,
in_index=3,
channels=512,
pool_scales=(1, 2, 3, 6),
dropout_ratio=0.1,
num_classes=19,
norm_cfg=norm_cfg,
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)))
```
### 增加新的损失函数
假设您想添加一个新的损失函数 `MyLoss` 到语义分割解码器里。
为了添加一个新的损失函数,使用者需要在 `mmseg/models/losses/my_loss.py` 里面去实现它。
`weighted_loss` 可以对计算损失时的每个样本做加权。
```python
import torch
import torch.nn as nn
from ..builder import LOSSES
from .utils import weighted_loss
@weighted_loss
def my_loss(pred, target):
assert pred.size() == target.size() and target.numel() > 0
loss = torch.abs(pred - target)
return loss
@LOSSES.register_module
class MyLoss(nn.Module):
def __init__(self, reduction='mean', loss_weight=1.0):
super(MyLoss, self).__init__()
self.reduction = reduction
self.loss_weight = loss_weight
def forward(self,
pred,
target,
weight=None,
avg_factor=None,
reduction_override=None):
assert reduction_override in (None, 'none', 'mean', 'sum')
reduction = (
reduction_override if reduction_override else self.reduction)
loss = self.loss_weight * my_loss(
pred, target, weight, reduction=reduction, avg_factor=avg_factor)
return loss
```
然后使用者需要在 `mmseg/models/losses/__init__.py` 里面添加它:
```python
from .my_loss import MyLoss, my_loss
```
为了使用它,修改 `loss_xxx` 域。之后您需要在解码头组件里修改 `loss_decode` 域。
`loss_weight` 可以被用来对不同的损失函数做加权。
```python
loss_decode=dict(type='MyLoss', loss_weight=1.0))
```
# 自定义数据流程(待更新)
# 新增数据增强(待更新)
## 数据流程的设计
按照通常的惯例,我们使用 `Dataset``DataLoader` 做多线程的数据加载。`Dataset` 返回一个数据内容的字典,里面对应于模型前传方法的各个参数。
因为在语义分割中,输入的图像数据具有不同的大小,我们在 MMCV 里引入一个新的 `DataContainer` 类别去帮助收集和分发不同大小的输入数据。
更多细节,请查看[这里](https://github.com/open-mmlab/mmcv/blob/master/mmcv/parallel/data_container.py)
数据的准备流程和数据集是解耦的。通常一个数据集定义了如何处理标注数据(annotations)信息,而一个数据流程定义了准备一个数据字典的所有步骤。一个流程包括了一系列操作,每个操作里都把一个字典作为输入,然后再输出一个新的字典给下一个变换操作。
这些操作可分为数据加载 (data loading),预处理 (pre-processing),格式变化 (formatting) 和测试时数据增强 (test-time augmentation)。
下面的例子就是 PSPNet 的一个流程:
```python
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (512, 1024)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)),
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='PhotoMetricDistortion'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_semantic_seg']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(2048, 1024),
# img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75],
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
```
对于每个操作,我们列出它添加、更新、移除的相关字典域 (dict fields):
### 数据加载 Data loading
`LoadImageFromFile`
- 增加: img, img_shape, ori_shape
`LoadAnnotations`
- 增加: gt_semantic_seg, seg_fields
### 预处理 Pre-processing
`Resize`
- 增加: scale, scale_idx, pad_shape, scale_factor, keep_ratio
- 更新: img, img_shape, \*seg_fields
`RandomFlip`
- 增加: flip
- 更新: img, \*seg_fields
`Pad`
- 增加: pad_fixed_size, pad_size_divisor
- 更新: img, pad_shape, \*seg_fields
`RandomCrop`
- 更新: img, pad_shape, \*seg_fields
`Normalize`
- 增加: img_norm_cfg
- 更新: img
`SegRescale`
- 更新: gt_semantic_seg
`PhotoMetricDistortion`
- 更新: img
### 格式 Formatting
`ToTensor`
- 更新: 由 `keys` 指定
`ImageToTensor`
- 更新: 由 `keys` 指定
`Transpose`
- 更新: 由 `keys` 指定
`ToDataContainer`
- 更新: 由 `keys` 指定
`DefaultFormatBundle`
- 更新: img, gt_semantic_seg
`Collect`
- 增加: img_meta (the keys of img_meta is specified by `meta_keys`)
- 移除: all other keys except for those specified by `keys`
### 测试时数据增强 Test time augmentation
`MultiScaleFlipAug`
## 拓展和使用自定义的流程
1. 在任何一个文件里写一个新的流程,例如 `my_pipeline.py`,它以一个字典作为输入并且输出一个字典
```python
from mmseg.datasets import PIPELINES
@PIPELINES.register_module()
class MyTransform:
def __call__(self, results):
results['dummy'] = True
return results
```
2. 导入一个新类
```python
from .my_pipeline import MyTransform
```
3. 在配置文件里使用它
```python
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (512, 1024)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)),
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='PhotoMetricDistortion'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
dict(type='MyTransform'),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_semantic_seg']),
]
```
中文版文档支持中,请先阅读[英文版本](../../en/advanced_guides/add_transform.md)
......@@ -43,7 +43,7 @@ dict(
### 数据预处理器到模型
虽然在[上面的图](#数据流概述)中分开绘制了数据预处理器和模型,但数据预处理器是模型的一部分,因此可以在[模型教程](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/models.html)中找到数据预处理器章节。 ***([中文链接待更新](https://mmsegmentation.readthedocs.io/zh_CN/dev-1.x/advanced_guides/models.html))***
虽然在[上面的图](##数据流概述)中分开绘制了数据预处理器和模型,但数据预处理器是模型的一部分,因此可以在[模型教程](https://mmsegmentation.readthedocs.io/en/dev-1.x/advanced_guides/models.html)中找到数据预处理器章节。 ***([中文链接待更新](https://mmsegmentation.readthedocs.io/zh_CN/dev-1.x/advanced_guides/models.html))***
数据预处理器的返回值是一个包含 `inputs``data_samples` 的字典,其中 `inputs` 是批处理图像的 4D 张量,`data_samples` 中添加了一些用于数据预处理的额外元信息。当传递给网络时,字典将被解包为两个值。 以下伪代码展示了数据预处理器的返回值和模型的输入值。
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment