Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RT-DETRv2 train fails to start due to empty Dataloaders #557

Open
OfirDataloopAI opened this issue Mar 5, 2025 · 0 comments
Open

RT-DETRv2 train fails to start due to empty Dataloaders #557

OfirDataloopAI opened this issue Mar 5, 2025 · 0 comments
Assignees

Comments

@OfirDataloopAI
Copy link

OfirDataloopAI commented Mar 5, 2025

Describe the bug
When I try to train "RT-DETRv2" with "rtdetrv2_pytorch/tools/train.py", I get the following error:

Not init distributed mode.
Start training
Load PResNet101 state_dict
Initial lr: [1e-06, 0.0001, 0.0001]
building train_dataloader with batch_size=16...
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
building val_dataloader with batch_size=32...
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Resume checkpoint from C:\Users\User\PycharmProjects\RT-DETR\rtdetrv2_r101vd_6x_coco_from_paddle.pth
Not load model.state_dict
Not load criterion.state_dict
Not load postprocessor.state_dict
Load ema.state_dict
Not load optimizer.state_dict
Not load lr_scheduler.state_dict
Not load lr_warmup_scheduler.state_dict
number of trainable parameters: 76527756
Traceback (most recent call last):
  File "C:\Users\User\PycharmProjects\RT-DETR\rtdetrv2_pytorch\tools\train.py", line 69, in <module>
    main(args)
  File "C:\Users\User\PycharmProjects\RT-DETR\rtdetrv2_pytorch\tools\train.py", line 35, in main
    solver.fit()
  File "C:\Users\User\PycharmProjects\RT-DETR\rtdetrv2_pytorch\src\solver\det_solver.py", line 38, in fit
    train_stats = train_one_epoch(
  File "C:\Users\User\PycharmProjects\RT-DETR\rtdetrv2_pytorch\src\solver\det_engine.py", line 38, in train_one_epoch
    for i, (samples, targets) in enumerate(metric_logger.log_every(data_loader, print_freq, header)):
  File "C:\Users\User\PycharmProjects\RT-DETR\rtdetrv2_pytorch\src\misc\logger.py", line 238, in log_every
    header, total_time_str, total_time / len(iterable)))
ZeroDivisionError: float division by zero

And I see that the loaders are empty even though the dataset is loaded successfully:

Image

What might be the issue?
It happens for both my train_loader and val_loader.

To Reproduce
I try to train "RT-DETRv2" with "rtdetrv2_pytorch/tools/train.py", and the following json files with 1 item for train and 1 item for validation:

instances_train.json
instances_val.json

And update in the "C:\Users\User\PycharmProjects\RT-DETR\rtdetrv2_pytorch\configs\dataset\coco_detection.yml" the fields: "ann_file" and "img_folder" as required.

Then I run the "rtdetrv2_pytorch/tools/train.py" with the following args:

args.config = r"C:\Users\User\PycharmProjects\RT-DETR\rtdetrv2_pytorch\configs\rtdetrv2\rtdetrv2_r101vd_6x_coco.yml" args.resume = r"C:\Users\User\PycharmProjects\RT-DETR\rtdetrv2_r101vd_6x_coco_from_paddle.pth"

python env:

absl-py==2.1.0
contourpy==1.3.1
cycler==0.12.1
faster-coco-eval==1.6.5
filelock==3.17.0
fonttools==4.56.0
fsspec==2025.2.0
grpcio==1.70.0
Jinja2==3.1.5
kiwisolver==1.4.8
Markdown==3.7
MarkupSafe==3.0.2
matplotlib==3.10.1
mpmath==1.3.0
narwhals==1.29.0
networkx==3.4.2
numpy==2.2.3
packaging==24.2
pandas==2.2.3
pillow==11.1.0
plotly==6.0.0
protobuf==6.30.0
pycocotools==2.0.8
pyparsing==3.2.1
python-dateutil==2.9.0.post0
pytz==2025.1
PyYAML==6.0.2
scipy==1.15.2
six==1.17.0
sympy==1.13.1
tensorboard==2.19.0
tensorboard-data-server==0.7.2
torch==2.6.0
torchvision==0.21.0
typing_extensions==4.12.2
tzdata==2025.1
Werkzeug==3.1.3
@OfirDataloopAI OfirDataloopAI changed the title RT-DETRv2 empty Dataloaders RT-DETRv2 train fails to start due to empty Dataloaders Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants