Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

streaming export #9084

Open
wants to merge 28 commits into
base: develop
Choose a base branch
from
Open

streaming export #9084

wants to merge 28 commits into from

Conversation

Eldies
Copy link
Contributor

@Eldies Eldies commented Feb 10, 2025

Motivation and context

When we export a dataset, all annotations are kept in RAM. It may be a problem if they are large.
Enabling streaming export for task or job data, for yolo formats

depends on cvat-ai/datumaro#81, cvat-ai/datumaro#90, cvat-ai/datumaro#94

How has this been tested?

Checklist

  • I submit my changes into the develop branch
  • I have created a changelog fragment
  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • I have linked related issues (see GitHub docs)

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

@Eldies Eldies force-pushed the dl/stream-export branch 6 times, most recently from bdda455 to fa321ce Compare February 12, 2025 06:37
@Eldies Eldies changed the base branch from develop to dl/update-datumaro February 12, 2025 06:38
@codecov-commenter
Copy link

codecov-commenter commented Feb 12, 2025

Codecov Report

Attention: Patch coverage is 97.36842% with 1 line in your changes missing coverage. Please review.

Project coverage is 73.34%. Comparing base (fa4934d) to head (2e56785).
Report is 2 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9084      +/-   ##
===========================================
+ Coverage    73.32%   73.34%   +0.01%     
===========================================
  Files          449      449              
  Lines        45875    45882       +7     
  Branches      3915     3915              
===========================================
+ Hits         33640    33652      +12     
+ Misses       12235    12230       -5     
Components Coverage Δ
cvat-ui 77.09% <ø> (-0.01%) ⬇️
cvat-server 70.34% <97.36%> (+0.03%) ⬆️
🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

# Conflicts:
#	cvat/requirements/base.in
#	cvat/requirements/base.txt
# Conflicts:
#	cvat/requirements/base.in
#	cvat/requirements/base.txt
Base automatically changed from dl/update-datumaro to develop March 4, 2025 14:54
Eldies added 4 commits March 4, 2025 16:09
# Conflicts:
#	cvat/apps/dataset_manager/bindings.py
#	cvat/apps/dataset_manager/formats/coco.py
#	cvat/apps/dataset_manager/formats/imagenet.py
#	cvat/apps/dataset_manager/formats/yolo.py
#	cvat/requirements/base.in
#	cvat/requirements/base.txt
@Eldies Eldies requested a review from nmanovic as a code owner March 4, 2025 20:56
@zhiltsov-max
Copy link
Contributor

zhiltsov-max commented Mar 6, 2025

I've noticed export performance degradation in streaming mode for COCO formats, about 2-3x in my cases. As I understand it, the problem is in coco/exporter.py:142, where the regular json library is used instead of the optimized orjson. I suppose orjson isn't compatible directly with json stream used in the implementation now, but I could find that there is other relevant functionality in orjson - https://github.com/ijl/orjson?tab=readme-ov-file#fragment. Please check this issue.

Copy link

sonarqubecloud bot commented Mar 6, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants