Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[neural search] fix the bug of reading files when calculating the recall scores #7836

Merged
merged 1 commit into from
Jan 12, 2024

Conversation

shenghwa
Copy link
Contributor

@shenghwa shenghwa commented Jan 12, 2024

PR types

[ Bug fixes ]

PR changes

[ Scripts ]

Description

For this file 'applications/neural_search/recall/in_batch_negative/evaluate.py'.
If (the number of rows in 'similar_text_pair') mod ('recall_num') doesn't equal 0, the current code will still store the remaining rows in the 'rs' list, resulting in inconsistent list dimensions stored in the 'rs' list. Although it won't raise errors, logically there are some problems.

In this file 'applications/neural_search/recall/simcse/evaluate.py'.
The current loop will not save the last 'relevance_labels' list into the 'rs' list.

Therefore, the corrected code can be applied to the above two files. Besides, I think the issue should be similar for the following files:

examples/semantic_indexing/evaluate.py
applications/question_answering/supervised_qa/faq_system/evaluate.py
applications/question_answering/supervised_qa/faq_finance/evaluate.py
applications/text_classification/multi_class/retrieval_based/evaluate.py
applications/text_classification/hierarchical/retrieval_based/evaluate.py

I hadn't modified these files because I didn't check the usage in their modules. Please check those files. Thanks.

Copy link

paddle-bot bot commented Jan 12, 2024

Thanks for your contribution!

@CLAassistant
Copy link

CLAassistant commented Jan 12, 2024

CLA assistant check
All committers have signed the CLA.

Copy link

codecov bot commented Jan 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (a1d1aee) 56.95% compared to head (f856fed) 56.95%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #7836      +/-   ##
===========================================
- Coverage    56.95%   56.95%   -0.01%     
===========================================
  Files          587      587              
  Lines        88628    88628              
===========================================
- Hits         50482    50480       -2     
- Misses       38146    38148       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@w5688414 w5688414 self-requested a review January 12, 2024 08:08
Copy link
Contributor

@w5688414 w5688414 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@w5688414 w5688414 changed the title fix the bug of reading files when calculating the recall scores [neural search] fix the bug of reading files when calculating the recall scores Jan 12, 2024
@w5688414 w5688414 merged commit 8bc06b0 into PaddlePaddle:develop Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants