-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify monoT5 and monoBERT boilerplate #80
Comments
Yeah, I think we should do that! |
I agree! |
EDIT: Outdated comment I'm a bit confused by the naming of this, is the goal to have a set of predefined rerankers w/ defaults from pyaggle.rerank.pretrained import monoBERT, monoT5
reranker = monoBERT() or monoT5() -> Reranker # defaults to castorini/monobert-large-msmarco and castorini/monot5-base-msmarco
# similar to how huggingface transformers has:
AutoModel.from_pretrained('monoBERT') or just to create general constructors for each reranker similar to what's in: def construct_seq_class_transformer(options: DocumentRankingEvaluationOptions
) -> Reranker:
model = AutoModelForSequenceClassification.from_pretrained(options.model, from_tf=options.from_tf)
device = torch.device(options.device)
model = model.to(device).eval()
tokenizer = AutoTokenizer.from_pretrained(options.tokenizer_name)
return SequenceClassificationTransformerReranker(model, tokenizer) TLDR why name |
There's a lot of boilerplate here: https://github.com/castorini/pygaggle#a-simple-reranking-example
Can we fold all of that into the constructor of the class? E.g., so we're left with:
or
Make
model_name
,tokenizer_name
, etc. configurable with sensical defaults.So simple reranking gets boiled down to
@rodrigonogueira4 @rodrigonogueira4 thoughts?
The text was updated successfully, but these errors were encountered: