-
-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NotImplementedError: [E894] The 'noun_chunks' syntax iterator is not implemented for language 'ru'. #204
Comments
Thank you @gremur |
Thank you @ceteri If you like to repeat my test I've used the following code:
The error occurs at this line of code 'doc = nlp(text)' |
Thank you @gremur, that code snippet really helps us debug. This now extracts two entities: It runs without exceptions, although I have a hunch that more structural work will be needed to get good results in cases where spaCy does not provide noun chunks. |
I'm evaluating across the different algorithms we've implemented, using this script:
With the following results:
TopicRank has an exception, though the other algorithms produce results – albeit limited results. @tomaarsen it'd be possible for @gremur overall, would you expect many more entities to be extracted? |
Perhaps I did not provide the best example of text, but some 'noun chunks' can be found (marked below in bold italic).
|
same problem happens in the Chinese. Is there any progress? |
hi @k0286, based on @gremur's feedback we made the need for noun chunks optional. so, yes this specific request has been completed. even so, the results depend on the quality of the other pipeline components prior to textgraph analysis. in the case of mandarin, as far as i'm aware we've never had any feedback yet about use of if you've got a PR we're ready to work on integration! |
Thank you! |
It seems to me that nlp.add_pipe("textrank") must have "noun chunks" which will raise "NotImplementedError" for some language models where "noun chunks" have not been implemented. I've got "NotImplementedError" with "ru_core_news_lg" and "ru_core_news_sm" spacy models.
The proposal is to make the use of "noun chunks" optional to prevent such errors.
The text was updated successfully, but these errors were encountered: