-
Notifications
You must be signed in to change notification settings - Fork 27.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
locale negotiation is incorrect for i18n routing #18676
Comments
this might be something formatjs can provide actually |
@Timer I believe u can use the ResolveLocale(
["fr", "en"],
["fr-XX", "en"],
{ localeMatcher: "best fit" },
[],
{},
() => "en"
) should yield {
"locale":"fr",
"dataLocale":"fr"
} |
I have the same issue here... very very annoying. |
I've just encountered this in a real world scenario, I have Italian set up as an Here's a link: https://i18n.wscandy.vercel.app/ |
@ijjk You have assigned yourself to this issue, but to save you time, I tried to understand how NextJS took into account the Accept-Language header, and realized that this part was managed by the hapi/accept module: So, I've written a small script that could replace the const regex = /((([a-zA-Z]+(-[a-zA-Z0-9]+){0,2})|\*)(;q=[0-1](\.[0-9]+)?)?)*/g;
function parse(al){
const strings = (al || "").match(regex);
return strings.map(m => {
if(!m){
return;
}
const bits = m.split(';');
const ietf = bits[0].split('-');
const hasScript = ietf.length === 3;
return {
code: ietf[0],
script: hasScript ? ietf[1] : null,
region: hasScript ? ietf[2] : ietf[1],
quality: bits[1] ? parseFloat(bits[1].split('=')[1]) : 1.0
};
}).filter(r => r).sort((a, b) => b.quality - a.quality);
}
function pick(supportedLanguages, acceptLanguage) {
const parsedAcceptedLanguage = parse(acceptLanguage);
const supported = supportedLanguages.map(support => {
const bits = support.split('-');
const hasScript = bits.length === 3;
return {
code: bits[0],
script: hasScript ? bits[1] : null,
region: hasScript ? bits[2] : bits[1]
};
});
for (let i = 0; i < parsedAcceptedLanguage.length; i++) {
const lang = parsedAcceptedLanguage[i];
const langCode = lang.code.toLowerCase();
const langRegion = lang.region ? lang.region.toLowerCase() : lang.region;
const langScript = lang.script ? lang.script.toLowerCase() : lang.script;
let possible = []
for (let j = 0; j < supported.length; j++) {
const supportedCode = supported[j].code.toLowerCase();
const supportedScript = supported[j].script ? supported[j].script.toLowerCase() : supported[j].script;
const supportedRegion = supported[j].region ? supported[j].region.toLowerCase() : supported[j].region;
if (langCode === supportedCode) {
possible.push({ lang: supportedLanguages[j], supportedScript, supportedRegion });
}
}
if (possible.length > 0) {
const regionsFilter = possible.filter(({ supportedRegion }) => supportedRegion == langRegion)
if (regionsFilter.length > 0) {
const scriptFilter = regionsFilter.filter(({ supportedScript }) => supportedScript == langScript)
if (scriptFilter.length > 0) {
return scriptFilter[0].lang
}
return regionsFilter[0].lang
} else {
return possible[0].lang
}
}
}
return null;
} Which, once implemented/imported in the server (line 328) would resolve into: pick(i18n.locales, req.headers['accept-language']) I've tested my script on multiple cases, and I found it always took the right option (but I encourage to try it yourself). The only thing is that it works better by adding general language code before region/script specific languages: for example, to make sure
|
@arguiot unfortunately your locale negotiation algorithm is not correct. There are currently 2 algorithms: RFC4647 is pretty rudimentary while UTS35 is more sophisticated and handle legacy aliases (e.g |
@longlho Which part is wrong exactly? The links you gave aren't algorithm at all, they are specifications (which is quite different). Currently, the problem is not about parsing I'm not saying my algorithm is perfect, because it's not the case and not the goal. The problem with specifications is that not everyone follows them. The language picker has to be very flexible to make sure that it works for everyone. There's room for improvement (so if you want to improve my work or someone else's work, please do). But I think it's better to quickly fix the issue and improve the language detection overtime. I believe it's always better to have something pretty reliable right now that will get better than having something absolutely perfect but in 3 months. |
I'm not sure you understand locale negotiation. You need language matching data from CLDR to resolve them correctly. Just parsing locales and doing matching based on lang/region/script is not enough. The 2 algorithms are described in the 2 specs if you read them. "Not everyone following them" because they don't really understand its complexity and choose to just hack together solutions that work in the 20 tests they have. I don't recommend just hacking a solution. Either use a library that follows the algorithm or implement the algorithm itself. Formatjs already follows the spec so either use that or libicu |
I probably don't understand locale negotiation as well as you. I would like to understand why parsing locales and doing matching based on lang/region/script is not enough in our case. The problem with what you're proposing is that it's not related to Moreover, we would have to parse the I agree on the fact that it's better to use FormatJS, but we first need to solve the Finally, the ResolveLocale(
["en-CA", "fr-CA"],
["fr-FR"],
{ localeMatcher: "best fit" },
[],
{},
() => "en"
) In that case, it should return |
Q weightings determines the order of the requested locale array. And in your example, yes, it should return "en". You're assuming if you understand a language in a certain region then you understand that in all regions, which is not true for traditional vs simplified Chinese, or latin american spanish vs spain spanish. Using lang/region/script is not enough. Because zh-TW matches with zh-Hant, not zh-Hans or zh and that data lives in CLDR. |
@longlho I know what Q weightings are, but does the order matter in the ResolveLocale function? This is a very tricky point, and I think we should let the developer decide. Maybe by letting the developer use its own resolver in The problem is that not every browser will send the header with the region and the code: Chrome, will do it but Safari will only send For global websites, having a perfect language resolver matters a lot. But for country specific website, like in Canada for example, only the language code matters. I would like to have your opinion on this proposal. For me I think it would solve the problem (and close our little debate 😄). |
The order absolutely matters in
That's not entirely correct. Locale preferences are determined by both browser user settings and OSes. It contains more than just lang/script/region. If u read the UTS35 LDML spec it contains preferences like calendar, numbering systems and such (e.g
Canada has 2 official languages: English & French. CA also uses Celsius instead of Fahrenheit. There are a lot of differences in Having worked with large engineering org I have not seen the need to have "custom" i18n functionalities that are not encompassed by Unicode or ICU due to high learning curve and complexity, so providing something standard out of the box is the norm. |
…cale This is an attempt to fix vercel#18676 Signed-off-by: Arthur Guiot <[email protected]>
I must admit that you convinced me. I created a Pull Request (#19460) based on your code. It would be great if you could review the code and make sure everything is working properly. |
Experiencing the same issue on a Next 10 sites using built-in localization. The client using iOS Safari (I can reproduce it on my iPad) is not redirected to /fr but keeps seeing the content of the site with the default language (en). Works on any other browser though. |
Given that it seems that this issue isn't gonna resolved anytime soon: did somebody at least find a workaround? Some way to hijack the locale negotiation routine maybe? |
I wish, I just ended up adding a language button to the bottom of my site which lets you flip from english to spanish. Its amazing there is no work being done to fix this, if this was more than a personal site i’d consider is a major issue. |
@hohl technically it is possible if you don't use built in next.js locale routing and just read directly from headers to negotiate |
@arguiot How did you solve it? |
Issue also presents itself in Firefox. I surely thought this was me not using Next.js properly. It's super odd that this bug has been allowed to persist since i18n was implemented in Next.js. |
the PR is pretty old but that's the general idea. If you're using vercel the issue is upstream routing at the CDN level so that patch won't help. |
I can't tell if this has been reported yet, but it seems this is happening specifically with dynamic routes. If you create a new next.js project with a single Since this seems to be related to the routing, it would be nice to have an option to use |
Any updates? |
Safari does NOT work for us pushing the locale into router.push it always goes back to english. Chrome works just fine. using next v 11.1.2 |
Here is a simple expression of the problem:
|
One possible workaround is to include all locales, like so:
It creates a massive overhead for static page generation though. I have tested multiple production next websites and unfortunately all of them have this flaw. It is quite sad that this is a year old issue, I guess it is under reported because it is hard to spot and real users most likely would not report such UX issue. |
@timuric Our bloody workaround was to use _app, add a small hook so there is client-side redirection from / to /de when accpt-language resolves to de* and route is /. But of course it has major downsides |
Haha, we also tried that, but unfortunately it is not bullet proof :( So far adding all possible locales does the job, but as I mentioned there is a massive overhead that can be a deal breaker for some static paths with many permutations. |
@timuric We use SSG for a webshop and I'd wait 2h for a build if I did that haha ^^ Well it's all crap we need a next-native solution, I'd prefer beeing able to define a custom locale resolver as a function:
|
you guys can use https://formatjs.io/docs/polyfills/intl-localematcher to manually match locales |
This comment has been minimized.
This comment has been minimized.
I had this same problem. My workaround is to do my _middleware file as this:
My ACCEPTED_LOCALES variable = ['en', 'fr', 'pt'] |
My test results with this workaround (en default language, browsers languages in this order: fr-CH, pt-BR, en-US) Firefoxhttp://localhost:3000/ => http://localhost:3000/fr/ - language found in getBrowserLanguage function Chromehttp://localhost:3000/ => http://localhost:3000/fr/ - language fr sent by chrome, no call to getBrowserLanguage |
|
This comment has been minimized.
This comment has been minimized.
going to be almost 5 years and nextjs still can't show the language correctly on IOS Safari devices without some hacks through patch-package? :( |
Bug report
Describe the bug
If I send Accept-Language: "fr-XX,en" and I have available locales fr, en, it should give me fr instead of en (current behavior)
To Reproduce
Steps to reproduce the behavior, please provide code snippets or a repository:
Expected behavior
Should be routed to fr
Screenshots
If applicable, add screenshots to help explain your problem.
System information
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: