-
-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scan_string() return token_type::parse_error; when parse ansi file #812
Comments
Could you please try the |
@nlohmann I'm already using the {
if (JSON_UNLIKELY(not next_byte_in_range({0x80, 0xBF})))
{
return token_type::parse_error;//line 2186
}
break;
} When I comment line from 2153 to 2266 and modify default branch to below it works fine. By the way are you have utf-8 convert function. default:
{
add(current);
break;
/* error_message = "invalid string: ill-formed UTF-8 byte";
return token_type::parse_error;*/
} |
Looks like you need to convert your non-ascii strings |
@gregmarr Yes! But this project should support this. |
The project supports UTF-8. Could you try adding a |
@nlohmann add u8 prefix not work correct. #include "json.hpp"
#include <fstream>
using namespace std;
using json = nlohmann::json;
int main()
{
ofstream out_json("C:\\test.json");
json jsDefault = json();
jsDefault["name"] = u8"默认";
jsDefault["param"] = json();
json jsArray = json::array({ jsDefault });
json jsObj = json();
jsObj["select"] = u8"默认";
jsObj["items"] = jsArray;
out_json << std::setw(4) << jsObj;
out_json.close();
ifstream in_json("C:\\test.json");
json jsNewObj = json();
in_json >> jsNewObj;
string strJson = jsNewObj.dump(4);
return 0;
} Output file {
"items": [
{
"name": "默认",
"param": null
}
],
"select": "默认"
} And
|
Can you please attach your code and the JSON file (best as a ZIP archive) so I can check this myself? |
Here is VS2015 test project you can try it. |
I don't have MSVC. Can you please execute the code on your side and attach the generated JSON file please? |
Check the attach file. |
I can parse the file without problems: #include <iostream>
#include <fstream>
#include "json.hpp"
using json = nlohmann::json;
int main(int argc, char* argv[]) {
std::ifstream f("test.json");
json j;
f >> j;
std::cout << j << std::endl;
} |
Not error,decode string is not same as src string. |
In #812 (comment) you mentioned a parse error. I cannot reproduce this with the file. I think your input is not UTF-8 encoded. The library only supports UTF-8. |
No u8 prefix will report parse error.After add u8 prefix no error but decoded string not same as source string. No u8 prefix saved file: Use u8 prefix saved file: |
Can you check in the debugger if the string you store in the library is UTF-8 encoded? |
I didn't know how to check it, but i think use |
The thing is that the library does not check whether stored strings are UTF-8 encoded. That's why, serialization may produce non-compliant JSON text. When such a text is parsed, a parse error occurs reporting the non-valid UTF-8. |
Can you add this feature to reduce discuss about self save file but parse error with invalid UTF-8 char? This will take a lot of your time taocpp json has this feature. |
I won't do re-encoding, but maybe throwing an exception when non-UTF-8 encoded text is dumped is discussed in #838. |
It would be save your time if you implement convert to UTF-8 encode when dump to file. I guess when more user use this library and more same question will be discussed. This is test code on windows. // TestJson.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include "json.hpp"
#include <fstream>
#include <iostream>
#include <Windows.h>
using namespace std;
using json = nlohmann::json;
// 多字节编码转为UTF8编码
bool MBToUTF8(vector<char>& pu8, const char* pmb, int mLen)
{
// convert an MBCS string to widechar
int nLen = MultiByteToWideChar(CP_ACP, 0, pmb, mLen, NULL, 0);
WCHAR* lpszW = NULL;
try
{
lpszW = new WCHAR[nLen];
}
catch (bad_alloc &memExp)
{
return false;
}
int nRtn = MultiByteToWideChar(CP_ACP, 0, pmb, mLen, lpszW, nLen);
if (nRtn != nLen)
{
delete[] lpszW;
return false;
}
// convert an widechar string to utf8
int utf8Len = WideCharToMultiByte(CP_UTF8, 0, lpszW, nLen, NULL, 0, NULL, NULL);
if (utf8Len <= 0)
{
return false;
}
pu8.resize(utf8Len+1);
nRtn = WideCharToMultiByte(CP_UTF8, 0, lpszW, nLen, &*pu8.begin(), utf8Len, NULL, NULL);
pu8[utf8Len] = '\0';
delete[] lpszW;
if (nRtn != utf8Len)
{
pu8.clear();
return false;
}
return true;
}
int main()
{
ofstream out_json("C:\\test.json");
json jsDefault = json();
jsDefault["name"] = "默认";
jsDefault["param"] = json();
json jsArray = json::array({ jsDefault });
json jsObj = json();
jsObj["select"] = "默认";
jsObj["items"] = jsArray;
std::string strDump = jsObj.dump(4);
vector<char> utf8Char;
MBToUTF8(utf8Char, strDump.c_str(), strDump.size());
out_json << (byte)0xEF << (byte)0xBB << (byte)0xBF;
out_json << utf8Char.data();
out_json.close();
ifstream in_json("C:\\test.json");
json jsNewObj = json();
in_json >> jsNewObj;
string strJson = jsNewObj.dump(4);
return 0;
} |
This seems to be MSVC-specific code. |
With VS2015 IDE when I run below demo code to read json file, scan_string function return
token_type::parse_error
at line 2186.{
ofstream out_json("C:\test.json");
The text was updated successfully, but these errors were encountered: