Hi Isomorphic,
please see the online showcase BatchUploader sample (now on v11.0p_2016-10-13, but also happening locally with latest 5.1p).
I downloaded the supplyItemTest.csv file and opened it in LibreOffice (v 5.1.5) and saved it as XLSX. This works and is portable (file can be opened in Excel as well).
I then closed LibreOffice and reopened the file in it and exported it with different encodings.
I did the same in Windows Excel (latest Office 365, but older versions and MacOS versions show the same behavior).
I uploaded these exported CSVs to the sample. This is the result (files included in this thread):
1) is OK obviously. Let's ignore 4) / 5) for now, as I don't know what the encoding here is, either.
3) is the same as 2) after taking care of the different default delimiters (or one could add a SelectItem to configure the delimiter for the BatchUploader), and the encoding is ISO 8859-1 / CP1252 / WinLatin1.
This is how uploads of 2) / 3) look for me.
My problem is with 2/3. Our customers all use excel, no one uses LibreOffice. All end user can do Excel save-as-csv, but when they want to import the CSV in our tool, we have Umlaut-problems.
Could you add an encoding setting for the BatchUploader like you do with setDefaultDelimiter() and setDefaultQuoteString() (also to 5.1p, switch to 6.0p delayed because of this one)?
That encoding setting should define the encoding of the input file. The result should of course still be UTF-8.
This one is a pretty serious one for us, as now end users are using the Batch Upload on a regular basis, and they start complaining about the umlaut-errors. Before, we used it mainly for setup where we did the export in LibreOffice and could control the export encoding.
IMHO the BatchUploader should be able to handle different encodings regardless of our problem here, because different tools will create different exports (until one day, maybe, all are using the now already 10 year old UTF-8...).
I don't know if support for all possible different encodings is needed, but UTF-8 and ISO 8859-1 / CP1252 / WinLatin1 should definitely work and will make 99% of all uploads, I'd guess.
Thank you & Best regards
Blama
PS: This is not related to this report.
please see the online showcase BatchUploader sample (now on v11.0p_2016-10-13, but also happening locally with latest 5.1p).
I downloaded the supplyItemTest.csv file and opened it in LibreOffice (v 5.1.5) and saved it as XLSX. This works and is portable (file can be opened in Excel as well).
I then closed LibreOffice and reopened the file in it and exported it with different encodings.
I did the same in Windows Excel (latest Office 365, but older versions and MacOS versions show the same behavior).
I uploaded these exported CSVs to the sample. This is the result (files included in this thread):
- supplyItemTest_LibreOffice_UTF8.csv: OK
- supplyItemTest_LibreOffice_CP1252.csv: Questionmark-Umlauts
- supplyItemTest_Excel-CSV.csv: Questionmark-Umlauts (Default delimiter: Semicolon, needed to search-replace here, then the same as supplyItemTest_LibreOffice_CP1252.csv according to WinMerge)
- supplyItemTest_Excel-CSVMAC.csv: Questionmark-Umlauts (Default delimiter: Semicolon, needed to search-replace here)
- supplyItemTest_Excel-CSVMSDOS.csv: Questionmark-Umlauts (Default delimiter: Semicolon, needed to search-replace here)
1) is OK obviously. Let's ignore 4) / 5) for now, as I don't know what the encoding here is, either.
3) is the same as 2) after taking care of the different default delimiters (or one could add a SelectItem to configure the delimiter for the BatchUploader), and the encoding is ISO 8859-1 / CP1252 / WinLatin1.
This is how uploads of 2) / 3) look for me.
My problem is with 2/3. Our customers all use excel, no one uses LibreOffice. All end user can do Excel save-as-csv, but when they want to import the CSV in our tool, we have Umlaut-problems.
Could you add an encoding setting for the BatchUploader like you do with setDefaultDelimiter() and setDefaultQuoteString() (also to 5.1p, switch to 6.0p delayed because of this one)?
That encoding setting should define the encoding of the input file. The result should of course still be UTF-8.
This one is a pretty serious one for us, as now end users are using the Batch Upload on a regular basis, and they start complaining about the umlaut-errors. Before, we used it mainly for setup where we did the export in LibreOffice and could control the export encoding.
IMHO the BatchUploader should be able to handle different encodings regardless of our problem here, because different tools will create different exports (until one day, maybe, all are using the now already 10 year old UTF-8...).
I don't know if support for all possible different encodings is needed, but UTF-8 and ISO 8859-1 / CP1252 / WinLatin1 should definitely work and will make 99% of all uploads, I'd guess.
Thank you & Best regards
Blama
PS: This is not related to this report.
Comment