Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
Clear All
new posts

    exportclientdata and character encoding

    Hey,

    trying to get CSV-export to work, doing just like the example, all works.

    However, Some characters become corrupt in the CSV-file when look at it.

    Example:
    "Sˆderstadion","2011-6-1 0:0:0","2011-6-21 7:35:0","487.58","487","35","5363.38"

    In this example, it looks fine on the server, i send UTF-8 everywhere and it looks right in the browser too.

    Any ideas? Can you set the encoding in the file export?

    #2
    Hey, would be great if someone could comment on this.

    I have looked around and tried lots of stuff, but to no avail. I have also found zero documentation besides the showcase, please correct me if i'm wrong.

    Let me give some more feedback:

    On the server, i have extended the datasourceloader in order to always return UTF-8:
    Code:
    public class NubaDataSourceLoader extends DataSourceLoader {
    
        @Override
        public void processRequest(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
            response.setContentType("text/json; charset=UTF-8");
            super.processRequest(request, response);
        }
    }
    So again, in the browser all Swedish character display fine. I use exportclientdata since i have all the data in the listgrid and don't want another server roundtrip:
    Code:
    DSRequest dsRequestProperties = new DSRequest();
    dsRequestProperties.setExportAs(ExportFormat.CSV);
    dsRequestProperties.setExportDisplay(ExportDisplay.DOWNLOAD);
    dsRequestProperties.setAttribute("exportFilename", "nubaexport.csv");
    reportGrid.exportClientData(dsRequestProperties);
    Not only swedish characters, but also é, and other, "strange" characters are wrong in the export.

    Please see attached screenshot of how it looks in the listgrid, and below for how it looks in the export:

    Code:
    "Il CaffÈ","2011-11-3 10:3:32","2011-11-4 1:39:12","15.59","15","35","155884.41"
    "Learning Tree","2011-11-4 8:16:20","2011-11-4 17:24:16","9.13","9","7","9120.87"
    "Violv‰gen 29","2011-11-4 16:31:49","2011-11-4 17:28:34","0.95","0","56","9499.05"
    "Violv‰gen 29","2011-11-4 17:28:41","2011-11-6 20:20:55","50.87","50","52","508649.13"
    "Kungsholmstorg 11","2011-11-6 23:51:59","2011-11-7 0:20:10","0.47","0","28","4699.53"
    "Kungsholmstorg 11","2011-11-7 0:36:36","2011-11-7 0:37:12","0.01","0","0","8"
    "23:ans smˆrgÂscafe","2011-11-7 0:24:43","2011-11-7 7:41:0","7.27","7","16","72692.73"
    Attached Files

    Comment


      #3
      We've tried this and gotten a "works for us" result so far. Special characters, including multi-byte characters, are delivered to a grid normally and exportClientData successfully exports them to Excel.

      Check to see if you are somehow overriding the default content type and charset that the SmartGWT Server sets. You should see text/csv; charset=UTF-8. If you don't, then you have somehow broken the default behavior, possibly with a filter servlet.

      Note exportClientData() is in JavaDoc like all other methods.

      Comment


        #4
        Hey much thanks for comments.

        I forgot to mention that i'm using a characterencodingfilter to set UTF-8 on all servlets (most importantly IDACall ofc). I have also looked, and the character-encoding header is UTF-8 in all responses from the server.

        So i find this kinda strange. HOWEVER, I have just noticed after installing Excel on my Mac, that while the characters looks as described when i view the export in textedit, textview and Ultraedit, they DO look ok in Excel...

        Regarding docs, naturally i've looked in Javadoc, but in my opinion it's kind of sparse. It would really be nice a cohesive rundown on how exportdata *works*, with things like server.properties configuration when doing excel, what properties that can be relevant to change in the DSRequest object, if and when servertrips are executed etc.

        Regarding the last bit, i set the "exportformat" to xls at one point, did exportclientdata and a server-request was executed. It might not be clear to everyone why that happened.

        Again, many thanks for replying, have a good one.

        Comment


          #5
          There are no server.properties settings for export. There is a comprehensive rundown of what properties exist and what they do on operationBinding.exportResults, which is linked pervasively from all the methods in involved in export.

          As far your encoding issue, looks like you're either just looking at the output with tools that don't support Unicode or you're added filter servlet is somehow interfering with the default encoding - please let us know if you find compelling evidence otherwise.

          Comment


            #6
            We had the same problem, we used SmartGWT 2.4 and GlassFish 3.1.

            The solution:
            You need a character encoding filter, sg. like this:
            Code:
            	...
            
            	public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
            		request.setCharacterEncoding("UTF-8");
            		chain.doFilter(request, response);
            	}
            The order of the filters is important, make sure this is the first filter in your web.xml.

            In our case we have two filter, and we got the following warning:
            Code:
            WARNING: PWC4011: Unable to set request character encoding to UTF-8 from context , because request parameters have already been read, or ServletRequest.getReader() has already been called
            The order was wrong.

            I hope it helps you.

            Comment


              #7
              Just a note that this is not needed for 2.5 and 3.0, and should only be needed in 2.4 if your servlet engine is configured to use an encoding other than UTF-8.

              Comment


                #8
                Hi,

                i'm having to revisit this since i'm having this issue still.

                To recap: My csv export opens up fine in most editors, but not excel, i get garbled characters for all non-standard chars.

                I have investigated, and have now understood that it's because the exported file does NOT contain a UTF-8 BOM in the beginning.

                If i open the file in Ultraedit and re-save it with a BOM, it opens up fine in Excel.


                Is this something you can shed any light on? Is there a way for me to configure, or similar?


                Pointers would be greatly appreciated.

                EDIT: info about BOM at the bottom of this page: http://www.unicode.org/faq/utf_bom.html

                Comment


                  #9
                  You shouldn't need to manually add a BOM or do anything low-level like this. You just need the correct character encoding settings in your servlet engine.

                  3.0+ does this automatically and the 3.1d internationalization docs cover this in more detail, but for older versions, if you cannot the default character encoding in your servlet engine, MeditcomAcc's solution of adding a servlet filter is the best approach.

                  Comment


                    #10
                    Hello there Iso,

                    please allow me to clarify. The file returned by the server IS utf-8 (albeit without any BOM), this is why it opens up in most editors like Ultraedit without problems.


                    However, Excel is *notorious* for not handling UTF-8 CSV's very well, from what i have understood from many threads over at Stack Overflow.

                    From there, and Ultraedits support page (which i linked), i have been told that unless you add a BOM to the beginning of an UTF-8 encoded CSV, it won't open up properly in Excel.

                    A very good thread on Stackoverflow is here: http://stackoverflow.com/questions/6...-automatically

                    EDIT and another really good here: http://stackoverflow.com/questions/1...s-in-csv-files


                    If something i've written here isn't correct, please enlighten me.

                    I'm not sure how to solve this for my clients, besides figuring out a way to return the csv ISO-8859-encoded instead...
                    Last edited by mathias; 20 Jul 2012, 07:30.

                    Comment


                      #11
                      If you are using 3.0+ or the Servlet filter approach, your HTTP Content-Type header should be declaring the charset as UTF-8. If it's not, fix that first.

                      Comment


                        #12
                        I am using 3.0+, the HTTP content type has nothing to do with this issue. This issue is about opening up the file that was returned from the server, and saved by the browser in Excel.

                        When exporting data, the file is saved by the browser somewhere, with UTF8-encoded data. When i open this saved file in Excel, the characters are not displayed properly since it's UTF8 without a BOM (as per the examples i linked). This won't be affected by HTTP content-types.

                        Comment


                          #13
                          Not really following this - are you doing an export, such that the Excel file is downloaded over HTTP and Excel is launched by the web browser? In this case the charset in the Content-Type header is very important. If you're doing something else, please berry specific about the series of operations.

                          Comment


                            #14
                            Hmm, are you saying that i potentially could have different results depending on whether i choose "open with..." directly in the browser compared to just saving the file and THEN opening the file up with Excel from Finder?

                            I have not had that experience, TBH, and i get the same end result regardless of which method i perform for the exort.

                            to be CRYSTAL clear:

                            Clicking my "export" button then "open with.." Excel yields the same result as if i do "save file" then open it up with Excel. In both cases the file is UTF-8 encoded without a BOM.

                            In any case, i suppose our clients could do either...

                            Comment


                              #15
                              You're very focused on the BOM, but you still haven't indicated whether you're got a correct Content-Type header with charset=UTF-8 explicitly specified. If you don't have that, make sure you have that; there's no point in any other investigation until this is resolved.

                              Comment

                              Working...
                              X