Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
Clear All
new posts

    BatchUploader unsuccessful uploading 4000 records

    I'm trying to use BatchUploader to accept a CSV file and upload that data into a PostgreSQL 9.2.2 database.

    I started with the Power 2013-03-06 nightly build and the moved to v8.3p_2013-05-12/PowerEdition Deployment (built 2013-05-12). I didn't get it to work with either build.

    I'm testing with Firefox 20.0.1 talking to Tomcat 7.0.33.

    If I upload a csv file with around 1000 records/rows. Everything seems to work fine. When I upload a csv file with around 4000 records/rows, the 'Upload' batchUploader function will read the file and propulate the listGrid. The 'Commit' function will take a while, will give me several 'script not responsive' prompts, and eventually come back with the 'Records added' popup, but the records were never added.

    When it is unsuccessful I see these messages on the Developer Console after pressing the 'Commit' button:
    Code:
    11:19:49.048:INFO:RPCManager:rpcResponse(upload)[update]: result: object[status=0]
    11:28:28.910:MUP4:INFO:RPCManager:sendQueue[14]: 4063 RPCRequest(s); transport: xmlHttpRequest; target: http://172.30.1.158:8080/krv/streetsmart1/sc/IDACall?isc_rpc=1&isc_v=v8.3p_2013-05-12&isc_xhr=1
    11:30:39.907:MOU7:DEBUG:RPCManager:XMLHttpRequest POST to http://172.30.1.158:8080/krv/streetsmart1/sc/IDACall?isc_rpc=1&isc_v=v8.3p_2013-05-12&isc_xhr=1 contentType: application/x-www-form-urlencoded; charset=UTF-8 with body -->isc_tnum=14&_transaction=<transaction xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:type="xsd:Object"><transactionNum xsi:type="xsd:long">14</transactionNum><operations xsi:type="xsd:List"><elem xsi:type="xsd:Object"><values xsi:type="xsd:Object"><cycle>01</cycle><sequencenumber>20</sequencenumber><meterid>**********</meterid><route>000001</route><name>*************</name><street>*****************</street><siteid>0000200</siteid><longitude>-**.*******</longitude><latitude>**.*******</latitude><identifier>*******</identifier></values><operationConfig xsi:type="xsd:Object"><dataSource>amrSiteUpload</dataSource><operationType>add</operationType></operationConfig><appID>builtinApplication</appID><operation>amrSiteUpload_add</operation><oldValues xsi:type="xsd:Object"><cycle>01</cycle><sequencenumber>20</sequencenumber><meterid>**********</meterid><route>000001</route><name>*************</name><street>*****************</street><siteid>0000200</siteid><longitude>-**.*******</longitude><latitude>**.*******</latitude><identifier>*******</identifier></oldValues></elem>...
    ...(3535729 bytes truncated).  Enable RPCManagerResponse log at same threshold to see full message.
    11:30:40.847:XRP0:INFO:RPCManager:transaction 14 arrived after 131936ms
    11:30:40.850:XRP0:DEBUG:RPCManager:Result string for transaction 14: "<HTML><BODY><SCRIPT>parent.isc.RPCManager.retryOperation(window.name);</SCRIPT></BODY></HTML>"
    11:30:41.094:XRP0:INFO:RPCManager:rpcResponse(unstructured) results -->"<HTML><BODY><SCRIPT>parent.isc.RPCManager.retryOperation(window.name);</SCRIPT></BODY></HTML>"<--
    11:30:41.095:XRP0:INFO:RPCManager:rpcResponse(unstructured) results -->"<HTML><BODY><SCRIPT>parent.isc.RPCManager.retryOperation(window.name);</SCRIPT></BODY></HTML>"<--
    ...
    11:30:41.777:XRP0:INFO:RPCManager:rpcResponse(unstructured) results -->"<HTML><BODY><SCRIPT>parent.isc.RPCManager.retryOperation(window.name);</SCRIPT></BODY></HTML>"<--
    11:30:41.778:XRP0:INFO:RPCManager:rpcResponse(unstructured) results -->"<HTML><BODY><SCRIPT>parent.isc.RPCManager.retryOperation(window.name);</SCRIPT></BODY></HTML>"<--
    Note that I replaced personal information in the output with '*'. I also truncated the repetative parts of the output.

    The data source is fairly simple. It included fields from two different postgreSQL tables. A Server side method performs the insert into each table and if the entry already exists, it updates the data instead. There are log messages at the begining for the methods. When doing 1000 entries I see the log messages in the log. With 4000 entries, I don't see the log messages in the log. I don't believe my server side methods are being called.

    The RPC log shows the DSRequests and has them marked as successful.

    Note that when the upload is unsuccessful, I don't see any exceptions or error messages.

    Thanks,
    Kevin
    Attached Files

    #2
    131 seconds (aka over two minutes) is long enough that your server may be configured to just drop the HTTP connection.

    To bring this number down, start by taking a look at where the time is being spent - the RPC tab will show you the total time taken on the server, for example.

    Also, you've enabled logging on the client - that slows things down, so disable that when actually testing maximum data throughput.

    Finally, consider whether this UI makes any sense at this data volume. Will the end user be reviewing 4000 records line by line and queuing up thousands of corrections? If not, then the best approach is probably to store the data in a temporary table on the server, and allow the user to make various corrections one at a time.

    Comment


      #3
      Thank you very much for your reply.

      I'm investigating the server connection configuration to extend the timeout. I'm not sure how to do that yet.

      I ran the test again watching the numbers on the RPC tab. Initially 'Time Sent' for each entry was 'NaN:NaN:NaN.NaN' until all the add DSRequests were listed, then all the 'Time Sent' entries were set to the time the last DSRequest was listed in the Developer Console. Elapsed time was blank until some time later when the 'Records Added' message popup up. Then 'Elaspsed Time' was set to 119074ms even though the time was much longer than that by the clock.

      I'm concerned about two things:
      1. If this is a timeout problem, why didn't the server side try to save any of the records? My server side methods were never called. When it is successful with a thousand records, I see my server side methods called.
      2. There is no indication that the upload was unsuccessful. Even with use cases which expect smaller data sets, if the user provides more data than expected, then the upload could silently fail. The user would not know that it failed.

      I appreciate the point that this UI may not be a good choice for this data volume. I think BatchUploader would be more suited for larger data sets if the ListGrid could be filtered to show only rows with errors. I like the BatchUploader functionality. If I don't use the BatchUploader UI, is it possible to use the CSV import/processing functionality? I didn't see any other CSV importing tool.

      Thanks,
      Kevin
      Last edited by valerien; 17 May 2013, 11:22.

      Comment


        #4
        Elapsed Time in the RPC tab means server elapsed time. So you've got 119 seconds being spent on the server; that's almost all of the time. You can look at the timestamps in the server log to see where this time is being spent; your results (where your methods aren't even being called) suggest that there might be a FilterServlet or something involved before processing even reaches the SmartGWT server framework or your code.

        About a better UI approach - filtering could be added to the interface, but the underlying issue is really that 4000 records are being kept in browser memory while the user is presumably making a lot of corrections. That's too big a chance for data loss; server-side storage should be used instead.

        The code that we use for CSV parsing doesn't happen to be exposed for direct use outside of BatchUploader, however, CSV parsing is really easy and any off-the-shelf CSV parser should work fine.

        Comment

        Working...
        X