Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
Clear All
new posts

    Help debugging server timeout

    I'm trying to track down a bug that results in a server timeout. The server log shows a queue of DSRequests coming in from the client to populate a bunch of different SelectItems based on optionDataSource tags in various ds.xml files. In one case the queue has 11 entries and in another there are 7. Oddly, there are two instances of each queue.

    I'm using a subclass of IDACall and have overridden handleDSRequest and inserted a System.out.println() before and after each call to super.handleDSRequest(). Every request in the queue processes just fine, but then I see this in the server log, followed by a long wait.
    Code:
    === 2010-10-18 18:02:14,868 [l0-8] DEBUG RPCManager - Content type for RPC transaction: text/plain; charset=UTF-8
    === 2010-10-18 18:02:14,873 [l0-8] DEBUG RPCManager - non-DMI response, dropExtraFields: false
    === 2010-10-18 18:02:14,877 [l0-8] DEBUG RPCManager - non-DMI response, dropExtraFields: false
    === 2010-10-18 18:02:14,883 [l0-8] DEBUG RPCManager - non-DMI response, dropExtraFields: false
    === 2010-10-18 18:02:14,888 [l0-8] DEBUG RPCManager - non-DMI response, dropExtraFields: false
    === 2010-10-18 18:02:14,891 [l0-8] DEBUG RPCManager - non-DMI response, dropExtraFields: false
    === 2010-10-18 18:02:14,894 [l0-8] DEBUG RPCManager - non-DMI response, dropExtraFields: false
    === 2010-10-18 18:02:14,899 [l0-8] DEBUG RPCManager - non-DMI response, dropExtraFields: false
    === 2010-10-18 18:02:14,915 [l0-8] DEBUG RPCManager - non-DMI response, dropExtraFields: false
    === 2010-10-18 18:02:14,919 [l0-8] DEBUG RPCManager - non-DMI response, dropExtraFields: false
    === 2010-10-18 18:02:14,925 [l0-8] DEBUG RPCManager - non-DMI response, dropExtraFields: false
    === 2010-10-18 18:02:14,931 [l0-8] DEBUG RPCManager - non-DMI response, dropExtraFields: false
    Then, after a few minutes I see these messages appear in the server log. But by then the client has stopped waiting for a response.
    Code:
    === 2010-10-18 18:08:31,182 [l0-0] INFO  Compression - /ipgui/sc/IDACall: 1698435 -> 72214 bytes
    200 - POST /ipgui/sc/IDACall?isc_rpc=1&isc_v=SC_SNAPSHOT-2010-10-03&isc_xhr=1&isc_tnum=14 (127.0.0.1) 72214 bytes
    === 2010-10-18 18:08:57,096 [l0-5] INFO  Compression - /ipgui/sc/IDACall: 1672316 -> 76846 bytes
    200 - POST /ipgui/sc/IDACall?isc_rpc=1&isc_v=SC_SNAPSHOT-2010-10-03&isc_xhr=1&isc_tnum=11 (127.0.0.1) 76846 bytes
    === 2010-10-18 18:09:04,877 [l0-8] INFO  Compression - /ipgui/sc/IDACall: 1672316 -> 76846 bytes
    200 - POST /ipgui/sc/IDACall?isc_rpc=1&isc_v=SC_SNAPSHOT-2010-10-03&isc_xhr=1&isc_tnum=12 (127.0.0.1) 76846 bytes
    === 2010-10-18 18:09:06,420 [l0-4] INFO  Compression - /ipgui/sc/IDACall: 1698435 -> 72214 bytes
    200 - POST /ipgui/sc/IDACall?isc_rpc=1&isc_v=SC_SNAPSHOT-2010-10-03&isc_xhr=1&isc_tnum=13 (127.0.0.1) 72214 bytes
    Looking at the client side log doesn't tell me much more than that the client waited approximately 3 minutes and then threw a server timeout error. Eventually the response comes back to the client, but by then the client can't match it back up with a transaction it is waiting for so I see these messages in the client side log.
    Code:
    18:08:31.193:XRP0:WARN:RPCManager:performTransactionReply: No such transaction 14
    18:08:57.111:XRP1:WARN:RPCManager:performTransactionReply: No such transaction 11
    18:09:04.932:XRP2:WARN:RPCManager:performTransactionReply: No such transaction 12
    18:09:06.502:XRP3:WARN:RPCManager:performTransactionReply: No such transaction 13
    How can I find out where the server is getting hung up? I can see that the problem is not with the actual queries running against the database server since those return in a reasonable amount of time.

    #2
    Those responses look to be over 1MB - were you expecting that volume of data? If not, what may be happening is runaway serialization of a large Java structure you are providing as a response.

    Comment


      #3
      I was surprised at the size of the responses as well. The SELECTs are the auto-generated ones to fill SelectItems based on their optionDataSource which in all cases is a fairly standard SQL data source. It seems to be loading the entire file instead of using paging. What might cause that? As I said, there isn't anything very special about the data sources. At least one of them is being used in another context to fill a ListGrid which exhibits the normal paging behavior.

      Comment


        #4
        Does the fact that the queues of requests that are timing out also look to be duplicated ring any bells? In other words, if I look at the SC dev console I first see transaction #12 which is a queue of 11 requests. The "Component" column shows <undefined> followed by a field name (the field has an optionDataSource specified in the ds.xml). In some cases there are multiple field names listed when multiple fields use the same optionDataSource. All 11 requests show Status "Success".

        Next I see transaction #13 which is 11 identical requests, same as #12. The time stamp on this next queue is just a few milliseconds after transaction #12. It is also Status=Success.

        Next I see a new queue with transaction #14 that has 7 requests, followed by an identical queue #15. Both of those are also successful.

        I then see a single request for another picklist which is #16.

        Next is #17 and #18 which are repeats of #12 & #13. Those are actually successful as well.

        Finally there are queues #19 & #20 which are repeats of #14 and #15 and these final two show status SERVER_TIMEOUT.

        All of those requests were sent by the client in approximately 3 seconds. So which ones time out is based on how fast the server is running.

        What would cause the duplicate queued requests?

        Comment


          #5
          I was just having a similar problem, but maybe not the same. I was getting the same message in the client log.
          "... WARN:RPCManager:performTransactionReply: No such transaction .. "
          I am doing JSONP against a remote http server, but running the application off the PC in Eclipse Development mode.

          I was able to intermittently reproduce my problem with just one transaction, repeated even after refreshing the browser. When I restarted the application server in Eclipse my problem was not reproducible. This led me to believe that the client might be holding on to some memory structure.

          I no longer have this problem. What fixed it for me was a reboot.
          I'll look out for it again when I deploy the code, but for now I'll put it down to a Development mode memory managment issue and just reboot my PC as needed.

          Comment


            #6
            I'm still not seeing how the duplicate queues of requests are getting created. It is "automatic" behavior and I'm not sure what triggers it.

            The canvas being drawn consists of two ListGrids which use two different data sources. I can see that each queue of requests includes all fields with an optionDataSource from one of the two data sources, presumably to populate dropdowns in the ListGrid records. Neither grid has yet fetched any data when the queued requests are generated.

            One datasource results in 11 requests in a queue, the other results in 7 requests in a queue. The client is generating two identical queues of requests for each datasource.

            I've run through debug and confirmed that the ListGrids are only being drawn once. What is it that triggers these auto-generated requests?

            Comment


              #7
              (this issue resolved offline)

              Comment

              Working...
              X