Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
Clear All
new posts

    Need help with ListGrid using FetchMode.LOCAL and custom DataSource

    SmartClient Version: v10.0p_2015-12-10/LGPL Development Only (built 2015-12-10)
    Browser: Any

    I have a custom DS to access my data. I want the grid to load all of the data and sort/filter client side. To do this I am setting the grid fetch mode to LOCAL. When I do this, the grid never finishes the loading (the message saying it is loading and the spinner stay up forever). If I comment out the line that calls setDataFetchMode (so it uses PAGED), but still return all of the records (sample loads 100 records even though the request is for 75), then the grid finished loading fine (but then the grid will not locally sort or filter because it assumes the DS is doing that). When in LOCAL mode, is there something special my DS needs to do in order to signal that the FETCH is complete?
    Attached Files
    Last edited by pgrever; 14 Jan 2016, 09:25. Reason: Fix typo

    #2
    No, no special settings are required. There are a couple of strange things in your code which may be the culprit:

    1. you are setting autoCacheAllData:true; this setting doesn't really make sense if you are building a DataSource that uses dataProtocol:clientCustom to fetch from some kind of in-browser storage. Have you read the docs for this setting? What were you hoping that this setting would do for you?

    2. you're using dataProtocol:clientCustom but synchronously returning data, whereas this API is actually designed for you to do your own asynchronous comm to fetch data. Returning data synchronously is somewhat odd, since if you already have the data without needing to do an asynch fetch to get it, you could have just populated the DataSource with it. Is there some scenario you've got in which this pattern actually makes sense?

    Comment


      #3
      I wasn't certain if I needed to set autoCacheAllData or not. I tried removing it and that seems to have fixed this issue. The reason the sample doesn't do anything async is so I don't have to send you my tomcat server. I just took that part out to simplify the example. Behaves the same way whether the code is async or not.

      Comment


        #4
        autoCacheAllData (which again is a feature that doesn't seem to apply) probably has a piece of its implementation that expects asynchronous responses (because it would be nonsense to use this feature with synchronous responses).

        If you are building a test case for a situation where you would go to your Tomcat server, then you should respond asynchronously. You can just use Scheduler to wait for a trivial time period (say, 1ms) to do this.

        Or you can just populate the clientOnly DataSource in the usual way (setCacheData()), since your test case is basically a convoluted way of achieving the same thing.

        Comment


          #5
          I have a version of the sample that is asynchronous and it has the same problem if I set autoCacheAllData to true. My real app is doing asynchronous I/O, I was just trying to create a more simplistic sample to send you. Even if I make it async, I still see the problem. Since I don't seem to need to set autoCacheAllData to true in this case, and since not setting it fixes the issue, I'll just remove it. Thanks for the help.

          Comment


            #6
            Included an updated set of sample files where the fetch operation is async. Although I'm returning all of the data, if I call hasAllData() on my data source after the fetch it still returns false. Should I be overriding this method, or is there something else I should be doing so the DS knows that it has all of the data, or am I misunderstanding some concept here?
            Attached Files

            Comment


              #7
              In this new test case you have not set cacheAllData or autoCacheAllData. Therefore there is no caching at the DataSource level.

              The ResultSet, which is specific to the grid, has obtained a cache of all records. You can see this by calling ResultSet.allRowsCached().

              Comment


                #8
                OK, tried these things. The ResultSet.allRowsCached() does show that it has all of the data. But I do want the DS to be caching. I tried setting all combinations of cacheAllData and autoCacheAllData and they all do the same thing - the grid data is never displayed, the loading spinner just keeps spinning forever. Also setting these seems to contradict your earlier response on this same thread that says it doesn't seem to make any sense in my case.

                Maybe there is an easier way to accomplish what I am trying to do. I want the DS to get all of the data and the grid using it to be able to switch filters without re-querying the server for the data again. I also want the data source to retrieve the data in chunks because returning it in one request is causing memory issues because the JSON response is so large. I want the grid in LOCAL mode, but I don't see any way in this mode to have the DS get the data using multiple URL requests. Are there settings that would configure it to retrieve all of the data on the first call, but to do it using multiple requests? If not, I've looked into just using a custom protocol for the fetch, but I haven't seen any easy way to convert the JSON response I get into a set of DS records. Obviously the underlying DS class has a way to do this when I configure the DS for JSON, can I call some of these methods directly?
                Attached Files

                Comment


                  #9
                  The DS has recordsFromText and recordsFromXML, any chance there is a recordsFromJSON?

                  Comment


                    #10
                    What we said was that cacheAllData doesn't make sense when you've already got the data loaded and can synchronously return it. cacheAllData *does* potentially make sense for asynchronous loading via dataProtocol:"clientCustom" and we've already got someone looking into why this combination of settings isn't working.

                    But let's first talk about what settings you actually want, because none of the settings you're trying make any sense with your stated use case. You say:

                    Are there settings that would configure it to retrieve all of the data on the first call, but to do it using multiple requests?
                    First, what is the point of using multiple requests? You talk about memory issues because the JSON request is so large, but do you realize that if you parse that JSON and turn it into objects, that's going to be use much much more memory than the JSON string itself?

                    Regardless, if there were some valid reason to issue multiple requests, a simple way to do this would be to use a second DataSource that is configured to expect JSON responses (dataFormat:"json"), make a series of requests against that DataSource, then combine all the resulting Records and just provide them to a clientOnly DataSource.

                    But again, before you start down that road - the stated reason for separate requests doesn't make sense. At least not with what you've told us.

                    Comment


                      #11
                      With what we are seeing the separate requests actually make a big difference. As far as we can tell, instead of getting one 50MB JSON string that needs to be parsed, our separate gets only need about 1.4MB. Once the parsing of each JSON string is done, the string can be garbage collected and the resulting record data being binary is far more compact in memory than the JSON string representation. With one request the system needs enough memory for the 50MB JSON string as well as all of the Record objects it is being converted into. Also, it's not just an issue of needing 50MB in the heap for the string, it needs 50MB of contiguous memory in the heap for this string. In any event, watching the browser memory utilization when we get it in chunks instead of one get shows a dramatic decrease in process memory as well as the fact that we have been able to unload and reload this data over 50 times (we quit at that point) instead of only four times before the browser crashes with a "Not enough storage" exception. Since our application is structured to load and unload this data as the user navigates to different parts of our application, the ability to keep reloading the data over time without crashing is extremely important to us.

                      We have already re-written the code as you suggested by wrapping one DS with another. We did this before I started this thread. It's issues we've run into in attempting to do this that generated these questions. The most notable issue at this time is that when we change the filter criteria on the grid, it is requesting the DS to get all of the data again. We originally thought this might be because the DS is reporting that it doesn't have all of the data.

                      The latest questions arose because we thought that maybe wrapping the one DS with another might have been the hard way and maybe there was something more direct that we were just missing.

                      Currently our wrapper is NOT defined as a clientOnly DS, so maybe that is the mistake. However, based upon events we need to be able to update this DS on the fly using updateCaches. As long as that will continue to work, we could convert the wrapper to a clientOnly DS. Is that the correct thing to do?
                      Last edited by pgrever; 19 Jan 2016, 16:43.

                      Comment


                        #12
                        Several points to address here...

                        First, it looks like you are warping your data architecture because of a memory leak you haven't solved, but should be able to solve. The fact that you run out of memory in two different browsers strongly implies a logic bug creating the leak, which if it exists, should be correctable. And your workaround, if we understand it correctly, is to issue ~35 requests (50/1.4) back to back in order to avoid ever having a single, larger request. That has several drawbacks, but one is that it's going to pound on your server unnecessarily.

                        Second, just a sanity check, your JSON string has all whitespace removed, right? Because given that "{}" in JSON - 2 chars - implies an entire hashtable-like structure needs to be created in JS, it's difficult for the parsed form of JSON to be smaller than the String form. Certainly, long numeric values get smaller, and modern VMs have some tricks regarding slots (although not IE9).

                        Third, it looks like completely aside from the issue of a single fetch vs multiple, you still have some confusion about criteria and caching (and the result has been applying random settings that don't make sense together). Perhaps if you tried to clearly articulate the caching behavior you want, we could point to the right setting. One basic question is: should a criteria change on the grid *never* lead to new data being retrieved from the server?

                        Comment


                          #13
                          All spacing is removed from the JSON. We have a large number of numeric values and nulls. We have followed your documentation on leak detection and at this point we do not see any evidence of a leak. It appears that the browsers have a JVM memory limit and what we are dealing with is heap fragmentation. If we run our app with the large gets in chrome and we have the chrome browser developer window up and we use the feature where we can force garbage collection and we do this 3 or 4 times after each get (with each gc the heap size decreases), then we do not run out of memory. If we don't do this we do. We see the same results if we get the data in chunks. The structure of our server is such that chunking the gets does not cause any significant overhead, so you should not be making assumptions about how our server operates as this will NOT pound our server.

                          Criteria change should NOT retrieve new data. However we did find a path in our code that was causing the grid to throw away its data and that is what was causing the re-fetch - so that issue is resolved.

                          With that fixed, it seems to be functioning as expected without any caching at the DS level (no calls to setAutoCacheAllData or setCacheAllData). At this point it appears that the flow of our code looks like:

                          Our grid's DS is set to our wrapper DS. The grid asks for all of the data (unsorted and unfiltered). Our wrapper DS (which is currently not marked as clientOnly - not certain what difference this might make) makes multiple calls to our real DS getting all of the data in smaller chunks (current chunking size was arbitrary for testing - no idea if it should be larger or smaller) which builds up a ListGridRecord list. When it is done it calls processResponse with this data (which it then throws away). This appears to populate a ResultSet that is then used by the grid where we can change the filter criteria to see different subsets of the data without any more fetch requests to the DS wrapper.

                          The only thing that has changed in our code is the wrapper DS being placed around our previous DS to have it transparently get the data in chunks. In one case we see the memory issue and the other we don't. If our memory issues were the result of leaks, it seems like any leaks we have in the application would still be there whether we get the data in chunks or not, and the results would be similar. However, this is not the case in actual testing.

                          Analysis has not uncovered any leaks causing the memory issue and breaking the fetches up into chunks (just like paging would do) seems to eliminate the memory issue. As your page on memory leaks indicates, with all of the layers of indirection and the asynchronous nature of garbage collection in the VM and the limited tools available for detailed heap analysis (not to mention the translation from Java to JS going on), it is really impossible to "prove" that there are no leaks (since memory may go up even without leaks because the garbage collector has not run). However with the use of the chrome debugger allowing us to force garbage collection and the heap size being within a few bytes on each iteration of loading/unloading our data (sometime more, sometimes less, but all within a small range), there does not appear to be any significant memory leakage (we cannot prove there is none). Given that there appears to be nothing more we can do at this point with regard to identifying memory leaks and chunking either eliminates the problem completely or at least is making it much more difficult to produce, we really have no other viable alternative at this point to the memory issues other than to get the data in chunks. We have not yet done a full proof-of-concept of this technique within our application so it is possible that this direction will also end up failing us, but so far in tests in a more limited environment the results have been very promising.

                          A brief description of what we are trying to do:

                          We want to be able to query our server for a set of records via a REST API to an HTTP server returning JSON. We want the query to include custom URL parameters specifying context that will cause a subset of the data to be returned (essentially criteria). We want the grid to then use this data as a "complete" set of data that it will/can then locally sort and filter repeatedly as the user changes setting on the grid without any more requests to the server. Further more we need to be able to efficiently update the data displayed in the grid based upon a potentially high volume of events indicating records in the data being added/removed/modified while maintaining any local selection, grouping, sorting. highlighting, filtering, and scroll position. In summary we want to get a set of data up front and then be able to make updates to the data with minimal end-user disruption.

                          If there was a good way to do this without getting all of the data up front (i.e., paging), then we would take it. However we have not be able to find a reasonable way to do this since there is no rational way to apply updates to random records that may or may not be in the set already retrieved. If it is not obvious why updates on paged data is problematic and you really want to get into that we can, but we'd prefer to skip the gory details of that can of worms.
                          Last edited by pgrever; 19 Jan 2016, 18:58.

                          Comment


                            #14
                            It appears that the browsers have a JVM memory limit and what we are dealing with is heap fragmentation.
                            Just a sanity check: "JVM" usually means Java VM - possibly you meant this to mean JavaScript VM, but if you mean Java VM, obviously, there is no Java VM involved in compiled mode. Do you see this weird "out of memory" issue in compiled mode? Because this kind of inexplicable error is very suggestive of GWT Development Mode issues, where a lot of objects have a kind of dual representation in Java and JavaScript.

                            The structure of our server is such that chunking the gets does not cause any significant overhead, so you should not be making assumptions about how our server operates as this will NOT pound our server.
                            Hmm, even with an all-memory cache of the data, there would be 35 unnecessary runs through the request parsing and authentication logic, 35 separate compression runs instead of one, etc, but if you say that's not "significant" in your case, OK, we'll just assume that's a very unusual server :)

                            Glad to hear you found and fixed the bug that was causing unnecessary fetches.

                            As far as your overall approach, listGrid.fetchMode:"local" does make more sense than dataSource.cacheAllData for you, since without fetchMode:"local", a grid will still use data paging even with a DataSource that has a complete cache. And using both at once would just result in two complete caches of the same data.

                            And yes, we fully understand the issues around mass updates to a partial cache. For singular updates, we implement resultSet.updatePartialCache as a neat trick that prevents most cases of unnecessary cache dropping. For mass updates, you end up needing a lot of complicated communication about which PKs have moved to which indices, and how best to do it depends a lot on details of which operations happen to be faster or slower on the server side. We've implemented it before, so if you ultimately decide that dumping 50MB of data into the browser still has issues, we can help - but it would need to be a Consulting project.






                            Comment


                              #15
                              Yes, we were referring to the browsers JavaScript VM, sorry we thought that was obvious.

                              Yes, we do see the out of memory in compiled mode. We run out a little more easily in DevMode, but we still run out quickly either way.

                              Our server doesn't have any authentication around the REST API, it is not using a secure protocol so there is no encryption, and we am not aware of any compression being done so we are not certain what you are referring to in this regard. Also, remember that although we are doing more iterations, each iteration is processing a much smaller amount of data. It's worth noting that we are transferring data over a wire and the data transmission time far outweighs most everything else. We also will be playing with the chunk size, so we may be able to fix the issue with a much smaller number of requests. In the end we're adding about 100ms, which considering the alternative (i.e., the browser crashes), is worth it (it's more important that the product actually functions then to save a few 100ms). If we can fix the "Not enough storage" failure without the chunking we would much rather do that.

                              Thank you for all of you help and insight. We'll look into pursuing a more in-depth consulting project.

                              Comment

                              Working...
                              X