Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
Clear All
new posts

  • Enhancement: BatchUploader lookup result cache to increase performance

    Hi Isomorphic,

    please see this thread and the performance issue there.
    While writing it I had another idea how to improve BatchUploader.
    For my main table I can currently import about 1000 rows only, as otherwise the response takes to long, making the browser stop waiting for the response.
    The problem here is the (1st) upload part of BatchUploader, not the 2nd commit part.

    The queries issued serverside are mostly the same:
    • lookup country "Germany" to 455
    • lookup country "Austria" to 446
    • lookup country "Switzerland" to 447
    • lookup temperature "cold" to 1
    • lookup temperature "warm" to 2
    • lookup temperature"hot" to 3
    • lookup product "ABC" 1001
    • lookup product "DEF" 1002
    • ....
    During a run of BatchUpload the result here will never change (especially if you run the queries in a transaction, then is is even guaranteed).

    If you maintained a cache of
    "DataSource" - "key value" - "resulting id"
    during a run of BatchUpload, you'd need to run every different country lookup only once, every different temperature lookup only once, ...

    I assume that this is an easy and valuable enhancement, that would severely speed up BatchUploader, increasing the maximum number of possible records in a CSV.

    Best regards
    Blama

  • #2
    Hi Isomorphic,


    let me modify the example in order to improve two other use cases:
    • lookup country "Germany" to 455 for field country_id
    • lookup country "Austria" to 446 for field country_id
    • lookup country "Switzerland" to 447 for field country_id
    • lookup temperature "cold" to 1 for field temperature_id
    • lookup temperature "warm" to 2 for field temperature_id
    • lookup temperature"hot" to 3 for field temperature_id
    • lookup product "ABC" 1001 for field product1_id
    • lookup product "DEF" 1002 for field product1_id
    • lookup product "ZYX" 1055 for field product2_id (different field also linking to product)
    • lookup product "WVU" 1054 for field product2_id (different field also linking to product)
    • ....
    If you ever implement the feature from this thread, it would be better if the cache is maintained as:
    "uploadFieldName" - "key value" - "resulting id" (because of the potentially different results per uploadField that link to the same DataSource)

    Also, if a lookup value is not found, this should be tracked as well, so that the next time the BatchUploader wants to request a value, because it does not have a stored value for it already, it does directly know that the query won't return any results. (e.g. if in all CSV entries the ISO code of a country is used instead of the name, that is stored in the DB). In this case the BatchUpload could directly add an error for the row/field in question, without issuing a query.


    Best regards
    Blama

    Comment

    Working...
    X