•  
      request #21364 Optimize retrieval of the list of artifact submitters/updaters
    Infos
    #21364
    Thomas Gerbet (tgerbet)
    2021-05-18 14:38
    2021-05-17 17:03
    22960
    Details
    Optimize retrieval of the list of artifact submitters/updaters

    The retrieval of this list can be quite slow in a tracker with a consequent number of artifacts. When you have a filter on your tracker report on e.g. "Submitted By" it starts to be an issue.

    The query to retrieve the list of submitter (the one to retrieve the list artifact updaters is similar) looks like this:

    SELECT DISTINCT user.user_id, CONCAT(user.realname, '(', user.user_name, ')') AS full_name, user.realname
    FROM tracker_artifact AS a
    INNER JOIN user ON (user.user_id = a.submitted_by AND a.tracker_id = ?)
    ORDER BY user.realname
    

    which leads to this execution plan:

    +----+-------------+-------+------------+--------+-----------------------+----------------+---------+-----------------------+------+----------+---------------------------------+
    | id | select_type | table | partitions | type   | possible_keys         | key            | key_len | ref                   | rows | filtered | Extra                           |
    +----+-------------+-------+------------+--------+-----------------------+----------------+---------+-----------------------+------+----------+---------------------------------+
    |  1 | SIMPLE      | a     | NULL       | ref    | idx_tracker_id,idx_my | idx_tracker_id | 4       | const                 | 7041 |   100.00 | Using temporary; Using filesort |
    |  1 | SIMPLE      | user  | NULL       | eq_ref | PRIMARY               | PRIMARY        | 4       | tuleap.a.submitted_by |    1 |   100.00 | NULL                            |
    +----+-------------+-------+------------+--------+-----------------------+----------------+---------+-----------------------+------+----------+---------------------------------+
    

    Notice the large number of rows that needs to filtered to deal with the duplication/ordering/concatenation.

    By moving the de-duplication earlier in the query we can help the DB engine:

    SELECT user.user_id, CONCAT(user.realname, '(', user.user_name, ')') AS full_name, user.realname
    FROM user
    JOIN (
        SELECT DISTINCT tracker_artifact.submitted_by FROM tracker_artifact WHERE tracker_id = XXX
    ) AS tracker_submitted_by ON (user.user_id = tracker_submitted_by.submitted_by)
    ORDER BY user.realname
    

    Execution plan of the modified query:

    +----+-------------+------------------+------------+--------+-----------------------+---------+---------+-----------------------------------+------+----------+---------------------------------------+
    | id | select_type | table            | partitions | type   | possible_keys         | key     | key_len | ref                               | rows | filtered | Extra                                 |
    +----+-------------+------------------+------------+--------+-----------------------+---------+---------+-----------------------------------+------+----------+---------------------------------------+
    |  1 | PRIMARY     | <derived2>       | NULL       | ALL    | NULL                  | NULL    | NULL    | NULL                              |  441 |   100.00 | Using temporary; Using filesort       |
    |  1 | PRIMARY     | user             | NULL       | eq_ref | PRIMARY               | PRIMARY | 4       | tracker_submitted_by.submitted_by |    1 |   100.00 | NULL                                  |
    |  2 | DERIVED     | tracker_artifact | NULL       | range  | idx_tracker_id,idx_my | idx_my  | 8       | NULL                              |  441 |   100.00 | Using where; Using index for group-by |
    +----+-------------+------------------+------------+--------+-----------------------+---------+---------+-----------------------------------+------+----------+---------------------------------------+
    

    On my test instance this gives an interesting performance improvement: the initial query takes ~0.27sec, the modified one ~0.02sec.

    Trackers
    All
    Empty
    • [ ] enhancement
    • [ ] internal improvement
    Empty
    Stage
    Thomas Gerbet (tgerbet)
    Closed
    2021-05-18
    Attachments
    Empty
    References

    Follow-ups

    User avatar
    Thomas Gerbet (tgerbet)2021-05-17 17:19

    Patch under review: gerrit #22741.


    • Summary
      -Optimize retrieval of list of artifact submitters/modifiers 
      +Optimize retrieval of the list of artifact submitters/updaters 
    • Original Submission
      Something went wrong, the follow up content couldn't be loaded
      Only formatting have been changed, you should switch to markup to see the changes
    • Status changed from Under implementation to Under review
    • Reported in version set to All