•  
      epic #26790 Integration with Qlik (BI)
    Summary
    Integration with Qlik (BI)
    Trackers

    Overview

    The goal of this epic is to have REST API format that will better suit with consumption by a BI tool (eg. Qlik, Power BI, etc).

    The following routes will be covered:

    • GET /tracker_reports/{id}/artifacts
    • GET /trackers/{id}/artifacts
    • GET /artifacts

    Format & approaches

    Initially CSV was identified as the best tradeoff as an output format, despite its limitations, because it was easy to injest by most BI tools without any end users manipulations.

    After deeper thoughts, it will not be possible to generate a CSV output that doesn't require end users manipulations. At the very least, end users will have to deal with pagination as the output cannot be sent without any (risk of server overload, request truncation, timeout, etc). Moreover CSV as a data format is already by itself problematic:

    • It's exposed to the issues related to encoding (charset, BOM & co).
    • It as issues with date time values, separator and quotations.
    • The topic is well covered in the RFC of the format itself

    With this in mind:

    • Given that most BI tools are able to injest JSON (at least Qlik and PowerBI)
    • But the JSON exposed by Tuleap APIs is too complex for average BI users to parse
    • The current proposal is to expose a simplified json output, easier to parse.

    How to parse JSON with Qlik: https://www.youtube.com/watch?v=8xkwFjDjO84

    Json

    The identified routes, get a new optional parameter (eg. ouput_format) that can be either nested (default) or flat (new, BI easy). The output representation is then:

    [
    {"field_name_int": 0, "field_name_array": ["A", "B"], "field_name_c": ...},
    ...
    ]
    

    The logic:

    1. One line per artifact
    2. Each field value is made of
      1. one key with field name (not id)
      2. one value with field value
    3. When field value is :
      1. a simple type (string, int, float), the value is output directly
      2. has multiple values (list fields), the value is itself an array with the simple values within
      3. cannot be represented with a single value (eg. a Text field that comes with a format to be properly managed) have a special treatement

    Special fields

    This section describe the fields that requires a special treatement for further discussions

    • Text
      • Text field values have 3 possible formats (text, markdown & HTML). For any sensible treatment, it's required for consumer to know the format.
      • As format is reduced to a single value, it's not possible to specify the format so there are two options
        1. The approach for CSV or XLSX output: transform the output to text only (remove HTML markup).
        2. Output the raw value and let the consumer figure out the format.

    Right now option 1 is in place for experimentation

    CSV

    Note: From a pure HTTP point of view we should request an Accept: text/csv header to the request (as we do for XML) but we have no guaranty that tools will provide such customization to their end users. Plus, people who will do the BI integration are likely to be less tech-savvy and the management of HTTP headers would be too complex.

    When format=csv is used, the query result will be sent with "Content-type: text/csv" with UTF8 encoding.

    The separator to be used will be the one chosen in user preferences.

    Format

    The output will be paginated with the same rules than with JSON output.

    The header of the CSV document will be the name of the columns made of field's name (not label). Eg: artifact_id, start_date, submitted_by, ...

    The content will be adapted to CSV with following rules:

    • String, Text, Float, Int, Computed, Artifact ID, Per tracker ID: literal value (eg. 5, 2.0, Error when doing...)
    • Date, Date + Time, Submitted on, Last update date: date according to CSV preferences
    • Select box, multi select box, radio button, check box, open list, shared fields:
      • static: literal value (eg. UX & UI)
      • users: login name (eg. jdoe)
      • groups: group name (eg. developers)

    Field not supported will not be part of CSV (without warning):

    • Permissions on artifacts
    • Artifact links
    • Cross references
    • File upload
    • Burnup
    • Burndown
    • Encrypted fields

    Open point

    How do we represent lists to make them useful ?

    Given a multi select box Category with "Database", "Web UI", "REST API" as possible values and an artifact whose categories are "Web UI" & "Database":

    • Output one line with sub separator :
    artifact_id, Category
    1234, Web UI|Database
    
    • Output one line per type
    artifact_id, category
    1234, Web UI
    1234, Database
    

    The first representation is more inline with Tuleap output but we don't know how tools will be able to consume this kind of output.

    The second representation is more "natural" in term of CSV but would be more complex. Example with the previous artifact with Language "FR, EN, DE":

    artifact_id, category, language
    1234, Web UI, FR
    1234, Web UI, EN
    1234, Web UI, DE
    1234, Database, FR
    1234, Database, EN
    1234, Database, DE
    
    Empty
    Progress
    Empty
    Empty
    On going
    Details
    #26790
    Manuel Vacelet (vaceletm)
    2024-03-14 08:52
    2022-05-16 14:21
    Attachments
    Empty
    References

    Follow-ups

    User avatar
    • Description
      Something went wrong, the follow up content couldn't be loaded
      Only formatting have been changed, you should switch to markup to see the changes
    User avatar
    • Description
      Something went wrong, the follow up content couldn't be loaded
      Only formatting have been changed, you should switch to markup to see the changes
    User avatar
    • Description
      Something went wrong, the follow up content couldn't be loaded
      Only formatting have been changed, you should switch to markup to see the changes
    • Permissions set to