epic #26790 Integration with Qlik (BI)

Summary

Summary *

Integration with Qlik (BI)

CategoryTrackers

Description

Overview

The goal of this epic is to have REST API format that will better suit with consumption by a BI tool (eg. Qlik, Power BI, etc).

The following routes will be covered:

GET /tracker_reports/{id}/artifacts
GET /trackers/{id}/artifacts
GET /artifacts

Format & approaches

Initially CSV was identified as the best tradeoff as an output format, despite its limitations, because it was easy to injest by most BI tools without any end users manipulations.

After deeper thoughts, it will not be possible to generate a CSV output that doesn't require end users manipulations. At the very least, end users will have to deal with pagination as the output cannot be sent without any (risk of server overload, request truncation, timeout, etc). Moreover CSV as a data format is already by itself problematic:

It's exposed to the issues related to encoding (charset, BOM & co).
It as issues with date time values, separator and quotations.
The topic is well covered in the RFC of the format itself

With this in mind:

Given that most BI tools are able to injest JSON (at least Qlik and PowerBI)
But the JSON exposed by Tuleap APIs is too complex for average BI users to parse
The current proposal is to expose a simplified json output, easier to parse.

How to parse JSON with Qlik: https://www.youtube.com/watch?v=8xkwFjDjO84

Json

The identified routes, get a new optional parameter (eg. ouput_format) that can be either nested (default) or flat (new, BI easy). The output representation is then:

[
{"field_name_int": 0, "field_name_array": ["A", "B"], "field_name_c": ...},
...
]

The logic:

One line per artifact
Each field value is made of
1. one key with field name (not id)
2. one value with field value
When field value is :
1. a simple type (string, int, float), the value is output directly
2. has multiple values (list fields), the value is itself an array with the simple values within
3. cannot be represented with a single value (eg. a Text field that comes with a format to be properly managed) have a special treatement

Special fields

This section describe the fields that requires a special treatement for further discussions

Text
- Text field values have 3 possible formats (text, markdown & HTML). For any sensible treatment, it's required for consumer to know the format.
- As format is reduced to a single value, it's not possible to specify the format so there are two options
  1. The approach for CSV or XLSX output: transform the output to text only (remove HTML markup).
  2. Output the raw value and let the consumer figure out the format.

Right now option 1 is in place for experimentation

CSV

Note: From a pure HTTP point of view we should request an Accept: text/csv header to the request (as we do for XML) but we have no guaranty that tools will provide such customization to their end users. Plus, people who will do the BI integration are likely to be less tech-savvy and the management of HTTP headers would be too complex.

When format=csv is used, the query result will be sent with "Content-type: text/csv" with UTF8 encoding.

The separator to be used will be the one chosen in user preferences.

Format

The output will be paginated with the same rules than with JSON output.

The header of the CSV document will be the name of the columns made of field's name (not label). Eg: artifact_id, start_date, submitted_by, ...

The content will be adapted to CSV with following rules:

String, Text, Float, Int, Computed, Artifact ID, Per tracker ID: literal value (eg. 5, 2.0, Error when doing...)
Date, Date + Time, Submitted on, Last update date: date according to CSV preferences
Select box, multi select box, radio button, check box, open list, shared fields:
- static: literal value (eg. UX & UI)
- users: login name (eg. jdoe)
- groups: group name (eg. developers)

Field not supported will not be part of CSV (without warning):

Permissions on artifacts
Artifact links
Cross references
File upload
Burnup
Burndown
Encrypted fields

Open point

How do we represent lists to make them useful ?

Given a multi select box Category with "Database", "Web UI", "REST API" as possible values and an artifact whose categories are "Web UI" & "Database":

Output one line with sub separator :

artifact_id, Category
1234, Web UI|Database

Output one line per type

artifact_id, category
1234, Web UI
1234, Database

The first representation is more inline with Tuleap output but we don't know how tools will be able to consume this kind of output.

The second representation is more "natural" in term of CSV but would be more complex. Example with the previous artifact with Language "FR, EN, DE":

artifact_id, category, language
1234, Web UI, FR
1234, Web UI, EN
1234, Web UI, DE
1234, Database, FR
1234, Database, EN
1234, Database, DE

Progress

Start dateEmpty

End dateEmpty

StatusClosed

Details

Artifact ID#26790

Submitted ByManuel Vacelet (vaceletm)

Last Modified On2024-09-26 10:42

Submitted On2022-05-16 14:21

Attachments

AttachmentsEmpty

References

Cross References

Referencing epic #26790

Git commit

tuleap/tuleap/stable

Add a `flat` output format to the REST endpoint retrieving artifacts from a report 8b876681b4

Thomas Gerbet (tgerbet) , 2023-02-20 17:01

Show items referenced by epic #26790

Referenced by epic #26790

Artifact Tracker v5

story #31087 have flat json representation on `/tracker_reports/:id/artifacts`

story #37213

story #37214

story #37216

story #37218 output flat json with semicolon separated lists

Permissions

Restrict access to this artifact for the following user groups:

Follow-ups

Manuel Vacelet (vaceletm)

2024-09-26 10:42

Description

Something went wrong, the follow up content couldn't be loaded

Only formatting have been changed, you should switch to markup to see the changes
Status changed from On going to Closed

Manuel Vacelet (vaceletm)

2024-03-14 08:52

Summary
-Integration with BI ~~tools~~
+Integration with Qlik (BI)
Status changed from Scheduled to On going
links
- Added Child: story #37218

Manuel Vacelet (vaceletm)

2023-12-22 15:34

Status changed from Planned to Scheduled

Manuel Vacelet (vaceletm)

2023-10-04 14:15

Status changed from On going to Planned

Manuel Vacelet (vaceletm)

2023-02-21 16:08

links
- Added Child: story #31087

Manuel Vacelet (vaceletm)

2023-02-20 12:04

Description

Something went wrong, the follow up content couldn't be loaded

Only formatting have been changed, you should switch to markup to see the changes

Manuel Vacelet (vaceletm)

2022-12-22 10:48

Status changed from Planned to On going

Manuel Vacelet (vaceletm)

2022-05-30 16:40

Description

Something went wrong, the follow up content couldn't be loaded

Only formatting have been changed, you should switch to markup to see the changes

Manuel Vacelet (vaceletm)

2022-05-30 16:39

Description

Something went wrong, the follow up content couldn't be loaded

Only formatting have been changed, you should switch to markup to see the changes
Permissions set to

Public

Overview

Format & approaches

Json

Special fields

CSV

Format

Open point

Referencing epic #26790

Git commit

tuleap/tuleap/stable

Referenced by epic #26790

Artifact Tracker v5

Follow-ups