Overview
The goal of this feature is to let project administrator do a level of computation automatically when artifacts are updated. For instance, this could be used, for instance, to compute a risk as a combination of two factors.
This story is designed to be an alternative to the model where team have to rely on webhook & an external server to do their own computation.
In this initial implementation, the idea is to leverage what was done on untrusted code execution in git pre-commit hooks (see epic #31078):
- User code will be a web assembly module (in order to guaranty the security of the system).
- The execution will be done after the artifact update, asynchronously.
- The result of the computation will lead to an artifact update (new changeset) with a dedicated user (like workflow triggers).
For a bit more of context:
- Is mandatory to ensure that executed code cannot be used by threat actors to steal or damage the infrastructure.
- While execution of webassembly module was already addressed with git pre-commit, the introduction of the concept in Trackers has its own challenges. For instance, right now there is no intermediate representation of a changeset that could be easily passed to a webassembly module. If the execution of webassembly module happens after the artifact update, we can re-use the existing infrastructure that is used to send webhook payload. In addition to that, as code is executed asynchronously, we don't have to deal with code optimization as much as it would be required if code was run OLTP.
- As an extension of 2, it's likely that at some point,we would like to see what was part of user submitted data and what was computed (at least for debug or traceability reasons). By having the changes done by the webassembly module recorded as a dedicated changeset, the diff comes for free.
Functional requirements
Prerequisite: asynchronous processing in trackers is activated.
- As a tracker administrator, I should be able to select a webassembly (WASM) module and to upload it for my tracker. There is only one WASM module per tracker (on purpose to avoid having to deal with the order of execution of modules right now) if multiple computation must be performed, it will be in the same WASM module
- Execution of the WASM module can be suspended (module is there but not executed)
- For each tracker, the last 50 execution details (status, errors) are kept with: associated artifact id, changeset id, source payload and generated payload.
- When an artifact is updated, after the new changeset is saved in the database, there is a new action in the post-processing pipeline: execution of the WASM module (if any)
- The module is executed with the artifact webhook payload as an input (json)
- The module returns the payload as expected by
PUT /artifacts/:id
on standard output
- Tuleap saves the modified payload as new changeset (if there is any change) as workflow user
-
The update will not trigger a new WASM module execution but the rest of the pipeline will not change (it means that webhook are triggered for the WASM update). The webhook payload is modified to inform if the source action is WASM module or regular update.
- The execution already gather metrics (prometheus), a label should be added to distinguish the source.
- When there is an error in the execution of the module, an email is sent to the tracker administrators to inform them.
- In order to ease the development of module, the tracker description is added to the webhook payload. This is useful when module want to produce a value in a List field
The feature is available as a plugin to limit the amount of clutter on Tracker.
Known limitations
As this feature is currently in a shape where we want to see if there is an interest for it, the following things are not covered:
- XML import/export
- Tracker duplication
- Ability to read attachments