Metadata-Version: 2.4
Name: mass-prebuild
Version: 1.8.0
Summary: A set of tools to massively pre-build reverse dependencies for a RPM package
Maintainer-email: Frédéric Bérat <fberat@redhat.com>
License: GPL-2.0-or-later
Project-URL: Homepage, https://gitlab.com/fedora/packager-tools/mass-prebuild
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: COPYING
Requires-Dist: argcomplete
Requires-Dist: copr
Requires-Dist: fedpkg
Requires-Dist: filelock
Requires-Dist: koji
Requires-Dist: pycurl
Requires-Dist: python-daemon
Requires-Dist: PyYAML
Dynamic: license-file

[![Latest Source Release](https://gitlab.com/fedora/packager-tools/mass-prebuild/-/badges/release.svg)](https://gitlab.com/fedora/packager-tools/mass-prebuild/-/releases) [![pipeline status](https://gitlab.com/fedora/packager-tools/mass-prebuild/badges/main/pipeline.svg)](https://gitlab.com/fedora/packager-tools/mass-prebuild/-/commits/main)

[[_TOC_]]

# Mass pre-builder - Why ?

When developing an application, or a library, there are generally a bunch of tests coming along.
These tests aim to verify, as much as possible, whether or not the functionality works as intended, and doesn't bring any regression.
That comes with a drawback, you're unlikely to be able to catch real life problems, with all their complexity.
That can even be more complicated when a new feature needs to be implemented, which may have unexpected side effects on user's use-cases.

Then comes an idea, why not run final user's test cases ?
That will not cover everything either, but that would bring another level of confidence regarding the stability of the changes that come along with an update of your application or library.
Yet, you may have a rough idea on who may use your project, but this view can be limited, or ... Slightly overwhelming.

Let's look at a simple example, there are roughly 1200 packages that depend on GNU autoconf in Red Hat-based distributions.
Knowing all of them by heart is unlikely, building them manually would take ages, and finally having an idea if a failure is due to a change in GNU autoconf or not is very difficult.

Behold the Mass Pre-Builder !

# What is the mass pre-builder ?

The mass pre-builder (mpb) is a set of tools aimed to help the user to create mass rebuilds around a limited set of packages, in order to assess the stability of a given update.

The idea is rather simple. Given a package or a set of packages, namely "main packages", the mass pre-builder will calculate the list of its direct reverse dependencies: packages that explicitly mark one of the main packages in their "BuildRequires" field.
The tooling first builds the main packages using the distribution's facilities, which should include a set of test cases that validate general functionalities.
Assuming these packages are built successfully, they are then used as base packages in order to build the reverse dependencies and execute their own test cases.

That gives a first set of results, there may be successful builds (hopefully the majority), but also failures that may or may not be due to the changes introduced by modifications of the main packages.
In order to reduce the uncertainty, and give a limited list of packages to analyze, as soon as a failure is detected, the mass pre-builder will create another mass build, in parallel to the original one, but without the changes that were introduced into the main packages: a pristine build.
This pristine build will therefore only include a sublist of the reverse dependencies, the ones that failed on the original run.

Once all the package builds are done, there will therefore be 3 major categories:

1. The successful ones
1. The ones that failed only with the modified packages
1. The ones that failed with both the modifications and the pristine version

Out of these, the first category can likely be ignored, since the packages don't seem to have been affected by the changes.

The second category needs much more attention.
These are the ones that cry: "Hey there seems to be a big issue with your changes !".
Failure needs to be analyzed, in order to figure out if the problem being raised is due to changes that have been introduced, or maybe a mistake from the final user (e.g. use of a deprecated feature that got removed).

The last category is a bit trickier. Since the build failed with the pristine packages, there may be hidden failures among them that originated from the new changes.

Now let's see a bit more in detail what we have under the hoods.

# How does it work ?

MPB is a set of python script made to abstract the use of commonly available infrastructures.
Although the primary infrastructure to be used is [COPR](https://pagure.io/copr/copr), there are plans to implement support for other infrastructures like [Koji](https://pagure.io/koji/) and potentially [Beaker](https://beaker-project.org/).

## Installation

### From official releases

The tool is available as COPR package at the following address: <https://copr.fedorainfracloud.org/coprs/fberat/mass-prebuild/>.

There are packages available for EPEL8, EPEL9 and all Fedora active releases.

In order to install the package, execute the following commands:

```bash
$> sudo dnf copr enable fberat/mass-prebuild
$> sudo dnf install mass-prebuild
```

Once the package is installed, the `man mass-prebuild` command provides more information on the tools available and how to use them.

Latest official release is: [![Latest Release](https://gitlab.com/fberat/mass-prebuild-dist/-/badges/release.svg)](https://gitlab.com/fberat/mass-prebuild-dist/-/releases)

### Directly from source

The sources can be found here: <https://gitlab.com/fedora/packager-tools/mass-prebuild>

In order to install latest development version, you can execute the following commands:

```bash
$> sudo dnf install git copr-cli koji python3-posix_ipc
$> git clone https://gitlab.com/fedora/packager-tools/mass-prebuild.git
$> cd mass-prebuild
$> pip install .
$> mkdir -p ~/.mpb
$> cp -r examples/*.conf.d ~/.mpb
```

## Basic usage

This section gives some basic examples on how to use the tool.
There are multiple workflows that can be covered by the tools, which are not depicted in this section, but may be detailed in the [man/mass-prebuild.adoc](man/mass-prebuild.adoc) document.

### Setting up COPR

Before using the Mass pre-build tool, make sure that you are able to use copr-cli.

You will need a valid COPR token, follow the instructions available here to get one: [COPR API](https://copr.fedorainfracloud.org/api/)

Then make sure copr-cli recognizes you:

```bash
$> copr-cli whoami
```

NOTE: The 'whoami' command doesn't actually require a valid token, as this command is not gated by authentication, it only ensure you that you have one.

Known issues: When renewing your COPR token, it may happen that copr-cli messes things up, and don't clear its own credential cache.
In such a case, the "whoami" command will work fine, but you won't be able to create or modify projects.
In theory, mass-prebuilder version above v0.5.0 should clear the cache on connection failure, but it may be worth making sure of it.

```bash
$> rm -rf ~/.cache/copr/*
```

### Checking Fedora notification

Fedora notifications are centralized.
It may happen that you have some notifications that are enabled, which can lead to e-mail flooding due to the thousands of builds created by the Mass pre-builder in your COPR instance.
If that happens, carefully check [Fedora notifications' settings](https://apps.fedoraproject.org/notifications/).

### Creating a simple config file

In order to use the tool, you need to prepare a small configuration file, containing information on what you want to build and how.
Following is a simple example on how to use the tool:

```bash
$> mkdir -p ~/work/mpb/autoconf
$> cd ~/work/mpb/autoconf
$> cat > mpb.config << EOF
> archs: x86_64
chroot: fedora-rawhide
name: autoconf-2.72c
packages:
  autoconf:
    src_type: file
    src: /home/jdoe/work/fedora/autoconf/autoconf-2.72c-1.fc37.src.rpm
data: /home/jdoe/work/mpb/
verbose: 1
> EOF
```

In this simple example, we are going to request the mass pre-builder to:

1. Create a new COPR project, named "autoconf-2.72c".
1. Set this project to have a fedora-rawhide chroot, for x86_64.
1. Upload and build "autoconf-2.72c-1.fc37.src.rpm" in this chroot.
1. Automatically calculate the reverse dependencies for the package "autoconf".
1. Build these reverse dependencies against the package that we have uploaded.
1. Store the data for all the "failed" build into "/home/jdoe/work/mpb/".

In parallel to the main project, you'll notice that the Mass pre-builder will create a "autoconf-2.72c.checker" project.
This project is used to assess if failures that occur in the main project are due to the modifications you have done in the main packages, and not because of general failures from the reverse dependency.

This configuration file may be provided in 3 different ways:

1. A local "mpb.config" file as shown in this example
1. Through the command line: mpb --config=~/work/mpb/autoconf/my_config_file
1. In the default path: ~/.mpb/config

The config files are looked at in this priority order, if any of these is found, the nexts in the priority are skipped.

### Executing the tool

When you execute the tool, you'll get an output similar to the following:

```bash
$> mpb
Loading mpb.config
Using copr back-end
Populating package list with autoconf
Executing stage 0 (prepare)
Prepared build autoconf-2.72c (ID: 18)
Executing stage 1 (check_prepare)
Checking build for autoconf-2.72c (ID: 18)
You can now safely interrupt this stage
Restart it later using one of the following commands:
"mpb --buildid 18"
"mpb --buildid 18 --stage 1"
Build status: /
        0 out of 1 builds are done.
        Pending: 0
        Running: 1
        Success: 0
        Under check: 0
        Manual confirmation needed: 0
        Failed: 0
```

Note the last 3 elements:

- Under check: the package as failed, and a build is started in the ".checker" project
- Manual confirmation needed: the package as failed both in the main project and the ".checker" one
- Failed: the package as failed in the main project, but not in the ".checker" one

At this point, it may be wise to save the build ID in the config file.
Stop the execution using "ctrl-C", then save the build ID in the configuration:

```bash
$> echo "build_id: 18" >> mpb.config
```

That way, when you execute the command `mpb` from within this folder again, the tool will automatically know that you want to continue the build, and not start a new one.

Another way, would be to execute `mpb --build-id 18` as stated in the output.

The tool accepts a subset of commands through the command-line, and an extensive set of commands through the configuration file.
Any argument passed through the command-line overrides commands passed through the configuration file.

For more details about the options available through the configuration file, please have a look at [mpb.config.example](mpb.config.example).

### Reverse dependencies

The "reversedeps" can be a list of reverse dependencies you want the tool to use instead of calculating it.
That may be useful if you want to rebuild only a subset of packages instead of, let's say, the 6K+ ones for gcc.

```bash
$> cat > mpb-1.config << EOF
> arch: x86_64
chroot: fedora-rawhide
packages:
  autoconf:
    src_type: file
    src: /home/jdoe/work/fedora/autoconf/autoconf-2.72c-1.fc37.src.rpm
reversedeps:
  list:
    libtool:
      priority: 0
    automake:
      priority: 1
name: autoconf-2.72c-1
data: /home/jdoe/work/mpb/autoconf
verbose: 1
> EOF
```

The "priority" field in the reverse dependencies can be given in order to specify a build order.
This configuration can be used independently for "packages" and "reversedeps".
In the case shown above, libtool will be built before automake.
If you are not interested in giving them a build order, this can be simplified to:

```yaml
reversedeps:
  list: libtool automake
```

Let's come back to our autoconf build. After a while (about 30min in this case), the tool may move to the next stage, and calculate the reverse dependencies that would be valid for x86-64:

```bash
Executing stage 2 (build)
Calculating reverse dependencies.
Level 0 depth for x86_64
Checking 1158 packages.
Retrieved 1151 packages.
Prepare discriminator for priorities calculation
100% done
Setting priorities
7 pass done.
Populating package list with [package names here]
```

This is done automatically if there is no "reversedeps: list:" configuration option set.
The priorities are calculated using the dependency graph.
The tool will try to group packages, so that they are build in a dependency ordering, as far as possible.
For example, if you are trying to calculate reverse dependencies for a package named A, and you get the following graph:

- B -> A
- C -> A and C -> B

The tool will try to first build B, then once the build is done, it will build C.
The goal is to try to catch transitive failures, where modifications done in A will affect B in a way that breaks C without breaking B build itself.

### Collecting data

At the very end of the process, you will get 3 kind of outcome for the builds:

1. Success
1. Manual Confirmation needed
1. Failed

By default, the tool will automatically download all the available artifacts for (2) and (3), in the following location:

```
    [data field]/[name of the project]/[chroot]/[name of the failing package]
```

More specifically considering our configuration file:

```
    ~/work/mpb/autoconf/autoconf-2.72c/fedora-rawhide-x86_64/
```

The kind of data that is gathered, and for which kind of outcome it is gathered can be controlled from the command line through `--config-list` option or the configuration file through `config-list:`.

### The command line options

There are several command line arguments that can be provided to the tool.
They generally have their equivalent in the configuration file, with few notable exceptions:

- help: Print the help and exits
- version: Print the version and exits
- config: Use a specific file as configuration inputs

_**When an argument is provided through the command line, it overrides values previously set in the configuration file or saved in the database.**_

The command lines options are detailed in the [man/mpb.adoc](man/mpb.adoc) document.

### Generate a new config with failed builds

If needed, one of the tool allows to generate a new configuration containing the list of failed packages from a given build.
More details on the use case can be found in the [man/mpb-failedconf.adoc](man/mpb-failedconf.adoc) document.

# The benefits in a real life example: autoconf2.72c

As of June 2022, GNU autoconf 2.72 is still under development, and is therefore unstable.
Yet, the upstream maintainers are tagging the mainline with intermediate versions that may be useful to test early, in order to limit the problems when the final release comes out.

That's where the MPB comes in handy.

For Fedora 36 x86_64, there are about 1151 packages that depend directly on autoconf.
When built with autoconf 2.72c pre-release, we get the following result:

- Success: 1033
- Manual confirmation needed: 47
- Failed: 71

Let's skip in this example the "Manual confirmation needed" part, these are packages which failed to build with both the version 2.72c and the original 2.71.
There may be failures to look at among them, but the ones we have are already interesting enough.

Out of the 71 failures, there is a common pattern, 65 of them are due to a malformed configure script, e.g. for the php package:

```
    ./configure: line 104153: syntax error: unexpected end of file
```

Although this seems to be a quite common failure, it went through the internal tests from autoconf without being noticed.

Then, there are 6 more failures, that need deeper analysis:

- am-utils: straight forward failure, due to a hard requirement on autoconf version (requires 2.69 or 2.71 exclusively)
- cyrus-imapd: A missing krb.h header. Even though configure did report that the file isn't available, the application still tries to use it.
  Yet, not finding this header is strange, and may hide a bigger problem.
- libmng: "zlib library not found" while zlib-devel got installed as part of the dependencies
- libverto: Failure seems unrelated to autoconf, one of the internal library seems to have changed its name.
  Yet, it is strange that this failure only appeared with autoconf 2.72c.
- mingw-libmng: Unresolved symbols during linking. May be unrelated to autoconf, unless a change in the configuration modified the build process.
- nfdump: configure: error: conditional "FT2NFDUMP" was never defined.
  This one may also be due to a malformed configure script.

Overall, this gives a good view on the kind of failure that would have been missed by simply building the autoconf package and relying on its internal tests as a gate keeping.
Being able to benefit from a large panel view provided by a distribution is quite beneficial, and may improve the overall quality of the packages being provided.

# Contributing

If you have any suggestion, if you find an issue, or if the tool doesn't behave the way you'd expect, don't hesitate to contact me and give me feedback.

If there are features you'd like to see, bugs you've spotted (even if I did all my best to limit them), don't hesitate to file an issue through [GitLab issues](https://gitlab.com/fedora/packager-tools/mass-prebuild/-/issues).

If you want to become an active developer of the project, you can request access through the [Project Members list](https://gitlab.com/fedora/packager-tools/mass-prebuild/-/project_members).

When submitting a patch, make sure that that there are no regression using the test suite.
There are 2 ways to executed them:

- Directly on your host, by executing `./tests/ci/all.sh`
- In a dedicated container, by executing `./scripts/CI/ci.sh`

The local execution will trigger back-end specific tests that imply you have a running setup (i.e. copr-cli can be used).

# License

The tool is made available under the [GPLv2 license](LICENSE).

# Project status

This is under heavy development.
An [architecture document](docs/mass_prebuild.adoc) is available under docs, describing what is planned to be implemented, and how.

While testing the tool, it appeared that sometimes builds issued in COPR are not fully reliable.
While successes are trustworthy, failures may not.
There were some cases where the failure was due to the infrastructure not being capable of installing build dependencies, even on a stable release (like Fedora 35), which doesn't make sense.
The same build with no changes may result in different status if started with a few seconds delay.

The data that can be collected out of a failed COPR build is relatively limited.
In the example described above, there is no way to retrieve the failing configure script.
A local build needs to be made for that.
