Thank you for showing interest in reviewing contributions made to Open Terms Archive. This document is intended to help you get started and provide some guidelines for reviewing contributions.
We want to make sure that the contributions made to the project are of high quality and that they are in line with the vision of the project. We also want to make sure that the contributions are reviewed in a timely manner so that the contributors can get feedback and continue to contribute to the project.
This is where volunteer reviewers come in to help. Reviewers are people who have volunteered to review contributions made to the project. They are not paid for their work, but they are given credit for their work.
Anyone can review contributions. As hinted before, most of the contributions made to Open Terms Archive are contractual documents that are to be tracked. Therefore, you don’t need to be a programmer or have any technical knowledge. You just need to be able to read and understand the contribution and provide feedback where necessary.
It depends on the contribution. Some contributions may be spot on and can be reviewed in a few minutes. Other contributions may require a more detailed review and changes to be made and can take longer.
We estimate it to take 3 to 15 mins for one to review a document. The first reviews might be a bit longer as reviewers get familiar with the process, and will speed up with time.
To get started, we will need to understand a few things. The nature of the contributions you will be reviewing, where to get the contributions to review and the tools that will help you in reviewing the contributions.
The contributions you will be reviewing are contractual documents of digital services. These are documents that govern the relationship between two or more parties.
They are not the original documents, but rather the terms extracted from these documents. If the terms are represented accurately, it will be easier to follow on with any subsequent changes in the document. Contributors track these documents (sometimes, anonymously) by submitting a pull request either using a tool that helps contributors add documents to the project, the GUI contributing tool, or by creating a JSON file and adding it via a pull request. You can find more information about pull requests here.
There are three types of contributions you’ll come across:
The method used to review each of these types of contribution varies, and you’ll find a detailed explanation below.
The contributions can be found in the form of pull requests in the repository of the collection you want to work on. For example, for the Contrib collection, they are visible under the pull requests tab of the contrib-declarations
repository..
Your focus should be on two aspects: accuracy and quality.
When some terms can no longer be tracked by the Open Terms Archive engine, an issue is created in the collection repository. This issue contains the details on why the document cannot be tracked and the date the challenge was encountered.
Contributors are then required to update the declaration in order to bring back the tracking of the terms.
The updates can be made using the contribution tool and it’s effects will be similar to the one seen when adding declaration but with a slight change.
The pull request created will consist of fewer checks than those that add declarations, as some aspects have already been previously checked, such as the name of the service and its ID. The checks will guide the reviewer on the key things to look out for during the review process, and a link to inspect the declaration.
For pull requests that update declarations, you should focus should be on two things: history file and declaration.
validUntil
property that specifies the date a specific version of a service declaration was last effective. You have to confirm that this date is the same as the date in the issue opened for the declaration when the bot couldn’t track it for the first time. This issue is usually included in the pull request message. The history file is updated with every update
pull request. You can find more information about the history file here.update
pull requests, you only look at the selectors to make sure they are simple and also verify the generated version is ok.verify version
button.validUntil
property in the history file under the Files changes
section of the pull request. If the dates are the same, proceed to approve the pull request.You can read more about maintaining declarations from the official documentation.
When you spot an error in a contribution, instead of asking the author to implement these changes, we usually recommend you fix it, especially for first time contributors, in order to speed up the process. You can make your corrections directly through the graphical user interface and send the document. This will create an update in the already existing pull request and a new comment will be generated by the OTA-bot.
In some special cases, the correction may have to do with the service name. Such changes modify the branch name, hence, creating a new pull request instead of updating the initial pull request as their names are now different. In any case, it’s always important to let the contributor know about any changes or corrections you make to their contribution.
When contributing to the project, reviewers may need to modify the Service ID of a service that is being added for tracking. This is often necessary when the service being tracked is in a language other than English, such as Chinese or French. In these cases, the Service ID usually reflects the transliteration of the service name (written in the native language). The documentation provides more information on this. It is important to note that the contributor can be very crucial in providing an accurate transliteration of the service name especially if the reviewer does not know the service’s native language.
Since the Service ID is used as the file name of the JSON file associated with the service, it is necessary to change the file name to reflect the transliteration. Here are the steps to follow:
Status checks are required to pass before merging can take place. This ensures that automated tests (through “Continuous Integration”, or CI) confirm the contribution will be readable by the Open Terms Archive engine.
These tests should run automatically. However, under some circumstances, the tests might need to be triggered manually. To that end, navigate to the “Actions” tab of the collections repository, and look for the name of the contribution where tests are not run. Once located in the list, click on the entry name, and click on “Re-run all jobs” at the top right.
If the contribution comes from a fork rather than from the bot, the checks will not appear in the Actions tab, as they cannot be run from a different repository. In order to run checks from a fork, you need to create a new branch in the collection repository, with the contents of the branch in the fork. This operation can only be done through the command-line. Assuming you already have cloned the collection repository:
git checkout main && git pull
.git checkout -b add_<service_name>_<terms_type>
.git pull https://github.com/<contributor_name>/<collection_name> <branch_name_in_the_fork>:add_<service_name>_<terms_type>
. The branch name used in the fork can be found in the pull request page, right under the title.git push origin add_<service_name>_<terms_type>
.As you did not push to the source branch but instead created a copy of that branch in the collection repository, the pull request itself will not mention a new push, and GitHub will suggest that you create a new pull request. In order to keep pull request authorship clear for contributors, please keep the original pull request instead: since the checked-in code is now pushed by a trusted author, the CI should run. And since the commit IDs are the same in the fork and in the original branch, the status checks will update in the original pull request when you refresh the page, without the need to create a new pull request.
Once the pull request has been merged, delete the copy you made of the branch with git push origin -d add_<service_name>_<terms_type>
.
When tests fail, you can follow these steps to diagnose and address the issue:
Begin by analyzing any error messages or warnings provided by the test output. These messages can often provide meaningful information to identify the source of the problem. Pay attention to specific issues such as schema validation errors, inaccessible web locations or inaccessible content selection.
For a deeper investigation, you can access the snapshots and versions generated during the test run. Navigate to the summary page of the failing workflow. Scroll down to the “Artifacts” section located at the bottom of the page. Click on snapshots_and_versions
to download them.
Inside the downloaded archive you will be able to inspect the snapshot file related to the terms that failed. Ensure the document downloaded by the engine is the correct one and that the terms content is present. Sometimes a login wall or a cookies wall can block access to the content.
If the snapshot is the proper one, you can examine the generated version to check the accuracy of content selection.
If the tests fail systematically in CI but there is at least one environment in which all tests pass, it is allowed to use admin powers to force the merge.
For example, tests may fail in CI because of a 403 Access Denied error, but succeed when run on a development machine.
Bypassing protection is allowed because:
- if the engine is updated, the problem could correct itself;
- if it’s an anti-bot protection, it can stop from one day to the next;
- adding the terms prevents duplicate suggestions for additions;
- a service can encounter this type of error after being merged, so there is no reason to prevent it from being added with this error.
Beyond status checks, additional restriction requires branches to be up to date before merging. This ensures that the contribution has been tested with the latest version of the collection. This appears as a “This branch is out-of-date with the base branch” warning on a pull request.
You can fix this using the Github interface, by clicking on the arrow button next to the “Update Branch” button, and select “Update with Rebase”.