Configuration options
This reference documentation details all available configuration options that can be specified in a collection’s configuration file to configure the Open Terms Archive engine.
As an example, see the production configuration file of the Demo collection.
Defines how often the engine should check for changes in terms. Uses standard cron syntax to set the schedule. By default, it runs every 12 hours at minute 30.
Path to the collection’s directory containing declarations directory and metadata file, relative to the engine execution location
Default:
./
Example
../collections/demo-declarations
The recorder section manages how versions and snapshots of terms are stored, supporting multiple storage backends.
recorder.versions.storage
object
Configuration for storing versions. Supports Git and MongoDB. See
Storage Repositories for more information.
recorder.snapshots.storage
object
Configuration for storing snapshots. Supports Git and MongoDB. See
Storage Repositories for more information.
The fetcher section configures how the engine retrieves documents from the web.
fetcher.waitForElementsTimeout
number
Maximum wait time for elements to appear in a page (milliseconds).
fetcher.navigationTimeout
number
Maximum wait time for a page to load (milliseconds).
Language code (ISO 639-1) for request headers.
The notifier section sets up how notifications are sent when new versions of terms are recorded.
notifier.sendInBlue.updatesListId
string
SendInBlue contacts list ID of persons to notify on terms updates.
notifier.sendInBlue.updateTemplateId
string
SendInBlue email template ID used for updates notifications.
The logger section configures logging and error notification settings.
SMTP server hostname.
Default:
smtp-relay.brevo.com
logger.smtp.username
string
Username for SMTP server authentication.
Default:
admin@opentermsarchive.org
logger.sendMailOnError.to
string
Email address for error notifications.
Example
admin@example.com
logger.sendMailOnError.from
string
Sender email address for error notifications.
Example
noreply@example.com
logger.sendMailOnError.sendWarnings
boolean
Set to true to also send email in case of warning.
logger.timestampPrefix
boolean
Set to false to avoid duplicate timestamps if logs are managed by a process manager.
The reporter section manages how issues are reported when terms content is inaccessible, supporting GitHub and GitLab.
Type of reporter
Allowed values
github
gitlab
Example
github
reporter.repositories.declarations
string
Repository for creating issues.
Example
OpenTermsArchive/demo-declarations
reporter.repositories.versions
string
Repository for versions.
Example
OpenTermsArchive/demo-versions
reporter.repositories.snapshots
string
Repository for snapshots.
Example
OpenTermsArchive/demo-snapshots
Base URL for GitLab (if applicable).
Example
https://gitlab.example.com
reporter.apiBaseURL
string
API base URL for GitLab (if applicable).
Example
https://api.gitlab.example.com
The dataset section configures how datasets are published. Datasets can be published to GitHub releases, GitLab releases, and/or data.gouv.fr. If both GitHub and GitLab tokens are configured, GitHub takes precedence.
dataset.title
string
required
Title of the dataset.
Example
Contrib collection dataset
dataset.versionsRepositoryURL
string
required
Repository URL for dataset releases. Also used to generate the dataset README.
Example
https://github.com/OpenTermsArchive/contrib-versions
dataset.publishingSchedule
string
Cron expression for dataset publishing. By default, it runs every Monday at 8:30 AM. If publishing to data.gouv.fr, remember to update dataset.datagouv.frequency to match the actual publishing frequency.
The data.gouv.fr section configures publishing to the French government’s open data platform. Either datasetId or organizationIdOrSlug must be configured.
dataset.datagouv.datasetId
string
required
ID of an existing dataset on data.gouv.fr. Use this to publish to an existing dataset. Either this or organizationIdOrSlug is required.
Example
6914a64b17a0a91bb0a61222
dataset.datagouv.organizationIdOrSlug
string
required
ID or slug of the organization on data.gouv.fr. Use this to automatically create and publish a dataset. The dataset will be created with the title from dataset.title if it doesn’t exist. Either this or datasetId is required.
Example
open-terms-archive
dataset.datagouv.frequency
string
required
Update frequency of the dataset. Used when creating or updating a dataset on data.gouv.fr. See
data.gouv.fr API for all allowed values.
dataset.datagouv.useDemo
boolean
Set to true to use the demo.data.gouv.fr environment for testing.
The collection API section sets the parameters for the API server.
collection-api.api.port
number
required
Port number for the API server.
collection-api.api.basePath
string
required
Base path for API endpoints.
The storage repositories section set the parameters for supported backends for storing versions and snapshots, supporting Git and MongoDB.
Type of storage backend.
Default:
git
Allowed values
git
mongo
The Git storage configuration allows to store versions in a Git repository.
Path to the versions database directory.
storage.git.publish
boolean
Boolean to push changes to the origin.
storage.git.snapshotIdentiferTemplate
string
Template for snapshot ID reference. %SNAPSHOT_ID will be replaced with the actual snapshot ID.
Default:
./data/snapshots/%SNAPSHOT_ID
storage.git.author.name
string
Author name for changes.
Default:
Open Terms Archive Bot
storage.git.author.email
string
Author email for changes.
Default:
bot@opentermsarchive.org
The MongoDB storage configuration allows to store versions in a MongoDB database.
storage.mongo.connectionURI
string
MongoDB connection URI.
Default:
mongodb://127.0.0.1:27017
storage.mongo.database
string
Database name.
Default:
open-terms-archive
storage.mongo.collection
string
Collection name.