This guide explains how to configure your collection to automatically publish datasets to data.gouv.fr, the French government’s open data platform.
There are two ways to publish datasets to data.gouv.fr:
This approach is suitable when you want the system to automatically create and manage the dataset within your organization.
open-terms-archive)dataset.datagouv.organizationIdOrSlugdataset.title in your configuration (this will be used as the dataset title)The dataset will be automatically created if it doesn’t already exist in the organization.
This approach is suitable when you already have a dataset created on data.gouv.fr and want to update it automatically.
6914a64b17a0a61222)dataset.datagouv.datasetIdIn your collection’s configuration file (e.g., config/production.json), add the datagouv settings under the dataset section:
For Option 1 (automatic creation):
{
"dataset": {
"title": "<collection_name> collection dataset",
"datagouv": {
"organizationIdOrSlug": "open-terms-archive"
}
}
}
For Option 2 (existing dataset):
{
"dataset": {
"title": "<collection_name> collection dataset",
"datagouv": {
"datasetId": "6914a64b17a0a91bb0a61222"
}
}
}
If you want to test with the demo environment first, add useDemo:
{
"dataset": {
"title": "<collection_name> collection dataset",
"datagouv": {
"organizationIdOrSlug": "open-terms-archive",
"useDemo": true
}
}
}
Create a .env file at the root of your collection repository (if it doesn’t already exist) and add your data.gouv.fr API key:
OTA_ENGINE_DATAGOUV_API_KEY=your_api_key_here
You can test your configuration by manually publishing a dataset:
npx ota dataset --publish
This will create and publish a dataset to data.gouv.fr. Check the output to verify the dataset was published successfully.
To automatically publish datasets on a schedule, use the --schedule flag:
npx ota dataset --schedule --publish --remove-local-copy
This will publish datasets according to the schedule defined in your configuration (by default, every Monday at 8:30 AM).
data.gouv.fr publishing can be used alongside GitHub or GitLab releases. Simply configure both platforms and datasets will be published to all configured platforms simultaneously.
See the configuration reference for all available options.