Create your first collection

This tutorial will guide you through creating your first Open Terms Archive collection.

By the end, you’ll have a working collection that tracks changes to a service’s privacy policy. You will also have a basic understanding of how to create a collection.

Prerequisites πŸ”—

  • Node.js is installed on your system.
  • You have basic familiarity with the command line.
  • You know how to use a text editor.

Create a collection πŸ”—

Step 1: Set up the directory structure πŸ”—

  1. Create a new directory:

    mkdir ota-tutorial-declarations
    cd ota-tutorial-declarations
    
  2. Create a declarations directory inside the project. This is where you will declare the service and terms you want to track:

    mkdir declarations
    
  3. Create the configuration file for the collection:

    mkdir config
    

Step 2: Create the service declaration πŸ”—

  1. Create a file declarations/Open Terms Archive.json with the following content. For detailed instructions on how to structure it, follow the Tracking terms tutorial:
    {
      "name": "Open Terms Archive",
      "documents": {
        "Privacy Policy": {
          "fetch": "https://opentermsarchive.org/en/privacy-policy",
          "select": ".textcontent"
        }
      }
    }
    

Step 2: Create the metadata file πŸ”—

  1. Create a file metadata.yaml:
    id: ota-tutorial
    name: Tutorial collection
    tagline: Learn how to create a collection
    description: |
      A step-by-step tutorial collection that guides through creating an Open Terms Archive collection.
      Track terms and conditions from websites while learning the basics of declarations, configuration, and metadata.  
    languages: [en]
    jurisdictions: [EU]
    

Step 3: Create the configuration file πŸ”—

  1. Create a file config/development.json and set the tracking schedule to every minute:
    {
      "trackingSchedule": "* * * * *"
    }
    

Step 4: Install and run the engine πŸ”—

  1. Install the Open Terms Archive engine:

    npm install --save @opentermsarchive/engine
    
  2. Start the scheduled tracking of the declared terms:

    npx ota track --schedule
    
  3. After one minute, check the results:

  • Check the extracted version, which should contain the Privacy Policy of Open Terms Archive in Markdown format without any other content (no header, footer…): ./data/versions/Open Terms Archive/Privacy Policy.md.
  • Check the snapshot, which is the original HTML document of the Open Terms Archive Privacy Policy: ./data/snapshots/Open Terms Archive/Privacy Policy.html.

Congratulations! You have created your first collection.