How to get Diff between two Docupedia Page Versions

How to get Diff between two Docupedia Page Versions#

Introduction#

The Docupedia Fetcher can be configured to produce the diff between two Docupedia page content versions. The result is stored in the evidence folder as a simple html file and a json file which contain insertions and deletions.

The expected output files are OUTPUT_NAME_diff.html and OUTPUT_NAME_diff.json.

Adjust the qg-config.yml file#

  1. Start with the example configuration file from Getting Started with Docupedia Autopilot

  2. Add the diff environment variable, shown at line 15. By setting DOCUPEDIA_PAGE_DIFF_VERSIONS to 0,-1 the Fetcher will produce the diff between latest and previous Docupedia page version.

  3. For demonstration purposes, we will just output a GREEN status in line 10. Usually, you would make use of an evaluator here to check the downloaded Docupedia data for some expected properties.

 6autopilots:
 7  docupedia-autopilot:
 8    run: |
 9      docupedia-fetcher
10      filecheck exists "${{ env.OUTPUT_NAME }}.html"
11    env:
12      DOCUPEDIA_PAGE_ID: ${{ env.DOCUPEDIA_PAGE_ID }}
13      DOCUPEDIA_PAT: ${{ secrets.DOCUPEDIA_PAT }}
14      DOCUPEDIA_URL: ${{ env.DOCUPEDIA_URL }}
15      DOCUPEDIA_PAGE_DIFF_VERSIONS: 0,-1
16      OUTPUT_NAME: docupedia_content

Upload and run the config#

You can now upload the config to the Yaku service and run it. You should then find the downloaded diff information in the evidence zip file.