Skip to content

DCAT harvesting by data.gouv.fr

Following theclosure of geo.data.gouv.fr, a DCAT server compatible with the data.gouv.fr DCAT harvester model was developed thanks to tripartite funding from the Loiret department, Calvados and the city of Bayonne.

Prerequisites

In order for your data to be uploaded to data.gouv.fr, it must meet several criteria:

  • have an open license and indicate that there are no limitations in the INSPIRE sense (see manage CGUs) ;
  • be in a catalog shared by the DCAT server and OpenCatalog ;
  • contain at least one operational download link. Recognized download links are :
    • link to a WFS service capable of delivering the data in GeoJSON format in WGS84 (4326): see assign a WFS service ;
    • link to vector (GeoJSON, Shapefile Zipped and Geopackage), raster (ECW, JPEG2000 and GeoTIFF) or tabular (csv, xls and xlsx) data files: see assign download link.

Note

In on-premises mode, the DCAT URL must be made publicly accessible.

List of completed fields

Fields data.gouv.fr Isogeo fields
Title Title
Acronym Not filled
Description Mixing several fields
Keywords Keywords and Themes
License Conditions
Spatial coverage Not recovered by the harvester
Time coverage Period of validity
Update frequency Update frequency
Remote identifier Unique Isogeo identifier
URI Not filled (entered by data.gouv)

Description

The description is formatted as follows:

Description : Summary

Collection context: Data collection context (if any)

Collection method: Data collection method (if any)

Attributes: array containing field name, alias (or basic comment) and type.

For more information, see the metadata on the Isogeo catalog (OpenCatalog link).

Update frequency

The update frequency is entered according to this correspondence:

Isogeo data.gouv
Every hour Every hour
Every 6 hours Four times a day
Every 12 hours Twice a day
Every day Daily
Every 3 days Twice a week
Every week Weekly
Every 2 weeks Every two weeks
Monthly Monthly
Every 2 months Bimonthly
Every 3 months Quarterly
Every 4 months Three times a year
Every 6 months Half-yearly
Every year Annual
Every 2 years Biennial
Every 3 years Triennial
Every 5 years Five-yearly
Other frequency Unknown

Data download

Several types of links can be uploaded to the data.gouv file as downloadable resources.

  • link to a file uploaded to Isogeo (hosted) ;
  • link to vector (GeoJSON, Shapefile Zipped and Geopackage), raster (ECW, JPEG2000 and GeoTIFF) or tabular (csv, xls and xlsx) data files as a data link with the download action ;
  • link to a WFS service :
    • the number of entities must be less than the map server threshold ( MaxRecordCount=1000 parameter by default for ArcGIS Server, maximum number of objects parameter for Geoserver, etc.);
    • the 4326 coordinate system must be available in ;
    • geojson export format.

Generation and monitoring of valid data sets

  1. To generate a new harvesting link, go to Administration, Shares then New;
  2. Then choose the DCAT application, a catalog and a Name for the share;
  3. Click on Create;
  4. A link is automatically generated, to be referenced in data.gouv.fr.
DCAT URL generation

By adding /debug-page to the url, you can see which valid datasets (in green) will actually be returned by the harvester, and which invalid datasets (in red) will not.

Monitoring valid data sets

For your information, service sheets and data sheets that do not have a download link meeting the above requirements are considered invalid.

Account and organization on data.gouv.fr

  1. Create an account on data.gouv.fr

    To create an account or log in: https://www.data.gouv.fr/login. It is recommended to create an account directly without a social network interface.

    data.gouv.fr - Registration/Login

  2. Create / join an organization on data.gouv.fr

    To do this, you need to go through the administration of your profile: https://www.data.gouv.fr/fr/admin/organization/new/. If it already exists, make a request to join it.

    data.gouv.fr - Organization


Referencing and harvesting DCAT streams

  1. Add a new harvester

    Once your DCAT flow has been created in Isogeo, add a new harvester from the data.gouv.fr administration interface

    Create a new harvester

  2. Select your organization

  3. Complete the required fields and click Next

    Configuring the new harvester

  4. A message is displayed telling you that your harvester needs to be validated by the administration team. Click on "See in administration"

    DCAT validation

  5. Click on "Edit", then "Edit"

    DCAT edition

  6. Test harvesting by clicking on "Preview" and check the number of validated datasets.

    Preview harvesting results

    If a piece of data seems to be unavailable, double-check the prerequisites and then contact the data.gouv.fr team.

  7. Check harvesting

Once the harvester has been validated, you can consult the various harvesting operations carried out, which are launched daily.

Harvest results