DCAT harvesting by data.gouv.fr¶
Following theclosure of geo.data.gouv.fr, a DCAT server compatible with the data.gouv.fr DCAT harvester model was developed thanks to tripartite funding from the Loiret department, Calvados and the city of Bayonne.
Prerequisites¶
In order for your data to be uploaded to data.gouv.fr, it must meet several criteria:
- have an open license and indicate that there are no limitations in the INSPIRE sense (see manage CGUs) ;
- be in a catalog shared by the DCAT server and OpenCatalog ;
- contain at least one operational download link. Recognized download links are :
- link to a WFS service capable of delivering the data in GeoJSON format in WGS84 (4326): see assign a WFS service ;
- link to vector (GeoJSON, Shapefile Zipped and Geopackage), raster (ECW, JPEG2000 and GeoTIFF) or tabular (csv, xls and xlsx) data files: see assign download link.
Note
In on-premises mode, the DCAT URL must be made publicly accessible.
List of completed fields¶
Fields data.gouv.fr | Isogeo fields |
---|---|
Title | Title |
Acronym | Not filled |
Description | Mixing several fields |
Keywords | Keywords and Themes |
License | Conditions |
Spatial coverage | Not recovered by the harvester |
Time coverage | Period of validity |
Update frequency | Update frequency |
Remote identifier | Unique Isogeo identifier |
URI | Not filled (entered by data.gouv) |
Description¶
The description is formatted as follows:
Description : Summary
Collection context: Data collection context (if any)
Collection method: Data collection method (if any)
Attributes: array containing field name, alias (or basic comment) and type.
For more information, see the metadata on the Isogeo catalog (OpenCatalog link).
Update frequency¶
The update frequency is entered according to this correspondence:
Isogeo | data.gouv |
---|---|
Every hour | Every hour |
Every 6 hours | Four times a day |
Every 12 hours | Twice a day |
Every day | Daily |
Every 3 days | Twice a week |
Every week | Weekly |
Every 2 weeks | Every two weeks |
Monthly | Monthly |
Every 2 months | Bimonthly |
Every 3 months | Quarterly |
Every 4 months | Three times a year |
Every 6 months | Half-yearly |
Every year | Annual |
Every 2 years | Biennial |
Every 3 years | Triennial |
Every 5 years | Five-yearly |
Other frequency | Unknown |
Data download¶
Several types of links can be uploaded to the data.gouv file as downloadable resources.
- link to a file uploaded to Isogeo (hosted) ;
- link to vector (GeoJSON, Shapefile Zipped and Geopackage), raster (ECW, JPEG2000 and GeoTIFF) or tabular (csv, xls and xlsx) data files as a data link with the download action ;
- link to a WFS service :
- the number of entities must be less than the map server threshold ( MaxRecordCount=1000 parameter by default for ArcGIS Server, maximum number of objects parameter for Geoserver, etc.);
- the 4326 coordinate system must be available in ;
- geojson export format.
Generation and monitoring of valid data sets¶
- To generate a new harvesting link, go to
Administration
,Shares
thenNew
; - Then choose the DCAT application, a catalog and a Name for the share;
- Click on
Create
; - A link is automatically generated, to be referenced in
data.gouv.fr
.
By adding /debug-page
to the url, you can see which valid datasets (in green) will actually be returned by the harvester, and which invalid datasets (in red) will not.
For your information, service sheets and data sheets that do not have a download link meeting the above requirements are considered invalid.
Account and organization on data.gouv.fr¶
-
Create an account on data.gouv.fr
To create an account or log in: https://www.data.gouv.fr/login. It is recommended to create an account directly without a social network interface.
-
Create / join an organization on data.gouv.fr
To do this, you need to go through the administration of your profile: https://www.data.gouv.fr/fr/admin/organization/new/. If it already exists, make a request to join it.
Referencing and harvesting DCAT streams¶
-
Add a new harvester
Once your DCAT flow has been created in Isogeo, add a new harvester from the data.gouv.fr administration interface
-
Select your organization
-
Complete the required fields and click Next
-
A message is displayed telling you that your harvester needs to be validated by the administration team. Click on "See in administration"
-
Click on "Edit", then "Edit"
-
Test harvesting by clicking on "Preview" and check the number of validated datasets.
If a piece of data seems to be unavailable, double-check the prerequisites and then contact the data.gouv.fr team.
-
Check harvesting
Once the harvester has been validated, you can consult the various harvesting operations carried out, which are launched daily.