The metadata extraction subroutine¶
Info
This documentation is intended for teams responsible for technical infrastructures and their security. Please do not hesitate to contact Isogeo staff directly for further information.
Presentation¶
Info
All the processes described below are for consultation only, and no editing operations are carried out on our customers' data.
The Windows Isogeo service whose installation is described here includes a sub-program used to perform various processing operations on the data targeted by
-
verification of ArcGIS Pro software version and ESRI license activation
-
listing inventories the data available in :
- an Oracle, PostgreSQL or Microsoft SQL Server database
- an ESRI Enterprise Geodatabase
- a Geodatabase ESRI file
-
the signature is used to calculate a hash from the content of a data item
-
lookup extracts and calculates the metadata describing a piece of data:
- name
- location
- coordinate system
- convex envelope coordinates in GeoJSON
- field name, alias and type
- number of entities
- type of geometry
- number of strips
- number of columns
- number of lines
- bAND NAME
- bounding box coordinates
Structure of JSON from a lookup
[
{
"dataset": {
"name": "...",
"formatShort": "...",
"formatLong": "...",
"path": "...",
"numberOfFeatures": 0,
"coordsys": { // null si SRS inconnu
"EPSG": "...",
"name": "..."
},
"type": "...", // POINT | LINESTRING | POLYGON | MULTIPOINT | FEATURECOLLECTION | ...
"envelope": "...", // GeoJSON stringifié de l'enveloppe convexe, ou null
"attributes": [
{
"name": "...",
"type": "...",
"alias": "...",
"length": 0, // optionnel
"precision": 0, // optionnel
"scale": 0, // optionnel
"isNullable": true, // optionnel
"editable": true, // optionnel
"required": true, // optionnel
"domain": "...", // optionnel
"defaultValue": "..." // optionnel
}
],
"esriVersion": "...", // ESRI uniquement
"geometryStorage": "...", // ESRI uniquement
"warnings": { // optionnel, si SRS non résolu
"coordsys": {
"EPSG": "...",
"name": "...",
"wkt": "..."
}
}
}
}
]
[
{
"dataset": {
"name": "...",
"formatShort": "...",
"formatLong": "...", // absent en FME si non renseigné
"path": "...",
"coordsys": { // null si SRS inconnu
"EPSG": "...",
"name": "..."
},
"envelope": "...", // GeoJSON stringifié, null, ou "" si pas de géoréférencement
"bands_count": 0,
"cols_count": 0,
"rows_count": 0,
"bands": [
{
"name": "...",
"interpretation": "...",
"profondeur": "..."
}
],
"bbox": [xmin, ymin, xmax, ymax],
"esriVersion": "...", // ESRI uniquement
"warnings": { // optionnel, si SRS non résolu
"coordsys": {
"wkt": "..."
}
}
}
}
]
[
{
"dataset": {
"name": "...",
"formatShort": "...",
"formatLong": "...",
"path": "...",
"numberOfFeatures": 0,
"type": "...", // no_geom
"attributes": [
{
"name": "...",
"type": "...",
"alias": "...",
"length": 0, // optionnel
"precision": 0, // optionnel
"scale": 0, // optionnel
"isNullable": true, // optionnel
"editable": true, // optionnel
"required": true, // optionnel
"domain": "...", // optionnel
"defaultValue": "..." // optionnel
}
],
"esriVersion": "...", // ESRI uniquement
"warnings": { // optionnel, si SRS non résolu
"coordsys": {
"EPSG": "...",
"name": "...",
"wkt": "..."
}
}
}
}
]
This sub-program, located in the Isogeo service installation directory, consists of :
- an
.exeapplication file - an
_internalfolder containing the DLLs needed to run the application
As indicated here, the execution of the application must not be blocked on the machine hosting the Isogeo service in order for Scan Isogeo to function.
Description¶
The application and its _internal are derived from :
- packaging with PyInstaller 6,
- in a Windows environment,
- a program written in Python 3.
The main program dependencies are listed below and are updated regularly:
Writing¶
.json and .log files are edited by the sub-program in a %programdata%\Isogeo\tmp folder and deleted regularly by the Isogeo service.
.log file¶
These log files are used for support actions in the event of a processing failure.
.json files¶
The sub-program writes the processing results to these files, which are read by the Isogeo service.
Network flows and data access¶
The sub-program does not initiate any outgoing connections to the Internet, nor does it contact any external service belonging to the editor.
It opens no listening ports and exposes no network services.
The only network communications likely to be carried out are those required to access data explicitly targeted by the Scan Isogeo user.
Depending on the customer's configuration, this may include :
- Connection to a relational database (Oracle, PostgreSQL, SQL Server) ;
- Access to an ESRI corporate geodatabase ;
- Access to network shares (SMB) containing data files.
These connections are made :
- to administrator-defined hosts ;
- on the standard ports of the services concerned (e.g. 1521 for Oracle, 5432 for PostgreSQL, 1433 for SQL Server, 445 for SMB), unless specifically configured by the customer;
- in the security context of the account running the Isogeo Windows Service.
The subroutine performs no network scans, no automatic host discovery and no communication to unconfigured destinations.
Behaviors that may require safety validation¶
Creating child processes¶
In order to process data from corporate Geodatabases or ESRI file Geodatabases, the subroutine needs toexploit the capabilities of the ArcGIS Pro software, which must be installed and licensed on the same machine as the Windows Service. To do this, it triggers the execution of scripts by the Python interpreter installed with the ArcGIS Pro software.
Info
Retrieving the location of the Python interpreter installed with ArcGIS Pro software is documented here.
Executed Python scripts are stored in the _internal folder. The pywin library is used to prevent these child processes from being orphaned if the parent process is stopped.
Resetting the DLL search directory¶
Toavoid DLL conflicts between the subroutine and the ArcGIS Pro Python interpreter it uses, the following lines have been added to the source code:
if sys.platform == "win32":
import ctypes
ctypes.windll.kernel32.SetDllDirectoryW(None)
This code restores the default Windows behavior when searching for DLLs. This is the safest way to ensure that ArcGIS Pro's Python interpreter accesses its own DLLs.