Geospatial lineage can be defined as the part of metadata that describes the origin of the data (in essence, sources and processes used). Its usefulness has been recognized in data discovery, quality assessment, and reproducibility of geographic information. Despite the existence of scientific literature and data models to represent it, the presence of lineage information in geospatial metadata is generally still scarce, and when present, this is not comprehensive enough. The main hypothesis of this PhD is based on the evidence the absence of lineage information in geospatial metadata acts as a barrier for the interoperability and reproducibility of data, processes and geospatial models, in both scientific and administrative environments. In this scenario, a further research is needed in order to propose new mechanisms to promote a greater incorporation of lineage information into geospatial metadata. Firstly, this PhD investigates the deficiencies in the phases of representation, capture, storage and visualization of the lineage information. Secondly, it proposes alternatives, both at theoretical and practical level, that promote a better description of lineage to increase its presence in the metadata. Finally, it proposes methodologies to increase its usefulness in both, in the context of Geographic Information Systems (GIS) as well as in distributed web environments. Chapters 2, 3, 4, and 6 make proposals to improve the capabilities of the models. Specifically, chapter 2 proposes an adaptation of the W3C PROV model (a generic model to describe the lineage of all types of information on the web) to the particularities of geographic information in order to describe lineage at layer, feature and attribute level. Chapters 3 and 4 propose the combination of the lineage models included in ISO 19115-1 and ISO 19115-2 with Web Processing Service (WPS) standard from the Open Geospatial Consortium (OGC) to improve the completeness of the lineage data model. Finally, chapter 6 emphasizes the need to represent and relate the lineage of different datasets to maximize benefits. Chapters 3 and 4 present a tool called Provenance Engine (PE). The tool, developed in the framework of MiraMon GIS and Remote Sensing software, captures automatically the lineage of executions performed by MiraMon. Tools to enhance the interpretation of lineage are necessary as has a direct impact on its understanding and usefulness. In this sense, MiraMon allows to visualize lineage as an indented sequence of processes including all parameters used and the outputs generated. In addition, chapter 6 presents a system that provides and renders lineage information as a graph in a distributed web environment. Finally, some proposals have been formulated to increase the usefulness and provide an added value to lineage. Chapter 5 sets a theoretical basis for querying lineage information on remote sensing data in order to receive only those fragments of data or processes that we are interested on. Finally, chapter 6 expands and complements the work presented in chapter 5. Specifically, it provides the design of a query system embedded within a map browser that allows presenting lineage information of some layers included in the browser in a single view, as well as compare the workflows executed to generate the different datasets.
| Date of Award | 29 Jan 2021 |
|---|
| Original language | Catalan |
|---|
| Supervisor | Joan Masó Pau (Director) |
|---|
Aportacions en el camp del Llinatge Geospacial en entorns istribuïts: de la captura a l’explotació
Closa Santos, G. (Author). 29 Jan 2021
Student thesis: Doctoral thesis
Closa Santos, G. (Author), Masó Pau, J. (Director),
29 Jan 2021Student thesis: Doctoral thesis
Student thesis: Doctoral thesis