Printer Friendly

What about metadata?

Abstract

Statistics Canada has been a producer of spatial data for many years. Up to the 1971 Census of population, the spatial data were mainly analogue maps. For the 1971 Census of population, Statistics Canada released its first digital road network. Through the following years, more and more digital spatial data became available. The geographic products released for the 1996 Census included some elements of a metadata structure. These elements were included in the user guide accompanying each product. The spatial data quality elements included provided information on the fitness-for-use of a spatial database by describing why, when and how the data are created, and how accurate the data are. With the 2001 Census of population, the Geography Division created an environment or warehouse to integrate the data and the metadata. As a result of having a metadata framework or model, excellent metadata were created for tabular data. Unfortunately, the metadata accompanying the spatial data were not well articulated. In preparation for the 2006 Census, more efforts have been dedicated to improving the metadata for spatial data. This paper describes the challenges related to the production of metadata and the current efforts to improve metadata within Statistics Canada's Geography Division.

Introduction

An increase in the awareness of the importance of geography and how things relate spatially, combined with the advancement of electronic technology, have caused an expansion in the use of and geographic information systems worldwide. Increasingly, individuals form a wide range of disciplines outside of the geographic sciences and information technologies are capable of accessing, using, producing, enhancing, and modifying digital geographic information.

Metadata is the key in this new world of geographic awareness. Firstly, metadata are the root of information findability, whether they are administrative, structural or descriptive. They allow a producer to describe a dataset fully so that users can find and access the data. This information can then be provided to data catalogues and clearinghouses. By making metadata available through data catalogues and clearinghouses, organizations can find data to use, partners to share data collection and maintenance efforts, and customers for their data.

Secondly metadata provides users with the information they need to understand the data, its limitations and evaluate the dataset's applicability for their intended use. This is important because digital geographic data is only an abstraction of reality, always partial, and always just one of many possible "views". This view or model of the real world is not an exact duplication; some things are approximated, others are simplified, and some things are ignored. There is seldom perfect, complete, and correct data. Unlike the data producer who is aware of a dataset's limitations, a user must rely on proper documentation to develop a better understanding of a product, thereby enabling them to use it properly.

Although there are many different metadata standards that have been developed they all have basically the same elements and objectives. These are

* Provide data producers with appropriate information to characterize their geographic data properly;

* Organize and maintain an organization's investment in data - metadata help insure an organization's investment in data. As personnel change or time passes, information about an organization's data will be lost and the data may lose their value. It is particularly important in this period where many organizations are losing their staff who as they reach retirement age.

* Facilitate the organization and management of metadata for geographic data;

* Enable users to apply geographic data in the most efficient way by knowing its basic characteristics;

* Facilitate data discovery, retrieval and reuse. Users will be better able to locate, access, evaluate, purchase and utilize geographic data;

Barriers to metadata

Producing metadata is one of the most important activities for data producers but is often treated as a secondary deliverable and usually only started once the primary data deliverable has been completed. To better understand the reasons for this the Federal Geographic Data Committee (FGDC), in 2005, commissioned a study of metadata development in the biological and ecosystem science domain (GeoWorld, January 2006). The study aimed to identify common obstacles to metadata development and provide recommendations to resource-management agencies for best policies and practices to facilitate metadata development.

Before the study began, some obstacles about why metadata weren't being developed were assumed. The obstacles included lack of time and money, issues of data ownership and privacy, difficult-to-use authoring tools, complex standards, and a lack of directives or mandates to create metadata records. In all, 26 obstacles to metadata development were identified. These obstacles revolved around issues of organization, standards, education, process and tools.

The most commonly reported hurdle to metadata development within an organization is that managers don't believe in the value of metadata and that metadata are viewed as an activity that is ancillary to most enterprises' mission and objectives. This does not address the business goals of an agency and it leaves the metadata management program at risk of being abandoned during financially difficult or busy times.

How the organization treats data management through its organizational structure was another key factor. It was noted that personnel assigned specifically to metadata activities are frequently buried within the organization. Other obstacles reported were:

* development of metadata is the responsibility of the data owner and it is left until the work on the data is completed;

* most metadata developers can be considered occasional developers and current metadata authoring tools are difficult to use;

* complexity of standards.

It has long been recognized by the Geography Division (GEO) of Statistics Canada that the best time to collect metadata is while the data are being developed, when the information needed for metadata is known. Waiting after the data are developed increases the risk that less accurate information will be recorded and that the costs associated with searching for information are increased. As a result a number of initiatives have been undertaken by GEO to facilitate the production of metadata.

Development of metadata in the Geography Division of Statistics Canada

Statistics Canada has been a producer of spatial data for many years. Up to the 1971 Census of Population, the spatial data were mainly analogue maps. For the 1971 Census of Population, Statistics Canada released its first digital road network. Through the following years, more and more digital spatial data became available. The geographic products released for the 1996 Census were the first to include some elements of a geospatial metadata structure. These elements were included in the user guide accompanying each product. The spatial data quality elements included provided information on the fitness-for-use of a spatial database by describing why, when and how the data are created, and how accurate the data are. The elements include an overview describing the purpose and usage of a dataset, as well as, specific quality elements describing the lineage, positional accuracy, attribute accuracy, logical consistency and completeness. The inclusion of this information was a direct result of an official policy that requires Statistic Canada programmes to inform users of the concepts and methodology used in collecting, processing and analysing their data, of the accuracy of these data, and of any other features that affect their quality or "fitness for use".

For the 2001 Census more effort was put into creating geospatial metadata especially for products being made available to the public. Figure 1 shows the link to information about census geography products. Figure 2 shows that even FGDC compliant metadata was produced for the 2001 Census Cartographic Boundary Files which were provided to the GeoConnections website.

[FIGURE 1 OMITTED]

While this was a significant improvement, even more efforts have been dedicated to improving the metadata for Statistics Canada's spatial data in preparation for the 2006 Census.

[FIGURE 2 OMITTED]

Working group on spatial metadata

The current divisional metadata situation was discussed by Geography Division management and it was generally agreed that metadata standards required improvement, co-ordination between projects and standardisation as well as the establishment of clear roles and responsibilities for metadata collection. In order to improve the metadata component, the Geography Division management put in place a working group consisting of representatives of key projects, systems, and metadata authorities as a vehicle for the discussion and development of best practices for metadata and related issues. The primary objectives of the group were:

* to increase the use and value of metadata through the development of best practices;

* to define and resolve issues with respect to the development, warehousing and dissemination of metadata;

* to make recommendations on the implementation and continual refinement of metadata while considering user objectives and requirements; and

* to serve as an authoritative counsel for those involved in metadata activities.

There was agreement on the need to understand that establishing a standard will have to be a fluid process. There was also general consensus that data quality is one area that will likely require additional time to implement since various data quality measures and tests will require time to investigate. The point raised is that effort should me made to include data quality sooner than later in order to boost the utility of the data.

The major task for the working group was to establish a consensus of what the GEO version of divisional metadata should contain. To assist in prioritizing metadata content, a ranking exercise was devised that started with the examination of the FGDC standard. Each member of the committee ranked the seven sections in order of importance, and the combined score determined the sections overall priority. The results were used to establish a priority list for the completion of metadata by the various data producers within the division.

Internal Website

Internal websites were created to aid Statistics Canada personnel find data (Figure 3). These websites can be accessed from the Statcan Internal Communication Network (ICN) by any employee of the organization. One website is dedicated to the description and archiving of historical data. The other contains data for the current Census and has a number of sub categories to help users. This includes guidelines for the naming of new attributes, entity relationship diagrams which provide a concise overview of the data holdings (Figure 4), both attribute and concept definitions, and FGDC compliant metadata for each table/layer. The website also has a metadata generator which can be used as a starting point for data producers to create basic metadata for a layer/table (Figure 5).

[FIGURE 3 OMITTED]

[FIGURE 4 OMITTED]

[FIGURE 5 OMITTED]

Conclusion

Statistics Canada's Geography Division has taken a major step forward in the development of metadata for the 2006 Census. A key lesson learned is that although the creation of metadata is not always appealing, it can be done with the strong support of management. The challenge for the next years will be to increase the number of metadata elements, to keep abreast of new standards and to encourage data producers of spatial data to include metadata prior to loading into our data warehouse.

Robert Parenteau

Spatial Data Infrastructure / Infrastructure des donnees spatiales

SDI Section Chief / Chef de la section IDS

Geography Division / Division de la geographie

Statistics Canada / Statistique Canada

Phone: (613) 951-2958

e-mail: robert.parenteau@statcan.ca

Glen Hohlmann

Spatial Data Infrastructure / Infrastructure des donnees spatiales

Manager - SDI Improvement Task / Gestionnaire--Tache d'amelioration de IDS

Geography Division / Division de la geographie

Statistics Canada /Statistique Canada

Phone: (613) 951-3897

e-mail: glen.hohlmann@statcan.ca
COPYRIGHT 2006 Urban and Regional Information Systems Association (URISA)
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2006 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Parenteau, Robert; Hohlmann, Glen
Publication:Urban and Regional Information Systems Association Annual Conference Proceedings
Article Type:Technical report
Geographic Code:1CANA
Date:Jan 1, 2006
Words:1875
Previous Article:Integrated Cadastral Management parcel mapping on Canada lands using web GIS.
Next Article:Integrated Land Management Bureau solutions to data discovery.
Topics:


Related Articles
Metadata and other electronic realities facing lawyers today.
Metadata: a new word for an old concept.
Integrated Land Management Bureau solutions to data discovery.
Why metadata matters: records managers must be involved in the development and design of metadata structures to ensure that digital records are...
Metadata: implications for academic libraries.
Examining metadata: its role in e-discovery and the future of records managers: recent developments in law, standards, and technology suggest that...

Terms of use | Copyright © 2018 Farlex, Inc. | Feedback | For webmasters