MCP: The Metadata Collection Parser

MCP: The Metadata Collection Parser
The Metadata Collection Parser is a Perl script that takes a collection of GIS metadata that might span several directories and subdirectories and creates a nifty hypertext catalog with various indexes, etc. It was written by Paul Cote GIS Specialist at the Harvard Graduate School of Design.
For an example of the metadata catalog produced, click here
The reason MCP was created was to provide a way to catalog and recatalog geographic data easily, though it may be rearranged in different directory structures, or put onto CD. One big advantage of the catalogs produced, though they have many cross-referenced indices, they are nothing but hypertext which can be burned onto CD and perused with any hypertext browser.
The source for most of what appears in the catalog comes from mp-complient metadata files that are associated with each dataset in the collection, and index.htm files that are html-format readme files explaining the content of each subdirectory. MP does a lot of work for MCP, checking the formatting of the metadata files and generating html reports.
The cataloger parses each mp-complient file, and then creates the following:

A nice metadata summary for each dataset, including an easy- to-read attribute dictionary, including sub-tables for enumerated domain values. ( click here for an example.)
Multiple indexes to the summaries, by directory, by theme keyword, or geography keyword. (Example)
In the midst of doing this, the cataloger calls on MP to create its comprehensive HTML format metadata, and a listing of any errors that it found in the metadata.

Architecture and Dependencies
There are a few things you need to know, and a few resources you need to have before you can exploit MCP:

FGDC Metadata Standards and the MP metadata parser
MCP is built on the FGDC metadata standard, and the file formats supported by the MP metadata parser.
PERL
MCP does a lot of parsing and indexing. PERL (a programming language by Larry Wall, provides the guts of the program. If you don't know perl, you should learn it. Please do not write to the author of MCP with perl-specific questions.

Downloading and learning MCP

You can download a tar-gzipped distribution of mcp by clicking here. THis distribution has been prototyped on unix, and may need some alterations to run on other platforms. It will unpack as a directory named mcp. This directory contains a file list that you probably will want to read.

It easy to start learning MCP by running it against the sample metadata collection contained in the sample_md_tree directory. There are three things you will undoutedly need to change to make catalog.pl work on your system before running it on the sample data:

change the perl path at the first line of catalog.pl
change the $refdir for your collection in sample.conf to be the full system path of the sample_md_tree directory.
change the entry for $mp_path in the sample.conf file to point to the system path for your version of mp.

Before you run catalog.pl the first time, you should look through the sample_md_tree directory and take a look at what is in there. catalog.pl will create a bunch of other files, and you will understand the process better if you see the 'before picture.' If you are reading this too late, you can always unpack a new sample_md_tree from your tgz archive.

Now you should be able to run catalog.pl with the single argument being the name of the configuration file:

catalog.pl sample.conf

Now that you have seen what it does, I will leave it as an exercise for you to read sample.conf, and the various readme.htm and index.htm files in the sample directory to figure out their roles. You should then be able to run catalog.pl on your own directories.

A couple of related, maybe useful, not well documented arcview extensions:

GSD Metadata Helper an extension to aid in the construction of mp complient metadata files. (for more information, choose 'About Metadata Helper' from the 'Metadata' menu
The metadata browser extension which lets you get the nifty metadata summaries for themes from within arcview. If a theme is associated with a file called theme_name.htm, click on the "M" button to pull up the html summary.