System Requirements
HARVESTING REQUIREMENTS
- OAI-PMH Harvesting - The service must harvest metadata from repositories which are OAI-PMH version 2.0 compliant.
- Metadata Schemes - NSDL - The service must be able to harvest NSDL/DC metadata records.
- Metadata Schemes - Unqualified Dublin Core - The service must be able to harvest unqualified Dublin Core metadata records.
- Multiple Collections - The service must allow for access to multiple, separate collections.
- Additional Metadata Schemes - The service may allow for additional metadata schemes to be harvested. Ideally, these additional schemes would be extensible by the user through an administrative interface or template.
- XML Validation - The service should be able to check XML validity at the time of harvesting.
QUERY REQUIREMENTS
- Z39.50 Query Protocol - The Service must provide Z39.50 -2003 protocol access for querying and retrieving metadata records from stored collections.
- Result set format - USMarc - Query result sets must be provided in USMarc format.
- Result set format - Marc/XML - Query result sets must be provided in Marc/XML format.
- Result set format - SUTRS - Query result sets must be provided in SUTRS-compatible format.
- Standard Mapping - Standard Z39.50 mappings for NSDL Dublin Core and Unqualified Dublin Core must be provided out-of-the-box.
- Z39.50 Mappings - The user must have the ability to enhance / change mappings from metadata schemes to their Z39.50 equivilents.
COLLECTION STORAGE REQUIREMENTS
- Multiple OAI-PMH targets kept as individual query collections - The system must provide the ability to store harvested collections separately and allow querying of the separate collections individually.
- Mutiple OAI-PMH targets combined into single query collection - The system may provide the ability to harvest from multiple repositories and store multiple repositories in a single, local collection for combined querying purposes.
4. USER ADMINISTRATION REQUIREMENTS
Description of the tool:
The goal of the administration tool is to provide easy system control to a user/librarian. It is assumed that user has a very little familiarity with specific details of text matching algorithm. It is laso assumed that the user has general knowledge about Z39.50 servers and clients, USMARC, XML tags and OAI records.
The tool should be able to provide the following features:
Add/Edit/Delete information about OAI sources
- Add a new OAI source – Enter URL and Harvesting parameters
- Edit existing source
- Delete OAI source
Index (Add/Update/Delete) the data into Zebra server using Zebra indexing tool
- Create a new table with a provided name
- Delete a table with a provided name
Create/Edit/View filter(s)
- Specify filter name
Miscellaneous features
- Mechanism to save the settings into a file such as oai.cfg
- Harvest through selectable list of OAI sources (create HTTP connection, store OAI into the folders, print the report from harvesting)
- Error reporting mechanism. The response from other OAI sources have to be catched.
Undecided features or low priority features
- A way to start/shutdown zebra
- Specify listening port
- Web interface allowing to search Z39.50 online
- Customizeable paths to store records, settings
- Compatibility with Win32 architecture
- Specify 'match' string
- Specify characteristics of a string (integer, string)
- Specify how to match each tag in the following format: "from <> to <> and to ### usmarc field"
- Display summary information on tables, records, etc.