Tuesday, July 14, 2009

HARVEST FROM NRTS

NTRS promotes the dissemination of NASA STI to the widest audience possible by allowing NTRS information to be harvested by sites using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). OAI-PMH defines a mechanism for information technology systems to exchange citation information using the open standards HTTP (Hypertext Transport Protocol) and XML (Extensible Markup Language). NTRS is designed to accept and respond to automated requests using OAI-PMH. Automated requests only harvest citation information and not the full-text document images.

Sites interested in harvesting from NTRS should review the following guidance before harvesting:

  • Use of Government Information
    The NTRS serves out unlimited, unclassified, publicly available NASA citations and full-text documents (PDFs). Persons, organizations, and sites interested in obtaining NASA information should review "Disclaimers, Copyright, Terms and Conditions of Use" for guidance.
  • Harvesting Images
    NTRS actively blocks spidering, robots, and intelligent agents from automatically retrieving the full-text images. Links to full-text documents (PDFs) are included in the citations. The URL image link in the harvested NTRS metadata is a way for your users to access the full-text document image residing on NTRS.
  • Harvesting Metadata Citations
    • The NTRS is an OAI-compliant data provider. OAI-PMH is an implementation of the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), a standard for retrieving metadata from digital document repositories.
    • NTRS supports OAI-PMH version 2.0. It does not support earlier versions of the protocol.
    • The base URL for the NTRS is http://ntrs.nasa.gov/
    • An OAI request for harvesting from NTRS will return a maximum of 300 records per request. If you plan on harvesting more than 300 records, please run it outside of the NTRS system's peak hours. Peak hours are Monday to Friday, 5:00 AM to 9:00 PM, U.S. Eastern time. Do not make more than one request every 3 seconds, even at off-peak times.
    • Users can harvest the NTRS data by sending an OAI compliant request to the NTRS archive. The URL is http://ntrs.nasa.gov/?verb. There are several valid verb values that provide useful information.

      Identify = Provides a description of the NTRS repository
      ListMetadataFormats = Gives the metadata format(s) available for request from NTRS
      ListSets = Provides a list of the NTRS defined sets. These results can help refine your request by asking for one specific set of data versus the entire NTRS collection
      ListIdentifiers = Gives a list of the OAI unique identifiers available within NTRS
      ListRecords = Gives a listing of N records at a time. NTRS is currently set to give 300 records at a time with a Resumption Token at the end if more records are available for the request received
      GetRecord = Will provide the user the XML file for a specific record
    • Supported Data Format Records may be retrieved from NTRS in the following format: NTRS supports oai_dc and the format is available to view with this request: http://ntrs.nasa.gov/?verb=ListMetadataFormats
  • Updated, Modified, and Deleted Citations and Full-Text Documents
    Over time, metadata citations and full-text document images may be updated, modified, and/or deleted as a result of regular data management. The best method to detect changes in NTRS information is regular harveting of NTRS using OAI-PMH. Newly updated and/or modified records will automatically replace previously harvested records. Records marked as 'deleted' will take additional processing on your site to detect NASA citations that should be deleted from your repository.

No comments: