In the simplest terms, Open Archives Initiative – Protocol for Metadata Harvesting (OAI-PMH) is a set of guidelines that if implemented on a computer system, enables the system to exchange data with other similar or compatible systems.
Harvesting is the process through which metadata is copied from one system to another. The system from which metadata is harvested from is called a data provider, while the system that harvests data from a data provider is called a service provider. Both a data provider and service providers are called repositories. Service providers usually have value added services such federating data from various repositories and examples include; Library Discovery Services, Union catalogs, or National Repositories
Another benefit of having other people harvest your metadata is that they will create back-links to your repository and this is very important not just in google ranking but also for Webometrics Ranking Web of Universities
DSpace has a built in mechanism that allows repository managers to either harvest metadata from other repositories or expose their data to other service providers using OAI-PMH. In other words, DSpace can act as a data provider or service provider at the same time.
This guide assumes that you have installed dspace on a Linux system and that dspace files are installed in [dspace-folder]
- Stop tomcat server
sudo service tomcat7 stop
- cd (go to) tomcat’s webapps folder – After installation, dspace webapps are compiled in [dspace-folder]/webapps. These webapps must be linked to tomcat so that tomcat can serve them to HTTP requests. To enable dspace’s oai webapp, first, go to tomcat’s webapps folder /var/lib/tomcat7/webapps.
- Create a symbolic link from tomcat’s webapps folder to dspace’s webapps folder
sudo ln -s [dspace-folder]/webapps/oai
- Confirm the symbolic link
ls -lh oai
- Restart tomcat server
sudo service tomcat7 start
- Test OAI webapp via HTTP – Open up the web browser and access the following link
Replace [repository.my-institution.com] with the actual URL of your repository.
If this works OK, then you can proceed to the next step,, if not, review the steps above again .
- Index records into OAI – To make your records accessible (harvestable) via HTTP, you need to index them.
[dspace-folder]/bin/dspace oai import -o
NB: All the previous steps are irrelevant if you don’t implement step 7.
- Open the crontab file
This step needs to be performed using the dspace user just as is solr indexing.
- Set-up automatic indexing using a cronjob – Add this line to the file and save
0 0 * * * [dspace-folder]/bin/dspace oai import -o > /dev/null
- Confirm everything is OK
Your dspace server should now be set up to exchange metadata with other compatible systems.
The link below is an example that shows a properly working OAI data provider.