Guidelines for bulk import

Points to consider

Where is the data coming from?

Who owns the database? Are they happy for wholesale extraction of data? Will your actions slow down their server? Is there a relevant API to help you? What are the future plans for the service; are you potentially replacing the service, or feeding from/to it?

What was the original purpose of the data?

Metadata quality and quantity

The quality and richness of the metadata depends on the purpose of its original use and who collated the data. It is possible that the completeness of the metadata was not important for the original use and therefore there may be many key pieces of information missing (such as authors, DOI, date of publication). Is it worth adding these manually?

Metadata format

Metadata import

Is there an import plug-in already available from your repository that can be used to import the data in the format provided? This will save quite a bit of time.

Resource allocation

Keeping track of imported data