Web site survey
A survey of the websites from the Universities of Leeds, Sheffield and York was carried out between October 2007 and February 2008. The aim of the survey was to discover sources of full text research publications that could provide sources of content for White Rose Research Online. Unfortunately there was very little full text available. The survey became a mapping exercise of where and how research outputs were presented online.
A detailed account of our findings is available in the Database Prevalence Report (v1.0 April 2008) pdf
Key findings
- A web survey is a useful starting point to identify patterns of archiving behaviour across different schools and faculties.
- We hoped to identify some "pots of gold": collections of metadata and full text which could be imported wholesale into WRRO; this proved over-optimistic.
- Database collections of metadata can be found but the structure and quality of metadata is very varied.
- Import of such databases may be a driver for further WRRO deposit - though this thesis is still to be tested during the project.
- Import of departmental databases is likely to involve considerable work to standardise and improve metadata.
- Most full text is distributed across the web pages of individual researchers; researchers may be using the local content management system to organise their files - but often the files are located on local drives.
- Citation format on individual web pages can be inconsistent; this will impact on the extent to which pages can be crawled effectively to extract metadata.
- On the one hand, researchers who archive on personal pages may be interested in depositing into WRRO; on the other hand, sometimes those who archive on their own websites do so without regard for copyright laws. They may feel the repository offers them less freedom to archive than their current site.
- Maths and Computing were the disciplines with the strongest self-archiving behaviours.
- Most full text collections could be described as grey literature.
- Academics are most interested in presenting information about their publications when it is directly linked to their professional identity i.e. on personal web sites rather than at the departmental level
