Developing a Policy
In order to run an EPrint archive effectively, it should be backed up by an explicit Archiving policy to clarify the purpose and context of operation of the archive. This policy should address the various roles in archiving, indicating the responsibilities of different classes of users and also outlining the use and benefits that the institution expects to obtain from its use.
An example of such a policy can be seen at http://www.ecs.soton.ac.uk/~lac/archpol.html. It begins by emphasising the benefits of self archiving, then states that all members of staff are expected to archive all their research output. The remainder of the document addresses worries about copyright.
The construction of an institutional policy should reflect the practises and concerns of the members of that institution. The following issues should be agreed before the archive is operational
- The Scope of the Archive
Different people in your institution will have different ideas about the kinds of material that willbe stored in the archive. You should make clear decisions from the outset about whether the archive to be used only for research output, or for capturing many forms of institutional documentation (e.g. minutes of meetings, teaching materials or multimedia data).
- The Purpose of the Archive
For what purpose do you intend to use the archive? The answer to this question will affect the amount of resourcing that the archive will demand.
Purpose Full text High Quality metadata Complete Coverage Acceptable Formats Longevity Dissemination X Web-accessible open formats (DOC/HTML/PDF) Archiving X X Preservation formats (XML) X Administration X X Library Science Research X Although access to full text and quality bibliographic metadata is often assumed, it is not a prerequisite for every application. The imperative for dissemination of the latest research results does not require accurate bibliographic metadata, and in fact such metadata will be unavailable before official publication. For long-term archiving, it is important to capture that information as soon as possible and to provide guaranteed, future-proof storage and document access. For research administration purposes, it is the existence of the eprint which is of paramount importance, and for every single artcile to have an eprint record. Many research applications, by contrast, require a selection of records with good metadata.
- Print Fidelity of Bibliographic Metadata
In technical subjects, it may be common to have unusual typographic symbols (e.g. ℘) or formulae (e.g. H2SO4, E=mc2)in the titles or abstracts of an article. In European languages it is common to have author names containing diacritical marks (ü é î etc.). EPrints supports the use of Unicode characters in its metadata, so a wide range of special charcters are supported, but there is no support for formulae.
It is important to manage expectations on thbis front, as an EPrint archive listing will not be as flexible as a bibliography produced in a word processor. It is difficult to search on special formulae in titles, and so a possible stance to take is to provide a simple ASCII translation of a formula in the metadata (or to use a commonly recognised coding scheme like TeX maths mode).
- Institutional Subgroup Reorganisations
It is likely that people in the deploying institution are divided into administrative groupings (Research Groups, Departments, Schools, Faculties etc). It is common for each user to be a member of one (or more) groups, and for their eprints to be affiliated to that group. The problem is that such groupings may change every few years, whereas the life of an eprint should potentially be measured in decades (longer in an archiving application). It is necessary to decide at the outset how eprints would be treated in this case - will they have to be individually reassigned to new groups, or should the old group be maintained as a legacy construct to keep the archive's database coherent. Changes such as group renaming can take place relatively easily, but group splits are more difficult to manage.
On a related theme, when an individual changes groups, should their eprints be reassigned? In fact, are eprints assigned a group ownership independently of the membership of the authors or depositer? Or is their group affiliation derived from the affiliation of the person who wrote/deposited the eprint?
- Identifying Authors and Depositers Reliably
The lifespan of an eprint may be many times longer than the period of employment of the user who created or deposited any particular paper. It is important that once login names and emails are allocated, they stay the same for a long period of time. This has consequences, especially if you decide to use an existing username/password authentication service, in particular it should not reallocate a login name just because its owner left several years ago!
Experience shows that one of the easiest ways to uniquely identify an author is to use their professional email address.




