Handbook

EPrints Handbook
 • Home Page

A Guide to Starting Self-Archiving

A Guide to Self-Archiving and Open Access

Managing an EPrints Service

Installing an EPrints Server

Importing Data into an EPrints Archive

EPrints has a bulk data import facility based on XML. To import an XML file in this format, use the import_eprints script.

[eprints@hostname eprints2]$ bin/import_eprints siteid dataset >filename

siteid is the archive identifier, given when you create the EPrints archive. filename is the name of the XML file to import. dataset is one of the four datasets as follows

There are some further datasets to import other types of data into EPrints; see the Dataset entry in section 4.1 of the EPrints technical documentation. To import the subject list, the import_subjects tool should be used instead — see section 17.13 of the EPrints documentation for more details.

XML file format documentation

<record>
    <field id="cjg" name="authors">
        <part name="family">Gutteridge</part>
        <part name="given">Christopher</part>
    </field>

    <field id="mv" name="authors">
        <part name="honourific">Dr.</part>
        <part name="given">Marvin</part>
        <part name="family">Fenderson</part>
    </field>

    <field name="year">1993</field>

    <field name="subjects">foo</field>
    <field name="subjects">bar</field>
    <field name="subjects">baz</field>

    <field name="title">
        <lang id="en">The Thing</lang>
        <lang id="de">da Thung</lang>
        <lang id="fr">l'Thingu</lang>
    </field>
</record>

...(more records can go here)...

</eprintsdata>

Sample XML interchange format

The top level element is eprintsdata which contains zero or more record elements.

A record element represents a single eprints object and contains zero or more field elements.

A field element has the attribute name which is the name of a field in the dataset. The contents of the field element are the value of this field in this record.

Hello