The Sickle Client

class sickle.app.Sickle(endpoint, http_method='GET', protocol_version='2.0', iterator=<class 'sickle.iterator.OAIItemIterator'>, max_retries=0, retry_status_codes=None, default_retry_after=60, class_mapping=None, encoding=None, **request_args)

Client for harvesting OAI interfaces.

Use it like this:

>>> sickle = Sickle('http://elis.da.ulcc.ac.uk/cgi/oai2')
>>> records = sickle.ListRecords(metadataPrefix='oai_dc')
>>> records.next()
<Record oai:eprints.rclis.org:3780>
  • endpoint (str) – The endpoint of the OAI interface.
  • http_method (str) – Method used for requests (GET or POST, default: GET).
  • protocol_version (str) – The OAI protocol version.
  • iterator – The type of the returned iterator (default: sickle.iterator.OAIItemIterator)
  • max_retries (int) – Number of retry attempts if an HTTP request fails (default: 0 = request only once). Sickle will use the value from the retry-after header (if present) and will wait the specified number of seconds between retries.
  • retry_status_codes (iterable) – HTTP status codes to retry (default will only retry on 503)
  • default_retry_after (int) – default number of seconds to wait between retries in case no retry-after header is found on the response (defaults to 60 seconds)
  • class_mapping (dict) – A dictionary that maps OAI verbs to classes representing OAI items. If not provided, sickle.app.DEFAULT_CLASS_MAPPING will be used.
  • encoding (str) – Can be used to override the encoding used when decoding the server response. If not specified, requests will use the encoding returned by the server in the content-type header. However, if the charset information is missing, requests will fallback to ‘ISO-8859-1’.
  • request_args – Arguments to be passed to requests when issuing HTTP requests. Useful examples are auth=(‘username’, ‘password’) for basic auth-protected endpoints or timeout=<int>. See the documentation of requests for all available parameters.

Contains the last response that has been received.


Issue a ListSets request.


Issue an Identify request.

Return type:sickle.models.Identify
ListIdentifiers(ignore_deleted=False, **kwargs)

Issue a ListIdentifiers request.

Parameters:ignore_deleted – If set to True, the resulting iterator will skip records flagged as deleted.
Return type:sickle.iterator.BaseOAIIterator

Issue a ListMetadataFormats request.

Return type:sickle.iterator.BaseOAIIterator
ListRecords(ignore_deleted=False, **kwargs)

Issue a ListRecords request.

Parameters:ignore_deleted – If set to True, the resulting iterator will skip records flagged as deleted.
Return type:sickle.iterator.BaseOAIIterator

Issue a ListSets request.

Return type:sickle.iterator.BaseOAIIterator

Make HTTP requests to the OAI server.

Parameters:kwargs – OAI HTTP parameters.
Return type:sickle.OAIResponse

Working with OAI Responses

class sickle.response.OAIResponse(http_response, params)

A response from an OAI server.

Provides access to the returned data on different abstraction levels.

  • http_response – The original HTTP response.
  • params (dict) – The OAI parameters for the request.

The server’s response as unicode.


The server’s response as parsed XML.

Iterating over OAI Items

class sickle.iterator.OAIItemIterator(sickle, params, ignore_deleted=False)

Iterator over OAI records/identifiers/sets transparently aggregated via OAI-PMH.

Can be used to conveniently iterate through the records of a repository.

  • sickle (sickle.app.Sickle) – The Sickle object that issued the first request.
  • params (dict) – The OAI arguments.
  • ignore_deleted (bool) – Flag for whether to ignore deleted records.

The sickle.app.Sickle instance used for making requests to the server.


The OAI verb used for making requests to the server.


The name of the OAI item to iterate on (record, header, set or metadataFormat).


The content of the XML element resumptionToken from the last request.


Flag for whether to skip records marked as deleted.


Return the next record/header/set.

Iterating over OAI Responses

class sickle.iterator.OAIResponseIterator(sickle, params, ignore_deleted=False)

Iterator over OAI responses.


Return the next response.

Classes for OAI Items

The following classes represent OAI-specific items like records, headers, and sets. All items feature the attributes raw and xml which contain their original XML representation as unicode and as parsed XML objects.


Sickle’s automatic mapping of XML to OAI objects only works for Dublin Core encoded record data.

Identify Object

The Identify object is generated from Identify responses and is returned by sickle.app.Sickle.Identify(). It contains general information about the repository.

class sickle.models.Identify(identify_response)

Represents an Identify container.

This object differs from the other entities in that is has to be created from a sickle.response.OAIResponse instead of an XML element.

Parameters:identify_response (sickle.OAIResponse) – The response for an Identify request.


As the attributes of this class are auto-generated from the Identify XML elements, some of them may be missing for specific OAI interfaces.


The content of the element adminEmail. Normally the repository’s administrative contact.


The content of the element baseURL, which is the URL of the repository’s OAI endpoint.


The content of the element repositoryName, which contains the name of the repository.


The content of the element deletedRecord, which indicates whether and how the repository keeps track of deleted records.


The content of the element delimiter.


The content of the element description, which contains a description of the repository.


The content of the element earliestDatestamp, which indicates the datestamp of the oldest record in the repository.


The content of the element granularity, which indicates the granularity of the used dates.


The content of the element oai-identifier.


oai-identifier is not a valid name in Python.


The content of the element protocolVersion, which indicates the version of the OAI protocol implemented by the repository.


The content of the element repositoryIdentifier.


The content of the element sampleIdentifier, which usually contains an example of an identifier used by this repository.


The content of the element scheme.


The original XML as unicode.

Record Object

Record objects represent single OAI records.

class sickle.models.Record(record_element, strip_ns=True)

Represents an OAI record.

  • record_element (lxml.etree._Element) – The XML element ‘record’.
  • strip_ns – Flag for whether to remove the namespaces from the element names.

Contains the record header represented as a sickle.models.Header object.


A boolean flag that indicates whether this record is deleted.


The original XML as unicode.

Header Object

Header objects represent OAI headers.

class sickle.models.Header(header_element)

Represents an OAI Header.

Parameters:header_element (lxml.etree._Element) – The XML element ‘header’.

The original XML as unicode.

Set Object

class sickle.models.Set(set_element)

Represents an OAI set.

Parameters:set_element (lxml.etree._Element) – The XML element ‘set’.

The name of the set.


The identifier of this set used for querying.


The original XML as unicode.

MetadataFormat Object

class sickle.models.MetadataFormat(mdf_element)

Represents an OAI MetadataFormat.

Parameters:mdf_element (lxml.etree._Element) – The XML element ‘metadataFormat’.

The prefix used to identify this format.


The namespace URL for this format.


The URL to the schema file of this format.


The original XML as unicode.