Counting on IRUS

As we all know, repositories are an established component of the rapidly evolving scholarly web infrastructure in the UK and globally, and whatever the impact of Finch and the potential shift from Green to Gold, they are likely to remain a primary source of authoritative full-text versions of research outputs and, increasingly, associated data-sets as well as a variety of other scholarly outputs including electronic theses and Open Educational Resources (OER). The institutional variety are increasingly better integrated within University websites and research management infrastructure and emphasis on Search Engine Optimisation (SEO) means that we can only expect in-bound traffic to increase.

The repository landscape, however, is fragmented with 1813 repositories currently registered globally with OpenDOAR utilising dozens of different software platforms with a total of 206 in the UK including 154 institutional repositories and 47 disciplinary repositories. As explored in my Pecha Kucha at OR2012, it is far from easy to consistently provide accurate, dynamic, article level usage data across the various software platforms and, in addition to the functionality of the underlying software, can depend on how that software has been implemented and the technical ability of supporting staff. EPrints, for example, by far the most popular software, has the excellent IRStats plug-in but it is not implemented consistently across EPrints installations. Many repository managers also utilise Google Analytics which can be a powerful tool but requires a degree of technical intervention and active management.

There is therefore an urgent need for a standardised method of aggregating usage data across repositories which is where the IRUS-UK project comes in. IRUS (Institutional Repository Usage Statistics) follows on from the PIRUS2 project, which demonstrated how COUNTER-compliant article-level usage statistics could be collected and consolidated from Publishers and Institutional Repositories.

To participate in IRUS, repositories will need to install ‘tracker code’ which pings the IRUS server with a defined OpenURL string every time an item is downloaded from the repository. Personally I have been interested to learn a little about how IRUS will eliminate search engine spiders and robots by screening “user-agents” defined in the COUNTER official list (available from here as an XML file and a TXT file).

There are currently plug-ins available for EPrints 3.2/3.3 and DSpace 1.8.x (for other software, ‘tracker’ installation will need additional work which will vary according to the software (specification/requirements will be defined in the PIRUS and IRUS-UK Codes of Practice which has not yet been released.) When implemented in your repository, COUNTER compliant usage statistics will be available from IRUS via standard COUNTER reports (SUSHI and/or Excel spreadsheets/CSV/TSV files) as well as via an API to enable retrieval and display of data in repository records.

There is also the potential to implement IRUS in third-party aggregation services like CORE and CiteSeer which both cache copies of full-text, thereby enabling item-level data to be consolidated from different sources.

Thanks to Paul Needham for this information; IRUS have also agreed to come and speak to the UKCoRR membership at the next members meeting at Teesside in November (full programme soon.) For more information or to register your interest in the meantime email irus@mimas.ac.uk

N.B. Neil Stewart has recently blogged on early participation with IRUS-UK at http://cityopenaccess.wordpress.com/2012/08/29/city-research-online-irus-uk/

5 thoughts on “Counting on IRUS

  1. There needs to be a piece of Javascript on every page of the repository you want to track. When the page is loaded (somebody with a browser loads that page) the Javascript is executed. The Javascript function should then call the server which in turn counts the visit and can do whatever, e.g. send it to another server …

  2. Perhaps the best way to discard clicks from search engine spiders is using Javascript. Spiders download pages just for indexing purposes and do not do not execute Javascript. This method is in my opinion more effective than looking into the user-agents tag and also much easier to maintain.

    1. Probably showing my ignorance but would that work in all circumstances…presumably the executable javascript would need to be on the repository record / pdf download link but what if links end up elsewhere…either by OAI harvest or just manually added to a blog etc?

      Nick

Leave a Reply to Ian Wellaway Cancel reply

Your email address will not be published. Required fields are marked *

Back To Top