Jimmy Wales to advise government on open access to research

Interesting press coverage on the 2nd May 2012 that Jimmy Wales is to help the government ensure that all publically funded research is available freely online within two years. David Willetts made this announcement in a speech delivered to the Publishers Association on the evening of 2nd May.

Reading the OA lists, there’s a range of opinion from scepticism to warm welcome for his involvement. He’s certainly high profile and long been a proponent of open access. Remember, the 24 hour closure of Wikipedia in protest at the proposed SOPA and PIPA legislation in the US. Celebrity involvement does guarantee that the press will take note.

On balance, UKCoRR believes that we should welcome the plans put forward today but there is a need to ensure that the government also listens to those who have been working in UK academia to promote and extend open access. It would seem a sensible approach to work with that resource and experience that already exists than simply starting from scratch and that existing projects and infrastructure are built upon.

According to the Guardian,

“This initiative is most likely to result in a central repository that will host all research articles that result from public funding. The aim is that, even if an academic publishes their work in a traditional subscription journal, a version of their article would simultaneously appear on the freely available repository. The repository would also have built-in tools to share, comment and discuss articles.”

There is a dearth of detail about implementation at the moment – understandably as the group convened by Dame Janet Finch won’t be reporting until June 2012. But it seems likely that the “central repository” won’t be a physical thing but will build on current infrastructure and projects. There has to be a pivotal role for Repository Junction which is “a standalone middleware tool for handling the deposit of research articles from a provider to multiple repositories” thus avoiding the thorny problem of duplicate deposit, which is understandably disliked by academics. See http://edina.ac.uk/cgi-bin/news.cgi?filename=2012-04-24-rjbroker-ori.txt

However, the bigger stumbling block is that old chestnut copyright. Simultaneous traditional publication and availability in an open access repository of publically funded research is restricted depending on the publisher’s policy’s, which can change in an instant. We repository workers all know the minefield that is journal copyright policies and the care which our host organisations take to avoid breaching them. The solution is to replace a practice where the author signs away their copyright with one where they give the publisher a non-exclusive licence to publish the article. Let’s get that in the two year plan and we’d really be making progress!

The text of David Willetts speech was published this morning and it makes interesting reading. He makes much reference to the gold road to open access but, given the context, this is not surprising. We might take issue with his definition of green: “Green means publishers are required to make research openly accessible within an agreed embargo period” but this is a minister telling the publishing industry that open access is here to stay. “Our starting point is very simple. The Coalition is committed to the principle of public access to publicly-funded research results. That is where both technology and contemporary culture are taking us. It is how we can maximise the value and impact generated by our excellent research base. As taxpayers put their money towards intellectual enquiry, they cannot be barred from then accessing it”.

Right at the end of the speech, there was a reference to the REF which indicates that open access is being considered for inclusion in the criteria for assessment: “HEFCE is also considering the issue. Peer review and assessment of impact are crucial to their allocation of research funding. The debate on open access will inform HEFCE’s planning for the research excellence process that succeeds the current one which concludes in 2014. Open access could be among the excellence criteria for qualifying articles in the future”. This is really exciting stuff – it would really change academic’s practice and behaviour. Let’s keep this on the
agenda.

 All in all, a good day for open access.

Using Google Analytics Statistics within DSpace

Thank you to Claire Knowles of Edinburgh University who provides this overview of how they have been able to display statistics from Google Analytics in DSpace.
—-
In 2009 Edinburgh University Digital Library adopted Google Analytics (GA) to track usage statistics within the DSpace Repositories it supports on behalf of the Scottish Digital Library Consortium (SDLC).  The GA statistics have proven much more reliable than the existing plugins available for DSpace previously with which we experienced lost statistics and inflated pageviews resulting from robots.

Unfortunately the GA statistics for sites being tracked are only viewable via the GA dashboard for which users require a Google account and managed permissions.  This limits the visibility of statistics to a few people at each institution.   Prompted by the presentation given by Graham Triggs (then working for BioMed Central) at the Open Repositories Conference 2010, we decided to write some code to make the Google Analytics statistics visible to all users of the DSpace installations.

The work has been broken into phases:

1. Capture of downloads in DSpace by Google Analytics. 

The basic GA tracking code within DSpace is unable to capture the number of file downloads as these are not links within pages.  To address this we added code to the two downloads on the item page to enable these download actions to be measured.  This captured all downloads within Dspace but not those users coming directly from search engines to the download file.  To capture these statistics we decided to reroute all users back through the item page. This means that they now have two clicks instead of one to reach the download but it enables us to capture these statistics and also raises the visibility of the Repository to users.  To reduce the inconvenience to the users we moved the file downloads links on the item page from the bottom to the top so that they do not have to scroll down to find the download. 

2. Adding page views to each item page within DSpace

Secondly, we added the number of page views within the last year to the item page.  This was a proof of concept which showed that we could connect to the Google Analytics API and pull back statistics into DSpace.  We decided to only include the number of views for the past year to reduce any disparities between the the number of pageviews between older and new items.

3. Making statistics viewable within the DSpace web pages. 

We decided to make the GA statistics available at three levels: item, collection and repository as this provides most of the statistics which are requested by users.  Using the Query Explorer provided by Google we were able to test and refine our queries before starting development.  The pages were developed using the Google Analytics java API, jQuery and the Google Chart tools to draw graphs and maps.  

As we complete the rollout of Google Analytics to all the SDLC partners we are starting to look to what other statistics we would like to make available both from Google Analytics and also possible exposing statistical information about DSpace using Google’s chart tools. One statistic that would be of interest to researchers is collating and presenting download figures for authors (rather than by item/collection/community).

We have encountered problems separating the item, collection and community statistics within DSpace as all of their urls are formatted in the same way, we therefore have to query DSpace data to do this and cannot distinguish them using the statistics data alone.  If the requested item, file, collection or community is not available in DSpace an error page is returned, these were being recorded in the same way as successful page which has led to invalid items being listed in the statistics top ten tables.  To prevent this error pages are now recorded as an error event within Google Analytics.

These changes have given us much greater understanding of how our repository is being used with the majority of users coming directly from Google.  The URLrewrite change led to a double of our download statistics as we now capture users who previously went straight to the download.

Thanks to: Scottish Digital Library Consortium, Stuart Wood and Gareth Johnson of University of Leicester for information on the URLrewrite, Graham Triggs formerly of BioMed Central and now Sympletic.

The code to enable GA stats within DSpace is freely available from github: https://github.com/seesmith/Dspace-googleanalytics

You can view our collection and item statistic changes at http://www.era.lib.ed.ac.uk

Using Google Analytics Statistics within DSpace

Thank you to Claire Knowles of Edinburgh University who provides this overview of how they have been able to display statistics from Google Analytics in DSpace.
—-
In 2009 Edinburgh University Digital Library adopted Google Analytics (GA) to track usage statistics within the DSpace Repositories it supports on behalf of the Scottish Digital Library Consortium (SDLC).  The GA statistics have proven much more reliable than the existing plugins available for DSpace previously with which we experienced lost statistics and inflated pageviews resulting from robots.

Unfortunately the GA statistics for sites being tracked are only viewable via the GA dashboard for which users require a Google account and managed permissions.  This limits the visibility of statistics to a few people at each institution.   Prompted by the presentation given by Graham Triggs (then working for BioMed Central) at the Open Repositories Conference 2010, we decided to write some code to make the Google Analytics statistics visible to all users of the DSpace installations.

The work has been broken into phases:

1. Capture of downloads in DSpace by Google Analytics. 

The basic GA tracking code within DSpace is unable to capture the number of file downloads as these are not links within pages.  To address this we added code to the two downloads on the item page to enable these download actions to be measured.  This captured all downloads within Dspace but not those users coming directly from search engines to the download file.  To capture these statistics we decided to reroute all users back through the item page. This means that they now have two clicks instead of one to reach the download but it enables us to capture these statistics and also raises the visibility of the Repository to users.  To reduce the inconvenience to the users we moved the file downloads links on the item page from the bottom to the top so that they do not have to scroll down to find the download. 

2. Adding page views to each item page within DSpace

Secondly, we added the number of page views within the last year to the item page.  This was a proof of concept which showed that we could connect to the Google Analytics API and pull back statistics into DSpace.  We decided to only include the number of views for the past year to reduce any disparities between the the number of pageviews between older and new items.

3. Making statistics viewable within the DSpace web pages. 

We decided to make the GA statistics available at three levels: item, collection and repository as this provides most of the statistics which are requested by users.  Using the Query Explorer provided by Google we were able to test and refine our queries before starting development.  The pages were developed using the Google Analytics java API, jQuery and the Google Chart tools to draw graphs and maps.  

As we complete the rollout of Google Analytics to all the SDLC partners we are starting to look to what other statistics we would like to make available both from Google Analytics and also possible exposing statistical information about DSpace using Google’s chart tools. One statistic that would be of interest to researchers is collating and presenting download figures for authors (rather than by item/collection/community).

We have encountered problems separating the item, collection and community statistics within DSpace as all of their urls are formatted in the same way, we therefore have to query DSpace data to do this and cannot distinguish them using the statistics data alone.  If the requested item, file, collection or community is not available in DSpace an error page is returned, these were being recorded in the same way as successful page which has led to invalid items being listed in the statistics top ten tables.  To prevent this error pages are now recorded as an error event within Google Analytics.

These changes have given us much greater understanding of how our repository is being used with the majority of users coming directly from Google.  The URLrewrite change led to a double of our download statistics as we now capture users who previously went straight to the download.

Thanks to: Scottish Digital Library Consortium, Stuart Wood and Gareth Johnson of University of Leicester for information on the URLrewrite, Graham Triggs formerly of BioMed Central and now Sympletic.

The code to enable GA stats within DSpace is freely available from github: https://github.com/seesmith/Dspace-googleanalytics

You can view our collection and item statistic changes at http://www.era.lib.ed.ac.uk

Using Google Analytics Statistics within DSpace

Thank you to Claire Knowles of Edinburgh University who provides this overview of how they have been able to display statistics from Google Analytics in DSpace.
—-
In 2009 Edinburgh University Digital Library adopted Google Analytics (GA) to track usage statistics within the DSpace Repositories it supports on behalf of the Scottish Digital Library Consortium (SDLC).  The GA statistics have proven much more reliable than the existing plugins available for DSpace previously with which we experienced lost statistics and inflated pageviews resulting from robots.

Unfortunately the GA statistics for sites being tracked are only viewable via the GA dashboard for which users require a Google account and managed permissions.  This limits the visibility of statistics to a few people at each institution.   Prompted by the presentation given by Graham Triggs (then working for BioMed Central) at the Open Repositories Conference 2010, we decided to write some code to make the Google Analytics statistics visible to all users of the DSpace installations.

The work has been broken into phases:

1. Capture of downloads in DSpace by Google Analytics. 

The basic GA tracking code within DSpace is unable to capture the number of file downloads as these are not links within pages.  To address this we added code to the two downloads on the item page to enable these download actions to be measured.  This captured all downloads within Dspace but not those users coming directly from search engines to the download file.  To capture these statistics we decided to reroute all users back through the item page. This means that they now have two clicks instead of one to reach the download but it enables us to capture these statistics and also raises the visibility of the Repository to users.  To reduce the inconvenience to the users we moved the file downloads links on the item page from the bottom to the top so that they do not have to scroll down to find the download. 

2. Adding page views to each item page within DSpace

Secondly, we added the number of page views within the last year to the item page.  This was a proof of concept which showed that we could connect to the Google Analytics API and pull back statistics into DSpace.  We decided to only include the number of views for the past year to reduce any disparities between the the number of pageviews between older and new items.

3. Making statistics viewable within the DSpace web pages. 

We decided to make the GA statistics available at three levels: item, collection and repository as this provides most of the statistics which are requested by users.  Using the Query Explorer provided by Google we were able to test and refine our queries before starting development.  The pages were developed using the Google Analytics java API, jQuery and the Google Chart tools to draw graphs and maps.  

As we complete the rollout of Google Analytics to all the SDLC partners we are starting to look to what other statistics we would like to make available both from Google Analytics and also possible exposing statistical information about DSpace using Google’s chart tools. One statistic that would be of interest to researchers is collating and presenting download figures for authors (rather than by item/collection/community).

We have encountered problems separating the item, collection and community statistics within DSpace as all of their urls are formatted in the same way, we therefore have to query DSpace data to do this and cannot distinguish them using the statistics data alone.  If the requested item, file, collection or community is not available in DSpace an error page is returned, these were being recorded in the same way as successful page which has led to invalid items being listed in the statistics top ten tables.  To prevent this error pages are now recorded as an error event within Google Analytics.

These changes have given us much greater understanding of how our repository is being used with the majority of users coming directly from Google.  The URLrewrite change led to a double of our download statistics as we now capture users who previously went straight to the download.

Thanks to: Scottish Digital Library Consortium, Stuart Wood and Gareth Johnson of University of Leicester for information on the URLrewrite, Graham Triggs formerly of BioMed Central and now Sympletic.

The code to enable GA stats within DSpace is freely available from github: https://github.com/seesmith/Dspace-googleanalytics

You can view our collection and item statistic changes at http://www.era.lib.ed.ac.uk

Back To Top