Can we further subdivide nuances of open access?
July 15, 2009
The Public Knowledge Project 2009 conference ultimately made me re-think the way that open access (OA) is defined and subdivided.
The current subdivision is dichotomous. Open access is subdivided into the gratis and the libre models as described by Peter Suber in his Open Access Newsletter, where gratis OA refers to access without price barriers alone, while libre OA involves the removal of price and at least some permission barriers. I perceive this to be a hierarchy of use, where gratis OA is less usable as permissions for further use of these items are not clear.
The concept of the hierarchy was echoed in a workshop that I attended at the PKP Scholarly Publishing Conference on Lemon8-XML (L8X). One of the speakers, MJ Suhonos, underscored that all document dissemination formats are not created equal. If one compares an XML encoded article to the same article available in PDF, we see that the XML encoded article enables enhanced access to the content. The strength is in the modularity of the XML, which enables the content to be labeled and described explicitly in a standardized way. The usefulness of XML can be described using the example of citations. In a PDF, the citations sit lumped with the rest of the PDF and can not be reliably harvested or parsed as discrete citations because to a machine they appear to be identical to the text of the article. In XML, the citations are denoted as citations and hence can be parsed and analyzed as such.
One can’t help by imagine a world where every document has semantically encoded citations! We would not need to rely on ISI and Scopus anymore (or pay the Crossref fees). Everyone would have equal access to citation harvesting and analysis. (Two years ago, a Scopus vendor told me their indexing rejection rate was approximately 80%…talk about an elite society!) XML markup could enable global barrier-free citation analysis, where elite membership would no longer be necessary.
In this same L8X session, Juan Pablo Alperin discussed other benefits of XML markup besides the infinite possibilities of enhanced bibliometric analysis. He asked us to imagine the benefits of discovering collaboration networks, where enhanced author markup, for example, would enable us to see which institution collaborates with whom. Enhanced document discovery would also be a benefit, where the availability of complete metadata means that we can find related works in many ways such as: by the same author, subject, in same journal, by the same publisher.
While we are already seeing some of these benefits in Google Scholar, not all articles are marked up in a way to be able to fully benefit from what Google Scholar has to offer.
We see then, that there is a divide between articles which are static in their nature like PDF vs. articles that are marked up in such a way that all their components have meaning associated with them. I argue, then, that articles that are not marked up in XML are less usable than those that are, just like research that is available as gratis OA is somewhat less usable than libre OA.
The PKP team has been aware of the benefits of XML early on and responded by creating the Lemon8-XML software. They recognize the need for equal semantic exposure for all scholarship and have created a tool that puts this ability within everyone’s reach.
The Lemon8 software enables an editor to upload an article, and takes them step by step through marking up that document in XML while abstracting them from the gory details. Lemon8 identifies document metadata such as title and author, and among other features, searches multiple databases to help verify citations by automatically suggesting additional data in a user friendly way. Article markup is still not a quick venture, but if editors were to incorporate Lemon8 into their workflow, it could actually save them time as it would greatly reduce the time it takes to verify citations while at the same time enabling their semantic markup.
I am excited to learn that integration of Lemon8 into the Open Journal Systems software is on the development roadmap for the Public Knowledge Project, and am looking forward to working with this added functionality.
PRONOM
March 25, 2009
Paving the way towards sustainability of electronic records is PRONOM, an online registry of technical information.
An initiative of the National Archives (the UK government’s official archive in Surrey) the PRONOM registry was “originally developed to support the accession and long-term preservation of electronic records”. The National Archives have graciously made this valuable resource available to all.
As described on the site, “PRONOM holds information about file formats, and the software products which can process (read, write, identify etc) each format. Information related to the file formats, such as documentation about them, their compression types, character encoding schemes and intellectual property rights is also held. “
When browsing the site, I was pleased to find that in addition to a simple search, one can search by file format, vendor, software, lifecycle, migration pathway and Pronom unique identifier. The search also allows you to find file formats by extension, and to search for software that can process files with a particular extension (or file format name). An online submission form is available to encourage user contributions and to help keep the registry current.
What an important step towards tackling the challenges of digital preservation!
Green OA vs. Gold OA
March 11, 2009
A post by Stevan Harnad on the JISC repositories listserv directed me to Richard Poynder’s post and article. His article “Open Access: Whom would you back” provides an excellent history of the Open Access movement. It chronicles the devious and fascinating approaches employed by scholarly publishers in adapting to the evolving publishing landscape.
As a librarian in the trenches promoting Green OA in support of our institutional repository, I see on a daily basis the tactics that some publishers are using to appear complicit to open access while at the same time doing their very best to undermine the authority and validity of self-archived works. As Poynder points out, publishers are doing what they can to shift focus from Green OA to Gold OA as this is a more profitable venue for them. He points out that we may be disappointed if all that Gold OA accomplishes is a shift from the paying of subscriptions to the paying of APCs.
While I do agree with many of Richard’s arguments, there’s one point I’d like to make. To me, it is not as important who profits most in the transition to OA, because we are all winners in the end. While I agree that the Green OA would solve both affordability and access challenges and I back it wholeheartedly as an ideal solution, no matter which OA models win out, as a global community we will all benefit from the barrier-free access to peer-reviewed scholarship.
Acer Aspire One
July 21, 2008
I finally had a chance to customize my new Acer Aspire One. It came bundled with only 512 MB of RAM, so I bought another 1 GB (max for this mainboard) to install. I was cautioned by a member of the sales staff at Canada Computers that the installation was not an easy one, and that I should scour YouTube to find a video demonstration.
The staff member was not kidding! This RAM installation was quite a learning experience. I’ve been assembling computers for years, but had never cracked open a laptop before, never mind a sub-notebook. It was very counter intuitive. Screws were hidden under the rubber feet of the laptop. I had to rip them out with a screwdriver, revealing double sided tape. The keyboard then had to be pried out, followed by removing multiple layers to finally reveal the expansion slot which was UNDER the mainboard itself. Tiny delicate clips were everywhere. I must say, after putting the Aspire One back together, I was just glad that it booted up! Luckily everything worked and the extra RAM showed up in the bios. I was unable to find a YouTube video with instructions, but found this post to be a good overview, and this step by step guide was an amazing help.
I also found the following post quite useful for some tips on how to get into the Linpus back end.
Scopus’ friendly seaching interface
July 10, 2008
If you have not yet had a chance to play with the Scopus interface, its well worth a look. Through their extensive research, Scopus has made the use of facets approacable and intuitive. Just like in Erik Hatcher’s Collex platform, users can choose to either limit or exclude multiple facets within specific categories in their searches.
I’m glad that the Scopus folks made it down to demo the service as I had a chance to ask them some questions about their content. I’m always thinking about how to promote and include the journals that we’re hosting at York through Open Journal Systems software, and so I asked one of the three Scopus representatives about the process involved. It turns out the there is an on-line form through which one can suggest a title. Applications are reviewed once per year. This year’s deadline is September 1st, 2008. It was disappointing to hear that the rejection rate is 65% because being accessible through this interface would enable Scopus users to more deliberately discover a journal’s content.
Another huge benefit to journals indexed by Scopus is the Journal Analyzer function. This tool allows a journal to track its citations back to 1996. The Journal Analyzer also allows users to select up to 10 journals and compare their performance next to each other on the same graph. This could be a useful visual accompaniment to a grant application.
Web 2.0 concepts
July 6, 2008
I recently finished reading Key differences between Web 1.0 and Web 2.0. I admit that Web 2.0 was old news over two years ago, but this article provides a good summary of key concepts. It would be useful to read for those wanting to integrate Web 2.0 concepts into a site design.
A few excerpts:
The article discusses the “balkanization” phenomenon. Users are more likely to join a particular on-line social network if a critical mass of one’s friends are already members.
Where Web 1.0 started the portalization trend (trying to build features of interest such as weather and news into a site so that users do not need to leave the site), Web 2.0 continues the portalization trend (by providing hosting of users’ photos and videos) but relies on users to bring the content.
In Web 2.0, the “trend is towards an increasingly customize front page so that no two users have the same experience.”
Brand names and Open Access
July 6, 2008
I was reading Peter Suber’s July 08 Open Access Newsletter, and its enough to make my head spin…there are so many developments posted on Open Access News I just can’t keep up anymore, its fantastic that “hot” stories are now tagged and the feed to these stories can be subscribed to here.
I was struck by a particular point Peter raised: that the availability of funds to pay for access [to research] does not scale to keep pace with the growth of published knowledge.
It made me think about the format problem. I’ve been hearing it mentioned over and over again, this question: why are we so attached to packaging our research into a journal format?
Is it the brand name that we’re so attached to? If we’re looking for quality, do we simply just seek out the Prada of journals? Does not the research stand up for itself, just like a consumer good has to? If your Vuitton luggage falls apart after one trip down the baggage conveyor belt, does the fancy brand matter anymore?
Maybe its about lack of time. Who has the time to compare quality of consumer goods…we’ve all purchased a generic brand at one point or another that greatly underperformed. To protect against that disappointment, its just easier to pay a little more for the name brand version. Perhaps we adopt a similar mentality with research?
This worries me a bit. The fact that research output volumes are multiplying so quickly…is it not in a way working against the cause?…is it not further fueling the demand for these high impact journals to exist? Is it not so much easier to simply save time by trusting the name brand research?
In my mind, the solution lies in the metrics…where citations and downloads can be measured and compared to the opinions of the elite groups of peer reviewers that decide which articles are Vuittons and which are simply generic. I can’t help but predict that once a more unified and unprejudiced method of tracking impact appears, brand names just won’t matter. The quality of an item of research will simply stand up for itself, visible for all to see, no longer in need of being sold under a designer label.