Archive for the 'Praxis' category

Link, don't pass around files

Jan 25 2011 Published by under Open Access, Praxis

So I heard an interesting question the other day, one that's worth thinking out loud about. Someone asked whether it was legal, copyrightly-speaking, to post a legally open-access article to a public server or service (such as Facebook or FriendFeed), or if one should link instead.

The answer, as with most copyright questions, is "it depends." The other answer is "I am not a lawyer; if you have a copyright question, go ask a lawyer." But in my estimation, even when reposting is probably safe, I think it's better to link, and I'll try to explain why I think that.

First, there's a pragmatic argument: it's usually just plain easier to drop in a link than to download and reupload (and if it isn't easier, the hosting archive is broken). I'm all in favor of easy.

Second, in many cases, reposting articles publicly may well infringe copyright. If there's a CC-BY license on the article, I would guess public reposting with credit to be an acceptable reuse. If there's a CC non-commercial or share-alike license, I'd personally think twice. If there's no CC license at all, which is the usual case? By reposting, you're making a copy, and yes, an author or copyright-owning publisher could bring a lawsuit over that. Would they have much of a case? Who knows? I don't. But who needs the hassle?

Can I, as a digital-archive manager, give you permission to repost items from the archive I run? Actually, no, I usually can't (the few CC-BY items in the archive aside). The license that archive depositors give the archive lets the archive disseminate materials via its own website. That license emphatically does not let the archive give other people permission to disseminate (except perhaps under the specific circumstance of the archive shutting down and transferring the entirety of its assets elsewhere). It's a subtle point, but important.

Third, there's an impact question to consider. As alternative impact metrics take hold in journal publishing, view and download numbers take on new importance for authors. If you repost an article instead of linking to it, are you going to count views and downloads? Probably not. Publishers and archives, though, they're counting and reporting. So anybody who downloads your copy robs the author of a countable download. Maybe that doesn't matter much today… but it might matter a lot tomorrow.

Fourth, authors aren't the only folks counting views and downloads. Digital archives aren't magically free to run, and we digital archivists don't work entirely out of the goodness of our hearts. One of the ways we justify our work and our archives' existence is through view-and-download counts. When you repost, you dilute the impact that we can report to our funders. Speaking as one whose service has been threatened with closure—any impact dilution can be a true threat.

Link, don't repost, even when reposting is legal. The author you benefit may be your colleague, or even yourself. The open-access archive or publisher you benefit is fighting against the paywall-bounded darkness.

19 responses so far

Library Day in the Life 6

Jan 24 2011 Published by under Praxis

Once a year, librarians get together to tell people what we do all day, because we know that many people have stale, stereotyped, or just plain wrong ideas about what that is. I don't usually talk about my job here, because I've landed myself in hot water over that before, but for Library Day in the Life I'll make an exception.

For the record, what I do: My job split three ways as of last August. One-quarter of me belongs to the institutional repository, at least until such time as that enterprise is absorbed into the grand new digital-library infrastructure currently being built. Another quarter of me belongs to the library school, allowing me to teach two courses per year. The remaining half co-manages Research Data Services, pitches in on various projects relating to scholarly communication, (usually digital) preservation, and research-data management, and has some other irons in the fire that aren't yet ready for prime time.

Today's doings:

  • 7:30 am: Arrive at my office. Start up the iMac; tidy up a couple of (physical) things while it boots.
  • 7:32 am: Start up email, RSS feedreader, IM client (among other things, it's how colleagues know where I am), calendar app, Evernote (with my to-do list).
  • 7:33 am: Chug through email-related to-dos, while chugging the morning's Diet Coke:
    • Send an old RDS announcement to a committee member for revision.
    • Send a researcher at another institution a copy of a closed-because-of-copyright thesis that they found in the institutional repository. (We treat such requests just like interlibrary loan, by policy.)
    • I have a green light to talk briefly about RDS at Thursday's all-staff meeting, yay! Quickly write up what I want to say in Evernote (so that I can have it in my iPod Touch on the day).
    • Do a requirements writeup for a projected new RDS website feature for the folks who manage the RDS website.
    • Check the scholarly-communication questions email address. Nothing there but journal spam. Delete journal spam.
    • Hack through stuff in inbox that doesn't need to be there. Make it to Inbox 6. Note a few things that have been waiting to get done; give them a lick and a promise, because it's a busy day.
  • Intermittently: skim feeds, FriendFeed, and Twitter. Not much I actually have to stop and read through today, which is good.
  • 8:45 am: Set IM client away notification to "At a meeting, sorry!" Chug across the quad to catch part of a SLIS health-informatics class. They're demoing a new campus research facility whose PI is interested in help figuring out how to manage the digital and analog data that the facility will produce; this is the best chance we'll likely have to get a read on the project, given how hard the PI is to reach. Demo is useful; I ask questions related to what data they will be recording during experiments, and whether/how installations like this one are sharing their work. From the sound of things, we'll be in on the ground floor as they think through these questions—which is wonderful.
  • 9:30 am: Get back to office, slightly lightheaded from the glue smell in the library entryway (they're replacing the absorbent padding on the entryway floor). Set IM status back to "available." Revise ("put a screenshot on the back!" sounds so easy, yet isn't) and start printing flyers and handouts for afternoon meet-some-faculty event for RDS.
  • 9:34 am: Notice from @joan_starr on Twitter that DataCite Metadata Scheme is published. Grimace, bookmark it in Pinboard (tags "datacitation" and digital-curation class number), leave it open in a tab to look at later.
  • 10:04 am: Flyers merrily printing (they're both two-sided, and it's a communal printer, which means a lot of running-back-and-forth), take the opportunity to check the course-management system for any student SOSes. First-week homework is coming in; good. Had some drops, which is unsurprising and probably positive (if they can't handle the first week's homework, they don't belong in my class, and I deliberately set up the first week's homework so that students would find that decision easy).
  • 10:13 am: Try to sort out issues with RDS-related email list. Send SOS to library helpdesk. Get prompt, helpful response. Email internal RDS list to start discussion about future of related list.
  • 10:18 am: Skim the published DataCite scheme, paying special attention to the XML instance listed. Realize that DSpace can neither create nor do anything useful with such an XML instance. Sigh. Wish once again that we were off DSpace, or that DSpace would finally recognize that metadata stoppeth not with key-value pairs.
  • 10:20 am: Squeeze in some work on OLA Superconference slides ("Turning Collection Development Inside-Out"). For me, this means hunting for CC-BY licensed photos and art and sorting out the typography and general aesthetic I want. Save new Keynote theme in case I want this particular combo again.
  • 11:25 am: Everything's printed, yay! And I have some finished slides and a lot of presentation outline done. Heading down to meet health-informatics professor for discussion over lunch.
  • 12:30 pm: Back in office, with game plan for health-informatics professor's project. Send requested email to professor. Churn through email that's piled up over the morning. Figure out where RDS-related meeting is; realize that I'll have to leave in 25 minutes to make it there on time. Sigh. Try to move slides a wee bit further along.
  • 12:50 pm: Answer an emailed question for an institutional-repository contact on a different UW System campus about copyright clearances for graduate theses.
  • 12:55 pm: Shut down iMac; there won't be much point in returning to my office after meeting, so I'll just go home and do the rest of my workday from there. Depart for meeting, grabbing up folder with flyers on the way out; chug up and over Bascom Hill to the center of campus.
  • 2:35 pm: Having listened to many cheerful facts about the square of the hypotenuse conflict-of-interest reporting and watched a librarian colleague knock her RDS presentation out of the park, head out to the bus stop to catch a bus home. While waiting, check email and RSS feeds via iPod Touch. Star one RSS item related to OLA Superconference talk for more in-depth perusal later.
  • 3:10 pm: Arrive home, boot up Buffle the MacBook. Log onto the course management system, deal with group-project assignments, post some administrivia, grade first-week homework (yes, I am mean and cruel). Note with pleasure that students new to XML (which isn't all of them by any means, and I can tell the difference!) are figuring out for themselves that you can "make up your own tags" in XML that make sense in context.
  • 4:00 pm: Done grading. A successful assignment; I feel good about it (which is an important datum in a first-time course). Four assignments missing, but they have another hour's grace; I'll get the stragglers tomorrow morning. One last email check, one small, non-serious fire to put out.
  • 4:10 pm: Realize I forgot to water the philodendron in my office. Sigh. It's a forgiving plant; it'll live until tomorrow. Start expanding the little one-or-two-word notes-to-self in Evernote into this post. Not that this is part of my workday, of course; I thought folks might wonder, that's all.

Over the course of the evening, I'll probably look in on email and the course-management system a couple more times, but I won't answer anything unless it's an immediate problem (which it hardly ever is).

Comments are off for this post

What if we threw a data-curation party and nobody came?

Dec 21 2010 Published by under Praxis, Research Data

So a lot of libraries and campus IT shops in the States are gearing up to deal with this whole NSF data-management plan thing. Websites are going up, would-be consultants are warming up their phones, plans are being planned (and sometimes even executed).

What if we build it and they don't come? Have we thought about this possibility?

I'm afraid my intrinsically Cassandraic nature only partly inspires these questions. We know pretty well from surveys and qualitative investigations (bug me for a bibliography if you like) that the average researcher hasn't a clue librarians can help her look after her research data. The said average researcher despises librarians, for that matter; she thinks that pukka information management can be taught to graduate students soup to nuts in a weeklong seminar, and she thinks that the real limiting skill for data management is deep disciplinary knowledge (which raises the question of why she typically leaves it to wet-behind-the-ears grad students, but…). The average researcher is dead wrong, of course (including about disciplinary knowledge being the sole limiter), but does she know that?

So let's imagine our old friend Dr. Helen Troia of the University of Achaea's Basketology department for a moment, faced with this new NSF requirement. Where will she go for help?

Well, she's probably going to call her NSF program officer first, an eminently reasonable thing to do. I hope the NSF has told its program officers to tell all the Dr. Troias of this world to look for help in their libraries—at least on their own campuses—but I'm not sanguine. What is clear, though, is that the NSF isn't going to manage Dr. Troia's data for her; at most, it'll give her a better idea of what she has to do to prove she's managing it wisely. So where does she go then?

She may also talk to her research-support office. Libraries: does your institution's research-support office know about your NSF-related activities? If it doesn't, better tell it. And she'll have a word with her local grant admin (she's lucky enough to have one) as well. Libraries: what do local grant administrators know about you?

If Dr. Troia's data are digital (not all data covered under the policy are, a point that bears re-emphasis), her next stop is likely to be her departmental IT talent. Libraries: if you are only partnering with campus IT, you may (depending on the way your campus is organized) be missing the boat. Find out where the people in small IT shops hang out, and reach out to them, too.

Now, departmental IT may well take on the job, but they are liable to do it ludicrously wrong. "Here, have some server storage space," they will say, ignoring questions of metadata, versioning, formats, organization, security, citability and other sharing issues, sustainability past grant expiration, and possibly even backup. I'm not sneering; with my own eyes I have seen a campuswide IT shop at a major research university, a shop that should assuredly know better, advertising unbacked-up storage as suitable for data-archiving needs. (No, I won't link. Yes, I am tempted to.) Again, it's a case of people not realizing what they don't know. NSF helper-elves need to be prepared to cope with that.

If departmental IT punts (as it likely should), then and only then will Dr. Troia approach campus IT. She will do so with fear and trepidation, as campus IT tends to be a Cthulhoid monstrosity, as fathomable as sunken Rl'yeh and approximately as helpful. Libraries: how are front-line tech-support finding out about your NSF-related services?

If none of the above people with whom Dr. Troia interacts points her toward the library, she won't come to the library. I wish that weren't so too. It's so. The inevitable corollary is that outreach efforts should not start with researchers. It should start with the layer of support and administrative staff with whom researchers regularly interact.

Even more cheerfully: none of this may work. We just don't know yet. We'll know much better in a year or so! Best have a plan for if it doesn't. Can you get a list of campus NSF awardees, to contact them individually? Do you have a few campus researchers who are willing to do projects with you? Can you get at the graduate students who are doing the real work?

Good luck. I think we'll all need it.

4 responses so far

In which copyright is annoying

Dec 20 2010 Published by under Praxis

With all the ferment over copyright law currently, I don't understand why someone hasn't pointed out that from a recordkeeping perspective, tying copyright law to author lifespan is an incredibly bad idea, amounting to an immense research tax on would-be preservers and reusers of culture.

I was recently asked about reuse of a published photograph by Paul Regnard, a French psychologist. Don't bother with Wikipedia; he doesn't have an article there, nor was he important enough to make the pages of the scientific and medical biographical dictionaries I could lay hands on. It is possible to triangulate via Google and some fossicking about that he died in 1927.

So French copyright terms, as best I can tell, currently mirror ours (life plus seventy years), with one wrinkle: if you died in active service, your copyright term lasts an extra thirty years. (What was French copyright law like in 1927? When the term was extended to its current length, did the extension apply retroactively? Darned if I know. If anyone would like to enlighten me, feel free.) So if Regnard died in active service, his photographs are still copyrighted. If not, not.

I'm not planning to investigate 1920s service records for France. I'm just not. So there the matter rests.

Frankly, as a pragmatic tradeoff I'd accept a longer copyright term (odious though that would be) in exchange for a more precise one, such that I wouldn't have to fuss about French service records. I mean, merde.

6 responses so far

In-tech and Lazinica at it again

Dec 01 2010 Published by under Open Access, Praxis

This is by way of a public-service warning.


Lazinica has the dubious distinction of being the only (as far as I know, anyway) publisher to be told by OASPA to take their logo off his site. Looking through the current In-tech offerings, one is bombarded with nonexistent copyediting and appalling typesetting. I can only guess acquisitions and review standards are equally low or lower, especially the way the outfit goes around trawling for authors.

This is not an outfit that will do your academic career any good. Stay away. Can I interest you in a nice PLoS or BMC instead?

Last I checked, In-tech's journals were still listed in the DOAJ. If I were DOAJ, I'd rectify that problem, but I'm not. And other than OASPA telling Lazinica he can't use their logo, they've been silent on the subject.

So I do what I can to spread the word. Somebody should.

4 responses so far

Little Nuggets

Nov 18 2010 Published by under Miscellanea, Open Access, Praxis, Uncategorized

Little nuggets of information are swirling around in my head. I'm just back from two meetings, in two different cities, and each one had some interesting ideas about the future of library services, collections, and technology.

Meeting #1 was the 2010 SPARC Digital Repositories Meeting in Baltimore. The last time this meeting was held, 2008, the landscape for institutional repositories (and digital repositories) was focussed on how libraries could create and/or host them and convince others of their value. I would say that with a few exceptions, not much has changed.

Just like everyone wants to get married in Jane Austen's Pride and Prejudice, everyone in libraryland seems convinced if the right marketing approach/language is used, the perfect match will be made with respect to people contributing and using IR/DR content. Unfortunately the current IR/DR infrastructure isn't conducive to this. You need to establish relationships before (or while) you build the network, and there're few easy tie-ins to the existing infrastructure.  The keynote speaker, Michael Nielsen, made this point with respect to use and adoption of science online networks and the same is true for libraries. The current reward system isn't set up so scientists can show the value of contributing to social networks outside of the peer review process. I would agree this is true for IRs/SRs/DRS also, although of the three subject repositories have been the most successful.

As you can tell from the program, there was emphasis on collecting and curating open data, which I think showed there is a desire for libraries to find a better match. While this may create a niche for libraries, it's going to take some work between the "data nerds" and the collectors, as this friendfeed discussion shows.  

While several presenters mentioned the need for preservation, there was suprisingly little talk about the importance of having policies, infrastructure, technology in place to do this. In fact these two communities are almost completely disconnected. There's also been very little attention to assessment issues such as identifying if the money and staff time devoted to projects is worthwhile given the continuing recession and shrinking library collections budgets. I see both of these ideas impacting work on IRs/DRs/SRs, although since neither topic is "sexy" it may take some before we see much attention devoted to these issues.

The plan is to have this conference again in two years, and if this happens I predict we will see further shifts in focus or perhaps this program co-sponsored or linked with another organization.

Meeting #2 was a joint ARL/SSP workshop, Partnering to Publish: Innovative Roles for Societies, Institutions, Presses, and Libraries. This should have been a session or part of the schedule for Meeting #1, because it became clear as the meeting progressed that working in the publishing infrastructure is a natural way for libraries to make their repositories and/or preservation efforts tie into the existing promotion and tenure environment. In most cases the speakers at the event were able to show this in easily quanitfiable ways, like sales figures, enhanced content and features in books and journals, as well as stronger relationships with administrative units and campus faculty.

I also attended yet another conference in the last month: the 2010 Library Assessment Conference. Not much of this conference addresses issues in BOT but I will say this - there were twice as many attendees at this meeting than the SPARC meeting with many more presentations and ideas generated. This is currently a hot topic in librarianship and I predict we will see more programming devoted to all areas of this topic in the future.

No responses yet

I am the program

Oct 19 2010 Published by under Open Access, Praxis, Tactics

I'm a little chagrined that I have zero activities planned for our university campus this week for Open Access. I know - this is the time! Act now! The enthusiasm seems larger this year than I remember from previous OA week/day events. I'm missing the boat!

So after I got over it I realized: I am the Program.

This week I'll be giving two talks - one was on Monday afternoon to our University Graduate Council on a new software product, SciVal Spotlight, that tracks research performance and helps predict research strengths and emphases. It is Open Access. No, it's a subscription database, and one that is pretty expensive. But it's unique, it builds on librarian expertise supporting the research mission, and should strengthen our relationship with other campus units. There's a short demo if it here if you want to see more. Note that I didn't include any detailed local data in the talk but Elsevier has several whitepapers available that discuss aspects of the software, which are Open Access.

I'll also be giving a guest lecture this Thursday to Jean-Claude Bradley's Chemical Information Retrieval class at Drevel University over the internet. I'll be giving an overview of web 0.0/1.0/2.0/3.0 applications and scholarly communications issues in chemistry. Since it's Open Access week I'll cover this along with publishing, copyright, identity and library issues. This lecture was fun to prepare and thanks to Jean-Claude for asking me to offer my perspective on this.

I was also planning to talk to my reference colleagues this week about reference management tools, specifically comparing established licensed products like Refworks and EndNote with newer, freely available versions like Zotero and Mendeley. This talk is being pushed back to early next month. While this may seem like a odd topic for Open Access week, I've believed for some time these tools are going to become increasingly important channels for scholarly communications and information sharing. I also believe that researchers will need to use multiple reference management tools over the course of their career, and become facile in converting files and moving between products as their research and personal networks expand and change.   

While it's great to promote Open Access, there are a bunch of other issues tied up with it: copyright, author rights and archives and funding mandates for researchers to deposit results. There are many ways to reach faculty, staff, and administrators and make them aware of the deeper issues lurking underneath the concept of Open Access and how the library can help them.

As a program planner the most difficult part of my job in scholarly communications is creating programming that faculty in all areas (humanities, social science and science) can relate to. Invariably a program emphasis that excites scientists will be inappropriate or of little interest to humanities and some social science faculty. So programming can be effective but also limiting in some ways.

So talk to faculty about how they are publishing their research, especially if there is new journal being formed or moved from another campus, a new research initiative that needs to know more about library collections, a student or faculty member that needs more information about publishers and editors for their work.  Tell the administration about new tools to better identify campus research strengths, and mandates from funding agencies that affect research activity and support, and how the libraries can support these efforts. I can't guarantee that all of these conversations will be successful, or that everyone who hears what you have to say will be excited about it. You may have to build supporters one faculty member, one program one person at a time. What may work on a large research campus may not be effective for a small college or specialized technical school.  The important thing is to start a conversation.

One response so far

ACS: The Perfect Storm

Oct 05 2010 Published by under Praxis

Chemistry and scholarly communications issues have a difficult, stormy relationship. Why?

Part of the problem is the disconnect between industry and academia. This exists in all areas of science but can particularly bad in big pharma and profitable trade secrets.

Another part of the problem is that professional societies, with the American Chemical Society (ACS) as a notable example, use income generated from journal subscriptions and literature index licensing costs to fund other society activities. Has the society quantified this? I'm not sure - I can say as a local section officer our small section was able to obtain several programming grants and other supplemental funds to host Science Cafes, seminars, outreach activities and the like. As an incoming local section officer I was able to attend a weekend leadership institute with free hotel, meals, and transportation costs. This was not a trivial amount of money - I estimate this totalled approx. $3,000 - $4,000 in my year as President. And I'm not counting the money our section recived from the ACS as our allotment of  member dues - these "grants" all came directly from ACS HQ programs and presumably from journal profits.

While our section hosted worthwhile activities that promoted science to the general and local public, I question handing out funds this easily when libraries are struggling to pay subscription costs and maintain access to the literature. Isn't having a usable local library collection part of my outreach to my users? How can I but new ACS journals when I can't afford the ones that currently exist?

A third example of  the gathering storm clouds is the New Publishing Agreement for ACS Journals released this week. It allows authors to retain copyright for copyrightable material in the article's supplemental information - generated tables, graphs and illustrations are some examples. That's about it though, for author rights. The author still has to transfer exclusive copyright to the ACS for the manuscript, as well as "all versions in any format now known or hereafter developed." It seems the ACS has tried to retain as much control as possible to protect future revenue streams.    

In addition to copyright transfer there is a lengthy section on appropriate use of materials in repositories, personal websites, and classroom use. Does the ACS not realize in-classroom use is already covered by the existing rules for reserves and fair use? Apparently not, as it goes into great detail about how students can access articles for their classes using passwords and when access will end. Their stance on prior publication appears little changed, with basically any activity considered prior publication. Better be careful with those preprint submissions and precedings posts on the Nature Publishing Group website! 

Want to put your ACS papers and manuscripts in the local repository? Better get out the letterhead - authors must receive written confirmation from the appropriate ACS journal editor that posting a submitted manuscript doesn't conflict with that journal's prior publication policies. They will let authors post materials mandated by funding agencies, but you better get out the checkbook, as the only route is still the Author Choice program. Last time I checked this was $3,000 an article. But hey, you'll get to fund some of our local section activities for a year. It's a bargain! 

Is this progress? Yes, in that this is better than the previous ACS Copyright Status Form. Is it still protecting the revenue and the profits?  You bet it is. I'm curious to see if this will be further amended with the implementation of the NSF Data Management Plans.  It seems they are not sure how to support it, although I think even they realize they can't own the data. Let's hope so anyway.

14 responses so far

Friday foolery: Data management plans

Sep 24 2010 Published by under Praxis

I've linked to this before, but as the NSF data-management-plan go-live date looms overhead, it's worth linking to again: My Data Management Plan -- a satire.

Guaranteed to have your local data-management expert howling with laughter—or curling up under a desk in tears.

4 responses so far

On preservation versus replication of research data

Sep 20 2010 Published by under Praxis

I often see a cost argument against research-data preservation: if it's cheaper to replicate or regenerate the data than to preserve it, why preserve?

Here's my question: Cheaper for whom?

If we remain within the context of an individual lab, this question is a no-brainer: if it's cheaper to regenerate, regenerate. As we dip our toes into an opener-data world, however, I should think the equation changes rather.

Is it still cheaper for two labs to have to regenerate these same data? Five labs? Twenty labs? How many of those labs will have to buy specialized equipment to create those data, equipment they wouldn't need if the data were shared by the first lab? How much staff time—worst-case, specialized staff time—will be eaten up in regenerating data?

There are certainly offsetting costs to consider: the cost of data discovery, the cost of cleaning up and describing data for sharing, the cost of whatever munging it takes to move data from one lab's context to another's, the magnified cost of any error on the part of the data-generating lab.

Still, my sense is that the discussion around cost has been just a bit simplistic… and is likely to become more complicated as data-sharing norms emerge.

5 responses so far

Older posts »