Archive for: January, 2011

The One Schema

Jan 31 2011 Published by under How Libraries Work, Research Data

I grumbled on FriendFeed today that I wish folks (IT folks in particular) would understand that there is no single metadata schema that works for every kind of data in every form in every situation. If you're building a data repository intending to store many kinds of data from many disciplines, it had better have a metadata model that accommodates many different vocabularies.

Bill Hooker promptly stepped up to the plate with the following dictum (slightly edited by yours truly):

Three schemas for the astronomers under the sky;
   Seven for the urban planners in their halls of stone;
      Nine with which biologists comply;
and ONE for the Librarian on hir Dark Throne:
In the Land of Library, where the metadata lies.
   One schema to rule them all,
   One schema to find them;
   One schema to bring them all;
      And in the repository bind them.
In the Land of Library, where the metadata lies.

I just named my Aeron chair the Dark Throne, y'all.

7 responses so far

Friday foolery: Hawking the Library of Congress

Jan 28 2011 Published by under Miscellanea

I've had a week, folks; as a reward, next week I'm off to a conference with a brand-new talk. (Slides up soon, I hope; they're getting close to done, but they're not there yet, and I still have plenty of patter to write. I'm trying a new presentation technique, which adds a lot to my prep time.)

But my week is as nothing to the week the Library of Congress has had, wrangling a stray Cooper's hawk that wandered in and didn't particularly want to leave. They did safely capture her, and now she's off to rehab despite saying "no, no, no" and will eventually be rereleased into the wild.

It's hard to get useful good press for libraries. What's more typical is this kind of nonsense with the subtext "that library stuff, it's sooooooo obscure, and aren't librarians just weirdoes?" Worse are the lazy buns-and-shushing stereotypes (George Lucas, my lazergaze is on you, man); sometimes even worse than those are the "look, aren't librarians so hip? and isn't that cute?" stories.

So I'm super-impressed with how well the Library of Congress handled public relations on this one. Their blog shines with good humor and good information. They made very clear that they were handling the situation well and responsibly. They took note of useful input from blog commenters, and responded publicly to it. They deserve every iota of the attention this odd little episode has garnered.

Comments are off for this post

Link, don't pass around files

Jan 25 2011 Published by under Open Access, Praxis

So I heard an interesting question the other day, one that's worth thinking out loud about. Someone asked whether it was legal, copyrightly-speaking, to post a legally open-access article to a public server or service (such as Facebook or FriendFeed), or if one should link instead.

The answer, as with most copyright questions, is "it depends." The other answer is "I am not a lawyer; if you have a copyright question, go ask a lawyer." But in my estimation, even when reposting is probably safe, I think it's better to link, and I'll try to explain why I think that.

First, there's a pragmatic argument: it's usually just plain easier to drop in a link than to download and reupload (and if it isn't easier, the hosting archive is broken). I'm all in favor of easy.

Second, in many cases, reposting articles publicly may well infringe copyright. If there's a CC-BY license on the article, I would guess public reposting with credit to be an acceptable reuse. If there's a CC non-commercial or share-alike license, I'd personally think twice. If there's no CC license at all, which is the usual case? By reposting, you're making a copy, and yes, an author or copyright-owning publisher could bring a lawsuit over that. Would they have much of a case? Who knows? I don't. But who needs the hassle?

Can I, as a digital-archive manager, give you permission to repost items from the archive I run? Actually, no, I usually can't (the few CC-BY items in the archive aside). The license that archive depositors give the archive lets the archive disseminate materials via its own website. That license emphatically does not let the archive give other people permission to disseminate (except perhaps under the specific circumstance of the archive shutting down and transferring the entirety of its assets elsewhere). It's a subtle point, but important.

Third, there's an impact question to consider. As alternative impact metrics take hold in journal publishing, view and download numbers take on new importance for authors. If you repost an article instead of linking to it, are you going to count views and downloads? Probably not. Publishers and archives, though, they're counting and reporting. So anybody who downloads your copy robs the author of a countable download. Maybe that doesn't matter much today… but it might matter a lot tomorrow.

Fourth, authors aren't the only folks counting views and downloads. Digital archives aren't magically free to run, and we digital archivists don't work entirely out of the goodness of our hearts. One of the ways we justify our work and our archives' existence is through view-and-download counts. When you repost, you dilute the impact that we can report to our funders. Speaking as one whose service has been threatened with closure—any impact dilution can be a true threat.

Link, don't repost, even when reposting is legal. The author you benefit may be your colleague, or even yourself. The open-access archive or publisher you benefit is fighting against the paywall-bounded darkness.

19 responses so far

Library Day in the Life 6

Jan 24 2011 Published by under Praxis

Once a year, librarians get together to tell people what we do all day, because we know that many people have stale, stereotyped, or just plain wrong ideas about what that is. I don't usually talk about my job here, because I've landed myself in hot water over that before, but for Library Day in the Life I'll make an exception.

For the record, what I do: My job split three ways as of last August. One-quarter of me belongs to the institutional repository, at least until such time as that enterprise is absorbed into the grand new digital-library infrastructure currently being built. Another quarter of me belongs to the library school, allowing me to teach two courses per year. The remaining half co-manages Research Data Services, pitches in on various projects relating to scholarly communication, (usually digital) preservation, and research-data management, and has some other irons in the fire that aren't yet ready for prime time.

Today's doings:

  • 7:30 am: Arrive at my office. Start up the iMac; tidy up a couple of (physical) things while it boots.
  • 7:32 am: Start up email, RSS feedreader, IM client (among other things, it's how colleagues know where I am), calendar app, Evernote (with my to-do list).
  • 7:33 am: Chug through email-related to-dos, while chugging the morning's Diet Coke:
    • Send an old RDS announcement to a committee member for revision.
    • Send a researcher at another institution a copy of a closed-because-of-copyright thesis that they found in the institutional repository. (We treat such requests just like interlibrary loan, by policy.)
    • I have a green light to talk briefly about RDS at Thursday's all-staff meeting, yay! Quickly write up what I want to say in Evernote (so that I can have it in my iPod Touch on the day).
    • Do a requirements writeup for a projected new RDS website feature for the folks who manage the RDS website.
    • Check the scholarly-communication questions email address. Nothing there but journal spam. Delete journal spam.
    • Hack through stuff in inbox that doesn't need to be there. Make it to Inbox 6. Note a few things that have been waiting to get done; give them a lick and a promise, because it's a busy day.
  • Intermittently: skim feeds, FriendFeed, and Twitter. Not much I actually have to stop and read through today, which is good.
  • 8:45 am: Set IM client away notification to "At a meeting, sorry!" Chug across the quad to catch part of a SLIS health-informatics class. They're demoing a new campus research facility whose PI is interested in help figuring out how to manage the digital and analog data that the facility will produce; this is the best chance we'll likely have to get a read on the project, given how hard the PI is to reach. Demo is useful; I ask questions related to what data they will be recording during experiments, and whether/how installations like this one are sharing their work. From the sound of things, we'll be in on the ground floor as they think through these questions—which is wonderful.
  • 9:30 am: Get back to office, slightly lightheaded from the glue smell in the library entryway (they're replacing the absorbent padding on the entryway floor). Set IM status back to "available." Revise ("put a screenshot on the back!" sounds so easy, yet isn't) and start printing flyers and handouts for afternoon meet-some-faculty event for RDS.
  • 9:34 am: Notice from @joan_starr on Twitter that DataCite Metadata Scheme is published. Grimace, bookmark it in Pinboard (tags "datacitation" and digital-curation class number), leave it open in a tab to look at later.
  • 10:04 am: Flyers merrily printing (they're both two-sided, and it's a communal printer, which means a lot of running-back-and-forth), take the opportunity to check the course-management system for any student SOSes. First-week homework is coming in; good. Had some drops, which is unsurprising and probably positive (if they can't handle the first week's homework, they don't belong in my class, and I deliberately set up the first week's homework so that students would find that decision easy).
  • 10:13 am: Try to sort out issues with RDS-related email list. Send SOS to library helpdesk. Get prompt, helpful response. Email internal RDS list to start discussion about future of related list.
  • 10:18 am: Skim the published DataCite scheme, paying special attention to the XML instance listed. Realize that DSpace can neither create nor do anything useful with such an XML instance. Sigh. Wish once again that we were off DSpace, or that DSpace would finally recognize that metadata stoppeth not with key-value pairs.
  • 10:20 am: Squeeze in some work on OLA Superconference slides ("Turning Collection Development Inside-Out"). For me, this means hunting for CC-BY licensed photos and art and sorting out the typography and general aesthetic I want. Save new Keynote theme in case I want this particular combo again.
  • 11:25 am: Everything's printed, yay! And I have some finished slides and a lot of presentation outline done. Heading down to meet health-informatics professor for discussion over lunch.
  • 12:30 pm: Back in office, with game plan for health-informatics professor's project. Send requested email to professor. Churn through email that's piled up over the morning. Figure out where RDS-related meeting is; realize that I'll have to leave in 25 minutes to make it there on time. Sigh. Try to move slides a wee bit further along.
  • 12:50 pm: Answer an emailed question for an institutional-repository contact on a different UW System campus about copyright clearances for graduate theses.
  • 12:55 pm: Shut down iMac; there won't be much point in returning to my office after meeting, so I'll just go home and do the rest of my workday from there. Depart for meeting, grabbing up folder with flyers on the way out; chug up and over Bascom Hill to the center of campus.
  • 2:35 pm: Having listened to many cheerful facts about the square of the hypotenuse conflict-of-interest reporting and watched a librarian colleague knock her RDS presentation out of the park, head out to the bus stop to catch a bus home. While waiting, check email and RSS feeds via iPod Touch. Star one RSS item related to OLA Superconference talk for more in-depth perusal later.
  • 3:10 pm: Arrive home, boot up Buffle the MacBook. Log onto the course management system, deal with group-project assignments, post some administrivia, grade first-week homework (yes, I am mean and cruel). Note with pleasure that students new to XML (which isn't all of them by any means, and I can tell the difference!) are figuring out for themselves that you can "make up your own tags" in XML that make sense in context.
  • 4:00 pm: Done grading. A successful assignment; I feel good about it (which is an important datum in a first-time course). Four assignments missing, but they have another hour's grace; I'll get the stragglers tomorrow morning. One last email check, one small, non-serious fire to put out.
  • 4:10 pm: Realize I forgot to water the philodendron in my office. Sigh. It's a forgiving plant; it'll live until tomorrow. Start expanding the little one-or-two-word notes-to-self in Evernote into this post. Not that this is part of my workday, of course; I thought folks might wonder, that's all.

Over the course of the evening, I'll probably look in on email and the course-management system a couple more times, but I won't answer anything unless it's an immediate problem (which it hardly ever is).

Comments are off for this post

Friday foolery: have some more Dui!

Jan 21 2011 Published by under Miscellanea

It's very cold where I am, so what could be better than some dude in shades and aloha shirts rapping about Dui, er, Dewey Decimal?

Okay, maybe some things could be better than that. But this isn't all bad!

Comments are off for this post

Can it be? A metadata standard that makes sense?

Jan 19 2011 Published by under Research Data

I am notorious for hating library metadata standards and standard-like objects. Hate MARC. Hate Dublin Core with a great and wonderful hate. Hate OpenURL. Hate EAD. Hate OAI-PMH and OAI-ORE. Bring me a metadata standard, I'll usually find something to hate.

What does it mean that I like the DataCite Metadata Scheme? Am I losing my edge? Going over the edge? What?

Or it could just be that the DCMS is a sensible minimum that solves the problem at hand (identifying and citing digital datasets) without gobs of cruft or gobs of oversimplification. They've also acknowledged the need to revisit and change the scheme over time, and are working on how that will happen (Open Archives Initiative, I am training laser-eyes on you).

DCMS is not perfect; in my opinion, they'll need to go beyond DOIs to handles and ARKs and PURLs. (Yes, I know all DOIs are handles; not all handles are DOIs.) But for a first cut, it's pretty darn good, and it'll stay that way if they can resist the temptation to cruft it up. Good job, standardistas!

Comments are off for this post

Syllabi (and how rapidly they become obsolete)

Jan 18 2011 Published by under Research Data

So I promised I'd throw my syllabus up for folks to look at, and voilà, I have done so.

A few foot-shuffling words about it. This is a library-school syllabus. I am teaching future librarians, archivists, and records managers. I therefore make no apology for the library focus in this syllabus. If approached to work on an informatics course for a science department, I would come up with a very different syllabus indeed. (I'm up for doing that, by the way; just not alone, unless it's a linguistics or digital-humanities course where I have sufficient disciplinary background not to make a total idiot of myself. Don't ask me to teach cheminformatics all on my lonesome, though; no can do. Find me a cheminformaticist or even a chemist to work with, and I'll see what I can accomplish.)

I haven't cribbed (much) from other curricular materials out there. Possibly I should have; I ran short on minutes. Part of it, though, is that I'm an ornery cuss with a full set of my own ornery notions about what newbie librarian data-managers need to know. That set will change over time! I'm already feeling sorry that I didn't stick in a day on personal digital archiving, and I may yet do so, since I cautiously left a free day in the syllabus.

Part of it is also that curricular materials tend to assume a whole program's worth of courses, rather than just one course. If I paid too much attention to DigCCurr, feelings of utter inadequacy would have prevented me from writing a syllabus at all! There's only so much I can do in a single semester.

The fun bit (for certain values of "fun") of writing syllabi is how rapidly they obsolesce. Teaching and working in a rapidly-growing, rapidly-developing area, as I remarked on Twitter this morning, is an exercise in constant "whoa, hey, look at THAT!" moments. Today is the first official day of class for me (although since this class is all-online and I opened it up late last week, several enterprising students have already dug in, and I even have a couple of first-week homework assignments turned in already!), and what should show up in my feedreader but an entire issue of D-Lib Magazine devoted to research data. Total facepalm moment. If this issue had been out when I was syllabus-writing, half of it would have gone in, I'm sure!

So, you know. I do what I can do. I posted a "whoa, hey, look at THAT!" note to the course-management system. I expect I'll post quite a few more of those, as the semester progresses!

6 responses so far

Derk Haank interview, translated into English

Jan 17 2011 Published by under Open Access

"The Big Deal, a problem? C'maaaaaaaan! There's no problem! I'm getting big fat checks; where's the problem?"

Interview here, if you'd like to check my interpretation.

I can't even manage to get angry at this. Events will overtake it. I stand pat on my current set of predictions: in 2011–2013, even the few remaining wealthy libraries will go over the cliff. (Why will it take three years? Because most Big Deal contracts are multi-year. It's hard to know when precisely the renewal demand will come in that shatters the camel's back.)

Haank is even right about one thing: if libraries could just kick the Big Deal can down the road a little further, they would. It's an utterly dysfunctional short-termist way to behave, but it's worked this far.

Think of Haank as a realtor in 2005. Real-estate market looked great from a realtor's perspective! That it was structurally unsound, and the cracks couldn't be spackled over any more, wasn't something he was prepared to admit, or even acknowledge.

I could be wrong. I don't think I am, though. Then again, neither does he.

Comments are off for this post

Syllabus machine

Jan 12 2011 Published by under Metablogging, Research Data

Sorry for the radio silence this week; I thought it might be a good idea to finish my syllabus for this spring's digital-curation course, seeing as how class starts next week and all.

It's pretty much done, finally; I'm working on stuff in the course-management system now. I do intend to post the syllabus online when I'm committed to it sure I'm finished. Since this is an all-online course, I'll be doing a fair few audio lectures and screencasts, and I may post a few of those as well over the course of the semester. (Not all of them by any means; the classroom is a sacred space where I can tell horror stories and not get in trouble, but Book of Trogool is not a sacred space.)

This is the first time I've taught this course; it should be a pretty wild ride!

Also, how in the world did anyone do syllabi before there were DOIs? I love DOIs. Find the article, copy-paste the DOI into the syllabus with in front of it, done. All the messy access bits get dealt with by library proxy servers and CrossRef infrastructure.

4 responses so far

Huh? What'd she mean by that?

Jan 08 2011 Published by under Jargon, Metablogging

So in response to a plaint in a BoT comment, I've made a glossary of often-used jargon and acronyms on Book of Trogool.

It's assuredly not done yet! Please feel free to suggest things I've missed in the comments, on this post or any post. Librarianship, open access, and data curation are no less prone to jargon than any other field of endeavor. As a librarian and a teacher, though, it's my business to make the obscure and obtuse less so.

3 responses so far

Older posts »