Starting off the week with some juicy tidbits:
- An extremely nerdy but (for nerds) fascinating examination of XML and its implications for data modeling. Do we have to reduce everything to a relational model? Really? Perhaps not… Notably, it seems to me, this article describes fairly nicely how Fedora works. (For more beating on the humble RDBMS, see this blog post.)
- White Dielectric Substance in Library Metadata. "Understanding the noise turned out to be more important than understanding the signal." What does that mean for efforts to decide which data to preserve? "I've observed that most people trying to collect metadata go through an early period of thinking it's easy, and then gradually gain understanding of the real challenges." So have I observed this, Mr. Hellman, so have I.
- An empirical study of data sharing by authors publishing in PLoS journals. My reaction split neatly in half: half "data are doomed because no one who makes them will lift a finger to save them" and half "surely this could be easier?"
- Steve Hitchcock on how data will change library-managed repositories. My best wishes to Steve as he makes his vision real.
- The Fourth Paradigm (PDF). Microsoft's toe-dip into the data waters. Useful case studies for those thrashing about in planning processes.
- Writing math on the web. Because it still makes my head explode that the Web was nominally designed to exchange physics papers, but the math-display problem didn't seem to occur to any of its architects until years later.
- And finally, an amazing data project: the Ocean Observatories Initiative.
That should keep everyone out of trouble a while…