Planet Cataloging

September 16, 2014

Catalogue & Index Blog


We are pleased to announce our recently launched annual bursary. Full details can be found below. Best of luck to all applicants!

Purpose: The CILIP Cataloguing & Indexing Group intends to support research, best practice, and professional development in the field of metadata, cataloguing, classification, indexing, and the technology for these areas with an annual bursary of up to £500. CIG also wants to help disseminate the outcomes of any sponsored projects or activities.

Conditions: This bursary is intended for future or ongoing projects (i.e. not awarding past achievements) where no other funds are available. It is available to CIG members. Candidates are expected to report on their results/findings/output etc. to the CIG committee; these reports are to be published in the group’s journal, members’ newsletter, and/or blog. If the report or results are otherwise published, the support from CIG Annual Bursary should be acknowledged. The bursary is not intended for primary professional training (e.g. library school fees). Depending on the suitability of applications, the bursary may be split or not awarded.

Application: The bursary will be announced at least four weeks in advance of its deadline which shall be 31st October; the candidates will be informed of the outcome within another four weeks. Applications should be submitted to the CIG Chair/Secretary and should include:

  • A covering letter of application
  • Details of how the bursary will be spent i.e.
    • a description of the aims & objectives of the project or activity
    • how it will contribute to the professional development of individuals or generally to CIG’s field of interest (not more than 500 words)

A supporting statement from anyone in the wider library profession is optional.

Decision: The panel of judges for the CIG Annual Bursary will be comprised of three persons:

  • CIG Chair or CIG committee member nominated by the Chair
  • Professional academic, invited by the committee
  • Professional practitioner or other expert in the field, invited by the committee

Payment: Payment will be made to the successful applicant(s) by cheque or electronic bank transfer, at a time determined by the judging panel. The panel may impose conditions, such as proof of expenses, that have to be fulfilled before the full sum is paid out.


by cilipcig at September 16, 2014 09:38 AM

September 15, 2014

Metadata Matters (Diane Hillmann)

Who ya gonna call?

Some of you have probably noted that we’ve been somewhat quiet recently, but as usual, it doesn’t mean nothing is going on, more that we’ve been too busy to come up for air to talk about it.

A few of you might have noticed a tweet from the PBCore folks on a conversation we had with them recently. There’s a fuller note on their blog, with links to other posts describing what they’ve been thinking about as they move forward on upgrading the vocabularies they already have in the OMR.

Shortly after that, a post from Bernard Vatant of the Linked Open Vocabularies project (LOV) came over the W3C discussion list for Linked Open Data. Bernard is a hero to those of us toiling in this vineyard, and LOV ( one of the go-to places for those interested in what’s available in the vocabulary world and the relationships between those vocabularies. Bernard was criticizing the recent release of the DBpedia Ontology, having seen the announcement and, as is his habit, going in to try and add the new ontology to LOV. His gripes fell into a couple of important categories:

* the ontology namespace was dereferenceable, but what he found there was basically useless (his word)
* finding the ontology content itself required making a path via the documentation at another site to get to the goods
* the content was available as an archive that needed to be opened to get to the RDF
* there was no versioning available, thus no way to determine when and where changes were made

I was pretty stunned to see that a big important ontology was released in that way–so was Bernard apparently, although since that release there has apparently been a meeting of the minds, and the DBpedia Ontology is now resident in LOV. But as I read the post and its critique my mind harkened back to the conversation with PBCore. The issues Bernard brought up were exactly the ones we were discussing with them–how to manage a vocabulary, what tools were available to distribute the vocabulary to ensure easy re-use and understanding, the importance of versioning, providing documentation, etc.

These were all issues we’d been working hard on for RDA, and are still working on behind the RDA Registry. Clearly, there are a lot of folks out there looking for help figuring out how to provide useful access to their vocabularies and to maintain them properly. We’re exploring how we might do similar work for others (so ask us!).

Oh, and if you’re interested on our take on vocabulary versioning, take a look at our recent paper on the subject, presented at the IFLA satellite meeting on LOD in Paris last month.

I plan on posting more about that paper and its ideas later this week.

by Diane Hillmann at September 15, 2014 07:31 PM

025.431: The Dewey blog

Anthropological Linguistics, Ethnolinguistics, Sociolinguistics of Specific Languages

Sociolinguistics and its closely related neighbors, ethnolinguistics and anthropological linguistics, are often studied in the context of specific languages.  (“Sociolinguistics” will be used throughout to represent any and/or all of sociolinguistics, ethnolinguistics, and anthropological linguistics.)  Until recently, however, no provision was given in the DDC for expressing a specific language in the context of sociolinguistics et al.  An expansion at 306.442 Anthropological linguistics, ethnolinguistics, sociolinguistics of a specific language now provides for notation from Table 6 Languages to be added to 306.442 to collocate language-specific sociolinguistics.  

Consider, for example, Language ideologies and the globalization of standard Spanish (to which the LCSH Sociolinguistics has been assigned).  With the expansion at 306.442, the number for this work becomes 306.44261 Sociolinguistics of Spanish (built with 306.442, plus notation T6—61 Spanish, as instructed at 306.442).  (Because it wasn't possible previously to add Table 6 notation to 306.44 Language ["Class here sociolinguistics"], classifiers sometimes desperately and cleverly tried to fill the need by adding standard subdivision notation T1—09 Geographic treatment, plus notation T2—175 Regions where specific languages predominate, as instructed under T1—091 Areas, regions, places in general, plus the appropriate notation from Table 6, as instructed under T2—175.  Such classification was an end run around a missing add instruction:  geographic treatment in a region where a language predominates is not the same thing as the language itself.)

Sometimes classification of a sociolinguistics work wants to express both language and place.  Because this happens fairly often, the add instruction under 306.442 provides for the classifier to use 0 as a facet indicator between the notation for language and Table 2 notation instead of using standard subdivision T1—09:  "Add to base number 306.442 notation T6—2-9 from Table 6, e.g., sociolinguistics of French 306.44241; then add 0 and to the result add notation T2—1- 9 from Table 2, e.g.,  sociolinguistics of French in Quebec 306.442410714."   This is the approach that we take with a work like Sociolinguistics and Nigerian English, which should now be classed in 306.442210669 Sociolinguistics of English in Nigeria (built with 306.442, plus notation T6—21 English, plus facet indicator 0, plus notation T2—669 Nigeria, all as instructed under 306.442).

A work on the sociolinguistics of a place, where the work is not language-specific, uses 306.44 Language plus standard subdivision T1—09 Geographic treatment plus Table 2 notation, as in the past.  For example, The languages of urban Africa, to which the LCSH Sociolinguistics—Africa has been assigned,should continue to be classed in 306.44096091732 Sociolinguistics of African urban regions (built with 306.44, plus notation T1—09 Geographic treatment, plus notation T2—6 Africa, as instructed under T1—093-099 Specific continents, countries, localities, plus notation T1—093-099:091 Areas, regions, places in general from the add table under T1—093-099, plus notation T2—1732 Urban regions, as instructed under T1—093-099:091).

by Rebecca at September 15, 2014 01:44 PM

September 12, 2014

Resource Description & Access (RDA)


approximate date[2014?]
supplied date[2014]
date of publication not identified[date of publication not identified]
two years[2013 or 2014]
betwen years[between 2005 and 2014?]
not before[not before 2000]
not after[not after 2000]
betwen years with date[between March 13, 2000 and July 10, 2014]


Note: Cells left blank for Catalogers to fill with data. Please supply cases and solutions quoting proper RDA rules where date of publication not identified. It will be included in this table along with name of the cataloger. Write your suggestions in the "comments" section of this blog post.



Your Browser doen't support for Iframe !

by Salman Haider ( at September 12, 2014 11:13 PM

Temporary / Permanent Date in an Incomplete Multipart Monograph : Questions and Answers in the Google+ Community "RDA Cataloging"

RDA Cataloging is an online community/group/forum for library and information science students, professionals and cataloging & metadata librarians. It is a place where people can get together to share ideas, trade tips and tricks, share resources, get the latest news, and learn about Resource Description and Access (RDA), a new cataloging standard to replace AACR2, and other issues related to cataloging and metadata.

 Questions and Answers in the Google+ Community "RDA Cataloging"


Publication etc., dates (MARC21 264). These conventions do not apply to serials or integrating resources (temporary data not recorded in this field).

Temporary date. If a portion of a date is temporary, enclose the portion in angle brackets.


, 1980-〈1981〉 
v. 1-2 held; v. 2 published in 1981

, 〈1981-〉 
v. 2 held; v. 1-2 published in 1981

, 〈1979〉-1981. 
v. 2-3 held of a 3-volume set

, 〈1978-1980〉 
v. 2-3 held of a 5-volume set

Permanent date. If an entire date is judged to be permanent, record it without angle brackets.


, 1980-
〈1980-〉 or, 1980-〈 〉 
v. 1 held; v. 1 published in 1980

[Source: LC-PCC PS for RDA Rule 1.7.1]

by Salman Haider ( at September 12, 2014 11:13 PM

OCLC Cataloging and Metadata News

Five Canadian research libraries collaborate to share cataloguing of large collections

Joseph Hafner is Associate Dean, Collection Services at McGill University. During the 2014 WorldShare Metadata Users Group Meeting at ALA Annual in Las Vegas, he shared how OCLC’s WorldShare Metadata and WorldCat knowledge base, enabled five Canadian research libraries—University of Alberta, University of British Columbia, McGill University, Université de Montréal and University of Toronto—to catalogue nearly 7,000 records over just a few months. Many of these records include those from the National Film Board and Québec government publications.

September 12, 2014 04:00 PM

September 11, 2014

Mod Librarian

5 Things Thursday: DAM, Special Special Collections, RDA

Here are 5 more things of interest:

  1. A lovely British take on why organisations need a digital asset management department. Including that “the role of this department being about education rather than just a place to dump all the cataloguing jobs that DAM users don’t fancy taking on themselves…”
  2. Incredibly special Special Collections libraries complete with an archive of punk.
  3. How to design…

View On WordPress

September 11, 2014 12:07 PM

Coyle's InFormation

Philosophical Musings: The Work

We can't deny the idea of work - opera, oeuvre - as a cultural product, a meaningful bit of human-created stuff. The concept exists, the word exists. I question, however that we will ever have, or that we should ever have, precision in how works are bounded; that we'll ever be able to say clearly that the film version of Pride and Prejudice is or is not the same work as the book. I'm not even sure that we can say that the text of Pride and Prejudice is a single work. Is it the same work when read today that it was when first published? Is it the same work each time that one re-reads it? The reading experience varies based on so many different factors - the cultural context of the reader; the person's understanding of the author's language; the age and life experience of the reader.

The notion of work encompasses all of the complications of human communication and its consequent meaning. The work is a mystery, a range of possibilities and of possible disappointments. It has emotional and, at its best, transformational value. It exists in time and in space. Time is the more canny element here because it means that works intersect our lives and live on in our memories, yet as such they are but mere ghosts of themselves.

Take a book, say, Moby Dick; hundreds of pages, hundreds of thousands of words. We read each word, but we do not remember the words -- we remember the book as inner thoughts that we had while reading. Those could be sights and smells, feelings of fear, love, excitement, disgust. The words, external, and the thoughts, internal, are transformations of each other; from the author's ideas to words, and from the words to the reader's thoughts. How much is lost or gained during this process is unknown. All that we do know is that, for some people at least, the experience is vivid one. The story takes on some meaning in the mind of the reader, if one can even invoke the vague concept of mind without torpedoing the argument altogether.

Brain scientists work to find the place in the maze of neuronic connections that can register the idea of "red" or "cold" while outside of the laboratory we subject that same organ to the White Whale, or the Prince of Denmark, or the ever elusive Molly Bloom. We task that organ to taste Proust's madeleine; to feel the rage of Ahab's loss; to become a neighbor in one of Borges' villages. If what scientists know about thought is likened to a simple plastic ping-pong ball, plain, round, regular, white, then a work is akin to a rainforest of diversity and discovery, never fully mastered, almost unrecognizable from one moment to the next.

As we move from textual works to musical ones, or on to the visual arts, the transformation from the work to the experience of the work becomes even more mysterious. Who hasn't passed quickly by an unappealing painting hanging on the wall of a museum before which stands another person rapt with attention. If the painting doesn't speak to us, then we have no possible way of understanding what it is saying to someone else.

Libraries are struggling to define the work as an abstract but well-bounded, nameable thing within the mass of the resources of the library. But a definition of work would have to be as rich and complex as the work itself. It would have to include the unknown and unknowable effect that the work will have on those who encounter it; who transform it into their own thoughts and experiences. This is obviously impractical. It would also be unbelievably arrogant (as well as impossible) for libraries to claim to have some concrete measure of "workness" for now and for all time. One has to be reductionist to the point of absurdity to claim to define the boundaries between one work and another, unless they are so far apart in their meaning that there could be no shared messages or ideas or cultural markers between them. You would have to have a way to quantify all of the thoughts and impressions and meanings therein and show that they are not the same, when "same" is a target that moves with every second that passes, every synapse that is fired.

Does this mean that we should not try to surface workness for our users? Hardly. It means that it is too complex and too rich to be given a one-dimensional existence within the current library system. This is, indeed, one of the great challenges that libraries present to their users: a universe of knowledge organized by a single principle as if that is the beginning and end of the story. If the library universe and the library user's universe find few or no points of connection, then communication between them fails. At best, like the user of a badly designed computer interface, if any communication will take place it is the user who must adapt. This in itself should be taken the evidence of superior intelligence on the part of the user as compared to the inflexibility of the mechanistic library system.

Those of us in knowledge organization are obsessed with neatness, although few as much as the man who nearly single-handled defined our profession in the late 19th century; the man who kept diaries in which he entered the menu of every meal he ate; whose wedding vows included a mutual promise never to waste a minute; the man enthralled with the idea that every library be ordered by the simple mathematical concept of the decimal.

To give Dewey due credit, he did realize that his Decimal Classification had to bend reality to practicality. As the editions grew, choices had to be made on where to locate particular concepts in relation to others, and in early editions, as the Decimal Classification was used in more libraries and as subject experts weighed in, topics were relocated after sometimes heated debate. He was not seeking a platonic ideal or even a bibliographic ideal; his goal was closer to the late 19th century concept of efficiency. It was a place for everything, and everything in its place, for the least time and money.

Dewey's constraints of an analog catalog, physical books on physical shelves, and a classification and index printed in book form forced the limited solution of just one place in the universe of knowledge for each book. Such a solution can hardly be expected to do justice to the complexity of the Works on those shelves. Today we have available to us technology that can analyze complex patterns, can find connections in datasets that are of a size way beyond human scale for analysis, and can provide visualizations of the findings.

Now that we have the technological means, we should give up the idea that there is an immutable thing that is the work for every creative expression. The solution then is to see work as a piece of information about a resource, a quality, and to allow a resource to be described with as many qualities of work as might be useful. Any resource can have the quality of the work as basic content, a story, a theme. It can be a work of fiction, a triumphal work, a romantic work. It can be always or sometimes part of a larger work, it can complement a work, or refute it. It can represent the philosophical thoughts of someone, or a scientific discovery. In FRBR, the work has authorship and intellectual content. That is precisely what I have described here. But what I have described is not based on a single set of rules, but is an open-ended description that can grow and change as time changes the emotional and informational context as the work is experienced.

I write this because we risk the petrification of the library if we embrace what I have heard called the "FRBR fundamentalist" view. In that view, there is only one definition of work (and of each other FRBR entity). Such a choice might have been necessary 50 or even 30 years ago. It definitely would have been necessary in Dewey's time. Today we can allow ourselves greater flexibility because the technology exists that can give us different views of the same data. Using the same data elements we can present as many interpretations of Work as we find useful. As we have seen recently with analyses of audio-visual materials, we cannot define work for non-book materials identically to that of books or other texts. [1] [2] Some types of materials, such as works of art, defy any separation between the abstraction and the item. Just where the line will fall between Work and everything else, as well as between Works themselves, is not something that we can pre-determine. Actually, we can, I suppose, and some would like to "make that so", but I defy such thinkers to explain just how such an uncreative approach will further new knowledge.

[1] Kara Van Malssen. BIBFRAME A-V modeling study
[2] Kelley McGrath. FRBR and Moving Images

by Karen Coyle ( at September 11, 2014 05:30 AM

September 10, 2014

Bibliographic Wilderness

Cleaning up the Rails backtrace cleaner; Or, The Engine Stays in the Picture!

Rails has for a while included a BacktraceCleaner that removes some lines from backtraces, and reformats others to be more readable.

(There’s an ActiveSupport::BacktraceCleaner, although the one in your app by default is actually a subclass of that, which sets some defaults, Rails::BacktraceCleaner. That’s a somewhat odd way to implement Rails defaults on an AS::BacktraceCleaner, but oh well).

This is pretty crucial, especially since recent versions of Rails can have pretty HUGE call stacks, due to reliance on Rack middleware and other architectural choices.

I rely on clean stack traces in the standard Rails dev-mode error page, in my log files of fatal uncaught exceptions — but also in some log files I write myself, where I catch and recover from an exception, but want to log where it came from anyway, ideally with a clean stacktrace. `Rails.backtrace_cleaner.clean( exception.backtrace )`

A few problems I had with it though:

  • Several of my apps are based on kind of ‘one big Rails engine’. (Blacklight, Umlaut).  The default cleaner will strip out any lines that aren’t part of the local app, but I really want to leave the ‘main engine’ lines in. That was my main motivation to look into this, but as long as I was at it, a couple other inconveniences…
  • The default cleaner nicely reformats lines from gems to remove the filepath to the gem dir, and replace with just the name of the gem. But this didn’t seem to work for gems listed in Bundler as :path (or, I think, :github ?), that don’t live in the standard gem repo. And that ‘main engine gem’ would often be checked out thus, especially in development.
  • Stack trace lines that come from ERB templates include a dynamically generated internal method name, which is really long and makes the stack trace confusing — the line number in the ERB file is really all we need. (At first I thought the Rails ‘render template pattern filter’ was meant to deal with that, but I think it’s meant for something else)

Fortunately, you can remove and add/or your own silencers (which remove lines from the stack trace), and filters (which reformat stack trace lines) from the ActiveSupport/Rails::BacktraceCleaner.

Here’s what I’ve done to make it the way I want. I wanted to add it directly built into Umlaut (a Rails Engine), so this is written to go in Umlaut’s `< Rails::Engine` class. But you could do something similar in a local app, probably in the `initializers/backtrace_silencers.rb` file that Rails has left as a stub for you already.

Note that all filters are executed before silencers, so your silencer has to be prepared to recognize already-filtered input.

module Umlaut
  class Engine < Rails::Engine
    engine_name "umlaut"


    initializer "#{engine_name}.backtrace_cleaner" do |app|
      engine_root_regex = Regexp.escape (self.root.to_s + File::SEPARATOR)

      # Clean those ERB lines, we don't need the internal autogenerated
      # ERB method, what we do need (line number in ERB file) is already there
      Rails.backtrace_cleaner.add_filter do |line|
        line.sub /(\.erb:\d+)\:in `__.*$/, "\\1"

      # Remove our own engine's path prefix, even if it's
      # being used from a local path rather than the gem directory.
      Rails.backtrace_cleaner.add_filter do |line|
        line.sub(/^#{engine_root_regex}/, "#{engine_name} ")

      # Keep Umlaut's own stacktrace in the backtrace -- we have to remove Rails
      # silencers and re-add them how we want.

      # Silence what Rails silenced, UNLESS it looks like
      # it's from Umlaut engine
      Rails.backtrace_cleaner.add_silencer do |line|
        (line !~ Rails::BacktraceCleaner::APP_DIRS_PATTERN) &&
        (line !~ /^#{engine_root_regex}/  ) &&
        (line !~ /^#{engine_name} /)




Filed under: General

by jrochkind at September 10, 2014 04:40 PM

September 09, 2014

025.431: The Dewey blog

Dr. Lois Mai Chan

We were shocked and saddened to learn of the passing of Dr. Lois Mai Chan on August 20.

Lois was a member of the Decimal Classification Editorial Policy Committee (EPC) 1975-1993, and she served as chair of EPC 1986-1991.  Before her last EPC meeting, Peter Paulson (then executive director of Forest Press) wrote: "As most of you already know, this will be the last EPC meeting for Lois Chan, who has served the Committee wisely and well for eighteen years.  We shall miss her careful and thoughtful reading of exhibits, her ability to clarify ambiguities, and her soothing composure when opinions conflict.  We will be honoring Lois at the Committee dinner on Wednesday evening."

Lois was co-author of Dewey Decimal Classification: A Practical Guide (1994, 1996) and Dewey Decimal Classification: Principles and Application (2003).  

Lois was part of the teams that introduced Dewey at workshops held in various locations around the world:  in 1995 at Crimea '95 and in Moscow at the Russian National Public Library for Science and Technology, at IFLA in Beijing in 1996, and in several venues in Hanoi in 1997. The proceedings of the Beijing workshop were published as Dewey Decimal Classification: Edition 21 and International Perspectives: Papers from a Workshop presented at the General Conference of the International Federation of Library Associations and Institutions (IFLA), Beijing, China, August 29, 1996. Lois gave the opening remarks, the summary and closing remarks, and was co-editor of the publication—doing what was typical for her.  

Her deep understanding of the principles of the DDC, and her consistently careful, hard work to communicate that understanding to students around the world made a lasting contribution to the development and teaching of the DDC.

by Juli at September 09, 2014 08:37 PM

First Thus

ACAT Improving catalogues

On 09/09/2014 14.11, Scott R Piepenburg wrote:
> In one situation I was at, it was highly desirable for people to be able to say “I want Tom Hanks movies where he was an actor as opposed to a director” or I want the works of Jim Steinman as a composer versus a performer.” Maybe it is an isolated situation, but in this particular location, it was highly useful to be able to find people who serve multiple roles (actor, producer, director, etc.) to the exclusion of others. The challenge is first to code the information, then to make a user-friendly and useful interface to be able to retrieve it.
Yes, I agree with this. People have asked these kinds of questions for years and they have gone to the tools that provide them. For movies, there was the serial “Film directors : a complete guide” and other directories. For music, there have been “Who’s who in music” and “Musician’s directory” among others. Plus there was the occasional almanac or two.

Today however, for movies, you can go to the IMDB, among other places. Here is Tom Hank’s page. I have found Google itself to be pretty good too. For music, I am not quite so sure if there is one “best” but there is Here is the page for Jim Steinman, and of course, we should not forget about Wikipedia for any of this.

This is so simple, and *free* (amazingly!) that the next question seems obvious enough: why should we spend our precious resources re-creating things that already exist? It would take a long, long, long time (if every!) before anything we do could be nearly as good as these tools are right now–and by that time (after a few decades) what will exist for the public then?! The mind boggles at what there could be in only five years, much less 20 or 30. The very first Iphone was in 2007 and look at everything that has happened since then.

So, what should we do? Do we spend our time adding these relator codes to our records, when we know that anything we make will remain demonstrably inferior to those other tools? Why? Or do we try to work with these other tools in all kinds of cool ways? Again, that is a task primarily for the computer technicians, and lots of them are making these sorts of tools right now. This would leave catalogers to work on what they do best.

The information that is in our records right now could be used far better than it is. That is where our focus should be.


by James Weinheimer at September 09, 2014 02:28 PM

Improving catalogues

Posting to Autocat

On 08/09/2014 23.59, Charles Pennell wrote:

The introduction of Bibframe doesn’t change anything about legacy data, just as AACR didn’t change anything about pre-existing ALA rules records or AACR2 to pre-existing AACR1 records. …
In your example, $e can be added retrospectively when role information has been provided through the 245$c, 508 or 511, but again, this takes effort and needs to be subjected to a business model. We have been doing a lot of clean-up of our legacy e-book records, standardizing 856 notes, eliminating duplicate records, separating print from e-, etc. using global edits plus a lot of grunt work for which we have community support in the interest of providing better access. It can be done when you have a plan and support.

Sure, we can do all kinds of things, but in a world of declining numbers of catalogers who are spending less and less time actually cataloging (from everything I have heard and read), we should be asking ourselves: what is the best use of our diminishing resources? Is it best spent by adding information to older records that have been around for decades or more, or doing something else? Where is the evidence that the public wants and needs this relator information so much more than other things we could do? While I have heard many–too many–complaints about our catalogs, I have certainly never seen or heard anything from a user that says that it is critical for them to search for people as “editors” or “thesis advisors” or something like that. And this is at the same time as the authority records do not work.

Which is more important:
1) to find specific authors reliably by their function as editors, “creators” (whatever that catch-all word means) and so on, or

2) to find out that if you want to do a good search for Mark Twain, then you need to know the following :

“For works of this author written under other names, search also under Clemens, Samuel Langhorne, 1835-1910, Snodgrass, Quintus Curtius, 1835-1910 Conte, Louis de, 1835-1910, Alden, Jean Francois, 1835-1910″

along with the associated links? How is anybody supposed to just “know” that?

To do 1) demands a huge amount of resources and time from the entire cataloging community. A product that could give even halfway-decent results (such as finding a specific person in the role of an editor, say 50% of the time?) will take what? 5 years? 10 years? Shouldn’t we find out how long it would take and what resources would be needed to get it done in–say 20 years vs. 10 years vs. 5 years? More important, shouldn’t we at least find out if this is so important to the public–because librarians themselves do not need that information to manage the collection.

To do 2) it is entirely different. Everything exists now, catalogers need do nothing, and it remains only that some programmers/computer technicians build it, and share it. I emphasize “only” because that is a loaded word here, but in any case, it demands exponentially less resources from far fewer people than option 1). Anyway, I would argue that it needs to be done in any case if our authority records are ever to become useful to the public again.

So, what is the wiser choice? Of course, I may be wrong and it could turn out that research would show the public would vastly prefer the relator information to the information in the authority files, or prefer the relator information to having catalogers actually catalog more resources–which could help everybody deal with the problem of “information overload” that is causing everybody to pull their hair out. Shouldn’t we at least find out? To find out, the public needs to be researched in semi-scientific ways. Of course, that is one part of “making the business case”.

Nobody can avoid making a business case for what they do. Coming up with a valid business case is a complicated task and you may hear and discover things you don’t like one bit, but it is nonetheless inevitable. You can either make the case before implementation of a project, so that you can avoid as many problems and errors as possible, or you have to do it afterwards when the problems and errors are clear to everyone, and you find yourself trying to explain them away.


by James Weinheimer at September 09, 2014 11:04 AM

Bibliographic Wilderness

Cardo is a really nice free webfont

Some of the fonts on google web fonts aren’t that great. And I’m not that good at picking the good ones from the not-so-good ones on first glance either.

Cardo is a really nice old-style serif font that I originally found recommended on some list of “the best of google fonts”.

It’s got a pretty good character repertoire for latin text (and I think Greek). The Google Fonts version doesn’t seem to include Hebrew, even though some other versions might?  For library applications, the more characters the better, and it should have enough to deal stylishly with whatever letters and diacritics you throw at it in latin/germanic languages, and all the usual symbols (currency, punctuation; etc).

I’ve used it in a project that my eyeballs have spent a lot of time looking at (not quite done yet), and been increasingly pleased by it, it’s nice to look at and to read, especially on a ‘retina’ display. (I wouldn’t use it for headlines though)

Filed under: Uncategorized

by jrochkind at September 09, 2014 04:39 AM

September 08, 2014

First Thus

ACAT Improving catalogues

Posting to Autocat

On 9/8/2014 8:37 PM, Charles Pennell wrote:

If anything, I think Bibframe is on target to create even greater granularity for our data than MARC (including MARCXML) ever could have. Plus, it has the attention of those who are trying to provide better access to that data through the semantic Web and are looking for a more sympathetic data structure and vocabulary than what is currently offered through non-libraryviders.

This is one of those points I have never understood. Right now, our tools are powerful enough (either with mySQL or with XML) we can manipulate any (and I mean ANY) information in our records in any way we could want (and this means ANY WAY). Changing our formats will not change this at all. For instance, if we want to enable people to do the FRBR user tasks: to find/identify/select/obtain works/expressions/manifestations/items by their authors/title/subjects, that can be done RIGHT NOW. By anybody. This is not because of any changes in our cataloging rules or formats, but because the “innards” of the catalog have been changed with Lucene indexing that now allows anybody to use the facets (created automatically through the new indexing) to do that. To prove it to yourself, all you have to do is search WorldCat for any uniform title, e.g. Dante’s “Divine Comedy”. With this search, anybody can then click on different formats, different dates, languages and so on. These facets and the user interface can be changed in any way we want. Any uniform title can be used.

Yes, all of this can be improved a lot, that user interface can really be improved, but catalogers need do nothing. These changes can be made only by the computer technicians. Catalogers just need to add all the uniform titles, just as they have always done.

To make records more granular means to add information that is currently not in our records. What does this mean? For instance, we could add that a specific person was a translator instead of just an added entry. In a new record, we make today, we can add the correct $e relator code. That is pretty easy. But what about in this record where Henry Francis Cary was the translator? Who is going to add it to that record? It won’t add itself! Or to the other expressions/manifestations of his translation? Or to all of the other translators to all of the other versions? Or to all of the other translations of all works? How many records and how much work would that be? What about all of the relator codes, including the WEMI?


This is what I was getting at in my podcast Cataloging Matters No. 16: Catalogs, Consistency and the Future

If we don’t add the coding to these older records, then they will effectively be hidden when someone clicks on someone as translator, i.e. only the new headings will get:

Cary, Henry Francis, |d 1772-1844, |e translator

and the old ones will not. That is, until someone recodes them. To make this search useful means that every single record will have to be recataloged, otherwise when people search for Cary as a translator those records cannot, by definition, be found.

How is this different from other databases that IT people work with? From my experience, business people (where this happens regularly) deal with it by saying, “Well, we are dealing with old, obsolete information, so we can just archive it. We can put it in a zip file and let people download it if they want it.” I have heard precisely those words.

Maybe this is correct when talking about invoice information that is 5+ years old–or maybe not–or personnel information, or even for medical information that is 15 years old or older. But it is 100% totally incorrect for library catalog information. Why?

Because the materials received, and the records made, for materials received 50 or 100 years ago may be among the most important and valuable materials in your library. Remember, we are talking about everything made before just 2 or 3 years ago. That is quite a bit. If you make those records a lot harder to find, you automatically make the materials they describe harder to find. And as a result, the collection itself is less useful. Therefore, the information in a library catalog is fundamentally different from the information in most other databases.

Library catalogs have always been based on the rule of consistency, and I still have seen nothing at all that replaces that. For instance, linked data is still based on putting in links consistently. If the information in the records is inconsistent (and adding relator information etc. only to the new records is a perfect example of that), that makes it at least a whole lot harder to find the earlier records–and therefore, the materials they describe.

Perhaps it works in some databases better than others, but absolutely not in a library catalog. If we change our new records, we must change our old ones or people won’t find them (or at least it will be a lot harder and more confusing). We either care about that or we don’t. If we care, this means massive retrospective conversions, and in our dwindling cataloging departments, we must confess that that means it will never be done. That is a simple fact.

As Mac has said: our records as they are right now could be much more useful to people than they are, but that is a task for the IT people to change the catalog “innards”. One part would be to make the authority records actually useful again.

Now, to return to Bibframe etc. Sure, we can and should change our format (should have been done at least 15 years ago), but that has much more to do with not being stuck in traditional ILMSs, and being able to use cheaper, more powerful tools (e.g. Lucene uses MARCXML) and making them more available for non-library uses.

But that is another topic.


by James Weinheimer at September 08, 2014 09:15 PM

ACAT Authorities and references

Posting to Autocat

On 08/09/2014 15.40, Brian Briscoe wrote:

How do we catalogers get invited to meet with ILS developers so that we can work together to create the kind of tools that will really serve the needs of our users?


On 08/09/2014 15.48, Scott R Piepenburg wrote:

As Brian writes ” How do we catalogers get invited to meet with ILS developers so that we can work together to create the kind of tools that will really serve the needs of our users?” Easy! Write the checks.

Yes, and therein lies the real difficulty. I like to think that I can speak both languages (catalog-speak and tech-speak) and I have succeeded in interesting IT developers. But that is not nearly enough because it is their supervisors who have to be interested! Plus, it is not the catalogers who have control of the money to pay them; catalogers (or somebody) must first convince the library administrators to cut loose of the funds. THEN, in the case of a normal ILS, someone else (neither the catalogers nor the library administrators) must convince the ILS company administrators to work on it. Achieving all of that is anything but easy.

That is why I have placed at least some hopes on the open-source software movement. If someone wants an open-source catalog to work in a certain way, you can pay the money and it will be done. Or–if you can get a developer interested and it is not too much work, someone might actually do it on their own time. But as I said, it’s tough–even with the best of intentions–when everybody has too much to do already and are working flat out.

I have found that demonstrating the power of authorities, which seems so obvious to me, is very abstract for others to grasp. I suspect that most people alive today have probably never experienced it while others have forgotten. Plus, the new ways are popular. When I have tried to demonstrate how authority control could work for us today, I have been reduced to showing how it worked in cards and book catalogs, which makes me look like the biggest Luddite that walked the face of the earth. It turns people off immediately and I know it. To get some possible movement, I think there needs to be a small prototype so that people could see how it might help them.

At least building a prototype is possible, now that the LC authorities are available for download and manipulation. So, it could be done–it is conceivable. But the first step is to get people (i.e. catalogers) to see that the catalog really is broken; that is has been broken for a long time, and that is too bad–but it really and truly can be fixed. It really can.

Unfortunately, the cataloging community is currently focused on rule changes, format changes, and striving for the universe of FRBR and linked data, as if that will make the real difference. That takes tremendous resources, time, money and so on from shrinking departments. I don’t know if anything will come from the cataloging community. At least not anytime soon.


by James Weinheimer at September 08, 2014 02:31 PM

September 07, 2014



The survey I distributed to get more information on metadata services asked respondents to explain why they didn’t provide metadata services if this was the case at their institution. Interestingly, it was a split between those still in the planning phase and those that said their units were still traditional cataloging units. Do you know what a traditional cataloging unit is? I asked a couple of my colleagues. Answers varied but tended to focus on people who have been in the unit “a long time”, people adverse to change, or people always just quoting AACR2 rules. Well, I found these answers not exactly helpful. So I asked myself the question what is a traditional cataloging unit and does such a thing even exist? When I really thought about it, I don’t think there are “traditional” units out there. Let me explain.

If not all then a large majority of cataloging units have undergone wide ranging changes that have affected how we do business, who we do business with, job expectations, or resources. These changes have often been implemented with business models of “do more with less” or the “do less with less” or even “we’ll do what we can”. Cataloging units are used to change. They have had to adapt to changing standards, changing technology, new formats, new technologies, less staff and so much more. On top of this, cataloging is often not taught or on the sidelines in many library school programs. Typical positions that open often require much from new employees in terms of experience and skill level. One reason is that many departments just don’t have the resources to provide adequate training even for current staff. Of course that’s if you can renew a position or have a new position approved. This seems to be something of a miracle occurrence in some places. All of this is to say that cataloging units have to weather many storms. Staff have had to adapt, think out of the box, be flexible. This seems far from “traditional” in the sense of long standing activities done repeatedly over a long period of time.

But my survey question was aimed specifically at why metadata services were not being offered. Not in all cases, but in many, metadata services relate to digital projects or projects oriented to helping content creators, knowledge experts or others involved in research projects and making content (not necessarily library resources) accessible and discoverable to the world. Such projects tend to involve the digital humanities, research data, digital repositories and digital collections. The users go beyond staff to incorporate members of the community. Materials to be described are more varied and include more formats such as data sets. The technologies in play tend to be varied from vended solutions such as CONTENTdm to open source solutions such as Fedora/Islandora or Omeka.

Is it because a cataloging unit is traditional that staff in the unit will not provide metadata services? I would venture to say no because the picture is much more complex. There are cataloging units that are so bare bones that even learning RDA is a problem. There are even some institutions that don’t even have a cataloging unit because everything is outsourced. In many cases, there simply are no resources to dedicate to training and skill building. On the positive side, there are other units that can devote time to training and skill building. But this takes place on top of everything else. Staff is being asked to handle a variety of complex tasks and take on a new set of equally complex tasks. It makes for a busy and often overworked  unit. There might be some units that are competing for resources and need to make very good justifications for training, skill building, keeping or getting a new position. Often justification is a give and take. I think many job descriptions I’ve read fall victim to listing everything except the kitchen sink in their descriptions. It seems many want a super catalog metadata librarian who can program, catalog serials, work in discovery systems and design repository systems. Why? Most likely, their institution needs programmers, web designers and more catalog and metadata librarians. So why not ask for a little bit of everything and see what you get. The problem is that this is a lot to ask for, especially when many units are already understaffed and overextended. There is nothing easy in developing and creating new services.

Are metadata services not incorporated because the unit is traditional? Not really. Is it because the unit might be understaffed and overextended? For many this is a reality. It becomes key in those cases to promote the value of cataloging to management and the need for new positions. Is it because catalogers are adverse to change? Well, this has more to do with people than a profession. Being adverse to change is not a trademarked weakness of cataloging librarianship – thankfully. It is because a lack of resources. Yes definitely. It becomes then a question of not how to move beyond traditions but of how to get the resources you need to continue to grow. These resources might include tapping into a staff member who can think out of the box or take change in different directions. But how do you get those resources? And how do you juggle a full plate with another plate called metadata services. Next up, I’ll share some of the respondents’ answers. So stay tuned.

Filed under: cataloging, Metadata Tagged: metadata services, traditional catalog units

by Jen at September 07, 2014 08:26 PM

Resource Description & Access (RDA)

Question on RDA Relationships asked at Google+ Community "RDA Cataloging"

Question on Relationships in Resource Description & Access on Google+ Community RDA Cataloging - Answers from Experts at RDA-L

I face a difficulty when recording a work relationship for a work which is a review of a critical edition of another work.

Ideally, I would like to be able to connect between a work (the review) and the expression of another work (the critical edition).

Here is an example:

WORK A: Baalbaki, Ramzi, born 1951. [Review of] Harun's edition of Sibawayhi's Kitab (2010)

WORK B: Sībawayh, died 796?. Al-Kitāb (ca. 780)

EXPRESSION B1: Sībawayh, died 796?. Al-Kitāb (ca. 780). (Critical edition by H. Derenbourg, 1881‒1885). Text. Arabic

EXPRESSION B2: Sībawayh, died 796?. Al-Kitāb (ca. 780). (Critical edition by A. S. Hārūn, 1966‒1977 ). Text. Arabic

WORK A1 is a review of EXPRESSION B2.

What do you suggest I should do? It seems that the relationship designators (here, "review of") apply either between two works or between two expressions but not between a work and the expression of another work.

But here, WORK A itself, not one of its expressions, is a review of EXPRESSION B2. How should I record this relationship?


Response by Heidrun Wiesenmüller, Professor of library science at the Stuttgart Media University (Germany)


I have no easy solution, but I agree there is an oddity here.

RDA indeed seems to restrict relationships to work-work and expression-expression, with no "cross-overs". Whereas this may be o.k. for other types of relationships, it doesn't fit descriptive relationships of the type you mentioned.

You're quite right to think that the review of a certain edition is a work in its own right. Of course, the review work has at least one expression of its own. But it would be weird to record the relationship only between one expression of the review work and the reviewed edition. This would miss the point that *all* expressions of the review work describe the reviewed edition. So the relationship should indeed be recorded between a work (the review) and an expression of another work.

It's interesting to compare the situation on expression level to the descriptive relationships on the level of manifestation and item (J.4.4 and J.5.2):
- description of (manifestation) A manifestation described by a describing work.
- description of (item) An item described by a describing work.

So in these cases, RDA allows to record a relationship between a describing work and a manifestation, and a describing work and an item. It seems to me that the same should be possible for a relationship between a describing work and an expression. So perhaps J.3.3 needs to be changed according to the pattern of J.4.4 and J.5.2 (I assume a proposal would be needed for this).

By the way, one might argue that these "descriptive relationships" between group 1 entities are really subject relationships, and shouldn't be here at all. They do stand out rather.



Response by Gordon Dunsire, Chair, Joint Steering Committee for Development of RDA

Salman, Heidrun and others
These issues are addressed in a submission from the JSC Technical Working Group to the JSC for its November 2014 meeting:

6JSC/TechnicalWG/3  High-level subject relationship in RDA (



Gordon Dunsire
Chair, Joint Steering Committee for Development of RDA


Response by Heidrun Wiesenmüller, Professor of library science at the Stuttgart Media University (Germany)

Thanks, Gordon, that's great.
I haven't worked my way through all the documents yet, so had missed the fact that the proposal is already there :-)

Do I understand correctly that the descriptive relationship designators (with the proposed revisions and additions) would stay in Appendix J, i.e. still belong to section 8 (Relationships between works, expressions, manifestations, and items). So there is no plan to move them to section 7 (subject relationships)?



Response by Gordon Dunsire, Chair, Joint Steering Committee for Development of RDA

The JSC will discuss any improvements to RDA Toolkit associated with the Technical Working Group's recommendations in due course. There are no current plans to move the designators - but I think there are other submissions affecting relationship designators (I haven't had time myself to read all of the submissions).



Gordon Dunsire

Chair, Joint Steering Committee for Development of RDA


Response by Robert L. Maxwell, Ancient Languages and Special Collections Cataloger, Harold B. Lee Library, Brigham Young University

I would do:

Review of (expression): Sībawayh, ʻAmr ibn ʻUthmān, active 8th century. Kitāb. Arabic (Hārūn)

It should also be possible to use this access point in a subject field.

RDA doesn’t explicitly recognize a relationship between a work and an expression, but they happen all the time. In my opinion we should be able to record relationships between any FRBR entity and any other FRBR entity, as appropriate.



RDA Blog Thanks Heidrun Wiesenmüller, Gordon Dunsire, and Robert L. Maxwell for their valuabe remarks

by Salman Haider ( at September 07, 2014 09:10 PM

September 06, 2014

First Thus

ACAT Authorities and references

Posting to Autocat

On 9/6/2014 12:17 AM, Brian Briscoe wrote:

… speed has somehow become as, if not more, important than the quality of the results of a search. The quality of results has become secondary. Users are accustomed to a different way of searching, even though it is less precise and often returns poorer results. We must counter with a product that comes close to the same speed; is similar enough to the “Google box” that it is not repulsive; and will show the user the precision and quality that they are unaware they are not getting with keyword.
IMO, the way to do so, is not to bury our authority records in the way that I see so many ILS’ doing. I suspect we do so to look more like a keyword > search, but I think that strategy is the quickest way to make authorities unimportant to our users, directors and funders.

We have to find a way to steer the direction that our vendors are going.

I completely agree, but I will say that the traditional catalog–when people knew how to work with it as they were supposed to–actually could save the searcher’s time, but of course that didn’t mean it was perfect. OPACs never really have allowed these powers to be utilized. And that is why I have maintained that our current catalogs are broken and have been for a long, long time.

As a very simple example, let us suppose that I am interested in battles of WWII. I can search keyword “battles wwii” (what else would a normal person–scholar or not–search today?) and get lots of records to work with and may be “happy” enough.

But, our “information literacy” programs say that people are supposed to know how the catalog works and should therefore realize that this is a very poor search. They should know that the correct search is:

World War, 1939-1945–Aerial operations. (and/or)
World War, 1939-1945–Campaigns. (and/or)
World War, 1939-1945–Naval operations.

How are people to know this? Because of the directions found in where the invaluable cross-reference “World War, 1939-1945–Battles, sieges, etc.” exists as a 450 cross-reference in each of these headings; e.g. to “World War, 1939-1945–Aerial operations”, and once there, you also find some handy references to other headings: Aeronautics, Military; Air warfare; Bombing, Aerial; Naval aviation, that probably would never enter into someone’s head. The same goes for “Campaigns” and “Naval operations”. As a result, this saves time when compared to a mere keyword search where you wind up looking at one record after another after another….

This was how “information search and retrieval” took place before computerization of full text and algorithms. It worked this way in other places than library catalogs: indexes found in the back of books worked this way, the yellow pages in the phone books and lots of other places, although the structures there were much less extensive. Of course, it wasn’t perfect, but that is how the tools were (and still are) designed to work. You take away the cross-references and handy notes, of course people will flail around.

The very few times I have managed to get a non-cataloger to understand how the system is supposed to work, they suddenly become much, much less enamored of those full-text, algorithmic search results and in fact, they have asked why these wonderful cross-references are so hard to find in library catalogs, and why the general search tools such as Google don’t implement these methods that actually could–and they see that it should–save their time! Good questions and I have no answers!

How are people today supposed to find the cross-reference for wwii battles? By doing something very strange: by searching for subject (!), and entering exactly: “world war 1939-1945–battles”. The chances of that happening probably come close to the proverbial roomful of monkeys randomly typing out the complete works of Shakespeare!

I still maintain that people would actually like and use our tools IF they were reconfigured to work for the 21st century and not for the 19th century. Until we admit that our catalogs are broken in fundamental ways and must be fixed (because they CAN be fixed), any changes to our rules (adding relator codes and spelling out abbreviations), adding FRBR relationships, changing formats, and adding linked data until it’s coming out of ears, will end up making zero difference to the public until the catalogs themselves are fixed. We can do all of those things and our catalogs will still be broken. If the only way people can get into our authority structures is by doing left-anchored text string searches, then our authority records are fated to forever be irrelevant.

But catalogers CANNOT fix this because the problem is not with the rules, the formats or anything that catalogers can work with. The problem is in the “innards” of the catalog and it is the systems people who must fix it so that our authority records can actually be found in ways people search today in the 21st century, and not pretend that there is no real problems because people can just search by left-anchored text strings. Nobody does that. Even then it doesn’t work–not even for me–as I demonstrated in (I am ever shameless!) Cataloging Matters no. 18: Problems with Library Catalogs It just DOESN’T WORK! That means, it is broken–not broken beyond repair because it can be fixed–and until this is accepted there will be no attempts toward fixing it.

I don’t know what would work, although I have some ideas that could be shown right or wrong in the end. To figure out what would work would demand study, thought, consideration and reconsideration as to what would be the most useful for the public today. And not only should catalogers be involved, but the entire community. The catalog is for them and not just for us.

I can only hope that these reconsiderations will happen eventually.


by James Weinheimer at September 06, 2014 10:17 PM

September 05, 2014


Looking for data tricks in Libraryland

IFLA 2014 Annual World Library and Information Congress Lyon – Libraries, Citizens, Societies: Confluence for Knowledge


After attending the IFLA 2014 Library Linked Data Satellite Meeting in Paris I travelled to Lyon for the first three days (August 17-19) of the IFLA 2014 Annual World Library and Information Congress. This year’s theme “Libraries, Citizens, Societies: Confluence for Knowledge” was named after the confluence or convergence of the rivers Rhône and Saône where the city of Lyon was built.

This was the first time I attended an IFLA annual meeting and it was very much unlike all conferences I have ever attended. Most of them are small and focused. The IFLA annual meeting is very big (but not as big as ALA) and covers a lot of domains and interests. The main conference lasts a week, including all kinds of committee meetings, and has more than 4000 participants and a lot of parallel tracks and very specialized Special Interest Group sessions. Separate Satellite Meetings are organized before the actual conference in different locations. This year there were more than 20 of them. These Satellite Meetings actually resemble the smaller and more focused conferences that I am used to.

A conference like this requires a lot of preparation and organization. Many people are involved, but I especially want to mention the hundreds of volunteers who were present not only in the conference centre but also at the airport, the railway stations, on the road to the location of the cultural evening, etc. They were all very friendly and helpful.

Another feature of such a large global conference is that presentations are held in a number of official languages, not only English. A team of translators is available for simultaneous translations. I attended a couple of talks in French, without translation headset, but I managed to understand most of what was presented, mainly because the presenters provided their slides in English.

It is clear that you have to prepare for the IFLA annual meeting and select in advance a number of sessions and tracks that you want to attend. With a large multi-track conference like this it is not always possible to attend all interesting sessions. In the light of a new data infrastructure project I recently started at the Library of the University of Amsterdam I decided to focus on tracks and sessions related to aspects of data in libraries in the broadest sense: “Cloud services for libraries – safety, security and flexibility” on Sunday afternoon, the all day track Universal Bibliographic Control in the Digital Age: Golden Opportunity or Paradise Lost?” on Monday and “Research in the big data era: legal, social and technical approaches to large text and data sets” on Tuesday morning.

Cloud Services for Libraries

It is clear that the term “cloud” is a very ambiguous term and consequently a rather unclear concept. Which is good, because clouds are elusive objects anyway.

In the Cloud Services for Libraries session there were five talks in total. Kee Siang Lee of the National Library Board of Singapore (NLB) described the cloud based NLB IT infrastructure consisting of three parts; a private, public and hybrid cloud. The private (restricted access) cloud is used for virtualization, an extensive service layer for discovery, content, personalization, and “Analytics as a service”, which is used for pushing and recommending related content from different sources and of various formats to end users. This “contextual discovery” is based on text analytics technologies across multiple sources, using a Hadoop cluster on virtual servers. The public cloud is used for the Web Archive Singapore project which is aimed at archiving a large number of Singapore websites. The hybrid cloud is used for what is called the Enquiry Management System (EMS), where “sensitive data is processed in-house while the non-sensitive data resides in the cloud”. It seems that in Singapore “cloud” is just another word for a group of real or virtual servers.

In the talk given by Beate Rusch of the German Library Network Service Centre for Berlin and Brandenburg KOBV the term “cloud” meant: the shared management of data on servers located somewhere in Germany. KOBV is one of the German regional Library Networks involved in the CIB project targeted at developing a unified national library data infrastructure. This infrastructure may consist of a number of individual clouds. Beate Rusch described three possible outcomes: one cloud serving as a master for the others, a data roundabout linking the other clouds, and a cross cloud dataspace where there is an overlapping shared environment between the individual clouds. An interesting aspect of the CIB project is that cooperation with two large commercial library system vendors, OCLC and Ex Libris, is part of the official agreement. This is of interest for other countries that have vested interests in these two companies, like The Netherlands.

Universal Bibliographic Control in the Digital Age

The Universal Bibliographic Control (UBC) session was an all day track with twelve very diverse presentations. Ted Fons of OCLC gave a good talk explaining the importance of the transition from the description of records to the modeling of entities. My personal impression lately is that OCLC all in all has been doing a good job with linked data PR, explaining the importance and the inevitability of the semantic web for libraries to a librarian audience without using technical jargon like URI, ontology, dereferencing and the like. Richard Wallis of OCLC, who was at the IFLA 2014 Linked Data Satellite Meeting and in Lyon, is spreading the word all over the globe.

Of the rest of the talks the most interesting ones were given in the afternoon. Anila Angjeli of the National Library of France (BnF) and Andrew MacEwan of the British Library explained the importance, similarities and differences of ISNI and VIAF, both authority files with identifiers used for people (both real and virtual). Gildas Illien (also one of the organizers of the Linked Data Satellite Meeting in Paris) and Françoise Bourdon, both BnF, described the future of Universal Bibliographic Control in the web of data, which is a development closely related to the topic of the talks by Ted Fons, Anila Angjeli and Andrew MacEwan.

The ONKI project, presented by the National Library of Finland, is a very good example of how bibliographic control can be moved into the digital age. The project entails the transfer of the general national library thesaurus YSA to the new YSO ontology, from libraries to the whole public sector and from closed to open data. The new ontology is based on concepts (identified by URIs) instead of monolingual text strings, with multilingual labels and machine readable relationships. Moreover the management and development of the ontology is now a distributed process. On top of the ontology the new public online Finto service has been made available.

The final talk of the day “The local in the global: universal bibliographic control from the bottom up” by Gordon Dunsire applied the “Think globally, act locally” aphorism to the Universal Bibliographic Control in the semantic web era. The universal top down control should make place for local bottom up control. There are so many old and new formats for describing information that we are facing a new biblical confusion of tongues: RDA, FRBR, MARC, BIBO, BIBFRAME, DC, ISBD, etc. What is needed are a number of translators between local and global data structures. On a logical level: Schema Translator, Term Translator, Statement Maker, Statement Breaker, Record Maker, Record Breaker. These black boxes are a challenge to developers. Indeed, mapping and matching of data of various types, formats and origins are vital in the new web of information age.


Research in the big data era

The Research in the big data era session had five presentations on essentially two different topics: data and text mining (four talks) and research data management (one talk). Peter Leonard of Yale University Library started the day with a very interesting presentation of how advanced text mining techniques can be used for digital humanities research. Using the digitized archive of Vogue magazine he demonstrated how the long term analysis of statistical distribution of related terms, like “pants”, “skirts”, “frocks”, or “women”, “girls”, can help visualise social trends and identify research questions. To do this there are a number of free tools available, like Google Books N-Gram Search and Bookworm. To make this type of analysis possible, researchers need full access to all data and text. However, rights issues come into play here, as Christoph Bruch of the Helmholtz Association, Germany, explained. What is needed is “intelligent openness” as defined by the Royal Society: data must be accessible, assessable, intelligible and usable. Unfortunately European copyright law stands in the way of the idea of fair use. Many European researchers are forced to perform their data analysis projects outside Europe, in the USA. The plea for openness was also supported by LIBER’s Susan Reilly. Data and text mining should be regarded as just another form of reading, that doesn’t need additional licenses



IdeasBox packed

A very impressive and sympathetic library project that deserves everybody’s support was not an official programme item, but a bunch of crates, seats, tables and cushions spread across the central conference venue square. The whole set of furniture and equipment, that comes on two industrial pallets, constitutes a self supporting mobile library/information centre to be deployed in emergency areas, refugee camps etc. It is called IdeasBox, provided by Libraries without Borders. It contains mobile internet, servers, power supplies, ereaders, laptops, board games, books, etc., based on the circumstances, culture and needs of the target users and regions. The first IdeasBoxes are now used in Burundi in camps for refugees from Congo. Others will soon go to Lebanon for Syrian refugees. If librarians can make a difference, it’s here. You can support Libraries without Borders and IdeadBox in all kinds of ways:


IdeasBox unpacked


The questions about data management in libraries that I brought with me to the conference were only partly addressed, and actual practical answers and solutions were very rare. The management and mapping of heterogeneous and redundant types of data from all types of sources across all domains that libraries cover, in a flexible, efficient and system independent way apparently is not a mainstream topic yet. For things like that you have to attend Satellite Meetings. Legal issues, privacy, copyright, text and data mining, cloud based data sharing and management on the other hand are topics that were discussed. It turns out that attending an IFLA meeting is a good way to find out what is discussed, and more importantly what is NOT discussed, among librarians, library managers and vendors.

The quality and content of the talks vary a lot. As always the value of informal contacts and meetings cannot be overrated. All in all, looking back I can say that my first IFLA has been a positive experience, not in the least because of the positive spirit and enthusiasm of all organizers, volunteers and delegates.

(Special thanks to Beate Rusch for sharing IFLA experiences)


flattr this!

by Lukas Koster at September 05, 2014 12:12 PM

Books and Library stuff

Deborah Bowness bookshelf wallpaper and real Penguins

  The pleasure of books, 
without the that possible?

Deborah Bowness books wallpaper


Trompe L'Oeil book wallpaper

If you can’t get enough of books, or you can’t fill your walls with bookshelves or bookcases, or maybe the children/dogs would eat them anyway, here is your answer.

Deborah Bowness bookshelf wallpaper and real Penguins

It looks beautiful and can give you the same sense of equanimity that a book collection, being surrounded by books, or walking through a Reference Library can give you.

Bibliophiles everywhere do not scoff at this, it is surely the next best thing

“thoughts from scarlettlibrarian!”

by venessa harris at September 05, 2014 11:35 AM

September 04, 2014

Coyle's InFormation


I've been spending quite a bit of time lately following the Wikipedia pages of "Articles for Deletion" or WP:AfD in Wikipedia parlance. This is a fascinating way to learn about the Wikipedia world. The articles for deletion fall mostly into a few categories:
  1. Brief mentions of something that someone once thought interesting (a favorite game character, a dearly loved soap opera star, a heartfelt local organization) but that has not been considered important by anyone else. In Wikipedian, it lacks WP:NOTABILITY.
  2. Highly polished P.R. intended to make someone or something look more important than it is, knowing that Wikipedia shows up high on search engine results, and that any site linked to from Wikipedia also gets its ranking boosted.
Some of #2 is actually created by companies that are paid to get their clients into Wikipedia along with promoting them in other places online. Another good example is that of authors of self-published books, some of whom appear to be more skilled in P.R. than they are in the literary arts.

In working through a few of the fifty or more articles proposed for deletion each day, you get to do some interesting sleuthing. You can see who has edited the article, and what else they have edited; any account that has only edited one article could be seen as a suspected bogus account created just for that purpose. Or you could assume that only one person in the English-speaking world has any interest in this topic at all.

Most of the work, though, is in seeing if you can establish notability. Notability is not a precise measure, and there are many pages of policy and discussion on the topic. The short form is that for something or someone to be notable, it has to be written about in respected, neutral, third-party publications. Thus a New York Times book review is good evidence of notability for a book, while a listing in the Amazon book department is not. The grey area is wide, however. Publisher's Weekly may or may not indicate notability, since they publish only short paragraphs, and cover about 7,000 books a year. That's not very discriminating.

Notability can be tricky. I recently came across an article for deletion pointing to Elsie Finnimore Buckley, a person I had never heard of before. I discovered that her dates were 1882-1959, and she was primarily a translator of works from French into English. She did, though, write what appears to have been a popular book of Greek tales for young people.

As a translator, her works were listed under "E. F. Buckley." I can well imagine that if she had used her full name it would not have been welcome on the title page of the books she translated. Some of the works she translated appear to have a certain stature, such as works by Franz Funck-Brentano. She has an LC name authority file under "Buckley, E. F." although her full name is added in parentheses: "(Elsie Finnimore)".

To understand what it was like for women writers, one can turn to Linda Peterson's book "Becoming a Woman of Letters and the fact of the Victorian market." In that, she quotes a male reviewer of Buckley's Greek tales, which she did publish under her full name. His comments are enough to chill the aspirations of any woman writer. He said that writing on such serious topics is "not women's work" and that "a woman has neither the knowledge nor the literary tact necessary for it." (Peterson, p. 58) Obviously, her work as a translator is proof otherwise, but he probably did not know of that work.

Given this attitude toward women as writers (of anything other than embroidery patterns and luncheon menus) it isn't all that surprising that it's not easy to establish WP:NOTABILITY for women writers of that era. As Dale Spender says in "Mothers of the Novel; 100 good women writers before Jane Austen":
"If the laws of literary criticism were to be made explicit they would require as their first entry that the sex of the author is the single most important factor in any test of greatness and in any preservation for posterity." (p. 137)
That may be a bit harsh, but it illustrates the problem that one faces when trying to rectify the prejudices against women, especially from centuries past, while still wishing to provide valid proof that this woman's accomplishments are worthy of an encyclopedia entry.

We know well that many women writers had to use male names in order to be able to publish at all. Others, like E.F. Buckley, hid behind initials. Had her real identity been revealed to the reading public, she might have lost her work as a translator. Of late, J.K. Rowling has used both techniques, so this is not a problem that we left behind with the Victorian era. As I said in the discussion on Wikipedia:
"It's hard to achieve notability when you have to keep your head down."

by Karen Coyle ( at September 04, 2014 07:04 PM

025.431: The Dewey blog

Dewey by the Numbers

Here’s a brief snapshot of the DDC 23 EN database (the database associated with the English-language version of DDC 23) as of 1 September 2014:

Dewey by the Numbers (2014-09-01)

by Rebecca at September 04, 2014 04:20 PM

September 03, 2014

TSLL TechScans

Cataloger's Desktop interface changes and training resources.

The Library of Congress is rolling out a new interface for the Cataloger's Desktop product on September 10, 2014. This new interface will be simpler and cleaner; focus will be on search and retrieval instead of table of contents browsing.  The text of the announcement is posted at

The Cataloger's Desktop Training and Tutorials page provides links to presentation slides and recordings from recent webinars, plus links to "Quick Tips" documentation.

by (Jackie Magagnosc) at September 03, 2014 03:20 PM

First Thus

ACAT Quality of MLS education cataloging

Posting to Autocat

On 02/09/2014 19.43, Kevin M Randall wrote:

It seems like the typical route now to a professional cataloger position is to learn a lot of cataloging basics as a paraprofessional, then go to library school to get the degree to be qualified for a librarian position. (But of course, there’s the question of how many professional positions will be available…)

This is the point. One real concern of mine is that, as I mentioned earlier, library school is there to train the librarians of the future. Those are the people who will be the leaders of the library field. If they lack an appreciation, an understanding, of the library’s catalog, i.e. what is–by far–the most complex tool created by the library (putting to shame all spreadsheets and library guides etc.); if they don’t understand why it is made the way it is and the reasons the people who make the catalog do the things that they do; if the decision makers do not understand, or do not appreciate it, then the future looks very difficult.

Although I have read quite often how the catalog will disappear, but as long as there is a physical collection to manage (and it looks as if physical collections will be around for quite awhile yet), I haven’t seen even any suggestions for anything to replace the library catalog. You just can’t do it with a spreadsheet! Of course, the problem with this is, as the digital materials available to the public continue to grow exponentially faster than the printed materials any library could provide, we face a worrisome trajectory.

The big question is: what will be the purpose of the catalog? Will it be a tool for “information discovery” that is, to help people discover intellectually what may be of interest to them in the collection (whatever “the collection” comes to mean). Or will “information discovery” take place using other tools (full-text, driven by algorithms on correlations based not only on texts but your “likes” your “connections” your browsing history, and what some would call “invasions of privacy”) while the catalog becomes a tool just for finding and retrieving the resources found in those other places? In other words, borrowing from Cutter’s Objectives:

1. To enable a person to find a book of which either

A. the author)
B. the title) is known
C. the subject)

2. To show what the library has

D. by a given author
E. on a given subject
F. in a given kind of literature

3. To assist in the choice of a book

G. as to its edition (bibliographically)
H. as to its character (literary or topical)

The library catalog could allow only for objectives 1. A and 1. B, while leaving the rest to other methods. I have seen early catalogs that did that (author/title with no authority control). Supplement it with basic descriptions (publication information and extent) and this would be making a run-of-the-mill inventory tool. It would be a lot simpler to make that one. (Cheaper too!) Training staff could take a week or so, much as in a stockroom.

I think that it is inevitable that the decision concerning what should be the catalog’s purpose will be made eventually. Of course, we must realize that catalogers will not be the ones to make that decision–it will be made by the “movers and shakers” (library big-wigs, directors, and lots of non-librarians). It would be nice if they had some kind of understanding what the catalog is and what it does when they make those decisions, but as I have pointed out several times, it is so hard for people to use our catalogs correctly that a positive answer is far from certain.

Interesting times.


by James Weinheimer at September 03, 2014 12:20 PM

Books and Library stuff

battered books



Rare books cataloguing experience

I have recently had the joy of learning and performing some rare books cataloguing, and this is what I would like to share:

As with a lot of things, it is important to be equipped with context. I hold my hands up to being a “context person” anyway, and I have learnt the hard way, that I don’t cope well on being given tasks that are tantamount to a factory line with no meaning. With rare books cataloguing, this means having knowledge and understanding of the printing process during the hand-press period (approx 1450 – 1850), and I would without doubt, always advise anyone beginning their training to start here. So here’s my little list for this task (which I may add to as I learn!) -

  • knowledge and understanding of printing process during hand-press period
  • knowledge of Roman numerals e.g M.CD.XLII
  • knowlegde of the way names were shortened e.g  James – Jas
  • knowledge of spelling history
  • knowledge and understanding of paper – the history of how it is made
  • knowledge of parts of the book without doubt via John Carters ABC for book collectors
  • knowledge of book conservation and preservation
  • some understanding of provenance (just because the subject is too wide)
  • knowledge of and appreciation of marginalia

There are a few significant differences between regular cataloguing and rare books cataloguing, probably the thing that stands out the most for me initially was the absolute essential need to check every single page in a rare book. This is something that is done for pagination, signatures, catchwords and any marginalia or damage. The second thing for me, was the need to check the format and using *Gaskell to do so – looking at the paper, checking for chain lines direction, measuring the book, and checking both of these against the signatures to ascertain the format i.e folio, quarto, octavo, duodecimo. Finally it is essential to note any provenance marks. Obviously, because it is a bit of a passion of mine, I would personally also make notes about the binding.

Books as material culture

Material culture is the physical evidence of a culture in the objects and architecture they make, or have made, and I strongly believe that books have a role to play here. Every piece of the composition of a book during the hand-press period, can tell us something about the social culture of the day. Why is that you ask? When we think of evidence of a social culture, what springs to mind is consumerism, education, and class. It actually encompasses relationships between people and their things: the making of, history, preservation, and interpretation of objects. Now, all you bibliophiles, think about that for a moment –  the relationship between you and your books. This is a subject that I am passionate about, so I will talk more about that in another post, but in the meantime, if anyone has any thoughts on that, I’d be happy to hear them.


battered books

“thoughts from scarlettlibrarian”


*Gaskell P, (1985) A New Introduction to Bibliography. OUP: New York

by venessa harris at September 03, 2014 08:32 AM

Terry's Worklog

MarcEdit 6 Update

I’ve just posted a new update to MarcEdit.  In addition to fixing the following three issues:

  • Check URL crashes when running…this has been fixed.
  • Delimited Text Translator doesn’t show finishing message…fixed
  • Debugging messagebox shows when processing mnemonic files not using MarcEdit’s documented format.

In addition to these three bug fixes, MarcEdit is including a new tool called MARCNext for testing BibFrame principles. Please note, the BibFrame Testbed currently *does not* work on the MAC platform under MONO.  This is due to an incompatibility in the current version of saxon with the runtime.  It appears that downgrading the version will correct the problem, but I need to make sure there are not any unforeseen issues.  I’ll be working to correct this during the week.

I’ve recorded a couple videos documenting the new functionality.  You can find there here:

You can download the update via MarcEdit’s automated update tool or view the MarcEdit downloads page at:


by reeset at September 03, 2014 02:58 AM

September 02, 2014

OCLC Cataloging and Metadata News

Shanghai Library adds 2 million records to WorldCat to share its collection with the world

Shanghai Library, the largest public library in China and one of the largest libraries in the world, has contributed 2 million holdings to WorldCat, including some 770,000 unique bibliographic records, to share its collection worldwide.  

September 02, 2014 05:30 PM