The 23rd SIG/CR workshop on classification research featured papers, lightning talks, brief presentations of doctoral projects and two keynote talks, all exploring what’s new in the field. Under the theme of new approaches with a historical focus, presenters explored novel theories, models and applications, approaches to building classificatory structures, methods and criteria for evaluation and much more. Classification theory, concepts and terminology were considered from a historical perspective, and new theories and changes in conceptualization and classification structures were raised. Modern perspectives on classification include folksonomies, personal classification practices, power structures captured through classification and the limitations of standardization. Researchers discussed cognitive processes involved in classifying, the evolution of concepts associated with terms and sources for new terms in a domain. Through the variety of presentations, it was clear that classification encompasses a broad array of topics, ultimately serving information retrieval and access.

The 23rd Annual SIG/CR Classification Research Workshop: A Report

by Jonathan Furner

A large and appreciative audience gathered in Baltimore, Maryland, on Friday, October 23, 2012, for the 23rd Annual SIG/CR Classification Research Workshop. The day’s program had been coordinated by the workshop co-chairs, Kathryn La Barre, University of Illinois at Urbana-Champaign, and Joseph Tennis, University of Washington. Ten full papers were accepted for presentation, with 15 minutes allocated to each, along with nine “lightning talks” of seven minutes each and a doctoral mini-symposium in which five Ph.D. students introduced their research projects. Fran Miksa, University of Texas at Austin, and Shawne Miksa, University of North Texas, collaborated on a keynote panel, and Marjorie Hlava, Access Innovations, Inc., provided a second keynote presentation. Given the workshop’s topic, I thought it might be interesting to categorize the innovations – literally, the new things – that I think we heard about during the presentations and follow-up discussions. I believe that the results are indicative of the vibrancy, the richness and the complexity of the classification research field right now.

If I had to pick the one big theme of the day – and this choice should not be surprising given the thrust of the original call for papers – it would be the utility of new approaches with a historical focus. It is very appropriate in the year of ASIS&T’s 75th anniversary that we should consider it important to remember and learn the many lessons taught by the past. Simply to lump all the historically oriented approaches in a single category, however, would serve only to hide the variation in the kinds of histories that are being constructed and in the innovations being made among them.

Keynote Address
In his keynote address, Observations on Historical Aspects of Classification Theory, for instance, Fran Miksa’s emphasis was on the history of ideas about classification, whereas both David Dubin, University of Illinois at Urbana-Champaign, and Kathryn La Barre focused on different aspects of the history of the field of classification research. Presentations by Melissa Adler, University of Wisconsin–Madison, and K. R. Roberto, University of Illinois at Urbana-Champaign, were concerned with the histories of the concepts and terms that are the elements of classification schemes and crucially with the primary role played in shaping those histories by the decisions and actions of particular individuals. Grant Campbell, University of Western Ontario, talked about changes in classificatory structures over time, while Tennis’s interest was in the history of conceptualizations and definitions of classification. All these speakers prioritized the temporal, but in rather different ways.

More generally, I think we are continuing to see growth in the use of humanistic approaches (all imported from other disciplines, of course, but I’m not sure it could be any other way). When there is talk at the SIG/CR workshop of “the classificationist’s gaze,” you know things are going rather well in that direction. In this context the paper by Daniel Martínez-Ávila, Universidad Carlos III de Madrid, and Richard Smiraglia, University of Wisconsin-Milwaukee, on the integration of a phenomenological approach with discourse-analytic methods describes a very interesting development, and it was a disappointment that neither speaker could attend on the day. 

New Theories and Understandings
Many new theories and understandings, or relatively recently developed ones, were introduced in the papers accepted for the workshop: theories of the ways in which power structures are unavoidably reflected in classification structures (Patrick Keilty, University of Toronto; Melodie Fox, University of Wisconsin-Milwaukee); of how folksonomies have emancipatory potential (Keilty); and of the cultural, social and historical specificity of definitions of classification (Tennis), of classification schemes (lots of people) and of classification practices (Eva Hourihan Jansen, University of Toronto). We learned that standardization is not always beneficial, since its effects include undesirable decontextualization (Jansen); that personal classificatory practice is heavily influenced by social factors (Kyong Eun Oh, Rutgers University); that user-generated folksonomies are more similar to top-down schemes than previously thought (Andrea Scharnhorst, Royal Netherlands Academy of Arts and Sciences, and Richard Smiraglia); that classification (and thus information retrieval) based on analysis of the probability – or likeliness – with which a searcher will judge two objects to be related at any given moment is effective (Charles van den Heuvel, Royal Netherlands Academy of Arts and Sciences, and Richard Smiraglia); and that in any given domain many different classification structures are possible, and many of those are potentially equally useful in different ways – in other words, there is no single correct classification, even in science (Rebecca Green, OCLC Online Computer Library Center, Inc., and Giles Martin, independent consultant).

New Models and Methods
Similarly, we were treated to presentations of several new models: of the cognitive processes involved in classifying – in lumping and splitting (Oh), for example – and in learning how to classify (Shawne Miksa); of the relationship between classifying, as the act of assigning labels to classes, and other cognitive activities such as counting and writing (Tennis); of knowledge organization systems as artificial languages rather than as hierarchical trees (Scharnhost and Smiraglia); of classes as situated in rhetorical space (Fox); and of the relationship between bibliographic/documentary classification and the scientific classification of naturally occurring phenomena (to which both Dubin and Green & Martin alluded).

Hlava reminds us how the explanatory role of theory is significant not only for the design of new kinds of classificatory structures, but also for new kinds of uses of those structures – new applications of theory to practice, in other words – in systems designed for the improvement of search, retrieval and related tasks. Meanwhile, Nicholas Weber and Andrea Thomer, both University of Illinois, Urbana-Champaign, and Gary Strand, National Center for Atmospheric Research, demonstrate the use of what Birger Hjørland calls pragmatic classification – in which classes are identified on the basis of usage – for identifying the elements in a metadata schema for climatology; while Jansen reviews the use of the concept of “boundary object” to understand classification practices. Such work may be characterized as the application of otherwise well-understood methods in new domains.

Did we hear about any new methods of building classificatory structures and knowledge organization schemes? Methods that prioritize the needs of specific groups (for example, Martínez-Ávila and Smiraglia) and that involve crowdsourcing (for example, Jane Greenberg, Angela Murillo, both University of North Carolina at Chapel Hill, and John Kunze, California Digital Library) were discussed, and part of Adler’s contribution was her account of the history of methods of dealing with new topics and subjects. But the methods discussed probably should not be counted as innovations.

It is similarly difficult to identify, among the contributions to the workshop, any new principles for the construction of classification schemes. Smiraglia’s domain analysis (not presented) shows that we’re comfortable with a plurality of methods and approaches, and what is more, comfortable with the extent to which those approaches are compatible if not complementary. We may well still seek principled answers to questions like the following: Should classification schemes be theory-driven or based on empirical observation? Should classification researchers in the information sciences (broadly defined) be concerned with classification of natural kinds or just of artifactual kinds? Whatever new principles are proposed, which of them (if any) reach the status of ethical principles? The possibilities might include principles that emphasize flexibility and pluralism (Campbell); that recognize that classification practices should be participatory and classification schemes, thus, user-centered (Adler); and that such approaches require that all voices should be heard in the construction of classification schemes (Fox, Roberto). These latter proposals essentially would be a re-affirmation of the principle of user warrant – to wit, the terms and structures that are used by the members of groups who share a social identity are the ones that should be included in classification schemes (Roberto). Greenberg, Murillo and Kunze further suggest that there should be full participation in the evaluation of candidate terms/concepts for inclusion in classification schemes, thereby giving participants a sense of ownership.

In alignment with the already-noted emphasis on the temporal and historical, the new method of analysis of the moment seems to be ontogenesis. This method derives from Tennis’s suggestion that we can and should explore “the life of the subject over time” – an empirical method that enables better understanding of the factors that influence the course of a classification scheme’s development (for example, Fox; Scharnhorst and Smiraglia). Campbell wants to combine ontogenetic analysis with principles of flexibility and pluralism, partly in order to predict future changes in classification schemes. Martínez-Ávila and Smiraglia talk about new extensions of discourse analysis as methods of ontogenetic analysis that “[reveal] knowledge as artificially constructed by social factors.” Workshop participants were also introduced to methods of analyzing the discourse of domains to identify terms for inclusion in specialized vocabularies (Christine Marchese, Nassau Community College, and Richard Smiraglia), and the use of diaries and interviews rather than direct observation as a way of collecting data about personal classificatory practices (Oh).

New empirical data that were reported in workshop papers include Smiraglia’s data (not presented) on citedness, rates of self-citation and disciplinary association of classification researchers; Elizabeth Milonas’s (Long Island University) data that will inform the designers of the faceted search features of web search engines; Oh’s data on the ways in which people organize personal files, which leads her to propose a new model of such practice as a five-stage process, which is intended to be useful for designers of new tools and interfaces supporting such practice; the results of Scharnhorst and Smiraglia’s ontogenetic analysis, showing the gradual evolution of intension over time; and Fran Miksa’s findings from his investigations of ancient, medieval and Islamic classifications. New methods of presenting data included Scharnhorst and Smiraglia’s visualizations of classifications over time, and Lori Ann Rung Hoeffner, Adelphi University, and Smiraglia’s visualizations of the results of domain analyses.

Workshop participants heard much about new methods of evaluation. With respect to methods of evaluating the field of classification research, Dubin points to the ever-increasing specialization of scholarly communities and asks: Is that a sign of success or an indication that we’re missing the opportunities for progress that come with interdisciplinarity? Jansen discusses critical-analytic methods of evaluating the theoretical frameworks that underlie classification research; Hoeffner and Smiraglia’s method of evaluating domains uses coherence as a criterion; Green and Martin demonstrate the need for effective methods of evaluating and choosing among different scientific classification structures; and Milonas’s study of the relationship between expert inspection and users’ perceptions of usability could even be considered as a contribution to the literature on the evaluation of methods of evaluation! Several of these methods apply new criteria for evaluating classification schemes (and other products of classification theory and research) and/or classificatory practices. From Habermas (via Martínez-Ávila & Smiraglia), we’re reminded that theory is used to predict, to explain or understand, to emancipate and to deconstruct. I think we’re clearly seeing a shift towards the latter half of this list in our prioritization of criteria for evaluating the products of classification research. Greenberg, Murillo and Kunze additionally propose the use of sustainability as a criterion for evaluating classification schemes. 

The potential for new definitions of the nature of classification, of classification theory and of classification history was mentioned by both Fran and Shawne Miksa. Dubin considers the relationship between aboutness and topicality and asks: Is a topic a thing, or a concept? Possibly new methods of creating definitions include domain analysis and bibliometrics (Smiraglia), and methods based on the kind of phenomenon being classified (Tennis). Echoing the parallels drawn by Fran Miksa between the history of writing and the history of classification, Tennis asks: Are counting and writing themselves kinds of classification? Fran Miksa reminds us that people became interested in the classification of the elements of knowledge (that is, subjects) only relatively recently, and that the classification of both physical objects and informational entities has a longer history. Steven MacCall’s (University of Alabama) model of the relationships among works, texts and artifacts (rather than the more familiar combination of works, expressions, manifestations and items) could be viewed as a specification of new kinds of objects to be classified; Lei Zhang and Hur-Li Lee, University of Wisconsin-Milwaukee, advocate for genre as a new dimension or facet by which objects are to be classified. 

Accompanying new principles and practices are new problems to address. These challenges include not only the problems that are due to “life happening” and to the fact that the information explosion has only just begun (Hlava), but also those that emerge as unintended consequences of the application of innovative methods. The crowdsourcing of tags for use as class labels, for instance, may lead to undesirable results such as a “tyranny of the majority” or a “Matthew effect” in which less-popular classes are less likely than more-popular ones to grow in popularity in the future. Questions about the extent to which such methods are “fair” or “just” may be conceived as having an ethical dimension. I’m not sure that the ethical ramifications of the decisions made by classification-scheme designers (as distinct from those made by classifiers) were adequately covered at this particular workshop. 

New Goals and Paradigms 
Did workshop participants characterize classification research as having any new goals? Fran Miksa proposes a framework for understanding how the goals of designers of classification schemes have changed over time, distinguishing among the pragmatic, scientific and aesthetic functions of such schemes that have been prioritized in different ways in different periods and cultures. Shawne Miksa provides a list of questions that serves well as a comprehensive statement of the various goals of classification research – one stand-out being the goal of creating “living” classification schemes. Tennis’s emphasis on the temporal reminds us of the importance of the goal of understanding the dynamic intensions and extensions of concepts, and Hoeffner and Smiraglia demonstrate the value of classificatory practices in defining the boundaries of a domain or discipline. Otherwise, the assumptions that appear to underpin much of the work presented were that classification is a means to the end of resource discovery, and thus the appropriate goal of classification research is to contribute to improvements in the design of retrieval and access systems. 

Neither was it clear that any radically new paradigms or theoretical frameworks were identified on the day. Certainly, several speakers made reference to a (seemingly ongoing) battle between positivists and constructivists, and this tension was explored in different ways in different papers (for example, by Jansen, and by Martínez-Ávila and Smiraglia). The closest we came to something that amounts to a whole new way of thinking about classification research, I think, was Campbell’s suggestion that classification schemes should be understood as living, breathing organisms that change rhythmically over time. To my mind, however, the most valuable innovation of the day was Fran Miksa’s conception of the history of classification theory. He distinguishes among three historical periods – we might call them the periods of unificationism, specialism and pluralism, although Miksa doesn’t use these labels himself – in a framework that provides an illuminating backdrop to the current turn in classification studies towards the generation of theories, principles and methods that emphasize both the cultural and historical specificity of classification practices and their emancipatory function. 

Rebecca Green and Giles Martin explained, “A rosid is a rosid is a rosid,” and in doing so received special commendation in the 1st Annual SIG/CR Award for Best Workshop Paper Title. But the winner of this prestigious new award was K. R. Roberto, for his paper, “Description Is a Drag, and Vice Versa.” Kathryn La Barre drew praise for her reference to Mr. MITS (the “Man In The Street”), as did Jane Greenberg for her analogy between toothbrushes and metadata standards (we all think they’re great, but ideally we’d like one of our own), and Andrea Thomer for her realization that hours in climate centers are like cigarettes in prison. (At least, I think that’s what Andrea said ...) 

Since we heard about so many different aspects of classification research during the workshop, we might have been forgiven at the end for reeling and asking afresh: What is classification research? or What should it be? or even What can it be? My personal response is that there is little need for concern. Classification research is “all of the above” and more. So long as it stays that way, the field is in excellent shape. On behalf of all the participants, I’d like to extend many thanks to the presenters and to the organizers, Kathryn and Joe, for making our 23rd workshop an exceptionally productive one.

Jonathan Furner is an associate professor in the Department of Information Studies, which is part of the Graduate School of Education and Information Studies at the University of California, Los Angeles.