ASIS&T 2013 Annual Meeting 
Montréal, Québec, Canada | November 1-5, 2013

Exact versus Estimated Pruning of Subject Hierarchies

Charles-Antoine Julien, McGill University
Pierre Tirilly, Université de Lille

Monday, 10:30am


Many large digital collections are currently organized by subject; these useful information organization structures are large and complex, thus difficult to browse. Current online tools and visualization prototypes show small localized subsets and do not provide the ability to explore the predominant patterns of the overall subject structure. This study builds on existing work concerning automatic subject hierarchy modification techniques that aim to facilitate browsing for documents by capitalizing on the highly uneven distribution of real-world collections. Specifically, previous work used an estimation of the number of accessible documents offered by each subject term, while the current study uses the exact number of accessible documents. The impact is demonstrated on a large collection organized using Medical Subject Headings (MeSH). Results show that, although computationally more demanding, pruning the MeSH hierarchy based on the exact access produces a different subject hierarchy under some conditions. The visual impact is demonstrated using examples shown using traditional outline views. This study has implications for the development of information organization theory and human-information interaction techniques for subject hierarchies.