Annual Meeting Reports

Taxonomy

The session began with a quick overview by Hlava of taxonomies and their place within the controlled vocabulary complexity continuum of list (no control), authority file (people, places, and things, listing the preferred form with other options provided as synonyms), taxonomy (list in hierarchical form), thesaurus (adding related terms while keeping synonyms and hierarchy) and ontology (adding directionality to the relationships) and the characteristics of each. She explored what can be done with a taxonomy implementation besides enhancing search, such as the recommendation of additional articles on the same topical cluster, linked data enrichment of content, finding peer reviewers based on their semantic profile, taxonomy terms applied to the person based on the indexing of their publications, and image indexing. Finally, she gave some examples of how to integrate a taxonomy into the production workflow.

Bradford took up the conversation next by having the audience “Following Our Yellow Brick Road”. She described the first failed initiative to create a taxonomy (which lasted two years), and a subsequent hiatus while they worked on taxonomy use cases including editorial, business office, and reader/client usage. Next, they came up with a set of lessons learned, which included the following: 1) Support for the project cannot be focused on a short-term revenue opportunity. 2) The thesaurus must be considered an investment in the long-term value of your content. 3) Technology is only one piece of the project. Skilled taxonomists (consultants or staff) are essential. Strong IT support is needed as well. 4) Subject-matter experts must see the project as a priority and accept the skills brought to the project by the taxonomist and technology team. In 2013, they “awoke in the poppy field” and began the project anew. The American Association for the Advancement of Science (AAAS) development of a new manuscripttracking system and new journal launches highlighted the need to re-start taxonomy development in support of the peer-review process. AAAS contracted with Access Innovations in 2013/2014 to develop the taxonomy/thesaurus and rules, index content back to 1996, and review content moving forward to help AAAS maintain the thesaurus. Now the taxonomy is integrated into the submission and tracking systems. IT staff worked on integration of automatic indexing using Access Innovations tools. Strong internal project management has been established. AAAS is now able to look ahead to the next steps in integrating the taxonomy across the production and userfacing platforms and indexing all AAAS content. Bradford’s slides, which can be found on the CSE 2015 meeting website, include screenshots of the implementation.

McNaughton gave the second case study in the session. She outlined the process of taxonomy creation starting in 2009 and the subsequent rule building for automatic indexing of the American Society of Civil Engineers (ASCE) content. There are 2,400 concepts in the ASCE thesaurus and 1,100 synonyms. The geographic thesaurus grew to 30,000 terms before being reduced to a more manageable size. They have 6,541 rules, 806 of which are complex in nature. The rules help further disambiguate the terms and allow automatic application of the conceptual terms to text during the production flow. Having the thesaurus and rule base in place, ASCE turned its attention to implementation, starting with indexing of author profiles in ColWiz. This application applies taxonomy terms to describe an author’s or editor’s area of expertise in a reliable, consistent method. They also needed to decide what content to index; no to front matter and editorials and yes to articles and book chapters. The second project is to index the locations of all civil engineering disasters in the ASCE library and the Civil Engineering Database. The next implementations will be search, article recommendations, topic pages, and visualization using the taxonomy.