We’ll be delivering a short presentation called “Controlled Vocabulary Strategies for Annotating tranSMART Datasets” at the tranSMART Community Meeting in April. (The tranSMART Community Meeting is timed to coincide with this year’s Bio-IT World conference in Boston.) The tranSMART community is a group of companies, institutions, and individuals working on translational research in medicine, biotherapies, and similar life sciences endeavors.
In our talk we’ll be looking at the differences in utility between differing naming systems, from simple lookup lists to fully-realized ontologies, and how each has its own role when packaging datasets for analysis. Our Curator software handles controlled vocabulary (official terminology) using what we call Concept Schemes, which for the purposes of dataset construction gives annotators a lot of tools to find and apply terminology without sacrificing ease of use and performance. We think it’s the right approach for the job, but there’s still some art in deciding what (and especially how much) goes into the Concept Scheme for the purposes of defining data structures and concepts. We’re looking forward to the opportunity to discuss that art with the group.