  • Beyond metadata - enriching life science publications in LIVIVO with semantic entities from the linked data cloud
  • Bernd Müller, Alexandra Hagelstein
  1. Müller, Bernd |
  2. Hagelstein, Alexandra |
  1. Artikel |
  • SEMPDS-2016 Posters&Demos@SEMANTiCS 2016 and SuCCESS'16 Workshop; 2016; 4 ungezählte Seiten
  • Queries in literature search engines are usually conducted on metadata derived from scientific publications. The search engine LIVIVO holds a corpus of 63 Million life science publications. About 25 Million publications in LIVIVO are taken from PubMed that have annotations with Medical Subject Headings (MeSH). The other publications have heterogeneous keyword annotations. Hence, a workflow is developed using the Unstructured Information Management Architecture (UIMA) to enrich publications from LIVIVO with semantic annotations. The UIMA analysis engine ConceptMapper employs entity recognition based on dictionaries developed using MeSH, the pharmaceutical database DrugBank, and the multilingual agricultural vocabulary AGROVOC . Additionally, ontological relationships amongst the semantic entities are preserved by using the graph database Neo4j. The ontological information is derived from the MeSH tree, the Anatomical Therapeutic Chemical classification system (ATC) for pharmaceuticals and the AGROVOC tree. The ontological structure of semantic entities enables functionalities like query expansion, the aggregation of search results, and conceptbased ranking algorithms.
linked data, graph database, document database, named entity recognition, semantic search
ddc 020 Bibliotheks- und Informationswissenschaften
