IN23E-06
Illuminate Knowledge Elements in Geoscience Literature

Tuesday, 15 December 2015: 14:55
2020 (Moscone West)
Xiaogang Ma1, Jin Guang Zheng2, Han Wang1 and Peter Arthur Fox1, (1)Rensselaer Polytechnic Institute, Troy, NY, United States, (2)Rensselaer Polytechnic Institute, Elmhurst, NY, United States
Abstract:
There are numerous dark data hidden in geoscience literature. Efficient retrieval and reuse of those data will greatly benefit geoscience researches of nowadays. Among the works of data rescue, a topic of interest is illuminating the knowledge framework, i.e. entities and relationships, embedded in documents. Entity recognition and linking have received extensive attention in news and social media analysis, as well as in bioinformatics. In the domain of geoscience, however, such works are limited. We will present our work on how to use knowledge bases on the Web, such as ontologies and vocabularies, to facilitate entity recognition and linking in geoscience literature. The work deploys an un-supervised collective inference approach [1] to link entity mentions in unstructured texts to a knowledge base, which leverages the meaningful information and structures in ontologies and vocabularies for similarity computation and entity ranking. Our work is still in the initial stage towards the detection of knowledge frameworks in literature, and we have been collecting geoscience ontologies and vocabularies in order to build a comprehensive geoscience knowledge base [2]. We hope the work will initiate new ideas and collaborations on dark data rescue, as well as on the synthesis of data and knowledge from geoscience literature.

References:

1. Zheng, J., Howsmon, D., Zhang, B., Hahn, J., McGuinness, D.L., Hendler, J., and Ji, H. 2014. Entity linking for biomedical literature. In Proceedings of ACM 8th International Workshop on Data and Text Mining in Bioinformatics, Shanghai, China.

2. Ma, X. Zheng, J., 2015. Linking geoscience entity mentions to the Web of Data. ESIP 2015 Summer Meeting, Pacific Grove, CA.