Park, Gilchan and Pouchard, Line (2021) Advances in scientific literature mining for interpreting materials characterization. Machine Learning: Science and Technology, 2 (4). 045007. ISSN 2632-2153
Park_2021_Mach._Learn.__Sci._Technol._2_045007.pdf - Published Version
Download (1MB)
Abstract
Using synchrotron light sources, such as the National Synchrotron Light Source II at Brookhaven National Laboratory, scientists in fields as diverse as physics, biology, and materials science, identify the atomic structure, chemical composition, or other important properties of varied specimens. x-ray spectroscopy from light sources is particularly valuable for materials research with vast information available about reference spectra in the scientific literature. However, as the technique is applicable to many science domains, searching for information about select x-ray spectroscopy spectra is impeded by the sheer number of publications. Moreover, useful information about the context of an experiment or figures presented in papers can be buried among the details, which takes time to assess. This work presents a scientific literature mining system that supports data acquisition, information extraction, and user interaction for referencing x-ray spectra identification and spectral interpretation. The goal is to provide efficient access to useful spectral data to researchers who may spend only a few days at a synchrotron light source. With this system, users browse a classification tree for papers arranged according to x-ray spectroscopic methods, chemical elements, and x-ray absorption spectroscopy edges. Relevant figures are extracted with sentences from the paper that explain them, known as 'figure explanatory text.' Notably, this system focuses on semantic aspects (logical analysis) to find figure explanatory text using deep contextualized word embeddings techniques and contains an interface to obtain labeled data from domain experts that is used to evaluate and improve the model.
Item Type: | Article |
---|---|
Subjects: | Universal Eprints > Multidisciplinary |
Depositing User: | Managing Editor |
Date Deposited: | 05 Jul 2023 03:55 |
Last Modified: | 14 Oct 2023 03:48 |
URI: | http://journal.article2publish.com/id/eprint/2282 |