Workshop on extracting structured knowledge from scientific publications (ESSP)

June 6th, 2019, Minneapolis, USA

Collocated with NAACL 2019

Scientific knowledge is one of the greatest assets of humankind. This knowledge is recorded and disseminated in scientific publications, and the body of scientific literature is growing at an enormous rate. Automatic methods of processing and cataloguing that information are necessary for assisting scientists to navigate this vast amount of information, and for facilitating automated reasoning, discovery and decision making on that data.

Structured information can be extracted at different levels of granularity. Previous and ongoing work has focused on bibliographic information (segmentation and linking of referenced literature, Wick et al., 2013), keyword extraction and categorization (e.g., what are tasks, materials and processes central to a publication, (Augenstein et al., 2017)), and cataloguing research findings. Scientific discoveries can often be represented as pairwise relationships, e.g., protein-protein (Mallory et al., 2016), drug-drug (Segura-Bedmar et al., 2013), and chemical-disease (Li et al., 2016) interactions, or as more complicated networks such as action graphs describing scientific procedures (e.g., synthesis recipes in material sciences, (Mysore et al., 2017)). Information extracted with such methods can be enriched with time-stamps, and other meta-information, such as indicators of uncertainty or limitations of the discovered facts (Zhou et al., 2015).

Structured representations, such as knowledge graphs, summarize information from a variety of sources in a convenient and machine readable format. Graph representations, that link the information of a large body of publications, can reveal patterns and lead to the discovery of new information that would not be apparent from the analysis of just one publication (Luan et al., 2018). This kind of aggregation can lead to new scientific insights (Kim et al., 2017), and it can also help to detect trends (Prabhakaran et al., 2016), or find experts for a particular scientific area (Neshati et al., 2014).

While various workshops have focused separately on several aspects -- extraction of information from scientific articles, building and using knowledge graphs, the analysis of bibliographical information, graph algorithms for text analysis -- the proposed workshop focuses on processing scientific articles and creating structured repositories such as knowledge graphs for finding new information and making scientific discoveries. The aim of this workshop is to identify the necessary representations for facilitating automated reasoning over scientific information, and to bring together experts in natural language processing and information extraction with scientists from other domains (e.g. material sciences, biomedical research) who want to leverage the vast amount of information stored in scientific publications.

Invited speakers

Workshop program

9:00 -- 10:30
9:00 -- 9:15 Welcome
9:15 -- 10:10 Invited talk: Machine Reading for Precision Medicine
Hoifung Poon
10:10 -- 10:30 Distantly Supervised Biomedical Knowledge Acquisition via Knowledge Graph Based Attention
Qin Dai, Naoya Inoue, Paul Reisert, Ryo Takahashi and Kentaro Inui

10:30 -- 11:00

Coffee break

11:00 -- 12:30
11:00 -- 11:50 Invited talk: Extraction-Intensive Systems for the Social Sciences
Michael Cafarella
11:50 -- 12:10 Scalable, Semi-Supervised Extraction of Structured Information from Scientific Literature
Kritika Agrawal, Aakash Mittal and Vikram Pudi
12:10 -- 12:30 Understanding the Polarity of Events in the Biomedical Literature: Deep Learning vs. Linguistically-informed Methods
Enrique Noriega-Atala, Zhengzhong Liang, John Bachman, Clayton Morrison and Mihai Surdeanu

12:30 -- 14:00

Lunch break

14:00 -- 15:30
14:00 -- 14:50 Invited talk: Extracting structured knowledge from biomedical publications
Dina Demner-Fushman
14:50 -- 15:15 5 min presentations for posters and the demo

Dataset Mentions Extraction and Classification
Animesh Prasad, Chenglei Si and Min-Yen Kan

Annotating with Pros and Cons of Technologies in Computer Science Papers
Hono Shirai, Naoya Inoue, Jun Suzuki and Kentaro Inui

Browsing Health: Information Extraction to Support New Interfaces for Accessing Medical Evidence
Soham Parikh, Elizabeth Conrad, Oshin Agarwal, Iain Marshall, Byron Wallace and Ani Nenkova

An Analysis of Deep Contextual Word Embeddings and Neural Architectures for Toponym Mention Detection in Scientific Publications
Matthew Magnusson and Laura Dietz

STAC: Science Toolkit Based on Chinese Idiom Knowledge Graph (demo)
Changliang Li, Meiling Wang, Yu Guo, Zhixin Zhao and Xiaonan Liu

15:15 -- 16:00

Coffee break and Poster session

16:00 -- 17:30
16:00 -- 16:50 Invited talk: Just when I thought I was out, they pull me back in: The role of knowledge representation in automatic knowledge base construction
Chris Welty
16:50 -- 17:10 Playing by the Book: An Interactive Game Approach for Action Graph Extraction from Text
Ronen Tamari, Hiroyuki Shindo, Dafna Shahaf and Yuji Matsumoto
17:10 -- 17:30 Textual and Visual Characteristics of Mathematical Expressions in Scholar Documents
Vidas Daudaravicius

Call for Papers

Accepted Papers

We invite submissions on (but not limited to) the following topics:
Further submission information and submission link

Important Dates

Workshop papers due: Wednesday March 6, 2019
Notification of acceptance: Friday March 29, 2019
Camera-ready papers due (firm deadline): Friday April 5, 2019
Workshop dates: Thursday June 6th, 2019


Organizing Committee

Program Committee