User Tools

Site Tools


14.4 - Novel Methods for Hidden Relation and Error Detection (In Particular, Author Name Disambiguation in Citation Records

Project - Summary

In this project, we conducted research on author name disambiguation in citation records under the mentoring of Johnson and Johnson (J&J). Author name disambiguation has important applications in author-based information retrieval systems, such as Google Scholar, PubMed, DBLP, BDBComp, CiteSeer, etc. It is a challenging research problem due to two reasons: Synonyms problem - the same author may appear under distinct names, and Polysems - distinct authors may have similar names. In this project, we aim to solve the author name disambiguation problem for PubMed datasets, and develop algorithms and techniques to improve performance compared with existing methods.

Project - Team

Team Member Role Email Phone Number Academic Site/IAB
Yuan An PI Not available Not available Drexel University
Yuan Ling Graduate Student Not available Not available Drexel University

Project - Impact and Uses/Benefits

Working with the IAB member Johnson and Johnson, we developed the ensemble clustering method for author name disambiguation. The method is a fully automatic clustering method; it can be applied and generalized to other tasks and datasets based on clustering. For the author name disambiguation tasks, we proposed to use extra features as abstract and keywords to improve performance. We have collected abstract information for two experimental datasets: DBLP and BDBComp. The enriched datasets can be used for general evaluation purpose.

Project - Deep Dive

Project - Documents

FilenameFilesizeLast modified
14.4_year_3_cvdi_ip_letter_combined.pdf769.4 KiB2019/08/22 11:26
14.4_year_3_executive_summaries_combined.pdf441.9 KiB2019/08/22 11:26
14.4_year_3_final_report.pdf995.6 KiB2019/08/22 10:21
projects/year3/14.4.txt · Last modified: 2019/08/22 10:22 by sally.johnson