User Tools

Site Tools


16.06 - Developing an Incremental and Active Learning Framework for Evolving High-Volume Data Streams

Project - Summary


The project aims to develop and evaluate an incremental learning framework that can process high volumes of data without the need for retraining. The objectives are to:

  • develop multiple incremental learning algorithms (including semi-supervised and unsupervised)
  • evaluate the algorithms using simulated and real-world datasets
  • develop a customized incremental learner for author name disambiguation problem with IAB member, Clarivate Analytics

Project - Team

Team Member Role Email Phone Number Academic Site/IAB
Gail Rosen PI Not available Not available Drexel University
Zhengqiao Zhao Graduate Student Not available Not available Drexel University

Project - Impact and Uses/Benefits

The contributions of our research are thus three-fold: (1) We propose four incremental learning algorithms that can update the decision boundary for the new incoming samples of the classifier without retraining the classifier with the entire training set each time. Moreover, one of the methods, IPAP, can also update the decision for original data (2) We proposed a customized incremental learning algorithm (Incremental High precision rule base Naive Bayesian Classifier) to update bibliographic datasets without the need for labeled information. The disambiguation results from an existing disambiguation system as also optional in our framework. The high-precision rule NBC algorithm demonstrates high precision on a large-scale name block comprising nearly 20,000 citation records. To our knowledge, this test dataset is one of the largest name blocks ever used in an incremental author disambiguation evaluation research project. (3) We successfully benchmark the performance of our proposed methods against other incremental approaches, where we demonstrate notable improvements.

Project - Deep Dive

Project - Documents

FilenameFilesizeLast modified
16.06_year_5_presentation.pdf655.8 KiB2019/08/22 13:21
16.06_year_5_final_report.pdf1.5 MiB2019/08/22 10:47
projects/year5/16.06.txt · Last modified: 2019/08/22 10:49 by sally.johnson