User Tools

Site Tools


projects:year8:8a.021.uh

8a.021.UH - Reinforcement Learning and Alternative Truth Discovery in Augmented Media Analysis Use Case

Project - Summary

Currently, there are situations where queries cannot be specified easily: by analysts looking at unfamiliar material and by clients who struggle to define their interests. Reinforcement learning, combined with relevance feedback, will allow for these issues to be resolved using an interactive search process. Results from the model can be extracted for integration into automated pipelines.

There should be additional support for users making relevance judgements. Documents written for a particular audience can involve implicit details that are hard for casual readers to detect (newspapers or social media from a specific political or cultural context, for example). This can involve subtle differences in word usage that alter the semantic meaning of certain terms. By augmenting documents with techniques based on word embeddings and language models, we can highlight such semantic shifts to make analysts aware of these differences.

Project - Team

Team Member Role Email Phone Number Academic Site/IAB
Dorota Glowacka PI glowacka@cs.helsinki.fi 358 50 465 2956 University of Helsinki
Alan Medlar Researcher alan.j.medlar@helsinki.fi +358504484626 University of Helsinki
Kimmo Valtonen Project Mentor kimmo.valtonen@m-brain.com +358 45 1218075 M-Brain
Company Support: M-Brain

Project - Novelty of Approach

  • New document representation based on word embeddings will be added to the system
  • Designing and building a more generic system that can handle different types of data as opposed to only scientific documents
  • Adding visualization of term differences
  • Testing the system on a much larger dataset than before

Project - Deliverables

Deliverables
1 Prototype reinforcement learning search system operating over heterogeneous media data.
2 Word embedding models derived from restricted corpora with known bias.
3 Visualization of term differences derived from embedding models.

Project - Benefits to IAB

We have submitted a paper entitled “Statistically Significant Detection of Semantic Shifts using Contextual Word Embeddings” to the conference on Empirical Methods in Natural Language Processing (EMNLP 2020).

Project - Documents

FilenameFilesizeLast modified
8a.021.uh_final_report.docx594.0 KiB2020/07/30 15:59
8a.021.uh_ip_disclosure.docx21.8 KiB2020/07/30 15:59
8a.021.uh_mid-year_report.docx239.2 KiB2019/12/11 10:30
8a.021.uh_project_poster.pdf1.9 MiB2019/11/01 10:41
8a.021.uh_year_8_pitch.pptx58.2 KiB2019/09/18 17:28
8a.021.uh_year_8_project_proposal_dorota_glowacka.pdf609.6 KiB2019/09/18 17:26

Life Form Feedback

Year 8 Project Poster Session Feedback (Fall 2019)
Real name Great Progress On Course Needs Change Off Course Abstain
Kimmo Valtonen (kimmo.valtonen)     
Sumit Shah (sumit.shah)     
Count:02000
projects/year8/8a.021.uh.txt · Last modified: 2021/06/02 15:21 by sally.johnson