User Tools

Site Tools


7a.013.UVA - Privacy Preserving Multi-Party Analytics

Project - Summary

Many domains and industries would benefit from data sharing for predictive modeling but for privacy, legal, competitive, or other reasons, parties may be unwilling or unable to share information with each other. This project investigates techniques for information transfer between cooperating parties in a machine learning and model development context, and crucially without the physical transfer of, or access to the underlying proprietary data by outside parties. Specifically, this project made use of Generative Adversarial Networks, a technique which uses deep neural networks models to learn a data distribution. These models were trained on a dataset of credit card transactions, to present a use case of two banks sharing synthetic data for the purpose of developing fraud detection models. We show that generating models using purely synthetic data can achieve acceptable levels of performance in the fraud detection task, and that by sharing even synthetic data, cooperating parties can improve model performance on their own local data, as well as the ability of their models to generalize to new data distributions.

Project - Team

Team Member Role Email Phone Number Academic Site/IAB
Peter Beling PI (434) 982-2066 University of Virginia
Stephen Adams Co-PI (757) 870-4954 University of Virginia
Alex Langevin Student/Graduate Researcher Not Available University of Virginia
Steven Greenspan Project Mentor Not Available Not Available CA Technologies

Project - Novelty of Approach


Project - Deliverables

1 Literature review
2 Develop and test GAN for continuous data
3 Develop and test alternative methods for discrete/categorical data GANs
4 Develop architecture for joint continuous/categorical GANs
5 Test GAN performance in public (i.e. no differential privacy) setting
6 Build differential privacy into the discriminator model and re-evaluate
7 Introduce multi-party setting with heterogeneous data
8 Performance benchmarking

Project - Benefits to IAB

As noted in the previous two sections, we developed an architecture for the generation and transfer of mixed type synthetic data, and demonstrated that it can be used for predictive modeling, without the need for outside parties to view the underlying data. With more experimentation and refinement, we expect this approach to yield performance closer to that of the real-data model building scenarios. This framework can be applied by any institution where data confidentiality is a priority.

Project - Presentation Video

Project - Documents

projects/year7/7a.013.uva.txt · Last modified: 2021/06/02 17:13 by sally.johnson