How to Build Machine Learning Models with Private Data Sources on AWS

April 28, 2021

Authors:

Marcella Arthur

This article was originally published on AWS blogs by Conor Moran, Sr. Director, Business Development, Inpher.

AWS users occasionally need to perform analysis on data sources containing private or sensitive inputs. Inpher’s XOR Secret Computing Platform, available in AWS Marketplace, enables data scientists to train and run machine learning models while maintaining data privacy and without trading utility. Data analysis and machine learning performed by XOR can improve model performance with mathematically guaranteed data privacy while ensuring the data never leaves the data source.

In this post, I show you how to use XOR Trial Beta to predict the risk of coronary heart disease by performing Secret Computing. I show how to use secure multi-party computation on three distributed datasets and how to add features to the training data.

This demonstration involves joining the three datasets using a private set intersect function. A private set intersect function involves joining data source features from a common identifier. I also show how to use the output of the private set intersect function in logistic regression to identify the influence of those features on the target variable. A logistic regression is a statistical model that in its basic form uses a logistic function to model a binary dependent variable, although many more complex extensions exist.

This is all performed without viewing the data inputs or requiring the data to be transferred.

How to Build Machine Learning Models with Private Data Sources on AWS

Authors:

Continue Reading here