Research & Innovation

Stevens Takes the Lead in Keeping Your Private Data Private

As federated learning systems gain popularity, Shusen Wang is working to protect everything from those cute family photos on your smart phone to your highly confidential medical data

abstract computer security image

In the digital jungle, data is king—and like crafty hyenas eager to steal a lion’s dinner, cyber criminals are always lurking to obtain data from unsuspecting—and even the most cautious—users. Joining the hunt to put a stop to this cyber threat, Stevens Institute of Technology assistant computer science professor Shusen Wang 500 Internal Server Error

Internal Server Error

The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

A Wake-Up Call for Data Privacy

“Data is the key to machine learning,” Wang explains. “The more data we have, the better we can train a machine learning model to make highly accurate predictions. A major challenge, though, is the so-called ‘data silos’ distributed across isolated sources that cannot be shared due to privacy issues. Federated learning has been proposed as a solution to allow cross-device, cross-silo learning without breaking privacy constraints. The raw data does not leave one’s device, yet multiple parties’ data can be used on a central server for training the model.”

500 Internal Server Error

Internal Server Error

The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

However, thanks to the groundbreaking efforts of Stevens computer science chair Giuseppe Ateniese and his team, there’s been a rude awakening that, in its present state, federated learning may be a privacy nightmare.

“Back in 2017, Giuseppe and his co-authors were the first to realize that federated learning still leaks users’ privacy,” Wang says. “Attacks are easy to launch; the attacker needs only to be a participant who can communicate with the server in the federated learning model. Defending data against these privacy leakages is imperative. Stevens is strong in research into security and privacy, and when Giuseppe told me about this work during my interview in 2018, I became interested in helping make federated learning safer.”

Double-Blind Machine Learning Defense May Support a Clear Sight to ‘Holy Grail’ of Data Protection

Ateniese and Wang are each developing defenses intended to stop privacy leakage.

“Since my team and I designed the first attack on these protocols, we’re credible in the community, and we’re excited to be taking the lead in finding a solution that makes sense from a security perspective,” Ateniese says. “I’m using cryptography techniques, which can be expensive. Shusen, one of the industry’s top machine learning experts, is approaching it from a computational, machine learning point of view.”

Specifically, Wang is developing a double-blind, collaborative learning defense that shows strong potential to substantially improve data privacy, more quickly and at a much lower cost.

“Random matrices transform the information sent between the server and users, so an attacker cannot use the communicated information for privacy inference,” Wang explains. “Theories guarantee that the defense defeats gradient-based attacks, which are the most effective privacy leakages at present. Experiments demonstrate that the defense works and does not hurt accuracy or efficiency. So far, it’s not absolutely safe, but we’re getting there. It’s like a sword and a shield—we’re still developing the shield to protect data from even the strongest sword.”

Although they are working independently, Ateniese and Wang are eager to see whether they can combine their approaches for the best of both worlds.

“Federated learning technology can be a game-changer, but we can’t patch fixes after an attack, because once private data has been leaked, it’s gone forever—you can’t reverse the damage,” Ateniese says. “Although we’ve seen hundreds of proposals for privacy-preserving federated learning since my team published our work in 2017, we’ve actually seen no effective solutions—they’ve been too easy to defeat or didn’t fully address privacy leakage. This has to be done properly from the start. Shusen’s method provides several privacy guarantees while being more efficient than a simple cryptographic approach. Ideally, we’ll combine his efforts from the machine learning field with strong security guarantees from cryptography to achieve the ‘holy grail’ of data privacy.”

Learn more about Computer Science at Stevens: