Cross Domain User Authentication on Mobile Devices

Project Objectives

Due to the open nature of voice input, voice assistant (VA) systems (e.g., Google Home and Amazon Alexa) are vulnerable to various security and privacy leakages (e.g., credit card numbers, passwords), especially when issuing critical user commands involving large purchases, critical calls, etc. Though the existing VA systems may employ voice features to identify users, they are still vulnerable to various acoustic-based attacks (e.g., impersonation, replay, and hidden command attacks). To ensure the successful large-scale deployment of VA systems, it is critical to address these inherited security vulnerabilities in VA systems and bring trustworthiness to users. In this project, we aim to design a low-overhead system with enhanced security that could protect highly critical commands in VA systems.

Technology Rationale

To protect the users from various attacks, we propose a voice authentication system that leverages the speech similarity between the vibration domain and the audio domain to verify the user. The insight is that when a user gives a voice command, her voice creates similar voice characteristics in both aerial speech vibration and audio voice. By leveraging users’ wearable devices as a personal identity token, our solution captures users’ voice characteristics in the aerial speech vibration using the wearable’s accelerometer and compares them with the voice characteristics in the audio speech captured by the VA device’s microphone. The similarity between the voice characteristics obtained from the vibration domain and the audio domain should have high similarity. Otherwise, the voice command is given by an adversary.

Technical Approach

Upon detecting a wake word at the VA device, our system exploits the wearable’s accelerometer and VA’s microphone to capture voice commands in the vibration domain and audio domain at the same time, respectively. The voice commands recording data are sent to a cloud server for user authentication. To realize the cross-domain similarity comparison, we develop a training-free algorithm that converts high-fidelity microphone data into a low-fidelity aliasing form. We show example time-frequency features extracted from the accelerometer readings and the converted microphone data in Figure 1. Our system then correlates the time-frequency features of the speech signals in the vibration domain and the audio domain to verify the voice command. The algorithm could be easily integrated with existing VA systems and wearables without any hardware modification.

Comparison of the time-frequency features extracted from the accelerometer and the microphone readings.

Project Status

This project has led to three conference papers in ACSAC 2020, ACM AsiaCCS 2021, ACM WiSec 2021, and one magazine paper in IEEE S&P. We have reported a low-effort and training-free user authentication system leveraging the wearable device. We show the user authentication performance with receiver operating characteristic (ROC) curves in Figure 2. The high true positive rates and low false-positive rates demonstrate the effectiveness of the proposed authentication system.

References

Cong Shi, Yan Wang, Yingying Chen, Nitesh Saxena, Chen Wang. WearID: Low-Effort Wearable-Assisted Authentication of Voice Commands via Cross-Domain Comparison without Training, in Proceedings of the 36th Annual Computer Security Applications Conference (ACSAC), pp. 829–842, 2020.

Cong Shi, Yan Wang, Yingying Chen, Nitesh Saxena. Authentication of Voice Commands on Voice Assistance Systems Leveraging Vibrations on Wearables. IEEE Security & Privacy Magazine (IEEE S&P), 2021.

S Abhishek Anand, Jian Liu, Chen Wang, Maliheh Shirvanian, Nitesh Saxena, Yingying Chen, EchoVib: Exploring Voice Authentication via Unique Non-Linear Vibrations of Short Replayed Speech. In Proceedings of the ACM ASIA Conference on Computer and Communications Security (ACM AsiaCCS), pp. 67–81, 2021.

S Abhishek Anand, Chen Wang, Jian Liu, Nitesh Saxena, Yingying Chen. Spearphone: A Lightweight Speech Privacy Exploit via Accelerometer-Sensed Reverberations from Smartphone Loudspeakers. In Proceedings of the ACM Conference on Security and Privacy in Wireless and Mobile Networks (ACM WiSec), 2021.