Guo Xinhua wants to teach computers to echolocate. He and his colleagues have built a device, about the size of a thin laptop, that emits sound at frequencies 10 times higher than the shrillest note a piccolo can sustain. The pitches it produces are inaudible to the human ear. When Guo’s team aims the device at a person and fires an ultrasonic pitch, the gadget listens for the echo using its hundreds of embedded microphones. Then, employing artificial intelligence techniques, his team tries to decipher what the person is doing from the reflected sound alone.
The technology is still in its infancy, but they’ve achieved some promising initial results. Based at the Wuhan University of Technology, in China, Guo’s team has tested its microphone array on four different college students and found that they can identify whether the person is sitting, standing, walking, or falling, with complete accuracy, they report in a paper published today in Applied Physics Letters. While they still need to test that the technique works on more people, and that it can identify a broader range of behaviors, this demonstration hints at a new technology for surveilling human behavior.
Guo’s device belongs to a category of technology known as human activity recognition, in which a computer analyzes signals to figure out what people are doing. Smartwatch pedometers, for example, convert acceleration and rotation data into the number of steps the wearer has taken. Now, researchers like Guo are designing systems that can identify more complicated human behavior, with the help of more sophisticated AI techniques. Some work with sound data, like Guo; others are developing better image recognition algorithms. Some researchers have even shown that they can identify simple human poses by analyzing ambient Wi-Fi signals, says computer scientist Albrecht Schmidt of the Ludwig Maximilian University of Munich. “When humans move through the signals, they change them,” he says. Fluctuations in Wi-Fi signals can reveal, for example, a person clapping, making a phone call, or squatting.
Guo’s team has designed an algorithm that relates a sound signal to a specific human posture. After the device captures the echo, the algorithm first removes some ambient noise, and then analyzes the data for patterns. The mixture of frequencies in the reflected sound, for example, can offer hints about what’s happening in a room. Whatever pose a person is holding will end up reflecting back more of one pitch than another. The algorithm exploits these differences to determine the person’s posture.
The algorithm works more accurately if you use more microphones to pick up the echo, says Guo. The tone differences between the various poses appear starker. Guo’s current array, which uses 256 microphones, is bulky and likely too expensive to mass produce, so they are now trying to reduce the number of microphones without compromising accuracy.
But companies have yet to develop these new behavior detection techniques into commercial products. It’s still unclear what, if anything, they will be used for, says Schmidt.
Guo has a few ideas for his sound array. One possibility is to incorporate it into future Amazon Echo-like devices, so they can listen for elderly people falling in their own homes. He thinks it might also be used as an alternative to image or video recognition software. Sound can identify objects in environments that a camera can’t, he says, such as in dark rooms or smoky areas.
Guo also thinks that sound-based monitoring can preserve individual privacy better than video surveillance, which could make people more willing to accept this technology in their homes, he says.
But technology ethicist Jake Metcalf of Data & Society, a New York-based research institute, argues that sound monitoring could just as easily be combined with video to create a heightened form of surveillance.
LEARN MORE
The WIRED Guide to Artificial Intelligence
That’s because even though researchers may develop the technology with one application in mind, they don’t control how people end up using it. Sound monitoring could end up saving the life of someone’s elderly parent, but it could also be exploited by the state to look for and persecute a Muslim kneeling in prayer, says Metcalf, referring to China’s ongoing surveillance of its minority Uyghur population. A hacker could also rejigger it to listen for people having sex. “These technologies just scream to be repurposed,” he says. “That’s the whole point of them. It’s why they’re valuable.”
But Guo’s technology isn’t there yet—his team has to develop the device further before it can be deployed in any sort of product. They need to shrink down the hardware, and they plan to run more tests to make sure the algorithm works on more people and more scenarios. For example, they will collect behavior data on more people from more diverse demographics and make them fall at different angles, says Guo. As the sound array gets smarter, maybe someone will figure out how to put it to use.