There has been a huge debacle over the smart home devices like Amazon Echo or Google home. Many claimed that these devices are constantly listening and spying on us. I was slightly sceptical about this mainly because these smart home device are a convenience to me and knowing that they act as a data feeder to advertisers like how the internet are reading your browsing habits, it will make me change my tune on using smart home devices.
Therefore, for this exercise, I was curious to focus on whether data are being sent all the time and where are they being sent to. It will be interesting to delve into the traffic fingerprint of the smart home device and look into the paths it took to and fro.
I do however understand that Amazon uses TLS 1.2 encryption with certificate validation/Pinning. What TLS means is that it is a Transport Layer Security designed to provide communications security over a computer network. What this also really meant is that it would prevent me from using a ‘Man-In-The-Middle-Proxy’ and read along encrypted traffic.
Is there a way to go through this?
I have made some readings on certain blogs that recommended Burp, a tool to test web application security to by pass the encryption, it is interesting enough that Burp could be a solution to lighten the burden of enthusiast who would like to look at a unencrypted traffic. But, in order to do this, you would need to run on a raspberry pi and download the Amazon Alexa developer and run alexa on raspberry pi instead. From there, you would be able to packet sniff the alexa because it is free from TLS encryption.
I didn't manage to do this but I do have a rough grasp of how things are done, I did however manage to run Alexa and did some packet sniffing using Herbivore and Wireshark to make my own observation. From my observation, I was on idle and refused to speak to Alexa for a few minutes. What I find interesting is that Alexa was pinging to the server even though I did not deploy any voice commands to it.
My results on Wireshark was similar to Herbivore, the device kept trying to communicate with the server.
I also notice that on Wireshark, Alexa uses the ARP Protocol, which is the Address Resolution Protocol used to convert IP address into physical address (DLC - Data Link Address).
I then transition myself to ask a question on Alexa for the news. Since I have a google home, I was curios to make a comparison between both so I asked the same question and observe them, both at different times. Interestingly, I seem to find Alexa sending more data as compared to google home.
Herbivore for Alexa
Herbivore for Google Home
I don't necessarily know what is going on, and why the difference. I am just assuming that there is a possibility that Amazon takes in more data than google perhaps? Although it doesn't make sense why google would take in less data than Amazon. Perhaps Amazon is trying to sneak in their voice sniffing algorithm as an idea that they planned to work on.
Interesting enough, I asked alexa to play spotify and it kept pinging its server every few seconds after.
I am still curious if we are able to see our voice to text over in terminal if we communicate with these smart devices. I know that the communication log is being kept in the application only we can retrieve, if it is possible for them to log in our communication with the device and have an archive of it, I am sure there is a way to see this in real time. Maybe that could be my next exploration.