Workshop Notes #5
Eavesdropper is a web application that actively listens for the mention of a user’s name spoken in conversation. The application uses a voice to text API that transcribes conversations that are in listening range. The transcriptions are analyzed, and if the name of someone is said the application will sound an alert noting that that person is going to be notified through text message. Simultaneously, the clip of what was being said around the user is saved. The user then receives a text message and can go see what was being said about them.
Building upon my previous voice to text “DIY Siri” project, I wanted to play around with the concept “what if my computer could notify me if it heard something specific?”. I initially thought that it would be interesting to build directly off of the Wolf Ram Alpha API from the DIY Siri project to notify me if something specific was searched. From here I decided that I wanted to isolate the working parts and start with the concept of “the application hears a specific word, the user gets a notification”. I chose to use names as a trigger because they are rare enough that the trigger would not be sent frequently. This is important because both IFTTT and Adafruit IO have data sending and receiving limits. IFTTT has a limit of sending up to 100 text messages a month, and Adafruit IO has a limit of updating channels 30 times a minute.
I started off by using my existing code from DIY Siri and removing any of the PubNub server integration. I then changed the code to analyze the transcripts of what was being said. If my name was mentioned, then log this information.
My next step was to connect my Adafruit IO channel to the page. I created a new feed titled “overheard” with two channels: listening, and transcripts. Listening would indicate whether or not my name was overheard, and transcripts would save whatever was being said about me.
After creating those two channels, I connected my voice to text API to Adafruit to see if I would be able to save the value “true” and the transcript of the conversation. I tested with “if my name is included in this transcript, send the data to Adafruit”. This was successful.
Upon the guidance from Adafruit, I started to create an applet of my own to connect this feed to my phone. I chose the if “this” (trigger) to be Adafruit IO, and the “then that” (action) to be an SMS text message. On the Adafruit side, I selected to monitor the feed “overheard” and the channel “listening”. If “listening” was equal to the data “true” then send a text message. The UX of IFTTT made it simple to connect the two platforms together.
I started testing my application with all of the parts now connected. At first, I was not receiving text messages. This was because I was sending Adafruit a boolean value and not a string. The “equal to” on the IFTTT side of the platform was comparing the channel value to the string “true”. I changed the value of what I was sending to Adafruit to a string and was able to receive a text message.
Once I received a text message, I resulted in receiving six in a row. I realized that my voice-to-text alert that played upon hearing my name was vocalizing my name out of the speakers, which in result my application was picking up. This created an infinite loop of alerts. “Alert alert Olivia has been notified that you mentioned her and received a text message”. I attempted to stop the recursive loop by turning off the voice recognition and restarting it. The issue was with each time a new voice recognition object is instantiated explicit permission from the user to have their microphone activated was required. A quick fix for this was so that I could continue development was to not use my name in the text to voice alert from my speakers. I chose to use “O Prior has been notified” rather than using my name, Olivia.
For the UX/UI of the application, I chose to use a simple button. When the application was not actively listening a button would appear that said “Shhhhhhh!”. If the button was clicked, a microphone permissions prompt would display requesting access. Once the application was listening to the room the entire screen would turn black to be “inconspicuous”. The stop button was designed to be black and appears if the cursor hovers overtop of the element. If the name Olivia is heard in conversation, then a .gif file plays showing someone pressing an alert button. The video and message loop twice before returning to a black screen.
Video demo of Eavesdropper
One challenge I faced was attempting to connect two channel to the IFTTT applet. I wanted to additionally send the transcript as data through the SMS notification. The applet that was connected to Adafruit only allowed for the data of one channel to be used in the SMS. Due to the set up of the applet, I could only compare on direct values (such as greater than, is not equal too, etc.) This inhibited me from using the transcript channel as a trigger to send the message. Alternatively, I could have set up the applet so that it sent a message anytime the transcript channel was updated. With this method, I would have to be concerned with character length and substring the message to ensure that the data would not exceed the character limit for the SMS. I did not want to cut the transcript short, so I chose to use the boolean method. If the user wanted to see what was being said about them, they could investigate the transcript channel and use the time the text message was sent as an indicator for what was being said about them at that moment.
The other challenge I noted was the text to speech API. I had to use a function that checked many different iterations of “Olivia”. This included all different capitalizations of Olivia and with different punctuation. This was only an issue once so all of the iterations may not be necessary. The function that I used is incredibly useful if this program were adapted to listen for multiple keywords. The program loops through the transcript and checks for strings for a list of words that are kept in an array. The array can be modified to store whatever the user would want to be notified of in a conversation.
Next steps & conclusion
The next step for this project would be to find a way to use the data from both of the channels. Having different customized messages from the triggered conversation I think would provide a better experience for the user and would stop the application from being redundant.
Overall IFTTT is an intuitive platform that uses simple and accessible logic that allows many people to create bespoke trigger and responses. Some limitations may be an issue if one were to have their applet on a grander scale, such as the SMS message limit of 100 per month. This web application is a simple demo of what can be achieved with lots of other potentials to adapt to more bespoke applications.
References & Research