Alexa-fying wolfram alpha

When it comes to asking questions, the most natural to people, I think, is to say them outloud.

Speech recognition is really easy to set up in p5. There’s a neat library called speech.p5 which has a verbal speech component, and a speech recognition component, letting you infer strings from speech, as well as produce speech from strings.

The voice recognition part is trivial. Simply start the recording, and wait for it to identify a sentence. Once it has, save the sentence to a variable to preserve it, and then send it off to wolfram’s API.

The speech is presenting more of a challenge. As of December 2018, Chrome has disabled speech without direct user input.

boo

Of course, this complicates things, since I want the synth to automatically speak the response it receives from the API.

I have yet to solve this, without making the user press a button to confirm playback.

UPDATE: changing the API type from “spoken” to “short answer” actually somehow solves this???? I have no idea why yet but IRONICALLY, SPOKEN DOESNT LIKE TO BE SPOKEN.

That being said, the speech synthesizer is… mmm. Bad with time. “seven colon thirty-nine” is not a human friendly way to give the hour!!


As for customizing an engine to be more specific about a field or topic,that one is an interesting case. One of the things I find especially frustrating about verbal communication with computers is that they give out the most simple answer possible, but sometimes, you want to know context. What assumptions did the machine take for granted that allowed them to pick such an answer? In some of my test cases, I would ask for the temperature, but the location didn’t match my own. Going through the API, I found that location is an input parameter you can be explicit about. For the sake of time, I simply hard coded the location to be Toronto, but in the future, it would be more useful for the user’s location to be identified by their IP address, and then passed into the server-side code in order to locate the user wherever they might be. It would be worth looking into PubNub’s geolocation API.

However, this proved to be a bit of a frustrating roadblock. Though the API documentation for Wolfram Alpha suggests that simply specifying the location for the query parameter should accept a string such as ‘Toronto’, the location never seemed to change. I know it wasn’t me not managing to save the function properly because I managed to change the units from Fahrenheit to Celsius no problem.

poop

IT TURNS OUT. The conversational API varies from the base API and it doesn’t use the “location” parameter, but the “geolocation” parameter.

I hate APIs so much 🙃 (thats a lie, i think apis are really neat and they do wonderful things, but they rely so much on good documentation, and when there’s conflicting sources, it causes so many headaches.)

Ok so, it’s cool, it’s chill, it works. If you ask it questions now, it will assume Toronto’s where you are and try to answer accordingly.

ALSO. This text to speech recognizer has been a source of ummm… INTERESTING HUMOUR all night, as it has been picking up on, and censoring my err…. colourful exclamations of frustration.

censorship-2

censorship

PS: another small thing I find is kind of annoying is that if you’re not careful of the language you use, the results might make the wrong assumptions. For example “what is the weather” returns a definition of the word “weather” instead of the weather, but saying “what is the temperature” returns the expected results. It doesn’t appear as though the spoken API accepts the “assumption” query parameter that the base API does. This would require a lot of query interpretation code-side, and that can get really tricky really fast.


video documentation of it working.

P5 CODE HERE.

pubnub code:

const xhr = require(‘xhr’);
const query = require(‘codec/query_string’);

export default (request) => {
const appId = ‘4PJJL4-RQK8E84YR7’;
const spoken = ‘http://www.wolframalpha.com/api/v1/result-json.jsp’;

const queryParams = {
appid: appId,
input: request.message.text,
geolocation: ‘43.649404,-79.388785’,
units: ‘metric’
};

const apiUrl = spoken + ‘?’ + query.stringify(queryParams);

return xhr.fetch(apiUrl)
.then((r) => {
const body = JSON.parse(r.body || r);
// console.log(body)
request.message.answer = body.result || body.error;
return request.ok();
})
.catch((e) => {
console.error(e);
return request.ok();
});
};

Leave a Reply