Speech recognition in ARI


#1

We’re trying to implement speech recognition with barge-in in an ARI stasis application.
The speech dial plan can’t be accessed from ARI, but there must be a way to implement them.
Is anyone working on this?


#2

It’s been brought up a few times but I know of noone working on such a thing.


#3

So, then the question is, what would it take to:

  1. add additional functionality to ARI?
  2. through that functionality access the speech_background dialplan app, or replicate that code in a similar fashion to be called from an ARI method?

The code for channel actions is in res_ari_channels.c, and the code for speech_background is in app_speech_utils.c

Something simple like Mute, is easy to trace and understand, as it goes from res_ari_channels.c to resource_channels.c to control.c where it finally calls ast_channel_suppress

If this is the case, an analysis of playback is only slightly more difficult to get to the point of calling the lower-level ast_… methods.

Based on that, what then would it take to make speech_background work within the same context? And then to add an additional ARI method to access it?


#4

ARI doesn’t execute applications or anything like that, doing so is rather undefined due to the underlying ownership and control interaction.

For ARI you update some JSON[1] (not specifically for this since it would be in regards to the channel, but you still update the channel JSON with new methods) and generate code. The code then calls functions which you have to implement. It would need to implement the same kind of stuff that app_speech_utils does.

Before doing any of this though the actual API as presented in ARI would need to be defined and explored to make sure it fits and that people are happy with it.

[1] https://wiki.asterisk.org/wiki/display/AST/Create+a+new+resource+with+ARI


#5

It doesn’t seem insurmountable to replicate the code that already exists in speech_utils in such a way that it could be accessed from a new ARI resource.
And, a version one of an API doesn’t have to make everyone happy, it just needs to work.

Many people are looking for this functionality, but no one is working on it. It’s open source and this is a forum, this is the perfect space to work this out.


#6

The developer community and the people who have been interested in this don’t hang out here, so using this wouldn’t yield much feedback. There exists a mailing list which people in that area are on[1]. And as for a version one of the API - it should still be fairly defined and agreed upon. We generally don’t include things in the tree which are in flux because changing it subsequently in any significant way is problematic, and you have to remain backwards compatible. We can’t impact the users of it. It would also need to be fairly defined to be eligible for release branches (13 or 16 for example).

[1] http://lists.digium.com/cgi-bin/mailman/listinfo/asterisk-app-dev