Streaming audio latency

I’m building an automated chat/IVR system which uses asterisk, speech-to-text, and text-to-speech engines. The dialplan contains basically the commands:

MP3Player(/path/to/greeting.mp3)
Monitor(/path/to/out-stream.wav)
WaitForSilence(1)
StopMonitor()

I’m finding that there are a couple seconds of latency between the MP3Player call (after greeting.mp3) has been played and when the Monitor call actually starts streaming audio to disk. Does anyone have a recommendation of how to reduce this time?

This is currently running on a google compute engine instance with 3GB of memory and a single CPU instance and seems to be using very few resources during testing.

Any suggestions would be greatly appreciated.