The configuration of the session accepts a parameter (modalities) that could restrict the response only to text. See it in https://platform.openai.com/docs/api-reference/realtime-clie....
$ units You have: 1 watermelon You want: grapes * 907.19 / 0.0011023
$ units You have: 1 VIM user You want: Emacs users St4ck 0verflow...
Apple M1 Max with 10-core CPU, 32-core GPU, 16-core Neural Engine - Takes 38 seconds as well up to 46 when it gets hotter.
Can anyone give comparison with Nvida gpu in terms of performance?