Hi
I have two suggestions.
1. Check that audio is available for the camera before adding the audio renderer and connecting the audio part. You can use the RTSPSourceSettings information API for this.
2. Set latency to some higher but normal value like 100-200 ms, or use the default value for tests. If you have a 25 fps video, the duration of one frame is 40 ms. So, if you've set 20 ms, it's an incorrect value because it's less than one frame. It can be theoretically normal for 60fps and higher frame rates, but I suggest using a minimum of 100ms.