tl;dr: One peer generates a self-signed certificate and sends the fingerprint of that over the signalling channel; the other connects to it as a "client".
The resulting DTLS keying material is subsequently used for SRTP encryption (for media) and SCTP over DTLS (for the data channel, which is presumably what's being used here).
The resulting DTLS keying material is subsequently used for SRTP encryption (for media) and SCTP over DTLS (for the data channel, which is presumably what's being used here).