
Advancing Zero-shot Speech Generation for Human-like Multi-talker Conversation

We introduce CoVoMix: Conversational Voice Mixture Generation, a novel model for zero-shot, human-like, multi-speaker, multi-round dialogue speech generation. In addition, we devise a comprehensive set of metrics for measuring the effectiveness of dialogue modeling and generation. Our experimental results show that CoVoMix can generate dialogues that are not only human-like in their naturalness and coherence but also involve multiple speakers engaging in multiple rounds of conversation. These dialogues, generated within a single channel, are characterized by seamless speech transitions, including overlapping speech, and appropriate paralinguistic behaviors such as laughter and coughing.



Ground truth


he was in jail for fourteen times and they finally deported him |  fourteen times | yeah his family spent over two hundred thousand dollars keeping him here | wow | and then finally they said no that’s it he’s out can’t even come back to visit


which is uh very strange it’s not something i ever thought would happen | yeah that’s not good | no


so i’m not i don’t know what it is i don’t know what the minimum wage is or | uh it’s five fifteen now it was like four seventy five something like that | my gosh


something around there | yeah that’s that’s good | i don’t think they’d ever get a divorce | no i know


my life was like consumed with the television and it was it was just sad i it that was awful [laughter] | and i remember the i think the day after and like gas prices went up to like three bucks a gallon | oh i know | everybody was like filling up with gas in the town i live in panicking and mhm | really wow


hard choice to make especially when you get peer pressure and then once you start doing it hey look now i’m cool | mhm | but in reality if you actually had to do that to be cool you’re hanging with the wrong parents anyways [laughter] | right right i think my brother in law and his



Ground truth


totally offended i and like they would be like oh that’s a yankee for you but like they like they would they would be like um


um well a way to get more money at your job and that’s pretty much it you know


right i totally agree it gets in your clothes gets in your hair and for a nonsmoker they don’t realize how sensitive i think th um


yeah it’s really not a pleasant odor and it it’s it’s horrible my fiance smokes um on a r daily basis he smokes like a pack and a half a week or more i don’t know he doesn’t tell me


and i i was i was saddened too because so many people do have a problem with tobacco and it is very addictive and i’m i have family members that


right yeah and i i think that’s what you have to do personally to get to that point i have my my dad kinda got the same way you know it was too much of a hassle to go outside and if it was raining and you know just too much of an inconvenience


and the american culture is so big there that you know because most of the people i saw smoking
