×
Perfectly imperfect: AI voice companions evolve beyond ChatGPT with unsettling realism
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

A new conversational AI called Sesame is raising eyebrows with its uncannily human-like speech patterns, complete with hesitations, self-corrections, and natural interruptions. Unlike traditional AI assistants that simply convert text to speech, Sesame’s breakthrough Conversational Speech Model (CSM) generates speech in a way that mirrors authentic human conversation, potentially marking a significant shift in how we interact with AI systems.

The big picture: Sesame represents a departure from conventional AI voice assistants by deliberately incorporating human imperfections rather than striving for polished perfection.

How it works: Sesame’s Conversational Speech Model combines text and audio processing into a single unified system, enabling more natural speech generation.

  • Unlike ChatGPT and Gemini, which first generate text and then convert it to speech, Sesame creates speech directly with human-like pauses, tonal shifts, and filler words.
  • The system can interrupt conversations, apologize for interruptions, and even change its “mind” mid-sentence, mirroring natural human speech patterns.

Key features: The AI demonstrates sophisticated conversational abilities that go beyond traditional voice assistants.

  • It produces natural chuckles when saying something mildly amusing.
  • The system incorporates thoughtful pauses before responding to questions.
  • It seamlessly handles interruptions in both directions, creating more authentic dialogue.

Why this matters: Sesame’s ability to replicate human speech imperfections so accurately raises important questions about the future of AI-human interactions and the increasing difficulty of distinguishing between human and AI voices.

Behind the numbers: While Sesame currently remains a niche technology, its development suggests a future where phone conversations may require verification of whether the speaker is human or AI.

I tried the most realistic AI voice companion ever created - if ChatGPT or Gemini ever gets this good, reality is in trouble

Recent News

Smaller AI models slash enterprise costs by up to 100X

Task-specific fine-tuning allows compact models to compete with flagship LLMs for particular use cases like summarization.

Psychologist exposes adoption assumption and other fallacies in pro-AI education debates

The calculator comparison fails because AI can bypass conceptual understanding entirely.

Job alert: Y Combinator-backed Spark seeks engineer for $15B clean energy AI tools

AI agents will automatically navigate regulatory websites like human browsers.