In this episode of the Huberman Lab podcast, Dr. Eddie Chang joins Andrew Huberman to explore the neurobiology of speech and language. Chang explains how speech and language rely on distinct neural systems, detailing the complex motor coordination required for speech production and distinguishing it from non-linguistic vocalizations. The discussion includes insights into stuttering as a speech motor disorder and the role of auditory feedback in speech control.
The conversation then shifts to brain-machine interfaces and their potential to restore communication in paralyzed patients. Chang describes the Bravo trial, where electrode arrays and machine learning algorithms successfully decoded brain activity into words for a locked-in stroke patient. The episode also covers avatar technology that translates neural activity into animated facial expressions and speech, as well as the ethical considerations surrounding neural augmentation technologies that could enhance human capabilities beyond medical restoration.

Sign up for Shortform to access the whole episode summary along with additional materials like counterarguments and context.
Edward Chang explains that speech and language, while closely related, rely on different cognitive and neural systems. Language encompasses semantics, syntax, and pragmatics—the broader aspects of communication that allow listeners to extract meaning. Speech, in contrast, refers specifically to the physical production of audible signals through vocal tract movements. Chang emphasizes that speech is only one form of language, alongside reading and sign language, and represents the most complex motor task humans perform.
Speech production begins with exhalation from the lungs. As air moves through the larynx, the vocal folds vibrate at about 100 hertz in men and 200 hertz in women, creating "voicing." This sound then travels upward through the pharynx and oral cavity, where the tongue, lips, and mouth shape it into distinct consonants and vowels that listeners interpret as words.
Chang distinguishes speech from non-linguistic vocalizations like crying or laughter. These sounds use similar physical mechanisms but arise from different, evolutionarily older brain areas shared with non-human primates. People with injuries in speech centers can often still produce these vocalizations.
Stuttering exemplifies impaired coordination in the brain's speech machinery. Chang notes that people who stutter have completely intact language abilities but struggle with motor control for articulating words smoothly. He likens the brain's coordination of speech structures to an orchestra that must be precisely synchronized. Auditory feedback—listening to one's own speech—plays a key role, and disruptions in this feedback loop can improve or worsen stuttering. Early intervention through behavioral speech therapy is most effective, especially when neuroplasticity is robust in children.
The first participant in the Bravo trial suffered a severe brainstem stroke 15 years ago that left him completely paralyzed and unable to speak, though his cognition remained intact. He communicated by using a stick attached to his baseball cap to peck at a keyboard letter by letter. Surgeons implanted an electrode array over the areas of his brain that control speech structures, allowing researchers to record brainwaves generated by his attempts to speak.
Machine learning algorithms analyzed the participant's brainwaves during attempted speech. Over weeks of training, the AI learned to detect distinctive neural patterns associated with specific words. While the decoding isn't perfect at the single-word level, the system uses linguistic context and autocorrect-like algorithms—similar to smartphone technology—to predict and correct words based on probabilities from previous words, significantly improving accuracy.
This marked the first time a completely paralyzed, locked-in patient could communicate in full words and sentences decoded directly from brain activity. When words appeared on the display, the participant shook with laughter out of joy, demonstrating emotional engagement. The system began with 50 words to build grammar and prediction models, with vocabulary expected to expand over time. Practical challenges remain, such as laughter interfering with decoding accuracy, but the breakthrough establishes that brain-to-text communication is feasible for paralyzed patients.
Brainstem strokes sever the connection between the brain's thinking center and the nerves needed for speaking or writing, leaving patients mentally alert but unable to express themselves. Other conditions like ALS lead to motor neuron death and loss of voluntary movement. Locked-in syndrome represents one of the most profound communication disabilities, with complete awareness but no means for voluntary expression. The Bravo trial advances represent a major leap in restoring agency for people otherwise trapped in silence.
Chang and Andrew Huberman discuss how avatar technology is reshaping communication by translating neural activity into animated facial expressions and speech.
Chang emphasizes that facial expressions and mouth movements play a crucial role beyond speech acoustics alone. Watching another person's mouth and jaw movements as they speak actually helps the listener's brain process spoken sounds more effectively, leveraging natural audiovisual integration for better comprehension.
Chang argues that animated avatars provide more authentic communication than simple text-based outputs, especially for individuals using speech neuroprosthetics. Avatars allow paralyzed users to control animated representations that mimic their speech and facial movements in real time, creating a sense of embodiment and making communication feel more natural and personal. This integration is moving into broader social and virtual platforms, allowing paralyzed individuals to participate more fully in digital society.
Looking forward, Chang and Huberman foresee avatar-driven communication rapidly rising on social media. Chang confirms that users will soon be able to have computer-animated avatars vocalize their messages, complete with facial expressions and emotional nuance. Significant progress is being made in building holistic avatars that decode and express the full breadth of human speech and emotion based on neural activity, with profound implications for both general consumers and those with disabilities.
Chang and Huberman discuss the implications of developing brain-machine interfaces for cognitive and communicative enhancements that go beyond restoring lost function to expanding human capabilities.
Chang describes a rapidly approaching scenario where devices could augment memory, communication speed, and athletic precision beyond natural limits. The line between medical restoration and enhancement is blurring as restorative technologies become capable of pushing people "beyond super normal." Smartphones already provide cognitive augmentation by granting access to global information, and the question now is how BMIs might accelerate these abilities through direct brain interfaces.
Chang emphasizes that the drive to augment human ability isn't new. Throughout history, humans have used substances like caffeine and pharmaceuticals to sharpen cognitive function, and these substances routinely move from medical to widespread consumer use. Cosmetic procedures similarly demonstrate willingness for elective self-modification. As technology matures, neural augmentation will likely follow a similar pattern from therapeutic to mainstream consumer use.
Chang voices concern that society hasn't sufficiently considered the ethical or societal impact of cognitive enhancement. There's no consensus on whether increased abilities improve wellbeing or present unforeseen risks. A major concern is access: sophisticated neural augmentation will likely be expensive, raising equity issues and potentially widening societal inequalities. Chang's overarching concern is that society hasn't fully discussed or anticipated the real-world problems BMIs will create, including questions of social desirability, equity, access, and downstream consequences—even as the technology approaches readiness for consumer use.
1-Page Summary
Edward Chang explains that speech and language, while closely related, rely on different cognitive and neural systems. Language encompasses the broader aspects of communication, such as semantics (the meaning of words), syntax (how words are assembled into grammatical sentences), and pragmatics (the gist or context of communication). These facets allow listeners to extract understanding and meaning from what is said. Speech, in contrast, refers specifically to the physical production of audible signals—words generated by movements of the vocal tract. These vibrations in the air are picked up by the ears or recording devices and translated into electrical activity for the brain to process. Chang emphasizes that speech itself is only one form of language; other modalities include reading and sign language. Producing speech is an extremely complex motor task, arguably the most complex performed by humans, since it requires precisely coordinated actions of many anatomical structures to produce intelligible words and sentences.
Speech production begins with the lungs—a person inhales, then exhales, pushing air outward. As this air moves through the vocal folds in the larynx, the folds come together and vibrate at high frequencies, creating "voicing." Typically, the vocal folds vibrate at about 100 hertz in men and 200 hertz in women, a difference arising from the larger larynx in men, which results in a lower resonance frequency and explains the difference in voice quality between the sexes.
Once the initial sound is created in the larynx, this energy travels upwards into the pharynx and then into the oral cavity, which includes the mouth, tongue, and lips. These structures further shape and modify the airflow, turning the raw voicing into distinct speech sounds—consonants and vowels—which are perceived as words. Thus, speech is the result of a highly complex and precisely coordinated system that begins with lung exhalation, is modulated by the larynx, and is meticulously shaped by the structures above the larynx to create the acoustic patterns that listeners interpret as language.
Chang distinguishes speech from non-linguistic vocalizations such as crying, moaning, or laughter. These sounds are produced by similar physical mechanisms—air exhaled and vocal folds vibrating in the larynx—but they arise from different areas of the brain than those responsible for speech and language. People with injuries in the speech and language centers of the brain can often still produce vocalizations such as moans or cries. These non-linguistic vocalizations are controlled by brain structures that are shared with non-human primates and are evolutionarily older and separate from the specialized human speech system.
Stuttering exemplifies how impaired coordination in the brain’s speech machinery can disrupt fluent speech. People who stutter typically have completely intact language abilities, including vocabulary, syntax, and semantics—they know exactly what they want to say—but struggle with the motor control required to articulate words smoo ...
Neurobiology of Speech and Language: Distinguishing Processing and Motor Control, Understanding Brain Coordination of Vocal Tract Movements
The first participant in the Bravo trial is a man who, 15 years ago, suffered a devastating consequence of a car accident. Although he walked out of the hospital initially, the next day he developed a severe complication—a large stroke in the brainstem. The stroke left him in a coma for about a week, and upon awakening, he was unable to move his arms or legs or speak intelligibly. He retained cognition but had only some control over neck and limited mouth movements, rendering him completely locked in, unable to communicate except by blinking.
With residual neck movement, the participant adapted by having friends attach a stick to his baseball cap, which he used to peck at a keyboard, letter by letter, to construct words. This process provided his main form of communication for 15 years, as he was otherwise unable to speak.
To advance beyond these communication barriers, surgeons implanted an electrode array onto the areas of his brain that control vocal tract, larynx, lips, tongue, and jaw—regions normally involved in speech. This electrode array, connected via a port fixed to his skull and passing through his scalp, allowed the recording of analog brainwaves generated by his attempts to speak. These signals were then converted into digital data for further processing.
The brainwaves collected during the participant’s attempted speech were analyzed by machine learning algorithms. Over weeks of training, the patient would be prompted to attempt to say specific words while the AI system learned to detect distinctive neural patterns associated with the preparations for those words.
This decoding process, while powerful, is far from perfect at identifying individual words with absolute accuracy, requiring weeks of intense training and iterative model refinement.
To compensate, the system uses additional linguistic context—similar to the autocorrect feature on smartphones—by constructing a computational model of all possible word combinations within a limited vocabulary. Probabilities from previous words help predict and correct subsequent outputs, improving overall communication accuracy even when single-word decoding is imperfect.
The trial marked the first time that a completely paralyzed, locked-in patient could communicate in full words and sentences decoded directly from brain activity. When prompted with words, he would attempt to say them, and, as the words appeared on a display, his reaction was visible—he shook with laughter out of joy when the system worked, demonstrating emotional engagement and success.
The system began with a constrained set of 50 words to generate all possible sentences and provide a manageable basis for building grammar and prediction models. Over time, the vocabulary is expected to expand, increasing the flexibility and richness of communication.
Brain-Machine Interfaces & Neuroprosthetics: Restoring Communication in Paralyzed Patients With the Bravo Trial & AI For Translating Neural Activity
Edward Chang and Andrew Huberman discuss how advances in avatar technology are reshaping communication, particularly by leveraging brain-computer interfaces (BCIs) to translate neural activity into animated facial expressions and speech, thus enabling more natural interactions.
Facial expressions and mouth movements play a crucial role in human communication beyond what speech acoustics alone convey. Chang emphasizes the importance of visual cues: for instance, a quizzical look during conversation signals the need for clarification or adjustment, guiding speakers to rephrase, slow down, or alter their message. Such dynamic feedback allows conversation to flow and adapt naturally.
It is not only expressive content that matters. According to Chang, watching another person’s mouth and jaw movements as they speak actually helps the listener’s brain process and understand spoken sounds more effectively. Visual and acoustic information combine in the brain to produce better speech comprehension than sound alone, leveraging natural audiovisual integration for more intelligible and natural communication.
Chang argues that animated avatars provide richer, more authentic communication than simple text-based speech outputs, especially for individuals using speech neuroprosthetics. For people who are paralyzed and communicate through BCIs, neuroprosthetic technology can now allow them to control an animated avatar that mimics their speech and facial movements in real time. This creates a sense of embodiment and control, making the communication process feel more natural and personal.
Chang notes that avatars provide superior feedback during neuroprosthetic speech training compared to text on a screen. Users can directly see how their intended expressions and mouth movements are rendered by the avatar, accelerating learning and enhancing their sense of self-expression.
The integration of avatar technology is also moving into broader social and virtual platforms—an important development for paralyzed individuals. Holistic avatars capable of replicating mouth, jaw, and facial movements allow these users ...
Avatar Technology: Translating Brain Activity Into Expressions and Speech for Natural Communication
Edward Chang and Andrew Huberman discuss the implications of developing brain-machine interfaces (BMIs) for cognitive and communicative enhancements that go beyond restoring lost function to expanding human capabilities. The conversation reveals both technological precedents and urgent ethical concerns in the field.
Chang describes a rapidly approaching scenario in which devices could provide augmentation for memory, communication speed, and athletic precision, taking abilities beyond current natural limits. These enhancements, such as super memory or communication speeds exceeding speech, raise novel questions, especially since most brain-machine interface pathways have so far focused on strictly medical applications—helping restore abilities after injury or disease. However, the line between medical restoration and enhancement is blurring, especially as restorative technologies become capable of pushing people “beyond super normal.”
Smartphones serve as an immediate precursor to this kind of neural augmentation. As Chang notes, current devices provide vast cognitive augmentation, granting access to global information through simple handheld technology. In this sense, cognitive enhancement has already begun, and the question now is how BMIs might accelerate these abilities, perhaps allowing far faster access and interaction via direct brain interfaces.
Chang emphasizes that the drive to augment human ability is not new. Throughout history, humans have used a variety of substances—such as caffeine, nicotine, and pharmaceuticals—to sharpen cognitive function and physical performance. Substances routinely move from strictly medical uses to widespread consumer applications. This fluid movement between medical and non-medical self-enhancement applies to cosmetic procedures as well, where technology enables people to alter physical appearance for non-medical reasons, signaling a willingness for elective self-modification that likely extends to neural technology. As technology matures, it is probable that neural augmentation will follow a similar pattern, transitioning from therapeutic to mainstream consumer use.
Despite the trajectory these enhancements are on, Chang points out that current neural interfaces do not match the natural capacities evolved for cognition and communication, with existing technology limited by bandwidth and integration. The rapidity with which augmentation is progressing, however, means these limitations might soon be challenged.
Chang voices his concern that society has not sufficiently considered the ethical or societal impact of cognitive and communicative enhancement. He ...
Ethical Considerations of Neural Augmentation For Enhancement Beyond Medical Restoration: Access, Societal Impact, and Appropriate Use
Download the Shortform Chrome extension for your browser
