Investing.com -- Google (NASDAQ: GOOGL )’s large language model, DolphinGemma, is playing a significant role in helping scientists understand dolphin communication. The model was announced today, on National Dolphin Day, as part of a collaborative effort with researchers from Georgia Tech and the Wild Dolphin Project (WDP). DolphinGemma is a foundational AI model designed to learn the structure of dolphin vocalizations and generate new dolphin-like sound sequences, pushing the boundaries of AI and our potential connection with the marine world.
The WDP has been conducting extensive research on dolphins since 1985, making it the longest-running underwater dolphin research project in the world. The project focuses on a specific community of wild Atlantic spotted dolphins in the Bahamas, studying them across generations. The WDP’s non-invasive approach has yielded decades of underwater video and audio paired with individual dolphin identities, life histories, and observed behaviors.
The WDP’s primary focus is observing and analyzing dolphins’ natural communication and social interactions. They have spent decades correlating different types of sounds with specific behaviors. For instance, signature whistles are used by mothers and calves to reunite, burst-pulse "squawks" are often observed during fights, and click "buzzes" are often used during courtship or chasing sharks.
Google’s DolphinGemma was developed to analyze this complex communication. The AI model utilizes Google’s audio technologies, including the SoundStream tokenizer, to efficiently represent dolphin sounds. The model then processes these sounds using an architecture designed for complex sequences. This model builds upon insights from Gemma (EGX: ECAP ), Google’s collection of lightweight, state-of-the-art open models. DolphinGemma functions as an audio-in, audio-out model, processing sequences of natural dolphin sounds to identify patterns, structure, and predict the likely subsequent sounds in a sequence.
The WDP is beginning to deploy DolphinGemma this field season with immediate potential benefits. By identifying recurring sound patterns and reliable sequences, the model can help researchers uncover hidden structures and potential meanings within the dolphins’ natural communication. These patterns, augmented with synthetic sounds created by the researchers, may establish a shared vocabulary with the dolphins for interactive communication.
In addition to analyzing natural communication, WDP is also exploring potential two-way interaction using the CHAT (Cetacean Hearing Augmentation Telemetry) system, developed in partnership with the Georgia Institute of Technology. This underwater computer system is designed to establish a simpler, shared vocabulary with the dolphins, rather than deciphering their complex natural language.
The CHAT system associates novel, synthetic whistles with specific objects that dolphins enjoy, like sargassum, seagrass, or scarves. Researchers hope the naturally curious dolphins will learn to mimic the whistles to request these items. As more of the dolphins’ natural sounds are understood, they can also be added to the system.
To enable two-way interaction, the CHAT system needs to accurately hear the mimic amid ocean noise, identify which whistle was mimicked in real-time, inform the researcher which object the dolphin "requested," and enable the researcher to respond quickly by offering the correct object.
The journey to understanding dolphin communication is long, but the combination of dedicated field research by WDP, engineering expertise from Georgia Tech, and the power of Google’s technology is opening new possibilities. The development of tools like DolphinGemma aims to deepen our understanding of these intelligent marine mammals and pave the way for a future where the gap between human and dolphin communication might just get a little smaller.
This article was generated with the support of AI and reviewed by an editor. For more information see our T&C.