Log In

DolphinGemma Could Enable AI Communication with Dolphins

Published 8 hours ago15 minute read

Rachel Feltman: For Scientific American’s Science Quickly, I’m Rachel Feltman.

There are a few animals that pretty much everyone likes: fluffy pandas, cute kittens and regal tigers. Dolphins would probably make the list for most folks; they’re intelligent, playful and have that permanent smile on their face. Watching them darting around in the water kind of makes you wonder: “What are those guys thinking?”

It’s a question many scientists have asked. But could we actually find out? And what if we could talk back?


If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


Freelance ocean writer Melissa Hobson has been looking into a new project that’s making a splash—sorry!—in the media: what’s being billed as the first large language model, or LLM, for dolphin vocalizations.

Could this new tech make direct communication with dolphins a reality? Here’s Melissa to share what she’s learned.

[CLIP: Splash and underwater sounds.]

Melissa Hobson: When you dip your head under the waves at the beach, the water muffles the noise around you and everything goes quiet for a moment. People often assume that means the ocean is silent, but that’s really not true. Underwater habitats are actually full of noise. In fact, some marine animals rely heavily on sound for communication—like dolphins.

[CLIP: A dolphin vocalizations.]

If you’ve ever been in the water with dolphins or watched them on TV, you’ll notice that they’re always chattering, chirping, clicking and squeaking. While these intelligent mammals also use visual, tactile and chemical cues, they often communicate with each other using vocalizations.

Thea Taylor: They have a really, really broad variety of acoustic communication.

Hobson: That’s Thea Taylor, a marine biologist and managing director of the Sussex Dolphin Project, a dolphin research organization based on England’s south coast. She’s not involved in the dolphin LLM project, but she’s really interested in how AI models such as this one could boost our understanding of dolphin communication. When it comes to vocalizations, dolphins generally make three different types of sounds.

Whistles for communication and identification.

[CLIP: A dolphin whistles.]

Hobson: Clicks to help them navigate.

[CLIP: A dolphin makes a clicking noise.]

Hobson: And burst pulses, which are rapid sequences of clicks. These tend to be heard during fights and other close-up social behaviors.

[CLIP: Dolphins make a series of burst noises.]

Hobson: Scientists around the world have spent decades trying to find out how dolphins use sound to communicate and whether the different sounds the mammals make have particular meanings. For example, we know each dolphin has a signature whistle that is essentially its name. But what else can they say?

Arik Kershenbaum is a zoologist at England’s Girton College at the University of Cambridge. He’s an expert in animal communication, particularly among predatory species like dolphins and wolves. Arik’s not involved in the dolphin LLM work.

Arik Kershenbaum: Well, we don’t really know everything about how dolphins communicate, and the most important thing that we don’t know is: we don’t know how much they have to say. They’re not all that clear, really, in terms of the cooperation between individuals, just how much of that is mediated through communication.

Hobson: Over the years researchers from around the world have collected vast amounts of data on dolphin vocalizations. Going through these recordings manually looking for patterns takes time.

Taylor: AI can, A, process data a lot faster than we can. It also has the benefit of not having a human perspective. We almost have an opportunity with AI to kind of let it have a little bit of free reign and look at patterns and indicators that we may not be seeing and we may not be picking up, so I think that’s what I’m particularly excited about.

Hobson: That’s what a team of researchers is hoping to do with an AI project called DolphinGemma, a large language model for dolphin vocalizations created by Google in collaboration with the Georgia Institute of Technology and the nonprofit Wild Dolphin Project.

I caught up with Thad Starner, a professor at Georgia Tech and research scientist at Google DeepMind, and Denise Herzing, founder of the Wild Dolphin Project, to find out how the LLM works.

The Wild Dolphin Project has spent 40 years studying Atlantic spotted dolphins. This includes recording acoustic data that was used to train DolphinGemma. Then teams at Georgia Tech and Google asked the LLM to generate dolphinlike sound sequences.

What it created surprised them all.

The AI model generated a type of sound that Thad and his team had been unable to reproduce synthetically using conventional computer programs. Could the ability to create this unique dolphin sound get us a step closer to communicating with these animals?

Thad Starner: We’ve been having a very hard time reproducing particular types of vocalizations we call VCM3s, and it’s the way the dolphins prefer to respond to us when we are trying to do our two-way communication work.

Hobson: VCM Type 3, or VCM3s, are a variation on the burst pulses we mentioned earlier.

Denise Herzing: Traditionally, in experimental studies in captivity, dolphins, for whatever reason, mimicked whistles they were given using a tonal whistle, like [imitates dolphin whistle], right, you would hear it. What we’re seeing and what Thad was describing is the way the spotted dolphins that we work with seem to want to mimic, and it’s using a click, or two clicks, and it’s basically taking out energy from certain frequency bands.

[CLIP: A dolphin vocalizes.]

Starner: And so when I first saw the results from the first version of DolphinGemma, half of it was, you know, the—mimicking ocean noise. But then the second half of it was actually doing the types of whistles we expect to see from the dolphins, and to my surprise the VCM3s showed up. And I said, “Oh, my word, the stuff that’s the hardest stuff for us to do—we finally have a way to actually create those VCM3s.”

Hobson: Another way they will be using the AI is to see how the LLM completes sequences of dolphin sounds. It’s a bit like when you’re typing into the Google search bar and autocomplete starts finishing your sentence, predicting what you were going to ask.

Starner: Once we have DolphinGemma trained up on everything, we can fine-tune on a particular type of vocalization and say, “Okay, when you hear this what do you predict next?” We can ask it to do it many, many different times and see if it predicts a particular vocalization back, and then we can go back and look at Denise’s 40 years of data and say, “Hey, is this consistent?” Right? It helps us get a magnifying glass to see what we should be paying attention to.

Hobson: If the AI keeps spitting back the same answers consistently, it might reveal a pattern. And if the researchers found a pattern, they could then check the Wild Dolphin Project’s underwater video footage to see how the dolphins were acting when they made a specific sound. This could add important context to the vocalization.

Herzing: “Okay, what were they doing when we saw Sequence A in these 20 sequences? Were they always fighting? Were they always disciplining their calf?”

I mean, we know they have certain types of sounds that are correlated with certain types of behaviors, but what we don’t have is the repeated structure that would suggest some languagelike structures in their acoustics.

Hobson: The team also wants to see what the animals do when researchers play dolphinlike sounds that have been created by computer programs to refer to items such as seagrass or a toy. To do this the team plans to use a technology called CHAT that was developed by Thad’s team. It stands for cetacean hearing augmented telemetry.

The equipment, worn while free diving with the dolphins, has the ability to recognize audio and play sounds. Luckily for Denise, who has to wear it, the technology has become much smaller and less cumbersome over the years and is now all incorporated into one unit. It used to be made up of two parts: a chest plate and an arm panel.

Starner: And when Denise would actually slide into the water there’s a good chance that she could knock herself out.

Herzing: [Laughs] I never knocked myself out. Getting in and out was the challenge. You needed a little crane lift, right? “Drop her in!”

Starner: ’Cause the thing was so big and heavy until you got into the water, and it was hard to make something that you could put on quickly. And so we’ve iterated over the years with a system that was on the chest and on the arm, and now we have this small thing that’s just on the chest, and the big change here is that we discovered that the Pixel phones are good enough on the AI now that they can do all the processing in real time much better than the specialty machines we were making five years ago.

And so we’ve gone down from something that was, I don’t know, four or five different computers in one box to basically a smartphone, and it’s really, really changed what we can do, and, and I’m no longer afraid every time that Denise slides into the water [laughs].

Hobson: The researchers use the CHAT system to essentially label different items. Two free divers get into the water with dolphins nearby. If the researchers can see they won’t be disturbing the dolphins’ natural behaviors, they use their CHAT device to play a made-up dolphinlike sound while holding or passing a specific object.

The hope is that the dolphins might learn which sounds refer to different items and mimic those specific noises to ask for the corresponding objects.

Herzing: You wanna show the dolphins how the system works, not just expect them to just figure it out quickly and absorb it, right? So another human and I, another researcher, we are asking each other for toys using our little synthetic whistles. We exchange toys, we play with them while the dolphins are around watching, and if the dolphins wanna get in the game, they can mimic the whistle for that toy, and we’ll give it to ’em.

[00:08:53] Hobson: For example, this is the sound researchers use for a scarf. The dolphins like to play with scarves.

[CLIP: Scarf vocalization sound.]

Hobson: And Denise has a specific whistle she uses to identify herself.

[CLIP: Denise’s scarf vocalization sound.]

Hobson: But could the team be unintentionally training the dolphins, like when you teach a dog to sit? Here’s what Thea had to say.

Taylor: I think my hesitation is whether that’s the animal actually understanding language or whether it’s more like: “I make this sound in relation to this thing, I get a reward.”

This is where we have to be careful that we don’t kind of bring in the human bias and the “oh, it understands this” kind of excitement—which I get, I totally get. People want to feel like we can communicate with dolphins because, I mean, who wouldn’t want to be able to talk to a dolphin? But I think we do have to be careful and look at it from a very kind of unbiased and scientific point of view when we’re looking at the concept of language and what animals understand.

Hobson: This is where we need to pause and get our dictionary out. Because if we’re trying to discover whether dolphins have language, we need to be clear on exactly what language is.

Kershenbaum: Well, there’s no one really good definition of language, but I think that one of the things that really has to be present if we’re going to give it that very distinguished name of “language” is that these different communicative symbols, or sounds or words or whatever you want to call them, need to be able to be combined in different ways so that there’s really—you could almost say almost anything, you know; if you can combine different sounds or different words into different sentences, then you have at your disposal an infinite range of concepts that you can convey. And it’s that ability to—really to be unlimited in what you can say that seems to be what’s the important part of what language is.

Hobson: So if we understand language as the ability to convey an infinite number of things, rather than just assigning different noises to different objects, can we say that dolphins have language?

At the moment Arik thinks the answer is probably no.

Kershenbaum: So they clearly have the cognitive ability to identify objects and distinguish between different objects by different sounds. That’s not quite the same, or it’s not even close to being the same, as having language. And we know that, that it’s possible to teach dolphins to understand human language.

If I had to guess, I would say that I think dolphins probably don’t have a language in the sense that we have a language, and the reason for that is quite simple: language is a very complicated and expensive thing to have—it’s something that uses up an awful lot of our brain—and it only evolves if it provides some evolutionary benefit. And it’s not at all clear what evolutionary benefit dolphins would have from language.

Hobson: To Arik this research project is not about translating the sounds the animals make but seeing if they appear to recognize complex AI sequences as having meaning.

Kershenbaum: So there’s that wonderful example in the movie Star Trek [IV]: The Voyage Home where the crew of the Enterprise are trying to communicate with humpback whales. And Kirk asks Spock, you know, “Can we reply to these animals?” And he says, “We could simulate the sounds but not the language. We would be responding in gibberish.”

Now there’s a couple of reasons why they would be responding in gibberish. One is that when you listen to a few humpback whales you cannot possibly have enough information to build a really detailed map of what that communication looks like.

When you train large language models on human language you are using the entirety of the Internet—billions upon billions of utterances are being analyzed. None of us investigating animal communication have a dataset anywhere near the size of a human dataset, and so it’s extremely difficult to have enough information to reverse engineer and understand meaning just from looking at sequences.

Hobson: There’s another problem. When we translate one human language to another we know the meanings of both languages. But that’s not true for dolphin communication.

Kershenbaum: When we’re working with animals we actually don’t know what a particular sequence means. We can identify, perhaps, that sequences have meaning, but it’s very, very difficult to understand what that meaning is without being able to ask the animal themselves, which, of course, requires language in the first place. So it’s a very circular problem that we face in decoding animal communication.

Hobson: Denise says this project isn’t exactly about trying to talk to dolphins—at least not yet. The possibility of having a true conversation with these animals is a long way off. But researchers are optimistic that AI could open new doors in their quest to decode dolphins’ whistles. Ultimately, they hope to find potential meanings within the sequences.

So could DolphinGemma help us figure out if dolphins and other animals have language? Thad hopes so.

Starner: With language comes culture, and I’m hoping that if we start doing this two-way work, the dolphins will reveal to us new things we’d never expected before. I mean, we know that they dive deep in some of these areas and see stuff that humans have never seen. We know they have lots of interactions with other marine life that we have no idea about.

Hobson: But even if it’s unlikely we’ll be having a chat with Flipper anytime soon, scientists are interested to see where this might lead. Humans often see language as the thing that sets us apart from animals. Might people have more empathy for cetaceans—that’s whales, dolphins and porpoises—if we discovered they use language?

Taylor: As someone who’s particularly interested, obviously, in cetacean communication, I think this could be [a] really vital step forward for being able to understand it, even in kind of the more basic senses. If we can start to get more of a picture into the world of cetaceans, the more we understand about them, the more we can protect them, the more we can understand what’s important. So yeah, I’m excited to see what this can do for the future of cetacean conservation.

Feltman: That’s all for this week’s Friday Fascination. We’re taking Monday off for Memorial Day, but we’ll be back on Wednesday.

In the meantime, we’d be so grateful if you could take a minute to fill out our ongoing listener survey. We’re looking to find out more about our listeners so we can continue to make Science Quickly the best podcast it can be. If you submit your answers this month, you’ll be eligible to win some sweet SciAm swag. Go to ScienceQuickly.com/survey to fill it out now.

Science Quickly is produced by me, Rachel Feltman, along with Fonda Mwangi, Kelso Harper, Naeem Amarsy and Jeff DelViscio. This episode was reported and co-hosted by Melissa Hobson and edited by Alex Sugiura. Shayna Posses and Aaron Shattuck fact-check our show. Our theme music was composed by Dominic Smith. Subscribe to Scientific American for more up-to-date and in-depth science news.

For Scientific American, this is Rachel Feltman. Have a great weekend!

Origin:
publisher logo
Scientific American
Loading...
Loading...
Loading...

You may also like...