Voice Chat for Games

Okay, let us begin, as always, with a disclaimer… I hate Ventrilo and all the other software voice chat stuff people use for MMOs and whatnot. There is just something I feel is clunky about using a tool that is outside the game, and if there is one thing I am a big proponent of is putting tools in the game for the players (any game without an in-game notepad annoys me, I don’t want my desk covered in notes, let me put them in the game).

To that end, what I would really like to see is a move toward “realistic” voice chat in game. I wouldn’t do away with text chat entirely, because text works much better than voice for managing multiple rooms or private chats. And, to a degree, I don’t mind if interaction with NPCs for quests and stuff has to stay text based, that’ll come later if a game can manage what I want.

The first step is to build a sound engine and structures within the game engine to support distance with sounds. For a simplified system, lets just say there are 4 levels of sound: Whisper, Normal, Loud, Yell. Roughly equating these to distances: 5 feet, 15 feet, 30 feet, 100 feet (this might need some adjusting as this is just off the top of my head stuff). Every sound effect in the game has a sound level attached to it. When a sound plays, the appropriate distance from the sound emitter is calculated and the sound will be played for every listening object (mostly players) in that range, at the appropriate level. What that last clause means is that something said at “Normal” level doesn’t just travel 15 feet and stop, it travels 15 feet at Normal, and then another 7.5 at Whisper. A Yell would travel 100 feet at Yell, 50 feet at Loud, 25 feet at Normal, and 12.5 feet at Whisper. Then, you build “echo” objects that will repeat any sound they “hear”
modified by the properties of the echo object. If you have been in caves you’ll know that sometimes an echo can actually come back at you louder than the original sound, or distorted, not always just softer.

Okay, now that you have it so sound plays at distance and have echoes, the next step is to make NPCs react to sound. Imagine what games like EverQuest or World of Warcraft would be like if your footsteps made sound and the monsters could hear you. Pretty cool, eh? You can bet suddenly people would stop running and jumping to get everywhere.
Now, the final step of my plan… Voice Chat. The player logs in and sets levels in the options for Whisper, Normal, Loud and Yell by speaking into there microphone at the different levels. This way, when the player Yells into his mic, the game will play his sound back in the game as a Yell… 100 feet, then 50, then 25, then 12.5. Everyone in those ranges just heard him, good or bad.

After that, you can get real tricky by utiliting modified echo objects linked together to work like a walkie-talkie or cell phone. I whisper at my end, and even though you are 500 yards away my whisper comes out your end as a whisper (perhaps even with static or other sound modifications added to it).

I know this won’t be easy, as its not a simple sound stream, but I’d love to see it done. Anything that moves MMOs away from the feel of a graphical chat room and adds more spacial awareness is good to me.

2 comments

  1. Such a system has merit.

    It would probably be better implemented in a Sci-fi game than a fantasy game, as the quality of audio that’s available better serves the “static-filled commlink” than real personal talk.

    I’d have to admit- I’d be royally irritated every time a cough or a RL problem interfered with gameplay. Even the fumbling of a microphone could easily exceed the volume of a “shout” (bringing an entire mob down on the team). Flu season would put an entirely different level of requirement on my pickup groups.

    Then there are the people that can’t use that interface- the ones that can’t talk in their house at 2am without waking the kids, let alone shout. Sure, you can tone the tolerances down until every breath is transmitted into the game. Sure, he could text-chat… maybe even with a real tactical advantage over others’ accidental audio bloopers.

    As for the sound engine: That’s not necessarily technically difficult- you can see… err… hear… such things already in some single-player games. The limiting factor to the MMO would thus be more one of capacity. It’d much easier if these ranges coincided with pre-existing detection ranges than their own, as each tier could add to the server’s tracking limits. Also, area increases faster than distance, and tracking processes get very cumbersome as they have more area to manage.

    The different “tiers” of noise has to be tracked as a separate “detection box” or the longest range detection box needs tracked, the data’s sent to each client, and the client then determines the proper “volume” there. That would violate the “trust no client” rule and opens us up for hacks for players that might want the advantage of the whisper with the clear communication of a “shout.”

    Finally, we have to deal with the bandwidth challenge. Devs can’t be paying for all the bandwidth caused by all the audio, meaning they can’t have the audio transmitted to their servers for re-broadcast. Audio’s considerably more bandwidth-laden than text, and it would greatly increase the hosting costs should that be necessary.

    Instead, the mapserver would have to send regularly updated lists to all client machines identifying who should get the audio data and where that computer is. The client would then send to these addresses.

    The downside to such a system: The game host has no record of what’s being said (a potentially critical issue for CSR’s handling interplayer complaints) and players will have lists of listeners that their characters may not necessarily have detected (stealthed foes). That’s an inherent risk in breaking the “trust no client” rule though.

  2. A solution to many of these problems would be to simply make the voice chat “press to talk”, or at least a “click-on/click-off”.

    … and yeah, I’m not thinking of using this for a fantasy setting, although as time goes on the acceptance of things like Ventrilo going up, people will be less likely to think the voice chat breaks the immersion.

    I’m also thinking that this is something moving toward bandwidth becoming cheaper. You say devs can’t be paying for the bandwidth, but why not? If $15 a month covers the current costs, new development and makes a profit, would $30 month cover the additional bandwidth for a voice chat system? I don’t know myself, but I think its worth looking in to, perhaps even look at aligning with one of the Voice over IP providers.

Leave a Reply

Your email address will not be published. Required fields are marked *