The Metaverse Will Engage All Five Senses
Dr. Jeroen Van Ham | Apr 07, 2023
Even as the electronics industry keeps improving virtual reality (VR) and augmented reality (AR), along comes an even more expansive concept: the metaverse. Where AR and VR are essentially visual in nature, we expect the metaverse to engage all our senses to blend real and digital worlds into an even richer experience.
VR and AR may be visual, but providing that visual experience requires a suite of sensors and actuators that have nothing to do with detecting or emitting light. Sensors for positioning, stability, and motion are fundamental. Data from these different sensors must be processed, and software that can integrate that data to produce progressively richer and more enjoyable experiences must be continuously refined. Developing VR/AR has been no small challenge.
The metaverse experience will require more sensors, more different types of sensors, far greater processing power, and ever more sophisticated software. Orchestrating all of this will be that much more formidable a task—eminently achievable, but the process will certainly keep us all busy.
To get a better grasp of what the metaverse might be, a quick look at where we are today will be useful.
VR could include digital representations of real places in our physical world, virtual depictions of environments that otherwise don't exist, or any conceivable mix of the seemingly real with the fantastical, as long as all of it is computer-generated.
AR and mixed reality (MR) both start with the real, physical world and then add something virtual. The virtual element could be anything from simple alphanumeric data to animated objects, characters, or scenes. What distinguishes AR from MR is still a matter of opinion. When distinctions are made they are typically about matters of degree—about the balance of real and virtual.
To make things murkier, there's also XR. For some the X in XR is just a stand-in for the V, A, and M in VR, AR, and MR, respectively. Others say the X in XR means "extended," and of course there's no agreement about what that means either. Extended reality might be synonymous with the metaverse—or it might not. Or it might end up being merely a catch-all term that relieves people from listing "VR, AR, MR, and the metaverse" every time the subject comes up.
While the terms remain fuzzy, the technologies involved can be considered somewhat definitive, at least for now. For starters, VR is defined by the use of goggles and probably always will be.
As of today, AR and MR both tend to rely on some transparent medium (eyeglasses or goggles) or something with a camera-and-screen combo that can behave as if it is transparent (a smartphone or, again, goggles) that are accompanied by some mechanism that enables users to perceive digital data and imagery. Currently the technologies for AR and MR are very similar, and if XR is anything it's not adequately defined, so for the remainder of this essay we’re going to keep things simple and refer only to VR, AR, and the metaverse.
There is no agreement on the definition of the metaverse either—but we stated our opinion earlier. We think there's no point in building toward a metaverse unless it reaches beyond VR and AR to encompass sight, sound, touch, smell, and taste.
So we believe the metaverse will be defined by technology that supplements goggles and glasses, and perhaps even replaces them. This will almost certainly include wearables. Wearables could take almost any form, from jewelry (e.g., smart watches) to smart clothing to prosthetic devices—and someday in the far future perhaps even medical implants.
Humans are visual creatures, and that's not going to change, so VR and AR will naturally be foundational elements of the metaverse. Displays are fundamental to both VR and AR.
VR systems rely almost exclusively on digital screens inserted into goggles. Different AR systems take different approaches, but they generally boil down to either mounting a tiny digital display somewhere in the user's field of view or projecting digital content onto the eyeglass lens(es).
TDK recently introduced a new projection option. We have created a small, lightweight laser module that projects full-color digital imagery. A reflector built into the lenses of goggles or glasses directs the projection straight into users’ retinas. An exciting fringe benefit of this technique is that it produces crisp, clear images even for people who have imperfect vision.
This approach makes it possible to build headsets that are smaller and weigh significantly less than those using other display techniques. Weight might not be a huge issue with VR goggles, but lighter headsets will no doubt be appreciated. Size and weight are issues with AR headsets, however. Mass market indifference to AR thus far is attributable at least in part to the hardware being obtrusive, heavy, and unattractive.
We believe that building AR headsets with tiny laser modules could help accelerate AR headset sales and AR application development.
Literally every possible human movement could conceivably be incorporated within a VR or AR application, and so it is necessary to be able to detect heads swiveling, fingers pointing, bodies crouching—all with six degrees of freedom (6DoF)—or nine (9DoF). These sensors must be extremely accurate; imprecise motion detection can lead to varying degrees of disorientation in some users.
Motion is one thing; position is another. A VR user might move forward, backward, or laterally in space, but generally VR applications don't compel users to do that. That said, it will always be a matter of user safety for the VR rig to be able to detect and track where users are in real space, for example, to avoid collisions with walls. It is further useful to detect anything else in that environment: furniture, other people, pets, etc.
Most AR systems, meanwhile, must account for full mobility in real, physical space, and therefore have to have accurate position sensing. Inherent in the concept of AR is the peril of distraction; at any point while wearing an AR rig, a user could be reading, viewing a video, or reacting to some virtual object. It is imperative for the AR system to compensate for user distraction by being aware of the environment on the user's behalf.
The sensors available for detecting environment and fixing position include:
Three-axis accelerometers and 3-axis gyroscopes are combined to detect motion with 6DoF. Integrating a 3-axis magnetometer leads to 9DoF sensing. These sensors can be used in almost any combination, depending on the requirements of any given VR or AR application, or set of applications.
Do single-user VR apps need ToF sensing, for example? Maybe not. But then again, maybe. One reason to add ToF sensing would be to get an accurate determination of where VR users’ hands are in real space, so that accurate digital depictions of their hands can be incorporated in the virtual world. With a certain minimum level of accuracy, it would be possible to ditch those extra handheld devices that some VR rigs use for hand tracking.
Or consider a multi-user VR game. You might want to use ToF sensors to determine the players’ relative position to each other in the real world and use that information in two modes—in the real world for collision avoidance and in the virtual world to keep the players’ avatars from interfering with each other.
There is a school of thought that says AR rigs might not need the most sophisticated suite of sensors for motion detection, positioning, and object detection. The proposition is that the environment will be studded with sensors, and the pertinent data will be collected and shared among all AR users (and eventually metaverse users) through some wireless network.
That may very well end up happening, but it might be optimistic to think the world will be that sensor-rich any time soon, or that there will be adequate wireless infrastructure with widespread coverage. In our estimation, the world can get useful AR (and the metaverse) much sooner if AR rigs do most or all of their own sensing. It might take more local processing, but the risk of lag or delay will be lower, and it should be easier to protect user data that way too.
There might be some reasons for VR rigs to use a microphone to incorporate sound from the real-world environment. On the other hand there are any number of reasons for AR rigs to monitor environmental audio.
Voice recognition alone could be useful in uncountable ways. Any information anyone relayed by voice could be used directly. A street address or directions to a location could be automatically entered into a map app. Language translation apps could be triggered automatically. There are any number of reasons to record ambient audio alone, or in conjunction with video.
We would want to embed cameras in AR rigs for all the same purposes we want them in our smartphones. In fact, some expect AR rigs might replace some smartphones—perhaps all of them eventually. And who knows what applications AR developers might be able to devise that use audio and video?
If the metaverse ends up engaging all of our senses, then metaverse rigs most certainly will also incorporate motion sensors, positional sensors, cameras, and microphones.
But what about touch? The data from many of the same sensors used for motion detection, positioning, and object detection can be fed into haptic devices. These might be anything from bracelets to gloves to partial- and full body suits.
There are haptic technologies in development today that can not only provide physical feedback that something is being touched, but can convey what the texture of that object is.
With VR and AR, "rigs" are essentially goggles, eyeglasses—headsets of one type or another. The metaverse is apt to be modular, however, and involve many different wearables that could be used in any combination depending on whatever metaverse apps each person chooses to use and how immersive they want their metaverse experiences to be.
Rounding out the senses are taste and smell. CO2 detectors, for example, are commonly available, and could provide warnings of potentially hazardous environmental conditions that most people could not detect on their own.
Detecting smells and flavors is already being done today using a variety of different detectors and sensors able to identify the presence—and sometimes even the proportions—of various chemical compounds.
Recreating taste and smell for the user is still in the realm of science fiction. Indeed, metaverse rigs will probably never be able to convey to us what a recipe might taste like or how a theoretical selection of different fragrances might smell if combined into a perfume. That said, a selection of gas and chemical detectors might be able to tell you whether a food item is unsafe to eat or which specific brand of liquor was used in the cocktail you just ordered.
More information about text formats