I have received some interesting user comments on my November Alertbox on the problems with 3D interfaces for abstract information spaces.
2D Had Its Own Problems
I agree with you that people sometimes try to force their clients to use 3D stuff because it's cool, not because their clients find it useful. Much of the enabling technology for good 3D interfaces is not there yet. And sometimes people try to force inappropriate metaphors on their users --- although that didn't begin with 3D stuff, certainly!
I have beefs with a few of the things you said, though:
- eyes on the sides of our heads: our eyes are in front of our heads so our stereoscopic vision covers a large proportion of our visual field, making it possible to determine depth. Monkeys that swing through the trees have the same anatomy in this department.
- a lot of your arguments incline me to believe that the problem is not that 3D is inherently worse than 2D, but that we don't have good ways of using 3D interfaces yet.
Most of the same arguments would have applied to graphical user interfaces in 1970 or so:
(In fact, a remarkable number of these still apply. Yet people often prefer GUIs.)
- just compare the number of people who write memos for a living to the number of people who draw pictures for a living;
- evolution optimized humans for abstraction, not concrete interfaces;
- it's hard to control a GUI with a keyboard, and mice are expensive;
- users need to pay attention to the GUI, and it distracts them from their task (actually, my wife prefers full-screen text-mode in text-mode apps for just this reason);
- poor terminal resolution means you can't see much if you have dialog boxes (although I don't think anyone tried to put a GUI on a glass TTY until the 1980s);
- most abstract information doesn't render well graphically, and is better represented as text;
- it requires big, bulky, unstable software;
- (well, the confusion argument doesn't apply)
- On Doom : you stumbled into a historical cow-patty here, I'm afraid. Doom 's predecessor, Wolfenstein 3D , was a relatively straightforward three-dimensionalization of a popular 2D game, Castle Wolfenstein . The interface was essentially the same. I've spent a few hours playing each of these games, and I didn't find Wolf3D any harder to use --- although there is the behind-you factor, as you mentioned.
- On what kind of data wants 3D presentation: linegraph data (i.e. nodes connected by edges) is often hard to grasp in 2D, and sometimes easier to grasp in 3D. There's quite a bit of data whose most natural representation is a linegraph.
I have high hopes for 3D interfaces for abstract information navigation in the future.
Jakob's reply: I like your list of similar arguments against 2D GUIs: in fact, you have probably listed the reasons IBM didn't use Doug Engelbart's inventions on their mainframe terminals in the 1970s.
For anybody doing a Web design right now it does not matter much whether the arguments against 3D are temporary (as were most of the arguments against GUIs) or fundamental. In either case, they surely apply today and serve as a warning against using 3D except in the cases where the third dimension adds substantial value to the user's task.
I tend to believe that many of the arguments against 3D interfaces are fundamental in nature and will remain true in the future. But I would more than happy to be proven wrong since I find the current generation of Macintosh-clone designs supremely inadequate to support the Internet Desktop vision. We need something better, but we need this improvement so soon that we cannot wait for 3D to potentially become usable. Thus, I recommend more focus on 2D designs that integrate better semantics and richer attributes.
Introspective vs. Extraspective 3D
Benjamin P. Samuel from Drexel University writes:
I may be nitpicking, but only the most primitive GUIs (usually terminal based GUIs) are strictly 2D. The moment you introduce a stacking order you introduce a third dimension. I bring this up because I want to distinguish between two types of 3D manipulation.
There's the introspective (I hope I'm using a real word) type where you're "swinging through trees." In other words, you're controlling your body and point of view . This requires some kind of immersive experience because the mind wants the whole body to take over. If you've ever watched people playing 1st person games like DOOM or Marathon, you'll notice they sometimes flinch when a missile flies close by their head.
The other is the extraspective type where you're removed from the objects . It's pseduo-3D because objects are treated as flat and largely static. This is the basis for the traditional GUI, essentially you sit at a desk and objects are laid out. Certain objects lie on top of each other, some are contained within. This requires an entirely different type of response. Someone playing a first-person game doesn't get bothered by blinking things, whereas if I'm in my office and the message-waiting indicator is blinking on my telephone, I notice it instantly. The same is true for the accursed <BLINK> tag.
That said, I think you're right that from an interface standpoint, the introspective form of 3D is not useful for navigating. However, your examples of when to use 3D do show that an introspective view is useful for illustrating a task such as a surgeon or engineer's.
My point, then, is that one shouldn't pretend one is working exclusively in 2D. Rather, when the designer needs to make the transition between the two types as seamless (natural) as possible. Any object can be found lying around on the ground, we need to be able to pick it up and take it with us.
3D Is Improving
Bob Jacobson of Bluefire! Consulting (bluefire
I'm as stern a critic of poorly-done 3D interface as you are, but your last column flushes the baby down the drain with the bathwater. Condemning today's 3D hackjobs is a lot like criticizing computer graphics, when desktop publishing was in vogue, for generating prolific banalaity. Every period of innovation begins with widespread experimentation; a few individuals learn from invention and advance the state-of-the-art. We're getting there.
But here are my nitpicks with your conclusions.
Finally, the computer-generated virtual world absolutely registers better with the internalized virtual world (a la Peter Senge). Experiments by architect Daniel Henry conducted at the HIT Lab in Seattle have demonstrated this fact. So, even if we can't achieve perfection today, it's a goal worth striving toward -- not surrendering to the inadequacies of 2D.
- "3D" environment is commonly equated with a virtual world, though it is not a virtual world. (You perpetuate this usage.) 3D is only one element of a virtual world. Every virtual world has one aspect, an individual's internalized mental-map; and may have a second, a 3D computer-map created by workstations and projectors. When they converge, these create a sense of "presence." An immersive sound field, for example, can effectively synergize with a 3D visual presentation to generate a profound 360-degree experience.
- Experiencing computer-generated virtual worlds does not require "weird headgear." It's been a long time since anyone in the profession argued for it. More popular are immersion environments. Immersion desktops aren't far off and even in the current situation, more designers are learning to write for them. See Fakespace for some hints as to what lies ahead.
(Mark Bolas, CEO of Fakespace, is one of the most talented designers working in 3D. His early  wireframe evocation of an elevator remains the most effective graphic I've ever experienced...and my most acute memory of movement through space prior to flying over the "Giza Plateau, 2000 BC" in a virtual world created by Worldesign Inc., my former employer.)
- Almost all useful data have spatial dimensions. Even very abstract information, to have meaning for human beings, must be spatially referenced (where is this?) and often, geospatially referenced (where on earth is this?). Multidimensional displays (employing GIS/GPS) are the best way to display these relationships and far more effective than 2D maps. Most adults cannot read 2D maps. Perhaps they at least can cognitively traverse 3D maps.
Thanks, as always, Jakob, for a provocative and well-thought-through column.
Jakob's reply: I agree completely with the utility of augmented reality . One of my favorite future interfaces is to say "Computer, where are my car keys?" and have the system reply by shining a spotlight on the keys. Immersing interaction in the physical environment is a sure way to connect closely with people's basic existence and everyday needs. But we need even more special equipment to do this.
It's a good point that today's bad 3D designs may not be indicative of the higher-quality designs to come once we learn when and how to best use 3D in user interfaces. On the other hand, I have been to countless SIGGRAPH conferences and rarely seen any 3D designs that would offer measurable improvements in usability beyond the "coolness" factor (except for the examples mentioned in my column). There is certainly hope for good 3D interfaces to come and we need to get more examples of 3D designs that have been through usability engineering and iterative design to refine the interaction to the point where it becomes truly useful.
Artists Simplify the World Into 2D
George Olsen, Design Director/Web Architect at 2-Lane Media, writes:
I too am skeptical of 3D interfaces, having sat through five years of virtual worlds demos, but I agree with Kragen Sitaker it's possible there are better interfaces down the road (and obviously not applicable in the here and now). For example, to take your example of Doom, a real-time 3D POV did provide a different experience to this sort of gaming -- and arguably one that's well-suited for the intended goals of Doom. At the time, I was part of a group discussing the potential ideas for interactive entertainment (sort of like when Eisenstein and friends discussed what they'd do if they ever had enough filmstock to make movies) and I remember that this sort of POV added dimensions to the experience that hadn't been anticipated.
However, I do think there are fundamental aspects to 3D that limit its usage. For centuries humans have rendered 3D natural environments into simplified 2D versions . Why, not because they lacked 3D modelers (aka sculptors in the analog world). Instead it was to simplify the world around them to make it more understandable. Unfortunately, too many examples 3D interfaces today seem to want to make the map into the territory -- on a 1:1 scale....
There are definitely cases where this is appropriate as you've pointed out, but the overriding reason for 3D is that it's better than reality -- not that it resembles reality.
Incidentally, coming from a graphic design background, it's disappointing to see so few HCI people talking to designers, who sometimes do more than make things pretty. After all, we do have at least five centuries of beta testing experience .
Jakob's reply: Your analogy with painting versus sculpture is right on target: 3D may be a closer match to reality, but 2D usually offers more useful simplifications. And we need simplicity in user interfaces: a simpler representation can often scale up better to handle larger amounts of complexity.
I also agree that interaction experts and design experts need to talk more. One of my personal favorites is Scott McCloud's book
which is particularly helpful because comics have so much in common with user interfaces. But we need an entire series of
for all values of
(speaking like a true engineer here :-)
Does Evolution Predetermine Interface Styles?
Bill Shackleton writes:
I must respectfully disagree with you when you say: "Evolution optimized homo sapiens for wandering the savannah - moving around a plane - and not swinging through the trees." Yes we have been optimized somewhat for wandering the plains and therefore stand up straight and use our hind legs for transportation, but we also have opposing thumbs (for gripping tree branches), binocular vision to help us judge the distance to the branch we are leaping for, and colour vision to help us distinguish the fruit in the trees from the rest of the foliage.
Although I do not wish to live back in the trees or on the savannah, I must say that I get a little morose at the thought of human beings cubicled - Dilbert like - 30 stories above some congested core of steel, plastic, and concrete - using that wonderfully evolved opposing thumb for smacking down on a space bar every 5 or 8 letters or so. Due to another gift from evolution - extreme adaptability - I happen to be a fairly quick typist. I think that I could - and probably would prefer to eventually adapt to a more 3D technological space.
To take an example, imagine a person who was blind navigating a 3D aural/information space!
Jakob's reply: I tend to think that stereo vision is somewhat over-rated: try to close one eye and note how little changes. I believe much more in your closing remark about a 3D space for blind users: spatial location and gestural interaction become important once we abandon the screen (either because the user can't see or because we want a UI for cases where it's better not to be tied to a monitor). Reach for your data!
Hugh Fisher from The Australian National University writes:
First, we're fairly well optimised for swinging around in trees, which is why we can rotate our arms through 360 degrees vertically and hang from things. Baboons and other primates which have spent much more time on the savannah do not have nearly as much reach and flexibility. Swinging around in the trees is also where we get our superb binocular color vision, which only evolves for specialist carnivores like lions and arboreal primates like our ancestors.
More to the point, so what? I'm not writing this email from the savannah or the treetops, but from a desk. We didn't evolve in an environment with computers, so are unlikely to be preadapted for any particular user interface style. (Does Windows 95 outsell the Mac because the designers tapped into some genetic racial memory?)
And second, unless the prices in the US are considerably different than they are here in Australia, I believe the number of people driving cars rather than helicopters is due to the two orders of magnitude price difference rather than any innate evolutionary tendency. Since the accident rate per kilometre/passenger for commercial aviation is much lower than it is for cars, one could even argue that we are in fact better at navigating in 3D than 2D.
I suspect you're going to get a flood of similar emails, arguing about that particular paragraph rather than the article as a whole. I suggest you delete that paragraph, so people will be more inclined to think about the rest of it.
Jakob's reply: You are probably right that humans are decently capable of swinging through trees. A better analogy may have been birds or fish: true 3D creatures that navigate a space with full up-down navigational freedom.
I do maintain that we are more capable of moving around a flat surface and that we spend most of our time doing just that. It is true that people are adaptable and can learn to fly helicopers and airplanes, but the reason for the lower accident rate is more likely to be the fact that pilots are extremely highly trained, have been selected from a small group of elite "right stuff" candidates, and are continually supported by air traffic control. If the same selectivity and support environment applied to audomobile drivers, you can be sure that we would have very few road casualties. But imagine not being able to turn into your driveway without clearing it with the tower first.
Returning to the main point: Evolution does favor certain interface styles over others. For example, you can't use a mouse for fine art because the arm muscles do not allow as detailed movements as the fingers. That's why all artists use a drawing tablet. Similarly, fingers only move so fast, so there are limits to the amount of text people can type, meaning that interface styles become infeasible if they require too much typing. (Obviously, some people are really fast typists, and such users have been observed to use different interaction techniques than normal users: for example, they often prefer to delete an entire screen and retype everything rather than having to spend time trying to find out which element on the screen contains an error.)