One of the things that has struck me over the last few weeks in discussions about where virtual worlds are going is that we are in danger of making the same mistake made by application development and the web, by concentrating on the wrong thing. With virtual worlds, and the moves towards interoperability and standards, there is a real opportunity to get things right first time.
The "mistake" is that we tend to concentrate on the visual. It's only natural, it's probably our most powerful sense and the one that most of us would least wish be without - and hardly a surprising one for virtual worlds!
Since the first computer read-out and green-screen VDU we have developed computer applications and their user interface as a single entity. Even with the move to client-server this did not change - one application, one user interface. However with the arrival of the web, and of mobile phones, things began to change a bit. Users wanted access to the same application (be it a CRM system or Facebook) regardless of the device they were using. In fact almost 10 years ago I had a slide in my Aseriti slide pack (selling utility billing systems) that called for "Different Users, Different Needs", showing a user interface less application engine surrounded by optimised user interfaces for mobile, consumer, web and call-centre users.
The development of the mash-up culture has pushed this even further. Once we replace applications by user interface-less application engines, and then create user interfaces (and even other application engines) which talk to the application through an agreed API (typically a web service) we can unleash a whole new set of creativity in how we create applications.
The web unfortunately made a similar mistake, hardly surprising since its was based around HTML, but disappointing given Sir Tim Berners-Lee's own original vision, and that of Alan Kay and the Dynabook. HTML is mostly about marking up the display, not the content. David Burden means nothing more but the characters D-a-v-i-d-%20-B-u-r-d-e-n displayed in bold type. If you search for "David Burden" on Google you'll find lots of the same characters in the same order, but you'll have to guess that they actually refer to different people.
The "solution" of course is the Semantic Web - championed by Sir Tim Berners-Lee. But trying to retrofit it to the Petabytes of text strings that make up most of the web is an enormous challenge. Formats like RSS, and even Twitter hashtags, begin to give us some sort access to a semantic web, but the true semantic web languages of RDF and OWL (which at Daden we are using to give our chatbots semantic understanding) are woefully under-used. Even less used are things like Info URIs - agreed semantic codes, like an ISBN number, that say that info:people/davidburden/515121 is me, and not the CIO of the Post Office. If every mention of me on the web was marked up semantically then finding all references to me on the web becomes trivial. It's good to see that Google is beginning to introduce aspects of semantics into its search results, but without the original content being semantically marked up its only a small step - the mistake has already been made.
So what's all this got to do with virtual worlds? Almost any initial assessment of a virtual world starts with how well it looks - how good are the textures, the avatars, the sky, water and shadows. After that it's about functionality - how naturally does the avatar move, how can you interact with objects, can you view the web or MS Office - and about deployment issues (does it run on a low spec PC, can it run behind the firewall, can we protect children). There is active debate at the moment about standards in virtual worlds - Collada, X3D etc - and whether virtual worlds should be downloads or browser based (and this itself offers a spectrum of solutions as pointed out by Babbage Linden at the recent Apply Serious Games conference).
But to me all this is missing the point. Virtual worlds are NOT about what they look like, but about what's in them.
Let's not repeat the mistake of application development and the web. Let's start thinking about virtual worlds in terms of how we semantically mark up their content, and then treat the display issue as a second order problem. The virtual world is not HOW you see it, it's WHAT you see (or more precisely what you sense and interact with).
Some examples. These are all based around Second Life, since with libsecondlife/libomv we can actually get access to the underlying object models (which is as close to a semantic virtual world model as you can get).
- With Open Sim you not only have a different application sharing the same object model as SL, but also different clients using different graphics engines to render "SL" in subtly different ways.
- We have been working with the University of Birmingham to use their expertise in robotics to help create autonomous avatars in SL. The University uses a standard robot simulation application to visualise and model physical world spaces and test robot software, before downloading the code to the physical-world robots. To work in SL they've taken the SL object/scene description and dynamically fed it to the bot modelling tool - so SL "appears" as a wireframe model in the simulation application just as their physical world spaces do.
- On my iPhone I have Sparkle, a great little app which lets me log my avatar into SL. No graphics yet, just text (and not even a list of nearby avatars) but adding a radar scan of nearby people - and objects - would be almost trivial, and adding a 2D birds-eye view of the locale only a little harder. Even a 2.5D "Habbo" rendering of SL would not be impossible.
- We've already played around with using LSL sensor data and libomv to generate live radar maps in web browsers - why not push this a bit further and use Unity, X3D or similar to "re-create" Second Life in the browser - it won't look "identical", but in reality it's all just bits anyway.
Four situations, four different ways of rendering the Second Life "semantics".
Our own work on PIVOTE shows another approach to this problem. By creating the structure and content of a training exercise away from the visualisation tool we are free to then deploy the exercise onto the web, or iPhone or virtual world of our choice without having to change the semantic information or the learning pedagogy. If that semantic model could be extended to include the virtual world itself, then we would have a true write once - play anywhere training system.
One final issue that our bots particularly suffer from, is that having access to objects is no real guarantee of having access to true semantics. I might create a plywood cube in Second Life and call it a chair, a snake, or anything. The bot cannot guarantee that reading an object's name will tell it what the object is. To be truly semantic any future editing system should ideally force us to put accurate semantics on the objects we create - and in particular their place in the ontology of the world. Then even if we can't recreate a "chair" as the specified collection of prims or polygons we can substitute our own.
So this is my challenge to the virtual world community. Stop thinking of virtual worlds (and especially the future of virtual worlds) in terms of how they are rendered, but concentrate on their object models and the underlying semantics. I have every confidence that the progressive increase in PC power and bandwidth - and the existing capabilities of computer games - will mean that the look and feel of virtual worlds will come on just fine. And those of us deploying virtual worlds into enterprises will find that with wider adoption and real business need/demand will come the solution to all our problems of firewalls and user account controls (just as it did when the web first arrived in enterprises). These are (almost) trivial problems. If we want to create truly usable and powerful virtual spaces (and I even hesitate even to use the world virtual) then we should be focussing on the semantics of the spaces and the objects within them. That way we will avoid the problems of applications and the web. We will know what the objects in our world are - we only have to decide how to render them.