Sony Eyes Motion Control, Augmented Reality

2009 will be remembered as the E3 game event that embraced computer vision. Far from me-too answers to the Wii’s gestural controllers, we saw remarkably different visions of how computer tracking might work.

As expected, Sony had their own motion tracking system to unveil at their press conference. But unlike Microsoft’s 3D camera, Sony opted to build on their already-lovable PlayStation 3 Eye camera with wands with spheres. The controllers look ridiculous, and lack the magic of the Microsoft demos. But don’t dismiss them out of hand. (Sorry, there’s no way to write this story without lots of abstract puns.)

Much of what Microsoft showed was “conceptual” video – and some of the hands-on demonstrations had noticeable latency problems. Sony’s approach, meanwhile, was really quite literal in its demonstation. The tracking looks extremely accurate in 3D space, and latency appears to be minimal.

Above: Video of the press conference – check out how quick and accurate the tracking looks

Via Joystiq; see also Offworld’s excellent 5 Things You Need to Know about the Sony shindig

The other good news for people working as artists and not necessarily mass-market game developers is that you can start to play with these ideas right now. Whereas Microsoft seems to have “lost” the once publicly-available 3D camera SDK for their solution, Sony is using an off-the-shelf camera you can buy right now and doing the rest of the work in software. I really like the use of tangible interfaces with cameras, because you can get more predictable tracking results, and you get the tactile feedback of having something in your hands. (I’m not sure I’d be as excited as they are about having a glowing ball on the end, but maybe I need to channel my inner raver.)

Anyway, here’s my humble prediction: it doesn’t matter how cool the demo looks or what sweeping statements anyone makes. Gameplay alone matters, and that means that what has to happen next is dependent entirely on the tracking working reliably and quickly, and developers building smart stuff around it that works as games. The same, naturally, is true for anyone doing broader interaction design and live visuals.

Sony is also getting further into the augmented reality arena. They have a Tamigotchi/Nintendogs-style augmented reality pet simulator, EyePet, for the console (see Joystiq’s hands-on), plus Invizimals, an augmented reality title for the PS3. Of the two, Invizimals is the most interesting. It’s funny that they immediately design it for kids (too bad, as I can see some office antics with this sort of thing). It’s also evident just how hard designing an effective augmented reality game can be. I don’t think skepticism would be wildly out of place – it’s clear that there’s something powerful about the concept, but not clear just what it will be.

And I don’t need to remind you, if you haven’t joined our tangible interface virtual party Saturday, head to http://hackday.noisepages.com/ARToolkit augmented reality is very much on the plate of stuff we’d like to see people play with. (The other schemes we’re using, Trackmate and reacTIVision, are better suited to 2D tracking on a surface, though they’re very, very reliable for that task.)

Full Body, No-Controller, No-Tag 3D Motion Tracking: Microsoft’s Project Natal for Xbox 360

Anyone for a game of Harmonix Mime Hero, with the Marcel Marceau expansion pack?

We’ve seen simple computer vision applications, “augmented reality” systems and object tracking schemes that use specially-printed tags, 3D tracking using IR emitters, and specialized motion detection sensors (most notably Nintendo’s Wii). But the holy grail, of course, is getting tracking without any of that stuff. That’s the idea behind the widely-anticipated release today of Microsoft’s Project Natal for Xbox 360.

What’s different about the new tracking systems that makes them work better? In short, a z axis. By detecting depth from the camera, you can track motion in three dimensions, which in turn makes detecting specific gestures far easier.

Microsoft had acquired 3D motion detection system maker 3DV Systems, as confirmed earlier this year on VentureBeat. Today’s news: that technology will see commercial distribution. Project Natal for Xbox 360 uses a three-camera device that interprets z-axis depth. Already, this leads to some impressive game demos. Of course, a big challenge of the Nintendo Wii has been that its sensors work poorly, but another challenge has been that developers often don’t use the sensors well, either. So it remains to be seen if developers figure out just what to do with this stuff.

There’s more, too:

  • 3D motion detection and tracking
  • Facial recognition (which could in turn lead to multi-person control experiences with this sort of technology, because you can tell the difference between different people)
  • “Object scanning” – no mention of object detection, but this could mean tangible interfaces that don’t require special tags

read more

In-Browser, All-JavaScript Motion Tracking? Believe It, Says Firefox 3.1

I may have to eat my words — here’s something I didn’t imagine being possible any time soon. It’s extremely processor-intensive computer vision, happening in a video stream, all with JavaScript worker threads. That is, this is possible because the next version of Firefox, version 3.1, allows for multiple threads processing the video instead of trying to do everything in succession. HTML5 + Firefox 3.1 + some not-terribly-backwards-compatible code = basic vision. It looks like it’s pretty simple frame differencing with a threshold, then a bounding area drawn around the spot that changes.

Video: Christopher Blizzard SoCal Linux Expo Javascript Motion Tracking, by AndroidAppFactory
Mozilla demos impressive Firefox 3.1 features at SCALE [Arts Technica]

And yep, that’s Linux running on a Mac, but you probably didn’t notice that — which is the whole point.

So, that’s it. No more desktop development. JavaScript is the future, and you’ll never need another language. Everything will happen in the browser. Nothing will happen in the browser, and everything will happen in servers. Not real servers - the cloud. In fact, nothing will happen in the cloud. That cloud will just virtualize another cloud. That cloud will be owned by Google. You won’t even have a computer, you’ll just have Firefox. Nothing will happen anywhere: you’ll just sit and think about Google and Firefox. Or a cloud will think about it for you.

read more

OpenCV Motion Tracking, Face Recognition with Processing: I’m Forever Popping Bubbles


Processing OpenGL Tutorial Video #2- Bubbles! from Andy Best on Vimeo.

Interested in performing high-performance, high-quality video processing, computer vision, motion tracking, and analysis? And want to do it in the friendly Processing coding environment - an ideal place to start, even for non-programmers? First, you’ll want to read Andy Best’s introduction to OpenCV posted a few days ago, to get started with the topic:

Processing Tutorials: Getting Started with Video Processing via OpenCV

But we’ve got next steps for you, as well. Andy has added a second tutorial which begins to cover actual motion analysis. It’s a simple technique, one possible even in Flash - but with OpenCV and Java/Processing, it can run very efficiently, and it’s a good stepping stone to more sophisticated techniques. Andy writes:

In this tutorial, I will show you how to use a thresholded frame difference (motion) image in order to perform collision detection with objects onscreen. Essentially we will be creating something similar to one of the old webcam games where you can ‘pop bubbles’ with your hands (or indeed anything that moves).

Processing OpenCV Tutorial #2- bubbles [andybest.net]

For another trick, here’s reader naus3a playing with OpenCV’s face recognition algorithm. I’ll let you figure out what to do with this one (but it could make an interesting performance tool … hmmm).


read more

Psychadelic Fluids at Glastonbury: Memo’s Report on the Motion-Activated AV Installation

By Jaymis

Last month we had a little teaser of Psychadelic Fluids, as CDM reader Memo was preparing to install the project (as part of a crew put together by Seeper) at the massive Glastonbury festival in the UK.

Well, the festival is over now, and Memo has followed up with a video documenting the project, and some more technical details on his site:


Glastonbury 2008 - Pi Interactive Installation from evan on Vimeo.

The biggest challenge in creating an application of this scale was to structure and optimize it in a way so it could analyze upto 6 camera feeds, and run at a large enough resolution to cover the entire tent. A multiple computer approach was out of the question due to the complications of synchronising a fluid simulation across multiple PC’s, so the decision was made to go with a multi-threaded app running on an 8-core Mac Pro. The motion estimation was split into 6 threads (one for each camera), the fluid solver ran in its own thread, and the particles (glitter & orb) ran in another thread - all of these threads ran in parallel. Once all threads were finished processing their data for one frame, they exchanged their results ready for processing for the next frame (camera motion fed into fluid solver ready for next frame, fluid currents fed into particles ready for next frame etc.). This approach allowed everything to run in parallel with smooth framerates of 30fps.

Tech aside, the crowd definitely seemed to like it.

More information @ memo.tv. Photos on Flickr.