Viewing entries in
Human Computer Interface

Motion Gaming Technology For Everyone

Omek Interactive wants to put you in the game…and in the TV…and in the computer. The Israel based company has developed Shadow SDK, a middlewarepackage that enables 3D gesture technology for all types of home media. With Shadow, third party developers can create realistic video games where your body becomes the controller, or it can be used to create gesture controlled TV/media centers, or computer interfaces. Omek Interactive demoed some great applications fueled by Shadow at Techonomy 2010. Check out them out along with CEO Janine Kutliroff’s presentation in the video below.

It looks like the human computer interface of the future could be the open air. Ive seen some pretty cool gesture systems that only require a camera and a person’s body to control various media devices. The incredible interface from Minority Report is going to arrive in the next few years, gesture TVs are coming to the market soon (“the end of 2010″), and Microsoft’s Project Natal should be available at about the same time. Because Shadow enabled applications can work with video games, it’s often compared to Natal. Both can give you real-time control of an avatar, as you’ll see in the following:

Kutliroff’s speech ends around 5:40 followed by a media room gesture control application, a demonstration of an avatar (7:43), and a pretty neat-looking boxing game (8:43).

Of course one of the big differences between Project Natal and Shadow is that you’ll only ever see Natal on the Xbox or other Microsoft platforms. Shadow might be popping up everywhere. At least, that’s what Kutliroff and Omek seemed to be aiming for. Other companies in the gesture control business are focusing on a single application (Toshiba/Hitachi for TVs and home media, g-speak for computers, and Project Natal for video games). Omek Interactive isn’t married to one particular kind of hardware and they’re definitely trying to court a plurality of application developing firms. While they’ve created some interesting demo games and applications, Kutliroff’s presentation clings to the middleware status. Shadow is, after all, a SDK. Omek is poised to enable third party developers to build the next generation of gesture controlled technologies. Probably in video games, but possibly for TVs and computers as well.

The only question I have is whether the products that would sandwich Shadow (the 3D cameras on one side, and the gesture enabled applications on the other) are actually ready. We’ve seen some depth-perceptive cameras on the market (such as the 3D stereoscopic webcam from Minoru) but they are far from ubiquitous. Likewise, there’s been some good buzz surrounding gesture TVs and Project Natal’s video games but neither is actually on sale yet. This is an emerging market, and while the possibilities for gesture controls are very promising there’s no guarantee they’ll be popular. Omek could be caught as the middleman between two types of products that never get off the ground.

I must admit that part of my skepticism stems from the fact that gesture controls are not my favorite of the technologies contending to be the next major human-computer interface. As fun as it may be to play a movie with the flip of a wrist, or use your entire body to play a virtual boxing match, these applications lack tactile feedback. There’s nothing to hold. Nothing physical to let you know that you’re actually interacting with something. To me, for gesture controls to really succeed they’ll need some sort of haptics. I’d be totally cool with flailing my limbs through the open air if I could actually feel when my virtual self was hitting something.

Still, my personal preferences aside, the entire body monitoring control scheme seems to be grabbing a lot of attention. Omek Interactive is making a great move by racing to become the definitive middleware solution in the field. If the public does become interested in gesture technology, the Shadow SDK could get some major use. It would let companies that are good at making hardware, and companies that are good at making applications (i.e. games) focus on their strengths while Omek knits them together. That’s a smart strategy and a sure way to enable innovation. It will likely take several years before we know whether gesture controls are here to stay, but Omek is certainly a name to watch while we figure it all out.

Pen + touch Interface

Touch screen interfaces are the gadget design trend-du-jour, but that doesn't mean they do everything elegantly. The finger is simply too blunt for many tasks. Microsoft Research's "Manual Deskterity," attempts to combine the strengths of touch interaction with the precision of a pen.

"Everything, including touch, is best for something and worse for something else," says Ken Hinckley, a research scientist at Microsoft who is involved with the project, which will be presented this week at the ACM Conference on Human Factors in Computing Systems (CHI). The prototype in the video above for Manual Deskterity is a drafting application built for the Microsoft Surface, a tabletop touchscreen. Users can perform typical touch actions, such as zooming in and out and manipulating images, but they can also use a pen to draw or annotate those images.
The interface's most interesting features come out when the two types of interaction are combined. For example, a user can copy an object by holding it with one hand and then dragging the pen across the image, "peeling" off a new image that can be placed elsewhere on the screen. By combining pen and hand, users get access to features such as an exacto knife, a rubber stamp, and brush painting.
What Was The Inspiration?
Hinckley says the researchers videotaped users working on visual projects with sketchbooks, scissors, glue, and other typical physical art supplies. They noticed that people tended to hold an image with one hand while making notes about it or doing other work related to it with the other. The researchers decided to incorporate this in their interface--touching an object onscreen with a free hand indicates that the actions performed with the pen relate to that object.
Hinckley acknowledges that the interface includes a lot of tricks that users need to learn. But he thinks this is true of most interfaces. "This idea that people just walk up with an expectation of how a [natural user interface] should work is a myth," he says.
Hinckley believes that natural user interfaces can ease the learning process by engaging muscle memory, rather than forcing users to memorizes sequences of commands or the layout of menus. If the work is successful, Hinckley says it will show how different sorts of input can be used in combination.
Hinckley also thinks it's a mistake to focus on devices that work with touch input alone. He says, "The question is not, 'How do I design for touch?' or 'How do I design for pen?' We should be asking, 'What is the correct division of labor in the interface for pen and touch interactions such that they complement one another?'"
What's Next?
The researchers plan to follow up by adapting their interface to work on mobile devices. 

Mind-controlled prosthetics without brain surgery

Mind-reading is powerful stuff, but what about hand-reading? Intricate, three-dimensional hand motions have been "read" from the brain using nothing but scalp electrodes. The achievement brings closer the prospect of thought-controlled prosthetics that do not require brain surgery.

Electroencephalography (EEG), which measures electrical activity through the scalp, was previously considered too insensitive to relay the neural activity involved in complex movements of the hands. Nevertheless, Trent Bradberry and colleagues at the University of Maryland, College Park, thought the idea worth investigating.

The team used EEG to measure the brain activity of five volunteers as they moved their hands in three dimensions, and also recorded the movement detected by motion sensors attached to the volunteers' hands. They then correlated the two sets of readings to create a mathematical model that converts one into the other.

In additional trials, this model allowed Bradberry's team to use the EEG readings to accurately monitor the speed and position of each participant's hand in three dimensions.

If EEG can, contrary to past expectation, be used to monitor complex hand movements, it might also be used to control a prosthetic arm, Bradberry suggests. EEG is less invasive and less expensive than the implanted electrodes, which have previously been used to control robotic arms and computer cursors by thought alone, he says.

Brain Scanners That Read Your Mind

What are you thinking about? Which memory are you reliving right now as you read this? You may believe that only you can answer, but by combining brain scans with pattern-detection software, neuroscientists are prying open a window into the human mind.

In the last few years, patterns in brain activity have been used to successfully predict what pictures people are looking at, their location in a virtual environment or a decision they are poised to make. The most recent results show that researchers can now recreate moving images that volunteers are viewing - and even make educated guesses at which event they are remembering.

Last week at the Society for Neuroscience meeting in Chicago, Jack Gallant, a leading "neural decoder" at the University of California, Berkeley, presented one of the field's most impressive results yet. He and colleague Shinji Nishimoto showed that they could create a crude reproduction of a movie clip that someone was watching just by viewing their brain activity. Others at the same meeting claimed that such neural decoding could be used to read memories and future plans - and even to diagnose eating disorders.

Understandably, such developments are raising concerns about "mind reading" technologies, which might be exploited by advertisers or oppressive governments. Yet despite - or perhaps because of - the recent progress in the field, most researchers are wary of calling their work mind-reading. Emphasising its limitations, they call it neural decoding.

The development of 'mind-reading' technologies is raising concerns about who might exploit them

They are quick to add that it may lead to powerful benefits, however. These include gaining a better understanding of the brain and improved communication with people who can't speak or write, such as stroke victims or people with neurodegenerative diseases. There is also excitement over the possibility of being able to visualise something highly graphical that someone healthy, perhaps an artist, is thinking.

So how does neural decoding work? Gallant's team drew international attention last year by showing that brain imaging could predict which of a group of pictures someone was looking at, based on activity in their visual cortex. But simply decoding still images alone won't do, says Nishimoto. "Our natural visual experience is more like movies."

Nishimoto and Gallant started their most recent experiment by showing two lab members 2 hours of video clips culled from DVD trailers, while scanning their brains. A computer program then mapped different patterns of activity in the visual cortex to different visual aspects of the movies such as shape, colour and movement. The program was then fed over 200 days' worth of YouTube clips, and used the mappings it had gathered from the DVD trailers to predict the brain activity that each YouTube clip would produce in the viewers.

Finally, the same two lab members watched a third, fresh set of clips which were never seen by the computer program, while their brains were scanned. The computer program compared these newly captured brain scans with the patterns of predicted brain activity it had produced from the YouTube clips. For each second of brain scan, it chose the 100 YouTube clips it considered would produce the most similar brain activity - and then merged them. The result was continuous, very blurry footage, corresponding to a crude "brain read-out" of the clip that the person was watching.

In some cases, this was more successful than others. When one lab member was watching a clip of the actor Steve Martin in a white shirt, the computer program produced a clip that looked like a moving, human-shaped smudge, with a white "torso", but the blob bears little resemblance to Martin, with nothing corresponding to the moustache he was sporting.

Another clip revealed a quirk of Gallant and Nishimoto's approach: a reconstruction of an aircraft flying directly towards the camera - and so barely seeming to move - with a city skyline in the background omitted the plane but produced something akin to a skyline. That's because the algorithm is more adept at reading off brain patterns evoked by watching movement than those produced by watching apparently stationary objects.

"It's going to get a lot better," says Gallant. The pair plan to improve the reconstruction of movies by providing the program with additional information about the content of the videos.

Team member Thomas Naselaris demonstrated the power of this approach on still images at the conference. For every pixel in a set of images shown to a viewer and used to train the program, researchers indicated whether it was part of a human, an animal, an artificial object or a natural one. The software could then predict where in a new set of images these classes of objects were located, based on brain scans of the picture viewers.

Movies and pictures aren't the only things that can be discerned from brain activity, however. A team led by Eleanor Maguire and Martin Chadwick at University College London presented results at the Chicago meeting showing that our memory isn't beyond the reach of brain scanners.

Movies and pictures aren't the only things that can be discerned from brain activity

A brain structure called the hippocampus is critical for forming memories, so Maguire's team focused its scanner on this area while 10 volunteers recalled videos they had watched of different women performing three banal tasks, such as throwing away a cup of coffee or posting a letter. When Maguire's team got the volunteers to recall one of these three memories, the researchers could tell which the volunteer was recalling with an accuracy of about 50 per cent.

That's well above chance, says Maguire, but it is not mind reading because the program can't decode memories that it hasn't already been trained on. "You can't stick somebody in a scanner and know what they're thinking." Rather, she sees neural decoding as a way to understand how the hippocampus and other brain regions form and recall a memory.

Maguire could tackle this by varying key aspects of the clips - the location or the identity of the protagonist, for instance - and see how those changes affect their ability to decode the memory. She is also keen to determine how memory encoding changes over the weeks, months or years after memories are first formed.

Meanwhile, decoding how people plan for the future is the hot topic for John-Dylan Haynes at the Bernstein Center for Computational Neuroscience in Berlin, Germany. In work presented at the conference, he and colleague Ida Momennejad found they could use brain scans to predict intentions in subjects planning and performing simple tasks. What's more, by showing people, including some with eating disorders, images of food, Haynes's team could determine which suffered from anorexia or bulimia via brain activity in one of the brain's "reward centres".

Another focus of neural decoding is language. Marcel Just at Carnegie Melon University in Pittsburgh, Pennsylvania, and his colleague Tom Mitchell reported last year that they could predict which of two nouns - such as "celery" and "airplane" - a subject is thinking of, at rates well above chance. They are now working on two-word phrases.

Their ultimate goal of turning brain scans into short sentences is distant, perhaps impossible. But as with the other decoding work, it's an idea that's as tantalising as it is creepy.

What do you think? Heh...

The Future of GUI

Some of us remember a time when a command line interfaces was the only interface. And, in many ways, The Graphical User Interfaces (GUI) of today are far better then staring at a blinking cursor. Though, the command line is not without it's value. However, the human computer interface (HCI) developed in the 1970s at Xerox PARC, combining a desktop metaphor GUI and mouse controller, has remained largely unchanged ever since.

Now R. Clayton Miller proposes the next step in the evolution of HCI's with his 10/GUI concept that harnesses the power of multi-touch by removing the touch surface from the screen.

Currently almost all GUI’s rely on a mouse, which confines a user's hand to a single pair of coordinates, even though the human hand is capable of multiple intricate manipulations. Recent multi-touch interfaces popularized on mobile gadgets, such as Apple’s iPhone, have recognized this and proved their worth on smaller handheld devices - so much so that computer makers are now extending multi-touch capabilities to desktop computers, but without the same level of success.

Repositioning the touch surface

Using a traditionally-placed desktop screen as a touch interface, even for short periods, places too much strain on a user’s arm. Touchscreens used in a drafting table configuration lessen the strain on the arms, but increase the chances of neck strain, as users are forced to look down at the display. And in either setup there is also the problem of the user’s hands obstructing the display. Miller’s 10/GUI overcomes these problems by splitting the touch surface from the screen, and using a touch surface similar to a drawing tablet that is large enough to accommodate all ten fingers.

A hyper resistive capacitive array, with the capability to sense the position of each finger and to detect individual finger presses, would allow ten circle cross-hairs to be onscreen at once, instead of a single mouse pointer. Since this surface would be placed on the desk, users could combine the ergonomic advantages of using a mouse with the benefits of multi-touch.


And the 10/GUI wouldn’t just offer multi-touch benefits, such as simpler zooming or rotating of images. Instead, Miller proposes a new way to deal with the problem of multiple windows cluttering up a desktop as well. To overcome the problem of multiple, arbitrarily placed windows that can be difficult to sort through, Miller has come up with a system he calls, CON10UUM, which organizes windows linearly.

Newly-opened windows would appear on the right side of the screen and take up the entire height of the display. Each successive window would slide in from the right, pushing the existing open windows to the left. When managing windows using multi-touch, the greater the number of fingers used, the higher the level where they have effect. For example, using one finger manipulates objects inside applications, while two fingers can be used to scroll or pinch-zoom inside applications.
Now, this is where the 10/GUI interface starts to show the advantages of using more than two fingers. Using three fingers allows the user to move applications around the desktop and pinching will resize the application. Four fingers are used to scroll left or right through the open applications and pinching will zoom the open applications to make it easy to find the desired application.

Two hands can even be used at once to zoom out with one hand and move applications around with the other. But even the CON10UUM system can become a chore to scroll through with enough open windows, so continuing to zoom out will provide an annotated thumbnail view of the open windows separated by application.

CON10UUM would also see the left and right edges of the touch surface acting as specialized areas for the 10/GUI interface. A subtle ridge would delineate the strips and allow them to be located by the sense of touch. Touching the right edge area would activate global menu options, such as opening applications and shutting the computer down, while touching the left edge area would trigger local menus, such as the current application menu.

In the video Miller has created detailing his concept, which can be seen below, no mention is made of text entry using the system. But at the very end of the video is a rendering of a keyboard with an integrated touch surface along the bottom, similar to a laptop keyboard/touchpad setup. Millar has obviously spent some time devising his 10/GUI concept and, given the increasing popularity of touch-based interfaces, it looks like a viable direction for HCI’s to head in.

Miller admits, “relentless prototyping, user testing, and iteration, combined with exacting control over the software and hardware in concert, would be key to transforming these principles into something usable, versatile, and marketable.” And while that's definitely true, Miller has already succeeded in his aim to, “inform, inspire, and start discussions.”

Check out the video below and let me know what you think of Miller’s 10/GUI concept. Can it work? Do you have some ideas to make it better? I'd be very interested to learn your thoughts.