Tuesday, June 23, 2009

VR Game Gun

A couple months ago, I participated in a VR experiment with the some CS grad students, who had a head mounted display as well as a motion tracking device for the hand. All in all, the experience was pretty exciting, but for some reason, the hand tracking was laggy (but the head tracking wasn't. Who knows?)

More recently, some guy has created a hackish VR controller, called the PC VR Game Gun:

I've always been a big fan of VR, and this thing is no exception. It uses a gyroscopic mouse to track pitch and yaw, and I think it uses a keyboard/gamepad hooked up to the gun's innards for movement. All told, these features don't seem all that new, and considering the accelerometers and buttons in a Wii controller, using a Wii mote + Zapper might have actually been a better idea.

What DID catch my eye was the fact that it places a screen right along the scope of the barrel. I'm actually a bit puzzled as how exactly how it's anchored to the gun, though I suppose it's a problem that a bit of welding and screws could fix. What I like about this is that the screen follows the hands rather than staying in a fixed place like a regular monitor (e.g. Wii).

Granted, a head mounted display would do much of the same, but the problem there is that it requires more expensive location tracking, which requires fixed cameras (i.e. not exactly the most portable setup). Gyroscopes can detect acceleration, so it's fine for turning and rotation, but it can't exactly tell where your hands are in relation to your face. It just seems awkward to be holding a trigger that essentially does nothing for where you're aiming.

In this iteration, if you imagine that the screen simply being a enlarged scope on top of a gun (ala CornerShot), then you get a similar amount of immersion for a much cheaper cost. You won't have as wide range of view as with a HMD, but the PC VR game gun seems to be undergoing a second revision: adding in micro projectors on top of the gun to project against the walls.

I'm not quite sure if micro projectors are even powerful enough to do the job, as last I checked, most micro projectors have a very low amount of lumens, and just aren't extremely bright at large ranges (which you'll probably want when you're waving around a toy gun, shooting virtual mobs). However, the idea itself is rather brilliant; in theory, it'd be no different than running around with a gun/flashlight combo, great for playing a dark game like FEAR. I can't wait for this next iteration to come around.

Sunday, June 21, 2009

LikeHate: Multiple Monitors

I am a big fan of big workspaces, where you can spread out to an extent where everything is everywhere. My office (read: my room) is a very big representation of this fact. Everything I need is spread out within a 270 degree arc of myself; I have desks in front, to the side, and to the back. Just a few small sections are open to get out of this interactive jail cell.

So it's no wonder that I use a plethora of monitors to accentuate this workspace. At the time of this writing, I have four monitors hooked up to my main computer, which is a boon for debugging; there's about million different things that can give vital feedback to me, and I need them to be visible at all times. In addition to these main four, I still have my laptop, and although I haven't yet hooked up another monitor to that laptop (not enough space right now), it brings me to a whopping potential of 6 screens to give feedback and/or to work off of.

It's a feedback overload made for an informational junkie like me.

So what's there not to love?

Well, unlike the hand, the mouse is not a full extension of the human body. The only feedback of the location of the mouse is a tiny arrow floating around the screen. If you lose visual track of your mouse, you're really left in the dust for a couple seconds, shaking and waving the mouse around in hopes of finding that little arrow again.

I'm not sure if it's even possible to lose track of your own hand. You can directly point at any object from any position with little forethought. Pointing at something with your hand is usually faster than pointing with your mouse. Unfortunately, most consumer computers don't exactly have point to screen technology yet, so this is something people have to live with.

A bigger follow-up problem is of intention; the mouse is your eyes/hands for the computer. It determines what the computer thinks you're currently interacting with, and when there's a disconnect between what you're intending to do and what the computer thinks you're intending to do, disaster ensues. Try typing random gibberish into IDA Pro, and see if you can recover from the holocaust you just blasted onto your assembly code.

The problem's source lies in the fact that the visual cues for these context switches are not especially jarring, making it easy to mistake whether one has actually registered their intention with the computer first. Under single monitor situations where the user usually only has one window open at a time, switching to another window is represented by the entire screen changing to the target window. The user is given the undeniable feedback that the computer has registered the action. However, in multiple monitor situations, the possibility of components not being hidden under by others is much higher. If I switch to one of the other open windows, the only major feedback is the title bar becoming highlighted. This feedback is very minimal and can be easily overlooked or forgotten.

A solution for this is not necessarily very simple; just darkening the whole contents of non-active windows defeats the purpose of having multiple monitors. Adding additional audio/visual cues on context switch could be potentially annoying, and I don't exactly know how long it takes for an audio/visual cue to be ingrained inside the user to the point where the absence of the cue will be a cue in itself.

Wednesday, June 17, 2009

Cogi Mobile, Part 2

In terms of the things that I learned from this project, I suppose there's the obvious:

The API was written in Python using the Django framework.
The Android client was written in Java using the Android API.
The Cogi Mobile website was written using AJAX, and the Google App Engine to tunnel requests.

But honestly, learning each of these languages was not a hard thing to do. Programming languages are only a means to express certain concepts, and being able to familiarize yourself with all these different languages is not in itself particularly meaningful.

It is useful, I suppose, to know how to use AJAX to create a web page that serves everything on one web page.

See: Slashdot. They have a continuous flow of information script running on their pages such that, as users near the bottom of the page, more stories are updated onto the page.

But AJAX isn't the thing to be learned here. It's the concepts. Concepts are the all important part of it all; once you grasp the concept, you can bring it anywhere.

So what did I learn in terms of through the term of this project?

Well, from the programming aspect, the concept of DRY (Don't Repeat Yourself) and refactoring code.

In Agile development, there's a big fuss about unit testing; every individual thing must be tested, rather than the whole. Granted, I'm hardly the type to do unit testing on anything; I prefer to think long and hard about something and then get it all down in one shot, rather than get bogged down with writing drivers and tests for some tiny if statement.

But there was something to be gained from the amount of unit testing that went on while I was working with Cogi. Every individual method I wrote had to be tested for correctness, but if I had about a million branch paths in my code, I'd probably have to write 2^billion tests to cover them. What happened instead was a revelation; Chris, one of the programmers at Cogi, took my code and refactored it as such to tunnel it through as minimal number of paths as possible. It was really interesting to see the number of tests required for a module go from some 50+ down to maybe 5-10 tests at possible. It's definitely something impossible to learn in a classroom environment, where the objective is simply to get a product working.

From the design aspect, this entire project was about accessibility. For a company providing telephony services, things should be made as accessible as possible. It's not enough that you can retrieve your recordings and things from only computers. Unless we live in a world where desktop computers are built into every vehicle (here's looking at you, WingMan project) and have them everywhere we go, services need to be capable of being accessed beyond just the primary means.

Tuesday, June 16, 2009

Cogi Mobile

So, for the past half year, I've been working with Cogi to create an API for their underlying system. Cogi provides phone recording and speech to text services, but their only forms of access is either through:

1) Their flash applet on their website.
2) Using their landline.

Technically though, their landline only allows you to make the calls and will not allow you the access the recordings at all. So in all actuality, the only real option users have is the flash applet.

Granted, this isn't a terrible thing at all. Flash penetration is ridiculously high; Adobe claims 99% penetration (reason? Youtube). With such a high penetration rate, it isn't all that bad to make users go through a flash applet rather than using plain HTML/AJAX, plus you can make it look all snazzy.

There are contradicting claims to Adobe's penetration rate though. If you count any internet-capable device as possible access points, then thinner devices like smartphones that don't have Flash capabilities yet are cut out. There is Flash Lite (haha, Flash lite, get it? /groan), but as far as I know, speaking with the people at Cogi, it wasn't compatible or something. Given that a large part of the market base is targeted towards business-oriented users (think managers, lawyers, agents, etc.), a lot of these people aren't going to have access to a flash-capable device all the time. Maybe they're out in meeting all day, or driving from place to place. You get the idea.

So our solution was to create API was something essentially akin to one giant set of accessor methods. All the information was already available on their servers, so all we had to do was package this information in an easily accessible format and keep it such that it would be easy to get. The information was all sent in XML format, over HTTPS. Given that basically any language/platform combo is capable of parsing XML and using HTTPS, it was a wise choice.

Of course, that was the easy part. Writing a bunch of accessor methods on top of a database is mindnumbingly dull. Get info get info get info, serialize into XML, ship over HTTPS. The problem remains though; XML is not exactly a friendly format for endusers. At best, it's a good format for a client application to understand information. We still needed to create that client application.

In retrospect, spending several months creating a client application was probably not the greatest idea. As it is, a lot of the smartphones are disjoint from eachother, so creating a client application for a specific platform would only open the market to a single platform. However, that's what we did: we created a client application for the Google Android OS (currently, the G1 HTC Dream is the only phone that can run Android). Oh well. It wasn't something we had a choice about; we only had an iPhone and an Android phone available, and since iPhone development requires a Mac + $99 developer fee, we went with the only choice. We did, however, create a very snazzy looking interface. It worked relatively well, and thanks to some Photoshop prowess, everything managed to look shiny and nice.

A better choice for a client application was only created much later, about one month from the end of the project; instead of creating a platform specific application, why not obviously create something that is even more accessible? Something that's common to all smart phones is internet browser capabilities. So, I whipped up a small web app that accessed the previously created API and parsed the data into a readable format. A working copy can still be found at: http://cogimobile.deviange.net/ The web app isn't exactly as polished or graphically pleasing (read: text-only) as our client application (icons and colors galore), but in terms of accessibility, it's worlds beyond what a platform specific app can give. And I think that's extremely key.

Tuesday, June 9, 2009


I suppose everything needs a purpose, and as such, so does this blog.

What would this blog serve you ask? More than likely merely just notes of the day to day things that pique my interest. The brain and how it wanders is an interesting thing, you see. But as it wanders and explores the world, it forgets what finds in an instant. Hopefully, this blog will attempt to capture what crosses my mind as I explore the world in all of it's glory.

Note to self: The language in that previous paragraph is hilarious. It's uppity, pretentious, and probably what I thought visitors would expect in a objective statement.

Anyways, don't expect much out of this blog. It'll probably end up be nothing more than word vomit, spewing endlessly forth from my brain. Edit nothing! Leave all for history to see! This blog is more likely just a conversation with myself, past, present, and future.

Note again: Considering that this will probably be seen by all future employers (Google is the all-seeing eye, with an all-access pass to everyone), I'll probably keep the usual day-to-day razzmatazz out of the way. I'm not that kind of person anyways.

REM sleep helps solve problems.