Damon Rocco's Developer Blog

Thursday, March 31, 2011

Adding Game Objects

This week I've been spending my time on implementing some simple objects into the game in preparation for my beta review next Tuesday. For the beta I will only be using simple primitive shapes (cube, sphere, and cylinder) but I'll eventually be adding in .obj support so that users can load in custom objects.

Here are some simple screen caps of the objects. I've set up some simple generic lighting to give the objects a little dimension, but shaders/textures will be a post beta feature.

A couple of cubes


Cubes are joined by their friend the sphere

I've also been working with the OpenNI/PrimeSense frame works to get input ready for beta. I'm working on getting it to the point where the interactions are exactly what I want for the final version (i.e. intuitive grabbing and moving/rotating). If I can't get that up and running in time I'm also considering a gesture based interface, where the user will make gestures with their hands which will change the mode of the program to allow different types of manipulation.

Thursday, March 24, 2011

Simplifying Input with OpenNI and PrimeSense

I'm got back from break very refreshed and ready to dive back into this project. I started work on integrating OpenNI to do the hand tracking for my application (as per my design), and discovered that I was grossly underestimating the power of the OpenNI libraries. I had to reorganize my software design significantly, but I think that my new design is much simpler and is based on only two third-party libraries now: OpenNI and DirectX (technically three since I'm still using pthreads).

I've been able to cut out fdlib completely, which I wanted since the beginning as it is closed source and proprietary (but was good for getting quick results). OpenNI has a skeletal tracking system very similar to the Microsoft SDK (I'm assuming) which lets me not only get head position, but also rotation, which will make the headtracking display much more accurate and intuitive for the user. I've also replaced libfreenect with alternative drivers from PrimeSense that work with OpenNI's data node architecture, so that OpenNI can pull the information from the device instead of my having to feed the information too it. Overall this significantly simplifies my program flow, and will make it much easier to troubleshoot issues with input etc.

Here are a couple of screenshots showing OpenNI in action:

Skeletal tracking being shown. Note the placement of the head joint is exactly where I will be positioning my camera to create the virtual window effect.

OpenNI's hand tracking. The white dot is showing where OpenNI is perceiving my palm to be.

I've also been working on the system for adding and manipulating objects in the world, but that sadly does not have shiny screen shots at the moment.

Self-Review

As part of this blog post I've been asked to perform a self-review of my progress thus far.

I've been making steady progress in setting up the Kinect as an input device. I've had several setbacks and redesigns, but on the whole I've succeeded in getting the information I need out of the Kinect. I now have everything working the way I want, and I'm confident that I shouldn't have any more major problems (famous last words I know) in terms of the Kinect.

My progress with the actual application mechanics however has been lagging because of all the focus I've given getting the Kinect to work. I really need to focus on getting some of the basic mechanics in during the next two weeks because I would like to have a simple playable demo by the beta review period, and currently all I have is unused input.

I'm pretty confident that I'll be able to deliver a final product that I'm proud of, but there is definitely plenty of work yet to be done. I think that in retrospect it would have been smarter to have made the application without the Kinect input and then worked on integrating the Kinect rather than the other way around, as every time I got more functionality out the Kinect I would have video to show and a demo to play.

tl;dr I think that I've made some decent progress, but things are going a little more slowly than I would like, and I'm going to have to really ramp it up for the Beta and Final reviews.

Thursday, March 3, 2011

Alpha Review

This past Friday was our Alpha Review. For the review I compiled a 4 minute video which outlined my project, examined my approach, and displayed the results I have so far. The video is embedded below, but it does not have any audio as I narrated it in person.

I didn't say too much that hasn't already been said on my blog, but the last segment in which I show some of my initial results with the Kinect is notable. I've got simple head-tracking working, and I'm using it to render a rotating cube in 3D. I recorded that segment of the video using my cellphone held in front of my face (hence the rather poor video quality).

My application framework is fully setup. The next step will be adding the OpenNI libraries, and using them to process the data from the Kinect instead of the simple face recognition library that I used to demo head tracking for my alpha review.

This coming week is spring break (yay!) so my next update won't be for two weeks. By that point I'll hopefully have OpenNI fully integrated and tracking the user's head and hands.

Thursday, February 17, 2011

Slow Week This Week

Project progress has been pretty slow this week because of midterms and papers. Also, DXUT (the DirectX version of GLUT) decided that it didn't want to work nicely with my laptop so I've been spending some precious time troubleshooting it.

I went to an excellent GRASP lecture by Zhengyou Zhang from Microsoft Research. The title of the talk was "Human Activity Understanding with Depth Sensors" and it covered how a lot of the functionality provided with the Kinect SDK works. The researchers at Microsoft are able to derive a skeleton from up to 4 humans in front of the Kinect at a rate of about 2ms. This gives me hope that the image processing segment of my application will not be the performance drain that I'm fearful of.

The talk was also interesting because it confirmed that even if I had access to the Kinect SDK I would still have to do the majority of the work I'm currently going to be doing as the Kinect SDK does not provide the level of detail regarding hands that I require. It would fully take care of head tracking, and would give me a rough idea of where the hands are, but I'd still need more precision to create a good user experience.

The Alpha Review is next Friday, and I'm aiming to have head tracking fully working by then.

Thursday, February 10, 2011

Ultimate Design Post

This week I’ve been getting my hands dirty with the different software components of my project, and I’ve spent a lot of time designing how the final thing is going to run.

The Big Picture

This picture depicts a side view of what the player will see. An actual observer from this angle would just see the player waving their hands in the air in front of their screen, but because of the head tracking, it appears to the player that they are grabbing objects that are floating in the air in front of them. This effect is best illustrated in Johnny Chung Lee’s video about head tracking using the Wiimote:

Software Details
I’m working with a lot of different software libraries, and I need to tie them together in a way that works, but is also fast and efficient so that the user experience isn’t degraded by lag. The biggest issue is that the Kinect’s camera is only 30 fps, so input is not going to be instantaneous.

I’ve decided to make my application threaded using the pthreads library and I’m assigning rendering and input processing to different threads. My main goal with this approach is to allow one thread to be calling into DirectX as much as possible so that even though input is only coming in and being processed at a 30fps maximum, the rendering rate will be much higher.

The program will consist of the following class structure:

Both the input manager and the game manager will be given a minimum of 1 thread each depending on the platform (my laptop has 2 threads, but the 360 has 6).

Input Manager
The input manager busy-waits on the Kinect until the next frame is ready. It then takes the data and processes it and places the processed results into a thread-safe structure so that the Game manager can access it and update the game state accordingly. When the input manager has more than one thread available to it, it will spawn helper threads to process the data from the Kinect since there is obvious parallelism there (face detection and hand detection can easily be done concurrently).

Game Manager
The game manager checks the thread-safe structure for changes, and then updates the game state based on elapsed time and sends the new scene to DirectX for rendering. The update is broken up into

Thursday, February 3, 2011

Prototype Technology

In this post I'm going to outline all of the technology I will be using in my initial Prototype which I will be developing in the next few weeks.

Input:

LibFreenect: I will be using the LibFreenect library to extract the color and depth information from the Kinect. The library will also be used during the calibration phase of the application to change the Kinect's viewing angle. I will be wrapping the library in a KinectHandler class so that in the future I can easily replace it with an official SDK without changing the entire structure of the program.

OpenNI: I will be using the OpenNI library to determine the user's hand positioning and what they are doing with their hands (ie. grabbing an object). I chose this library because it is open source under LGPL, has excellent community support, and is well suited for processing the images extracted from the Kinect.

Fdlib: I will be using the FdLib library for face detection. I choose this library because it is both fast and simple to integrate, which makes it a very strong candidate for incorporation in the prototype. The fact that it is both proprietary and closed-source means that I will most likely be replacing it later in development, but for the purpose of rapid prototyping and getting to a play-testable state I think it will do nicely for now. I will be wrapping the library in a class so that it is easy to replace without altering the program structure.

Output:

DirectX: I will be using DirectX to handle my output. The reason I'm choosing DirectX instead of OpenGL is because I think that ultimately my game would be released on the Xbox 360 platform (and perhaps Windows), which means that I must use DirectX for rendering, but as I don't forsee releases on platforms that don't support DirectX, implementing OpenGL seems unnecessary. I'd also like to get more experience working with DirectX as it is the current game industry standard. I will be writing a wrapper around my output so that I can swap out DirectX for something else if I change my mind.

That's an overview of the software I'll be using in my project. I'm sure there will be future posts that further explore the good and bad of each library.

Thursday, January 27, 2011

Getting Started with the Kinect

We got a Kinect in the lab this week and so I restructured my schedule a little bit in order to get my hands on it as quickly as possible.

I've got my laptop set up with the LibFreenect open source library for interfacing with the Kinect device, and begun to extract some data:

These images are screen grabs from one of the LibFreenect sample projects and show the extracted RGB image from the Kinect and a depth image that has been colorized (red is close to the camera dark blue is far away, black indicates a lack of depth information).

From this output, I've determined a couple of important details about the Kinect. The user has to be about two feet away from the Kinect or else the device will not be able to accurately determine their depth. This occurs because the Kinect uses an infrared painter to cast rays into the environment which the camera picks up, and objects that are too close to the Kinect get hit with too many rays that they get washed out. There is also a depth "shadow" that gets cast such that when one object is in front of another, the device cannot determine the depth of the background object around the left border of the foreground object.

I don't think that there should be any significant future problem based on these artifacts because I only need to track the hands with precision (and they should be in front of everything), and get a general positioning of the head (I don't need facial features or any specific data, just a generalized head position) so even if the player's hands are obscuring their face I shouldn't have trouble gathering the information I need.

It seems like working with the LibFreenect library moving forward should be fairly straight forward. The library provides a series of c-style function calls to the Kinect device, and I think that wrapping a class around these should allow me to integrate the library into my application in a suitably abstract fashion.

Next week I'll be researching the hand/head recognition algorithms/software that I want to use, and getting something resembling the above demo working using DirectX for rendering.