Thursday, February 17, 2011

Slow Week This Week

Project progress has been pretty slow this week because of midterms and papers. Also, DXUT (the DirectX version of GLUT) decided that it didn't want to work nicely with my laptop so I've been spending some precious time troubleshooting it.

I went to an excellent GRASP lecture by Zhengyou Zhang from Microsoft Research. The title of the talk was "Human Activity Understanding with Depth Sensors" and it covered how a lot of the functionality provided with the Kinect SDK works. The researchers at Microsoft are able to derive a skeleton from up to 4 humans in front of the Kinect at a rate of about 2ms. This gives me hope that the image processing segment of my application will not be the performance drain that I'm fearful of.

The talk was also interesting because it confirmed that even if I had access to the Kinect SDK I would still have to do the majority of the work I'm currently going to be doing as the Kinect SDK does not provide the level of detail regarding hands that I require. It would fully take care of head tracking, and would give me a rough idea of where the hands are, but I'd still need more precision to create a good user experience.

The Alpha Review is next Friday, and I'm aiming to have head tracking fully working by then.

Thursday, February 10, 2011

Ultimate Design Post

This week I’ve been getting my hands dirty with the different software components of my project, and I’ve spent a lot of time designing how the final thing is going to run.

The Big Picture

This picture depicts a side view of what the player will see. An actual observer from this angle would just see the player waving their hands in the air in front of their screen, but because of the head tracking, it appears to the player that they are grabbing objects that are floating in the air in front of them. This effect is best illustrated in Johnny Chung Lee’s video about head tracking using the Wiimote:



Software Details
I’m working with a lot of different software libraries, and I need to tie them together in a way that works, but is also fast and efficient so that the user experience isn’t degraded by lag. The biggest issue is that the Kinect’s camera is only 30 fps, so input is not going to be instantaneous.

I’ve decided to make my application threaded using the pthreads library and I’m assigning rendering and input processing to different threads. My main goal with this approach is to allow one thread to be calling into DirectX as much as possible so that even though input is only coming in and being processed at a 30fps maximum, the rendering rate will be much higher.

The program will consist of the following class structure:


Both the input manager and the game manager will be given a minimum of 1 thread each depending on the platform (my laptop has 2 threads, but the 360 has 6).

Input Manager
The input manager busy-waits on the Kinect until the next frame is ready. It then takes the data and processes it and places the processed results into a thread-safe structure so that the Game manager can access it and update the game state accordingly. When the input manager has more than one thread available to it, it will spawn helper threads to process the data from the Kinect since there is obvious parallelism there (face detection and hand detection can easily be done concurrently).

Game Manager
The game manager checks the thread-safe structure for changes, and then updates the game state based on elapsed time and sends the new scene to DirectX for rendering. The update is broken up into

Thursday, February 3, 2011

Prototype Technology

In this post I'm going to outline all of the technology I will be using in my initial Prototype which I will be developing in the next few weeks.

Input:

LibFreenect: I will be using the LibFreenect library to extract the color and depth information from the Kinect. The library will also be used during the calibration phase of the application to change the Kinect's viewing angle. I will be wrapping the library in a KinectHandler class so that in the future I can easily replace it with an official SDK without changing the entire structure of the program.

OpenNI: I will be using the OpenNI library to determine the user's hand positioning and what they are doing with their hands (ie. grabbing an object). I chose this library because it is open source under LGPL, has excellent community support, and is well suited for processing the images extracted from the Kinect.

Fdlib: I will be using the FdLib library for face detection. I choose this library because it is both fast and simple to integrate, which makes it a very strong candidate for incorporation in the prototype. The fact that it is both proprietary and closed-source means that I will most likely be replacing it later in development, but for the purpose of rapid prototyping and getting to a play-testable state I think it will do nicely for now. I will be wrapping the library in a class so that it is easy to replace without altering the program structure.

Output:

DirectX: I will be using DirectX to handle my output. The reason I'm choosing DirectX instead of OpenGL is because I think that ultimately my game would be released on the Xbox 360 platform (and perhaps Windows), which means that I must use DirectX for rendering, but as I don't forsee releases on platforms that don't support DirectX, implementing OpenGL seems unnecessary. I'd also like to get more experience working with DirectX as it is the current game industry standard. I will be writing a wrapper around my output so that I can swap out DirectX for something else if I change my mind.

That's an overview of the software I'll be using in my project. I'm sure there will be future posts that further explore the good and bad of each library.