Damon Rocco's Developer Blog: 2011

Friday, April 22, 2011

Last Minute Features and a Poster

Our final presentation is next Thursday, so I've been working to get the last few features I want implemented before that. Specifically, I've been working on improving the physics system, and adding a few more user controls like allowing users to rotate objects gesturally.

I'll also be presenting a poster about my project next Monday from 10am-12pm for the CS Day Fair, and again on Thursday for the CG@Penn Poster Day.

My Poster - Just a warning, its a very large image...it has to look good at 40x30 inches after all.
Edit - I noticed that for some reason the background appears black in the thumbnail...it appears the correct color (white) when the full image is viewed (click on it).

Thursday, April 14, 2011

Refining the Input

I spent my time this week polishing the way the player interacts with the environment. I spent a lot of time working on mapping the real world into the game work in a way such that the user does not need to swing their arms around in order to reach the outer edges of the game space.

I've also been working on reducing some of the noise in the data which causes the objects to vibrate even when the player is holding their head stationary (this can be seen in last weeks video). So far, I've been able to slightly reduce the noise, but the objects still move a little bit when the player is stationary.

Finally, I added some more simple objects to the game this week. Specifically I added a platform object, which will form the basis for building structures once the simple physics system is implemented.

By next week I'm hoping to have further improved the processing of the data from the Kinect as well as implementing a very simple gravity system, such that objects will fall in the space unless they hit another object. Platforms will be immune to the forces of gravity which will allow the user to establish a ground plane (or planes) as they see fit.

Thursday, April 7, 2011

Beta Review

This Tuesday was my Beta Review. I did a lot of work to meet my beta milestone and composed a video of my progress so far:

Since there is no audio I'll give a quick breakdown:

The video starts with the calibration phase of the application.
Once calibration is complete the application changes to full screen.
The green spheres represent where my hands are in the game space.

I fully integrated head and hand tracking into the simulation, and began work on making input work the way that I want it to. As shown in the video, the user can pick-up the cubes, but currently they cannot put them down. I also don't like the mapping of real world space to game space that currently exists as I think the user has to reach to far to grab objects.

Improving user interaction with the objects and game space is my primary goal for next week. After that I'll be working to add more objects into the game and possibly a simple physics system (gravity and stacking) if there is time.

Thursday, March 31, 2011

Adding Game Objects

This week I've been spending my time on implementing some simple objects into the game in preparation for my beta review next Tuesday. For the beta I will only be using simple primitive shapes (cube, sphere, and cylinder) but I'll eventually be adding in .obj support so that users can load in custom objects.

Here are some simple screen caps of the objects. I've set up some simple generic lighting to give the objects a little dimension, but shaders/textures will be a post beta feature.

A couple of cubes


Cubes are joined by their friend the sphere

I've also been working with the OpenNI/PrimeSense frame works to get input ready for beta. I'm working on getting it to the point where the interactions are exactly what I want for the final version (i.e. intuitive grabbing and moving/rotating). If I can't get that up and running in time I'm also considering a gesture based interface, where the user will make gestures with their hands which will change the mode of the program to allow different types of manipulation.

Thursday, March 24, 2011

Simplifying Input with OpenNI and PrimeSense

I'm got back from break very refreshed and ready to dive back into this project. I started work on integrating OpenNI to do the hand tracking for my application (as per my design), and discovered that I was grossly underestimating the power of the OpenNI libraries. I had to reorganize my software design significantly, but I think that my new design is much simpler and is based on only two third-party libraries now: OpenNI and DirectX (technically three since I'm still using pthreads).

I've been able to cut out fdlib completely, which I wanted since the beginning as it is closed source and proprietary (but was good for getting quick results). OpenNI has a skeletal tracking system very similar to the Microsoft SDK (I'm assuming) which lets me not only get head position, but also rotation, which will make the headtracking display much more accurate and intuitive for the user. I've also replaced libfreenect with alternative drivers from PrimeSense that work with OpenNI's data node architecture, so that OpenNI can pull the information from the device instead of my having to feed the information too it. Overall this significantly simplifies my program flow, and will make it much easier to troubleshoot issues with input etc.

Here are a couple of screenshots showing OpenNI in action:

Skeletal tracking being shown. Note the placement of the head joint is exactly where I will be positioning my camera to create the virtual window effect.

OpenNI's hand tracking. The white dot is showing where OpenNI is perceiving my palm to be.

I've also been working on the system for adding and manipulating objects in the world, but that sadly does not have shiny screen shots at the moment.

Self-Review

As part of this blog post I've been asked to perform a self-review of my progress thus far.

I've been making steady progress in setting up the Kinect as an input device. I've had several setbacks and redesigns, but on the whole I've succeeded in getting the information I need out of the Kinect. I now have everything working the way I want, and I'm confident that I shouldn't have any more major problems (famous last words I know) in terms of the Kinect.

My progress with the actual application mechanics however has been lagging because of all the focus I've given getting the Kinect to work. I really need to focus on getting some of the basic mechanics in during the next two weeks because I would like to have a simple playable demo by the beta review period, and currently all I have is unused input.

I'm pretty confident that I'll be able to deliver a final product that I'm proud of, but there is definitely plenty of work yet to be done. I think that in retrospect it would have been smarter to have made the application without the Kinect input and then worked on integrating the Kinect rather than the other way around, as every time I got more functionality out the Kinect I would have video to show and a demo to play.

tl;dr I think that I've made some decent progress, but things are going a little more slowly than I would like, and I'm going to have to really ramp it up for the Beta and Final reviews.

Thursday, March 3, 2011

Alpha Review

This past Friday was our Alpha Review. For the review I compiled a 4 minute video which outlined my project, examined my approach, and displayed the results I have so far. The video is embedded below, but it does not have any audio as I narrated it in person.

I didn't say too much that hasn't already been said on my blog, but the last segment in which I show some of my initial results with the Kinect is notable. I've got simple head-tracking working, and I'm using it to render a rotating cube in 3D. I recorded that segment of the video using my cellphone held in front of my face (hence the rather poor video quality).

My application framework is fully setup. The next step will be adding the OpenNI libraries, and using them to process the data from the Kinect instead of the simple face recognition library that I used to demo head tracking for my alpha review.

This coming week is spring break (yay!) so my next update won't be for two weeks. By that point I'll hopefully have OpenNI fully integrated and tracking the user's head and hands.

Thursday, February 17, 2011

Slow Week This Week

Project progress has been pretty slow this week because of midterms and papers. Also, DXUT (the DirectX version of GLUT) decided that it didn't want to work nicely with my laptop so I've been spending some precious time troubleshooting it.

I went to an excellent GRASP lecture by Zhengyou Zhang from Microsoft Research. The title of the talk was "Human Activity Understanding with Depth Sensors" and it covered how a lot of the functionality provided with the Kinect SDK works. The researchers at Microsoft are able to derive a skeleton from up to 4 humans in front of the Kinect at a rate of about 2ms. This gives me hope that the image processing segment of my application will not be the performance drain that I'm fearful of.

The talk was also interesting because it confirmed that even if I had access to the Kinect SDK I would still have to do the majority of the work I'm currently going to be doing as the Kinect SDK does not provide the level of detail regarding hands that I require. It would fully take care of head tracking, and would give me a rough idea of where the hands are, but I'd still need more precision to create a good user experience.

The Alpha Review is next Friday, and I'm aiming to have head tracking fully working by then.

Thursday, February 10, 2011

Ultimate Design Post

This week I’ve been getting my hands dirty with the different software components of my project, and I’ve spent a lot of time designing how the final thing is going to run.

The Big Picture

This picture depicts a side view of what the player will see. An actual observer from this angle would just see the player waving their hands in the air in front of their screen, but because of the head tracking, it appears to the player that they are grabbing objects that are floating in the air in front of them. This effect is best illustrated in Johnny Chung Lee’s video about head tracking using the Wiimote:

Software Details
I’m working with a lot of different software libraries, and I need to tie them together in a way that works, but is also fast and efficient so that the user experience isn’t degraded by lag. The biggest issue is that the Kinect’s camera is only 30 fps, so input is not going to be instantaneous.

I’ve decided to make my application threaded using the pthreads library and I’m assigning rendering and input processing to different threads. My main goal with this approach is to allow one thread to be calling into DirectX as much as possible so that even though input is only coming in and being processed at a 30fps maximum, the rendering rate will be much higher.

The program will consist of the following class structure:

Both the input manager and the game manager will be given a minimum of 1 thread each depending on the platform (my laptop has 2 threads, but the 360 has 6).

Input Manager
The input manager busy-waits on the Kinect until the next frame is ready. It then takes the data and processes it and places the processed results into a thread-safe structure so that the Game manager can access it and update the game state accordingly. When the input manager has more than one thread available to it, it will spawn helper threads to process the data from the Kinect since there is obvious parallelism there (face detection and hand detection can easily be done concurrently).

Game Manager
The game manager checks the thread-safe structure for changes, and then updates the game state based on elapsed time and sends the new scene to DirectX for rendering. The update is broken up into

Thursday, February 3, 2011

Prototype Technology

In this post I'm going to outline all of the technology I will be using in my initial Prototype which I will be developing in the next few weeks.

Input:

LibFreenect: I will be using the LibFreenect library to extract the color and depth information from the Kinect. The library will also be used during the calibration phase of the application to change the Kinect's viewing angle. I will be wrapping the library in a KinectHandler class so that in the future I can easily replace it with an official SDK without changing the entire structure of the program.

OpenNI: I will be using the OpenNI library to determine the user's hand positioning and what they are doing with their hands (ie. grabbing an object). I chose this library because it is open source under LGPL, has excellent community support, and is well suited for processing the images extracted from the Kinect.

Fdlib: I will be using the FdLib library for face detection. I choose this library because it is both fast and simple to integrate, which makes it a very strong candidate for incorporation in the prototype. The fact that it is both proprietary and closed-source means that I will most likely be replacing it later in development, but for the purpose of rapid prototyping and getting to a play-testable state I think it will do nicely for now. I will be wrapping the library in a class so that it is easy to replace without altering the program structure.

Output:

DirectX: I will be using DirectX to handle my output. The reason I'm choosing DirectX instead of OpenGL is because I think that ultimately my game would be released on the Xbox 360 platform (and perhaps Windows), which means that I must use DirectX for rendering, but as I don't forsee releases on platforms that don't support DirectX, implementing OpenGL seems unnecessary. I'd also like to get more experience working with DirectX as it is the current game industry standard. I will be writing a wrapper around my output so that I can swap out DirectX for something else if I change my mind.

That's an overview of the software I'll be using in my project. I'm sure there will be future posts that further explore the good and bad of each library.

Thursday, January 27, 2011

Getting Started with the Kinect

We got a Kinect in the lab this week and so I restructured my schedule a little bit in order to get my hands on it as quickly as possible.

I've got my laptop set up with the LibFreenect open source library for interfacing with the Kinect device, and begun to extract some data:

These images are screen grabs from one of the LibFreenect sample projects and show the extracted RGB image from the Kinect and a depth image that has been colorized (red is close to the camera dark blue is far away, black indicates a lack of depth information).

From this output, I've determined a couple of important details about the Kinect. The user has to be about two feet away from the Kinect or else the device will not be able to accurately determine their depth. This occurs because the Kinect uses an infrared painter to cast rays into the environment which the camera picks up, and objects that are too close to the Kinect get hit with too many rays that they get washed out. There is also a depth "shadow" that gets cast such that when one object is in front of another, the device cannot determine the depth of the background object around the left border of the foreground object.

I don't think that there should be any significant future problem based on these artifacts because I only need to track the hands with precision (and they should be in front of everything), and get a general positioning of the head (I don't need facial features or any specific data, just a generalized head position) so even if the player's hands are obscuring their face I shouldn't have trouble gathering the information I need.

It seems like working with the LibFreenect library moving forward should be fairly straight forward. The library provides a series of c-style function calls to the Kinect device, and I think that wrapping a class around these should allow me to integrate the library into my application in a suitably abstract fashion.

Next week I'll be researching the hand/head recognition algorithms/software that I want to use, and getting something resembling the above demo working using DirectX for rendering.

Friday, January 21, 2011

Senior Project Kickoff Post

This blog will be detailing the progress of my senior design capstone project. I will be completing the project over the course of the spring 2011 semester.

My project is the creation of virtual building block experience utilzing the Microsoft Kinect hardware system; here is the abstract:

I plan to simulate a building block experience using the Microsoft Kinect hardware system. I will use the Kinect to track the player’s hands and head in order to present them with objects that appear to project out of their screens that they can pick up and move around as if they were real. The objects will be given real world physical properties to create a realistic building block experience, where unstable structures crumble, and improperly placed pieces can bring the entire structure crashing down. The objects will also be given interesting shapes to encourage user creativity and experimentation. The project will use the Libfreenect open source library to extract information from and send orders to the Kinect device and graphics output will be handled by the Microsoft DirectX libraries.

My anticipated work for next week:

Complete background research

Research face/head recognition algorithms
Research hand recognition algorithms
Determine whether to use existing physics system or roll my own

Begin framework setup integrating DirectX at least
Figure out how I'm going to obtain a Kinect :)