At the 2011 Interactive and Generative Art Exhibition, I debuted a system that I have been working on for the past couple months with Jeff Sylvester that allows users to play a game of pong using their own hands: no control interface required. The project was initially conceived to be installed in a large public arena like a basketball court, letting people use their entire body as a paddle to bounce a virtual (by means of a projector) ball around the floor. In fact, we’d like to someday scale this system up to that size, and enable an arbitrary number of players to enter and leave the playing arena as they wish, to make it more like a giant game of virtual dodgeball mixed with pong! But first comes the proof of concept, where we show what kinds of technologies we would use to achieve such a goal.
We began with the software, and tried to separate our ideas about physical installation from game mechanics and everything else. This way, we can begin tackling this project early on, but dividing it into smaller and smaller pieces, then executing them as we had time. The first two things we knew we wanted to get done were a) build a game a standard game of pong and b) get the player paddles to move based on input from a processed webcam feed. We began by going after the first point, and building a fundamental subsystem that allows us to just play pong.
We knew from the very beginning that we wanted to use computer vision techniques to allow the game of pong to be controlled by something in the real world, and in my personal experience, my favorite go-to sketching language, Processing just wasn’t going to cut it. It’s various computer vision libraries were, in my experience, buggy, unstable, somewhat slow and just annoying to mess with (though we might see some good changes in the next release). We needed a speedier, more reliable solution, and I knew I had gotten computer vision examples to work using openFrameworks and their built-in ofxOpenCv add-on. Problem was, I had never built a serious oF application before, and had no idea what I was getting into.
Eventually we were able to get a basic oF application up and running, then decided to use the popular Box2D (ofxBox2d) library to help us out with the collision detection and general physics management that can haunt even seasoned programmers. To create a decent clone of pong, we simply configured ofxBox2D to have zero gravity, zero friction (restitution), and inelastic collisions (no energy is gained or lost in collisions).
Next, I drew up some sweet futuristic graphics to be used as sprites in our game (i.e., nicer looking paddles and game ball). Since Box2D wasn’t exactly the easiest thing in the world to figure out, we opted to draw each sprite on the screen separately, then assign their positions based on the positions of invisible Box2D shapes and objects. This way, all of the physical interactions actually occur with ugly, invisible objects, while the sprites make the same interactions more interesting to watch.
Computer vision (OpenCV)
Once we got a ball to bounce around the screen and hit a couple of paddles and walls, we wanted to figure out a way to track users using a webcam, so that we could control the paddles based on their positions.
Around this time, we learned that we’d be able to install this project at the 2011 Interactive and Generative Art Exhibition at UNK, so I quickly visited the installation space to see much space we had to work with. It turns out the ceilings were not high enough for us to let users use their entire bodies to play pong, so we scaled it down to using a user’s hand to control the motion of a paddle in the game.
Rather than fuss with all the complexities of calibrating our camera image to account for ambient lighting conditions and extracting a user’s actual hand from a full-color image, we chose to simplify the problem by building wriststraps with infrared LEDs attached to them, then using a specially modified PS3Eye webcam with an infrared band-pass filter (i.e., it only sees infrared, and nothing else). This way, we have a webcam feed that is reliably pitch black unless an infrared LED is in its viewing area, in which case a small, bright white circle appears.
From here, we utilized OpenCV to track the blobs (white circles, infrared LEDs) for us live, and report to us the current X and Y coordinates of all blobs seen. We planned on having to do a background subtract when the program starts, and possibly adjusting the image threshold and min/max blob sizes, but to our surprise, no calibration was necessary!
One important thing to note at this point is that the resolution of our webcam (320×240 pixels) was not the same as the resolution of our projector (1024×768 pixels). Therefore, we needed to map whatever X/Y coordinates we found in the webcam stream to the bigger coordinate space of the screen. This was much easier than anticipated, and can be achieved using the following pseudo-code:
x = (blobX / webcamWidth) * screenWidth;
y = (blobY / webcamHeight) * screenHeight;
Simply use a little bit of logic to determine which blob should control which paddle in the game, then set the paddle coordinates to the coordinates calculated!
Infrared wristband markers
As mentioned, we chose to use infrared LEDs as “markers” to identify a player’s hand in our webcam feed, which turned out to be one of the easiest aspects of this entire project. All we needed to do was attached a couple of batteries to a couple sweatbands, then wire up an infrared LED and current-limiting resistor. As soon as the batteries are popped in, the light turns on, simple as that! It hardly seems worth doing, but I’ve included a circuit schematic, just to clear up any ambiguous text I’ve written about these wristbands so far.
For power, I was able to scavenge a couple of battery holders (one for holding two AAs and another for holding two AAAs), which were then sewn onto two white wristbands. Next, I soldered the current-limiting resistors to the positive terminals of the battery packs, then the anode of the infrared LEDs to the current-limiting resistors. Finally, just solder up the cathode of the LEDs to the negative terminal of the battery pack, and you’ve got yourself an infrared wristband marker!
Once the software and computer vision aspects were stable, we set about designing and building a structure to hold a projector and webcam above a playing surface. Using a projector throw distance calculator, I determined that our projector would need to be approximately 8 ft above the surface of out of playing field in order to achieve an image roughly the size of two 2×6′ tables put side by side. Therefore, a projector would need to be suspended about 8 feet above these two tables.
We chose to use 4×4 posts as the building material for this project, and began by constructing our “cross beam” to hold the projector and webcam assemblies. Because projectors are generally not designed to be mounted vertically (heat doesn’t get dissipated as well), we attached a mirror to a modified camera tripod using epoxy, then drilled a large enough hole in the center of our cross beam to drop the tripod through. This way we had a huge amount of flexibility in mirror placement, which became immensely helpful during calibration.
A BenQ MP522ST projector was mounted to the cross beam using a universal projector mount, as close as possible to our mirror assembly. From there, it was a simple matter of powering on the projector and messing with the mirror angle and position until a good image, with low distortion and keystoning, was being projected onto our surface. Initially we installed everything on the cross beam suspended between two chairs, and used the ceiling of our lab to calibrate our image.
Next, we needed to attach our webcam to the underside of the projector to get a good look at the playing surface. To do this in a non-permanent way, we chose to use a piece of scrap wood with some holes drilled into one end, then attached the camera to it using zip-ties.