Create a system that reliably detects activities. Add activities. Improve the output. This could be graphical, musical, or voice output. Consider incorporating a tracker. Make your system truly interactive, i.e., adapt your image analysis modules (e.g., thresholds on intensity changes) based on the user's interaction.
In our research, we recorded hundreds of images of living cells in time-lapse phase-contrast microscopy video. You could help automate the interpretation of these vast amounts of data, which is too time-consuming, costly, and prone to human error when done by hand. You could try to develop a segmentation method based on adaptive thresholding and active contours. You could also try variations of the basic active contour algorithm that we discussed in class.
The lungs deform during inspiration and expiration. Modeling the deformations is of clinical interest as it facilitates the diagnosis and treatment of lung disease. For example, lung cancer is often treated with radiotherapy. Research in medical image analysis has focused on methods to determine the position of a tumor that moves with respiration during radiation treatment and thus reduce the amount of undesirable radiation to surrounding healthy tissue. Here is a paper that describes rigid-body registration of lung surfaces: M. Betke, H. Hong, D. Thomas, C. Prince, J. P. Ko, "Landmark Detection in the Chest and Registration of Lung Surfaces with an Application to Nodule Registration." Medical Image Analysis, 7:3, pp. 265-281, September 2003. pdf. The goal of the project is to extend this work to deformable registration.
A simple 2D face tracker can be implemented by detecting skin color and motion in the video. See, for example, S. Birchfield, "Elliptical Head Tracking Using Intensity Gradients." In Proceedings of the IEEE Computer Vision and Pattern Recognition Conference, pp. 232-237, June 2000. pdf. More interesting would be a generalization of the task to 3D: Implementation of a method that estimates the position and orientation of a head in three-dimensions, for example, Q. Ji and R. Hu, "3D Face pose estimation and tracking from a monocular camera," Image and Vision Computing, Volume 20, Issue 7, 1 May 2002, Pages 499-511. pdf.
See previous project. Here you will use the view of the face from cameras. I could give you a paper to implement (or extend). Please ask me for references.
Implement a method to recognize different hand shapes or gestures for human-computer interaction, or simply tracks hands for behavior understanding.
Please see our webpage on Thermal Video Analysis of Bats. Censusing natural populations of bats is important for understanding the ecological and economic impact of these animals on terrestrial ecosystems. It is challenging to census bats accurately, since they emerge in large numbers at night from their day-time roosting sites. We have used infrared thermal cameras to record Brazilian free-tailed bats in California, Massachusetts, New Mexico, and Texas and have developed an automated image analysis system that detects, tracks, and counts the emerging bats.
We need help in improving our detection and tracking methods. In particular, we would like to automatically evaluate the shape of flying bats. Using the material we covered in CS 585 on thresholding, segmentation, and active contour methods, this project would build a system that analyzes the shapes of flying bats.
Training shapes:
Test shapes:
Please see our webpage on Thermal Video Analysis of Bats. This project uses thermal videos of wind turbines. The goal is to detect the bats and birds that might be flying by. Bats and birds have been killed by wind turbines, and maybe there is a way to help wildlife avoid the deadly blades. Here is an example of a sequence of thermal videos that you could analyze for this project (the blade is moving upwards, the bat is not hit):
We have used Boston University's MRI and CT scanners to image a bat in three dimensions. We need somebody to help design an algorithm to visualize the bat in these data sets and to analyze its anatomical features. (To my knowledge, no one has done this with bats! This would be an exciting research project!)
Develop a system that can detect facial features and/or their motion, for example, eyebrow raises, in video sequences of American Sign Language (ASL) communications. An eyebrow raise is an important grammatical tool in American Sign Language to indicate a question. You could built upon a system that was developed by one of my students in CS 585. The ASL data is available from three view points:
You could use SignStream's ASL video annotations that include events such as "eyebrow raises," which were manually determined by linguists and can serve as your "ground truth." Check out a previous CS 585 course project: T. Castelli, M. Betke, C. Neidle, "Facial feature tracking and occlusion recovery in American Sign Language." In A. Fred and A. Lourenço, editors, Pattern Recognition in Information Systems: Proceedings of the 6th International Workshop on Pattern Recogntion in Information Systems - PRIS 2006, pages 81-90, Paphos, Cyprus, May 2006. INSTICC Press. pdf. See also Technical Report BU-CS-2005-024.
Develop a people tracking program. Videotape people walking on campus (you can use our cameras). Can you automatically detect which moving "blobs" are people and not cars, bikes, dogs etc? Could you apply your method to improve airport security? You may reimplement: P. K. TrakulPong and R. Bowden. "A real time adaptive visual surveillance system for tracking low-resolution colour targets in dynamically changing scenes,"Image and Vision Computing, Volume 21, Issue 10, September 2003, Pages 913-929. pdf. There are some very interesting newer papers. Please ask me for references.
Images often must be resized to fit in PDA or cell phone displays. If the aspect ratio changes, image content may be deformed or important objects in the image may be cropped. For example, if image A must be resized into a 100-pixel height image, a simple downsampling algorithm would result in image B. A content-aware resizing algorithm would produce image C instead. If the original image must be resized into 200x150-pixel image, a simple downsampling algorithm would deform the image content and yield image D. A content-aware resizing algorithm would produce image E instead.
For ideas how to approach the problem of designing a context-aware resizing algorithm, you may read the 2007 SIGGRAPH paper by Avidan and Shamir or the 2008 SIGGRAPH paper by Rubinstein et al. about seam craving.
This project researches efficient ways to store and retrieve images and video in a large databases. Various applications are possible: web image retrieval, CT databases for lung cancer screening, video surveillance databases. See, for example, the paper by Petrakis and Faloutsos on medical image databases that we discussed in class.
The goal of this project is to develop a vision-based system that detects if a driver is falling asleep behind the wheel. You may take one of our video cameras and film a friend behind the wheel. Try to detect the blinking eyes of a driver using image differencing techniques. Can you detect the difference between "normal" blinking and closing the eyes for a longer period of time? Test this only when the car is parked! The challenge of this project is to detect eye closures under various lighting conditions. So park your car in various locations and capture images at different times of the day and in various weather situations. Please ask me for references.
Driver Face Analysis for Intelligent Vehicles
The car industry is curious to find out as much as possible about driver behavior. What does a driver focus on while driving? How long does a driver look at the dashboard or the rear mirror? The goal is to design smarter, safer cars. Drivers are videotaped in simulators and in the real world. The videos are then annotated by hand in a painstaking process. Automation of this process is needed, because it would allow a larger test population. In addition, automatic recognition of driver intentions or mistakes may trigger safety features in our future "intelligent cars." The goal of this project is to implement a system that automatically analyzes the driver's face. The system should detect and track facial features. This would be a good group project, since it can be combined with the "Head Tracker" and "Warning System for Tired Drivers" projects. Please ask me for references.
The goal of this project is to implement a system that recognizes a person by matching his or her image to a database of faces. A powerful approach for recognizing faces is the principle component analysis (PCA) of an image. You can implement a PCA "Eigenface" algorithm for simple mug-shot pictures or an extension that can handle more complicated scenarios. The original paper om PCA representation of faces was written by Kirby and Sirovich. Also, please ask me for the paper by Turk and Pentland.
You would reimplement an optical flow algorithm, for example, Horn and Schunk's algorithm (see textbook or journal paper). To test the algorithm, you should set up some experiments. Move the camera or the objects and try to recover the direction of motion.
This would be a good programming project for anyone who is interested in computer vision and computer graphics and enjoys geometry. The goal is to take images from a hand-held video camera and stitch them seamlessly into a panoramic mosaic. Image mosaics are needed in virtual reality designs and for video conferencing. You could try to come up with your own algorithm or reimplement an existing algorithm, for example, H.-Y. Shum's and R. Szeliski's algorithms (ICCV'98, ICCV'99).
Here's a simple way to embed a secret message in an digital image: Convert the message into a string of zeros and ones. Assume the message contains 50 one-bit letters. Use the first 50 pixels in your image and, for each pixel, substitute the lowest bit of the pixel gray level with a letter of the message. Our eyes are not sensitive enough to notice that the image changed. Implement and test this algorithm. Find another, more sophisticated method. See, for example, a paper by Farid. This project requires knowledge in cryptography.
Develop a character recognition program to recognize the letters and numbers on license plates. You can treat the characters as binary images and use correlation, Euler number, and/or thinning techniques for recognition. You may want to simplify the problem by only trying to recognize certain fonts or just digits.
Use two images of an object to compute the 3D "structure" of the object. If you set up your own camera system, try to keep the camera geometry simple, so that the epipolar lines are parallel to the image rows. You can then search along the epipolar lines to find corresponding points in the two images. You may also use the stereo images provided by Scharstein. Illustrate the results of your algorithm on a few examples. Warning: We may use this project as the last homework programming assignment. If you choose this project, you will have to demonstrate that your work significantly improves upon your homework solution.