Computation and


Structures Group

Accelerating Training Across Applications

Project and Position Brief

In recent years, the use of computer vision, for everything from autonomy to security, has exploded. In particular, of growing importance are computer vision algorithms that are taught, not built---these are largely built around deep convolutional neural networks (CNNs), and when properly trained, can achieve near-human performance in computer vision many tasks. But there remains a problem: training these networks, specifically, obtaining the training data necessary to train these algorithms, is a laborious task. You can't always get the dataset you need from Google Image Search.

We believe that for application-specific neural networks, much of the training can be done in the field---on real video streams, in real time. Take the example of training a drone to recognize and follow a particular person, for instance. Rather than take a large number of pictures of similar-looking people and training a model, it is much faster to train the model on-the-fly, literally, while the person is in view of the camera. This also allows for correction of poor model behavior during operation rather than after. We call this type of neural network a RaFT-NN (Rapidly Field Trained Neural Network)

We have developed a system that will greatly aid an operator in training a CNN for object detection---up to 8x less clicks in certain scenarios. We are searching for a scientifically-minded machine learning/computer vision engineer to prove this concept across a large number of scenarios---some of which we will be using as benchmarks and distributing to the greater research community---and help us expand the capabilities of this system to accelerate it further.

Semester Deliverables

  • Investigation and creation of several training scenarios (videos), with ground truth tags
  • Report on the performance of current RaFT-NN techniques across various training scenarios
  • Implementation of several different image segmentation algorithms that can outline objects in the scene and separate them from the background
  • Assessment of their performance in distinguishing objects from background on the aformentioned training scenarios


  • Extensive experience with Python
  • Experience with OpenCV
  • Some machine learning background


  • Experience with TensorFlow or Caffe neural network frameworks
  • Familiarity with object tracking and/or image segmentation techniques


Send an email with your resume to:

  • Ervin Teng: ervin.teng@sv.cmu.edu
  • Bob Iannucci: bob@sv.cmu.edu