NVIDIA Vison Processing

Vision Server Example

This is an in-progress example of a program that processes real time video from a V4L-compatible webcam and streams both the image and the extracted data. It is the same algorithm as that used by Team Paradox during the Stronghold season.

The program has been tested on a NVIDIA Jetson TK1 running Linux For Tegra R21.4 (with updates) and OpenCV 3.1.0, with the Microsoft LifeCam HD-3000 USB webcam. However, it should probably work without modification on any other Linux system with OpenCV and Video4Linux2 drivers.

The source code for this project can be found here.


  1. Log in to your device
    • Remotely: Use SSH (through PuTTY if you are using Windows).
    • Locally: Plug in a monitor (HDMI port) and keyboard (USB port). After login, open a Terminal window.
  2. Install build tools (GCC, C++ libraries, Make, pkg-config)
    • apt-get install build-essential pkg-config as root
  3. Install OpenCV from source
  4. Build the example
    • Clone the repository: git clone https://github.com/Paradox2102/JetsonVision.git
    • Enter the new directory: cd JetsonVision
    • Build and run the program: make run

If you installed somewhere other than /opt, you will have to edit the following line in Makefile and replace /opt with what you used for CMAKE_INSTALL_PREFIX during the OpenCV installation:

pc = pkg-config /opt/lib/pkgconfig/opencv.pc


If all goes well, when you run the generated executable vision-server.o, it will get images from the attached USB camera in a loop and detect blobs of the specified HSV color (defaults to all-inclusive 0-255). It assumes the largest of these blobs represents the tape around the castle goal, and extracts the following useful data points from the contour:

  • Bounding rectangle: X, Y, width, and height
    • If the vertical center of this rectangle is close to the ideal target X position, the robot is facing the castle.
  • Y1 and Y2: the Y-values of the top left and the top right corners, respectively
    • If the greater of these two values is close to the ideal target Y position, the robot is at the correct distance from the castle.

The program also acts as two different servers, the DataServer and the ImageServer.

Robot side

The DataServer listens on port 5800 by default and repeatedly broadcasts this real-time information about the vision target in the plain text format shown below, terminated by a newline. The robot program is intended to connect to this port and make decisions based on incoming lines of data.

R [X] [Y] [width] [height] [Y1] [Y2] [ideal target Y position] [frame number] [ideal target X position]

Items shown in brackets are unsigned integers.

The robot can also send commands back to the server, however this has not been implemented in this example.

Driver side

The ImageServer listens on port 5801 by default and repeatedly broadcasts each image as well as parameters for image processing. The Driver Station laptop should have an Image Viewer program installed on the Desktop, which connects to the ImageServer and displays the video, as well as providing a user interface for viewing and changing parameters. Each message from the ImageServer consists of

  • A key, which is the bytes AA 55 AA 55 (hexadecimal)
  • A header, which is an instance of struct ImageHeader (see ImageServer.hpp) directly transmitted as a byte array
    • It contains the size in bytes for the ensuing image, as well as the current image processing parameters
  • The image itself, encoded as JPEG

The client (through a GUI used by drivers and programmers) can also send commands back to the server (in a plain text format) to change parameters. The following commands have been implemented so far:

HSV commands

  • h [delta]: Change the minimum value for the Hue filter by [delta] (a signed integer)
  • H [delta]: Change the maximum value for the Hue filter by [delta] (a signed integer)
  • s [delta]: Change the minimum value for the Saturation filter by [delta] (a signed integer)
  • S [delta]: Change the maximum value for the Saturation filter by [delta] (a signed integer)
  • v [delta]: Change the minimum value for the Value filter by [delta] (a signed integer)
  • V [delta]: Change the maximum value for the Value filter by [delta] (a signed integer)

All of these values are clamped to the 0-255 range on the server side.

The program used to view the images and control the server is a Java program which can be launched using the ImageViewer.cmd file that can be found in the PiUtils folder located in the RobotTools.zip file. The Eclipse project to build this tool can be found in the workspaceJava folder of the same RobotTools.zip download.

ImageViewerMulti.cmd is a Windows command file which starts the Image Viewer. Note that the last parameter specifies the IP address of the Raspberry Pi:

java -cp .;ImageViewer.jar imageViewer.ImageViewer

Before you run the Image Viewer, you must first start the Image Server on the Raspberry Pi. To do this you can use the SSH utility Putty.exe which is also located in the RobotTools folder. You can log into the Raspberry Pi using the user name ‘pi’ and the password ‘raspberry’. Once logged in, you need to run the program ‘NetworkVision’ located in the folder ‘networkVision’. When this program is running, your SSH terminal window should look like:


Now that you have the server running, you should be able to start the Image Viewer on your computer and connect to the pi. This should result is a display similar to the following:


You can control the parameters that control the processing of the images. The images are recognized by color and you specify a range for the Hue, Saturation, and Value components. This version of the Image Processor will only return the first image that exceeds the minimum area, and can only handle one color at a time.

You can also control camera settings such as shutter speed, brightness and saturation. Finally there are horizontal and vertical target parameters that can be set. These numbers will be sent to the RoboRio as part of the region data.