QVision overview

The QVision can be seen in two ways. First, as a computer vision and scientific computing library; second, as an application development and design tool.

Regarding the first perspective, the QVision is built upon the Qt library, which offers extensive functionality for file management, networking, graphical widgets, high level containers (such as lists, hash tables, etc...) and programming tools (such as qmake, the Qt Assistant, or the Qt Creator) that help the developer to create new applications.

The QVision extends the Qt library with a set of data structures commonly used in the fields of computer vision and scientific computing. Some of them are:

Images (class QVImage).
Matrices (class QVMatrix).
Vectors (class QVVector).
Quaternions (class QVQuaternion).
Tensors (class QVTensor).
3D points (class QV3DPointF).
Functions (class QVFunction).
Polylines (classes QVPolyline and QVPolylineF).

and so on. The QVision provides comprehensive functionality to operate with these types. It offers matrix decomposition functions (see the Matrix Algebra module), image feature detection functions (see module Image processing), projective geometry and 3D reconstruction functions (see Projective Geometry module), amongst others. This functionality will be continuously growing with contributions from several sources.

To provide fully comprehensive image processing, scientific calculus, and video input/output functionality, the QVision can easily inter-operate with several third-party libraries and applications. A set of wrapper functions are provided to use the classes of the QVision with functionality from those libraries, Also conversion operators are generally included to easily convert QVision data types from and to data types from those libraries.

The QVision also offers several tools to the developer for application prototyping. It includes tools for reusable block oriented application development, which the developer can use to rapidly and easily create complex applications featuring data processing pipeline structures.

Interoperability with other libraries

The QVision can be used in conjunction with several third-party libraries, and contains functionality to interact with them. The QVision must be configured to use each one of them, prior to its compilation. These third-party libraries are:

Intel(R) Integrated Performance Primitives (IPP) contains a set of highly optimized functions, mostly for image processing. If the QVision is configured to use this library, the module IPP wrapper functions will provide a large set of wrapper functions that take image and matrix objects as inputs. So, the developer can use most of the IPP functions directly with the high-level data types of the QVision for images, matrices and Qt containers.

QWT - Qt Widgets for Technical Applications is a library that contains a large set of graphical widgets: plots, scales, sliders, dials, compasses, thermometers, wheels and knobs, to control or display values, arrays, or ranges of type double. Most of the QVision GUI widgets are based on widgets from the QWT. The module GUI blocks based on the QWT library groups these QWT-based widgets. The QVision can work without this library, but most of the graphical interface widgets will not be available.

OpenCV is a widely known and used library for computer vision. If the QVision is configured to use this library, the image data type (QVImage) will offer conversion operators from and to the OpenCV images (of type IplImage). If so, including and inter-operating with OpenCV code from QVision code will be straightforward.

CGAL. An efficient C++ computational geometry library.

GSL - GNU Scientific Library is a C numerical library with over 1000 extensively tested functions. It provides a wide range of mathematical routines such as random number generators, matrix and vector processing, function minimization, least-squares fitting, etc... This library is required for most of the QVision math functionality (related with matrices, vectors, and so on).

Intel MKL is a library of highly optimized, extensively threaded math routines for science, engineering, and financial applications that require maximum performance. As with the GSL, the QVision matrix and vector operators and matrix algebra functionality will speed up using functions from this library.

Octave C++ API is an object oriented library, that offers most of the basic matrix and vector processing available in the Octave application.

CUDA (Compute Unified Device Architecture) is a standard for programming the NVIDIA GPU. The functionality in the QVision related to CUDA is still under development, and not fully available yet in this release.

Also, the QVision includes several classes and functions to use the widely known MPlayer as a back-end application. This means that any QVision application can read from a wide set of video and image source types, like web-cams, remote streams, and many video and image formats and types using the MPlayer. The QVision functionality provided to inter-operate with the MPlayer will launch the necessary instances of the MPlayer application, and will communicate with them to obtain the images or input video frames in the format required by the QVision application.

Check the documentation of the module MPlayer based image and video input/output for a list of these functions and classes.

QVision application development tools

Signal processing tasks are common in computer vision and scientific computing. Characteristic processing structures appear in these algorithms, where the data flows through a pipeline of several stages from the input of the application to the output (graphical, user interface, to disk, etc...).

An example is the well known Canny edge detector. The following graph depicts the processing stages it performs, from the input images, to the resulting borders detected at them:

Square elements in the graph represent data processing blocks. Round elements represent data or parameter input/output in the data path. The arrows are directed data links, which connect stages that produce certain data with the stages that process it.

The QVision provides a design tool to help in the creation of these structures, and to exploit their computational and algorithmic advantages. Data processing blocks are modeled as objects that share data through links between them.

Some of the advantages of using this block design for application creation are the following:

The QVision offers several ready-to-use blocks for image/video input, image processing. The developer can use these blocks along with new ones to rapidly and easily create complex applications.

The QVision exploits the inherent parallelism of these data-paths with multi-core architectures. The programmer can assign the processing of each block on the data-path to a different thread. Also, he or she can optionally establish synchronization policies between linked blocks.

It is generally easier to develop complex robust and correct applications by re-using well tested components (in this case, blocks) from other applications, or the QVision itself, and programming the remaining functionality in logically independent data processing blocks.

Creating advanced applications can be as simple as: creating some block processing objects, and establishing the data links between them. For example, the following is the full source-code of an application that (1) applies the Harris corner detector to the image frames of a video sequence, (2) can read the frames from any kind of video source, using the MPlayer as a back-end application, and (3) displays the output image and the detected features in an image window full of useful functionality, such as zooming, region selection, and so on:

        #include <QVApplication>
        #include <QVMPlayerReaderBlock>
        #include <QVDefaultGUI>
        #include <QVImageCanvas>
        #include <QVHarrisPointDetector>
        
        int main(int argc, char *argv[])
                {
                QVApplication app(argc, argv,
                        "Example program for QVision library. Obtains several features from input video frames."
                        );
        
                QVMPlayerReaderBlock videoReader("Video reader");
                QVHarrisPointDetector harrisBlock("Harris Block");

                QVDefaultGUI interface;

                QVImageCanvas cornersCanvas("Harris corners displayer");
                videoReader.linkProperty(&harrisBlock,"Input image");
                harrisBlock.linkProperty("Feature locations", cornersCanvas);
        
                return app.exec();
                }

Besides these advantages, the library contains graphical widgets to inspect and modify the behavior and structure of these processing data-paths at execution time. The following is a snapshot of a typical QVision application, built using some of the block inspector widgets provided by the library:

With these widgets the user of the application can stop, resume and execute step by step the processing of each input image frame. The user can also modify the parameters and behavior of the algorithms implemented in the application at execution time, which is specially interesting when tuning threshold variables, or designing new algorithms. Also, some of these widgets can inspect the performance and outputs of the intermediate and final blocks in the data-path, such as the resulting images, features detected, and so on.

The QVision can be used to create augmented reality applications:

You can check the class QVImageCanvas for further info about this.

The QVision also provides a special tool, named the Designer. With this tool, the developer can inspect and modify the structure of the data-path at execution time, making easy to perform rapid application development. The Designer displays a slate window, where the user can view the data-path structure, and, add or delete nodes and data links between them while the application is still running. An example of this slate window is the following:

For further info about the Designer, check the section The Designer GUI. Section Creating the first block-oriented application of the manual starts with the basics of block programming. Further sections extend the details and functionality of this approach to QVision application development.