Low-cost image annotation for supervised machine learning. Application to the detection of weeds in dense culture

TitreLow-cost image annotation for supervised machine learning. Application to the detection of weeds in dense culture
Type de publicationCommunication
TypeCommunication par affiche dans un congrès
Date du colloque6/9/2018
Titre du colloqueComputer Vision Problems in Plant Phenotyping (CVPPP 2018)
AuteurSamiei, Salma , Ahmad, Ali , Rasti, Pejman , Belin, Etienne , Rousseau, David
Résumé en anglais

An open problem in robotized agriculture is to detect weeds in dense culture. This problem can be addressed with computer vision and machine learning. The bottleneck of supervised approaches lay in the manual annotation of training images. We propose two different approaches for detecting weeds position to speed up this process. The first approach is using synthetic images and eye-tracking to annotated images [4] which is at least 30 times faster than manual annotation by an expert, the second approach is based on real RGB and depth images collected via Kinect v2 sensor.
We generated a data set of 150 synthetic images which weeds were randomly positioned on it. Images were gazed by two observers. Eye tracker sampled eye position during the execution of this task [5, 6]. Area of interest was recorded as rectangular patches. A patch is considered as including weeds if the average fixation time in this patch exceeds 1.04 seconds. The quality of visual annotation by eye-tracking is assessed by two ways. First, direct comparison of visual annotation with ground-truth which is shown an average 94.7% of all fixations on an image which fell within ground-truth bounding-boxes. Second, as shown in fig.1 eye-tracked annotated data is used as a training data set in four machine learning approaches and compare the recognition rate with the ground-truth.
These four machine learning methods are tested in order to assess the quality of the visual annotation. These methods correspond to handcrafted features adapted to texture characterization. They are followed by a linear support vector machine binary classifier. The table 1 gives the average accuracy and standard deviation. Experimental results prove that visual eye-tracked annotated data are almost the same as in-silico ground-truth and performances of supervised machine learning on eye-tracked annotated data are very close to the one obtained with ground-truth.

URL de la noticehttp://okina.univ-angers.fr/publications/ua17391
Lien vers le document en ligne