Apples are one of the most consumed fruits, and they require efficient harvesting proce-dures to remains in optimal states for a longer period, especially during transportation. Therefore, automation has been adopted by many orchard operators to help in the harvesting process, which includes apple localization on the trees. The de facto sensor that is currently used for this task is the standard camera, which can capture wide view information of various apple trees from a reasonable distance. Therefore, this paper aims to produce the output mask of the apple locations on the tree automatically by using a deep semantic segmentation network. The network must be robust enough to overcome all challenges of shadow, surrounding illumination, size variations, and occlusion to produce accurate pixel-wise localization of the apples. A high-resolution deep architecture is em-bedded with an optimized design of group and shuffle operators (GSO) to produce the best apple segmentation network. GSO allows the network to reduce the dependency on a few sets of dominant convolutional filters by forcing each smaller group to contribute effectively to the task of extracting optimal apple features. The experimental results show that the proposed network, GSHR-Net, with two sets of group convolution applied to all layers produced the best mean intersection over union of 0.8045. The performance has been benchmarked with 11 other state-of-the-art deep semantic segmentation networks. For future work, the network performance can be increased by integrating synthetic augmented data to further optimize the training phase. Moreover, spatial and channel-based attention mechanisms can also be explored by emphasizing some strategic locations of the apples, which makes the recognition more accurate.
- apples recognition
- convolutional neural networks
- deep learning
- semantic segmentation