Comparing workflow application designs for high resolution satellite image analysis

Aymen Al-Saadi, Ioannis Paraskevakos, Bento Collares Gonçalves, Heather J. Lynch, Shantenu Jha, and Matteo Turilli

PaperAbstract. Very High Resolution satellite and aerial imagery are used to monitor and conduct large scale surveys of ecological systems. Convolutional Neural Networks have successfully been employed to analyze such imagery to detect large animals and salient features. As the datasets increase in volume and number of images, utilizing High Performance Computing resources becomes necessary. In this paper, we investigate three task-parallel, data-driven workflow designs to support imagery analysis pipelines with heterogeneous tasks on high performance computing platforms. We analyze the capabilities of each design when processing 3097 and 1575 images for two distinct use cases, for a total of 4,672 satellite and aerial images and 8.35 TB of data. We experimentally model the execution time of the tasks of the image processing pipelines. We perform experiments to characterize resource utilization, total time to completion and overheads of each design. Our analysis shows which design is best suited to scientific pipelines with similar characteristics.