Table of Links
Abstract— Ultrasound imaging is pivotal in various medical diagnoses due to its non-invasive nature and safety. In clinical practice, the accuracy and precision of ultrasound image analysis are critical. Recent advancements in deep learning are showing great capacity of processing medical images. However, the data hungry nature of deep learning and the shortage of high-quality ultrasound image training data suppress the development of deep learning-based ultrasound analysis methods. To address these challenges, we introduce an advanced deep learning model, dubbed S-CycleGAN, which generates high-quality synthetic ultrasound images from computed tomography (CT) data. This model incorporates semantic discriminators within a CycleGAN framework to ensure that critical anatomical details are preserved during the style transfer process. The synthetic images produced are used to augment training datasets for semantic segmentation models and robot-assisted ultrasound scanning system development, enhancing their ability to accurately parse real ultrasound imagery. The data and code will be available at https://github.com/yhsong98/ct-us-i2i-translation
I. INTRODUCTION
Ultrasound imaging is one of the most widely implemented medical imaging modalities, offering a versatile, noninvasive, and cost-effective method for visualizing the internal structures of the body in real-time. Although ultrasound imaging is safe and convenient, analyzing these images presents considerable challenges due to factors such as low contrast, acoustic shadows, and speckles [1]. Deep learning based medical image processing methods have made great breakthroughs in recent years and have been the state-of-the-art tool for medical image processing applications in various fields, including detection, segmentation, classification, and synthesis [2].
Nonetheless, due to the data-hungry nature of deep learning, the performance of those methods relies heavily on a large amount of image data and manual annotations. While progress in unsupervised learning techniques and the emergence of large-scale open-source image datasets have mitigated these issues somewhat, these solutions are less applicable in the field of medical image processing [3]. This discrepancy is mainly due to several factors: First, medical images require precise and reliable annotations, which must often be provided by expert clinicians, making the process time-consuming and expensive. Second, patient privacy concerns limit the availability and sharing of medical datasets. Third, the variability in medical imaging equipment and protocols across different healthcare facilities can lead to inconsistencies in the data, complicating the development of generalized models. Lastly, the high dimensionality and complexity of medical images demand larger and more diverse datasets to train effective models, which are not always feasible to compile in the medical field.
Along the lines, we are building a fully automated robot assisted ultrasound scan system (RUSS). This platform is designed to perform abdominal ultrasound scans without any human intervention (Fig. 1). Thus we have proposed several versions of ultrasound image segmentation algorithms as evaluation metrics for the robot arm movements [4], [5], [6]. However, our prior efforts have been restricted by limited data sources. While our segmentation algorithms have demonstrated effectiveness within our experimental datasets, we anticipate that training our model with a more diverse array of data would enhance its robustness and applicability. Furthermore, we aim to create a simulation environment to facilitate the development of our RUSS, allowing for refined testing and optimization under controlled conditions. A pre-operative 3D model reconstructed from CT scans are planned to be utilized as the scan target. Based on the current contact point and angle of the virtual ultrasound probe, the system will generate and provide a corresponding ultrasound image as feedback. This integration will enable the RUSS to simulate realistic scanning scenarios, allowing for precise alignment and positioning adjustments that reflect actual clinical procedures.
In this research, we proposed a semantically enhanced CycleGAN, dubbed S-CycleGAN. By adding additional segmentation models as semantic discriminators, together arXiv:2406.01191v1 [eess.IV] 3 Jun 2024 with the original style discriminator, the proposed model is capable of transferring the style of CT slice to the ultrasound domain while keeping the transformed image semantically consistent with the source image.
This paper is
Authors:
(1) Yuhan Song, School of Information Science, Japan Advanced Institute of Science and Technology, Nomi, Ishikawa 923-1292, Japan ([email protected]);
(2) Nak Young Chong, School of Information Science, Japan Advanced Institute of Science and Technology, Nomi, Ishikawa 923-1292, Japan ([email protected]).