Multimodal
Diffusion-Refined Segmentation and Vision-Language Interpretation for Pediatric Brain Tumor MRI
A two-stage deep learning framework for pediatric brain tumor segmentation and interpretation has been introduced, addressing challenges like limited annotated data and heterogeneous imaging phenotypes. The framework evaluates 3D Res U-Net and Swin-UNETR on BraTS-PEDs MRI scans and employs diffusion-based refinement models, including a 3D DDPM refiner and MedSegDiff, to enhance segmentation accuracy, particularly at tumor boundaries. This approach not only improves segmentation performance but also integrates with a multimodal language model to generate structured radiology reports, enhancing interpretability in neuro-oncology workflows.
brain tumorsegmentationdeep learning