Before there were smartphones or smart cities, before autonomous cars or 360° virtual reality videos, there was our technology. Headquartered in San Diego, for over 30 years Qualcomm inventions have inspired others to make the impossible, possible. From 5G to artificial intelligence, from IoT to automotive and extended reality applications, Qualcomm is inventing the technologies of an intelligently connected future, spearheading research efforts for the next global wireless standard, and collaborating with industry leaders in the wireless value chain to make this future a commercial reality.
The field of 2D generative modeling is advancing rapidly, continually expanding the boundaries of image synthesis and manipulation. This internship aims to harness recent breakthroughs, particularly in 2D diffusion models [1, 2], to investigate efficient generative techniques such as few-step distillation [3, 4, 5] and high-resolution image generation [6, 7, 8]. The research will also extend into related areas including image editing, super-resolution, and restoration [9, 10, 11].
Responsibilities:
· Research and develop innovative approaches in 2D diffusion models for generative modeling.
· Extend research to adjacent tasks, such as super-resolution and image restoration.
· Explore and implement more efficient techniques to enhance the performance of generative models, focusing on computational efficiency.
· Conduct implementation of baselines for comparative evaluation on benchmarks, along with thorough ablation studies.
The research conducted in this internship is aimed at advancing the field of 2D generative modeling, with the expectation of contributing to paper submissions at top-tier conferences in the field.
[1] High-Resolution Image Synthesis with Latent Diffusion Models, https://arxiv.org/abs/2112.10752
[2] SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis, https://arxiv.org/abs/2307.01952
[3] SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation, https://arxiv.org/abs/2503.09641 [4] Consistency Models, https://arxiv.org/abs/2303.01469 [5] Adversarial Diffusion Distillation, https://arxiv.org/abs/2311.17042
[6] SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers, https://arxiv.org/abs/2410.10629
[7] PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation, https://arxiv.org/abs/2403.04692
[8] DemoFusion: Democratising High-Resolution Image Generation With No $$$, https://arxiv.org/abs/2311.16973
[9] InstructIR: High-Quality Image Restoration Following Human Instructions, https://arxiv.org/abs/2401.16468
[10] Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild, https://arxiv.org/pdf/2401.13627
[11] The Power of Context: How Multimodality Improves Image Super-Resolution, https://arxiv.org/abs/2503.14503
Programming Languages:
· Python
Minimum Qualification:
· Pytorch
· Neural network architecture development and evaluation
· Computer Vision