As part of CS 543: Computer Vision, our team — Abdulrahman AlRabah, Ajay Rao, Imaad Zaffar Khan, and Manav Singhai — re-implemented the CVPR 2024 paper “Shadows Don’t Lie and Lines Can’t Bend!” and extended it with our own contributions. The project focused on detecting synthetic images produced by generative and diffusion models through geometric and frequency-based cues.
We built a full detection pipeline that analyzes line geometry, perspective fields, and object-shadow relationships, integrating models such as DeepLSD for line-segment detection, PerspectiveFields for vanishing-point consistency, and Detectron2 for shadow segmentation. In addition, we incorporated frequency-domain analysis using Fourier and DCT transforms to expose subtle spectral artifacts introduced during image synthesis. Our extension to the original paper introduced an OpenAI vision fine-tuning component** that classifies real vs. generated images, as well as an autoencoder-based anomaly detector for reconstruction-based fake image discovery.
Through extensive experiments on datasets including Kandinsky, PixArt, SDXL, and DeepFloyd, we found that fake images consistently show misaligned shadows, *anishing-point errors, and low-frequency dominance in the spectral domain. Combining geometric and frequency cues yielded robust performance, achieving over 97 % accuracy on “easy” datasets and demonstrating strong generalization to unseen generative models.
This work reinforced our understanding of visual geometry, lighting cues, and frequency-domain analysis — directly connecting theoretical concepts from CS 543 to real-world detection of synthetic imagery.