SevanB Publications

LITT: Diffusion Models Encode Spatial Composition as Linear Information in Text Tokens

Rogerio Guimares, Sevan Brodjian, Pietro Perona

Under Review

This work demonstrates that text tokens in a modern flow-based image model (SD 3.5) have linearly decodable signals indicating whether a text prompt for spatial composition will succeed. We apply this signal for efficient inference time scaling to improve success rates, including gradient steering in the latent space resulting in significant prompt-following improvements.

Submission Date: 2026-05-03

Training-Free Temporal Abstraction for General Video Understanding

Etienne Cassanova, Sevan Brodjian, Pietro Perona

Under Review

In this paper we demonstrate that pretrained foundation models provide a general video understanding backbone across moment retrieval, generic event boundary detection, and long-video frame selection for VLMs.

Submission Date: 2026-05-02

Single View Seafloor Recovery from Imaging Sonar via Differentiable Rendering

CVPR PBVS Workshop

*First Author

Sevan Brodjian, Michael Hobley, Pietro Perona

Accepted (Poster)

Forward-looking sonar collapses the entire vertical structure of a 3D scene into a single flat, ambiguous image, which made inverting it an interesting problem. We built a fully differentiable renderer of the acoustic acquisition physics and let gradient descent do the rest, recovering seafloor and riverbed geometry from a single frame with no training data.

DOI: arXiv.2605.24195

Link to publication

Citation: @inproceedings{brodjian2026sonar, title = {Single-View Seafloor Recovery from Imaging Sonar via Differentiable Rendering}, author = {Brodjian, Sevan and Hobley, Michael and Perona, Pietro}, booktitle = {CVPR Workshops (PBVS)}, year = {2026} }

Submission Date: 2026-03-08

Kuramoto Orientation Diffusion Models

NeurIPS 2025

Yue Song, T. Anderson Keller, Sevan Brodjian et al.

Accepted (Poster)

Biological neural systems synchronize through Kuramoto oscillator dynamics. We built off that mechanism to create a diffusion model out of it, using phase synchronization as a structured prior for generating orientation-rich images like fingerprints and textures.

DOI: 10.48550/arXiv.2509.15328