LoFA: Learning to Predict Personalized Prior for Fast Adaptation of Visual Generative Models
Abstract
Personalizing visual generative models to meet specific user needs has gained increasing attention, yet current methods like Low-Rank Adaptation (LoRA) remain impractical due to their demand for task-specific data and lengthy optimization. While a few hypernetwork-based approaches attempt to predict adaptation weights directly, they struggle to map fine-grained user prompts to complex LoRA distributions, limiting their practical applicability. To bridge this gap, we propose LoFA, a general framework that efficiently predicts personalized priors for fast model adaptation. We first identify a key property of LoRA: structured distribution patterns emerge in the relative changes between LoRA and base model parameters. Building on this, we design a two-stage hypernetwork: first predicting sparse response maps that capture key adaptation regions, then using these to guide final LoRA weight prediction. Extensive experiments demonstrate that our method consistently predicts high-quality personalized priors within seconds, across multiple tasks and user prompts, even outperforming conventional LoRA that requires hours of processing.
Method Overview
An overview of our LoFA. Conditioned on different user prompts, our network takes the base model weight W as the input, and predicts LoRA response map at Stage-I. Next, Stage-II inherits Stage-I’s architecture, and uses the learned information of the response map to guide the final prediction of the full LoRA weights.
Comparison on Text Conditioned Human Action Video Generation
Comparison on Pose Conditioned Human Action Video Generation
Comparison on Text-to-Video Stylization
Comparison on Identity-Personalized Image Generation
BibTeX
@article{YourPaperKey2024,
title={Your Paper Title Here},
author={First Author and Second Author and Third Author},
journal={Conference/Journal Name},
year={2024},
url={https://your-domain.com/your-project-page}
}