Diffusion Model

A new framework that fuses the generation approaches of diffusion models and auto-regressive:

Arriola, Marianne, Aaron Gokaslan, Justin T. Chiu, Zhihan Yang, Zhixuan Qi, Jiaqi Han, Subham Sekhar Sahoo, and Volodymyr Kuleshov. “Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models,” 2024. https://openreview.net/forum?id=tyEyYT267x.

Potentially a novel guidance framework to enhance conditioned generation of diffusion models:

Chen, Huayu, Hang Su, Peize Sun, and Jun Zhu. “Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment,” 2024. https://openreview.net/forum?id=kGvXIlIVLM.

Another guidance framework for conditioned generation:

Lin, Han, Jaemin Cho, Abhay Zala, and Mohit Bansal. “Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model,” 2024. https://openreview.net/forum?id=ny8T8OuNHe.

Yet another framework for achieving few or one step reverse diffusion process:

Frans, Kevin, Danijar Hafner, Sergey Levine, and Pieter Abbeel. “One Step Diffusion via Shortcut Models,” 2024. https://openreview.net/forum?id=OlzB6LnXcS.

Consistency models (CMs), a family of diffusion models:

Lu, Cheng, and Yang Song. “Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models,” 2024. https://openreview.net/forum?id=LyJi5ugyJx.

A work on improving the reverse denoising process performance by finding optimal discrete timesteps:

Tong, Vinh, Dung Trung Hoang, Anji Liu, Guy Van den Broeck, and Mathias Niepert. “Learning to Discretize Denoising Diffusion ODEs,” 2024. https://openreview.net/forum?id=xDrFWUmCne.

Computer Vision

Video Modeling

A few works representing the latest development in video modeling:

Gu, Xin, Yaojie Shen, Chenxi Luo, Tiejian Luo, Yan Huang, Yuewei Lin, Heng Fan, and Libo Zhang. “Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding,” 2024. https://openreview.net/forum?id=WOzffPgVjF.

Wang, Hanyu, Saksham Suri, Yixuan Ren, Hao Chen, and Abhinav Shrivastava. “LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior,” 2024. https://openreview.net/forum?id=Wr3UuEx72f.

Image Modeling

Xie, Enze, Junsong Chen, Junyu Chen, Han Cai, Haotian Tang, Yujun Lin, Zhekai Zhang, et al. “SANA: Efficient High-Resolution Text-to-Image Synthesis with Linear Diffusion Transformers,” 2024. https://openreview.net/forum?id=N8Oj1XhtYZ.

Yu, Sihyun, Sangkyung Kwak, Huiwon Jang, Jongheon Jeong, Jonathan Huang, Jinwoo Shin, and Saining Xie. “Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think,” 2024. https://openreview.net/forum?id=DJSZGGZYVi.

Vision-language

Chow, Wei, Jiageng Mao, Boyi Li, Daniel Seita, Vitor Campagnolo Guizilini, and Yue Wang. “PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding,” 2024. https://openreview.net/forum?id=Q6a9W6kzv5.

Pal, Avik, Max van Spengler, Guido Maria D’Amely di Melendugno, Alessandro Flaborea, Fabio Galasso, and Pascal Mettes. “Compositional Entailment Learning for Hyperbolic Vision-Language Models,” 2024. https://openreview.net/forum?id=3i13Gev2hV.

Schrodi, Simon, David T. Hoffmann, Max Argus, Volker Fischer, and Thomas Brox. “Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models,” 2024. https://openreview.net/forum?id=uAFHCZRmXk.

Zhang, Zheyuan, Fengyuan Hu, Jayjun Lee, Freda Shi, Parisa Kordjamshidi, Joyce Chai, and Ziqiao Ma. “Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference under Ambiguities,” 2024. https://openreview.net/forum?id=84pDoCD4lH.

Large Language Models

Inference Framework

This framework has potential of enabling LLMs for probabilistic prediction:

Feng, Yu, Ben Zhou, Weidong Lin, and Dan Roth. “BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models.” arXiv, October 16, 2024. https://doi.org/10.48550/arXiv.2404.12494.

Frameworks for controlling the generation process of LLMs:

Loula, João, Benjamin LeBrun, Li Du, Ben Lipkin, Clemente Pasti, Gabriel Grand, Tianyu Liu, et al. “Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo,” 2024. https://openreview.net/forum?id=xoXn62FzD0.

Minh, Nguyen Nhat, Andrew Baker, Clement Neo, Allen G. Roush, Andreas Kirsch, and Ravid Shwartz-Ziv. “Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs,” 2024. https://openreview.net/forum?id=FBkpCyujtS.

Representation Learning

There are a few works that shed light into the utilization of LLMs as a representation learning or embedding model:

Gao, Leo, Tom Dupre la Tour, Henk Tillman, Gabriel Goh, Rajan Troll, Alec Radford, Ilya Sutskever, Jan Leike, and Jeffrey Wu. “Scaling and Evaluating Sparse Autoencoders,” 2024. https://openreview.net/forum?id=tcsZt9ZNKD.

Li, Ziyue, and Tianyi Zhou. “Your Mixture-of-Experts LLM Is Secretly an Embedding Model for Free,” 2024. https://openreview.net/forum?id=eFGQ97z5Cd.

Zhang, Jie, Dongrui Liu, Chen Qian, Linfeng Zhang, Yong Liu, Yu Qiao, and Jing Shao. “REEF: Representation Encoding Fingerprints for Large Language Models,” 2024. https://openreview.net/forum?id=SnDmPkOJ0T.

Training Strategy

A few works with different perspective of how to train, finetune, and adapt LLMs:

Gu, Yuxian, Li Dong, Hongning Wang, Yaru Hao, Qingxiu Dong, Furu Wei, and Minlie Huang. “Data Selection via Optimal Control for Language Models,” 2024. https://openreview.net/forum?id=dhAL5fy8wS.

Huang, Qiushi, Tom Ko, Zhan Zhuang, Lilian Tang, and Yu Zhang. “HiRA: Parameter-Efficient Hadamard High-Rank Adaptation for Large Language Models,” 2024. https://openreview.net/forum?id=TwJrTz9cRS.

Jiang, Gangwei, Caigao Jiang, Zhaoyi Li, Siqiao Xue, Jun Zhou, Linqi Song, Defu Lian, and Ying Wei. “Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning,” 2024. https://openreview.net/forum?id=gc8QAQfXv6.

Scholten, Yan, Stephan Günnemann, and Leo Schwinn. “A Probabilistic Perspective on Unlearning and Alignment for Large Language Models.” arXiv, March 1, 2025. https://doi.org/10.48550/arXiv.2410.03523.

Tang, Song, Wenxin Su, Yan Gan, Mao Ye, Jianwei Dr Zhang, and Xiatian Zhu. “Proxy Denoising for Source-Free Domain Adaptation,” 2024. https://openreview.net/forum?id=FIj9IEPCKr.

Wu, Yichen, Hongming Piao, Long-Kai Huang, Renzhen Wang, Wanhua Li, Hanspeter Pfister, Deyu Meng, Kede Ma, and Ying Wei. “SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning,” 2024. https://openreview.net/forum?id=5U1rlpX68A.

Zhang, Yuheng, Dian Yu, Baolin Peng, Linfeng Song, Ye Tian, Mingyue Huo, Nan Jiang, Haitao Mi, and Dong Yu. “Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning,” 2024. https://openreview.net/forum?id=Pujt3ADZgI.

Worth noting is that there are quite a few works on Reinforcement Learning From Human Feedback (RLHF):

Li, Aaron Jiaxun, Satyapriya Krishna, and Himabindu Lakkaraju. “More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness,” 2024. https://openreview.net/forum?id=FpiCLJrSW8.

Liu, Yantao, Zijun Yao, Rui Min, Yixin Cao, Lei Hou, and Juanzi Li. “RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style,” 2024. https://openreview.net/forum?id=QEHrmQPBdd.

Sun, Hao, Yunyi Shen, and Jean-Francois Ton. “Rethinking Reward Modeling in Preference-Based Large Language Model Alignment,” 2024. https://openreview.net/forum?id=rfdblE10qm.

Self-improvement

Self-improvement has became a popular topic in LLM research recently. Below is a few papers on this topic:

Huang, Audrey, Adam Block, Dylan J. Foster, Dhruv Rohatgi, Cyril Zhang, Max Simchowitz, Jordan T. Ash, and Akshay Krishnamurthy. “Self-Improvement in Language Models: The Sharpening Mechanism,” 2024. https://openreview.net/forum?id=WJaUkwci9o.

Kumar, Aviral, Vincent Zhuang, Rishabh Agarwal, Yi Su, John D. Co-Reyes, Avi Singh, Kate Baumli, et al. “Training Language Models to Self-Correct via Reinforcement Learning.” arXiv, October 4, 2024. https://doi.org/10.48550/arXiv.2409.12917.

Peng, Xiangyu, Congying Xia, Xinyi Yang, Caiming Xiong, Chien-Sheng Wu, and Chen Xing. “ReGenesis: LLMs Can Grow into Reasoning Generalists via Self-Improvement,” 2024. https://openreview.net/forum?id=YUYJsHOf3c.

Song, Yuda, Hanlin Zhang, Carson Eisenach, Sham M. Kakade, Dean Foster, and Udaya Ghai. “Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models,” 2024. https://openreview.net/forum?id=mtJSMcF3ek.

Prompting & RAG

RAG is an important technique for injecting domain-specific knowledges into LLMs. Below is a few works on this topic:

Song, Maojia, Shang Hong Sim, Rishabh Bhardwaj, Hai Leong Chieu, Navonil Majumder, and Soujanya Poria. “Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse,” 2024. https://openreview.net/forum?id=Iyrtb9EJBp.

Yue, Zhenrui, Honglei Zhuang, Aijun Bai, Kai Hui, Rolf Jagerman, Hansi Zeng, Zhen Qin, Dong Wang, Xuanhui Wang, and Michael Bendersky. “Inference Scaling for Long-Context Retrieval Augmented Generation.” arXiv, March 2, 2025. https://doi.org/10.48550/arXiv.2410.04343.

Agents

LLM-based agent systems that can perform complex real-world tasks:

Zhang, Jiayi, Jinyu Xiang, Zhaoyang Yu, Fengwei Teng, Xiong-Hui Chen, Jiaqi Chen, Mingchen Zhuge, et al. “AFlow: Automating Agentic Workflow Generation,” 2024. https://openreview.net/forum?id=z5uVAKwmjf.

Shojaee, Parshin, Kazem Meidani, Shashank Gupta, Amir Barati Farimani, and Chandan K. Reddy. “LLM-SR: Scientific Equation Discovery via Programming with Large Language Models,” 2024. https://openreview.net/forum?id=m2nmp8P5in.

Analysis

A few works on analyzing mechanisms of LLMs:

Kran, Esben, Hieu Minh Nguyen, Akash Kundu, Sami Jawhar, Jinsuk Park, and Mateusz Maria Jurewicz. “DarkBench: Benchmarking Dark Patterns in Large Language Models,” 2024. https://openreview.net/forum?id=odjMSBSWRt.

Ren, Yi, and Danica J. Sutherland. “Learning Dynamics of LLM Finetuning,” 2024. https://openreview.net/forum?id=tPNHOoZFl9.

Snell, Charlie Victor, Jaehoon Lee, Kelvin Xu, and Aviral Kumar. “Scaling LLM Test-Time Compute Optimally Can Be More Effective than Scaling Parameters for Reasoning,” 2024. https://openreview.net/forum?id=4FWAwZtd2n.

Weng, Zhiyuan, Guikun Chen, and Wenguan Wang. “Do as We Do, Not as You Think: The Conformity of Large Language Models,” 2024. https://openreview.net/forum?id=st77ShxP1K.

Zhao, Siyan, Mingyi Hong, Yang Liu, Devamanyu Hazarika, and Kaixiang Lin. “Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs,” 2024. https://openreview.net/forum?id=QWunLKbBGF.

Deep Learning Frameworks

Meta Learning

A framework that claims to achieve adaptive model complexity for different tasks:

Mathur, Mrinal, Barak A. Pearlmutter, and Sergey M. Plis. “MIND over Body: Adaptive Thinking Using Dynamic Computation,” 2024. https://openreview.net/forum?id=EjJGND0m1x.

Embedding

A novel distance-preserving embedding method:

Xu, Dehong, Ruiqi Gao, Wenhao Zhang, Xue-Xin Wei, and Ying Nian Wu. “On Conformal Isometry of Grid Cells: Learning Distance-Preserving Position Embedding,” 2024. https://openreview.net/forum?id=Xo0Q1N7CGk.

Sequential Model

A few new Transformer or attention frameworks:

Lai, Xunhao, Jianqiao Lu, Yao Luo, Yiyuan Ma, and Xun Zhou. “FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference,” 2024. https://openreview.net/forum?id=OfjIlbelrT.

Ye, Tianzhu, Li Dong, Yuqing Xia, Yutao Sun, Yi Zhu, Gao Huang, and Furu Wei. “Differential Transformer,” 2024. https://openreview.net/forum?id=OvoCm1gGhN.

Some works on state space models:

Park, Byoungwoo, Hyungi Lee, and Juho Lee. “Amortized Control of Continuous State Space Feynman-Kac Model for Irregular Time Series,” 2024. https://openreview.net/forum?id=8zJRon6k5v.

Rusch, T. Konstantin, and Daniela Rus. “Oscillatory State-Space Models,” 2024. https://openreview.net/forum?id=GRMfXcAAFh.

A interesting sequential model that fuses diffusion denoiser and LM into one unified model.

Zhou, Chunting, Lili Yu, Arun Babu, Kushal Tirumala, Michihiro Yasunaga, Leonid Shamis, Jacob Kahn, Xuezhe Ma, Luke Zettlemoyer, and Omer Levy. “Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model,” 2024. https://openreview.net/forum?id=SI2hI0frk6.

Probabilistic Model

Wyrwal, Kacper, Andreas Krause, and Viacheslav Borovitskiy. “Residual Deep Gaussian Processes on Manifolds,” 2024. https://openreview.net/forum?id=JWtrk7mprJ.

Model Training

Not only for LLMs, but a general discussion that could be useful for training a parameter-efficient model:

Kumar, Tanishq, Zachary Ankner, Benjamin Frederick Spector, Blake Bordelon, Niklas Muennighoff, Mansheej Paul, Cengiz Pehlevan, Christopher Re, and Aditi Raghunathan. “Scaling Laws for Precision,” 2024. https://openreview.net/forum?id=wg1PCg3CUP.

AI4Science

A few interdisciplinary works on drug design, material generation, protein modeling, and so on.

Adams, Keir, Kento Abeywardane, Jenna Fromer, and Connor W. Coley. “ShEPhERD: Diffusing Shape, Electrostatics, and Pharmacophores for Bioisosteric Drug Design,” 2024. https://openreview.net/forum?id=KSLkFYHlYg.

Chen, Pin, Zexin Xu, Qing Mo, Hongjin Zhong, Fengyang Xu, and Yutong Lu. “ECD: A Machine Learning Benchmark for Predicting Enhanced-Precision Electronic Charge Density in Crystalline Inorganic Materials,” 2024. https://openreview.net/forum?id=SBCMNc3Mq3.

Corso, Gabriele, Vignesh Ram Somnath, Noah Getz, Regina Barzilay, Tommi Jaakkola, and Andreas Krause. “Composing Unbalanced Flows for Flexible Docking and Relaxation,” 2024. https://openreview.net/forum?id=gHLWTzKiZV.

Geffner, Tomas, Kieran Didi, Zuobai Zhang, Danny Reidenbach, Zhonglin Cao, Jason Yim, Mario Geiger, et al. “Proteina: Scaling Flow-Based Protein Structure Generative Models,” 2024. https://openreview.net/forum?id=TVQLu34bdw&noteId=ypeoraSUA0.

Gong, Jingjing, Yu Pei, Siyu Long, Yuxuan Song, Zhe Zhang, Wenhao Huang, Ziyao Cao, Shuyi Zhang, Hao Zhou, and Wei-Ying Ma. “Steering Protein Family Design through Profile Bayesian Flow,” 2024. https://openreview.net/forum?id=PSiijdQjNU.

Stark, Hannes, Bowen Jing, Tomas Geffner, Jason Yim, Tommi Jaakkola, Arash Vahdat, and Karsten Kreis. “ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids,” 2024. https://openreview.net/forum?id=0ctvBgKFgc.

Zhang, Chenbin, Zhiqiang Hu, Jiang Chuchu, Wen Chen, Jie Xu, and Shaoting Zhang. “Rethinking the Generalization of Drug Target Affinity Prediction Algorithms via Similarity Aware Evaluation,” 2024. https://openreview.net/forum?id=j7cyANIAxV.

Yan Lin's Blog

Explorer

ICLR 2025 Oral Papers