【汇总】CVPR25语义分割相关文章【语义分割】

是王同学呀

2395人浏览 · 2025-06-17 17:48:47

是王同学呀 · 2025-06-17 17:48:47 发布

本次CVPR25涉及到语义分割的文章大约有144篇，粗略分为以下几类：其中与医学图像相关的占比是最多的，值得注意的是开放词汇语义分割今年也有不少。

探索类

Your ViT is Secretly an Image Segmentation Model
Convex Combination Star Shape Prior for Data-driven Image Semantic Segmentation
Advancing Manga Analysis: Comprehensive Segmentation Annotations for the Manga109 Dataset
FisherTune: Fisher-Guided Robust Tuning of Vision Foundation Models for Domain Generalized Segmentation
Scaling up Image Segmentation across Data and Tasks
NightAdapter: Learning a Frequency Adapter for Generalizable Night-time Scene Segmentation
SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation
Universal Domain Adaptation for Semantic Segmentation
The Impact Label Noise and Choice of Threshold has on Cross-Entropy and Soft-Dice in Image Segmentation
UNICL-SAM: Uncertainty-Driven In-Context Segmentation with Part Prototype Discovery

实时语义分割

Golden Cudgel Network for Real-Time Semantic Segmentation

高分辨率语义分割

Boosting the Dual-Stream Architecture in Ultra-High Resolution Segmentation with Resolution-Biased Uncertainty Estimatio

少样本语义分割

Text Augmented Correlation Transformer For Few-shot Classification & Segmentation
DSV-LFS: Unifying LLM-Driven Semantic Cues with Visual Features for Robust Few-Shot Segmentation
The Devil is in Low-Level Features for Cross-Domain Few-Shot Segmentation
Dual-Agent Optimization framework for Cross-Domain Few-Shot Segmentation

弱监督语义分割

Exploring CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation
Prototype-Based Image Prompting for Weakly Supervised Histopathological Image Segmentation
Weakly Supervised Semantic Segmentation via Progressive Confidence Region Expansion
WISH: Weakly Supervised Instance Segmentation using Heterogeneous Labels
FFR: Frequency Feature Rectification for Weakly Supervised Semantic Segmentation
POT: Prototypical Optimal Transport for Weakly Supervised Semantic Segmentation
Multi-Label Prototype Visual Spatial Search for Weakly Supervised Semantic Segmentation
Soft Self-labeling and Potts Relaxations for Weakly-supervised Segmentation

半监督语义分割

Improving Semi-Supervised Semantic Segmentation with Sliced-Wasserstein Feature Alignment and Uniformity
SemiDAViL: Semi-supervised Domain Adaptation with Vision-Language Guidance for Semantic Segmentation

开放词汇语义分割

Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation
LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation
Understanding Fine-tuning CLIP for Open-vocabulary Semantic Segmentation in Hyperbolic Space
Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation
Dual Semantic Guidance for Open Vocabulary Semantic Segmentation
Effective SAM Combination for Open-Vocabulary Semantic Segmentation
Exploring Simple Open-Vocabulary Semantic Segmentation
Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation
Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation
DPSeg: Dual-Prompt Cost Volume Learning for Open-Vocabulary Semantic Segmentation

全景分割

Zero-Shot 4D Lidar Panoptic Segmentation
Scene-Centric Unsupervised Panoptic Segmentation

持续学习语义分割

Beyond Background Shift: Rethinking Instance Replay in Continual Semantic Segmentation
Rethinking Query-based Transformer for Continual Image Segmentation
Towards Continual Universal Segmentation

类别增量语义分割

CoMBO: Conflict Mitigation via Branched Optimization for Class Incremental Segmentation

医学图像语义分割

Steady Progress Beats Stagnation: Mutual Aid of Foundation and Conventional Models in Mixed Domain Semi-Supervised Medical Image Segmentation
Unified Medical Lesion Segmentation via Self-referring Indicator
Revisiting MAE Pre-training for 3D Medical Image Segmentation
Test-Time Domain Generalization via Universe Learning: A Multi-Graph Matching Approach for Medical Image Segmentation
Rethinking Decoder Design: Improving Biomarker Segmentation Using Depth-to-Space Restoration and Residual Linear Attention
Show and Segment: Universal Medical Image Segmentation via In-Context Learning
A Semantic Knowledge Complementarity based Decoupling Framework for Semi-supervised Class-imbalanced Medical Image Segmentation
vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation
Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models
Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline
Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images
SuperLightNet: Lightweight Parameter Aggregation Network for Multimodal Brain Tumor Segmentation
CSC-PA: Cross-image Semantic Correlation via Prototype Attentions for Single-network Semi-supervised Breast Tumor Segmentation
LesionLocator: Zero-Shot Universal Tumor Segmentation and Tracking in 3D Whole-Body Imaging
Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation
beta-FFT: Nonlinear Interpolation and Differentiated Training Strategies for Semi-Supervised Medical Image Segmentation
EffiDec3D: An Optimized Decoder for High-Performance and Efficient 3D Medical Image Segmentation
KMD: Koopman Multi-modality Decomposition for Generalized Brain Tumor Segmentation under Incomplete Modalities
Annotation Ambiguity Aware Semi-Supervised Medical Image Segmentation
Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation
Minding Fuzzy Regions: A Data-driven Alternating Learning Paradigm for Stable Lesion Segmentation
Learning Dynamic Collaborative Network for Semi-supervised 3D Vessel Segmentation
nnWNet: Rethinking the Use of Transformers in Biomedical Image Segmentation and Calling for a Unified Evaluation Benchmark
Boost the Inference with Co-training: A Depth-guided Mutual Learning Framework for Semi-supervised Medical Polyp Segmentation
Incomplete Multi-modal Brain Tumor Segmentation via Learnable Sorting State Space Model

Amodal 语义分割

Towards Efficient Foundation Model for Zero-shot Amodal Segmentation
EntityErasure: Erasing Entity Cleanly via Amodal Entity Segmentation and Completion

Part语义分割

Fine-Grained Image-Text Correspondence with Cost Aggregation for Open-Vocabulary Part Segmentation
CALICO: Part-Focused Semantic Co-Segmentation with Large Vision-Language Models

遥感场景语义分割

Exact: Exploring Space-Time Perceptive Clues for Weakly Supervised Satellite Image Time Series Semantic Segmentation
SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing Images
ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object

裂缝场景语义分割

SCSegamba: Lightweight Structure-Aware Vision Mamba for Crack Segmentation in Structures

视频语义分割

Semantic and Sequential Alignment for Referring Video Object Segmentation
VidSeg: Training-free Video Semantic Segmentation based on Diffusion Models
ViCaS: A Dataset for Combining Holistic and Pixel-level Video Understanding using Captions with Grounded Segmentation
M^3-VOS: Multi-Phase, Multi-Transition, and Multi-Scenery Video Object Segmentation
The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
High Temporal Consistency through Semantic Similarity Propagation in Semi-Supervised Video Semantic Segmentation for Autonomous Flight
Exploiting Temporal State Space Sharing for Video Semantic Segmentation
SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost
AMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
DeNVeR: Deformable Neural Vessel Representations for Unsupervised Video Vessel Segmentation
Using Diffusion Priors for Video Amodal Segmentation
GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation
LiVOS: Light Video Object Segmentation with Gated Linear Matching
Decoupled Motion Expression Video Segmentation

RGB-X语义分割

DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation
Keep the Balance: A Parameter-Efficient Symmetrical Framework for RGB+X Semantic Segmentation

BEV语义分割

Generative Map Priors for Collaborative BEV Semantic Segmentation

语义分割新任务

WeakMCN: Multi-task Collaborative Network for Weakly Supervised Referring Expression Comprehension and Segmentation
MaSS13K: A Matting-level Semantic Segmentation Benchmark
SUM Parts: Benchmarking Part-Level Semantic Segmentation of Urban Meshes
A Dataset for Semantic Segmentation in the Presence of Unknowns

交互式语义分割

NTClick: Achieving Precise Interactive Segmentation With Noise-tolerant Clicks
Repurposing Stable Diffusion Attention for Training-Free Unsupervised Interactive Segmentation

指代语义分割

Hybrid Global-Local Representation with Augmented Spatial Guidance for Zero-Shot Referring Image Segmentation

3D场景语义分割

BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis
3D Dental Model Segmentation with Geometrical Boundary Preserving
Layered Motion Fusion: Lifting Motion Segmentation to 3D in Egocentric Videos
LogoSP: Local-global Grouping of Superpoints for Unsupervised Semantic Segmentation of 3D Point Clouds
Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model
3D-AVS: LiDAR-based 3D Auto-Vocabulary Segmentation
Generative Hard Example Augmentation for Semantic Point Cloud Segmentation
Hyperbolic Uncertainty-Aware Few-Shot Incremental Point Cloud Segmentation
CamPoint: Boosting Point Cloud Segmentation with Virtual Camera
OnlineAnySeg: Online Zero-Shot 3D Segmentation by Visual Foundation Model Guided 2D Mask Merging
D^3CTTA: Domain-Dependent Decorrelation for Continual Test-Time Adaption of 3D LiDAR Segmentation
Towards Explicit Geometry-Reflectance Collaboration for Generalized LiDAR Segmentation in Adverse Weather
PanoGS: Gaussian-based Panoptic Segmentation for 3D Open Vocabulary Scene Understanding
Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting
Relation3D : Enhancing Relation Modeling for Point Cloud Instance Segmentation
Functionality Understanding and Segmentation in 3D Scenes
An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models
COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting
Spotting the Unexpected (STU): A 3D LiDAR Dataset for Anomaly Segmentation in Autonomous Driving
Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation
Exploring Scene Affinity for Semi-Supervised LiDAR Semantic Segmentation
No Thing, Nothing: Highlighting Safety-Critical Classes for Robust LiDAR Semantic Segmentation in Adverse Weather
HiLoTs: High-Low Temporal Sensitive Representation Learning for Semi-Supervised LiDAR Segmentation in Autonomous Driving

VLM-Based语义分割

HyperSeg: Hybrid Segmentation Assistant with Fine-grained Visual Perceiver
POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation
Mamba as a Bridge: Where Vision Foundation Models Meet Vision Language Models for Domain-Generalized Semantic Segmentation

文档语义分割

DocSAM: Unified Document Image Segmentation via Query Decomposition and Heterogeneous Mixed Learning

音频-视觉联合语义分割

TSAM: Temporal SAM Augmented with Multimodal Prompts for Referring Audio-Visual Segmentation
Revisiting Audio-Visual Segmentation with Vision-Centric Transformer
Dynamic Derivation and Elimination: Audio Visual Segmentation with Enhanced Audio Semantics
Robust Audio-Visual Segmentation via Audio-Guided Visual Convergent Alignment

实例分割

PolarNeXt: Rethink Instance Segmentation with Polar Representation
Sketchy Bounding-box Supervision for 3D Instance Segmentation
Insightful Instance Features for 3D Instance Segmentation
SAM2Object: Consolidating View Consistency via SAM2 for Zero-Shot 3D Instance Segmentation
v-CLR: View-Consistent Learning for Open-World Instance Segmentation
Any3DIS: Class-Agnostic 3D Instance Segmentation by 2D Mask Tracking
RipVIS: Rip Currents Video Instance Segmentation Benchmark for Beach Monitoring and Safety
Minimizing Labeled, Maximizing Unlabeled: An Image-Driven Approach for Video Instance Segmentation
Audio-Visual Instance Segmentation
Foveated Instance Segmentation

全球具身智能开发者社区

立足具身智能前沿赛道，致力于搭建全球化、开源化、全栈式技术交流与实践共创平台。

更多推荐

开源聚势·具身启智，杭州这场沙龙给出中国具身智能产业化新答案

全球具身智能开发者社区

YoMo安全机制详解：TLS v1.3如何保护你的AI代理通信

在当今AI驱动的分布式系统中，安全通信已成为不可忽视的核心需求。YoMo作为Serverless AI Agent Framework，采用TLS v1.3加密协议构建了强大的安全防护机制，确保AI代理在地理分布式边缘计算环境中的通信安全。本文将深入解析YoMo的TLS实现原理、配置方法及最佳实践，帮助开发者构建安全可靠的AI应用。## 为什么TLS v1.3是AI代理通信的理想选择TLS

全球具身智能开发者社区

InternScenes开源数据集

数据集介绍 InternScenes 是上海人工智能实验室发布的大规模、可模拟室内场景数据集，论文收录于 NeurIPS 2025。具身人工智能的发展高度依赖于具有场景多样性和逼真布局的大规模、可模拟3D场景数据集。然而，现有数据集通常存在以下不足：数据规模或多样性有限、布局经过"净化"处理导致小物体缺失，以及严重的物体碰撞问题。为解决上述问题，InternScenes 整合