<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>ECCV on Yida Wang</title>
    <link>https://wangyida.github.io/categories/eccv/</link>
    <description>Recent content in ECCV on Yida Wang</description>
    <image>
      <title>Yida Wang</title>
      <url>https://wangyida.github.io/logos/android-chrome-512x512.png</url>
      <link>https://wangyida.github.io/logos/android-chrome-512x512.png</link>
    </image>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Wed, 24 Jun 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://wangyida.github.io/categories/eccv/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>MVGS: Multi-View Regulated Gaussian Splatting for Novel View Synthesis</title>
      <link>https://wangyida.github.io/posts/mvgs/</link>
      <pubDate>Wed, 24 Jun 2026 00:00:00 +0000</pubDate>
      <guid>https://wangyida.github.io/posts/mvgs/</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Re-direct to the full &lt;a href=&#34;https://arxiv.org/pdf/2410.02103&#34;&gt;&lt;strong&gt;PAPER&lt;/strong&gt;&lt;/a&gt;, &lt;a href=&#34;https://xiaobiaodu.github.io/mvgs/&#34;&gt;&lt;strong&gt;PROJECT PAGE&lt;/strong&gt;&lt;/a&gt;, and &lt;a href=&#34;https://github.com/xiaobiaodu/mvgs&#34;&gt;&lt;strong&gt;CODE&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img alt=&#34;teaser&#34; loading=&#34;lazy&#34; src=&#34;https://wangyida.github.io/posts/mvgs/images/gaussian_correlation_analysis.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;cover teaser&#34; loading=&#34;lazy&#34; src=&#34;https://wangyida.github.io/posts/mvgs/images/teaser-mvgs.png&#34;&gt;&lt;/p&gt;
&lt;h1 id=&#34;abstract&#34;&gt;Abstract&lt;/h1&gt;
&lt;p&gt;Recent works in novel view synthesis, \textit{e.g.}, Neural Radiance Field (NeRF) and 3D Gaussian Splatting (3DGS), have significantly advanced rendering quality and efficiency. However, existing Gaussian-based novel view synthesis methods typically follow a single-view optimization paradigm. We observed that this optimization paradigm suffers from unstable gradients, leading to suboptimal rendering quality. To tackle this issue, we present a novel multi-view regulated Gaussian Splatting (MVGS) that fully leverages a multi-view coherent (MVC) constraint throughout the optimization process. Specifically, our proposed MVC enhances 3D Gaussian multi-view consistency and thus ensures smoother gradient updates. Furthermore, since single-scale training usually leads to suboptimal solutions, we propose a cross-intrinsic guidance scheme in a coarse-to-fine manner to further improve the convergence of multi-view optimization in 3DGS. In particular, by incorporating more multi-view images at the low resolution, we can optimize 3D Gaussians with a more comprehensive perspective. Then, finer-scale Gaussians are initialized by coarsely estimated ones instead of optimizing full-scale 3D Gaussians from scratch. Moreover, we found that 3D Gaussians usually struggle to fit 2D training views with minimal overlap. Thus, we propose a novel multi-view cross-ray densification strategy, where 3D Gaussians are dynamically split to accommodate drastic viewpoint variations in the multi-view optimization process. In this way, the multi-view consistency can be further improved. Notably, our proposed MVGS method is a plug-and-play optimizer. Extensive experiments across various tasks demonstrate that our proposed MVGS improves existing Gaussian-based methods and achieves state-of-the-art performance.&lt;/p&gt;</description>
    </item>
    <item>
      <title>StreetForward: Perceiving Dynamic Street with Feedforward Causal Attention</title>
      <link>https://wangyida.github.io/posts/streetforward/</link>
      <pubDate>Wed, 22 Apr 2026 10:07:05 +0000</pubDate>
      <guid>https://wangyida.github.io/posts/streetforward/</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Re-direct to the full &lt;a href=&#34;https://arxiv.org/abs/2603.19552&#34;&gt;&lt;strong&gt;PAPER&lt;/strong&gt;&lt;/a&gt; and &lt;a href=&#34;https://streetforward.github.io/&#34;&gt;&lt;strong&gt;PROJECT PAGE&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;We present StreetForward, a pose-free and tracker-free feedforward framework for dynamic street reconstruction. Building upon alternating attention, it introduces a temporal mask attention module that captures dynamic motion from image sequences and produces motion-aware latent representations. Static content and dynamic instances are represented uniformly with 3D Gaussian Splatting and optimized jointly through cross-frame rendering with spatio-temporal consistency, enabling high-fidelity novel-view synthesis at new poses and times while also estimating per-pixel velocities.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Shape Descriptor for Point Cloud Completion and Classification (SoftPoolNet)</title>
      <link>https://wangyida.github.io/posts/softpoolnet/</link>
      <pubDate>Tue, 25 Aug 2020 10:15:01 +0200</pubDate>
      <guid>https://wangyida.github.io/posts/softpoolnet/</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Re-direct to the full &lt;a href=&#34;https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123480069.pdf&#34;&gt;&lt;strong&gt;PAPER&lt;/strong&gt;&lt;/a&gt; and &lt;a href=&#34;https://github.com/wangyida/softpool&#34;&gt;&lt;strong&gt;CODE&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div style=&#34;position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;&#34;&gt;
      &lt;iframe allow=&#34;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen&#34; loading=&#34;eager&#34; referrerpolicy=&#34;strict-origin-when-cross-origin&#34; src=&#34;https://www.youtube.com/embed/zw4NlyxWlBg?autoplay=0&amp;amp;controls=1&amp;amp;end=0&amp;amp;loop=0&amp;amp;mute=0&amp;amp;start=0&#34; style=&#34;position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;&#34; title=&#34;YouTube video&#34;&gt;&lt;/iframe&gt;
    &lt;/div&gt;

&lt;h1 id=&#34;abstract&#34;&gt;Abstract&lt;/h1&gt;
&lt;p&gt;Point clouds are often the default choice for many applications as they exhibit more flexibility and efficiency than volumetric data. Nevertheless, their unorganized nature &amp;ndash; points are stored in an unordered way &amp;ndash; makes them less suited to be processed by deep learning pipelines. In this paper, we propose a method for 3D object completion and classification based on point clouds. We introduce a new way of organizing the extracted features based on their activations, which we name soft pooling. For the decoder stage, we propose regional convolutions, a novel operator aimed at maximizing the global activation entropy. Furthermore, inspired by the local refining procedure in Point Completion Network (PCN), we also propose a patch-deforming operation to simulate deconvolutional operations for point clouds. This paper proves that our regional activation can be incorporated in many point cloud architectures like AtlasNet and PCN, leading to better performance for geometric completion. We evaluate our approach on different 3D tasks such as object completion and classification, achieving state-of-the-art accuracy.&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
