Search | arXiv e-print repository

arXiv:2403.19924 [pdf, other]

SceneTracker: Long-term Scene Flow Estimation Network

Authors: Bo Wang, Jian Li, Yang Yu, Li Liu, Zhen** Sun, Dewen Hu

Abstract: Considering the complementarity of scene flow estimation in the spatial domain's focusing capability and 3D object tracking in the temporal domain's coherence, this study aims to address a comprehensive new task that can simultaneously capture fine-grained and long-term 3D motion in an online manner: long-term scene flow estimation (LSFE). We introduce SceneTracker, a novel learning-based LSFE net… ▽ More Considering the complementarity of scene flow estimation in the spatial domain's focusing capability and 3D object tracking in the temporal domain's coherence, this study aims to address a comprehensive new task that can simultaneously capture fine-grained and long-term 3D motion in an online manner: long-term scene flow estimation (LSFE). We introduce SceneTracker, a novel learning-based LSFE network that adopts an iterative approach to approximate the optimal trajectory. Besides, it dynamically indexes and constructs appearance and depth correlation features simultaneously and employs the Transformer to explore and utilize long-range connections within and between trajectories. With detailed experiments, SceneTracker shows superior capabilities in handling 3D spatial occlusion and depth noise interference, highly tailored to the LSFE task's needs. Finally, we build the first real-world evaluation dataset, LSFDriving, further substantiating SceneTracker's commendable generalization capacity. The code and data for SceneTracker is available at https://github.com/wwsource/SceneTracker. △ Less

Submitted 6 May, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.19830 [pdf, other]

Emerging Jordan blocks in the two-dimensional Potts and loop models at generic $Q$

Authors: Lawrence Liu, Jesper Lykke Jacobsen, Hubert Saleur

Abstract: It was recently suggested -- based on general self-consistency arguments as well as results from the bootstrap (arXiv:2005.07708, arXiv:2007.11539, arXiv:2007.04190) -- that the CFT describing the $Q$-state Potts model is logarithmic for generic values of $Q$, with rank-two Jordan blocks for $L_0$ and ${\mkern 1.5mu\overline{\mkern-1.5mu L\mkern-1.5mu}\mkern 1.5mu}_0$ in many sectors of the theory… ▽ More It was recently suggested -- based on general self-consistency arguments as well as results from the bootstrap (arXiv:2005.07708, arXiv:2007.11539, arXiv:2007.04190) -- that the CFT describing the $Q$-state Potts model is logarithmic for generic values of $Q$, with rank-two Jordan blocks for $L_0$ and ${\mkern 1.5mu\overline{\mkern-1.5mu L\mkern-1.5mu}\mkern 1.5mu}_0$ in many sectors of the theory. This is despite the well-known fact that the lattice transfer matrix (or Hamiltonian) is diagonalizable in (arbitrary) finite size. While the emergence of Jordan blocks only in the limit $L\to\infty$ is perfectly possible conceptually, diagonalizability in finite size makes the measurement of logarithmic couplings (whose values are analytically predicted in arXiv:2007.11539, arXiv:2007.04190) very challenging. This problem is solved in the present paper (which can be considered a companion to arXiv:2007.11539), and the conjectured logarithmic structure of the CFT confirmed in detail by the study of the lattice model and associated "emerging Jordan blocks." △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2403.09881

arXiv:2403.19521 [pdf, other]

Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models

Authors: Ang Lv, Yuhan Chen, Kaiyi Zhang, Yulong Wang, Lifeng Liu, Ji-Rong Wen, Jian Xie, Rui Yan

Abstract: In this paper, we delve into several mechanisms employed by Transformer-based language models (LLMs) for factual recall tasks. We outline a pipeline consisting of three major steps: (1) Given a prompt ``The capital of France is,'' task-specific attention heads extract the topic token, such as ``France,'' from the context and pass it to subsequent MLPs. (2) As attention heads' outputs are aggregate… ▽ More In this paper, we delve into several mechanisms employed by Transformer-based language models (LLMs) for factual recall tasks. We outline a pipeline consisting of three major steps: (1) Given a prompt ``The capital of France is,'' task-specific attention heads extract the topic token, such as ``France,'' from the context and pass it to subsequent MLPs. (2) As attention heads' outputs are aggregated with equal weight and added to the residual stream, the subsequent MLP acts as an ``activation,'' which either erases or amplifies the information originating from individual heads. As a result, the topic token ``France'' stands out in the residual stream. (3) A deep MLP takes ``France'' and generates a component that redirects the residual stream towards the direction of the correct answer, i.e., ``Paris.'' This procedure is akin to applying an implicit function such as ``get\_capital($X$),'' and the argument $X$ is the topic token information passed by attention heads. To achieve the above quantitative and qualitative analysis for MLPs, we proposed a novel analytic method aimed at decomposing the outputs of the MLP into components understandable by humans. Additionally, we observed a universal anti-overconfidence mechanism in the final layer of models, which suppresses correct predictions. We mitigate this suppression by leveraging our interpretation to improve factual recall confidence. The above interpretations are evaluated across diverse tasks spanning various domains of factual knowledge, using various language models from the GPT-2 families, 1.3B OPT, up to 7B Llama-2, and in both zero- and few-shot setups. △ Less

Submitted 24 May, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.19256 [pdf, other]

Measurement of absolute branching fractions of $D_s^+$ hadronic decays

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (632 additional authors not shown)

Abstract: Using $e^+ e^-$ collision data collected at the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of $7.33~{\rm fb}^{-1}$, we determine the absolute branching fractions of fifteen hadronic $D_s^{+}$ decays with a double-tag technique. In particular, we make precise measurements of the branching fractions… ▽ More Using $e^+ e^-$ collision data collected at the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of $7.33~{\rm fb}^{-1}$, we determine the absolute branching fractions of fifteen hadronic $D_s^{+}$ decays with a double-tag technique. In particular, we make precise measurements of the branching fractions $\mathcal{B}(D_s^+ \to K^+ K^- π^+)=(5.49 \pm 0.04 \pm 0.07)\%$, $\mathcal{B}(D_s^+ \to K_S^0 K^+)=(1.50 \pm 0.01 \pm 0.01)\%$ and $\mathcal{B}(D_s^+ \to K^+ K^- π^+ π^0)=(5.50 \pm 0.05 \pm 0.11)\%$, where the first uncertainties are statistical and the second ones are systematic. The \emph{CP} asymmetries in these decays are also measured and all are found to be compatible with zero. △ Less

Submitted 30 May, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.19091 [pdf, other]

Observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (600 additional authors not shown)

Abstract: By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fra… ▽ More By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fractions are measured to be $\mathcal{B}(D^0\rightarrow {K}_1(1270)^-(\to K^0_Sπ^-π^0)e^+ν_e)=(1.69^{+0.53}_{-0.46}\pm0.15)\times10^{-4}$ and $\mathcal{B}(D^+\to \bar{K}_1(1270)^0(\to K^0_Sπ^+π^-)e^+ν_e)=(1.47^{+0.45}_{-0.40}\pm0.20)\times10^{-4}$ with statistical significance of 5.4$σ$ and 5.6$σ$, respectively. When combined with measurements of the $K_1(1270)\to K^+π^-π$ decays, the absolute branching fractions are determined to be $\mathcal{B}(D^0\to K_1(1270)^-e^+ν_e)=(1.05^{+0.33}_{-0.28}\pm0.12\pm0.12)\times10^{-3}$ and $\mathcal{B}(D^+\to \bar{K}_1(1270)^0e^+ν_e)=(1.29^{+0.40}_{-0.35}\pm0.18\pm0.15)\times10^{-3}$. The first and second uncertainties are statistical and systematic, respectively, and the third uncertainties originate from the assumed branching fractions of the $K_1(1270)\to Kππ$ decays. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: 19pages

arXiv:2403.19010 [pdf, other]

Gaussian Process-based Traversability Analysis for Terrain Mapless Navigation

Authors: Abe Leininger, Mahmoud Ali, Hassan Jardali, Lantao Liu

Abstract: Efficient navigation through uneven terrain remains a challenging endeavor for autonomous robots. We propose a new geometric-based uneven terrain mapless navigation framework combining a Sparse Gaussian Process (SGP) local map with a Rapidly-Exploring Random Tree* (RRT*) planner. Our approach begins with the generation of a high-resolution SGP local map, providing an interpolated representation of… ▽ More Efficient navigation through uneven terrain remains a challenging endeavor for autonomous robots. We propose a new geometric-based uneven terrain mapless navigation framework combining a Sparse Gaussian Process (SGP) local map with a Rapidly-Exploring Random Tree* (RRT*) planner. Our approach begins with the generation of a high-resolution SGP local map, providing an interpolated representation of the robot's immediate environment. This map captures crucial environmental variations, including height, uncertainties, and slope characteristics. Subsequently, we construct a traversability map based on the SGP representation to guide our planning process. The RRT* planner efficiently generates real-time navigation paths, avoiding untraversable terrain in pursuit of the goal. This combination of SGP-based terrain interpretation and RRT* planning enables ground robots to safely navigate environments with varying elevations and steep obstacles. We evaluate the performance of our proposed approach through robust simulation testing, highlighting its effectiveness in achieving safe and efficient navigation compared to existing methods. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: This paper has been accepted for publication at 2024 IEEE International Conference on Robotics and Automation (ICRA 2024)

arXiv:2403.18335 [pdf, ps, other]

Arc-transitive maps with coprime Euler characteristic and edge number

Authors: C. H. Li, Lu Yi Liu

Abstract: This is one of a series of papers which aim towards a classification of edge-transitive maps of which the Euler characteristic and the edge number are coprime. This one carries out the classification work for arc-transitive maps with nonsolvable automorphism groups, which illustrates how the edge number impacts on the Euler characteristic for maps. The classification is involved with the construct… ▽ More This is one of a series of papers which aim towards a classification of edge-transitive maps of which the Euler characteristic and the edge number are coprime. This one carries out the classification work for arc-transitive maps with nonsolvable automorphism groups, which illustrates how the edge number impacts on the Euler characteristic for maps. The classification is involved with the construction of some new and interesting arc-regular maps. △ Less

Submitted 27 March, 2024; originally announced March 2024.

MSC Class: 05C10; 20B25

arXiv:2403.17931 [pdf, other]

Track Everything Everywhere Fast and Robustly

Authors: Yunzhou Song, Jiahui Lei, Ziyun Wang, Lingjie Liu, Kostas Daniilidis

Abstract: We propose a novel test-time optimization approach for efficiently and robustly tracking any pixel at any time in a video. The latest state-of-the-art optimization-based tracking technique, OmniMotion, requires a prohibitively long optimization time, rendering it impractical for downstream applications. OmniMotion is sensitive to the choice of random seeds, leading to unstable convergence. To impr… ▽ More We propose a novel test-time optimization approach for efficiently and robustly tracking any pixel at any time in a video. The latest state-of-the-art optimization-based tracking technique, OmniMotion, requires a prohibitively long optimization time, rendering it impractical for downstream applications. OmniMotion is sensitive to the choice of random seeds, leading to unstable convergence. To improve efficiency and robustness, we introduce a novel invertible deformation network, CaDeX++, which factorizes the function representation into a local spatial-temporal feature grid and enhances the expressivity of the coupling blocks with non-linear functions. While CaDeX++ incorporates a stronger geometric bias within its architectural design, it also takes advantage of the inductive bias provided by the vision foundation models. Our system utilizes monocular depth estimation to represent scene geometry and enhances the objective by incorporating DINOv2 long-term semantics to regulate the optimization process. Our experiments demonstrate a substantial improvement in training speed (more than \textbf{10 times} faster), robustness, and accuracy in tracking over the SoTA optimization-based method OmniMotion. △ Less