-
MaNGA DynPop -- VI. Matter density slopes from dynamical models of 6000 galaxies versus cosmological simulations: the interplay between baryonic and dark matter
Authors:
Shubo Li,
Ran Li,
Kai Zhu,
Shengdong Lu,
Michele Cappellari,
Shude Mao,
Chunxiang Wang,
Liang Gao
Abstract:
We try to understand the trends in the mass density slopes as a function of galaxy properties. We use the results from the best Jeans Anisotropic Modelling (JAM) of the integral-field stellar kinematics for near 6000 galaxies from the MaNGA DynPop project, with stellar masses of $10^{9-12}\ {\rm M_{\odot}}$, including both early-type and late-type galaxies. We use the mass-weighted density slopes…
▽ More
We try to understand the trends in the mass density slopes as a function of galaxy properties. We use the results from the best Jeans Anisotropic Modelling (JAM) of the integral-field stellar kinematics for near 6000 galaxies from the MaNGA DynPop project, with stellar masses of $10^{9-12}\ {\rm M_{\odot}}$, including both early-type and late-type galaxies. We use the mass-weighted density slopes for the stellar $\overlineγ_*$, dark $\overlineγ_{\rm DM}$, and total $\overlineγ_{\rm T}$ mass from the MaNGA DynPop project. The $\overlineγ_{\rm T}$ approaches a constant value of 2.2 for high $σ_{\rm e}$ galaxies, and flattens for lg$(σ_{\rm e}/{\rm km\ s^{-1}})\lesssim2.3$, reaching 1.5 for lg$(σ_{\rm e}/{\rm km\ s^{-1}})\approx1.8$. The total and stellar slopes track each other tightly, with $\overlineγ_{\rm T}\approx\overlineγ_*-0.174$ over the full $σ_{\rm e}$ range. This confirms the dominance of stellar matter within $R_{\rm e}$. We also show that there is no perfect conspiracy between baryonic and dark matter, as $\overlineγ_*$ and $\overlineγ_{\rm DM}$ do not vary inversely within the $σ_{\rm e}$ range. We find that the central galaxies from TNG50 and TNG100 simulations do not reproduce the observed galaxy mass distribution, which we attribute to the overestimated dark matter fraction, possibly due to a constant IMF and excessive adiabatic contraction effects in the simulations. Finally, we present the stacked dark matter density profiles and show that they are slightly steeper than the pure dark matter simulation prediction of $\overlineγ_{\rm DM}\approx1$, suggesting moderate adiabatic contraction in the central region of galaxies. Our work demonstrate the power of stellar dynamics modelling for probing the interaction between stellar and dark matter and testing galaxy formation theories.
△ Less
Submitted 1 April, 2024; v1 submitted 20 October, 2023;
originally announced October 2023.
-
HumanTOMATO: Text-aligned Whole-body Motion Generation
Authors:
Shunlin Lu,
Ling-Hao Chen,
Ailing Zeng,
**g Lin,
Ruimao Zhang,
Lei Zhang,
Heung-Yeung Shum
Abstract:
This work targets a novel text-driven whole-body motion generation task, which takes a given textual description as input and aims at generating high-quality, diverse, and coherent facial expressions, hand gestures, and body motions simultaneously. Previous works on text-driven motion generation tasks mainly have two limitations: they ignore the key role of fine-grained hand and face controlling i…
▽ More
This work targets a novel text-driven whole-body motion generation task, which takes a given textual description as input and aims at generating high-quality, diverse, and coherent facial expressions, hand gestures, and body motions simultaneously. Previous works on text-driven motion generation tasks mainly have two limitations: they ignore the key role of fine-grained hand and face controlling in vivid whole-body motion generation, and lack a good alignment between text and motion. To address such limitations, we propose a Text-aligned whOle-body Motion generATiOn framework, named HumanTOMATO, which is the first attempt to our knowledge towards applicable holistic motion generation in this research area. To tackle this challenging task, our solution includes two key designs: (1) a Holistic Hierarchical VQ-VAE (aka H$^2$VQ) and a Hierarchical-GPT for fine-grained body and hand motion reconstruction and generation with two structured codebooks; and (2) a pre-trained text-motion-alignment model to help generated motion align with the input textual description explicitly. Comprehensive experiments verify that our model has significant advantages in both the quality of generated motions and their alignment with text.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Measuring Pointwise $\mathcal{V}$-Usable Information In-Context-ly
Authors:
Sheng Lu,
Shan Chen,
Yingya Li,
Danielle Bitterman,
Guergana Savova,
Iryna Gurevych
Abstract:
In-context learning (ICL) is a new learning paradigm that has gained popularity along with the development of large language models. In this work, we adapt a recently proposed hardness metric, pointwise $\mathcal{V}$-usable information (PVI), to an in-context version (in-context PVI). Compared to the original PVI, in-context PVI is more efficient in that it requires only a few exemplars and does n…
▽ More
In-context learning (ICL) is a new learning paradigm that has gained popularity along with the development of large language models. In this work, we adapt a recently proposed hardness metric, pointwise $\mathcal{V}$-usable information (PVI), to an in-context version (in-context PVI). Compared to the original PVI, in-context PVI is more efficient in that it requires only a few exemplars and does not require fine-tuning. We conducted a comprehensive empirical analysis to evaluate the reliability of in-context PVI. Our findings indicate that in-context PVI estimates exhibit similar characteristics to the original PVI. Specific to the in-context setting, we show that in-context PVI estimates remain consistent across different exemplar selections and numbers of shots. The variance of in-context PVI estimates across different exemplar selections is insignificant, which suggests that in-context PVI are stable. Furthermore, we demonstrate how in-context PVI can be employed to identify challenging instances. Our work highlights the potential of in-context PVI and provides new insights into the capabilities of ICL.
△ Less
Submitted 8 December, 2023; v1 submitted 18 October, 2023;
originally announced October 2023.
-
Live Graph Lab: Towards Open, Dynamic and Real Transaction Graphs with NFT
Authors:
Zhen Zhang,
Bingqiao Luo,
Shengliang Lu,
Bingsheng He
Abstract:
Numerous studies have been conducted to investigate the properties of large-scale temporal graphs. Despite the ubiquity of these graphs in real-world scenarios, it's usually impractical for us to obtain the whole real-time graphs due to privacy concerns and technical limitations. In this paper, we introduce the concept of {\it Live Graph Lab} for temporal graphs, which enables open, dynamic and re…
▽ More
Numerous studies have been conducted to investigate the properties of large-scale temporal graphs. Despite the ubiquity of these graphs in real-world scenarios, it's usually impractical for us to obtain the whole real-time graphs due to privacy concerns and technical limitations. In this paper, we introduce the concept of {\it Live Graph Lab} for temporal graphs, which enables open, dynamic and real transaction graphs from blockchains. Among them, Non-fungible tokens (NFTs) have become one of the most prominent parts of blockchain over the past several years. With more than \$40 billion market capitalization, this decentralized ecosystem produces massive, anonymous and real transaction activities, which naturally forms a complicated transaction network. However, there is limited understanding about the characteristics of this emerging NFT ecosystem from a temporal graph analysis perspective. To mitigate this gap, we instantiate a live graph with NFT transaction network and investigate its dynamics to provide new observations and insights. Specifically, through downloading and parsing the NFT transaction activities, we obtain a temporal graph with more than 4.5 million nodes and 124 million edges. Then, a series of measurements are presented to understand the properties of the NFT ecosystem. Through comparisons with social, citation, and web networks, our analyses give intriguing findings and point out potential directions for future exploration. Finally, we also study machine learning models in this live graph to enrich the current datasets and provide new opportunities for the graph community. The source codes and dataset are available at https://livegraphlab.github.io.
△ Less
Submitted 18 October, 2023; v1 submitted 18 October, 2023;
originally announced October 2023.
-
Kinematical coherence between satellite galaxies and host stellar discs for MaNGA & SAMI galaxies
Authors:
Sen Wang,
Dandan Xu,
Shengdong Lu,
Cheng Li
Abstract:
The effect of angular momentum on galaxy formation and evolution has been studied for several decades. Our recent two papers using IllustrisTNG-100 simulation have revealed the acquisition path of the angular momentum from large-scale environment (satellites within hundreds of kpc) through the circum-galactic medium (CGM) to the stellar discs, putting forward the co-rotation scenario across the th…
▽ More
The effect of angular momentum on galaxy formation and evolution has been studied for several decades. Our recent two papers using IllustrisTNG-100 simulation have revealed the acquisition path of the angular momentum from large-scale environment (satellites within hundreds of kpc) through the circum-galactic medium (CGM) to the stellar discs, putting forward the co-rotation scenario across the three distance scales. In real observations, although the rotation signature for the CGM and environmental three-dimensional (3d) angular momentum are difficult to obtain, line-of-sight kinematics of group member galaxies and stellar disc kinematics of central galaxies are available utilizing existing group catalogue data and integral field unit (IFU) data. In this paper, we use (1) the group catalogue of SDSS DR7 and MaNGA IFU stellar kinematic maps and (2) the group catalogue of GAMA DR4 data and SAMI IFU stellar kinematic maps, to test if the prediction above can be seen in real data. We found the co-rotation pattern between stellar discs and satellites can be concluded with 99.7 percent confidence level ($\sim 3σ$) when combining the two datasets. And the random tests show that the signal can be scarcely drawn from random distribution.
△ Less
Submitted 29 November, 2023; v1 submitted 16 October, 2023;
originally announced October 2023.
-
Type-aware Decoding via Explicitly Aggregating Event Information for Document-level Event Extraction
Authors:
Gang Zhao,
Yidong Shi,
Shudong Lu,
Xinjie Yang,
Guanting Dong,
Jian Xu,
Xiaocheng Gong,
Si Li
Abstract:
Document-level event extraction (DEE) faces two main challenges: arguments-scattering and multi-event. Although previous methods attempt to address these challenges, they overlook the interference of event-unrelated sentences during event detection and neglect the mutual interference of different event roles during argument extraction. Therefore, this paper proposes a novel Schema-based Explicitly…
▽ More
Document-level event extraction (DEE) faces two main challenges: arguments-scattering and multi-event. Although previous methods attempt to address these challenges, they overlook the interference of event-unrelated sentences during event detection and neglect the mutual interference of different event roles during argument extraction. Therefore, this paper proposes a novel Schema-based Explicitly Aggregating~(SEA) model to address these limitations. SEA aggregates event information into event type and role representations, enabling the decoding of event records based on specific type-aware representations. By detecting each event based on its event type representation, SEA mitigates the interference caused by event-unrelated information. Furthermore, SEA extracts arguments for each role based on its role-aware representations, reducing mutual interference between different roles. Experimental results on the ChFinAnn and DuEE-fin datasets show that SEA outperforms the SOTA methods.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
DemoSG: Demonstration-enhanced Schema-guided Generation for Low-resource Event Extraction
Authors:
Gang Zhao,
Xiaocheng Gong,
Xinjie Yang,
Guanting Dong,
Shudong Lu,
Si Li
Abstract:
Most current Event Extraction (EE) methods focus on the high-resource scenario, which requires a large amount of annotated data and can hardly be applied to low-resource domains. To address EE more effectively with limited resources, we propose the Demonstration-enhanced Schema-guided Generation (DemoSG) model, which benefits low-resource EE from two aspects: Firstly, we propose the demonstration-…
▽ More
Most current Event Extraction (EE) methods focus on the high-resource scenario, which requires a large amount of annotated data and can hardly be applied to low-resource domains. To address EE more effectively with limited resources, we propose the Demonstration-enhanced Schema-guided Generation (DemoSG) model, which benefits low-resource EE from two aspects: Firstly, we propose the demonstration-based learning paradigm for EE to fully use the annotated data, which transforms them into demonstrations to illustrate the extraction process and help the model learn effectively. Secondly, we formulate EE as a natural language generation task guided by schema-based prompts, thereby leveraging label semantics and promoting knowledge transfer in low-resource scenarios. We conduct extensive experiments under in-domain and domain adaptation low-resource settings on three datasets, and study the robustness of DemoSG. The results show that DemoSG significantly outperforms current methods in low-resource scenarios.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Privacy-Preserved Aggregate Thermal Dynamic Model of Buildings
Authors:
Zeyin Hou,
Shuai Lu,
Yijun Xu,
Haifeng Qiu,
Wei Gu,
Zhaoyang Dong,
Shixing Ding
Abstract:
The thermal inertia of buildings brings considerable flexibility to the heating and cooling load, which is known to be a promising demand response resource. The aggregate model that can describe the thermal dynamics of the building cluster is an important interference for energy systems to exploit its intrinsic thermal inertia. However, the private information of users, such as the indoor temperat…
▽ More
The thermal inertia of buildings brings considerable flexibility to the heating and cooling load, which is known to be a promising demand response resource. The aggregate model that can describe the thermal dynamics of the building cluster is an important interference for energy systems to exploit its intrinsic thermal inertia. However, the private information of users, such as the indoor temperature and heating/cooling power, needs to be collected in the parameter estimation procedure to obtain the aggregate model, causing severe privacy concerns. In light of this, we propose a novel privacy-preserved parameter estimation approach to infer the aggregate model for the thermal dynamics of the building cluster for the first time. Using it, the parameters of the aggregate thermal dynamic model (ATDM) can be obtained by the load aggregator without accessing the individual's privacy information. More specifically, this method not only exploits the block coordinate descent (BCD) method to resolve its non-convexity in the estimation but investigates the transformation-based encryption (TE) associated with its secure aggregation protocol (SAP) techniques to realize privacy-preserved computation. Its capability of preserving privacy is also theoretically proven. Finally, simulation results using real-world data demonstrate the accuracy and privacy-preserved performance of our proposed method.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
CacheGen: KV Cache Compression and Streaming for Fast Language Model Serving
Authors:
Yuhan Liu,
Hanchen Li,
Yihua Cheng,
Siddhant Ray,
Yuyang Huang,
Qizheng Zhang,
Kuntai Du,
Jiayi Yao,
Shan Lu,
Ganesh Ananthanarayanan,
Michael Maire,
Henry Hoffmann,
Ari Holtzman,
Junchen Jiang
Abstract:
As large language models (LLMs) take on complex tasks, their inputs are supplemented with longer contexts that incorporate domain knowledge or user-specific information. Yet using long contexts poses a challenge for responsive LLM systems, as nothing can be generated until the whole context is processed by the LLM. .
CacheGen is a fast context-loading module for LLM systems. First, CacheGen uses…
▽ More
As large language models (LLMs) take on complex tasks, their inputs are supplemented with longer contexts that incorporate domain knowledge or user-specific information. Yet using long contexts poses a challenge for responsive LLM systems, as nothing can be generated until the whole context is processed by the LLM. .
CacheGen is a fast context-loading module for LLM systems. First, CacheGen uses a custom tensor encoder, which embraces KV cache's distributional properties, to encode a KV cache into more compact bitstream representations with negligible encoding/decoding overhead. This reduces the bandwidth demand to fetch the KV cache. Second, to maintain low context-loading delay and high generation quality, CacheGen adapts the streaming strategies to cope with changes in available bandwidth. When available bandwidth drops, CacheGen may raise the compression level for a part of the context or choose to recompute its KV cache on the fly. We test CacheGen on four popular LLMs of various sizes and four datasets (662 contexts in total). Compared to the recent systems that reuse the KV cache, CacheGen reduces the KV cache size by 3.5-4.3x and the total delay in fetching and processing contexts by 3.2-3.7x while having negligible impact on the LLM response quality in accuracy or perplexity.
△ Less
Submitted 30 April, 2024; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Finiteness of pointed maps to moduli spaces of polarized varieties
Authors:
Ariyan Javanpeykar,
Steven Lu,
Ruiran Sun,
Kang Zuo
Abstract:
We prove a finiteness result for pointed maps to the base space of a family of polarized varieties with maximal variation in moduli. A key ingredient is a new criterion for the rigidity of pointed maps.
We prove a finiteness result for pointed maps to the base space of a family of polarized varieties with maximal variation in moduli. A key ingredient is a new criterion for the rigidity of pointed maps.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Tunable non-Lifshitz-Kosevich temperature dependence of Shubnikov-de Haas oscillation amplitudes in SmSb
Authors:
Wei Zhang,
C. N. Kuo,
S. T. Kuo,
Chun Wa So,
Jianyu Xie,
Kwing To Lai,
Wing Chi Yu,
C. S. Lue,
Hoi Chun Po,
Swee K. Goh
Abstract:
The Lifshitz-Kosevich (LK) theory is the pillar of magnetic quantum oscillations, which have been extensively applied to characterize a wide range of metallic states. In this study, we focus on the Shubnikov-de Haas (SdH) effect observed in SmSb, a rare-earth monopnictide. We observed a significant departure from the expected LK theory near $T_N=2.4$~K: both a peak-like anomaly and an enhancement…
▽ More
The Lifshitz-Kosevich (LK) theory is the pillar of magnetic quantum oscillations, which have been extensively applied to characterize a wide range of metallic states. In this study, we focus on the Shubnikov-de Haas (SdH) effect observed in SmSb, a rare-earth monopnictide. We observed a significant departure from the expected LK theory near $T_N=2.4$~K: both a peak-like anomaly and an enhancement in the temperature dependence of quantum oscillation amplitude are seen in SmSb. Moreover, we discovered a remarkable sensitivity of the SdH amplitudes to sample purity. By adjusting the sample purity, we were able to tune the temperature dependence of the $α$ band's SdH amplitudes from a peak-like anomalous behavior to an enhancement. Therefore, SdH oscillations from the $α$ band connect the two well-known non-LK behaviours, controllable through varying the sample purity, paving the way for develo** further understanding of the mechanism leading to the anomalous quantum oscillations.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model
Authors:
Peng Di,
Jianguo Li,
Hang Yu,
Wei Jiang,
Wenting Cai,
Yang Cao,
Chaoyu Chen,
Dajun Chen,
Hongwei Chen,
Liang Chen,
Gang Fan,
Jie Gong,
Zi Gong,
Wen Hu,
Tingting Guo,
Zhichao Lei,
Ting Li,
Zheng Li,
Ming Liang,
Cong Liao,
Bingchang Liu,
Jiachen Liu,
Zhiwei Liu,
Shaojun Lu,
Min Shen
, et al. (13 additional authors not shown)
Abstract:
Code Large Language Models (Code LLMs) have gained significant attention in the industry due to their wide applications in the full lifecycle of software engineering. However, the effectiveness of existing models in understanding non-English inputs for multi-lingual code-related tasks is still far from well studied. This paper introduces CodeFuse-13B, an open-sourced pre-trained code LLM. It is sp…
▽ More
Code Large Language Models (Code LLMs) have gained significant attention in the industry due to their wide applications in the full lifecycle of software engineering. However, the effectiveness of existing models in understanding non-English inputs for multi-lingual code-related tasks is still far from well studied. This paper introduces CodeFuse-13B, an open-sourced pre-trained code LLM. It is specifically designed for code-related tasks with both English and Chinese prompts and supports over 40 programming languages. CodeFuse achieves its effectiveness by utilizing a high quality pre-training dataset that is carefully filtered by program analyzers and optimized during the training process. Extensive experiments are conducted using real-world usage scenarios, the industry-standard benchmark HumanEval-x, and the specially designed CodeFuseEval for Chinese prompts. To assess the effectiveness of CodeFuse, we actively collected valuable human feedback from the AntGroup's software development process where CodeFuse has been successfully deployed. The results demonstrate that CodeFuse-13B achieves a HumanEval pass@1 score of 37.10%, positioning it as one of the top multi-lingual code LLMs with similar parameter sizes. In practical scenarios, such as code generation, code translation, code comments, and testcase generation, CodeFuse performs better than other models when confronted with Chinese prompts.
△ Less
Submitted 10 January, 2024; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Sequential Tag Recommendation
Authors:
Bing Liu,
Pengyu Xu,
Si** Lu,
Shi**g Wang,
Hongjian Sun,
Li** **g
Abstract:
With the development of Internet technology and the expansion of social networks, online platforms have become an important way for people to obtain information. The introduction of tags facilitates information categorization and retrieval. Meanwhile, the development of tag recommendation systems not only enables users to input tags more efficiently, but also improves the quality of tags. However,…
▽ More
With the development of Internet technology and the expansion of social networks, online platforms have become an important way for people to obtain information. The introduction of tags facilitates information categorization and retrieval. Meanwhile, the development of tag recommendation systems not only enables users to input tags more efficiently, but also improves the quality of tags. However, current tag recommendation methods only consider the content of the current post and do not take into account the influence of user preferences. Since the main body of tag recommendation is the user, it is very necessary to obtain the user's tagging habits. Therefore, this paper proposes a tag recommendation algorithm (MLP4STR) based on the dynamic preference of user's behavioral sequence, which models the user's historical post information and historical tag information to obtain the user's dynamic interest changes. A pure MLP structure across feature dimensions is used in sequence modeling to model the interaction between tag content and post content to fully extract the user's interests. Finally tag recommendation is performed.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Automatic and Efficient Customization of Neural Networks for ML Applications
Authors:
Yuhan Liu,
Chengcheng Wan,
Kuntai Du,
Henry Hoffmann,
Junchen Jiang,
Shan Lu,
Michael Maire
Abstract:
ML APIs have greatly relieved application developers of the burden to design and train their own neural network models -- classifying objects in an image can now be as simple as one line of Python code to call an API. However, these APIs offer the same pre-trained models regardless of how their output is used by different applications. This can be suboptimal as not all ML inference errors can caus…
▽ More
ML APIs have greatly relieved application developers of the burden to design and train their own neural network models -- classifying objects in an image can now be as simple as one line of Python code to call an API. However, these APIs offer the same pre-trained models regardless of how their output is used by different applications. This can be suboptimal as not all ML inference errors can cause application failures, and the distinction between inference errors that can or cannot cause failures varies greatly across applications.
To tackle this problem, we first study 77 real-world applications, which collectively use six ML APIs from two providers, to reveal common patterns of how ML API output affects applications' decision processes. Inspired by the findings, we propose ChameleonAPI, an optimization framework for ML APIs, which takes effect without changing the application source code. ChameleonAPI provides application developers with a parser that automatically analyzes the application to produce an abstract of its decision process, which is then used to devise an application-specific loss function that only penalizes API output errors critical to the application. ChameleonAPI uses the loss function to efficiently train a neural network model customized for each application and deploys it to serve API invocations from the respective application via existing interface. Compared to a baseline that selects the best-of-all commercial ML API, we show that ChameleonAPI reduces incorrect application decisions by 43%.
△ Less
Submitted 7 October, 2023;
originally announced October 2023.
-
Current direction dependent magnetotransport in CuTe
Authors:
Ying Kit Tsui,
C. N. Kuo,
C. E. Hsu,
Wei Zhang,
Wenyan Wang,
Shanmin Wang,
Wing Chi Yu,
H. C. Hsueh,
C. S. Lue,
Swee K. Goh
Abstract:
Despite being a layered, easily-exfoliated compound, copper monotelluride (CuTe) features an unusual quasi-one-dimensional charge density wave below $T_{\rm CDW}\approx335$ K. Within a CuTe layer, the electrical resistivity depends sensitively on the direction of the electrical current. Here, we use magnetotransport to probe the metallic state of CuTe with two distinct in-plane current directions.…
▽ More
Despite being a layered, easily-exfoliated compound, copper monotelluride (CuTe) features an unusual quasi-one-dimensional charge density wave below $T_{\rm CDW}\approx335$ K. Within a CuTe layer, the electrical resistivity depends sensitively on the direction of the electrical current. Here, we use magnetotransport to probe the metallic state of CuTe with two distinct in-plane current directions. When the current flows along the $a$-axis ($I//a$), the magnetoresistance exhibits a downward curvature as the magnetic field increases. On the other hand, when the current is along the $b$-axis ($I//b$), the magnetoresistance shows the opposite curvature. Our analysis uncovers a violation of Kohler scaling, but only for $I//a$. Shubnikov-de Haas oscillations are detected at low temperatures. Our results shed light on the nature of the metallic state in CuTe with the development of the charge density wave.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
AI-Generated Images as Data Source: The Dawn of Synthetic Era
Authors:
Zuhao Yang,
Fangneng Zhan,
Kunhao Liu,
Muyu Xu,
Shijian Lu
Abstract:
The advancement of visual intelligence is intrinsically tethered to the availability of large-scale data. In parallel, generative Artificial Intelligence (AI) has unlocked the potential to create synthetic images that closely resemble real-world photographs. This prompts a compelling inquiry: how much visual intelligence could benefit from the advance of generative AI? This paper explores the inno…
▽ More
The advancement of visual intelligence is intrinsically tethered to the availability of large-scale data. In parallel, generative Artificial Intelligence (AI) has unlocked the potential to create synthetic images that closely resemble real-world photographs. This prompts a compelling inquiry: how much visual intelligence could benefit from the advance of generative AI? This paper explores the innovative concept of harnessing these AI-generated images as new data sources, resha** traditional modeling paradigms in visual intelligence. In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability, the rapid generation of vast datasets, and the effortless simulation of edge cases. Built on the success of generative AI models, we examine the potential of their generated data in a range of applications, from training machine learning models to simulating scenarios for computational modeling, testing, and validation. We probe the technological foundations that support this groundbreaking use of generative AI, engaging in an in-depth discussion on the ethical, legal, and practical considerations that accompany this transformative paradigm shift. Through an exhaustive survey of current technologies and applications, this paper presents a comprehensive view of the synthetic era in visual intelligence. A project associated with this paper can be found at https://github.com/mwxely/AIGS .
△ Less
Submitted 23 October, 2023; v1 submitted 3 October, 2023;
originally announced October 2023.
-
EX-Graph: A Pioneering Dataset Bridging Ethereum and X
Authors:
Qian Wang,
Zhen Zhang,
Zemin Liu,
Shengliang Lu,
Bingqiao Luo,
Bingsheng He
Abstract:
While numerous public blockchain datasets are available, their utility is constrained by an exclusive focus on blockchain data. This constraint limits the incorporation of relevant social network data into blockchain analysis, thereby diminishing the breadth and depth of insight that can be derived. To address the above limitation, we introduce EX-Graph, a novel dataset that authentically links Et…
▽ More
While numerous public blockchain datasets are available, their utility is constrained by an exclusive focus on blockchain data. This constraint limits the incorporation of relevant social network data into blockchain analysis, thereby diminishing the breadth and depth of insight that can be derived. To address the above limitation, we introduce EX-Graph, a novel dataset that authentically links Ethereum and X, marking the first and largest dataset of its kind. EX-Graph combines Ethereum transaction records (2 million nodes and 30 million edges) and X following data (1 million nodes and 3 million edges), bonding 30,667 Ethereum addresses with verified X accounts sourced from OpenSea. Detailed statistical analysis on EX-Graph highlights the structural differences between X-matched and non-X-matched Ethereum addresses. Extensive experiments, including Ethereum link prediction, wash-trading Ethereum addresses detection, and X-Ethereum matching link prediction, emphasize the significant role of X data in enhancing Ethereum analysis. EX-Graph is available at \url{https://exgraph.deno.dev/}.
△ Less
Submitted 17 March, 2024; v1 submitted 2 October, 2023;
originally announced October 2023.
-
Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency
Authors:
Baizhou Huang,
Shuai Lu,
Weizhu Chen,
Xiaojun Wan,
Nan Duan
Abstract:
Large language models (LLMs) have exhibited remarkable ability in code generation. However, generating the correct solution in a single attempt still remains a challenge. Prior works utilize verification properties in software engineering to verify and re-rank solutions in a majority voting manner. But the assumption behind them that generated verification properties have better qualities than sol…
▽ More
Large language models (LLMs) have exhibited remarkable ability in code generation. However, generating the correct solution in a single attempt still remains a challenge. Prior works utilize verification properties in software engineering to verify and re-rank solutions in a majority voting manner. But the assumption behind them that generated verification properties have better qualities than solutions may not always hold. In this paper, we treat them equally as different perspectives of LLMs' reasoning processes. We propose the Multi-Perspective Self-Consistency (MPSC) framework incorporating both inter- and intra-consistency across outputs from multiple perspectives. Specifically, we prompt LLMs to generate diverse outputs from three perspectives, Solution, Specification and Test case, constructing a 3-partite graph. With two measure functions of consistency, we embed both inter- and intra-consistency information into the graph. The optimal choice of solutions is then determined based on analysis in the graph. MPSC significantly boosts performance of foundation models (ChatGPT in this paper) on various benchmarks, including HumanEval (+15.91%), MBPP (+6.43%) and CodeContests (+9.37%), even surpassing GPT-4.
△ Less
Submitted 2 July, 2024; v1 submitted 29 September, 2023;
originally announced September 2023.
-
Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs
Authors:
Haonan Chang,
Kowndinya Boyalakuntla,
Shiyang Lu,
Siwei Cai,
Eric **g,
Shreesh Keskar,
Shijie Geng,
Adeeb Abbas,
Lifeng Zhou,
Kostas Bekris,
Abdeslam Boularias
Abstract:
We present an Open-Vocabulary 3D Scene Graph (OVSG), a formal framework for grounding a variety of entities, such as object instances, agents, and regions, with free-form text-based queries. Unlike conventional semantic-based object localization approaches, our system facilitates context-aware entity localization, allowing for queries such as ``pick up a cup on a kitchen table" or ``navigate to a…
▽ More
We present an Open-Vocabulary 3D Scene Graph (OVSG), a formal framework for grounding a variety of entities, such as object instances, agents, and regions, with free-form text-based queries. Unlike conventional semantic-based object localization approaches, our system facilitates context-aware entity localization, allowing for queries such as ``pick up a cup on a kitchen table" or ``navigate to a sofa on which someone is sitting". In contrast to existing research on 3D scene graphs, OVSG supports free-form text input and open-vocabulary querying. Through a series of comparative experiments using the ScanNet dataset and a self-collected dataset, we demonstrate that our proposed approach significantly surpasses the performance of previous semantic-based localization techniques. Moreover, we highlight the practical application of OVSG in real-world robot navigation and manipulation experiments.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Noise-Tolerant Unsupervised Adapter for Vision-Language Models
Authors:
Eman Ali,
Dayan Guan,
Shijian Lu,
Abdulmotaleb Elsaddik
Abstract:
Recent advances in large-scale vision-language models have achieved very impressive performance in various zero-shot image classification tasks. While prior studies have demonstrated significant improvements by introducing few-shot labelled target samples, they still require labelling of target samples, which greatly degrades their scalability while handling various visual recognition tasks. We de…
▽ More
Recent advances in large-scale vision-language models have achieved very impressive performance in various zero-shot image classification tasks. While prior studies have demonstrated significant improvements by introducing few-shot labelled target samples, they still require labelling of target samples, which greatly degrades their scalability while handling various visual recognition tasks. We design NtUA, a Noise-tolerant Unsupervised Adapter that allows learning superior target models with few-shot unlabelled target samples. NtUA works as a key-value cache that formulates visual features and predicted pseudo-labels of the few-shot unlabelled target samples as key-value pairs. It consists of two complementary designs. The first is adaptive cache formation that combats pseudo-label noises by weighting the key-value pairs according to their prediction confidence. The second is pseudo-label rectification, which corrects both pair values (i.e., pseudo-labels) and cache weights by leveraging knowledge distillation from large-scale vision language models. Extensive experiments show that NtUA achieves superior performance consistently across multiple widely adopted benchmarks.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Rewrite Caption Semantics: Bridging Semantic Gaps for Language-Supervised Semantic Segmentation
Authors:
Yun Xing,
Jian Kang,
Aoran Xiao,
Jiahao Nie,
Ling Shao,
Shijian Lu
Abstract:
Vision-Language Pre-training has demonstrated its remarkable zero-shot recognition ability and potential to learn generalizable visual representations from language supervision. Taking a step ahead, language-supervised semantic segmentation enables spatial localization of textual inputs by learning pixel grou** solely from image-text pairs. Nevertheless, the state-of-the-art suffers from clear s…
▽ More
Vision-Language Pre-training has demonstrated its remarkable zero-shot recognition ability and potential to learn generalizable visual representations from language supervision. Taking a step ahead, language-supervised semantic segmentation enables spatial localization of textual inputs by learning pixel grou** solely from image-text pairs. Nevertheless, the state-of-the-art suffers from clear semantic gaps between visual and textual modality: plenty of visual concepts appeared in images are missing in their paired captions. Such semantic misalignment circulates in pre-training, leading to inferior zero-shot performance in dense predictions due to insufficient visual concepts captured in textual representations. To close such semantic gap, we propose Concept Curation (CoCu), a pipeline that leverages CLIP to compensate for the missing semantics. For each image-text pair, we establish a concept archive that maintains potential visually-matched concepts with our proposed vision-driven expansion and text-to-vision-guided ranking. Relevant concepts can thus be identified via cluster-guided sampling and fed into pre-training, thereby bridging the gap between visual and textual semantics. Extensive experiments over a broad suite of 8 segmentation benchmarks show that CoCu achieves superb zero-shot transfer performance and greatly boosts language-supervised segmentation baseline by a large margin, suggesting the value of bridging semantic gap in pre-training data.
△ Less
Submitted 4 January, 2024; v1 submitted 23 September, 2023;
originally announced September 2023.
-
Principal Stratification with Continuous Post-Treatment Variables: Nonparametric Identification and Semiparametric Estimation
Authors:
Sizhu Lu,
Zhichao Jiang,
Peng Ding
Abstract:
Post-treatment variables often complicate causal inference. They appear in many scientific problems, including noncompliance, truncation by death, mediation, and surrogate endpoint evaluation. Principal stratification is a strategy to address these challenges by adjusting for the potential values of the post-treatment variables, defined as the principal strata. It allows for characterizing treatme…
▽ More
Post-treatment variables often complicate causal inference. They appear in many scientific problems, including noncompliance, truncation by death, mediation, and surrogate endpoint evaluation. Principal stratification is a strategy to address these challenges by adjusting for the potential values of the post-treatment variables, defined as the principal strata. It allows for characterizing treatment effect heterogeneity across principal strata and unveiling the mechanism of the treatment's impact on the outcome related to post-treatment variables. However, the existing literature has primarily focused on binary post-treatment variables, leaving the case with continuous post-treatment variables largely unexplored. This gap persists due to the complexity of infinitely many principal strata, which present challenges to both the identification and estimation of causal effects. We fill this gap by providing nonparametric identification and semiparametric estimation theory for principal stratification with continuous post-treatment variables. We propose to use working models to approximate the underlying causal effect surfaces and derive the efficient influence functions of the corresponding model parameters. Based on the theory, we construct doubly robust estimators and implement them in an R package.
△ Less
Submitted 3 April, 2024; v1 submitted 21 September, 2023;
originally announced September 2023.
-
MaNGA DynPop -- V. The dark-matter fraction versus stellar velocity dispersion relation and stellar initial mass function variations in galaxies: dynamical models and full spectrum fitting of integral-field spectroscopy
Authors:
Shengdong Lu,
Kai Zhu,
Michele Cappellari,
Ran Li,
Shude Mao,
Dandan Xu
Abstract:
Using the final MaNGA sample of 10K galaxies, we investigate the dark matter fraction $f_{\rm DM}$ within one half-light radius $R_{\rm e}$ for about 6K galaxies with good kinematics spanning a wide range of morphologies and stellar velocity dispersion. We employ two techniques to estimate $f_{\rm DM}$: (i) Jeans Anisotropic Modelling (JAM), which performs dark matter decomposition based on stella…
▽ More
Using the final MaNGA sample of 10K galaxies, we investigate the dark matter fraction $f_{\rm DM}$ within one half-light radius $R_{\rm e}$ for about 6K galaxies with good kinematics spanning a wide range of morphologies and stellar velocity dispersion. We employ two techniques to estimate $f_{\rm DM}$: (i) Jeans Anisotropic Modelling (JAM), which performs dark matter decomposition based on stellar kinematics and (ii) comparing the total dynamical mass-to-light ratios $(M/L)_{\rm JAM}$ and $(M_{\ast}/L)_{\rm SPS}$ from Stellar Population Synthesis (SPS). We find that both methods consistently show a significant trend of increasing $f_{\rm DM}$ with decreasing $σ_{\rm e}$ and low $f_{\rm DM}$ at larger $σ_{\rm e}$. For 235 early-type galaxies with the best models, we explore the variation of stellar initial mass function (IMF) by comparing the stellar mass-to-light ratios from JAM and SPS. We confirm that the stellar mass excess factor $α_{\rm IMF}$ increases with $σ_{\rm e}$, consistent with previous studies that reported a transition from Chabrier-like to Salpeter IMF among galaxies. We show that the $α_{\rm IMF}$ trend cannot be driven by $M_{\ast}/L$ or IMF gradients as it persists when allowing for radial gradients in our model. We find no evidence for the total $M/L$ increasing toward the centre. We detect weak positive correlations between $α_{\rm IMF}$ and age, but no correlations with metallicity. We stack galaxy spectra according to their $α_{\rm IMF}$ to search for differences in IMF-sensitive spectral features (e.g. the $\rm Na_{\rm I}$ doublet). We only find marginal evidence for such differences, which casts doubt on the validity of one or both methods to measure the IMF.
△ Less
Submitted 23 April, 2024; v1 submitted 21 September, 2023;
originally announced September 2023.
-
SOT-MRAM-Enabled Probabilistic Binary Neural Networks for Noise-Tolerant and Fast Training
Authors:
Puyang Huang,
Yu Gu,
Chenyi Fu,
Jiaqi Lu,
Yiyao Zhu,
Renhe Chen,
Yongqi Hu,
Yi Ding,
Hongchao Zhang,
Shiyang Lu,
Shouzhong Peng,
Weisheng Zhao,
Xufeng Kou
Abstract:
We report the use of spin-orbit torque (SOT) magnetoresistive random-access memory (MRAM) to implement a probabilistic binary neural network (PBNN) for resource-saving applications. The in-plane magnetized SOT (i-SOT) MRAM not only enables field-free magnetization switching with high endurance (> 10^11), but also hosts multiple stable probabilistic states with a low device-to-device variation (< 6…
▽ More
We report the use of spin-orbit torque (SOT) magnetoresistive random-access memory (MRAM) to implement a probabilistic binary neural network (PBNN) for resource-saving applications. The in-plane magnetized SOT (i-SOT) MRAM not only enables field-free magnetization switching with high endurance (> 10^11), but also hosts multiple stable probabilistic states with a low device-to-device variation (< 6.35%). Accordingly, the proposed PBNN outperforms other neural networks by achieving an 18* increase in training speed, while maintaining an accuracy above 97% under the write and read noise perturbations. Furthermore, by applying the binarization process with an additional SOT-MRAM dummy module, we demonstrate an on-chip MNIST inference performance close to the ideal baseline using our SOT-PBNN hardware.
△ Less
Submitted 20 September, 2023; v1 submitted 14 September, 2023;
originally announced September 2023.
-
Real-time Monitoring for the Next Core-Collapse Supernova in JUNO
Authors:
Angel Abusleme,
Thomas Adam,
Shakeel Ahmad,
Rizwan Ahmed,
Sebastiano Aiello,
Muhammad Akram,
Abid Aleem,
Fengpeng An,
Qi An,
Giuseppe Andronico,
Nikolay Anfimov,
Vito Antonelli,
Tatiana Antoshkina,
Burin Asavapibhop,
João Pedro Athayde Marcondes de André,
Didier Auguste,
Weidong Bai,
Nikita Balashov,
Wander Baldini,
Andrea Barresi,
Davide Basilico,
Eric Baussan,
Marco Bellato,
Marco Beretta,
Antonio Bergnoli
, et al. (606 additional authors not shown)
Abstract:
The core-collapse supernova (CCSN) is considered one of the most energetic astrophysical events in the universe. The early and prompt detection of neutrinos before (pre-SN) and during the supernova (SN) burst presents a unique opportunity for multi-messenger observations of CCSN events. In this study, we describe the monitoring concept and present the sensitivity of the system to pre-SN and SN neu…
▽ More
The core-collapse supernova (CCSN) is considered one of the most energetic astrophysical events in the universe. The early and prompt detection of neutrinos before (pre-SN) and during the supernova (SN) burst presents a unique opportunity for multi-messenger observations of CCSN events. In this study, we describe the monitoring concept and present the sensitivity of the system to pre-SN and SN neutrinos at the Jiangmen Underground Neutrino Observatory (JUNO), a 20 kton liquid scintillator detector currently under construction in South China. The real-time monitoring system is designed to ensure both prompt alert speed and comprehensive coverage of progenitor stars. It incorporates prompt monitors on the electronic board as well as online monitors at the data acquisition stage. Assuming a false alert rate of 1 per year, this monitoring system exhibits sensitivity to pre-SN neutrinos up to a distance of approximately 1.6 (0.9) kiloparsecs and SN neutrinos up to about 370 (360) kiloparsecs for a progenitor mass of 30 solar masses, considering both normal and inverted mass ordering scenarios. The pointing ability of the CCSN is evaluated by analyzing the accumulated event anisotropy of inverse beta decay interactions from pre-SN or SN neutrinos. This, along with the early alert, can play a crucial role in facilitating follow-up multi-messenger observations of the next galactic or nearby extragalactic CCSN.
△ Less
Submitted 4 December, 2023; v1 submitted 13 September, 2023;
originally announced September 2023.
-
Style Generation: Image Synthesis based on Coarsely Matched Texts
Authors:
Mengyao Cui,
Zhe Zhu,
Shao-** Lu,
Yulu Yang
Abstract:
Previous text-to-image synthesis algorithms typically use explicit textual instructions to generate/manipulate images accurately, but they have difficulty adapting to guidance in the form of coarsely matched texts. In this work, we attempt to stylize an input image using such coarsely matched text as guidance. To tackle this new problem, we introduce a novel task called text-based style generation…
▽ More
Previous text-to-image synthesis algorithms typically use explicit textual instructions to generate/manipulate images accurately, but they have difficulty adapting to guidance in the form of coarsely matched texts. In this work, we attempt to stylize an input image using such coarsely matched text as guidance. To tackle this new problem, we introduce a novel task called text-based style generation and propose a two-stage generative adversarial network: the first stage generates the overall image style with a sentence feature, and the second stage refines the generated style with a synthetic feature, which is produced by a multi-modality style synthesis module. We re-filter one existing dataset and collect a new dataset for the task. Extensive experiments and ablation studies are conducted to validate our framework. The practical potential of our work is demonstrated by various applications such as text-image alignment and story visualization. Our datasets are published at https://www.kaggle.com/datasets/mengyaocui/style-generation.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Test Primitive:A Straightforward Method To Decouple March
Authors:
Yindong Xiao,
Shanshan Lu,
Ensheng Wang,
Ruiqi Zhu,
Zhijian Dai
Abstract:
The academic community has made outstanding achievements in researching the March algorithm. However, the current fault modeling method, which centers on fault primitives, cannot be directly applied to analyzing the March algorithm. This paper proposes a new test primitive. The test primitives, which decouple the cell states from sensitization and detection operations, describe the common features…
▽ More
The academic community has made outstanding achievements in researching the March algorithm. However, the current fault modeling method, which centers on fault primitives, cannot be directly applied to analyzing the March algorithm. This paper proposes a new test primitive. The test primitives, which decouple the cell states from sensitization and detection operations, describe the common features that must be possessed for the March algorithm to detect corresponding faults, forming a highly flexible and scalable March algorithm analysis unit. The theoretical analysis proves that the test primitives demonstrate completeness, uniqueness, and conciseness. On this foundation, the utilization of test primitives within the March analysis procedure is elucidated.
△ Less
Submitted 29 August, 2023;
originally announced September 2023.
-
Screening of Pneumonia and Urinary Tract Infection at Triage using TriNet
Authors:
Stephen Z. Lu
Abstract:
Due to the steady rise in population demographics and longevity, emergency department visits are increasing across North America. As more patients visit the emergency department, traditional clinical workflows become overloaded and inefficient, leading to prolonged wait-times and reduced healthcare quality. One of such workflows is the triage medical directive, impeded by limited human workload, i…
▽ More
Due to the steady rise in population demographics and longevity, emergency department visits are increasing across North America. As more patients visit the emergency department, traditional clinical workflows become overloaded and inefficient, leading to prolonged wait-times and reduced healthcare quality. One of such workflows is the triage medical directive, impeded by limited human workload, inaccurate diagnoses and invasive over-testing. To address this issue, we propose TriNet: a machine learning model for medical directives that automates first-line screening at triage for conditions requiring downstream testing for diagnosis confirmation. To verify screening potential, TriNet was trained on hospital triage data and achieved high positive predictive values in detecting pneumonia (0.86) and urinary tract infection (0.93). These models outperform current clinical benchmarks, indicating that machine-learning medical directives can offer cost-free, non-invasive screening with high specificity for common conditions, reducing the risk of over-testing while increasing emergency department efficiency.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Sensing With Random Signals
Authors:
Shihang Lu,
Fan Liu,
Fuwang Dong,
Yifeng Xiong,
Jie Xu,
Ya-Feng Liu
Abstract:
Radar systems typically employ well-designed deterministic signals for target sensing. In contrast to that, integrated sensing and communications (ISAC) systems have to use random signals to convey useful information, potentially causing sensing performance degradation. In this paper, we define a new sensing performance metric, namely, ergodic linear minimum mean square error (ELMMSE), accounting…
▽ More
Radar systems typically employ well-designed deterministic signals for target sensing. In contrast to that, integrated sensing and communications (ISAC) systems have to use random signals to convey useful information, potentially causing sensing performance degradation. In this paper, we define a new sensing performance metric, namely, ergodic linear minimum mean square error (ELMMSE), accounting for the randomness of ISAC signals. Then, we investigate a data-dependent precoding scheme to minimize the ELMMSE, which attains the optimized sensing performance at the price of high computational complexity. To reduce the complexity, we present an alternative data-independent precoding scheme and propose a stochastic gradient projection (SGP) algorithm for ELMMSE minimization, which can be trained offline by locally generated signal samples. Finally, we demonstrate the superiority of the proposed methods by simulations.
△ Less
Submitted 14 January, 2024; v1 submitted 5 September, 2023;
originally announced September 2023.
-
Are Emergent Abilities in Large Language Models just In-Context Learning?
Authors:
Sheng Lu,
Irina Bigoulaeva,
Rachneet Sachdeva,
Harish Tayyar Madabushi,
Iryna Gurevych
Abstract:
Large language models have exhibited emergent abilities, demonstrating exceptional performance across diverse tasks for which they were not explicitly trained, including those that require complex reasoning abilities. The emergence of such abilities carries profound implications for the future direction of research in NLP, especially as the deployment of such models becomes more prevalent. However…
▽ More
Large language models have exhibited emergent abilities, demonstrating exceptional performance across diverse tasks for which they were not explicitly trained, including those that require complex reasoning abilities. The emergence of such abilities carries profound implications for the future direction of research in NLP, especially as the deployment of such models becomes more prevalent. However, one key challenge is that the evaluation of these abilities is often confounded by competencies that arise in models through alternative prompting techniques, such as in-context learning and instruction following, which also emerge as the models are scaled up. In this study, we provide the first comprehensive examination of these emergent abilities while accounting for various potentially biasing factors that can influence the evaluation of models. We conduct rigorous tests on a set of 18 models, encompassing a parameter range from 60 million to 175 billion parameters, across a comprehensive set of 22 tasks. Through an extensive series of over 1,000 experiments, we provide compelling evidence that emergent abilities can primarily be ascribed to in-context learning. We find no evidence for the emergence of reasoning abilities, thus providing valuable insights into the underlying mechanisms driving the observed abilities and thus alleviating safety concerns regarding their use.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
Identifiability and estimation of the competing risks model under exclusion restrictions
Authors:
Munir Hiabu,
Simon M. S. LU,
Ralf A. Wilke
Abstract:
The non-identifiability of the competing risks model requires researchers to work with restrictions on the model to obtain informative results. We present a new identifiability solution based on an exclusion restriction. Many areas of applied research use methods that rely on exclusion restrcitions. It appears natural to also use them for the identifiability of competing risks models. By imposing…
▽ More
The non-identifiability of the competing risks model requires researchers to work with restrictions on the model to obtain informative results. We present a new identifiability solution based on an exclusion restriction. Many areas of applied research use methods that rely on exclusion restrcitions. It appears natural to also use them for the identifiability of competing risks models. By imposing the exclusion restriction couple with an Archimedean copula, we are able to avoid any parametric restriction on the marginal distributions. We introduce a semiparametric estimation approach for the nonparametric marginals and the parametric copula. Our simulation results demonstrate the usefulness of the suggested model, as the degree of risk dependence can be estimated without parametric restrictions on the marginal distributions.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
Correlated and Multi-frequency Diffusion Modeling for Highly Under-sampled MRI Reconstruction
Authors:
Yu Guan,
Chuanming Yu,
Shiyu Lu,
Zhuoxu Cui,
Dong Liang,
Qiegen Liu
Abstract:
Most existing MRI reconstruction methods perform tar-geted reconstruction of the entire MR image without tak-ing specific tissue regions into consideration. This may fail to emphasize the reconstruction accuracy on im-portant tissues for diagnosis. In this study, leveraging a combination of the properties of k-space data and the diffusion process, our novel scheme focuses on mining the multi-frequ…
▽ More
Most existing MRI reconstruction methods perform tar-geted reconstruction of the entire MR image without tak-ing specific tissue regions into consideration. This may fail to emphasize the reconstruction accuracy on im-portant tissues for diagnosis. In this study, leveraging a combination of the properties of k-space data and the diffusion process, our novel scheme focuses on mining the multi-frequency prior with different strategies to pre-serve fine texture details in the reconstructed image. In addition, a diffusion process can converge more quickly if its target distribution closely resembles the noise distri-bution in the process. This can be accomplished through various high-frequency prior extractors. The finding further solidifies the effectiveness of the score-based gen-erative model. On top of all the advantages, our method improves the accuracy of MRI reconstruction and accel-erates sampling process. Experimental results verify that the proposed method successfully obtains more accurate reconstruction and outperforms state-of-the-art methods.
△ Less
Submitted 2 September, 2023;
originally announced September 2023.
-
Domain Generalization via Balancing Training Difficulty and Model Capability
Authors:
Xueying Jiang,
Jiaxing Huang,
Sheng **,
Shijian Lu
Abstract:
Domain generalization (DG) aims to learn domain-generalizable models from one or multiple source domains that can perform well in unseen target domains. Despite its recent progress, most existing work suffers from the misalignment between the difficulty level of training samples and the capability of contemporarily trained models, leading to over-fitting or under-fitting in the trained generalizat…
▽ More
Domain generalization (DG) aims to learn domain-generalizable models from one or multiple source domains that can perform well in unseen target domains. Despite its recent progress, most existing work suffers from the misalignment between the difficulty level of training samples and the capability of contemporarily trained models, leading to over-fitting or under-fitting in the trained generalization model. We design MoDify, a Momentum Difficulty framework that tackles the misalignment by balancing the seesaw between the model's capability and the samples' difficulties along the training process. MoDify consists of two novel designs that collaborate to fight against the misalignment while learning domain-generalizable models. The first is MoDify-based Data Augmentation which exploits an RGB Shuffle technique to generate difficulty-aware training samples on the fly. The second is MoDify-based Network Optimization which dynamically schedules the training samples for balanced and smooth learning with appropriate difficulty. Without bells and whistles, a simple implementation of MoDify achieves superior performance across multiple benchmarks. In addition, MoDify can complement existing methods as a plug-in, and it is generic and can work for different visual recognition tasks.
△ Less
Submitted 2 September, 2023;
originally announced September 2023.
-
On the Uncertainty Estimates of Equivariant-Neural-Network-Ensembles Interatomic Potentials
Authors:
Shuaihua Lu,
Luca M. Ghiringhelli,
Christian Carbogno,
**lan Wang,
Matthias Scheffler
Abstract:
Machine-learning (ML) interatomic potentials (IPs) trained on first-principles datasets are becoming increasingly popular since they promise to treat larger system sizes and longer time scales, compared to the {\em ab initio} techniques producing the training data. Estimating the accuracy of MLIPs and reliably detecting when predictions become inaccurate is key for enabling their unfailing usage.…
▽ More
Machine-learning (ML) interatomic potentials (IPs) trained on first-principles datasets are becoming increasingly popular since they promise to treat larger system sizes and longer time scales, compared to the {\em ab initio} techniques producing the training data. Estimating the accuracy of MLIPs and reliably detecting when predictions become inaccurate is key for enabling their unfailing usage. In this paper, we explore this aspect for a specific class of MLIPs, the equivariant-neural-network (ENN) IPs using the ensemble technique for quantifying their prediction uncertainties. We critically examine the robustness of uncertainties when the ENN ensemble IP (ENNE-IP) is applied to the realistic and physically relevant scenario of predicting local-minima structures in the configurational space. The ENNE-IP is trained on data for liquid silicon, created by density-functional theory (DFT) with the generalized gradient approximation (GGA) for the exchange-correlation functional. Then, the ensemble-derived uncertainties are compared with the actual errors (comparing the results of the ENNE-IP with those of the underlying DFT-GGA theory) for various test sets, including liquid silicon at different temperatures and out-of-training-domain data such as solid phases with and without point defects as well as surfaces. Our study reveals that the predicted uncertainties are generally overconfident and hold little quantitative predictive power for the actual errors.
△ Less
Submitted 31 August, 2023;
originally announced September 2023.
-
AI-powered Fraud Detection in Decentralized Finance: A Project Life Cycle Perspective
Authors:
Bingqiao Luo,
Zhen Zhang,
Qian Wang,
Anli Ke,
Shengliang Lu,
Bingsheng He
Abstract:
In recent years, blockchain technology has introduced decentralized finance (DeFi) as an alternative to traditional financial systems. DeFi aims to create a transparent and efficient financial ecosystem using smart contracts and emerging decentralized applications. However, the growing popularity of DeFi has made it a target for fraudulent activities, resulting in losses of billions of dollars due…
▽ More
In recent years, blockchain technology has introduced decentralized finance (DeFi) as an alternative to traditional financial systems. DeFi aims to create a transparent and efficient financial ecosystem using smart contracts and emerging decentralized applications. However, the growing popularity of DeFi has made it a target for fraudulent activities, resulting in losses of billions of dollars due to various types of frauds. To address these issues, researchers have explored the potential of artificial intelligence (AI) approaches to detect such fraudulent activities. Yet, there is a lack of a systematic survey to organize and summarize those existing works and to identify the future research opportunities. In this survey, we provide a systematic taxonomy of various frauds in the DeFi ecosystem, categorized by the different stages of a DeFi project's life cycle: project development, introduction, growth, maturity, and decline. This taxonomy is based on our finding: many frauds have strong correlations in the stage of the DeFi project. According to the taxonomy, we review existing AI-powered detection methods, including statistical modeling, natural language processing and other machine learning techniques, etc. We find that fraud detection in different stages employs distinct types of methods and observe the commendable performance of tree-based and graph-related models in tackling fraud detection tasks. By analyzing the challenges and trends, we present the findings to provide proactive suggestion and guide future research in DeFi fraud detection. We believe that this survey is able to support researchers, practitioners, and regulators in establishing a secure and trustworthy DeFi ecosystem.
△ Less
Submitted 13 March, 2024; v1 submitted 30 August, 2023;
originally announced August 2023.
-
Stage-by-stage Wavelet Optimization Refinement Diffusion Model for Sparse-View CT Reconstruction
Authors:
Kai Xu,
Shiyu Lu,
Bin Huang,
Weiwen Wu,
Qiegen Liu
Abstract:
Diffusion models have emerged as potential tools to tackle the challenge of sparse-view CT reconstruction, displaying superior performance compared to conventional methods. Nevertheless, these prevailing diffusion models predominantly focus on the sinogram or image domains, which can lead to instability during model training, potentially culminating in convergence towards local minimal solutions.…
▽ More
Diffusion models have emerged as potential tools to tackle the challenge of sparse-view CT reconstruction, displaying superior performance compared to conventional methods. Nevertheless, these prevailing diffusion models predominantly focus on the sinogram or image domains, which can lead to instability during model training, potentially culminating in convergence towards local minimal solutions. The wavelet trans-form serves to disentangle image contents and features into distinct frequency-component bands at varying scales, adeptly capturing diverse directional structures. Employing the Wavelet transform as a guiding sparsity prior significantly enhances the robustness of diffusion models. In this study, we present an innovative approach named the Stage-by-stage Wavelet Optimization Refinement Diffusion (SWORD) model for sparse-view CT reconstruction. Specifically, we establish a unified mathematical model integrating low-frequency and high-frequency generative models, achieving the solution with optimization procedure. Furthermore, we perform the low-frequency and high-frequency generative models on wavelet's decomposed components rather than sinogram or image domains, ensuring the stability of model training. Our method rooted in established optimization theory, comprising three distinct stages, including low-frequency generation, high-frequency refinement and domain transform. Our experimental results demonstrate that the proposed method outperforms existing state-of-the-art methods both quantitatively and qualitatively.
△ Less
Submitted 3 September, 2023; v1 submitted 30 August, 2023;
originally announced August 2023.
-
Federated Neuro-Symbolic Learning
Authors:
Pengwei Xing,
Songtao Lu,
Han Yu
Abstract:
Neuro-symbolic learning (NSL) models complex symbolic rule patterns into latent variable distributions by neural networks, which reduces rule search space and generates unseen rules to improve downstream task performance. Centralized NSL learning involves directly acquiring data from downstream tasks, which is not feasible for federated learning (FL). To address this limitation, we shift the focus…
▽ More
Neuro-symbolic learning (NSL) models complex symbolic rule patterns into latent variable distributions by neural networks, which reduces rule search space and generates unseen rules to improve downstream task performance. Centralized NSL learning involves directly acquiring data from downstream tasks, which is not feasible for federated learning (FL). To address this limitation, we shift the focus from such a one-to-one interactive neuro-symbolic paradigm to one-to-many Federated Neuro-Symbolic Learning framework (FedNSL) with latent variables as the FL communication medium. Built on the basis of our novel reformulation of the NSL theory, FedNSL is capable of identifying and addressing rule distribution heterogeneity through a simple and effective Kullback-Leibler (KL) divergence constraint on rule distribution applicable under the FL setting. It further theoretically adjusts variational expectation maximization (V-EM) to reduce the rule search space across domains. This is the first incorporation of distribution-coupled bilevel optimization into FL. Extensive experiments based on both synthetic and real-world data demonstrate significant advantages of FedNSL compared to five state-of-the-art methods. It outperforms the best baseline by 17% and 29% in terms of unbalanced average training accuracy and unseen average testing accuracy, respectively.
△ Less
Submitted 27 May, 2024; v1 submitted 29 August, 2023;
originally announced August 2023.
-
Pose-Free Neural Radiance Fields via Implicit Pose Regularization
Authors:
Jiahui Zhang,
Fangneng Zhan,
Yingchen Yu,
Kunhao Liu,
Rongliang Wu,
Xiaoqin Zhang,
Ling Shao,
Shijian Lu
Abstract:
Pose-free neural radiance fields (NeRF) aim to train NeRF with unposed multi-view images and it has achieved very impressive success in recent years. Most existing works share the pipeline of training a coarse pose estimator with rendered images at first, followed by a joint optimization of estimated poses and neural radiance field. However, as the pose estimator is trained with only rendered imag…
▽ More
Pose-free neural radiance fields (NeRF) aim to train NeRF with unposed multi-view images and it has achieved very impressive success in recent years. Most existing works share the pipeline of training a coarse pose estimator with rendered images at first, followed by a joint optimization of estimated poses and neural radiance field. However, as the pose estimator is trained with only rendered images, the pose estimation is usually biased or inaccurate for real images due to the domain gap between real images and rendered images, leading to poor robustness for the pose estimation of real images and further local minima in joint optimization. We design IR-NeRF, an innovative pose-free NeRF that introduces implicit pose regularization to refine pose estimator with unposed real images and improve the robustness of the pose estimation for real images. With a collection of 2D images of a specific scene, IR-NeRF constructs a scene codebook that stores scene features and captures the scene-specific pose distribution implicitly as priors. Thus, the robustness of pose estimation can be promoted with the scene priors according to the rationale that a 2D real image can be well reconstructed from the scene codebook only when its estimated pose lies within the pose distribution. Extensive experiments show that IR-NeRF achieves superior novel view synthesis and outperforms the state-of-the-art consistently across multiple synthetic and real datasets.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
How Can Context Help? Exploring Joint Retrieval of Passage and Personalized Context
Authors:
Hui Wan,
Hongkang Li,
Songtao Lu,
Xiaodong Cui,
Marina Danilevsky
Abstract:
The integration of external personalized context information into document-grounded conversational systems has significant potential business value, but has not been well-studied. Motivated by the concept of personalized context-aware document-grounded conversational systems, we introduce the task of context-aware passage retrieval. We also construct a dataset specifically curated for this purpose…
▽ More
The integration of external personalized context information into document-grounded conversational systems has significant potential business value, but has not been well-studied. Motivated by the concept of personalized context-aware document-grounded conversational systems, we introduce the task of context-aware passage retrieval. We also construct a dataset specifically curated for this purpose. We describe multiple baseline systems to address this task, and propose a novel approach, Personalized Context-Aware Search (PCAS), that effectively harnesses contextual information during passage retrieval. Experimental evaluations conducted on multiple popular dense retrieval systems demonstrate that our proposed approach not only outperforms the baselines in retrieving the most relevant passage but also excels at identifying the pertinent context among all the available contexts. We envision that our contributions will serve as a catalyst for inspiring future research endeavors in this promising direction.
△ Less
Submitted 26 August, 2023;
originally announced August 2023.
-
Black-box Unsupervised Domain Adaptation with Bi-directional Atkinson-Shiffrin Memory
Authors:
**gyi Zhang,
Jiaxing Huang,
Xueying Jiang,
Shijian Lu
Abstract:
Black-box unsupervised domain adaptation (UDA) learns with source predictions of target data without accessing either source data or source models during training, and it has clear superiority in data privacy and flexibility in target network selection. However, the source predictions of target data are often noisy and training with them is prone to learning collapses. We propose BiMem, a bi-direc…
▽ More
Black-box unsupervised domain adaptation (UDA) learns with source predictions of target data without accessing either source data or source models during training, and it has clear superiority in data privacy and flexibility in target network selection. However, the source predictions of target data are often noisy and training with them is prone to learning collapses. We propose BiMem, a bi-directional memorization mechanism that learns to remember useful and representative information to correct noisy pseudo labels on the fly, leading to robust black-box UDA that can generalize across different visual recognition tasks. BiMem constructs three types of memory, including sensory memory, short-term memory, and long-term memory, which interact in a bi-directional manner for comprehensive and robust memorization of learnt features. It includes a forward memorization flow that identifies and stores useful features and a backward calibration flow that rectifies features' pseudo labels progressively. Extensive experiments show that BiMem achieves superior domain adaptation performance consistently across various visual recognition tasks such as image classification, semantic segmentation and object detection.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
OFVL-MS: Once for Visual Localization across Multiple Indoor Scenes
Authors:
Tao Xie,
Kun Dai,
Siyi Lu,
Ke Wang,
Zhiqiang Jiang,
**ghan Gao,
Dedong Liu,
Jie Xu,
Lijun Zhao,
Ruifeng Li
Abstract:
In this work, we seek to predict camera poses across scenes with a multi-task learning manner, where we view the localization of each scene as a new task. We propose OFVL-MS, a unified framework that dispenses with the traditional practice of training a model for each individual scene and relieves gradient conflict induced by optimizing multiple scenes collectively, enabling efficient storage yet…
▽ More
In this work, we seek to predict camera poses across scenes with a multi-task learning manner, where we view the localization of each scene as a new task. We propose OFVL-MS, a unified framework that dispenses with the traditional practice of training a model for each individual scene and relieves gradient conflict induced by optimizing multiple scenes collectively, enabling efficient storage yet precise visual localization for all scenes. Technically, in the forward pass of OFVL-MS, we design a layer-adaptive sharing policy with a learnable score for each layer to automatically determine whether the layer is shared or not. Such sharing policy empowers us to acquire task-shared parameters for a reduction of storage cost and task-specific parameters for learning scene-related features to alleviate gradient conflict. In the backward pass of OFVL-MS, we introduce a gradient normalization algorithm that homogenizes the gradient magnitude of the task-shared parameters so that all tasks converge at the same pace. Furthermore, a sparse penalty loss is applied on the learnable scores to facilitate parameter sharing for all tasks without performance degradation. We conduct comprehensive experiments on multiple benchmarks and our new released indoor dataset LIVL, showing that OFVL-MS families significantly outperform the state-of-the-arts with fewer parameters. We also verify that OFVL-MS can generalize to a new scene with much few parameters while gaining superior localization performance.
△ Less
Submitted 23 August, 2023;
originally announced August 2023.
-
Orthogonal Constant-Amplitude Sequence Families for System Parameter Identification in Spectrally Compact OFDM
Authors:
Shih-Hao Lu,
Char-Dir Chung,
Wei-Chang Chen,
**-Feng Tsou
Abstract:
In rectangularly-pulsed orthogonal frequency division multiplexing (OFDM) systems, constant-amplitude (CA) sequences are desirable to construct preamble/pilot waveforms to facilitate system parameter identification (SPI). Orthogonal CA sequences are generally preferred in various SPI applications like random-access channel identification. However, the number of conventional orthogonal CA sequences…
▽ More
In rectangularly-pulsed orthogonal frequency division multiplexing (OFDM) systems, constant-amplitude (CA) sequences are desirable to construct preamble/pilot waveforms to facilitate system parameter identification (SPI). Orthogonal CA sequences are generally preferred in various SPI applications like random-access channel identification. However, the number of conventional orthogonal CA sequences (e.g., Zadoff-Chu sequences) that can be adopted in cellular communication without causing sequence identification ambiguity is insufficient. Such insufficiency causes heavy performance degradation for SPI requiring a large number of identification sequences. Moreover, rectangularly-pulsed OFDM preamble/pilot waveforms carrying conventional CA sequences suffer from large power spectral sidelobes and thus exhibit low spectral compactness. This paper is thus motivated to develop several order-I CA sequence families which contain more orthogonal CA sequences while endowing the corresponding OFDM preamble/pilot waveforms with fast-decaying spectral sidelobes. Since more orthogonal sequences are provided, the developed order-I CA sequence families can enhance the performance characteristics in SPI requiring a large number of identification sequences over multipath channels exhibiting short-delay channel profiles, while composing spectrally compact OFDM preamble/pilot waveforms.
△ Less
Submitted 22 August, 2023;
originally announced August 2023.
-
Aggregate Model of District Heating Network for Integrated Energy Dispatch: A Physically Informed Data-Driven Approach
Authors:
Shuai Lu,
Zihang Gao,
Yong Sun,
Suhan Zhang,
Baoju Li,
Chengliang Hao,
Yijun Xu,
Wei Gu
Abstract:
The district heating network (DHN) is essential in enhancing the operational flexibility of integrated energy systems (IES). Yet, it is hard to obtain an accurate and concise DHN model for the operation owing to complicated network features and imperfect measurements. Considering this, this paper proposes a physical-ly informed data-driven aggregate model (AGM) for the DHN, providing a concise des…
▽ More
The district heating network (DHN) is essential in enhancing the operational flexibility of integrated energy systems (IES). Yet, it is hard to obtain an accurate and concise DHN model for the operation owing to complicated network features and imperfect measurements. Considering this, this paper proposes a physical-ly informed data-driven aggregate model (AGM) for the DHN, providing a concise description of the source-load relationship of DHN without exposing network details. First, we derive the analytical relationship between the state variables of the source and load nodes of the DHN, offering a physical fundament for the AGM. Second, we propose a physics-informed estimator for the AGM that is robust to low-quality measurements, in which the physical constraints associated with the parameter normalization and sparsity are embedded to improve the accuracy and robustness. Finally, we propose a physics-enhanced algorithm to solve the nonlinear estimator with non-closed constraints efficiently. Simulation results verify the effectiveness of the proposed method.
△ Less
Submitted 27 March, 2024; v1 submitted 21 August, 2023;
originally announced August 2023.
-
Surface Second Harmonic Generation from Topological Dirac Semimetal PdTe$_2$
Authors:
Syed Mohammed Faizanuddin,
Ching-Hang Chien,
Yao-Jui Chan,
Si-Tong Liu,
Chia-Nung Kuo,
Chin Shuan Lue,
Yu-Chieh Wen
Abstract:
Recent experiments and calculations in topological semimetals have observed anomalously strong second-order optical nonlinearity, but yet whether the enhancement also occurs at surfaces of topological semimetals in general remains an open question. In this work, we tackle this problem by measuring polarization-dependent and rotational-anisotropy optical second harmonic generation (SHG) from centro…
▽ More
Recent experiments and calculations in topological semimetals have observed anomalously strong second-order optical nonlinearity, but yet whether the enhancement also occurs at surfaces of topological semimetals in general remains an open question. In this work, we tackle this problem by measuring polarization-dependent and rotational-anisotropy optical second harmonic generation (SHG) from centrosymmetric type-II Dirac semimetal PdTe$_2$. We found the SHG to follow C$_{3v}$ surface symmetry with a time-varying intensity dictated by the oxidation kinetics of the material after its surface cleavage, indicating the surface origin of SHG. Quantitative characterization of the surface nonlinear susceptibility indicates a large out-of-plane response of PdTe$_2$ with $|χ_{ccc}^{(2)}|$ up to 25 $\times$ 10$^{-18}$ m$^2$/V. Our results support the topological surfaces/interfaces as a new route toward applications of nonlinear optical effects with released symmetry constraints, and demonstrate SHG as a viable means to in situ study of kinetics of topological surfaces.
△ Less
Submitted 17 August, 2023;
originally announced August 2023.
-
Sensing as a Service in 6G Perceptive Mobile Networks: Architecture, Advances, and the Road Ahead
Authors:
Fuwang Dong,
Fan Liu,
Yuanhao Cui,
Shihang Lu,
Yunxin Li
Abstract:
Sensing-as-a-service is anticipated to be the core feature of 6G perceptive mobile networks (PMN), where high-precision real-time sensing will become an inherent capability rather than being an auxiliary function as before. With the proliferation of wireless connected devices, resource allocation (RA) in terms of the users' specific quality-of-service (QoS) requirements plays a pivotal role in enh…
▽ More
Sensing-as-a-service is anticipated to be the core feature of 6G perceptive mobile networks (PMN), where high-precision real-time sensing will become an inherent capability rather than being an auxiliary function as before. With the proliferation of wireless connected devices, resource allocation (RA) in terms of the users' specific quality-of-service (QoS) requirements plays a pivotal role in enhancing interference management ability and resource utilization efficiency. In this article, we comprehensively introduce the concept of sensing service in PMN, including the types of tasks, the distinctions/advantages compared to conventional networks, and the definitions of sensing QoS. Subsequently, we provide a unified RA framework in sensing-centric PMN and elaborate on the unique challenges. Furthermore, we present a typical case study named "communication-assisted sensing" and evaluate the performance trade-off between sensing and communication procedures. Finally, we shed light on several open problems and opportunities deserving further investigation in the future.
△ Less
Submitted 8 November, 2023; v1 submitted 16 August, 2023;
originally announced August 2023.
-
Accelerating Generic Graph Neural Networks via Architecture, Compiler, Partition Method Co-Design
Authors:
Shuwen Lu,
Zhihui Zhang,
Cong Guo,
**gwen Leng,
Yangjie Zhou,
Minyi Guo
Abstract:
Graph neural networks (GNNs) have shown significant accuracy improvements in a variety of graph learning domains, sparking considerable research interest. To translate these accuracy improvements into practical applications, it is essential to develop high-performance and efficient hardware acceleration for GNN models. However, designing GNN accelerators faces two fundamental challenges: the high…
▽ More
Graph neural networks (GNNs) have shown significant accuracy improvements in a variety of graph learning domains, sparking considerable research interest. To translate these accuracy improvements into practical applications, it is essential to develop high-performance and efficient hardware acceleration for GNN models. However, designing GNN accelerators faces two fundamental challenges: the high bandwidth requirement of GNN models and the diversity of GNN models. Previous works have addressed the first challenge by using more expensive memory interfaces to achieve higher bandwidth. For the second challenge, existing works either support specific GNN models or have generic designs with poor hardware utilization.
In this work, we tackle both challenges simultaneously. First, we identify a new type of partition-level operator fusion, which we utilize to internally reduce the high bandwidth requirement of GNNs. Next, we introduce partition-level multi-threading to schedule the concurrent processing of graph partitions, utilizing different hardware resources. To further reduce the extra on-chip memory required by multi-threading, we propose fine-grained graph partitioning to generate denser graph partitions. Importantly, these three methods make no assumptions about the targeted GNN models, addressing the challenge of model variety. We implement these methods in a framework called SwitchBlade, consisting of a compiler, a graph partitioner, and a hardware accelerator. Our evaluation demonstrates that SwitchBlade achieves an average speedup of $1.85\times$ and energy savings of $19.03\times$ compared to the NVIDIA V100 GPU. Additionally, SwitchBlade delivers performance comparable to state-of-the-art specialized accelerators.
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification
Authors:
Aojun Zhou,
Ke Wang,
Zimu Lu,
Weikang Shi,
Sichun Luo,
Zipeng Qin,
Shaoqing Lu,
Anya Jia,
Linqi Song,
Mingjie Zhan,
Hongsheng Li
Abstract:
Recent progress in large language models (LLMs) like GPT-4 and PaLM-2 has brought significant advancements in addressing math reasoning problems. In particular, OpenAI's latest version of GPT-4, known as GPT-4 Code Interpreter, shows remarkable performance on challenging math datasets. In this paper, we explore the effect of code on enhancing LLMs' reasoning capability by introducing different con…
▽ More
Recent progress in large language models (LLMs) like GPT-4 and PaLM-2 has brought significant advancements in addressing math reasoning problems. In particular, OpenAI's latest version of GPT-4, known as GPT-4 Code Interpreter, shows remarkable performance on challenging math datasets. In this paper, we explore the effect of code on enhancing LLMs' reasoning capability by introducing different constraints on the \textit{Code Usage Frequency} of GPT-4 Code Interpreter. We found that its success can be largely attributed to its powerful skills in generating and executing code, evaluating the output of code execution, and rectifying its solution when receiving unreasonable outputs. Based on this insight, we propose a novel and effective prompting method, explicit \uline{c}ode-based \uline{s}elf-\uline{v}erification~(CSV), to further boost the mathematical reasoning potential of GPT-4 Code Interpreter. This method employs a zero-shot prompt on GPT-4 Code Interpreter to encourage it to use code to self-verify its answers. In instances where the verification state registers as ``False'', the model shall automatically amend its solution, analogous to our approach of rectifying errors during a mathematics examination. Furthermore, we recognize that the states of the verification result indicate the confidence of a solution, which can improve the effectiveness of majority voting. With GPT-4 Code Interpreter and CSV, we achieve an impressive zero-shot accuracy on MATH dataset \textbf{(53.9\% $\to$ 84.3\%)}.
△ Less
Submitted 15 August, 2023;
originally announced August 2023.
-
4DRVO-Net: Deep 4D Radar-Visual Odometry Using Multi-Modal and Multi-Scale Adaptive Fusion
Authors:
Guirong Zhuo,
Shouyi Lu,
Huanyu Zhou,
Lianqing Zheng,
Lu Xiong
Abstract:
Four-dimensional (4D) radar--visual odometry (4DRVO) integrates complementary information from 4D radar and cameras, making it an attractive solution for achieving accurate and robust pose estimation. However, 4DRVO may exhibit significant tracking errors owing to three main factors: 1) sparsity of 4D radar point clouds; 2) inaccurate data association and insufficient feature interaction between t…
▽ More
Four-dimensional (4D) radar--visual odometry (4DRVO) integrates complementary information from 4D radar and cameras, making it an attractive solution for achieving accurate and robust pose estimation. However, 4DRVO may exhibit significant tracking errors owing to three main factors: 1) sparsity of 4D radar point clouds; 2) inaccurate data association and insufficient feature interaction between the 4D radar and camera; and 3) disturbances caused by dynamic objects in the environment, affecting odometry estimation. In this paper, we present 4DRVO-Net, which is a method for 4D radar--visual odometry. This method leverages the feature pyramid, pose war**, and cost volume (PWC) network architecture to progressively estimate and refine poses. Specifically, we propose a multi-scale feature extraction network called Radar-PointNet++ that fully considers rich 4D radar point information, enabling fine-grained learning for sparse 4D radar point clouds. To effectively integrate the two modalities, we design an adaptive 4D radar--camera fusion module (A-RCFM) that automatically selects image features based on 4D radar point features, facilitating multi-scale cross-modal feature interaction and adaptive multi-modal feature fusion. In addition, we introduce a velocity-guided point-confidence estimation module to measure local motion patterns, reduce the influence of dynamic objects and outliers, and provide continuous updates during pose refinement. We demonstrate the excellent performance of our method and the effectiveness of each module design on both the VoD and in-house datasets. Our method outperforms all learning-based and geometry-based methods for most sequences in the VoD dataset. Furthermore, it has exhibited promising performance that closely approaches that of the 64-line LiDAR odometry results of A-LOAM without map** optimization.
△ Less
Submitted 12 August, 2023;
originally announced August 2023.
-
Controlling Photon Transverse Orbital Angular Momentum in High Harmonic Generation
Authors:
Yiqi Fang,
Shengyue Lu,
Yunquan Liu
Abstract:
High harmonic generation (HHG) with longitudinal optical orbital angular momentum has attracted much attention over the past decade. Here, we present the first study on the HHG with transverse orbital angular momentum driven by the spatiotemporal optical vortex (STOV) pulses. We show that the produced spatial resolved harmonic spectra reveal unique structures, such as the spatially spectral tilt a…
▽ More
High harmonic generation (HHG) with longitudinal optical orbital angular momentum has attracted much attention over the past decade. Here, we present the first study on the HHG with transverse orbital angular momentum driven by the spatiotemporal optical vortex (STOV) pulses. We show that the produced spatial resolved harmonic spectra reveal unique structures, such as the spatially spectral tilt and the fine interference patterns. We show these spatio-spectral structures originate from both the macroscopic and microscopic effect of spatiotemporal optical singularity in HHG. Employing two-color counter-spin and counter-vorticity STOV pulses, we further discuss a robust method to control the spatiotemporal topological charge and spectral structure of high-order harmonics. The conservation rule of photon transverse orbital angular momentum in HHG process is also discussed when mixing with photon spin angular momenta.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.
-
Quantum-inspired Hash Function Based on Parity-dependent Quantum Walks with Memory
Authors:
Qing Zhou,
Xueming Tang,
Songfeng Lu,
Hao Yang
Abstract:
In this paper, we develop a generic controlled alternate quantum walk model (called CQWM-P) by combining parity-dependent quantum walks with distinct arbitrary memory lengths and then construct a quantum-inspired hash function (called QHFM-P) based on this model. Numerical simulation shows that QHFM-P has near-ideal statistical performance and is on a par with the state-of-the-art hash functions b…
▽ More
In this paper, we develop a generic controlled alternate quantum walk model (called CQWM-P) by combining parity-dependent quantum walks with distinct arbitrary memory lengths and then construct a quantum-inspired hash function (called QHFM-P) based on this model. Numerical simulation shows that QHFM-P has near-ideal statistical performance and is on a par with the state-of-the-art hash functions based on discrete quantum walks in terms of sensitivity of hash value to message, diffusion and confusion properties, uniform distribution property, and collision resistance property. Stability test illustrates that the statistical properties of the proposed hash function are robust with respect to the coin parameters, and theoretical analysis indicates that QHFM-P has the same computational complexity as that of its peers.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.