Search | arXiv e-print repository

arXiv:2406.19156 [pdf, other]

Heterogeneous Causal Metapath Graph Neural Network for Gene-Microbe-Disease Association Prediction

Authors: Kexin Zhang, Feng Huang, Luotao Liu, Zhankun Xiong, Hongyu Zhang, Yuan Quan, Wen Zhang

Abstract: The recent focus on microbes in human medicine highlights their potential role in the genetic framework of diseases. To decode the complex interactions among genes, microbes, and diseases, computational predictions of gene-microbe-disease (GMD) associations are crucial. Existing methods primarily address gene-disease and microbe-disease associations, but the more intricate triple-wise GMD associat… ▽ More The recent focus on microbes in human medicine highlights their potential role in the genetic framework of diseases. To decode the complex interactions among genes, microbes, and diseases, computational predictions of gene-microbe-disease (GMD) associations are crucial. Existing methods primarily address gene-disease and microbe-disease associations, but the more intricate triple-wise GMD associations remain less explored. In this paper, we propose a Heterogeneous Causal Metapath Graph Neural Network (HCMGNN) to predict GMD associations. HCMGNN constructs a heterogeneous graph linking genes, microbes, and diseases through their pairwise associations, and utilizes six predefined causal metapaths to extract directed causal subgraphs, which facilitate the multi-view analysis of causal relations among three entity types. Within each subgraph, we employ a causal semantic sharing message passing network for node representation learning, coupled with an attentive fusion method to integrate these representations for predicting GMD associations. Our extensive experiments show that HCMGNN effectively predicts GMD associations and addresses association sparsity issue by enhancing the graph's semantics and structure. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.08863 [pdf, other]

Self-supervised Graph Neural Network for Mechanical CAD Retrieval

Authors: Yuhan Quan, Huan Zhao, **feng Yi, Yuqiang Chen

Abstract: CAD (Computer-Aided Design) plays a crucial role in mechanical industry, where large numbers of similar-shaped CAD parts are often created. Efficiently reusing these parts is key to reducing design and production costs for enterprises. Retrieval systems are vital for achieving CAD reuse, but the complex shapes of CAD models are difficult to accurately describe using text or keywords, making tradit… ▽ More CAD (Computer-Aided Design) plays a crucial role in mechanical industry, where large numbers of similar-shaped CAD parts are often created. Efficiently reusing these parts is key to reducing design and production costs for enterprises. Retrieval systems are vital for achieving CAD reuse, but the complex shapes of CAD models are difficult to accurately describe using text or keywords, making traditional retrieval methods ineffective. While existing representation learning approaches have been developed for CAD, manually labeling similar samples in these methods is expensive. Additionally, CAD models' unique parameterized data structure presents challenges for applying existing 3D shape representation learning techniques directly. In this work, we propose GC-CAD, a self-supervised contrastive graph neural network-based method for mechanical CAD retrieval that directly models parameterized CAD raw files. GC-CAD consists of two key modules: structure-aware representation learning and contrastive graph learning framework. The method leverages graph neural networks to extract both geometric and topological information from CAD models, generating feature representations. We then introduce a simple yet effective contrastive graph learning framework approach, enabling the model to train without manual labels and generate retrieval-ready representations. Experimental results on four datasets including human evaluation demonstrate that the proposed method achieves significant accuracy improvements and up to 100 times efficiency improvement over the baseline methods. △ Less

Submitted 17 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

arXiv:2405.19991 [pdf, other]

OpenTM: An Open-source, Single-GPU, Large-scale Thermal Microstructure Design Framework

Authors: Yuchen Quan, Xiaoya Zhai, Xiao-Ming Fu

Abstract: Thermal microstructures are artificially engineered materials designed to manipulate and control heat flow in unconventional ways. This paper presents an educational framework, called \emph{OpenTM}, to use a single GPU for designing periodic 3D high-resolution thermal microstructures to match the predefined thermal conductivity matrices with volume fraction constraints. Specifically, we use adapti… ▽ More Thermal microstructures are artificially engineered materials designed to manipulate and control heat flow in unconventional ways. This paper presents an educational framework, called \emph{OpenTM}, to use a single GPU for designing periodic 3D high-resolution thermal microstructures to match the predefined thermal conductivity matrices with volume fraction constraints. Specifically, we use adaptive volume fraction to make the Optimality Criteria (OC) method run stably to obtain the thermal microstructures without a large memory overhead.Practical examples with a high resolution $128 \times 128 \times 128$ run under 90 seconds per structure on an NVIDIA GeForce GTX 4070Ti GPU with a peak GPU memory of 355 MB. Our open-source, high-performance implementation is publicly accessible at \url{https://github.com/quanyuchen2000/OPENTM}, and it is easy to install using Anaconda. Moreover, we provide a Python interface to make OpenTM well-suited for novices in C/C++. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.07938 [pdf, other]

EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning

Authors: Yinzhu Quan, Zefang Liu

Abstract: In this paper, we introduce EconLogicQA, a rigorous benchmark designed to assess the sequential reasoning capabilities of large language models (LLMs) within the intricate realms of economics, business, and supply chain management. Diverging from traditional benchmarks that predict subsequent events individually, EconLogicQA poses a more challenging task: it requires models to discern and sequence… ▽ More In this paper, we introduce EconLogicQA, a rigorous benchmark designed to assess the sequential reasoning capabilities of large language models (LLMs) within the intricate realms of economics, business, and supply chain management. Diverging from traditional benchmarks that predict subsequent events individually, EconLogicQA poses a more challenging task: it requires models to discern and sequence multiple interconnected events, capturing the complexity of economic logics. EconLogicQA comprises an array of multi-event scenarios derived from economic articles, which necessitate an insightful understanding of both temporal and logical event relationships. Through comprehensive evaluations, we exhibit that EconLogicQA effectively gauges a LLM's proficiency in navigating the sequential complexities inherent in economic contexts. We provide a detailed description of EconLogicQA dataset and shows the outcomes from evaluating the benchmark across various leading-edge LLMs, thereby offering a thorough perspective on their sequential reasoning potential in economic contexts. Our benchmark dataset is available at https://huggingface.co/datasets/yinzhu-quan/econ_logic_qa. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2404.17227 [pdf, other]

Trust Dynamics and Market Behavior in Cryptocurrency: A Comparative Study of Centralized and Decentralized Exchanges

Authors: Xintong Wu, Wanling Deng, Yuotng Quan, Luyao Zhang

Abstract: In the evolving landscape of digital finance, the transition from centralized to decentralized trust mechanisms, primarily driven by blockchain technology, plays a critical role in sha** the cryptocurrency ecosystem. This paradigm shift raises questions about the traditional reliance on centralized trust and introduces a novel, decentralized trust framework built upon distributed networks. Our r… ▽ More In the evolving landscape of digital finance, the transition from centralized to decentralized trust mechanisms, primarily driven by blockchain technology, plays a critical role in sha** the cryptocurrency ecosystem. This paradigm shift raises questions about the traditional reliance on centralized trust and introduces a novel, decentralized trust framework built upon distributed networks. Our research delves into the consequences of this shift, particularly focusing on how incidents influence trust within cryptocurrency markets, thereby affecting trade behaviors in centralized (CEXs) and decentralized exchanges (DEXs). We conduct a comprehensive analysis of various events, assessing their effects on market dynamics, including token valuation and trading volumes in both CEXs and DEXs. Our findings highlight the pivotal role of trust in directing user preferences and the fluidity of trust transfer between centralized and decentralized platforms. Despite certain anomalies, the results largely align with our initial hypotheses, revealing the intricate nature of user trust in cryptocurrency markets. This study contributes significantly to interdisciplinary research, bridging distributed systems, behavioral finance, and Decentralized Finance (DeFi). It offers valuable insights for the distributed computing community, particularly in understanding and applying distributed trust mechanisms in digital economies, paving the way for future research that could further explore the socio-economic dimensions and leverage blockchain data in this dynamic domain. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.12492 [pdf, other]

Planning and Operation of Millimeter-wave Downlink Systems with Hybrid Beamforming

Authors: Yuan Quan, Shahram Shahsavari, Catherine Rosenberg

Abstract: This paper investigates downlink radio resource management (RRM) in millimeter-wave systems with codebook-based hybrid beamforming in a single cell. We consider a practical but often overlooked multi-channel scenario where the base station is equipped with fewer radio frequency chains than there are user equipment (UEs) in the cell. In this case, analog beam selection is important because not all… ▽ More This paper investigates downlink radio resource management (RRM) in millimeter-wave systems with codebook-based hybrid beamforming in a single cell. We consider a practical but often overlooked multi-channel scenario where the base station is equipped with fewer radio frequency chains than there are user equipment (UEs) in the cell. In this case, analog beam selection is important because not all beams preferred by UEs can be selected simultaneously, and since the beam selection cannot vary across subchannels in a time slot, this creates a coupling between subchannels within a time slot. None of the solutions proposed in the literature deal with this important constraint. The paper begins with an offline study that analyzes the impact of different RRM procedures and system parameters on performance. An offline joint RRM optimization problem is formulated and solved that includes beam set selection, UE set selection, power distribution, modulation and coding scheme selection, and digital beamforming as a part of hybrid beamforming. The evaluation results of the offline study provide valuable insights that shows the importance of not neglecting the constraint and guide the design of low-complexity and high-performance online downlink RRM schemes in the second part of the paper. The proposed online RRM algorithms perform close to the performance targets obtained from the offline study while offering acceptable runtime. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2401.13887 [pdf]

A comparative study of zero-shot inference with large language models and supervised modeling in breast cancer pathology classification

Authors: Madhumita Sushil, Travis Zack, Divneet Mandair, Zhiwei Zheng, Ahmed Wali, Yan-Ning Yu, Yuwei Quan, Atul J. Butte

Abstract: Although supervised machine learning is popular for information extraction from clinical notes, creating large annotated datasets requires extensive domain expertise and is time-consuming. Meanwhile, large language models (LLMs) have demonstrated promising transfer learning capability. In this study, we explored whether recent LLMs can reduce the need for large-scale data annotations. We curated a… ▽ More Although supervised machine learning is popular for information extraction from clinical notes, creating large annotated datasets requires extensive domain expertise and is time-consuming. Meanwhile, large language models (LLMs) have demonstrated promising transfer learning capability. In this study, we explored whether recent LLMs can reduce the need for large-scale data annotations. We curated a manually-labeled dataset of 769 breast cancer pathology reports, labeled with 13 categories, to compare zero-shot classification capability of the GPT-4 model and the GPT-3.5 model with supervised classification performance of three model architectures: random forests classifier, long short-term memory networks with attention (LSTM-Att), and the UCSF-BERT model. Across all 13 tasks, the GPT-4 model performed either significantly better than or as well as the best supervised model, the LSTM-Att model (average macro F1 score of 0.83 vs. 0.75). On tasks with high imbalance between labels, the differences were more prominent. Frequent sources of GPT-4 errors included inferences from multiple samples and complex task design. On complex tasks where large annotated datasets cannot be easily collected, LLMs can reduce the burden of large-scale data labeling. However, if the use of LLMs is prohibitive, the use of simpler supervised models with large annotated datasets can provide comparable results. LLMs demonstrated the potential to speed up the execution of clinical NLP studies by reducing the need for curating large annotated datasets. This may result in an increase in the utilization of NLP-based variables and outcomes in observational clinical studies. △ Less

Submitted 24 January, 2024; originally announced January 2024.

arXiv:2401.02614 [pdf, other]

Scaling and Masking: A New Paradigm of Data Sampling for Image and Video Quality Assessment

Authors: Yongxu Liu, Yinghui Quan, Guoyao Xiao, Aobo Li, **jian Wu

Abstract: Quality assessment of images and videos emphasizes both local details and global semantics, whereas general data sampling methods (e.g., resizing, crop** or grid-based fragment) fail to catch them simultaneously. To address the deficiency, current approaches have to adopt multi-branch models and take as input the multi-resolution data, which burdens the model complexity. In this work, instead of… ▽ More Quality assessment of images and videos emphasizes both local details and global semantics, whereas general data sampling methods (e.g., resizing, crop** or grid-based fragment) fail to catch them simultaneously. To address the deficiency, current approaches have to adopt multi-branch models and take as input the multi-resolution data, which burdens the model complexity. In this work, instead of stacking up models, a more elegant data sampling method (named as SAMA, scaling and masking) is explored, which compacts both the local and global content in a regular input size. The basic idea is to scale the data into a pyramid first, and reduce the pyramid into a regular data dimension with a masking strategy. Benefiting from the spatial and temporal redundancy in images and videos, the processed data maintains the multi-scale characteristics with a regular input size, thus can be processed by a single-branch model. We verify the sampling method in image and video quality assessment. Experiments show that our sampling method can improve the performance of current single-branch models significantly, and achieves competitive performance to the multi-branch models without extra model complexity. The source code will be available at https://github.com/Sissuire/SAMA. △ Less

Submitted 4 January, 2024; originally announced January 2024.

Comments: Accepted by AAAI2024. Code has been released at https://github.com/Sissuire/SAMA

arXiv:2312.14159 [pdf, other]

Enhancing Ethereum's Security with LUMEN, a Novel Zero-Knowledge Protocol Generating Transparent and Efficient zk-SNARKs

Authors: Yunjia Quan

Abstract: This paper proposes a novel recursive polynomial commitment scheme (PCS) and a new polynomial interactive oracle proof (PIOP) protocol, which compile into efficient and transparent zk-SNARKs (zero-knowledge succinct non-interactive arguments of knowledge). The Ethereum blockchain utilizes zero-knowledge Rollups (ZKR) to improve its scalability (the ability to handle a large number of transactions)… ▽ More This paper proposes a novel recursive polynomial commitment scheme (PCS) and a new polynomial interactive oracle proof (PIOP) protocol, which compile into efficient and transparent zk-SNARKs (zero-knowledge succinct non-interactive arguments of knowledge). The Ethereum blockchain utilizes zero-knowledge Rollups (ZKR) to improve its scalability (the ability to handle a large number of transactions), and ZKR uses zk-SNARKs to validate transactions. The currently used zk-SNARKs rely on a trusted setup ceremony, where a group of participants uses secret information about transactions to generate the public parameters necessary to verify the zk-SNARKs. This introduces a security risk into Ethereum's system. Thus, researchers have been develo** transparent zk-SNARKs (which do not require a trusted setup), but those are not as efficient as non-transparent zk-SNARKs, so ZKRs do not use them. In this research, I developed LUMEN, a set of novel algorithms that generate transparent zk-SNARKs that improve Ethereum's security without sacrificing its efficiency. Various techniques were creatively incorporated into LUMEN, including groups with hidden orders, Lagrange basis polynomials, and an amortization strategy. I wrote mathematical proofs for LUMEN that convey its completeness, soundness and zero-knowledgeness, and implemented LUMEN by writing around $8000$ lines of Rust and Python code, which conveyed the practicality of LUMEN. Moreover, my implementation revealed the efficiency of LUMEN (measured in proof size, proof computation time, and verification time), which surpasses the efficiency of existing transparent zk-SNARKs and is on par with that of non-transparent zk-SNARKs. Therefore, LUMEN is a promising solution to improve Ethereum's security while maintaining its efficiency. △ Less

Submitted 10 November, 2023; originally announced December 2023.

arXiv:2311.14676 [pdf, other]

doi 10.31219/osf.io/bq6tu

Decoding Social Sentiment in DAO: A Comparative Analysis of Blockchain Governance Communities

Authors: Yutong Quan, Xintong Wu, Wanlin Deng, Luyao Zhang

Abstract: Blockchain technology is leading a revolutionary transformation across diverse industries, with effective governance being critical for the success and sustainability of blockchain projects. Community forums, pivotal in engaging decentralized autonomous organizations (DAOs), significantly impact blockchain governance decisions. Concurrently, Natural Language Processing (NLP), particularly sentimen… ▽ More Blockchain technology is leading a revolutionary transformation across diverse industries, with effective governance being critical for the success and sustainability of blockchain projects. Community forums, pivotal in engaging decentralized autonomous organizations (DAOs), significantly impact blockchain governance decisions. Concurrently, Natural Language Processing (NLP), particularly sentiment analysis, provides powerful insights from textual data. While prior research has explored the potential of NLP tools in social media sentiment analysis, there is a gap in understanding the sentiment landscape of blockchain governance communities. The evolving discourse and sentiment dynamics on the forums of top DAOs remain largely unknown. This paper delves deep into the evolving discourse and sentiment dynamics on the public forums of leading DeFi projects: Aave, Uniswap, Curve DAO, Yearn.finance, Merit Circle, and Balancer, focusing primarily on discussions related to governance issues. Our study shows that participants in decentralized communities generally express positive sentiments during Discord discussions. Furthermore, there is a potential interaction between discussion intensity and sentiment dynamics; higher discussion volume may contribute to a more stable sentiment from code analysis. The insights gained from this study are valuable for decision-makers in blockchain governance, underscoring the pivotal role of sentiment analysis in interpreting community emotions and its evolving impact on the landscape of blockchain governance. This research significantly contributes to the interdisciplinary exploration of the intersection of blockchain and society, specifically emphasizing the decentralized blockchain governance ecosystem. We provide our data and code for replicability as open access on GitHub. △ Less

Submitted 25 May, 2024; v1 submitted 31 October, 2023; originally announced November 2023.

arXiv:2310.05067 [pdf, other]

Robust-GBDT: GBDT with Nonconvex Loss for Tabular Classification in the Presence of Label Noise and Class Imbalance

Authors: Jiaqi Luo, Yuedong Quan, Shixin Xu

Abstract: Dealing with label noise in tabular classification tasks poses a persistent challenge in machine learning. While robust boosting methods have shown promise in binary classification, their effectiveness in complex, multi-class scenarios is often limited. Additionally, issues like imbalanced datasets, missing values, and computational inefficiencies further complicate their practical utility. This s… ▽ More Dealing with label noise in tabular classification tasks poses a persistent challenge in machine learning. While robust boosting methods have shown promise in binary classification, their effectiveness in complex, multi-class scenarios is often limited. Additionally, issues like imbalanced datasets, missing values, and computational inefficiencies further complicate their practical utility. This study introduces Robust-GBDT, a groundbreaking approach that combines the power of Gradient Boosted Decision Trees (GBDT) with the resilience of nonconvex loss functions against label noise. By leveraging local convexity within specific regions, Robust-GBDT demonstrates unprecedented robustness, challenging conventional wisdom. Through seamless integration of advanced GBDT with a novel Robust Focal Loss tailored for class imbalance, Robust-GBDT significantly enhances generalization capabilities, particularly in noisy and imbalanced datasets. Notably, its user-friendly design facilitates integration with existing open-source code, enhancing computational efficiency and scalability. Extensive experiments validate Robust-GBDT's superiority over other noise-robust methods, establishing a new standard for accurate classification amidst label noise. This research heralds a paradigm shift in machine learning, paving the way for a new era of robust and precise classification across diverse real-world applications. △ Less

Submitted 15 March, 2024; v1 submitted 8 October, 2023; originally announced October 2023.

arXiv:2308.14276 [pdf, other]

Alleviating Video-Length Effect for Micro-video Recommendation

Authors: Yuhan Quan, **gtao Ding, Chen Gao, Nian Li, Lingling Yi, Depeng **, Yong Li

Abstract: Micro-videos platforms such as TikTok are extremely popular nowadays. One important feature is that users no longer select interested videos from a set, instead they either watch the recommended video or skip to the next one. As a result, the time length of users' watching behavior becomes the most important signal for identifying preferences. However, our empirical data analysis has shown a video… ▽ More Micro-videos platforms such as TikTok are extremely popular nowadays. One important feature is that users no longer select interested videos from a set, instead they either watch the recommended video or skip to the next one. As a result, the time length of users' watching behavior becomes the most important signal for identifying preferences. However, our empirical data analysis has shown a video-length effect that long videos are easier to receive a higher value of average view time, thus adopting such view-time labels for measuring user preferences can easily induce a biased model that favors the longer videos. In this paper, we propose a Video Length Debiasing Recommendation (VLDRec) method to alleviate such an effect for micro-video recommendation. VLDRec designs the data labeling approach and the sample generation module that better capture user preferences in a view-time oriented manner. It further leverages the multi-task learning technique to jointly optimize the above samples with original biased ones. Extensive experiments show that VLDRec can improve the users' view time by 1.81% and 11.32% on two real-world datasets, given a recommendation list of a fixed overall video length, compared with the best baseline method. Moreover, VLDRec is also more effective in matching users' interests in terms of the video content. △ Less

Submitted 31 August, 2023; v1 submitted 27 August, 2023; originally announced August 2023.

Comments: Accept by TOIS

arXiv:2307.10201 [pdf, other]

doi 10.31219/osf.io/qwpdx

On the Mechanics of NFT Valuation: AI Ethics and Social Media

Authors: Luyao Zhang, Yutong Sun, Yutong Quan, Jiaxun Cao, Xin Tong

Abstract: As CryptoPunks pioneers the innovation of non-fungible tokens (NFTs) in AI and art, the valuation mechanics of NFTs has become a trending topic. Earlier research identifies the impact of ethics and society on the price prediction of CryptoPunks. Since the booming year of the NFT market in 2021, the discussion of CryptoPunks has propagated on social media. Still, existing literature hasn't consider… ▽ More As CryptoPunks pioneers the innovation of non-fungible tokens (NFTs) in AI and art, the valuation mechanics of NFTs has become a trending topic. Earlier research identifies the impact of ethics and society on the price prediction of CryptoPunks. Since the booming year of the NFT market in 2021, the discussion of CryptoPunks has propagated on social media. Still, existing literature hasn't considered the social sentiment factors after the historical turning point on NFT valuation. In this paper, we study how sentiments in social media, together with gender and skin tone, contribute to NFT valuations by an empirical analysis of social media, blockchain, and crypto exchange data. We evidence social sentiments as a significant contributor to the price prediction of CryptoPunks. Furthermore, we document structure changes in the valuation mechanics before and after 2021. Although people's attitudes towards Cryptopunks are primarily positive, our findings reflect imbalances in transaction activities and pricing based on gender and skin tone. Our result is consistent and robust, controlling for the rarity of an NFT based on the set of human-readable attributes, including gender and skin tone. Our research contributes to the interdisciplinary study at the intersection of AI, Ethics, and Society, focusing on the ecosystem of decentralized AI or blockchain. We provide our data and code for replicability as open access on GitHub. △ Less

Submitted 21 July, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

Comments: Presented at ChainScience Conference, 2003 (arXiv:2307.03277v2 [cs.DC] 11 Jul 2023)

Report number: ChainScience/2023/16

arXiv:2307.06528 [pdf, other]

Optimised Least Squares Approach for Accurate Polygon and Ellipse Fitting

Authors: Yiming Quan, Shian Chen

Abstract: This study presents a generalised least squares based method for fitting polygons and ellipses to data points. The method is based on a trigonometric fitness function that approximates a unit shape accurately, making it applicable to various geometric shapes with minimal fitting parameters. Furthermore, the proposed method does not require any constraints and can handle incomplete data. The method… ▽ More This study presents a generalised least squares based method for fitting polygons and ellipses to data points. The method is based on a trigonometric fitness function that approximates a unit shape accurately, making it applicable to various geometric shapes with minimal fitting parameters. Furthermore, the proposed method does not require any constraints and can handle incomplete data. The method is validated on synthetic and real-world data sets and compared with the existing methods in the literature for polygon and ellipse fitting. The test results show that the method achieves high accuracy and outperforms the referenced methods in terms of root-mean-square error, especially for noise-free data. The proposed method is a powerful tool for shape fitting in computer vision and geometry processing applications. △ Less

Submitted 19 October, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

arXiv:2303.08346 [pdf, other]

doi 10.1145/3543507.3583374

Robust Preference-Guided Denoising for Graph based Social Recommendation

Authors: Yuhan Quan, **gtao Ding, Chen Gao, Lingling Yi, Depeng **, Yong Li

Abstract: Graph Neural Network(GNN) based social recommendation models improve the prediction accuracy of user preference by leveraging GNN in exploiting preference similarity contained in social relations. However, in terms of both effectiveness and efficiency of recommendation, a large portion of social relations can be redundant or even noisy, e.g., it is quite normal that friends share no preference in… ▽ More Graph Neural Network(GNN) based social recommendation models improve the prediction accuracy of user preference by leveraging GNN in exploiting preference similarity contained in social relations. However, in terms of both effectiveness and efficiency of recommendation, a large portion of social relations can be redundant or even noisy, e.g., it is quite normal that friends share no preference in a certain domain. Existing models do not fully solve this problem of relation redundancy and noise, as they directly characterize social influence over the full social network. In this paper, we instead propose to improve graph based social recommendation by only retaining the informative social relations to ensure an efficient and effective influence diffusion, i.e., graph denoising. Our designed denoising method is preference-guided to model social relation confidence and benefits user preference learning in return by providing a denoised but more informative social graph for recommendation models. Moreover, to avoid interference of noisy social relations, it designs a self-correcting curriculum learning module and an adaptive denoising strategy, both favoring highly-confident samples. Experimental results on three public datasets demonstrate its consistent capability of improving two state-of-the-art social recommendation models by robustly removing 10-40% of original relations. We release the source code at https://github.com/tsinghua-fib-lab/Graph-Denoising-SocialRec. △ Less

Submitted 14 March, 2023; originally announced March 2023.

arXiv:2303.03508 [pdf, other]

Memory Maps for Video Object Detection and Tracking on UAVs

Authors: Benjamin Kiefer, Yitong Quan, Andreas Zell

Abstract: This paper introduces a novel approach to video object detection detection and tracking on Unmanned Aerial Vehicles (UAVs). By incorporating metadata, the proposed approach creates a memory map of object locations in actual world coordinates, providing a more robust and interpretable representation of object locations in both, image space and the real world. We use this representation to boost con… ▽ More This paper introduces a novel approach to video object detection detection and tracking on Unmanned Aerial Vehicles (UAVs). By incorporating metadata, the proposed approach creates a memory map of object locations in actual world coordinates, providing a more robust and interpretable representation of object locations in both, image space and the real world. We use this representation to boost confidences, resulting in improved performance for several temporal computer vision tasks, such as video object detection, short and long-term single and multi-object tracking, and video anomaly detection. These findings confirm the benefits of metadata in enhancing the capabilities of UAVs in the field of temporal computer vision and pave the way for further advancements in this area. △ Less

Submitted 6 March, 2023; originally announced March 2023.

arXiv:2211.13508 [pdf, other]

1st Workshop on Maritime Computer Vision (MaCVi) 2023: Challenge Results

Authors: Benjamin Kiefer, Matej Kristan, Janez Perš, Lojze Žust, Fabio Poiesi, Fabio Augusto de Alcantara Andrade, Alexandre Bernardino, Matthew Dawkins, Jenni Raitoharju, Yitong Quan, Adem Atmaca, Timon Höfer, Qiming Zhang, Yufei Xu, **g Zhang, Dacheng Tao, Lars Sommer, Raphael Spraul, Hangyue Zhao, Hongpu Zhang, Yanyun Zhao, Jan Lukas Augustin, Eui-ik Jeon, Impyeong Lee, Luca Zedda , et al. (48 additional authors not shown)

Abstract: The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detec… ▽ More The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi. △ Less

Submitted 28 November, 2022; v1 submitted 24 November, 2022; originally announced November 2022.

Comments: MaCVi 2023 was part of WACV 2023. This report (38 pages) discusses the competition as part of MaCVi

arXiv:2210.02093 [pdf, other]

doi 10.1109/TIP.2023.3297408

Centralized Feature Pyramid for Object Detection

Authors: Yu Quan, Dong Zhang, Liyan Zhang, **hui Tang

Abstract: Visual feature pyramid has shown its superiority in both effectiveness and efficiency in a wide range of applications. However, the existing methods exorbitantly concentrate on the inter-layer feature interactions but ignore the intra-layer feature regulations, which are empirically proved beneficial. Although some methods try to learn a compact intra-layer feature representation with the help of… ▽ More Visual feature pyramid has shown its superiority in both effectiveness and efficiency in a wide range of applications. However, the existing methods exorbitantly concentrate on the inter-layer feature interactions but ignore the intra-layer feature regulations, which are empirically proved beneficial. Although some methods try to learn a compact intra-layer feature representation with the help of the attention mechanism or the vision transformer, they ignore the neglected corner regions that are important for dense prediction tasks. To address this problem, in this paper, we propose a Centralized Feature Pyramid (CFP) for object detection, which is based on a globally explicit centralized feature regulation. Specifically, we first propose a spatial explicit visual center scheme, where a lightweight MLP is used to capture the globally long-range dependencies and a parallel learnable visual center mechanism is used to capture the local corner regions of the input images. Based on this, we then propose a globally centralized regulation for the commonly-used feature pyramid in a top-down fashion, where the explicit visual center information obtained from the deepest intra-layer feature is used to regulate frontal shallow features. Compared to the existing feature pyramids, CFP not only has the ability to capture the global long-range dependencies, but also efficiently obtain an all-round yet discriminative feature representation. Experimental results on the challenging MS-COCO validate that our proposed CFP can achieve the consistent performance gains on the state-of-the-art YOLOv5 and YOLOX object detection baselines. △ Less

Submitted 5 October, 2022; originally announced October 2022.

Comments: Code: https://github.com/QY1994-0919/CFPNet

arXiv:2205.00463 [pdf, other]

doi 10.1088/1361-6420/ac8ac6

A Dataset-free Deep learning Method for Low-Dose CT Image Reconstruction

Authors: Qiaoqiao Ding, Hui Ji, Yuhui Quan, Xiaoqun Zhang

Abstract: Low-dose CT (LDCT) imaging attracted a considerable interest for the reduction of the object's exposure to X-ray radiation. In recent years, supervised deep learning (DL) has been extensively studied for LDCT image reconstruction, which trains a network over a dataset containing many pairs of normal-dose and low-dose images. However, the challenge on collecting many such pairs in the clinical setu… ▽ More Low-dose CT (LDCT) imaging attracted a considerable interest for the reduction of the object's exposure to X-ray radiation. In recent years, supervised deep learning (DL) has been extensively studied for LDCT image reconstruction, which trains a network over a dataset containing many pairs of normal-dose and low-dose images. However, the challenge on collecting many such pairs in the clinical setup limits the application of such supervised-learning-based methods for LDCT image reconstruction in practice. Aiming at addressing the challenges raised by the collection of training dataset, this paper proposed a unsupervised deep learning method for LDCT image reconstruction, which does not require any external training data. The proposed method is built on a re-parametrization technique for Bayesian inference via deep network with random weights, combined with additional total variational~(TV) regularization. The experiments show that the proposed method noticeably outperforms existing dataset-free image reconstruction methods on the test data. △ Less

Submitted 5 October, 2022; v1 submitted 1 May, 2022; originally announced May 2022.

arXiv:2203.00091 [pdf, other]

Dynamic N:M Fine-grained Structured Sparse Attention Mechanism

Authors: Zhaodong Chen, Yuying Quan, Zheng Qu, Liu Liu, Yufei Ding, Yuan Xie

Abstract: Transformers are becoming the mainstream solutions for various tasks like NLP and Computer vision. Despite their success, the high complexity of the attention mechanism hinders them from being applied to latency-sensitive tasks. Tremendous efforts have been made to alleviate this problem, and many of them successfully reduce the asymptotic complexity to linear. Nevertheless, most of them fail to a… ▽ More Transformers are becoming the mainstream solutions for various tasks like NLP and Computer vision. Despite their success, the high complexity of the attention mechanism hinders them from being applied to latency-sensitive tasks. Tremendous efforts have been made to alleviate this problem, and many of them successfully reduce the asymptotic complexity to linear. Nevertheless, most of them fail to achieve practical speedup over the original full attention under moderate sequence lengths and are unfriendly to finetuning. In this paper, we present DFSS, an attention mechanism that dynamically prunes the full attention weight matrix to N:M fine-grained structured sparse pattern. We provide both theoretical and empirical evidence that demonstrates DFSS is a good approximation of the full attention mechanism. We propose a dedicated CUDA kernel design that completely eliminates the dynamic pruning overhead and achieves speedups under arbitrary sequence length. We evaluate the 1:2 and 2:4 sparsity under different configurations and achieve 1.27~ 1.89x speedups over the full-attention mechanism. It only takes a couple of finetuning epochs from the pretrained model to achieve on par accuracy with full attention mechanism on tasks from various domains under different sequence lengths from 384 to 4096. △ Less

Submitted 28 February, 2022; originally announced March 2022.

arXiv:2111.00454 [pdf, other]

Gaussian Kernel Mixture Network for Single Image Defocus Deblurring

Authors: Yuhui Quan, Zicong Wu, Hui Ji

Abstract: Defocus blur is one kind of blur effects often seen in images, which is challenging to remove due to its spatially variant amount. This paper presents an end-to-end deep learning approach for removing defocus blur from a single image, so as to have an all-in-focus image for consequent vision tasks. First, a pixel-wise Gaussian kernel mixture (GKM) model is proposed for representing spatially varia… ▽ More Defocus blur is one kind of blur effects often seen in images, which is challenging to remove due to its spatially variant amount. This paper presents an end-to-end deep learning approach for removing defocus blur from a single image, so as to have an all-in-focus image for consequent vision tasks. First, a pixel-wise Gaussian kernel mixture (GKM) model is proposed for representing spatially variant defocus blur kernels in an efficient linear parametric form, with higher accuracy than existing models. Then, a deep neural network called GKMNet is developed by unrolling a fixed-point iteration of the GKM-based deblurring. The GKMNet is built on a lightweight scale-recurrent architecture, with a scale-recurrent attention module for estimating the mixing coefficients in GKM for defocus deblurring. Extensive experiments show that the GKMNet not only noticeably outperforms existing defocus deblurring methods, but also has its advantages in terms of model complexity and computational efficiency. △ Less

Submitted 31 October, 2021; originally announced November 2021.

Comments: Accepted by NeurIPS 2021

arXiv:2109.12843 [pdf, other]

A Survey of Graph Neural Networks for Recommender Systems: Challenges, Methods, and Directions

Authors: Chen Gao, Yu Zheng, Nian Li, Yinfeng Li, Yingrong Qin, **ghua Piao, Yuhan Quan, Jianxin Chang, Depeng **, Xiangnan He, Yong Li

Abstract: Recommender system is one of the most important information services on today's Internet. Recently, graph neural networks have become the new state-of-the-art approach to recommender systems. In this survey, we conduct a comprehensive review of the literature on graph neural network-based recommender systems. We first introduce the background and the history of the development of both recommender… ▽ More Recommender system is one of the most important information services on today's Internet. Recently, graph neural networks have become the new state-of-the-art approach to recommender systems. In this survey, we conduct a comprehensive review of the literature on graph neural network-based recommender systems. We first introduce the background and the history of the development of both recommender systems and graph neural networks. For recommender systems, in general, there are four aspects for categorizing existing works: stage, scenario, objective, and application. For graph neural networks, the existing methods consist of two categories, spectral models and spatial ones. We then discuss the motivation of applying graph neural networks into recommender systems, mainly consisting of the high-order connectivity, the structural property of data, and the enhanced supervision signal. We then systematically analyze the challenges in graph construction, embedding propagation/aggregation, model optimization, and computation efficiency. Afterward and primarily, we provide a comprehensive overview of a multitude of existing works of graph neural network-based recommender systems, following the taxonomy above. Finally, we raise discussions on the open problems and promising future directions in this area. We summarize the representative papers along with their code repositories in \url{https://github.com/tsinghua-fib-lab/GNN-Recommender-Systems}. △ Less

Submitted 12 January, 2023; v1 submitted 27 September, 2021; originally announced September 2021.

Comments: accepted by ACM Transactions on Recommender Systems

arXiv:2108.09127 [pdf, other]

TabGNN: Multiplex Graph Neural Network for Tabular Data Prediction

Authors: Xiawei Guo, Yuhan Quan, Huan Zhao, Quanming Yao, Yong Li, Weiwei Tu

Abstract: Tabular data prediction (TDP) is one of the most popular industrial applications, and various methods have been designed to improve the prediction performance. However, existing works mainly focus on feature interactions and ignore sample relations, e.g., users with the same education level might have a similar ability to repay the debt. In this work, by explicitly and systematically modeling samp… ▽ More Tabular data prediction (TDP) is one of the most popular industrial applications, and various methods have been designed to improve the prediction performance. However, existing works mainly focus on feature interactions and ignore sample relations, e.g., users with the same education level might have a similar ability to repay the debt. In this work, by explicitly and systematically modeling sample relations, we propose a novel framework TabGNN based on recently popular graph neural networks (GNN). Specifically, we firstly construct a multiplex graph to model the multifaceted sample relations, and then design a multiplex graph neural network to learn enhanced representation for each sample. To integrate TabGNN with the tabular solution in our company, we concatenate the learned embeddings and the original ones, which are then fed to prediction models inside the solution. Experiments on eleven TDP datasets from various domains, including classification and regression ones, show that TabGNN can consistently improve the performance compared to the tabular solution AutoFE in 4Paradigm. △ Less

Submitted 20 August, 2021; originally announced August 2021.

arXiv:2106.06900 [pdf, other]

Deep Reinforcement Learning based Group Recommender System

Authors: Zefang Liu, Shuran Wen, Yinzhu Quan

Abstract: Group recommender systems are widely used in current web applications. In this paper, we propose a novel group recommender system based on the deep reinforcement learning. We introduce the MovieLens data at first and generate one random group dataset, MovieLens-Rand, from it. This randomly generated dataset is described and analyzed. We also present experimental settings and two state-of-art basel… ▽ More Group recommender systems are widely used in current web applications. In this paper, we propose a novel group recommender system based on the deep reinforcement learning. We introduce the MovieLens data at first and generate one random group dataset, MovieLens-Rand, from it. This randomly generated dataset is described and analyzed. We also present experimental settings and two state-of-art baselines, AGREE and GroupIM. The framework of our novel model, the Deep Reinforcement learning based Group Recommender system (DRGR), is proposed. Actor-critic networks are implemented with the deep deterministic policy gradient algorithm. The DRGR model is applied on the MovieLens-Rand dataset with two baselines. Compared with baselines, we conclude that DRGR performs better than GroupIM due to long interaction histories but worse than AGREE because of the self-attention mechanism. We express advantages and shortcomings of DRGR and also give future improvement directions at the end. △ Less

Submitted 12 June, 2021; originally announced June 2021.

arXiv:2009.03376 [pdf, other]

Simplify and Robustify Negative Sampling for Implicit Collaborative Filtering

Authors: **gtao Ding, Yuhan Quan, Quanming Yao, Yong Li, Depeng **

Abstract: Negative sampling approaches are prevalent in implicit collaborative filtering for obtaining negative labels from massive unlabeled data. As two major concerns in negative sampling, efficiency and effectiveness are still not fully achieved by recent works that use complicate structures and overlook risk of false negative instances. In this paper, we first provide a novel understanding of negative… ▽ More Negative sampling approaches are prevalent in implicit collaborative filtering for obtaining negative labels from massive unlabeled data. As two major concerns in negative sampling, efficiency and effectiveness are still not fully achieved by recent works that use complicate structures and overlook risk of false negative instances. In this paper, we first provide a novel understanding of negative instances by empirically observing that only a few instances are potentially important for model learning, and false negatives tend to have stable predictions over many training iterations. Above findings motivate us to simplify the model by sampling from designed memory that only stores a few important candidates and, more importantly, tackle the untouched false negative problem by favouring high-variance samples stored in memory, which achieves efficient sampling of true negatives with high-quality. Empirical results on two synthetic datasets and three real-world datasets demonstrate both robustness and superiorities of our negative sampling method. △ Less

Submitted 7 September, 2020; originally announced September 2020.

Comments: 20 pages, 7 figures, 8 tables

arXiv:2009.01616 [pdf]

Few-shot Object Detection with Feature Attention Highlight Module in Remote Sensing Images

Authors: Zixuan Xiao, ** Zhong, Yuan Quan, Xu** Yin, Wei Xue

Abstract: In recent years, there are many applications of object detection in remote sensing field, which demands a great number of labeled data. However, in many cases, data is extremely rare. In this paper, we proposed a few-shot object detector which is designed for detecting novel objects based on only a few examples. Through fully leveraging labeled base classes, our model that is composed of a feature… ▽ More In recent years, there are many applications of object detection in remote sensing field, which demands a great number of labeled data. However, in many cases, data is extremely rare. In this paper, we proposed a few-shot object detector which is designed for detecting novel objects based on only a few examples. Through fully leveraging labeled base classes, our model that is composed of a feature-extractor, a feature attention highlight module as well as a two-stage detection backend can quickly adapt to novel classes. The pre-trained feature extractor whose parameters are shared produces general features. While the feature attention highlight module is designed to be light-weighted and simple in order to fit the few-shot cases. Although it is simple, the information provided by it in a serial way is helpful to make the general features to be specific for few-shot objects. Then the object-specific features are delivered to the two-stage detection backend for the detection results. The experiments demonstrate the effectiveness of the proposed method for few-shot cases. △ Less

Submitted 3 September, 2020; originally announced September 2020.

arXiv:2007.10963 [pdf, other]

Recurrent Exposure Generation for Low-Light Face Detection

Authors: **xiu Liang, **gwen Wang, Yuhui Quan, Tianyi Chen, Jiaying Liu, Haibin Ling, Yong Xu

Abstract: Face detection from low-light images is challenging due to limited photos and inevitable noise, which, to make the task even harder, are often spatially unevenly distributed. A natural solution is to borrow the idea from multi-exposure, which captures multiple shots to obtain well-exposed images under challenging conditions. High-quality implementation/approximation of multi-exposure from a single… ▽ More Face detection from low-light images is challenging due to limited photos and inevitable noise, which, to make the task even harder, are often spatially unevenly distributed. A natural solution is to borrow the idea from multi-exposure, which captures multiple shots to obtain well-exposed images under challenging conditions. High-quality implementation/approximation of multi-exposure from a single image is however nontrivial. Fortunately, as shown in this paper, neither is such high-quality necessary since our task is face detection rather than image enhancement. Specifically, we propose a novel Recurrent Exposure Generation (REG) module and couple it seamlessly with a Multi-Exposure Detection (MED) module, and thus significantly improve face detection performance by effectively inhibiting non-uniform illumination and noise issues. REG produces progressively and efficiently intermediate images corresponding to various exposure settings, and such pseudo-exposures are then fused by MED to detect faces across different lighting conditions. The proposed method, named REGDet, is the first `detection-with-enhancement' framework for low-light face detection. It not only encourages rich interaction and feature fusion across different illumination levels, but also enables effective end-to-end learning of the REG component to be better tailored for face detection. Moreover, as clearly shown in our experiments, REG can be flexibly coupled with different face detectors without extra low/normal-light image pairs for training. We tested REGDet on the DARK FACE low-light face benchmark with thorough ablation study, where REGDet outperforms previous state-of-the-arts by a significant margin, with only negligible extra parameters. △ Less

Submitted 21 July, 2020; originally announced July 2020.

Comments: 11 pages

arXiv:2007.02018 [pdf, other]

Deep Bilateral Retinex for Low-Light Image Enhancement

Authors: **xiu Liang, Yong Xu, Yuhui Quan, **gwen Wang, Haibin Ling, Hui Ji

Abstract: Low-light images, i.e. the images captured in low-light conditions, suffer from very poor visibility caused by low contrast, color distortion and significant measurement noise. Low-light image enhancement is about improving the visibility of low-light images. As the measurement noise in low-light images is usually significant yet complex with spatially-varying characteristic, how to handle the noi… ▽ More Low-light images, i.e. the images captured in low-light conditions, suffer from very poor visibility caused by low contrast, color distortion and significant measurement noise. Low-light image enhancement is about improving the visibility of low-light images. As the measurement noise in low-light images is usually significant yet complex with spatially-varying characteristic, how to handle the noise effectively is an important yet challenging problem in low-light image enhancement. Based on the Retinex decomposition of natural images, this paper proposes a deep learning method for low-light image enhancement with a particular focus on handling the measurement noise. The basic idea is to train a neural network to generate a set of pixel-wise operators for simultaneously predicting the noise and the illumination layer, where the operators are defined in the bilateral space. Such an integrated approach allows us to have an accurate prediction of the reflectance layer in the presence of significant spatially-varying measurement noise. Extensive experiments on several benchmark datasets have shown that the proposed method is very competitive to the state-of-the-art methods, and has significant advantage over others when processing images captured in extremely low lighting conditions. △ Less

Submitted 4 July, 2020; originally announced July 2020.

Comments: 15 pages

arXiv:2006.16591 [pdf]

A Novel Bistatic Joint Radar-Communication System in Multi-path Environments

Authors: Yuan Quan, Longfei Shi, Jialei Liu, Jiazhi Ma

Abstract: Radar detection and communication can be operated simultaneously in joint radar-communication (JRC) system. In this paper, we propose a bistatic JRC system which is applicable in multi-path environments. Basing on a novel joint waveform, a joint detection process is designed for both target detection and channel estimation. Meanwhile, a low-cost channel equalization method that utilizes the channe… ▽ More Radar detection and communication can be operated simultaneously in joint radar-communication (JRC) system. In this paper, we propose a bistatic JRC system which is applicable in multi-path environments. Basing on a novel joint waveform, a joint detection process is designed for both target detection and channel estimation. Meanwhile, a low-cost channel equalization method that utilizes the channel state information acquired from the detection process is proposed. The numerical results show that the symbol error rate (SER) of the proposed system is similar to that of the binary frequency shift keying system, and the signal to noise ratio requirement in multi-path environments is less than 2 dB higher compared with that in single-path environment to reach a SER of 10-5. Besides, the knowledge of the embedded information is not required for the joint detection process and the detection performance is robust to unknown information. △ Less

Submitted 30 June, 2020; originally announced June 2020.

Comments: 6 pages, 9 figures

arXiv:2006.16096 [pdf]

Joint Radar-Communication Waveform Design Based on Composite Modulation

Authors: Yuan Quan, Fulai Wang, Longfei Shi, Jiazhi Ma

Abstract: Joint radar-communication (JRC) waveform can be used for simultaneous radar detection and communication in the same frequency band. However, radar detection processing requires the prior knowledge of the waveform including the embedded information for matched filtering. To remove this requirement, we propose a unimodular JRC waveform based on composite modulation where the internal modulation embe… ▽ More Joint radar-communication (JRC) waveform can be used for simultaneous radar detection and communication in the same frequency band. However, radar detection processing requires the prior knowledge of the waveform including the embedded information for matched filtering. To remove this requirement, we propose a unimodular JRC waveform based on composite modulation where the internal modulation embeds information by map** the bit sequence to different orthogonal signals, and the external modulation performs phase modulation on the internal waveform to satisfy the demand of detection. By adjusting the number of the orthogonal signals, a trade-off between the detection and the communication performance can be made. Besides, a new parameter, dissimilarity, is defined to evaluate the detection performance robustness to unknown embedded information. The numerical results show that the SER performance of the proposed system is similar to that of the multilevel frequency shift keying system, the ambiguity function resembles that of the phase coded signal, and the dissimilarity performance is better than other JRC systems. △ Less

Submitted 15 September, 2020; v1 submitted 29 June, 2020; originally announced June 2020.

Comments: 11 pages, 13 figures

arXiv:2006.11539 [pdf, other]

On Addressing the Impact of ISO Speed upon PRNU and Forgery Detection

Authors: Yijun Quan, Chang-Tsun Li

Abstract: Photo Response Non-Uniformity (PRNU) has been used as a powerful device fingerprint for image forgery detection because image forgeries can be revealed by finding the absence of the PRNU in the manipulated areas. The correlation between an image's noise residual with the device's reference PRNU is often compared with a decision threshold to check the existence of the PRNU. A PRNU correlation predi… ▽ More Photo Response Non-Uniformity (PRNU) has been used as a powerful device fingerprint for image forgery detection because image forgeries can be revealed by finding the absence of the PRNU in the manipulated areas. The correlation between an image's noise residual with the device's reference PRNU is often compared with a decision threshold to check the existence of the PRNU. A PRNU correlation predictor is usually used to determine this decision threshold assuming the correlation is content-dependent. However, we found that not only the correlation is content-dependent, but it also depends on the camera sensitivity setting. \textit{Camera sensitivity}, commonly known by the name of \textit{ISO speed}, is an important attribute in digital photography. In this work, we will show the PRNU correlation's dependency on ISO speed. Due to such dependency, we postulate that a correlation predictor is ISO speed-specific, i.e. \textit{reliable correlation predictions can only be made when a correlation predictor is trained with images of similar ISO speeds to the image in question}. We report the experiments we conducted to validate the postulate. It is realized that in the real-world, information about the ISO speed may not be available in the metadata to facilitate the implementation of our postulate in the correlation prediction process. We hence propose a method called Content-based Inference of ISO Speeds (CINFISOS) to infer the ISO speed from the image content. △ Less

Submitted 20 June, 2020; originally announced June 2020.

Comments: The paper is accepted to IEEE Transactions on Information Forensics and Security with the supplementary material

arXiv:2005.05519 [pdf]

A Novel Granular-Based Bi-Clustering Method of Deep Mining the Co-Expressed Genes

Authors: Kaijie Xu, Witold Pedrycz, Zhiwu Li, Yinghui Quan, Weike Nie

Abstract: Traditional clustering methods are limited when dealing with huge and heterogeneous groups of gene expression data, which motivates the development of bi-clustering methods. Bi-clustering methods are used to mine bi-clusters whose subsets of samples (genes) are co-regulated under their test conditions. Studies show that mining bi-clusters of consistent trends and trends with similar degrees of flu… ▽ More Traditional clustering methods are limited when dealing with huge and heterogeneous groups of gene expression data, which motivates the development of bi-clustering methods. Bi-clustering methods are used to mine bi-clusters whose subsets of samples (genes) are co-regulated under their test conditions. Studies show that mining bi-clusters of consistent trends and trends with similar degrees of fluctuations from the gene expression data is essential in bioinformatics research. Unfortunately, traditional bi-clustering methods are not fully effective in discovering such bi-clusters. Therefore, we propose a novel bi-clustering method by involving here the theory of Granular Computing. In the proposed scheme, the gene data matrix, considered as a group of time series, is transformed into a series of ordered information granules. With the information granules we build a characteristic matrix of the gene data to capture the fluctuation trend of the expression value between consecutive conditions to mine the ideal bi-clusters. The experimental results are in agreement with the theoretical analysis, and show the excellent performance of the proposed method. △ Less

Submitted 11 May, 2020; originally announced May 2020.

arXiv:2004.10469 [pdf, other]

Warwick Image Forensics Dataset for Device Fingerprinting In Multimedia Forensics

Authors: Yijun Quan, Chang-Tsun Li, Yujue Zhou, Li Li

Abstract: Device fingerprints like sensor pattern noise (SPN) are widely used for provenance analysis and image authentication. Over the past few years, the rapid advancement in digital photography has greatly reshaped the pipeline of image capturing process on consumer-level mobile devices. The flexibility of camera parameter settings and the emergence of multi-frame photography algorithms, especially high… ▽ More Device fingerprints like sensor pattern noise (SPN) are widely used for provenance analysis and image authentication. Over the past few years, the rapid advancement in digital photography has greatly reshaped the pipeline of image capturing process on consumer-level mobile devices. The flexibility of camera parameter settings and the emergence of multi-frame photography algorithms, especially high dynamic range (HDR) imaging, bring new challenges to device fingerprinting. The subsequent study on these topics requires a new purposefully built image dataset. In this paper, we present the Warwick Image Forensics Dataset, an image dataset of more than 58,600 images captured using 14 digital cameras with various exposure settings. Special attention to the exposure settings allows the images to be adopted by different multi-frame computational photography algorithms and for subsequent device fingerprinting. The dataset is released as an open-source, free for use for the digital forensic community. △ Less

Submitted 7 May, 2020; v1 submitted 22 April, 2020; originally announced April 2020.

Comments: Paper accepted to IEEE International Conference on Multimedia and Expo 2020 (ICME 2020)

arXiv:1811.04208 [pdf, other]

Image Cartoon-Texture Decomposition Using Isotropic Patch Recurrence

Authors: Ruotao Xu, Yuhui Quan, Yong Xu

Abstract: Aiming at separating the cartoon and texture layers from an image, cartoon-texture decomposition approaches resort to image priors to model cartoon and texture respectively. In recent years, patch recurrence has emerged as a powerful prior for image recovery. However, the existing strategies of using patch recurrence are ineffective to cartoon-texture decomposition, as both cartoon contours and te… ▽ More Aiming at separating the cartoon and texture layers from an image, cartoon-texture decomposition approaches resort to image priors to model cartoon and texture respectively. In recent years, patch recurrence has emerged as a powerful prior for image recovery. However, the existing strategies of using patch recurrence are ineffective to cartoon-texture decomposition, as both cartoon contours and texture patterns exhibit strong patch recurrence in images. To address this issue, we introduce the isotropy prior of patch recurrence, that the spatial configuration of similar patches in texture exhibits the isotropic structure which is different from that in cartoon, to model the texture component. Based on the isotropic patch recurrence, we construct a nonlocal sparsification system which can effectively distinguish well-patterned features from contour edges. Incorporating the constructed nonlocal system into morphology component analysis, we develop an effective method to both noiseless and noisy cartoon-texture decomposition. The experimental results have demonstrated the superior performance of the proposed method to the existing ones, as well as the effectiveness of the isotropic patch recurrence prior. △ Less

Submitted 10 November, 2018; originally announced November 2018.

Comments: 13 pages, 10 figures

Showing 1–34 of 34 results for author: Quan, Y