Skip to main content

Showing 1–50 of 71 results for author: Wu, F

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.08709  [pdf, other

    cs.LG stat.ME

    Introducing Diminutive Causal Structure into Graph Representation Learning

    Authors: Hang Gao, Peng Qiao, Yifan **, Fengge Wu, Jiangmeng Li, Changwen Zheng

    Abstract: When engaging in end-to-end graph representation learning with Graph Neural Networks (GNNs), the intricate causal relationships and rules inherent in graph data pose a formidable challenge for the model in accurately capturing authentic data relationships. A proposed mitigating strategy involves the direct integration of rules or relationships corresponding to the graph data into the model. Howeve… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  2. arXiv:2401.04900  [pdf, other

    astro-ph.SR astro-ph.IM cs.LG stat.ML

    SPT: Spectral Transformer for Red Giant Stars Age and Mass Estimation

    Authors: Mengmeng Zhang, Fan Wu, Yude Bu, Shanshan Li, Zhen** Yi, Meng Liu, Xiaoming Kong

    Abstract: The age and mass of red giants are essential for understanding the structure and evolution of the Milky Way. Traditional isochrone methods for these estimations are inherently limited due to overlap** isochrones in the Hertzsprung-Russell diagram, while asteroseismology, though more precise, requires high-precision, long-term observations. In response to these challenges, we developed a novel fr… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: Accepted by A&A

  3. arXiv:2312.09613  [pdf, other

    cs.LG cs.AI stat.ML

    Rethinking Causal Relationships Learning in Graph Neural Networks

    Authors: Hang Gao, Chengyu Yao, Jiangmeng Li, Lingyu Si, Yifan **, Fengge Wu, Changwen Zheng, Hua** Liu

    Abstract: Graph Neural Networks (GNNs) demonstrate their significance by effectively modeling complex interrelationships within graph-structured data. To enhance the credibility and robustness of GNNs, it becomes exceptionally crucial to bolster their ability to capture causal relationships. However, despite recent advancements that have indeed strengthened GNNs from a causal learning perspective, conductin… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  4. arXiv:2312.06098  [pdf, other

    stat.ME math.ST

    Mixture Matrix-valued Autoregressive Model

    Authors: Fei Wu, Kung-Sik Chan

    Abstract: Time series of matrix-valued data are increasingly available in various areas including economics, finance, social science, etc. These data may shed light on the inter-dynamical relationships between two sets of attributes, for instance countries and economic indices. The matrix autoregressive (MAR) model provides a parsimonious approach for analyzing such data. However, the MAR model, being a lin… ▽ More

    Submitted 26 May, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

  5. arXiv:2308.08148  [pdf, other

    cs.LG stat.ME

    Hierarchical Topological Ordering with Conditional Independence Test for Limited Time Series

    Authors: Anpeng Wu, Haoxuan Li, Kun Kuang, Keli Zhang, Fei Wu

    Abstract: Learning directed acyclic graphs (DAGs) to identify causal relations underlying observational data is crucial but also poses significant challenges. Recently, topology-based methods have emerged as a two-step approach to discovering DAGs by first learning the topological ordering of variables and then eliminating redundant edges, while ensuring that the graph remains acyclic. However, one limitati… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  6. arXiv:2306.07480  [pdf, other

    stat.ME

    ACE: Active Learning for Causal Inference with Expensive Experiments

    Authors: Difan Song, Simon Mak, C. F. Jeff Wu

    Abstract: Experiments are the gold standard for causal inference. In many applications, experimental units can often be recruited or chosen sequentially, and the adaptive execution of such experiments may offer greatly improved inference of causal quantities over non-adaptive approaches, particularly when experiments are expensive. We thus propose a novel active learning method called ACE (Active learning f… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: 6 pages, 4 figures

  7. arXiv:2306.03679  [pdf, other

    cs.CV cs.AI cs.CR cs.LG stat.ML

    Human-imperceptible, Machine-recognizable Images

    Authors: Fusheng Hao, Fengxiang He, Yikai Wang, Fuxiang Wu, **g Zhang, Jun Cheng, Dacheng Tao

    Abstract: Massive human-related data is collected to train neural networks for computer vision tasks. A major conflict is exposed relating to software engineers between better develo** AI systems and distancing from the sensitive training data. To reconcile this conflict, this paper proposes an efficient privacy-preserving learning paradigm, where images are first encrypted to become ``human-imperceptible… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  8. arXiv:2303.02311  [pdf, other

    cs.LG stat.AP

    Traffic State Estimation from Vehicle Trajectories with Anisotropic Gaussian Processes

    Authors: Fan Wu, Zhanhong Cheng, Huiyu Chen, Tony Z. Qiu, Lijun Sun

    Abstract: Accurately monitoring road traffic state is crucial for various applications, including travel time prediction, traffic control, and traffic safety. However, the lack of sensors often results in incomplete traffic state data, making it challenging to obtain reliable information for decision-making. This paper proposes a novel method for imputing traffic state data using Gaussian processes (GP) to… ▽ More

    Submitted 2 April, 2024; v1 submitted 3 March, 2023; originally announced March 2023.

  9. arXiv:2212.07881  [pdf, other

    astro-ph.GA stat.ML

    Identifying AGN host galaxies with convolutional neural networks

    Authors: Ziting Guo, John F. Wu, Chelsea E. Sharon

    Abstract: Active galactic nuclei (AGN) are supermassive black holes with luminous accretion disks found in some galaxies, and are thought to play an important role in galaxy evolution. However, traditional optical spectroscopy for identifying AGN requires time-intensive observations. We train a convolutional neural network (CNN) to distinguish AGN host galaxies from non-active galaxies using a sample of 210… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

    Comments: 6 pages, 2 figures. Accepted to the 2022 NeurIPS conference ML4PS workshop

  10. arXiv:2212.05778  [pdf, other

    cs.LG cs.AI stat.ME

    Instrumental Variables in Causal Inference and Machine Learning: A Survey

    Authors: Anpeng Wu, Kun Kuang, Ruoxuan Xiong, Fei Wu

    Abstract: Causal inference is the process of using assumptions, study designs, and estimation strategies to draw conclusions about the causal relationships between variables based on data. This allows researchers to better understand the underlying mechanisms at work in complex systems and make more informed decisions. In many settings, we may not fully observe all the confounders that affect both the treat… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

  11. arXiv:2211.15762  [pdf, other

    cs.LG stat.ML

    Understanding the Impact of Adversarial Robustness on Accuracy Disparity

    Authors: Yuzheng Hu, Fan Wu, Hongyang Zhang, Han Zhao

    Abstract: While it has long been empirically observed that adversarial robustness may be at odds with standard accuracy and may have further disparate impacts on different classes, it remains an open question to what extent such observations hold and how the class imbalance plays a role within. In this paper, we attempt to understand this question of accuracy disparity by taking a closer look at linear clas… ▽ More

    Submitted 28 May, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: Accepted at ICML 2023

  12. arXiv:2211.10008  [pdf, other

    cs.AI stat.ME

    Confounder Balancing for Instrumental Variable Regression with Latent Variable

    Authors: Anpeng Wu, Kun Kuang, Ruoxuan Xiong, Bo Li, Fei Wu

    Abstract: This paper studies the confounding effects from the unmeasured confounders and the imbalance of observed confounders in IV regression and aims at unbiased causal effect estimation. Recently, nonlinear IV estimators were proposed to allow for nonlinear model in both stages. However, the observed confounders may be imbalanced in stage 2, which could still lead to biased treatment effect estimation i… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

  13. arXiv:2210.14080  [pdf, other

    cs.LG cs.AI cs.SI stat.ME

    Learning Individual Treatment Effects under Heterogeneous Interference in Networks

    Authors: Ziyu Zhao, Yuqi Bai, Kun Kuang, Ruoxuan Xiong, Fei Wu

    Abstract: Estimates of individual treatment effects from networked observational data are attracting increasing attention these days. One major challenge in network scenarios is the violation of the stable unit treatment value assumption (SUTVA), which assumes that the treatment assignment of a unit does not influence others' outcomes. In network data, due to interference, the outcome of a unit is influence… ▽ More

    Submitted 25 January, 2024; v1 submitted 25 October, 2022; originally announced October 2022.

  14. arXiv:2210.04958  [pdf, other

    cs.LG stat.ME

    Mining Causality from Continuous-time Dynamics Models: An Application to Tsunami Forecasting

    Authors: Fan Wu, Sanghyun Hong, Donsub Rim, Noseong Park, Kook** Lee

    Abstract: Continuous-time dynamics models, such as neural ordinary differential equations, have enabled the modeling of underlying dynamics in time-series data and accurate forecasting. However, parameterization of dynamics using a neural network makes it difficult for humans to identify causal structures in the data. In consequence, this opaqueness hinders the use of these models in the domains where captu… ▽ More

    Submitted 13 October, 2022; v1 submitted 10 October, 2022; originally announced October 2022.

  15. arXiv:2209.13748  [pdf, other

    stat.ME

    Conglomerate Multi-Fidelity Gaussian Process Modeling, with Application to Heavy-Ion Collisions

    Authors: Yi Ji, Henry Shaowu Yuchi, Derek Soeder, J. -F. Paquet, Steffen A. Bass, V. Roshan Joseph, C. F. Jeff Wu, Simon Mak

    Abstract: In an era where scientific experimentation is often costly, multi-fidelity emulation provides a powerful tool for predictive scientific computing. While there has been notable work on multi-fidelity modeling, existing models do not incorporate an important "conglomerate" property of multi-fidelity simulators, where the accuracies of different simulator components are controlled by different fideli… ▽ More

    Submitted 28 September, 2023; v1 submitted 27 September, 2022; originally announced September 2022.

  16. arXiv:2208.10912  [pdf, other

    cs.AI stat.ME

    Learning Instrumental Variable from Data Fusion for Treatment Effect Estimation

    Authors: Anpeng Wu, Kun Kuang, Ruoxuan Xiong, Minqing Zhu, Yuxuan Liu, Bo Li, Furui Liu, Zhihua Wang, Fei Wu

    Abstract: The advent of the big data era brought new opportunities and challenges to draw treatment effect in data fusion, that is, a mixed dataset collected from multiple sources (each source with an independent treatment assignment mechanism). Due to possibly omitted source labels and unmeasured confounders, traditional methods cannot estimate individual treatment assignment probability and infer treatmen… ▽ More

    Submitted 7 December, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

  17. arXiv:2206.05891  [pdf, other

    cs.LG cs.DC stat.ML

    Anchor Sampling for Federated Learning with Partial Client Participation

    Authors: Feijie Wu, Song Guo, Zhihao Qu, Shiqi He, Ziming Liu, **g Gao

    Abstract: Compared with full client participation, partial client participation is a more practical scenario in federated learning, but it may amplify some challenges in federated learning, such as data heterogeneity. The lack of inactive clients' updates in partial client participation makes it more likely for the model aggregation to deviate from the aggregation based on full client participation. Trainin… ▽ More

    Submitted 28 May, 2023; v1 submitted 12 June, 2022; originally announced June 2022.

    Comments: ICML 2023

  18. arXiv:2205.05800  [pdf, other

    cs.LG math.OC stat.ML

    Stochastic first-order methods for average-reward Markov decision processes

    Authors: Tianjiao Li, Feiyang Wu, Guanghui Lan

    Abstract: We study the problem of average-reward Markov decision processes (AMDPs) and develop novel first-order methods with strong theoretical guarantees for both policy evaluation and optimization. Existing on-policy evaluation methods suffer from sub-optimal convergence rates as well as failure in handling insufficiently random policies, e.g., deterministic policies, for lack of exploration. To remedy t… ▽ More

    Submitted 14 September, 2022; v1 submitted 11 May, 2022; originally announced May 2022.

  19. arXiv:2105.14524  [pdf, other

    stat.ML cs.LG

    Parameter Estimation for the SEIR Model Using Recurrent Nets

    Authors: Chun Fan, Yuxian Meng, Xiaofei Sun, Fei Wu, Tianwei Zhang, Jiwei Li

    Abstract: The standard way to estimate the parameters $Θ_\text{SEIR}$ (e.g., the transmission rate $β$) of an SEIR model is to use grid search, where simulations are performed on each set of parameters, and the parameter set leading to the least $L_2$ distance between predicted number of infections and observed infections is selected. This brute-force strategy is not only time consuming, as simulations are… ▽ More

    Submitted 30 May, 2021; originally announced May 2021.

  20. arXiv:2105.13831  [pdf, other

    stat.ML cs.LG

    Implicit Regularization in Matrix Sensing via Mirror Descent

    Authors: Fan Wu, Patrick Rebeschini

    Abstract: We study discrete-time mirror descent applied to the unregularized empirical risk in matrix sensing. In both the general case of rectangular matrices and the particular case of positive semidefinite matrices, a simple potential-based analysis in terms of the Bregman divergence allows us to establish convergence of mirror descent -- with different choices of the mirror maps -- to a matrix that, amo… ▽ More

    Submitted 27 October, 2021; v1 submitted 28 May, 2021; originally announced May 2021.

  21. arXiv:2105.03678  [pdf, other

    eess.SP cs.LG stat.ML

    Nearly Minimax-Optimal Rates for Noisy Sparse Phase Retrieval via Early-Stopped Mirror Descent

    Authors: Fan Wu, Patrick Rebeschini

    Abstract: This paper studies early-stopped mirror descent applied to noisy sparse phase retrieval, which is the problem of recovering a $k$-sparse signal $\mathbf{x}^\star\in\mathbb{R}^n$ from a set of quadratic Gaussian measurements corrupted by sub-exponential noise. We consider the (non-convex) unregularized empirical risk minimization problem and show that early-stopped mirror descent, when equipped wit… ▽ More

    Submitted 8 May, 2021; originally announced May 2021.

    Comments: arXiv admin note: text overlap with arXiv:2010.10168

  22. arXiv:2101.06592  [pdf, other

    stat.ME cs.LG

    TSEC: a framework for online experimentation under experimental constraints

    Authors: Simon Mak, Yuanshuo Zhou, Lavonne Hoang, C. F. Jeff Wu

    Abstract: Thompson sampling is a popular algorithm for solving multi-armed bandit problems, and has been applied in a wide range of applications, from website design to portfolio optimization. In such applications, however, the number of choices (or arms) $N$ can be large, and the data needed to make adaptive decisions require expensive experimentation. One is then faced with the constraint of experimenting… ▽ More

    Submitted 17 January, 2021; originally announced January 2021.

  23. arXiv:2012.11798  [pdf, other

    stat.ME cs.LG

    APIK: Active Physics-Informed Kriging Model with Partial Differential Equations

    Authors: Jialei Chen, Zhehui Chen, Chuck Zhang, C. F. Jeff Wu

    Abstract: Kriging (or Gaussian process regression) is a popular machine learning method for its flexibility and closed-form prediction expressions. However, one of the key challenges in applying kriging to engineering systems is that the available measurement data is scarce due to the measurement limitations and high sensing costs. On the other hand, physical knowledge of the engineering system is often ava… ▽ More

    Submitted 21 December, 2020; originally announced December 2020.

  24. arXiv:2010.10168  [pdf, other

    stat.ML cs.LG

    A Continuous-Time Mirror Descent Approach to Sparse Phase Retrieval

    Authors: Fan Wu, Patrick Rebeschini

    Abstract: We analyze continuous-time mirror descent applied to sparse phase retrieval, which is the problem of recovering sparse signals from a set of magnitude-only measurements. We apply mirror descent to the unconstrained empirical risk minimization problem (batch setting), using the square loss and square measurements. We provide a convergence analysis of the algorithm in this non-convex setting and pro… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

  25. arXiv:2009.07708  [pdf, other

    stat.ML cs.LG

    Better Model Selection with a new Definition of Feature Importance

    Authors: Fan Fang, Carmine Ventre, Lingbo Li, Leslie Kanthan, Fan Wu, Michail Basios

    Abstract: Feature importance aims at measuring how crucial each input feature is for model prediction. It is widely used in feature engineering, model selection and explainable artificial intelligence (XAI). In this paper, we propose a new tree-model explanation approach for model selection. Our novel concept leverages the Coefficient of Variation of a feature weight (measured in terms of the contribution o… ▽ More

    Submitted 16 September, 2020; originally announced September 2020.

  26. arXiv:2008.08931  [pdf, other

    cs.SI cs.LG stat.ML

    A Deep Prediction Network for Understanding Advertiser Intent and Satisfaction

    Authors: Liyi Guo, Rui Lu, Haoqi Zhang, Junqi **, Zhenzhe Zheng, Fan Wu, ** Li, Haiyang Xu, Han Li, Wenkai Lu, Jian Xu, Kun Gai

    Abstract: For e-commerce platforms such as Taobao and Amazon, advertisers play an important role in the entire digital ecosystem: their behaviors explicitly influence users' browsing and shop** experience; more importantly, advertiser's expenditure on advertising constitutes a primary source of platform revenue. Therefore, providing better services for advertisers is essential for the long-term prosperity… ▽ More

    Submitted 20 August, 2020; originally announced August 2020.

    Journal ref: CIKM 2020, Virtual Event, Ireland

  27. arXiv:2008.01411  [pdf, other

    cs.LG cs.CV stat.ML

    Memory Efficient Class-Incremental Learning for Image Classification

    Authors: Hanbin Zhao, Hui Wang, Yongjian Fu, Fei Wu, Xi Li

    Abstract: With the memory-resource-limited constraints, class-incremental learning (CIL) usually suffers from the "catastrophic forgetting" problem when updating the joint classification model on the arrival of newly added classes. To cope with the forgetting problem, many CIL methods transfer the knowledge of old classes by preserving some exemplar samples into the size-constrained memory buffer. To utiliz… ▽ More

    Submitted 18 May, 2021; v1 submitted 4 August, 2020; originally announced August 2020.

  28. arXiv:2006.07040  [pdf, other

    stat.ME cs.LG stat.ML

    Learning Decomposed Representation for Counterfactual Inference

    Authors: Anpeng Wu, Kun Kuang, Junkun Yuan, Bo Li, Runze Wu, Qiang Zhu, Yueting Zhuang, Fei Wu

    Abstract: The fundamental problem in treatment effect estimation from observational data is confounder identification and balancing. Most of the previous methods realized confounder balancing by treating all observed pre-treatment variables as confounders, ignoring further identifying confounders and non-confounders. In general, not all the observed pre-treatment variables are confounders that refer to the… ▽ More

    Submitted 11 October, 2021; v1 submitted 12 June, 2020; originally announced June 2020.

  29. arXiv:2006.05076  [pdf, other

    cs.LG cs.AI stat.ML

    Stable Prediction via Leveraging Seed Variable

    Authors: Kun Kuang, Bo Li, Peng Cui, Yue Liu, Jianrong Tao, Yueting Zhuang, Fei Wu

    Abstract: In this paper, we focus on the problem of stable prediction across unknown test data, where the test distribution is agnostic and might be totally different from the training one. In such a case, previous machine learning methods might exploit subtly spurious correlations in training data induced by non-causal variables for prediction. Those spurious correlations are changeable across data, leadin… ▽ More

    Submitted 9 June, 2020; originally announced June 2020.

  30. arXiv:2006.04381  [pdf, other

    cs.LG stat.ME stat.ML

    Balance-Subsampled Stable Prediction

    Authors: Kun Kuang, Hengtao Zhang, Fei Wu, Yueting Zhuang, Aijun Zhang

    Abstract: In machine learning, it is commonly assumed that training and test data share the same population distribution. However, this assumption is often violated in practice because the sample selection bias may induce the distribution shift from training data to test data. Such a model-agnostic distribution shift usually leads to prediction instability across unknown test data. In this paper, we propose… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

  31. arXiv:2006.01065  [pdf, other

    stat.ML cs.LG eess.SP

    Hadamard Wirtinger Flow for Sparse Phase Retrieval

    Authors: Fan Wu, Patrick Rebeschini

    Abstract: We consider the problem of reconstructing an $n$-dimensional $k$-sparse signal from a set of noiseless magnitude-only measurements. Formulating the problem as an unregularized empirical risk minimization task, we study the sample complexity performance of gradient descent with Hadamard parametrization, which we call Hadamard Wirtinger flow (HWF). Provided knowledge of the signal sparsity $k$, we p… ▽ More

    Submitted 24 February, 2021; v1 submitted 1 June, 2020; originally announced June 2020.

  32. arXiv:2002.11102  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    On Feature Normalization and Data Augmentation

    Authors: Boyi Li, Felix Wu, Ser-Nam Lim, Serge Belongie, Kilian Q. Weinberger

    Abstract: The moments (a.k.a., mean and standard deviation) of latent features are often removed as noise when training image recognition models, to increase stability and reduce training time. However, in the field of image generation, the moments play a much more central role. Studies have shown that the moments extracted from instance normalization and positional normalization can roughly capture style a… ▽ More

    Submitted 30 March, 2021; v1 submitted 25 February, 2020; originally announced February 2020.

    Comments: CVPR 2021. Code is available at https://github.com/Boyiliee/MoEx

  33. arXiv:2002.07454  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Distributed Optimization over Block-Cyclic Data

    Authors: Yucheng Ding, Chaoyue Niu, Yikai Yan, Zhenzhe Zheng, Fan Wu, Guihai Chen, Shaojie Tang, Rongfei Jia

    Abstract: We consider practical data characteristics underlying federated learning, where unbalanced and non-i.i.d. data from clients have a block-cyclic structure: each cycle contains several blocks, and each client's training data follow block-specific and non-i.i.d. distributions. Such a data structure would introduce client and block biases during the collaborative training: the single global model woul… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

  34. arXiv:2002.07399  [pdf, other

    stat.ML cs.DC cs.LG math.OC

    Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability

    Authors: Yikai Yan, Chaoyue Niu, Yucheng Ding, Zhenzhe Zheng, Fan Wu, Guihai Chen, Shaojie Tang, Zhihua Wu

    Abstract: Federated learning is a new distributed machine learning framework, where a bunch of heterogeneous clients collaboratively train a model without sharing training data. In this work, we consider a practical and ubiquitous issue when deploying federated learning in mobile environments: intermittent client availability, where the set of eligible clients may change during the training process. Such in… ▽ More

    Submitted 21 December, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

  35. arXiv:2001.06858  [pdf, ps, other

    stat.ML cs.LG stat.ME

    Finding Optimal Points for Expensive Functions Using Adaptive RBF-Based Surrogate Model Via Uncertainty Quantification

    Authors: Ray-Bing Chen, Yuan Wang, C. F. Jeff Wu

    Abstract: Global optimization of expensive functions has important applications in physical and computer experiments. It is a challenging problem to develop efficient optimization scheme, because each function evaluation can be costly and the derivative information of the function is often not available. We propose a novel global optimization framework using adaptive Radial Basis Functions (RBF) based surro… ▽ More

    Submitted 19 January, 2020; originally announced January 2020.

    Comments: 35 pages, 5 figures, 5 tables

  36. arXiv:1912.00949  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Multi-Agent Deep Reinforcement Learning with Adaptive Policies

    Authors: Yixiang Wang, Feng Wu

    Abstract: We propose a novel approach to address one aspect of the non-stationarity problem in multi-agent reinforcement learning (RL), where the other agents may alter their policies due to environment changes during execution. This violates the Markov assumption that governs most single-agent RL methods and is one of the key challenges in multi-agent RL. To tackle this, we propose to train multiple polici… ▽ More

    Submitted 28 November, 2019; originally announced December 2019.

    Comments: arXiv admin note: text overlap with arXiv:1706.02275 by other authors

  37. arXiv:1911.07285  [pdf, other

    stat.ME

    A hierarchical expected improvement method for Bayesian optimization

    Authors: Zhehui Chen, Simon Mak, C. F. Jeff Wu

    Abstract: The Expected Improvement (EI) method, proposed by Jones et al. (1998), is a widely-used Bayesian optimization method, which makes use of a fitted Gaussian process model for efficient black-box optimization. However, one key drawback of EI is that it is overly greedy in exploiting the fitted Gaussian process model for optimization, which results in suboptimal solutions even with large sample sizes.… ▽ More

    Submitted 20 April, 2023; v1 submitted 17 November, 2019; originally announced November 2019.

  38. arXiv:1911.07128  [pdf, other

    cs.LG stat.ML

    Scalability vs. Utility: Do We Have to Sacrifice One for the Other in Data Importance Quantification?

    Authors: Ruoxi Jia, Fan Wu, Xuehui Sun, Jiacen Xu, David Dao, Bhavya Kailkhura, Ce Zhang, Bo Li, Dawn Song

    Abstract: Quantifying the importance of each training point to a learning task is a fundamental problem in machine learning and the estimated importance scores have been leveraged to guide a range of data workflows such as data summarization and domain adaption. One simple idea is to use the leave-one-out error of each training point to indicate its importance. Recent work has also proposed to use the Shapl… ▽ More

    Submitted 25 April, 2021; v1 submitted 16 November, 2019; originally announced November 2019.

  39. arXiv:1911.05531  [pdf, other

    q-bio.BM cs.LG stat.ML

    Accurate Protein Structure Prediction by Embeddings and Deep Learning Representations

    Authors: Iddo Drori, Darshan Thaker, Arjun Srivatsa, Daniel Jeong, Yueqi Wang, Linyong Nan, Fan Wu, Dimitri Leggas, **hao Lei, Weiyi Lu, Weilong Fu, Yuan Gao, Sashank Karri, Anand Kannan, Antonio Moretti, Mohammed AlQuraishi, Chen Keasar, Itsik Pe'er

    Abstract: Proteins are the major building blocks of life, and actuators of almost all chemical and biophysical events in living organisms. Their native structures in turn enable their biological functions which have a fundamental role in drug design. This motivates predicting the structure of a protein from its sequence of amino acids, a fundamental problem in computational biology. In this work, we demonst… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

    Journal ref: Machine Learning in Computational Biology, 2019

  40. arXiv:1911.02254  [pdf, other

    cs.LG cs.CR cs.DC stat.ML

    Secure Federated Submodel Learning

    Authors: Chaoyue Niu, Fan Wu, Shaojie Tang, Lifeng Hua, Rongfei Jia, Chengfei Lv, Zhihua Wu, Guihai Chen

    Abstract: Federated learning was proposed with an intriguing vision of achieving collaborative machine learning among numerous clients without uploading their private data to a cloud server. However, the conventional framework requires each client to leverage the full model for learning, which can be prohibitively inefficient for resource-constrained clients and large-scale deep learning tasks. We thus prop… ▽ More

    Submitted 11 November, 2019; v1 submitted 6 November, 2019; originally announced November 2019.

  41. arXiv:1910.04499  [pdf, other

    cs.LG stat.ML

    DeGNN: Characterizing and Improving Graph Neural Networks with Graph Decomposition

    Authors: Xupeng Miao, Nezihe Merve Gürel, Wentao Zhang, Zhichao Han, Bo Li, Wei Min, Xi Rao, Hansheng Ren, Yinan Shan, Yingxia Shao, Yujie Wang, Fan Wu, Hui Xue, Yaming Yang, Zitao Zhang, Yang Zhao, Shuai Zhang, Yu**g Wang, Bin Cui, Ce Zhang

    Abstract: Despite the wide application of Graph Convolutional Network (GCN), one major limitation is that it does not benefit from the increasing depth and suffers from the oversmoothing problem. In this work, we first characterize this phenomenon from the information-theoretic perspective and show that under certain conditions, the mutual information between the output after $l$ layers and the input of GCN… ▽ More

    Submitted 29 June, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

    Comments: 20 pages, 5 figures, 5 tables

  42. arXiv:1910.00387  [pdf, other

    cs.LG cs.CV stat.ML

    Leveraging Model Interpretability and Stability to increase Model Robustness

    Authors: Fei Wu, Thomas Michel, Alexandre Briot

    Abstract: State of the art Deep Neural Networks (DNN) can now achieve above human level accuracy on image classification tasks. However their outstanding performances come along with a complex inference mechanism making them arduously interpretable models. In order to understand the underlying prediction rules of DNNs, Dhamdhere et al. propose an interpretability method to break down a DNN prediction score… ▽ More

    Submitted 5 November, 2019; v1 submitted 1 October, 2019; originally announced October 2019.

    Comments: 2019 ICCV workshop on Interpreting and Explaining Visual AI models; 8 pages

  43. arXiv:1909.08329  [pdf, ps, other

    cs.LG cs.DC stat.ML

    From Server-Based to Client-Based Machine Learning: A Comprehensive Survey

    Authors: Renjie Gu, Chaoyue Niu, Fan Wu, Guihai Chen, Chun Hu, Chengfei Lyu, Zhihua Wu

    Abstract: In recent years, mobile devices have gained increasing development with stronger computation capability and larger storage space. Some of the computation-intensive machine learning tasks can now be run on mobile devices. To exploit the resources available on mobile devices and preserve personal privacy, the concept of client-based machine learning has been proposed. It leverages the users' local h… ▽ More

    Submitted 27 September, 2021; v1 submitted 18 September, 2019; originally announced September 2019.

    Comments: Accepted to ACM CSUR 2021, Volume 54, Issue 1, Pages 6:1 - 6:36

    ACM Class: A.1; I.2.6; I.2.11

    Journal ref: ACM Comput. Surv. 54, 1 (2021), 6:1 - 6:36 (2021)

  44. arXiv:1908.08868  [pdf, other

    stat.ME math.ST

    BdryGP: a new Gaussian process model for incorporating boundary information

    Authors: Liang Ding, Simon Mak, C. F. Jeff Wu

    Abstract: Gaussian processes (GPs) are widely used as surrogate models for emulating computer code, which simulate complex physical phenomena. In many problems, additional boundary information (i.e., the behavior of the phenomena along input boundaries) is known beforehand, either from governing physics or scientific knowledge. While there has been recent work on incorporating boundary information within GP… ▽ More

    Submitted 23 August, 2019; originally announced August 2019.

  45. arXiv:1906.03333  [pdf, other

    cs.LG cs.CR stat.ML

    Efficient Project Gradient Descent for Ensemble Adversarial Attack

    Authors: Fanyou Wu, Rado Gazo, Eva Haviarova, Bedrich Benes

    Abstract: Recent advances show that deep neural networks are not robust to deliberately crafted adversarial examples which many are generated by adding human imperceptible perturbation to clear input. Consider $l_2$ norms attacks, Project Gradient Descent (PGD) and the Carlini and Wagner (C\&W) attacks are the two main methods, where PGD control max perturbation for adversarial examples while C\&W approach… ▽ More

    Submitted 7 June, 2019; originally announced June 2019.

    Comments: 6 pages, 2 figures, submit to IJCAI 19 AIBS workshop

  46. arXiv:1906.00554  [pdf, other

    cs.LG stat.ML

    Factor Graph Neural Network

    Authors: Zhen Zhang, Fan Wu, Wee Sun Lee

    Abstract: Most of the successful deep neural network architectures are structured, often consisting of elements like convolutional neural networks and gated recurrent neural networks. Recently, graph neural networks have been successfully applied to graph structured data such as point cloud and molecular data. These networks often only consider pairwise dependencies, as they operate on a graph structure. We… ▽ More

    Submitted 2 June, 2019; originally announced June 2019.

  47. arXiv:1905.01964  [pdf, other

    cs.CL cs.LG stat.ML

    Neural Chinese Named Entity Recognition via CNN-LSTM-CRF and Joint Training with Word Segmentation

    Authors: Fangzhao Wu, Junxin Liu, Chuhan Wu, Yongfeng Huang, Xing Xie

    Abstract: Chinese named entity recognition (CNER) is an important task in Chinese natural language processing field. However, CNER is very challenging since Chinese entity names are highly context-dependent. In addition, Chinese texts lack delimiters to separate words, making it difficult to identify the boundary of entities. Besides, the training data for CNER in many domains is usually insufficient, and a… ▽ More

    Submitted 26 April, 2019; originally announced May 2019.

    Comments: 7 pages, 3 figures, accepted by the 2019 World Wide Web Conference (WWW'19)

  48. arXiv:1905.01963  [pdf, other

    cs.CL cs.LG stat.ML

    Neural Chinese Word Segmentation with Lexicon and Unlabeled Data via Posterior Regularization

    Authors: Junxin Liu, Fangzhao Wu, Chuhan Wu, Yongfeng Huang, Xing Xie

    Abstract: Existing methods for CWS usually rely on a large number of labeled sentences to train word segmentation models, which are expensive and time-consuming to annotate. Luckily, the unlabeled data is usually easy to collect and many high-quality Chinese lexicons are off-the-shelf, both of which can provide useful information for CWS. In this paper, we propose a neural approach for Chinese word segmenta… ▽ More

    Submitted 26 April, 2019; originally announced May 2019.

    Comments: 7 pages, 11 figures, accepted by the 2019 World Wide Web Conference (WWW '19)

  49. arXiv:1903.09239  [pdf, other

    stat.ML cs.LG

    Multi-Domain Adversarial Learning

    Authors: Alice Schoenauer-Sebag, Louise Heinrich, Marc Schoenauer, Michele Sebag, Lani F. Wu, Steve J. Altschuler

    Abstract: Multi-domain learning (MDL) aims at obtaining a model with minimal average risk across multiple domains. Our empirical motivation is automated microscopy data, where cultured cells are imaged after being exposed to known and unknown chemical perturbations, and each dataset displays significant experimental bias. This paper presents a multi-domain adversarial learning approach, MuLANN, to leverage… ▽ More

    Submitted 21 March, 2019; originally announced March 2019.

    Comments: Accepted at ICLR'19

    Journal ref: ICLR 2019-Seventh annual International Conference on Learning Representations

  50. arXiv:1902.07903  [pdf, other

    cs.NI cs.LG stat.ML

    Learning Deterministic Policy with Target for Power Control in Wireless Networks

    Authors: Yujiao Lu, Hancheng Lu, Liangliang Cao, Feng Wu, Daren Zhu

    Abstract: Inter-Cell Interference Coordination (ICIC) is a promising way to improve energy efficiency in wireless networks, especially where small base stations are densely deployed. However, traditional optimization based ICIC schemes suffer from severe performance degradation with complex interference pattern. To address this issue, we propose a Deep Reinforcement Learning with Deterministic Policy and Ta… ▽ More

    Submitted 21 February, 2019; originally announced February 2019.

    Comments: 7 pages, 7 figures, GlobeCom2018