Skip to main content

Showing 1–40 of 40 results for author: Dinh, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.02798  [pdf, other

    cs.SI

    Structural Balance in Real-World Social Networks: Incorporating Direction and Transitivity in Measuring Partial Balance

    Authors: Rezvaneh Rezapour, Ly Dinh, Lan Jiang, Jana Diesner

    Abstract: Structural balance theory predicts that triads in networks gravitate towards stable configurations. The theory has been verified for undirected graphs. Since real-world networks are often directed, we introduce a novel method for considering both transitivity and sign consistency for evaluating partial balance in signed digraphs. We test our approach on graphs constructed by using different method… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2006.02565

  2. arXiv:2403.05112  [pdf, other

    cs.AI

    RLPeri: Accelerating Visual Perimetry Test with Reinforcement Learning and Convolutional Feature Extraction

    Authors: Tanvi Verma, Linh Le Dinh, Nicholas Tan, Xinxing Xu, Chingyu Cheng, Yong Liu

    Abstract: Visual perimetry is an important eye examination that helps detect vision problems caused by ocular or neurological conditions. During the test, a patient's gaze is fixed at a specific location while light stimuli of varying intensities are presented in central and peripheral vision. Based on the patient's responses to the stimuli, the visual field map** and sensitivity are determined. However,… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Published at AAAI-24

    Journal ref: The 38th Annual AAAI Conference on Artificial Intelligence, 2024

  3. arXiv:2401.05525  [pdf, other

    cs.NI cs.LG

    Towards Safe Load Balancing based on Control Barrier Functions and Deep Reinforcement Learning

    Authors: Lam Dinh, Pham Tran Anh Quang, Jérémie Leguay

    Abstract: Deep Reinforcement Learning (DRL) algorithms have recently made significant strides in improving network performance. Nonetheless, their practical use is still limited in the absence of safe exploration and safe decision-making. In the context of commercial solutions, reliable and safe-to-operate systems are of paramount importance. Taking this problem into account, we propose a safe learning-base… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: Accepted to IEEE/IFIP NOMS 2024

  4. arXiv:2312.04000  [pdf, other

    cs.LG cs.CV

    LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures

    Authors: Vimal Thilak, Chen Huang, Omid Saremi, Laurent Dinh, Hanlin Goh, Preetum Nakkiran, Joshua M. Susskind, Etai Littwin

    Abstract: Joint embedding (JE) architectures have emerged as a promising avenue for acquiring transferable data representations. A key obstacle to using JE methods, however, is the inherent challenge of evaluating learned representations without access to a downstream task, and an annotated dataset. Without efficient and reliable evaluation, it is difficult to iterate on architectural and training choices f… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: Technical report

  5. arXiv:2310.08866  [pdf, other

    cs.LG cs.AI

    Adaptivity and Modularity for Efficient Generalization Over Task Complexity

    Authors: Samira Abnar, Omid Saremi, Laurent Dinh, Shantel Wilson, Miguel Angel Bautista, Chen Huang, Vimal Thilak, Etai Littwin, Jiatao Gu, Josh Susskind, Samy Bengio

    Abstract: Can transformers generalize efficiently on problems that require dealing with examples with different levels of difficulty? We introduce a new task tailored to assess generalization over different complexities and present results that indicate that standard transformers face challenges in solving these tasks. These tasks are variations of pointer value retrieval previously introduced by Zhang et a… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  6. arXiv:2310.07805  [pdf, other

    cs.LG cs.AI

    Generative Modeling with Phase Stochastic Bridges

    Authors: Tianrong Chen, Jiatao Gu, Laurent Dinh, Evangelos A. Theodorou, Joshua Susskind, Shuangfei Zhai

    Abstract: Diffusion models (DMs) represent state-of-the-art generative models for continuous inputs. DMs work by constructing a Stochastic Differential Equation (SDE) in the input space (ie, position space), and using a neural network to reverse it. In this work, we introduce a novel generative modeling framework grounded in \textbf{phase space dynamics}, where a phase space is defined as {an augmented spac… ▽ More

    Submitted 12 May, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  7. arXiv:2308.02212  [pdf, other

    cs.DL

    Hyperauthored papers disproportionately amplify important egocentric network metrics

    Authors: Ly Dinh, William C. Barley, Lauren Johnson, Brian F. Allan

    Abstract: Hyperauthorship, a phenomenon whereby there are a disproportionately large number of authors on a single paper, is increasingly common in several scientific disciplines, but with unknown consequences for network metrics used to study scientific collaboration. The validity of co-authorship as a proxy for scientific collaboration is affected by this. Using bibliometric data from publications in the… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: 36 pages, 8 figures, 3 tables, journal preprint

  8. arXiv:2306.13872  [pdf, other

    cs.RO cs.AI cs.LG

    Learning from Pixels with Expert Observations

    Authors: Minh-Huy Hoang, Long Dinh, Hai Nguyen

    Abstract: In reinforcement learning (RL), sparse rewards can present a significant challenge. Fortunately, expert actions can be utilized to overcome this issue. However, acquiring explicit expert actions can be costly, and expert observations are often more readily available. This paper presents a new approach that uses expert observations for learning in robot manipulation tasks with sparse rewards from p… ▽ More

    Submitted 15 July, 2023; v1 submitted 24 June, 2023; originally announced June 2023.

    Comments: Accepted at IROS-2023 (Detroit, USA), the first two authors contributed equally

  9. arXiv:2305.17648  [pdf, other

    cs.CV

    Z-GMOT: Zero-shot Generic Multiple Object Tracking

    Authors: Kim Hoang Tran, Anh Duy Le Dinh, Tien Phat Nguyen, Thinh Phan, Pha Nguyen, Khoa Luu, Donald Adjeroh, Gianfranco Doretto, Ngan Hoang Le

    Abstract: Despite recent significant progress, Multi-Object Tracking (MOT) faces limitations such as reliance on prior knowledge and predefined categories and struggles with unseen objects. To address these issues, Generic Multiple Object Tracking (GMOT) has emerged as an alternative approach, requiring less prior information. However, current GMOT methods often rely on initial bounding boxes and struggle t… ▽ More

    Submitted 13 June, 2024; v1 submitted 28 May, 2023; originally announced May 2023.

  10. arXiv:2304.10498  [pdf, other

    cs.GT

    Regret-Minimizing Double Oracle for Extensive-Form Games

    Authors: Xiaohang Tang, Le Cong Dinh, Stephen Marcus McAleer, Yaodong Yang

    Abstract: By incorporating regret minimization, double oracle methods have demonstrated rapid convergence to Nash Equilibrium (NE) in normal-form games and extensive-form games, through algorithms such as online double oracle (ODO) and extensive-form double oracle (XDO), respectively. In this study, we further examine the theoretical convergence rate and sample complexity of such regret minimization-based d… ▽ More

    Submitted 13 July, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: Accepted at ICML, 2023

  11. arXiv:2302.06652  [pdf, other

    cs.LG cs.GT cs.MA

    Achieving Better Regret against Strategic Adversaries

    Authors: Le Cong Dinh, Tri-Dung Nguyen, Alain Zemkoho, Long Tran-Thanh

    Abstract: We study online learning problems in which the learner has extra knowledge about the adversary's behaviour, i.e., in game-theoretic settings where opponents typically follow some no-external regret learning algorithms. Under this assumption, we propose two new online learning algorithms, Accurate Follow the Regularized Leader (AFTRL) and Prod-Best Response (Prod-BR), that intensively exploit this… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

  12. arXiv:2207.13751  [pdf, other

    cs.CV cs.GR cs.LG

    GAUDI: A Neural Architect for Immersive 3D Scene Generation

    Authors: Miguel Angel Bautista, Pengsheng Guo, Samira Abnar, Walter Talbott, Alexander Toshev, Zhuoyuan Chen, Laurent Dinh, Shuangfei Zhai, Hanlin Goh, Daniel Ulbricht, Afshin Dehghan, Josh Susskind

    Abstract: We introduce GAUDI, a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be rendered immersively from a moving camera. We tackle this challenging problem with a scalable yet powerful approach, where we first optimize a latent representation that disentangles radiance fields and camera poses. This latent representation is then used to learn a generati… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

    Comments: Project webpage: https://github.com/apple/ml-gaudi

  13. arXiv:2205.01644  [pdf, other

    cs.NI

    Towards URLLC with Proactive HARQ Adaptation

    Authors: Lam Ngoc Dinh, Ibtissam Labriji, Mickael Maman, Emilio Calvanese Strinati

    Abstract: In this work, we propose a dynamic decision maker algorithm to improve the proactive HARQ protocol for beyond 5G networks. Based on Lyapunov stochastic optimization, our adaptation control framework dynamically selects the number of proactive retransmissions for intermittent URLLC traffic scenarios under time-varying channel conditions without requiring any prior knowledge associated with this sto… ▽ More

    Submitted 14 April, 2022; originally announced May 2022.

  14. arXiv:2110.13532  [pdf, other

    cs.GT cs.MA

    Playing Coopetitive Polymatrix Games with Small Manipulation Cost

    Authors: Shivakumar Mahesh, Nicholas Bishop, Le Cong Dinh, Long Tran-Thanh

    Abstract: Iterated coopetitive games capture the situation when one must efficiently balance between cooperation and competition with the other agents over time in order to win the game (e.g., to become the player with highest total utility). Achieving this balance is typically very challenging or even impossible when explicit communication is not feasible (e.g., negotiation or bargaining are not allowed).… ▽ More

    Submitted 10 March, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

  15. arXiv:2110.03604  [pdf, ps, other

    cs.LG cs.AI cs.GT cs.MA

    Online Markov Decision Processes with Non-oblivious Strategic Adversary

    Authors: Le Cong Dinh, David Henry Mguni, Long Tran-Thanh, Jun Wang, Yaodong Yang

    Abstract: We study a novel setting in Online Markov Decision Processes (OMDPs) where the loss function is chosen by a non-oblivious strategic adversary who follows a no-external regret algorithm. In this setting, we first demonstrate that MDP-Expert, an existing algorithm that works well with oblivious adversaries can still apply and achieve a policy regret bound of… ▽ More

    Submitted 27 January, 2023; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: Accepted at Autonomous Agents and Multi-Agent Systems (2023)

    Report number: 15

  16. arXiv:2103.07780  [pdf, other

    cs.AI cs.GT

    Online Double Oracle

    Authors: Le Cong Dinh, Yaodong Yang, Stephen McAleer, Zheng Tian, Nicolas Perez Nieves, Oliver Slumbers, David Henry Mguni, Haitham Bou Ammar, Jun Wang

    Abstract: Solving strategic games with huge action space is a critical yet under-explored topic in economics, operations research and artificial intelligence. This paper proposes new learning algorithms for solving two-player zero-sum normal-form games where the number of pure strategies is prohibitively large. Specifically, we combine no-regret analysis from online learning with Double Oracle (DO) methods… ▽ More

    Submitted 15 February, 2023; v1 submitted 13 March, 2021; originally announced March 2021.

    Comments: Accepted at Transactions on Machine Learning Research (TMLR)

    Journal ref: Transactions on Machine Learning Research 2022

  17. arXiv:2101.01045  [pdf, other

    math.OC cs.LG

    Comparing different subgradient methods for solving convex optimization problems with functional constraints

    Authors: Thi Lan Dinh, Ngoc Hoang Anh Mai

    Abstract: We consider the problem of minimizing a convex, nonsmooth function subject to a closed convex constraint domain. The methods that we propose are reforms of subgradient methods based on Metel--Takeda's paper [Optimization Letters 15.4 (2021): 1491-1504] and Boyd's works [Lecture notes of EE364b, Stanford University, Spring 2013-14, pp. 1-39]. While the former has complexity… ▽ More

    Submitted 21 January, 2023; v1 submitted 4 January, 2021; originally announced January 2021.

    Comments: 25 pages, 10 tables, 15 figures

  18. arXiv:2012.03808  [pdf, other

    cs.LG stat.ML

    Perfect density models cannot guarantee anomaly detection

    Authors: Charline Le Lan, Laurent Dinh

    Abstract: Thanks to the tractability of their likelihood, several deep generative models show promise for seemingly straightforward but important applications like anomaly detection, uncertainty estimation, and active learning. However, the likelihood values empirically attributed to anomalies conflict with the expectations these proposed applications suggest. In this paper, we take a closer look at the beh… ▽ More

    Submitted 15 January, 2022; v1 submitted 7 December, 2020; originally announced December 2020.

    Comments: Accepted to the Special Issue "Probabilistic Methods for Deep Learning" of the Journal Entropy. 14 pages and 10 figures in main content, 4 pages of bibliography, and 2 pages in Appendix

    Journal ref: Entropy 23 (2021) 1690

  19. arXiv:2009.12275  [pdf, other

    cs.NI

    Energy Efficient Resource Allocation Optimization in Fog Radio Access Networks with Outdated Channel Knowledge

    Authors: Thi Ha Ly Dinh, Megumi Kaneko, Ellen Hidemi Fukuda, Lila Boukhatem

    Abstract: Fog Radio Access Networks (F-RAN) are gaining worldwide interests for enabling mobile edge computing for Beyond 5G. However, to realize the future real-time and delay-sensitive applications, F-RAN tailored radio resource allocation and interference management become necessary. This work investigates user association and beamforming issues for providing energy efficient F-RANs. We formulate the ene… ▽ More

    Submitted 25 September, 2020; originally announced September 2020.

  20. Exploiting No-Regret Algorithms in System Design

    Authors: Le Cong Dinh, Nick Bishop, Long Tran-Thanh

    Abstract: We investigate a repeated two-player zero-sum game setting where the column player is also a designer of the system, and has full control on the design of the payoff matrix. In addition, the row player uses a no-regret algorithm to efficiently learn how to adapt their strategy to the column player's behaviour over time in order to achieve good total payoff. The goal of the column player is to guid… ▽ More

    Submitted 15 February, 2023; v1 submitted 21 July, 2020; originally announced July 2020.

    Comments: Accepted at International Foundation for Autonomous Agents and Multiagent Systems (AAMAS 2022)

    Journal ref: Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems 2022

  21. arXiv:2006.02565  [pdf, other

    cs.SI physics.soc-ph

    Structural balance in signed digraphs: considering transitivity to measure balance in graphs constructed by using different link signing methods

    Authors: Ly Dinh, Rezvaneh Rezapour, Lan Jiang, Jana Diesner

    Abstract: Structural balance theory assumes triads in networks to gravitate towards stable configurations. The theory has been verified for undirected graphs. Since real-world networks are often directed, we introduce a novel method for considering both transitivity and sign consistency for calculating balance in signed digraphs. We test our approach on graphs that we constructed by using different methods… ▽ More

    Submitted 3 June, 2020; originally announced June 2020.

    Comments: 27 pages including figures, tables, references

  22. An Empirical Methodology for Detecting and Prioritizing Needs during Crisis Events

    Authors: M. Janina Sarol, Ly Dinh, Rezvaneh Rezapour, Chieh-Li Chin, ****g Yang, Jana Diesner

    Abstract: In times of crisis, identifying the essential needs is a crucial step to providing appropriate resources and services to affected entities. Social media platforms such as Twitter contain vast amount of information about the general public's needs. However, the sparsity of the information as well as the amount of noisy content present a challenge to practitioners to effectively identify shared info… ▽ More

    Submitted 2 June, 2020; originally announced June 2020.

  23. arXiv:2005.09925  [pdf, other

    cs.SI math.OC physics.soc-ph

    Multilevel Structural Evaluation of Signed Directed Social Networks based on Balance Theory

    Authors: Samin Aref, Ly Dinh, Rezvaneh Rezapour, Jana Diesner

    Abstract: Balance theory explains the forces behind the structure of social systems, which are commonly modeled as static undirected signed networks. We expand this modeling approach to incorporate directionality of edges, and consider three levels of analysis: triads, subgroups, and the whole network. For triad-level balance, we operationalize a new measure by utilizing semicycles that satisfy the conditio… ▽ More

    Submitted 20 July, 2020; v1 submitted 20 May, 2020; originally announced May 2020.

    Comments: Peer-reviewed author copy, combined 13-page manuscript and 13-page supplementary information

    MSC Class: 05C22; 90C90; 90C09; 90C10; 90C35; 05C15; 65K05

  24. arXiv:2003.11727  [pdf, ps, other

    cs.GT

    Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information

    Authors: Le Cong Dinh, Long Tran-Thanh, Tri-Dung Nguyen, Alain B. Zemkoho

    Abstract: This paper considers repeated games in which one player has more information about the game than the other players. In particular, we investigate repeated two-player zero-sum games where only the column player knows the payoff matrix A of the game. Suppose that while repeatedly playing this game, the row player chooses her strategy at each round by using a no-regret algorithm to minimize her (pseu… ▽ More

    Submitted 15 February, 2023; v1 submitted 25 March, 2020; originally announced March 2020.

    Comments: Accepted at International Conference on Algorithmic Learning Theory, PMLR 132:553-577, 2021

    Journal ref: Proceedings of Machine Learning Research 2021

  25. arXiv:2002.07101  [pdf, other

    cs.LG stat.ML

    Augmented Normalizing Flows: Bridging the Gap Between Generative Flows and Latent Variable Models

    Authors: Chin-Wei Huang, Laurent Dinh, Aaron Courville

    Abstract: In this work, we propose a new family of generative flows on an augmented data space, with an aim to improve expressivity without drastically increasing the computational cost of sampling and evaluation of a lower bound on the likelihood. Theoretically, we prove the proposed flow can approximate a Hamiltonian ODE as a universal transport map. Empirically, we demonstrate state-of-the-art performanc… ▽ More

    Submitted 17 February, 2020; originally announced February 2020.

    Comments: 27 pages, 12 figures

  26. arXiv:1905.10347  [pdf, other

    cs.LG stat.ML

    Discrete Flows: Invertible Generative Models of Discrete Data

    Authors: Dustin Tran, Keyon Vafa, Kumar Krishna Agrawal, Laurent Dinh, Ben Poole

    Abstract: While normalizing flows have led to significant advances in modeling high-dimensional continuous distributions, their applicability to discrete distributions remains unknown. In this paper, we show that flows can in fact be extended to discrete events---and under a simple change-of-variables formula not requiring log-determinant-Jacobian computations. Discrete flows have numerous applications. We… ▽ More

    Submitted 24 May, 2019; originally announced May 2019.

  27. arXiv:1903.07714  [pdf, other

    cs.LG stat.ML

    A RAD approach to deep mixture models

    Authors: Laurent Dinh, Jascha Sohl-Dickstein, Hugo Larochelle, Razvan Pascanu

    Abstract: Flow based models such as Real NVP are an extremely powerful approach to density estimation. However, existing flow based models are restricted to transforming continuous densities over a continuous input space into similarly continuous distributions over continuous latent variables. This makes them poorly suited for modeling and representing discrete structures in data distributions, for example… ▽ More

    Submitted 25 August, 2020; v1 submitted 18 March, 2019; originally announced March 2019.

    Comments: 18.5 pages of main content, 3 pages of appendices

  28. arXiv:1903.01434  [pdf, other

    cs.CV cs.AI cs.LG

    VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation

    Authors: Manoj Kumar, Mohammad Babaeizadeh, Dumitru Erhan, Chelsea Finn, Sergey Levine, Laurent Dinh, Durk Kingma

    Abstract: Generative models that can model and predict sequences of future events can, in principle, learn to capture complex real-world phenomena, such as physical interactions. However, a central challenge in video prediction is that the future is highly uncertain: a sequence of past observations of events can imply many possible futures. Although a number of recent works have studied probabilistic models… ▽ More

    Submitted 12 February, 2020; v1 submitted 4 March, 2019; originally announced March 2019.

    Comments: ICLR 2020 Camera-Ready. Previous title: VideoFlow: A Flow-Based Generative Model for Video

  29. arXiv:1804.06318  [pdf, other

    cs.AI cs.NE cs.RO

    Learning Awareness Models

    Authors: Brandon Amos, Laurent Dinh, Serkan Cabi, Thomas Rothörl, Sergio Gómez Colmenarejo, Alistair Muldal, Tom Erez, Yuval Tassa, Nando de Freitas, Misha Denil

    Abstract: We consider the setting of an agent with a fixed body interacting with an unknown and uncertain external world. We show that models trained to predict proprioceptive information about the agent's body come to represent objects in the external world. In spite of being trained with only internally available signals, these dynamic body models come to represent external objects through the necessity o… ▽ More

    Submitted 17 April, 2018; originally announced April 2018.

    Comments: Accepted to ICLR 2018

  30. arXiv:1802.01104  [pdf, ps, other

    eess.SP cs.NI

    User Pre-Scheduling and Beamforming with Imperfect CSI in 5G Fog Radio Access Networks

    Authors: Nicolas Pontois, Megumi Kaneko, Thi Ha Ly Dinh, Lila Boukhatem

    Abstract: We investigate the user-to-cell association (or user-clustering) and beamforming design for Cloud Radio Access Networks (CRANs) and Fog Radio Access Networks (FogRANs) for 5G. CRAN enables cloud centralized resource and power allocation optimization over all the small cells served by multiple Access Points (APs). However, the fronthaul links connecting each AP to the cloud introduce delays and cau… ▽ More

    Submitted 4 February, 2018; originally announced February 2018.

  31. arXiv:1710.02248  [pdf, other

    cs.LG cs.AI stat.ML

    Learnable Explicit Density for Continuous Latent Space and Variational Inference

    Authors: Chin-Wei Huang, Ahmed Touati, Laurent Dinh, Michal Drozdzal, Mohammad Havaei, Laurent Charlin, Aaron Courville

    Abstract: In this paper, we study two aspects of the variational autoencoder (VAE): the prior distribution over the latent variables and its corresponding posterior. First, we decompose the learning of VAEs into layerwise density estimation, and argue that having a flexible prior is beneficial to both sample generation and inference. Second, we analyze the family of inverse autoregressive flows (inverse AF)… ▽ More

    Submitted 5 October, 2017; originally announced October 2017.

    Comments: 2 figures, 5 pages, submitted to ICML Principled Approaches to Deep Learning workshop

  32. Discovering Business Rules from Business Process Models

    Authors: Thanh Thoa Pham Thi, Markus Helfert, Fakir Hossain, Thang Le Dinh

    Abstract: Discovering business rules from business process models are of advantage to ensure the compliance of business processes with business rules. Furthermore it provides the agility of business processes in case of business rules evolution. Current approaches are limited on types of rules that can be discovered. This paper analyses the expression power of some popular business process modelling languag… ▽ More

    Submitted 11 April, 2017; originally announced April 2017.

    Comments: International Conference on Computer Systems and Technologies - CompSysTech'11, 7 pages

  33. Modelling collaborative services: The COSEMO model

    Authors: Thanh Thoa Pham Thi, Thang Le Dinh, Markus Helfert, Michel Leonard

    Abstract: Despite the dominance of the service sector in the last decades, there is still a need for a strong foundation on service design and innovation. Little attention has paid on service modelling, particularly in the collaboration context. Collaboration is considered as one of solutions for surviving or sustaining the business in the high competitive atmosphere. Collaborative services require various… ▽ More

    Submitted 11 April, 2017; originally announced April 2017.

    Comments: 5th International Conference on Software and Data Technologies, 9 pages

  34. arXiv:1703.04933  [pdf, other

    cs.LG

    Sharp Minima Can Generalize For Deep Nets

    Authors: Laurent Dinh, Razvan Pascanu, Samy Bengio, Yoshua Bengio

    Abstract: Despite their overwhelming capacity to overfit, deep learning architectures tend to generalize relatively well to unseen data, allowing them to be deployed in practice. However, explaining why this is the case is still an open area of research. One standing hypothesis that is gaining popularity, e.g. Hochreiter & Schmidhuber (1997); Keskar et al. (2017), is that the flatness of minima of the loss… ▽ More

    Submitted 15 May, 2017; v1 submitted 15 March, 2017; originally announced March 2017.

    Comments: 8.5 pages of main content, 2.5 of bibliography and 1 page of appendix

  35. arXiv:1605.08803  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Density estimation using Real NVP

    Authors: Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio

    Abstract: Unsupervised learning of probabilistic models is a central yet challenging problem in machine learning. Specifically, designing models with tractable learning, sampling, inference and evaluation is crucial in solving this task. We extend the space of such models using real-valued non-volume preserving (real NVP) transformations, a set of powerful invertible and learnable transformations, resulting… ▽ More

    Submitted 27 February, 2017; v1 submitted 27 May, 2016; originally announced May 2016.

    Comments: 10 pages of main content, 3 pages of bibliography, 18 pages of appendix. Accepted at ICLR 2017

  36. arXiv:1605.02688  [pdf, other

    cs.SC cs.LG cs.MS

    Theano: A Python framework for fast computation of mathematical expressions

    Authors: The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano , et al. (88 additional authors not shown)

    Abstract: Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, mu… ▽ More

    Submitted 9 May, 2016; originally announced May 2016.

    Comments: 19 pages, 5 figures

  37. arXiv:1506.02216  [pdf, other

    cs.LG

    A Recurrent Latent Variable Model for Sequential Data

    Authors: Junyoung Chung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron Courville, Yoshua Bengio

    Abstract: In this paper, we explore the inclusion of latent random variables into the dynamic hidden state of a recurrent neural network (RNN) by combining elements of the variational autoencoder. We argue that through the use of high-level latent random variables, the variational RNN (VRNN)1 can model the kind of variability observed in highly structured sequential data such as natural speech. We empirical… ▽ More

    Submitted 6 April, 2016; v1 submitted 7 June, 2015; originally announced June 2015.

  38. arXiv:1410.8516  [pdf, other

    cs.LG

    NICE: Non-linear Independent Components Estimation

    Authors: Laurent Dinh, David Krueger, Yoshua Bengio

    Abstract: We propose a deep learning framework for modeling complex high-dimensional densities called Non-linear Independent Component Estimation (NICE). It is based on the idea that a good representation is one in which the data has a distribution that is easy to model. For this purpose, a non-linear deterministic transformation of the data is learned that maps it to a latent space so as to make the transf… ▽ More

    Submitted 10 April, 2015; v1 submitted 30 October, 2014; originally announced October 2014.

    Comments: 11 pages and 2 pages Appendix, workshop paper at ICLR 2015

  39. arXiv:1406.2989  [pdf, other

    stat.ML cs.LG cs.NE

    Techniques for Learning Binary Stochastic Feedforward Neural Networks

    Authors: Tapani Raiko, Mathias Berglund, Guillaume Alain, Laurent Dinh

    Abstract: Stochastic binary hidden units in a multi-layer perceptron (MLP) network give at least three potential benefits when compared to deterministic MLP networks. (1) They allow to learn one-to-many type of map**s. (2) They can be used in structured prediction problems, where modeling the internal structure of the output is important. (3) Stochasticity has been shown to be an excellent regularizer, wh… ▽ More

    Submitted 9 April, 2015; v1 submitted 11 June, 2014; originally announced June 2014.

  40. arXiv:1306.0543  [pdf, other

    cs.LG cs.NE stat.ML

    Predicting Parameters in Deep Learning

    Authors: Misha Denil, Babak Shakibi, Laurent Dinh, Marc'Aurelio Ranzato, Nando de Freitas

    Abstract: We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of them need not be learned at all. We train several different architectures by learning only a small nu… ▽ More

    Submitted 27 October, 2014; v1 submitted 3 June, 2013; originally announced June 2013.