Skip to main content

Showing 1–3 of 3 results for author: Hui, D Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2007.12770  [pdf, other

    cs.AI cs.CL cs.LG

    BabyAI 1.1

    Authors: David Yu-Tung Hui, Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Yoshua Bengio

    Abstract: The BabyAI platform is designed to measure the sample efficiency of training an agent to follow grounded-language instructions. BabyAI 1.0 presents baseline results of an agent trained by deep imitation or reinforcement learning. BabyAI 1.1 improves the agent's architecture in three minor ways. This increases reinforcement learning sample efficiency by up to 3 times and improves imitation learning… ▽ More

    Submitted 24 July, 2020; originally announced July 2020.

    Comments: 9 pages, 1 figure, technical report

  2. arXiv:2007.04250  [pdf, other

    cs.LG cs.CV stat.ML

    A Benchmark of Medical Out of Distribution Detection

    Authors: Tianshi Cao, Chin-Wei Huang, David Yu-Tung Hui, Joseph Paul Cohen

    Abstract: Motivation: Deep learning models deployed for use on medical tasks can be equipped with Out-of-Distribution Detection (OoDD) methods in order to avoid erroneous predictions. However it is unclear which OoDD method should be used in practice. Specific Problem: Systems trained for one particular domain of images cannot be expected to perform accurately on images of a different domain. These images s… ▽ More

    Submitted 4 August, 2020; v1 submitted 8 July, 2020; originally announced July 2020.

    Comments: Submitted to Machine Learning for Biomedical Imaging Journal (MELBA)

  3. arXiv:2002.00412  [pdf, other

    cs.LG cs.AI stat.ML

    Combating False Negatives in Adversarial Imitation Learning

    Authors: Konrad Zolna, Chitwan Saharia, Leonard Boussioux, David Yu-Tung Hui, Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Yoshua Bengio

    Abstract: In adversarial imitation learning, a discriminator is trained to differentiate agent episodes from expert demonstrations representing the desired behavior. However, as the trained policy learns to be more successful, the negative examples (the ones produced by the agent) become increasingly similar to expert ones. Despite the fact that the task is successfully accomplished in some of the agent's t… ▽ More

    Submitted 2 February, 2020; originally announced February 2020.

    Comments: This is an extended version of the student abstract published at 34th AAAI Conference on Artificial Intelligence