Skip to main content

Showing 1–50 of 62 results for author: Eslami, M

Searching in archive cs. Search in all archives.
.
  1. Public Technologies Transforming Work of the Public and the Public Sector

    Authors: Seyun Kim, Bonnie Fan, Willa Yunqi Yang, Jessie Ramey, Sarah E Fox, Haiyi Zhu, John Zimmerman, Motahhare Eslami

    Abstract: Technologies adopted by the public sector have transformed the work practices of employees in public agencies by creating different means of communication and decision-making. Although much of the recent research in the future of work domain has concentrated on the effects of technological advancements on public sector employees, the influence on work practices of external stakeholders engaging wi… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  2. arXiv:2405.03162  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Advancing Multimodal Medical Capabilities of Gemini

    Authors: Lin Yang, Shawn Xu, Andrew Sellergren, Timo Kohlberger, Yuchen Zhou, Ira Ktena, Atilla Kiraly, Faruk Ahmed, Farhad Hormozdiari, Tiam Jaroensri, Eric Wang, Ellery Wulczyn, Fayaz Jamil, Theo Guidroz, Chuck Lau, Siyuan Qiao, Yun Liu, Akshay Goel, Kendall Park, Arnav Agharwal, Nick George, Yang Wang, Ryutaro Tanno, David G. T. Barrett, Wei-Hung Weng , et al. (22 additional authors not shown)

    Abstract: Many clinical tasks require an understanding of specialized data, such as medical images and genomics, which is not typically found in general-purpose large multimodal models. Building upon Gemini's multimodal models, we develop several models within the new Med-Gemini family that inherit core capabilities of Gemini and are optimized for medical use via fine-tuning with 2D and 3D radiology, histop… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  3. arXiv:2404.18416  [pdf, other

    cs.AI cs.CL cs.CV cs.LG

    Capabilities of Gemini Models in Medicine

    Authors: Khaled Saab, Tao Tu, Wei-Hung Weng, Ryutaro Tanno, David Stutz, Ellery Wulczyn, Fan Zhang, Tim Strother, Chunjong Park, Elahe Vedadi, Juanma Zambrano Chaves, Szu-Yeu Hu, Mike Schaekermann, Aishwarya Kamath, Yong Cheng, David G. T. Barrett, Cathy Cheung, Basil Mustafa, Anil Palepu, Daniel McDuff, Le Hou, Tomer Golany, Luyang Liu, Jean-baptiste Alayrac, Neil Houlsby , et al. (42 additional authors not shown)

    Abstract: Excellence in a wide variety of medical applications poses considerable challenges for AI, requiring advanced reasoning, access to up-to-date medical knowledge and understanding of complex multimodal data. Gemini models, with strong general capabilities in multimodal and long-context reasoning, offer exciting possibilities in medicine. Building on these core strengths of Gemini, we introduce Med-G… ▽ More

    Submitted 1 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  4. arXiv:2404.14511  [pdf

    cs.HC

    Children's Overtrust and Shifting Perspectives of Generative AI

    Authors: Jaemarie Solyst, Ellia Yang, Shixian Xie, Jessica Hammer, Amy Ogan, Motahhare Eslami

    Abstract: The capabilities of generative AI (genAI) have dramatically increased in recent times, and there are opportunities for children to leverage new features for personal and school-related endeavors. However, while the future of genAI is taking form, there remain potentially harmful limitations, such as generation of outputs with misinformation and bias. We ran a workshop study focused on ChatGPT to e… ▽ More

    Submitted 29 June, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Journal ref: Proceedings of the 18th International Scoeity of the Learning Sciences (ICLS) 2024

  5. The Fall of an Algorithm: Characterizing the Dynamics Toward Abandonment

    Authors: Nari Johnson, Sanika Moharana, Christina N. Harrington, Nazanin Andalibi, Hoda Heidari, Motahhare Eslami

    Abstract: As more algorithmic systems have come under scrutiny for their potential to inflict societal harms, an increasing number of organizations that hold power over harmful algorithms have chosen (or were required under the law) to abandon them. While social movements and calls to abandon harmful algorithms have emerged across application domains, little academic attention has been paid to studying aban… ▽ More

    Submitted 12 May, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: 10 pages, 2 column format. In proceedings of ACM FAccT 2024

    Journal ref: ACM Conference on Fairness, Accountability, and Transparency 2024

  6. arXiv:2404.10897  [pdf

    cs.SI

    The Future of Research on Social Technologies: CCC Workshop Visioning Report

    Authors: Motahhare Eslami, Eric Gilbert, Sarita Schoenebeck, Eric P. S. Baumer, Eshwar Chandrasekharan, Michelle De Mooy, Karrie Karahalios, David Karger, Tressie McMillan Cottom, Andrés Monroy-Hernández, Loren Terveen, John Wihbey

    Abstract: Social technologies are the systems, interfaces, features, infrastructures, and architectures that allow people to interact with each other online. These technologies dramatically shape the fabric of our everyday lives, from the information we consume to the people we interact with to the foundations of our culture and politics. While the benefits of social technologies are well documented, the ha… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  7. arXiv:2403.07150  [pdf

    cs.HC cs.SI

    Breaking Political Filter Bubbles via Social Comparison

    Authors: Nouran Soliman, Motahhare Eslami, Karrie Karahalios

    Abstract: Online social platforms allow users to filter out content they do not like. According to selective exposure theory, people tend to view content they agree with more to get more self-assurance. This causes people to live in ideological filter bubbles. We report on a user study that encourages users to break the political filter bubble of their Twitter feed by reading more diverse viewpoints through… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: * Both of the first two authors contributed equally to this work

  8. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  9. arXiv:2402.12162  [pdf, other

    cs.CR cs.AR

    SCARF: Securing Chips with a Robust Framework against Fabrication-time Hardware Trojans

    Authors: Mohammad Eslami, Tara Ghasempouri, Samuel Pagliarini

    Abstract: The globalization of the semiconductor industry has introduced security challenges to Integrated Circuits (ICs), particularly those related to the threat of Hardware Trojans (HTs) - malicious logic that can be introduced during IC fabrication. While significant efforts are directed towards verifying the correctness and reliability of ICs, their security is often overlooked. In this paper, we propo… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  10. arXiv:2312.11497  [pdf, other

    cs.HC cs.CY

    The Public Algorithms Survey in Allegheny County

    Authors: Yu-Ru Lin, Beth Schwanke, Rosta Farzan, Bonnie Fan, Motahhare Eslami, Hong Shen, Sarah Fox

    Abstract: This survey study focuses on public opinion regarding the use of algorithmic decision-making in government sectors, specifically in Allegheny County, Pennsylvania. Algorithms are becoming increasingly prevalent in various public domains, including both routine and high-stakes government functions. Despite their growing use, public sentiment remains divided, with concerns about privacy and accuracy… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  11. arXiv:2308.06201  [pdf, other

    cs.CR cs.AR

    SALSy: Security-Aware Layout Synthesis

    Authors: Mohammad Eslami, Tiago Perez, Samuel Pagliarini

    Abstract: Integrated Circuits (ICs) are the target of diverse attacks during their lifetime. Fabrication-time attacks, such as the insertion of Hardware Trojans, can give an adversary access to privileged data and/or the means to corrupt the IC's internal computation. Post-fabrication attacks, where the end-user takes a malicious role, also attempt to obtain privileged information through means such as faul… ▽ More

    Submitted 21 August, 2023; v1 submitted 11 August, 2023; originally announced August 2023.

  12. arXiv:2306.06542  [pdf, ps, other

    cs.HC cs.CY cs.LG

    Investigating Practices and Opportunities for Cross-functional Collaboration around AI Fairness in Industry Practice

    Authors: Wesley Hanwen Deng, Nur Yildirim, Monica Chang, Motahhare Eslami, Ken Holstein, Michael Madaio

    Abstract: An emerging body of research indicates that ineffective cross-functional collaboration -- the interdisciplinary work done by industry practitioners across roles -- represents a major barrier to addressing issues of fairness in AI design and development. In this research, we sought to better understand practitioners' current practices and tactics to enact cross-functional collaboration for AI fairn… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

    Comments: In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (FAccT '23)

  13. Participation and Division of Labor in User-Driven Algorithm Audits: How Do Everyday Users Work together to Surface Algorithmic Harms?

    Authors: Rena Li, Sara Kingsley, Chelsea Fan, Proteeti Sinha, Nora Wai, Jaimie Lee, Hong Shen, Motahhare Eslami, Jason Hong

    Abstract: Recent years have witnessed an interesting phenomenon in which users come together to interrogate potentially harmful algorithmic behaviors they encounter in their everyday lives. Researchers have started to develop theoretical and empirical understandings of these user driven audits, with a hope to harness the power of users in detecting harmful machine behaviors. However, little is known about u… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  14. arXiv:2304.00167  [pdf, other

    cs.HC

    Towards "Anytime, Anywhere" Community Learning and Engagement around the Design of Public Sector AI

    Authors: Wesley Hanwen Deng, Motahhare Eslami, Kenneth Holstein

    Abstract: Data-driven algorithmic and AI systems are increasingly being deployed to automate or augment decision processes across a wide range of public service settings. Yet community members are often unaware of the presence, operation, and impacts of these systems on their lives. With the shift towards algorithmic decision-making in public services, technology developers increasingly assume the role of d… ▽ More

    Submitted 21 April, 2023; v1 submitted 31 March, 2023; originally announced April 2023.

    Journal ref: AI Literacy: Finding Common Threads between Education, Design, Policy, and Explainability Workshop at CHI 2023

  15. arXiv:2302.13947  [pdf

    cs.HC

    Investigating Girls' Perspectives and Knowledge Gaps on Ethics and Fairness in Artificial Intelligence in a Lightweight Workshop

    Authors: Jaemarie Solyst, Alexis Axon, Angela E. B. Stewart, Motahhare Eslami, Amy Ogan

    Abstract: Artificial intelligence (AI) is everywhere, with many children having increased exposure to AI technologies in daily life. We aimed to understand middle school girls' (a group often excluded group in tech) perceptions and knowledge gaps about AI. We created and explored the feasibility of a lightweight (less than 3 hours) educational workshop in which learners considered challenges in their lives… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: 8 pages, 2 figures (a table and a graphic with two parts)

    Journal ref: Proceedings of the 16th International Society of the Learning Sciences (ICLS) 2022, pages 807-814

  16. arXiv:2210.06433  [pdf, other

    cs.CV cs.AI cs.LG

    Self-supervised video pretraining yields human-aligned visual representations

    Authors: Nikhil Parthasarathy, S. M. Ali Eslami, João Carreira, Olivier J. Hénaff

    Abstract: Humans learn powerful representations of objects and scenes by observing how they evolve over time. Yet, outside of specific tasks that require explicit temporal understanding, static image pretraining remains the dominant paradigm for learning visual foundation models. We question this mismatch, and ask whether video pretraining can yield visual representations that bear the hallmarks of human pe… ▽ More

    Submitted 25 July, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: Technical report

  17. arXiv:2210.03709  [pdf, other

    cs.HC cs.AI cs.LG

    Understanding Practices, Challenges, and Opportunities for User-Engaged Algorithm Auditing in Industry Practice

    Authors: Wesley Hanwen Deng, Bill Boyuan Guo, Alicia DeVrio, Hong Shen, Motahhare Eslami, Kenneth Holstein

    Abstract: Recent years have seen growing interest among both researchers and practitioners in user-engaged approaches to algorithm auditing, which directly engage users in detecting problematic behaviors in algorithmic systems. However, we know little about industry practitioners' current practices and challenges around user-engaged auditing, nor what opportunities exist for them to better leverage such app… ▽ More

    Submitted 21 February, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

    Comments: 18 pages. In Proceedings of CHI 2023

    Journal ref: CHI 2023: ACM Conference on Human Factors in Computing Systems. April 23-28, 2023, Hamburg, Germany

  18. arXiv:2205.04599  [pdf, other

    cs.LG cs.AI

    Affective Medical Estimation and Decision Making via Visualized Learning and Deep Learning

    Authors: Mohammad Eslami, Solale Tabarestani, Ehsan Adeli, Glyn Elwyn, Tobias Elze, Mengyu Wang, Nazlee Zebardast, Nassir Navab, Malek Adjouadi

    Abstract: With the advent of sophisticated machine learning (ML) techniques and the promising results they yield, especially in medical applications, where they have been investigated for different tasks to enhance the decision-making process. Since visualization is such an effective tool for human comprehension, memorization, and judgment, we have presented a first-of-its-kind estimation approach we refer… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

  19. A Novel Service Deployment Policy in Fog Computing Considering The Degree of Availability and Fog Landscape Utilization Using Multiobjective Evolutionary Algorithms

    Authors: Maryam Eslami, Mehdi Sakhaei

    Abstract: Fog computing is a promising paradigm for real-time and mission-critical Internet of Things (IoT) applications. Regarding the high distribution, heterogeneity, and limitation of fog resources, applications should be placed in a distributed manner to fully utilize these resources. In this paper, we propose a linear formulation for assuring the different availability requirements of application serv… ▽ More

    Submitted 4 February, 2022; originally announced February 2022.

  20. arXiv:2201.12204  [pdf, other

    cs.LG

    From data to functa: Your data point is a function and you can treat it like one

    Authors: Emilien Dupont, Hyunjik Kim, S. M. Ali Eslami, Danilo Rezende, Dan Rosenbaum

    Abstract: It is common practice in deep learning to represent a measurement of the world on a discrete grid, e.g. a 2D grid of pixels. However, the underlying signal represented by these measurements is often continuous, e.g. the scene depicted in an image. A powerful continuous alternative is then to represent these measurements using an implicit neural representation, a neural function trained to output t… ▽ More

    Submitted 10 November, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

  21. Reusing Verification Assertions as Security Checkers for Hardware Trojan Detection

    Authors: Mohammad Eslami, Tara Ghasempouri, Samuel Pagliarini

    Abstract: Globalization in the semiconductor industry enables fabless design houses to reduce their costs, save time, and make use of newer technologies. However, the offshoring of Integrated Circuit (IC) fabrication has negative sides, including threats such as Hardware Trojans (HTs) - a type of malicious logic that is not trivial to detect. One aspect of IC design that is not affected by globalization is… ▽ More

    Submitted 30 January, 2023; v1 submitted 4 January, 2022; originally announced January 2022.

    Comments: 6 pages, 6 figures

    Journal ref: 2022 23rd International Symposium on Quality Electronic Design (ISQED), Santa Clara, CA, USA, 2022, pp. 1-6

  22. arXiv:2109.10777  [pdf, other

    cs.CV cs.LG eess.IV

    Deep Variational Clustering Framework for Self-labeling of Large-scale Medical Images

    Authors: Farzin Soleymani, Mohammad Eslami, Tobias Elze, Bernd Bischl, Mina Rezaei

    Abstract: We propose a Deep Variational Clustering (DVC) framework for unsupervised representation learning and clustering of large-scale medical images. DVC simultaneously learns the multivariate Gaussian posterior through the probabilistic convolutional encoder and the likelihood distribution with the probabilistic convolutional decoder; and optimizes cluster labels assignment. Here, the learned multivari… ▽ More

    Submitted 22 September, 2021; originally announced September 2021.

    Comments: arXiv admin note: text overlap with arXiv:2109.05232

  23. arXiv:2106.14108  [pdf, other

    cs.CE eess.IV

    Inferring a Continuous Distribution of Atom Coordinates from Cryo-EM Images using VAEs

    Authors: Dan Rosenbaum, Marta Garnelo, Michal Zielinski, Charlie Beattie, Ellen Clancy, Andrea Huber, Pushmeet Kohli, Andrew W. Senior, John Jumper, Carl Doersch, S. M. Ali Eslami, Olaf Ronneberger, Jonas Adler

    Abstract: Cryo-electron microscopy (cryo-EM) has revolutionized experimental protein structure determination. Despite advances in high resolution reconstruction, a majority of cryo-EM experiments provide either a single state of the studied macromolecule, or a relatively small number of its conformations. This reduces the effectiveness of the technique for proteins with flexible regions, which are known to… ▽ More

    Submitted 26 June, 2021; originally announced June 2021.

  24. arXiv:2106.13884  [pdf, other

    cs.CV cs.CL cs.LG

    Multimodal Few-Shot Learning with Frozen Language Models

    Authors: Maria Tsimpoukelli, Jacob Menick, Serkan Cabi, S. M. Ali Eslami, Oriol Vinyals, Felix Hill

    Abstract: When trained at sufficient scale, auto-regressive language models exhibit the notable ability to learn a new language task after being prompted with just a few examples. Here, we present a simple, yet effective, approach for transferring this few-shot learning ability to a multimodal setting (vision and language). Using aligned image and caption data, we train a vision encoder to represent each im… ▽ More

    Submitted 3 July, 2021; v1 submitted 25 June, 2021; originally announced June 2021.

  25. arXiv:2106.07683  [pdf, other

    math.DS cs.LG

    Extracting Global Dynamics of Loss Landscape in Deep Learning Models

    Authors: Mohammed Eslami, Hamed Eramian, Marcio Gameiro, William Kalies, Konstantin Mischaikow

    Abstract: Deep learning models evolve through training to learn the manifold in which the data exists to satisfy an objective. It is well known that evolution leads to different final states which produce inconsistent predictions of the same test data points. This calls for techniques to be able to empirically quantify the difference in the trajectories and highlight problematic regions. While much focus is… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

    Comments: 9 pages, 3 figures, Supplementary

  26. arXiv:2105.14986  [pdf

    eess.IV cs.CV cs.LG physics.med-ph

    Feasibility Assessment of Multitasking in MRI Neuroimaging Analysis: Tissue Segmentation, Cross-Modality Conversion and Bias correction

    Authors: Mohammad Eslami, Solale Tabarestani, Malek Adjouadi

    Abstract: Neuroimaging is essential in brain studies for the diagnosis and identification of disease, structure, and function of the brain in its healthy and disease states. Literature shows that there are advantages of multitasking with some deep learning (DL) schemes in challenging neuroimaging applications. This study examines the feasibility of using multitasking in three different applications, includi… ▽ More

    Submitted 31 May, 2021; originally announced May 2021.

  27. arXiv:2105.12196  [pdf, other

    cs.AI cs.MA cs.NE cs.RO

    From Motor Control to Team Play in Simulated Humanoid Football

    Authors: Siqi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess

    Abstract: Intelligent behaviour in the physical world exhibits structure at multiple spatial and temporal scales. Although movements are ultimately executed at the level of instantaneous muscle tensions or joint torques, they must be selected to serve goals defined on much longer timescales, and in terms of relations that extend far beyond the body itself, ultimately involving coordination with other agents… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

  28. arXiv:2105.02980  [pdf, other

    cs.HC cs.CY

    Everyday algorithm auditing: Understanding the power of everyday users in surfacing harmful algorithmic behaviors

    Authors: Hong Shen, Alicia DeVos, Motahhare Eslami, Kenneth Holstein

    Abstract: A growing body of literature has proposed formal approaches to audit algorithmic systems for biased and harmful behaviors. While formal auditing approaches have been greatly impactful, they often suffer major blindspots, with critical issues surfacing only in the context of everyday use once systems are deployed. Recent years have seen many cases in which everyday users of algorithmic systems dete… ▽ More

    Submitted 24 August, 2021; v1 submitted 6 May, 2021; originally announced May 2021.

    Comments: To appear in CSCW 2021. The co-first authors and co-senior authors each contributed equally to this work

  29. arXiv:2105.00162  [pdf, other

    cs.AI cs.NE

    Generative Art Using Neural Visual Grammars and Dual Encoders

    Authors: Chrisantha Fernando, S. M. Ali Eslami, Jean-Baptiste Alayrac, Piotr Mirowski, Dylan Banarse, Simon Osindero

    Abstract: Whilst there are perhaps only a few scientific methods, there seem to be almost as many artistic methods as there are artists. Artistic processes appear to inhabit the highest order of open-endedness. To begin to understand some of the processes of art making it is helpful to try to automate them even partially. In this paper, a novel algorithm for producing generative art is described which allow… ▽ More

    Submitted 3 May, 2021; v1 submitted 1 May, 2021; originally announced May 2021.

  30. arXiv:2101.04230  [pdf, other

    cs.CV eess.IV

    Explaining the Black-box Smoothly- A Counterfactual Approach

    Authors: Sumedha Singla, Motahhare Eslami, Brian Pollack, Stephen Wallace, Kayhan Batmanghelich

    Abstract: We propose a BlackBox Counterfactual Explainer, designed to explain image classification models for medical applications. Classical approaches (e.g., saliency maps) that assess feature importance do not explain "how" imaging features in important anatomical regions are relevant to the classification decision. Our framework explains the decision for a target class by gradually "exaggerating" the se… ▽ More

    Submitted 18 November, 2022; v1 submitted 11 January, 2021; originally announced January 2021.

    Comments: Preprint Accepted in Medical image Analysis journal

  31. arXiv:2011.09192  [pdf, other

    cs.AI cs.GT cs.MA

    Game Plan: What AI can do for Football, and What Football can do for AI

    Authors: Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome Connor, Daniel Hennes, Ian Graham, William Spearman, Tim Waskett, Dafydd Steele, Pauline Luc, Adria Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien Perolat, Bart De Vylder , et al. (11 additional authors not shown)

    Abstract: The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities in various team and individual sports, including baseball, basketball, and tennis. More recently, AI techniques have been applied to football, due to a huge increase in data collection by professional teams, increased computational power, and advances in machine learning, with t… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

  32. arXiv:2007.05566  [pdf, other

    cs.LG stat.ML

    Contrastive Training for Improved Out-of-Distribution Detection

    Authors: Jim Winkens, Rudy Bunel, Abhijit Guha Roy, Robert Stanforth, Vivek Natarajan, Joseph R. Ledsam, Patricia MacWilliams, Pushmeet Kohli, Alan Karthikesalingam, Simon Kohl, Taylan Cemgil, S. M. Ali Eslami, Olaf Ronneberger

    Abstract: Reliable detection of out-of-distribution (OOD) inputs is increasingly understood to be a precondition for deployment of machine learning systems. This paper proposes and investigates the use of contrastive training to boost OOD detection performance. Unlike leading methods for OOD detection, our approach does not require access to examples labeled explicitly as OOD, which can be difficult to coll… ▽ More

    Submitted 10 July, 2020; originally announced July 2020.

  33. arXiv:2002.10880  [pdf, other

    cs.GR cs.CV cs.LG stat.ML

    PolyGen: An Autoregressive Generative Model of 3D Meshes

    Authors: Charlie Nash, Yaroslav Ganin, S. M. Ali Eslami, Peter W. Battaglia

    Abstract: Polygon meshes are an efficient representation of 3D geometry, and are of central importance in computer graphics, robotics and games development. Existing learning-based approaches have avoided the challenges of working with 3D meshes, instead using alternative object representations that are more compatible with neural architectures and training approaches. We present an approach which models th… ▽ More

    Submitted 23 February, 2020; originally announced February 2020.

  34. SignCol: Open-Source Software for Collecting Sign Language Gestures

    Authors: Mohammad Eslami, Mahdi Karami, Sedigheh Eslami, Solale Tabarestani, Farah Torkamani-Azar, Christoph Meinel

    Abstract: Sign(ed) languages use gestures, such as hand or head movements, for communication. Sign language recognition is an assistive technology for individuals with hearing disability and its goal is to improve such individuals' life quality by facilitating their social involvement. Since sign languages are vastly varied in alphabets, as known as signs, a sign recognition software should be capable of ha… ▽ More

    Submitted 31 October, 2019; originally announced November 2019.

    Comments: The paper is presented at ICSESS conference but the published version by them on the IEEE Xplore is impaired and the quality of figures is inappropriate!! This is the preprint version which had appropriate format and figures

  35. arXiv:1910.01007  [pdf, other

    cs.CV cs.LG stat.ML

    Unsupervised Doodling and Painting with Improved SPIRAL

    Authors: John F. J. Mellor, Eunbyung Park, Yaroslav Ganin, Igor Babuschkin, Tejas Kulkarni, Dan Rosenbaum, Andy Ballard, Theophane Weber, Oriol Vinyals, S. M. Ali Eslami

    Abstract: We investigate using reinforcement learning agents as generative models of images (extending arXiv:1804.01118). A generative agent controls a simulated painting environment, and is trained with rewards provided by a discriminator network simultaneously trained to assess the realism of the agent's samples, either unconditional or reconstructions. Compared to prior work, we make a number of improvem… ▽ More

    Submitted 2 October, 2019; originally announced October 2019.

    Comments: See https://learning-to-paint.github.io for an interactive version of this paper, with videos

    ACM Class: I.2; I.4

  36. arXiv:1907.07951  [pdf, other

    eess.IV cs.CV cs.LG cs.SD eess.AS

    Automatic vocal tract landmark localization from midsagittal MRI data

    Authors: Mohammad Eslami, Christiane Neuschaefer-Rube, Antoine Serrurier

    Abstract: The various speech sounds of a language are obtained by varying the shape and position of the articulators surrounding the vocal tract. Analyzing their variations is crucial for understanding speech production, diagnosing speech disorders and planning therapy. Identifying key anatomical landmarks of these structures on medical images is a pre-requisite for any quantitative analysis and the rising… ▽ More

    Submitted 9 January, 2020; v1 submitted 18 July, 2019; originally announced July 2019.

  37. Image to Images Translation for Multi-Task Organ Segmentation and Bone Suppression in Chest X-Ray Radiography

    Authors: Mohammad Eslami, Solale Tabarestani, Shadi Albarqouni, Ehsan Adeli, Nassir Navab, Malek Adjouadi

    Abstract: Chest X-ray radiography is one of the earliest medical imaging technologies and remains one of the most widely-used for diagnosis, screening, and treatment follow up of diseases related to lungs and heart. The literature in this field of research reports many interesting studies dealing with the challenging tasks of bone suppression and organ segmentation but performed separately, limiting any lea… ▽ More

    Submitted 31 December, 2019; v1 submitted 24 June, 2019; originally announced June 2019.

  38. arXiv:1905.13077  [pdf, other

    cs.CV

    A Hierarchical Probabilistic U-Net for Modeling Multi-Scale Ambiguities

    Authors: Simon A. A. Kohl, Bernardino Romera-Paredes, Klaus H. Maier-Hein, Danilo Jimenez Rezende, S. M. Ali Eslami, Pushmeet Kohli, Andrew Zisserman, Olaf Ronneberger

    Abstract: Medical imaging only indirectly measures the molecular identity of the tissue within each voxel, which often produces only ambiguous image evidence for target measures of interest, like semantic segmentation. This diversity and the variations of plausible interpretations are often specific to given image regions and may thus manifest on various scales, spanning all the way from the pixel to the im… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

    Comments: 25 pages, 15 figures

  39. arXiv:1905.09272  [pdf, other

    cs.CV cs.LG

    Data-Efficient Image Recognition with Contrastive Predictive Coding

    Authors: Olivier J. Hénaff, Aravind Srinivas, Jeffrey De Fauw, Ali Razavi, Carl Doersch, S. M. Ali Eslami, Aaron van den Oord

    Abstract: Human observers can learn to recognize new categories of images from a handful of examples, yet doing so with artificial ones remains an open challenge. We hypothesize that data-efficient recognition is enabled by representations which make the variability in natural signals more predictable. We therefore revisit and improve Contrastive Predictive Coding, an unsupervised objective for learning suc… ▽ More

    Submitted 1 July, 2020; v1 submitted 22 May, 2019; originally announced May 2019.

  40. arXiv:1903.11907  [pdf, other

    stat.ML cs.LG

    Meta-Learning surrogate models for sequential decision making

    Authors: Alexandre Galashov, Jonathan Schwarz, Hyunjik Kim, Marta Garnelo, David Saxton, Pushmeet Kohli, S. M. Ali Eslami, Yee Whye Teh

    Abstract: We introduce a unified probabilistic framework for solving sequential decision making problems ranging from Bayesian optimisation to contextual bandits and reinforcement learning. This is accomplished by a probabilistic model-based approach that explains observed data while capturing predictive uncertainty during the decision making process. Crucially, this probabilistic model is chosen to be a Me… ▽ More

    Submitted 12 June, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

  41. arXiv:1807.03149  [pdf, other

    cs.CV cs.LG stat.ML

    Learning models for visual 3D localization with implicit map**

    Authors: Dan Rosenbaum, Frederic Besse, Fabio Viola, Danilo J. Rezende, S. M. Ali Eslami

    Abstract: We consider learning based methods for visual localization that do not require the construction of explicit maps in the form of point clouds or voxels. The goal is to learn an implicit representation of the environment at a higher, more abstract level. We propose to use a generative approach based on Generative Query Networks (GQNs, Eslami et al. 2018), asking the following questions: 1) Can GQN c… ▽ More

    Submitted 12 December, 2018; v1 submitted 4 July, 2018; originally announced July 2018.

  42. arXiv:1807.02033  [pdf, other

    cs.CV cs.LG stat.ML

    Consistent Generative Query Networks

    Authors: Ananya Kumar, S. M. Ali Eslami, Danilo J. Rezende, Marta Garnelo, Fabio Viola, Edward Lockhart, Murray Shanahan

    Abstract: Stochastic video prediction models take in a sequence of image frames, and generate a sequence of consecutive future image frames. These models typically generate future frames in an autoregressive fashion, which is slow and requires the input and output frames to be consecutive. We introduce a model that overcomes these drawbacks by generating a latent representation from an arbitrary set of fram… ▽ More

    Submitted 21 April, 2019; v1 submitted 5 July, 2018; originally announced July 2018.

  43. arXiv:1807.01670  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    Encoding Spatial Relations from Natural Language

    Authors: Tiago Ramalho, Tomáš Kočiský, Frederic Besse, S. M. Ali Eslami, Gábor Melis, Fabio Viola, Phil Blunsom, Karl Moritz Hermann

    Abstract: Natural language processing has made significant inroads into learning the semantics of words through distributional approaches, however representations learnt via these methods fail to capture certain kinds of information implicit in the real world. In particular, spatial relations are encoded in a way that is inconsistent with human spatial reasoning and lacking invariance to viewpoint changes.… ▽ More

    Submitted 5 July, 2018; v1 submitted 4 July, 2018; originally announced July 2018.

  44. arXiv:1807.01622  [pdf, other

    cs.LG stat.ML

    Neural Processes

    Authors: Marta Garnelo, Jonathan Schwarz, Dan Rosenbaum, Fabio Viola, Danilo J. Rezende, S. M. Ali Eslami, Yee Whye Teh

    Abstract: A neural network (NN) is a parameterised function that can be tuned via gradient descent to approximate a labelled collection of data with high precision. A Gaussian process (GP), on the other hand, is a probabilistic model that defines a distribution over possible functions, and is updated in light of data via the rules of probabilistic inference. GPs are probabilistic, data-efficient and flexibl… ▽ More

    Submitted 4 July, 2018; originally announced July 2018.

  45. arXiv:1807.01613  [pdf, other

    cs.LG stat.ML

    Conditional Neural Processes

    Authors: Marta Garnelo, Dan Rosenbaum, Chris J. Maddison, Tiago Ramalho, David Saxton, Murray Shanahan, Yee Whye Teh, Danilo J. Rezende, S. M. Ali Eslami

    Abstract: Deep neural networks excel at function approximation, yet they are typically trained from scratch for each new function. On the other hand, Bayesian methods, such as Gaussian Processes (GPs), exploit prior knowledge to quickly infer the shape of a new function at test time. Yet GPs are computationally expensive, and it can be hard to design appropriate priors. In this paper we propose a family of… ▽ More

    Submitted 4 July, 2018; originally announced July 2018.

  46. arXiv:1806.05034  [pdf, other

    cs.CV cs.LG cs.NE stat.ML

    A Probabilistic U-Net for Segmentation of Ambiguous Images

    Authors: Simon A. A. Kohl, Bernardino Romera-Paredes, Clemens Meyer, Jeffrey De Fauw, Joseph R. Ledsam, Klaus H. Maier-Hein, S. M. Ali Eslami, Danilo Jimenez Rezende, Olaf Ronneberger

    Abstract: Many real-world vision problems suffer from inherent ambiguities. In clinical applications for example, it might not be clear from a CT scan alone which particular region is cancer tissue. Therefore a group of graders typically produces a set of diverse but plausible segmentations. We consider the task of learning a distribution over segmentations given an input. To this end we propose a generativ… ▽ More

    Submitted 29 January, 2019; v1 submitted 13 June, 2018; originally announced June 2018.

    Comments: Last update: added further details about the LIDC experiment. 11 pages for the main paper, 28 pages including appendix. 5 figures in the main paper, 18 figures in total, Advances in Neural Information Processing Systems (NeurIPS), 2018

  47. arXiv:1804.09401  [pdf, other

    stat.ML cs.LG

    Generative Temporal Models with Spatial Memory for Partially Observed Environments

    Authors: Marco Fraccaro, Danilo Jimenez Rezende, Yori Zwols, Alexander Pritzel, S. M. Ali Eslami, Fabio Viola

    Abstract: In model-based reinforcement learning, generative and temporal models of environments can be leveraged to boost agent performance, either by tuning the agent's representations during training or via use as part of an explicit planning mechanism. However, their application in practice has been limited to simplistic environments, due to the difficulty of training such models in larger, potentially p… ▽ More

    Submitted 19 July, 2018; v1 submitted 25 April, 2018; originally announced April 2018.

    Comments: ICML 2018

  48. arXiv:1804.01118  [pdf, other

    cs.CV cs.LG stat.ML

    Synthesizing Programs for Images using Reinforced Adversarial Learning

    Authors: Yaroslav Ganin, Tejas Kulkarni, Igor Babuschkin, S. M. Ali Eslami, Oriol Vinyals

    Abstract: Advances in deep generative networks have led to impressive results in recent years. Nevertheless, such models can often waste their capacity on the minutiae of datasets, presumably due to weak inductive biases in their decoders. This is where graphics engines may come in handy since they abstract away low-level details and represent images as high-level programs. Current methods that combine deep… ▽ More

    Submitted 3 April, 2018; originally announced April 2018.

    Comments: 12 pages, 13 figures

  49. arXiv:1803.03835  [pdf, other

    cs.LG

    Kickstarting Deep Reinforcement Learning

    Authors: Simon Schmitt, Jonathan J. Hudson, Augustin Zidek, Simon Osindero, Carl Doersch, Wojciech M. Czarnecki, Joel Z. Leibo, Heinrich Kuttler, Andrew Zisserman, Karen Simonyan, S. M. Ali Eslami

    Abstract: We present a method for using previously-trained 'teacher' agents to kickstart the training of a new 'student' agent. To this end, we leverage ideas from policy distillation and population based training. Our method places no constraints on the architecture of the teacher or student agents, and it regulates itself to allow the students to surpass their teachers in performance. We show that, on a c… ▽ More

    Submitted 10 March, 2018; originally announced March 2018.

  50. arXiv:1802.07740  [pdf, other

    cs.AI

    Machine Theory of Mind

    Authors: Neil C. Rabinowitz, Frank Perbet, H. Francis Song, Chiyuan Zhang, S. M. Ali Eslami, Matthew Botvinick

    Abstract: Theory of mind (ToM; Premack & Woodruff, 1978) broadly refers to humans' ability to represent the mental states of others, including their desires, beliefs, and intentions. We propose to train a machine to build such models too. We design a Theory of Mind neural network -- a ToMnet -- which uses meta-learning to build models of the agents it encounters, from observations of their behaviour alone.… ▽ More

    Submitted 12 March, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

    Comments: 21 pages, 15 figures