Skip to main content

Showing 1–22 of 22 results for author: Rosen, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2402.11498  [pdf, other

    cs.RO cs.AI

    Verifiably Following Complex Robot Instructions with Foundation Models

    Authors: Benedict Quartey, Eric Rosen, Stefanie Tellex, George Konidaris

    Abstract: Enabling robots to follow complex natural language instructions is an important yet challenging problem. People want to flexibly express constraints, refer to arbitrary landmarks and verify behavior when instructing robots. Conversely, robots must disambiguate human instructions into specifications and ground instruction referents in the real world. We propose Language Instruction grounding for Mo… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

  3. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  4. arXiv:2309.07276  [pdf, other

    cs.RO cs.AI

    Language-Conditioned Observation Models for Visual Object Search

    Authors: Thao Nguyen, Vladislav Hrosinkov, Eric Rosen, Stefanie Tellex

    Abstract: Object search is a challenging task because when given complex language descriptions (e.g., "find the white cup on the table"), the robot must move its camera through the environment and recognize the described object. Previous works map language descriptions to a set of fixed object detectors with predetermined noise models, but these approaches are challenging to scale because new detectors need… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

  5. arXiv:2306.07350  [pdf, ps, other

    cs.LG

    G-invariant diffusion maps

    Authors: Eitan Rosen, Xiuyuan Cheng, Yoel Shkolnisky

    Abstract: The diffusion maps embedding of data lying on a manifold have shown success in tasks ranging from dimensionality reduction and clustering, to data visualization. In this work, we consider embedding data sets which were sampled from a manifold which is closed under the action of a continuous matrix group. An example of such a data set is images who's planar rotations are arbitrary. The G-invariant… ▽ More

    Submitted 25 July, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

  6. arXiv:2306.01765  [pdf

    cs.CY physics.ed-ph physics.soc-ph

    Message in a Bottle -- An Update to the Golden Record

    Authors: Jonathan H. Jiang, Anamaria Berea, Heather Bowden, Prithwis Das, Kristen A. Fahy, Joseph Ginsberg, Robert Jew, Xiaoming Jiang, Arik Kershenbaum, David Kip**, Graham Lau, Karen Lewis, C. Isabel Nunez Lendo, Philip E. Rosen, Nick Searra, Stuart F. Taylor, John Traphagan

    Abstract: In this first part of our series, we delve into the foundational aspects of the "Message in a Bottle" (henceforth referred to as MIAB). This study stands as a continuation of the legacy set by the Voyager Golden Records launched aboard Voyager 1 and 2 in 1977, which aimed to communicate with intelligent species beyond our world. These Records continue to serve not only as a snapshot of Earth and h… ▽ More

    Submitted 16 November, 2023; v1 submitted 27 May, 2023; originally announced June 2023.

  7. arXiv:2305.10960  [pdf, other

    cs.RO cs.AI

    A Virtual Reality Teleoperation Interface for Industrial Robot Manipulators

    Authors: Eric Rosen, Devesh K. Jha

    Abstract: We address the problem of teleoperating an industrial robot manipulator via a commercially available Virtual Reality (VR) interface. Previous works on VR teleoperation for robot manipulators focus primarily on collaborative or research robot platforms (whose dynamics and constraints differ from industrial robot arms), or only address tasks where the robot's dynamics are not as important (e.g: pick… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: 7 pages, 6 figures

  8. arXiv:2303.17001  [pdf, other

    cs.LG cs.SI

    The G-invariant graph Laplacian

    Authors: Eitan Rosen, Paulina Hoyos, Xiuyuan Cheng, Joe Kileel, Yoel Shkolnisky

    Abstract: Graph Laplacian based algorithms for data lying on a manifold have been proven effective for tasks such as dimensionality reduction, clustering, and denoising. In this work, we consider data sets whose data points lie on a manifold that is closed under the action of a known unitary matrix Lie group G. We propose to construct the graph Laplacian by incorporating the distances between all the pairs… ▽ More

    Submitted 28 June, 2024; v1 submitted 29 March, 2023; originally announced March 2023.

  9. arXiv:2212.09032  [pdf, ps, other

    cs.LG cs.DB

    AutoSlicer: Scalable Automated Data Slicing for ML Model Analysis

    Authors: Zifan Liu, Evan Rosen, Paul Suganthan G. C

    Abstract: Automated slicing aims to identify subsets of evaluation data where a trained model performs anomalously. This is an important problem for machine learning pipelines in production since it plays a key role in model debugging and comparison, as well as the diagnosis of fairness issues. Scalability has become a critical requirement for any automated slicing system due to the large search space of po… ▽ More

    Submitted 18 December, 2022; originally announced December 2022.

    Comments: 11 pages, 5 figures, NeurIPS 2022 Workshop on Challenges in Deploying and Monitoring Machine Learning Systems

    ACM Class: I.2.0; H.2.0

  10. arXiv:2211.09935  [pdf, other

    cs.AI cs.CL cs.LG cs.RO

    CAPE: Corrective Actions from Precondition Errors using Large Language Models

    Authors: Shreyas Sundara Raman, Vanya Cohen, Ifrah Idrees, Eric Rosen, Ray Mooney, Stefanie Tellex, David Paulius

    Abstract: Extracting commonsense knowledge from a large language model (LLM) offers a path to designing intelligent robots. Existing approaches that leverage LLMs for planning are unable to recover when an action fails and often resort to retrying failed actions, without resolving the error's underlying cause. We propose a novel approach (CAPE) that attempts to propose corrective actions to resolve precondi… ▽ More

    Submitted 9 March, 2024; v1 submitted 17 November, 2022; originally announced November 2022.

    Comments: 17 pages, 6 figures, accepted at ICRA 2024

    MSC Class: 68T20; 68T50 ACM Class: I.2.7; I.2.8; I.2.2; I.2.4

  11. arXiv:2208.06061  [pdf, other

    cs.CL

    Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages

    Authors: Paul Soulos, Sudha Rao, Caitlin Smith, Eric Rosen, Asli Celikyilmaz, R. Thomas McCoy, Yichen Jiang, Coleman Haley, Roland Fernandez, Hamid Palangi, Jianfeng Gao, Paul Smolensky

    Abstract: Machine translation has seen rapid progress with the advent of Transformer-based models. These models have no explicit linguistic structure built into them, yet they may still implicitly learn structured relationships by attending to relevant tokens. We hypothesize that this structural learning could be made more robust by explicitly endowing Transformers with a structural bias, and we investigate… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

    Comments: Revised edition to 4th Workshop on Technologies for MT of Low Resource Languages

    Journal ref: Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021)

  12. arXiv:2207.07806  [pdf, ps, other

    cs.LG eess.SP

    CHARM: A Hierarchical Deep Learning Model for Classification of Complex Human Activities Using Motion Sensors

    Authors: Eric Rosen, Doruk Senkal

    Abstract: In this paper, we report a hierarchical deep learning model for classification of complex human activities using motion sensors. In contrast to traditional Human Activity Recognition (HAR) models used for event-based activity recognition, such as step counting, fall detection, and gesture identification, this new deep learning model, which we refer to as CHARM (Complex Human Activity Recognition M… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: 8 pages, 5 figures

    ACM Class: I.2.1

  13. arXiv:2206.05096  [pdf, other

    cs.RO

    Skill Transfer for Temporally-Extended Task Specifications

    Authors: Jason Xinyu Liu, Ankit Shah, Eric Rosen, George Konidaris, Stefanie Tellex

    Abstract: Deploying robots in real-world domains, such as households and flexible manufacturing lines, requires the robots to be taskable on demand. Linear temporal logic (LTL) is a widely-used specification language with a compositional grammar that naturally induces commonalities across tasks. However, the majority of prior research on reinforcement learning with LTL specifications treats every new formul… ▽ More

    Submitted 5 March, 2023; v1 submitted 10 June, 2022; originally announced June 2022.

  14. arXiv:2203.11324  [pdf, other

    cs.RO cs.LG

    Learning robot motor skills with mixed reality

    Authors: Eric Rosen, Sreehari Rammohan, Devesh Jha

    Abstract: Mixed Reality (MR) has recently shown great success as an intuitive interface for enabling end-users to teach robots. Related works have used MR interfaces to communicate robot intents and beliefs to a co-located human, as well as developed algorithms for taking multi-modal human input and learning complex motor behaviors. Even with these successes, enabling end-users to teach robots complex motor… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

    Comments: VAM-HRI 2022

  15. arXiv:2110.12341  [pdf, other

    cs.CL

    Scalable knowledge base completion with superposition memories

    Authors: Matthias Lalisse, Eric Rosen, Paul Smolensky

    Abstract: We present Harmonic Memory Networks (HMem), a neural architecture for knowledge base completion that models entities as weighted sums of pairwise bindings between an entity's neighbors and corresponding relations. Since entities are modeled as aggregated neighborhoods, representations of unseen entities can be generated on the fly. We demonstrate this with two new datasets: WNGen and FBGen. Experi… ▽ More

    Submitted 23 October, 2021; originally announced October 2021.

  16. A Tool for Organizing Key Characteristics of Virtual, Augmented, and Mixed Reality for Human-Robot Interaction Systems: Synthesizing VAM-HRI Trends and Takeaways

    Authors: Thomas R. Groechel, Michael E. Walker, Christine T. Chang, Eric Rosen, Jessica Zosa Forde

    Abstract: Frameworks have begun to emerge to categorize Virtual, Augmented, and Mixed Reality (VAM) technologies that provide immersive, intuitive interfaces to facilitate Human-Robot Interaction. These frameworks, however, fail to capture key characteristics of the growing subfield of VAM-HRI and can be difficult to consistently apply due to continuous scales. This work builds upon these prior frameworks t… ▽ More

    Submitted 10 February, 2022; v1 submitted 7 August, 2021; originally announced August 2021.

    Comments: Accepted to Robotics and Automation Magazine Special Issue on Extended Reality in Robotics

  17. arXiv:2107.13356  [pdf, other

    cs.RO cs.AI

    Value-Based Reinforcement Learning for Continuous Control Robotic Manipulation in Multi-Task Sparse Reward Settings

    Authors: Sreehari Rammohan, Shangqun Yu, Bowen He, Eric Hsiung, Eric Rosen, Stefanie Tellex, George Konidaris

    Abstract: Learning continuous control in high-dimensional sparse reward settings, such as robotic manipulation, is a challenging problem due to the number of samples often required to obtain accurate optimal value and policy estimates. While many deep reinforcement learning methods have aimed at improving sample efficiency through replay or improved exploration techniques, state of the art actor-critic and… ▽ More

    Submitted 28 July, 2021; originally announced July 2021.

    Comments: 5 pages, 2 figures, published at RSS 2021 workshop: Advancing Artificial Intelligence and Manipulation for Robotics: Understanding Gaps, Industry and Academic Perspectives, and Community Building

  18. arXiv:2105.08961  [pdf, other

    cs.LG cs.AI cs.CL

    Compositional Processing Emerges in Neural Networks Solving Math Problems

    Authors: Jacob Russin, Roland Fernandez, Hamid Palangi, Eric Rosen, Nebojsa Jojic, Paul Smolensky, Jianfeng Gao

    Abstract: A longstanding question in cognitive science concerns the learning mechanisms underlying compositionality in human cognition. Humans can infer the structured relationships (e.g., grammatical rules) implicit in their sensory observations (e.g., auditory speech), and use this knowledge to guide the composition of simpler meanings into complex wholes. Recent progress in artificial neural networks has… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: 7 pages, 2 figures, Accepted to CogSci 2021 for poster presentation

  19. arXiv:2101.04736  [pdf, other

    cs.RO cs.AI

    Bootstrap** Motor Skill Learning with Motion Planning

    Authors: Ben Abbatematteo, Eric Rosen, Stefanie Tellex, George Konidaris

    Abstract: Learning a robot motor skill from scratch is impractically slow; so much so that in practice, learning must be bootstrapped using a good skill policy obtained from human demonstration. However, relying on human demonstration necessarily degrades the autonomy of robots that must learn a wide variety of skills over their operational lifetimes. We propose using kinematic motion planning as a complete… ▽ More

    Submitted 12 January, 2021; originally announced January 2021.

  20. arXiv:2007.06218  [pdf, ps, other

    cs.RO cs.HC

    Steps Towards Best Practices For Robot Videos

    Authors: Eric Rosen, Stefanie Tellex, Geroge Konidaris

    Abstract: There are unwritten guidelines for how to make robot videos that researchers learn from their advisors and pass onto their students. We believe that it is important for the community to collaboratively discuss and develop a standard set of best practices when making robot. We suggest a starting set of maxims for best robot video practices, and highlight positive examples from the community and neg… ▽ More

    Submitted 13 July, 2020; originally announced July 2020.

    Comments: 4 pages, 0 figures

    ACM Class: I.2.9

  21. arXiv:1708.03655  [pdf, other

    cs.RO cs.HC

    Communicating Robot Arm Motion Intent Through Mixed Reality Head-mounted Displays

    Authors: Eric Rosen, David Whitney, Elizabeth Phillips, Gary Chien, James Tompkin, George Konidaris, Stefanie Tellex

    Abstract: Efficient motion intent communication is necessary for safe and collaborative work environments with collocated humans and robots. Humans efficiently communicate their motion intent to other humans through gestures, gaze, and social cues. However, robots often have difficulty efficiently communicating their motion intent to humans via these methods. Many existing methods for robot motion intent co… ▽ More

    Submitted 11 August, 2017; originally announced August 2017.

  22. arXiv:1703.04481  [pdf, ps, other

    cs.CL

    Geometrical morphology

    Authors: John Goldsmith, Eric Rosen

    Abstract: We explore inflectional morphology as an example of the relationship of the discrete and the continuous in linguistics. The grammar requests a form of a lexeme by specifying a set of feature values, which corresponds to a corner M of a hypercube in feature value space. The morphology responds to that request by providing a morpheme, or a set of morphemes, whose vector sum is geometrically closest… ▽ More

    Submitted 13 March, 2017; originally announced March 2017.

    Comments: 42 pages

    Report number: TR-2017-2