Search | arXiv e-print repository

RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots

Authors: Soroush Nasiriany, Abhiram Maddukuri, Lance Zhang, Adeet Parikh, Aaron Lo, Abhishek Joshi, Ajay Mandlekar, Yuke Zhu

Abstract: Recent advancements in Artificial Intelligence (AI) have largely been propelled by scaling. In Robotics, scaling is hindered by the lack of access to massive robot datasets. We advocate using realistic physical simulation as a means to scale environments, tasks, and datasets for robot learning methods. We present RoboCasa, a large-scale simulation framework for training generalist robots in everyd… ▽ More Recent advancements in Artificial Intelligence (AI) have largely been propelled by scaling. In Robotics, scaling is hindered by the lack of access to massive robot datasets. We advocate using realistic physical simulation as a means to scale environments, tasks, and datasets for robot learning methods. We present RoboCasa, a large-scale simulation framework for training generalist robots in everyday environments. RoboCasa features realistic and diverse scenes focusing on kitchen environments. We provide thousands of 3D assets across over 150 object categories and dozens of interactable furniture and appliances. We enrich the realism and diversity of our simulation with generative AI tools, such as object assets from text-to-3D models and environment textures from text-to-image models. We design a set of 100 tasks for systematic evaluation, including composite tasks generated by the guidance of large language models. To facilitate learning, we provide high-quality human demonstrations and integrate automated trajectory generation methods to substantially enlarge our datasets with minimal human burden. Our experiments show a clear scaling trend in using synthetically generated robot data for large-scale imitation learning and show great promise in harnessing simulation data in real-world tasks. Videos and open-source code are available at https://robocasa.ai/ △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: RSS 2024

arXiv:2310.11609 [pdf, other]

doi 10.1063/5.0196620

Reflection-Equivariant Diffusion for 3D Structure Determination from Isotopologue Rotational Spectra in Natural Abundance

Authors: Austin Cheng, Alston Lo, Santiago Miret, Brooks Pate, Alán Aspuru-Guzik

Abstract: Structure determination is necessary to identify unknown organic molecules, such as those in natural products, forensic samples, the interstellar medium, and laboratory syntheses. Rotational spectroscopy enables structure determination by providing accurate 3D information about small organic molecules via their moments of inertia. Using these moments, Kraitchman analysis determines isotopic substi… ▽ More Structure determination is necessary to identify unknown organic molecules, such as those in natural products, forensic samples, the interstellar medium, and laboratory syntheses. Rotational spectroscopy enables structure determination by providing accurate 3D information about small organic molecules via their moments of inertia. Using these moments, Kraitchman analysis determines isotopic substitution coordinates, which are the unsigned $|x|,|y|,|z|$ coordinates of all atoms with natural isotopic abundance, including carbon, nitrogen, and oxygen. While unsigned substitution coordinates can verify guesses of structures, the missing $+/-$ signs make it challenging to determine the actual structure from the substitution coordinates alone. To tackle this inverse problem, we develop KREED (Kraitchman REflection-Equivariant Diffusion), a generative diffusion model that infers a molecule's complete 3D structure from its molecular formula, moments of inertia, and unsigned substitution coordinates of heavy atoms. KREED's top-1 predictions identify the correct 3D structure with >98% accuracy on the QM9 and GEOM datasets when provided with substitution coordinates of all heavy atoms with natural isotopic abundance. When substitution coordinates are restricted to only a subset of carbons, accuracy is retained at 91% on QM9 and 32% on GEOM. On a test set of experimentally measured substitution coordinates gathered from the literature, KREED predicts the correct all-atom 3D structure in 25 of 33 cases, demonstrating experimental applicability for context-free 3D structure determination with rotational spectroscopy. △ Less

Submitted 19 November, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

Comments: added software citations

Journal ref: J. Chem. Phys. 160, 124115 (2024)

arXiv:2305.06307 [pdf, other]

Analysis of Adversarial Image Manipulations

Authors: Ahsi Lo, Gabriella Pangelinan, Michael C. King

Abstract: As virtual and physical identity grow increasingly intertwined, the importance of privacy and security in the online sphere becomes paramount. In recent years, multiple news stories have emerged of private companies scra** web content and doing research with or selling the data. Images uploaded online can be scraped without users' consent or knowledge. Users of social media platforms whose image… ▽ More As virtual and physical identity grow increasingly intertwined, the importance of privacy and security in the online sphere becomes paramount. In recent years, multiple news stories have emerged of private companies scra** web content and doing research with or selling the data. Images uploaded online can be scraped without users' consent or knowledge. Users of social media platforms whose images are scraped may be at risk of being identified in other uploaded images or in real-world identification situations. This paper investigates how simple, accessible image manipulation techniques affect the accuracy of facial recognition software in identifying an individual's various face images based on one unique image. △ Less

Submitted 10 May, 2023; originally announced May 2023.

arXiv:2305.05833 [pdf, other]

A Statistical Model of Bipartite Networks: Application to Cosponsorship in the United States Senate

Authors: Adeline Lo, Santiago Olivella, Kosuke Imai

Abstract: Many networks in political and social research are bipartite, with edges connecting exclusively across two distinct types of nodes. A common example includes cosponsorship networks, in which legislators are connected indirectly through the bills they support. Yet most existing network models are designed for unipartite networks, where edges can arise between any pair of nodes. However, using a uni… ▽ More Many networks in political and social research are bipartite, with edges connecting exclusively across two distinct types of nodes. A common example includes cosponsorship networks, in which legislators are connected indirectly through the bills they support. Yet most existing network models are designed for unipartite networks, where edges can arise between any pair of nodes. However, using a unipartite network model to analyze bipartite networks, as often done in practice, can result in aggregation bias and artificially high-clustering -- a particularly insidious problem when studying the role groups play in network formation. To address these methodological problems, we develop a statistical model of bipartite networks theorized to be generated through group interactions by extending the popular mixed-membership stochastic blockmodel. Our model allows researchers to identify the groups of nodes, within each node type in the bipartite structure, that share common patterns of edge formation. The model also incorporates both node and dyad-level covariates as the predictors of group membership and of observed dyadic relations. We develop an efficient computational algorithm for fitting the model, and apply it to cosponsorship data from the United States Senate. We show that legislators in a Senate that was perfectly split along party lines were able to remain productive and pass major legislation by forming non-partisan, power-brokering coalitions that found common ground through their collaboration on low-stakes bills. We also find evidence for norms of reciprocity, and uncover the substantial role played by policy expertise in the formation of cosponsorships between senators and legislation. We make an open-source software package available that makes it possible for other researchers to uncover similar insights from bipartite networks. △ Less

Submitted 27 June, 2024; v1 submitted 9 May, 2023; originally announced May 2023.

Comments: 41 pages (main text), 6 pages (appendix), 19 pages (online SI)

arXiv:2302.03620 [pdf, other]

doi 10.1039/D3DD00044C

Recent advances in the Self-Referencing Embedding Strings (SELFIES) library

Authors: Alston Lo, Robert Pollice, AkshatKumar Nigam, Andrew D. White, Mario Krenn, Alán Aspuru-Guzik

Abstract: String-based molecular representations play a crucial role in cheminformatics applications, and with the growing success of deep learning in chemistry, have been readily adopted into machine learning pipelines. However, traditional string-based representations such as SMILES are often prone to syntactic and semantic errors when produced by generative models. To address these problems, a novel repr… ▽ More String-based molecular representations play a crucial role in cheminformatics applications, and with the growing success of deep learning in chemistry, have been readily adopted into machine learning pipelines. However, traditional string-based representations such as SMILES are often prone to syntactic and semantic errors when produced by generative models. To address these problems, a novel representation, SELF-referencIng Embedded Strings (SELFIES), was proposed that is inherently 100% robust, alongside an accompanying open-source implementation. Since then, we have generalized SELFIES to support a wider range of molecules and semantic constraints and streamlined its underlying grammar. We have implemented this updated representation in subsequent versions of \selfieslib, where we have also made major advances with respect to design, efficiency, and supported features. Hence, we present the current status of \selfieslib (version 2.1.1) in this manuscript. △ Less

Submitted 7 February, 2023; originally announced February 2023.

Comments: 11 pages, 2 figures

Journal ref: Digital Discovery 2, 897 (2023)

arXiv:2301.07085 [pdf, other]

Are Language Models Worse than Humans at Following Prompts? It's Complicated

Authors: Albert Webson, Alyssa Marie Loo, Qinan Yu, Ellie Pavlick

Abstract: Prompts have been the center of progress in advancing language models' zero-shot and few-shot performance. However, recent work finds that models can perform surprisingly well when given intentionally irrelevant or misleading prompts. Such results may be interpreted as evidence that model behavior is not "human like". In this study, we challenge a central assumption in such work: that humans would… ▽ More Prompts have been the center of progress in advancing language models' zero-shot and few-shot performance. However, recent work finds that models can perform surprisingly well when given intentionally irrelevant or misleading prompts. Such results may be interpreted as evidence that model behavior is not "human like". In this study, we challenge a central assumption in such work: that humans would perform badly when given pathological instructions. We find that humans are able to reliably ignore irrelevant instructions and thus, like models, perform well on the underlying task despite an apparent lack of signal regarding the task they are being asked to do. However, when given deliberately misleading instructions, humans follow the instructions faithfully, whereas models do not. Our findings caution that future research should not idealize human behaviors as a monolith and should not train or evaluate models to mimic assumptions about these behaviors without first validating humans' behaviors empirically. △ Less

Submitted 11 November, 2023; v1 submitted 17 January, 2023; originally announced January 2023.

Comments: EMNLP 2023

arXiv:2209.05364 [pdf, other]

If Influence Functions are the Answer, Then What is the Question?

Authors: Juhan Bae, Nathan Ng, Alston Lo, Marzyeh Ghassemi, Roger Grosse

Abstract: Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters. While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment is often poor in neural networks. In this work, we investigate the specific factors that cause this discrepancy by decomposing it into five separate… ▽ More Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters. While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment is often poor in neural networks. In this work, we investigate the specific factors that cause this discrepancy by decomposing it into five separate terms. We study the contributions of each term on a variety of architectures and datasets and how they vary with factors such as network width and training time. While practical influence function estimates may be a poor match to leave-one-out retraining for nonlinear networks, we show they are often a good approximation to a different object we term the proximal Bregman response function (PBRF). Since the PBRF can still be used to answer many of the questions motivating influence functions, such as identifying influential or mislabeled examples, our results suggest that current algorithms for influence function estimation give more informative results than previous error analyses would suggest. △ Less

Submitted 12 September, 2022; originally announced September 2022.

Comments: 28 pages, 6 figures

arXiv:2204.00056 [pdf, other]

doi 10.1016/j.patter.2022.100588

SELFIES and the future of molecular string representations

Authors: Mario Krenn, Qianxiang Ai, Senja Barthel, Nessa Carson, Angelo Frei, Nathan C. Frey, Pascal Friederich, Théophile Gaudin, Alberto Alexander Gayle, Kevin Maik Jablonka, Rafael F. Lameiro, Dominik Lemm, Alston Lo, Seyed Mohamad Moosavi, José Manuel Nápoles-Duarte, AkshatKumar Nigam, Robert Pollice, Kohulan Rajan, Ulrich Schatzschneider, Philippe Schwaller, Marta Skreta, Berend Smit, Felix Strieth-Kalthoff, Chong Sun, Gary Tom , et al. (6 additional authors not shown)

Abstract: Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool… ▽ More Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool to represent molecular graphs, and the most popular molecular string representation, SMILES, has powered cheminformatics since the late 1980s. However, in the context of AI and ML in chemistry, SMILES has several shortcomings -- most pertinently, most combinations of symbols lead to invalid results with no valid chemical interpretation. To overcome this issue, a new language for molecules was introduced in 2020 that guarantees 100\% robustness: SELFIES (SELF-referencIng Embedded Strings). SELFIES has since simplified and enabled numerous new applications in chemistry. In this manuscript, we look to the future and discuss molecular string representations, along with their respective opportunities and challenges. We propose 16 concrete Future Projects for robust molecular representations. These involve the extension toward new chemical domains, exciting questions at the interface of AI and robust languages and interpretability for both humans and machines. We hope that these proposals will inspire several follow-up works exploiting the full potential of molecular string representations for the future of AI in chemistry and materials science. △ Less

Submitted 31 March, 2022; originally announced April 2022.

Comments: 34 pages, 15 figures, comments and suggestions for additional references are welcome!

Journal ref: Cell Patterns 3(10), 100588(2022)

arXiv:2109.01232 [pdf, other]

A Study of Mixed Precision Strategies for GMRES on GPUs

Authors: Jennifer A. Loe, Christian A. Glusa, Ichitaro Yamazaki, Erik G. Boman, Sivasankaran Rajamanickam

Abstract: Support for lower precision computation is becoming more common in accelerator hardware due to lower power usage, reduced data movement and increased computational performance. However, computational science and engineering (CSE) problems require double precision accuracy in several domains. This conflict between hardware trends and application needs has resulted in a need for mixed precision stra… ▽ More Support for lower precision computation is becoming more common in accelerator hardware due to lower power usage, reduced data movement and increased computational performance. However, computational science and engineering (CSE) problems require double precision accuracy in several domains. This conflict between hardware trends and application needs has resulted in a need for mixed precision strategies at the linear algebra algorithms level if we want to exploit the hardware to its full potential while meeting the accuracy requirements. In this paper, we focus on preconditioned sparse iterative linear solvers, a key kernel in several CSE applications. We present a study of mixed precision strategies for accelerating this kernel on an NVIDIA V$100$ GPU with a Power 9 CPU. We seek the best methods for incorporating multiple precisions into the GMRES linear solver; these include iterative refinement and parallelizable preconditioners. Our work presents strategies to determine when mixed precision GMRES will be effective and to choose parameters for a mixed precision iterative refinement solver to achieve better performance. We use an implementation that is based on the Trilinos library and employs Kokkos Kernels for performance portability of linear algebra kernels. Performance results demonstrate the promise of mixed precision approaches and demonstrate even further improvements are possible by optimizing low-level kernels. △ Less

Submitted 2 September, 2021; originally announced September 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2105.07544

arXiv:2105.07544 [pdf, other]

Experimental Evaluation of Multiprecision Strategies for GMRES on GPUs

Authors: Jennifer A. Loe, Christian A. Glusa, Ichitaro Yamazaki, Erik G. Boman, Sivasankaran Rajamanickam

Abstract: Support for lower precision computation is becoming more common in accelerator hardware due to lower power usage, reduced data movement and increased computational performance. However, computational science and engineering (CSE) problems require double precision accuracy in several domains. This conflict between hardware trends and application needs has resulted in a need for multiprecision strat… ▽ More Support for lower precision computation is becoming more common in accelerator hardware due to lower power usage, reduced data movement and increased computational performance. However, computational science and engineering (CSE) problems require double precision accuracy in several domains. This conflict between hardware trends and application needs has resulted in a need for multiprecision strategies at the linear algebra algorithms level if we want to exploit the hardware to its full potential while meeting the accuracy requirements. In this paper, we focus on preconditioned sparse iterative linear solvers, a key kernel in several CSE applications. We present a study of multiprecision strategies for accelerating this kernel on GPUs. We seek the best methods for incorporating multiple precisions into the GMRES linear solver; these include iterative refinement and parallelizable preconditioners. Our work presents strategies to determine when multiprecision GMRES will be effective and to choose parameters for a multiprecision iterative refinement solver to achieve better performance. We use an implementation that is based on the Trilinos library and employs Kokkos Kernels for performance portability of linear algebra kernels. Performance results demonstrate the promise of multiprecision approaches and demonstrate even further improvements are possible by optimizing low-level kernels. △ Less

Submitted 16 May, 2021; originally announced May 2021.

Comments: Accepted for publication in the IEEE IPDPS Accelerators and Hybrid Emerging Systems (AsHES) 11th Workshop, 2021

arXiv:2010.13385 [pdf, other]

LB Scalability: Achieving the Right Balance Between Being Stateful and Stateless

Authors: Reuven Cohen, Matty Kadosh, Alan Lo, Qasem Sayah

Abstract: A high performance Layer-4 load balancer (LB) is one of the most important components of a cloud service infrastructure. Such an LB uses network and transport layer information for deciding how to distribute client requests across a group of servers. A crucial requirement for a stateful LB is per connection consistency (PCC); namely, that all the packets of the same connection will be forwarded to… ▽ More A high performance Layer-4 load balancer (LB) is one of the most important components of a cloud service infrastructure. Such an LB uses network and transport layer information for deciding how to distribute client requests across a group of servers. A crucial requirement for a stateful LB is per connection consistency (PCC); namely, that all the packets of the same connection will be forwarded to the same server, as long as the server is alive, even if the pool of servers or the assignment function changes. The challenge is in designing a high throughput, low latency solution that is also scalable. This paper proposes a highly scalable LB, called Prism, implemented using a programmable switch ASIC. As far as we know, Prism is the first reported LB that can process millions of connections per second and hundreds of millions connections in total, while ensuring PCC. This is due to the fact that Prism forwards all the packets in hardware, even during server pool changes, while avoiding the need to maintain a hardware state per every active connection. We implemented a prototype of the proposed architecture and showed that Prism can scale to 100 million simultaneous connections, and can accommodate more than one pool update per second. △ Less

Submitted 26 October, 2020; originally announced October 2020.

arXiv:2008.01937 [pdf, other]

doi 10.1371/journal.pcbi.1008967

Antibody Watch: Text Mining Antibody Specificity from the Literature

Authors: Chun-Nan Hsu, Chia-Hui Chang, Thamolwan Poopradubsil, Amanda Lo, Karen A. William, Ko-Wei Lin, Anita Bandrowski, Ibrahim Burak Ozyurt, Jeffrey S. Grethe, Maryann E. Martone

Abstract: Antibodies are widely used reagents to test for expression of proteins and other antigens. However, they might not always reliably produce results when they do not specifically bind to the target proteins that their providers designed them for, leading to unreliable research results. While many proposals have been developed to deal with the problem of antibody specificity, it is still challenging… ▽ More Antibodies are widely used reagents to test for expression of proteins and other antigens. However, they might not always reliably produce results when they do not specifically bind to the target proteins that their providers designed them for, leading to unreliable research results. While many proposals have been developed to deal with the problem of antibody specificity, it is still challenging to cover the millions of antibodies that are available to researchers. In this study, we investigate the feasibility of automatically generating alerts to users of problematic antibodies by extracting statements about antibody specificity reported in the literature. The extracted alerts can be used to construct an "Antibody Watch" knowledge base containing supporting statements of problematic antibodies. We developed a deep neural network system and tested its performance with a corpus of more than two thousand articles that reported uses of antibodies. We divided the problem into two tasks. Given an input article, the first task is to identify snippets about antibody specificity and classify if the snippets report that any antibody exhibits non-specificity, and thus is problematic. The second task is to link each of these snippets to one or more antibodies mentioned in the snippet. The experimental evaluation shows that our system can accurately perform both classification and linking tasks with weighted F-scores over 0.925 and 0.923, respectively, and 0.914 overall when combined to complete the joint task. We leveraged Research Resource Identifiers (RRID) to precisely identify antibodies linked to the extracted specificity snippets. The result shows that it is feasible to construct a reliable knowledge base about problematic antibodies by text mining. △ Less

Submitted 11 November, 2020; v1 submitted 5 August, 2020; originally announced August 2020.

Comments: 16 pages, 1 figures

Journal ref: PLOS Computational Biology, 2021

arXiv:1709.04937 [pdf, other]

Spanning trees with few branch vertices

Authors: Louis DeBiasio, Allan Lo

Abstract: A branch vertex in a tree is a vertex of degree at least three. We prove that, for all $s\geq 1$, every connected graph on $n$ vertices with minimum degree at least $(\frac{1}{s+3}+o(1))n$ contains a spanning tree having at most $s$ branch vertices. Asymptotically, this is best possible and solves, in less general form, a problem of Flandrin, Kaiser, Kuuzel, Li and Ryjáucek, which was originally m… ▽ More A branch vertex in a tree is a vertex of degree at least three. We prove that, for all $s\geq 1$, every connected graph on $n$ vertices with minimum degree at least $(\frac{1}{s+3}+o(1))n$ contains a spanning tree having at most $s$ branch vertices. Asymptotically, this is best possible and solves, in less general form, a problem of Flandrin, Kaiser, Kuuzel, Li and Ryjáucek, which was originally motivated by an optimization problem in the design of optical networks. △ Less

Submitted 9 October, 2019; v1 submitted 14 September, 2017; originally announced September 2017.

Comments: 20 pages, 2 figures, to appear in SIAM J. of Discrete Math

arXiv:1709.04300 [pdf, other]

Is Smaller Better: A Proposal To Consider Bacteria For Biologically Inspired Modeling

Authors: Archana Ram, Andrew Lo

Abstract: Bacteria are easily characterizable model organisms with an impressively complicated set of capabilities. Among their capabilities is quorum sensing, a detailed cell-cell signaling system that may have a common origin with eukaryotic cell-cell signaling. Not only are the two phenomena similar, but quorum sensing, as is the case with any bacterial phenomenon when compared to eukaryotes, is also eas… ▽ More Bacteria are easily characterizable model organisms with an impressively complicated set of capabilities. Among their capabilities is quorum sensing, a detailed cell-cell signaling system that may have a common origin with eukaryotic cell-cell signaling. Not only are the two phenomena similar, but quorum sensing, as is the case with any bacterial phenomenon when compared to eukaryotes, is also easier to study in depth than eukaryotic cell-cell signaling. This ease of study is a contrast to the only partially understood cellular dynamics of neurons. Here we review the literature on the strikingly neuron-like qualities of bacterial colonies and biofilms, including ion-based and hormonal signaling, and action potential-like behavior. This allows them to feasibly act as an analog for neurons that could produce more detailed and more accurate biologically-based computational models. Using bacteria as the basis for biologically feasible computational models may allow models to better harness the tremendous ability of biological organisms to make decisions and process information. Additionally, principles gleaned from bacterial function have the potential to influence computational efforts divorced from biology, just as neuronal function has in the abstract influenced countless machine learning efforts. △ Less

Submitted 11 September, 2017; originally announced September 2017.

arXiv:1610.07129 [pdf, ps, other]

Develo** and Assessing MATLAB Exercises for Active Concept Learning

Authors: S. H. Song, Marco Antonelli, Tony Fung, Brandon D. Armstrong, Amy Chong, Albert Lo, Bertram E. Shi

Abstract: New technologies, such as MOOCs, provide innovative methods to tackle new challenges in teaching and learning, such as globalization and changing contemporary culture and to remove the limits of conventional classrooms. However, they also bring challenges in course delivery and assessment, due to factors such as less direct student-instructor interaction. These challenges are especially severe in… ▽ More New technologies, such as MOOCs, provide innovative methods to tackle new challenges in teaching and learning, such as globalization and changing contemporary culture and to remove the limits of conventional classrooms. However, they also bring challenges in course delivery and assessment, due to factors such as less direct student-instructor interaction. These challenges are especially severe in engineering education, which relies heavily on experiential learning, such as computer simulations and laboratory exercises, to assist students in understanding concepts. As a result, effective design of experiential learning components is extremely critical for engineering MOOCs. In this paper, we will share our experience gained through develo** and offering a MOOC on communication systems, with special focus on the development and assessment of MATLAB exercises for active concept learning. Our approach introduced students to concepts using learning components commonly provided by many MOOC platforms (e.g., online lectures and quizzes), and augmented the student experience with MATLAB based computer simulations and exercises to enable more concrete and detailed understanding of the material. We describe here a systematic approach to MATLAB problem design and assessment, based on our experience with the MATLAB server provided by MathWorks and integrated with the edX MOOC platform. We discuss the effectiveness of the instructional methods as evaluated through students' learning performance. We analyze the impact of the course design tools from both the instructor and the student perspective. △ Less

Submitted 23 October, 2016; originally announced October 2016.

Comments: Submitted to IEEE Transactions on Education

arXiv:1111.5228 [pdf, other]

Privacy-Preserving Methods for Sharing Financial Risk Exposures

Authors: Emmanuel A. Abbe, Amir E. Khandani, Andrew W. Lo

Abstract: Unlike other industries in which intellectual property is patentable, the financial industry relies on trade secrecy to protect its business processes and methods, which can obscure critical financial risk exposures from regulators and the public. We develop methods for sharing and aggregating such risk exposures that protect the privacy of all parties involved and without the need for a trusted t… ▽ More Unlike other industries in which intellectual property is patentable, the financial industry relies on trade secrecy to protect its business processes and methods, which can obscure critical financial risk exposures from regulators and the public. We develop methods for sharing and aggregating such risk exposures that protect the privacy of all parties involved and without the need for a trusted third party. Our approach employs secure multi-party computation techniques from cryptography in which multiple parties are able to compute joint functions without revealing their individual inputs. In our framework, individual financial institutions evaluate a protocol on their proprietary data which cannot be inverted, leading to secure computations of real-valued statistics such a concentration indexes, pairwise correlations, and other single- and multi-point statistics. The proposed protocols are computationally tractable on realistic sample sizes. Potential financial applications include: the construction of privacy-preserving real-time indexes of bank capital and leverage ratios; the monitoring of delegated portfolio investments; financial audits; and the publication of new indexes of proprietary trading strategies. △ Less

Submitted 24 November, 2011; v1 submitted 19 November, 2011; originally announced November 2011.

arXiv:1002.4592 [pdf, ps, other]

Is It Real, or Is It Randomized?: A Financial Turing Test

Authors: Jasmina Hasanhodzic, Andrew W. Lo, Emanuele Viola

Abstract: We construct a financial "Turing test" to determine whether human subjects can differentiate between actual vs. randomized financial returns. The experiment consists of an online video-game (http://arora.ccs.neu.edu) where players are challenged to distinguish actual financial market returns from random temporal permutations of those returns. We find overwhelming statistical evidence (p-values n… ▽ More We construct a financial "Turing test" to determine whether human subjects can differentiate between actual vs. randomized financial returns. The experiment consists of an online video-game (http://arora.ccs.neu.edu) where players are challenged to distinguish actual financial market returns from random temporal permutations of those returns. We find overwhelming statistical evidence (p-values no greater than 0.5%) that subjects can consistently distinguish between the two types of time series, thereby refuting the widespread belief that financial markets "look random." A key feature of the experiment is that subjects are given immediate feedback regarding the validity of their choices, allowing them to learn and adapt. We suggest that such novel interfaces can harness human capabilities to process and extract information from financial data in ways that computers cannot. △ Less

Submitted 24 February, 2010; originally announced February 2010.

Comments: 12 pages, 6 figures

arXiv:0908.4580 [pdf, ps, other]

A Computational View of Market Efficiency

Authors: Jasmina Hasanhodzic, Andrew W. Lo, Emanuele Viola

Abstract: We propose to study market efficiency from a computational viewpoint. Borrowing from theoretical computer science, we define a market to be \emph{efficient with respect to resources $S$} (e.g., time, memory) if no strategy using resources $S$ can make a profit. As a first step, we consider memory-$m$ strategies whose action at time $t$ depends only on the $m$ previous observations at times… ▽ More We propose to study market efficiency from a computational viewpoint. Borrowing from theoretical computer science, we define a market to be \emph{efficient with respect to resources $S$} (e.g., time, memory) if no strategy using resources $S$ can make a profit. As a first step, we consider memory-$m$ strategies whose action at time $t$ depends only on the $m$ previous observations at times $t-m,...,t-1$. We introduce and study a simple model of market evolution, where strategies impact the market by their decision to buy or sell. We show that the effect of optimal strategies using memory $m$ can lead to "market conditions" that were not present initially, such as (1) market bubbles and (2) the possibility for a strategy using memory $m' > m$ to make a bigger profit than was initially possible. We suggest ours as a framework to rationalize the technological arms race of quantitative trading firms. △ Less

Submitted 31 August, 2009; originally announced August 2009.

Showing 1–18 of 18 results for author: Lo, A