Skip to main content

Showing 1–9 of 9 results for author: Lentz, M

.
  1. arXiv:2407.00467  [pdf, other

    cs.LG cs.DC eess.IV

    VcLLM: Video Codecs are Secretly Tensor Codecs

    Authors: Ceyu Xu, Yongji Wu, Xinyu Yang, Beidi Chen, Matthew Lentz, Danyang Zhuo, Lisa Wu Wills

    Abstract: As the parameter size of large language models (LLMs) continues to expand, the need for a large memory footprint and high communication bandwidth have become significant bottlenecks for the training and inference of LLMs. To mitigate these bottlenecks, various tensor compression techniques have been proposed to reduce the data size, thereby alleviating memory requirements and communication pressur… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  2. arXiv:2402.12280  [pdf, other

    cs.CL cs.AI

    Adaptive Skeleton Graph Decoding

    Authors: Shuowei **, Yongji Wu, Haizhong Zheng, Qingzhao Zhang, Matthew Lentz, Z. Morley Mao, Atul Prakash, Feng Qian, Danyang Zhuo

    Abstract: Large language models (LLMs) have seen significant adoption for natural language tasks, owing their success to massive numbers of model parameters (e.g., 70B+); however, LLM inference incurs significant computation and memory costs. Recent approaches propose parallel decoding strategies, such as Skeleton-of-Thought (SoT), to improve performance by breaking prompts down into sub-problems that can b… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  3. arXiv:2401.12230  [pdf, other

    cs.DC cs.LG

    Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native

    Authors: Yao Lu, Song Bian, Lequn Chen, Yongjun He, Yulong Hui, Matthew Lentz, Beibin Li, Fei Liu, Jialin Li, Qi Liu, Rui Liu, Xiaoxuan Liu, Lin Ma, Kexin Rong, Jianguo Wang, Yingjun Wu, Yongji Wu, Huanchen Zhang, Minjia Zhang, Qizhen Zhang, Tianyi Zhou, Danyang Zhuo

    Abstract: In this paper, we investigate the intersection of large generative AI models and cloud-native computing architectures. Recent large models such as ChatGPT, while revolutionary in their capabilities, face challenges like escalating costs and demand for high-end GPUs. Drawing analogies between large-model-as-a-service (LMaaS) and cloud database-as-a-service (DBaaS), we describe an AI-native computin… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

  4. arXiv:2304.07349  [pdf, other

    cs.NI cs.OS

    Remote Procedure Call as a Managed System Service

    Authors: **grong Chen, Yongji Wu, Shihan Lin, Yechen Xu, Xinhao Kong, Thomas Anderson, Matthew Lentz, Xiaowei Yang, Danyang Zhuo

    Abstract: Remote Procedure Call (RPC) is a widely used abstraction for cloud computing. The programmer specifies type information for each remote procedure, and a compiler generates stub code linked into each application to marshal and unmarshal arguments into message buffers. Increasingly, however, application and service operations teams need a high degree of visibility and control over the flow of RPCs b… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

    Comments: NSDI 2023

  5. arXiv:2207.00592  [pdf, other

    cs.DC cs.NI

    Dissecting Service Mesh Overheads

    Authors: Xiangfeng Zhu, Guozhen She, Bowen Xue, Yu Zhang, Yongsu Zhang, Xuan Kelvin Zou, Xiongchun Duan, Peng He, Arvind Krishnamurthy, Matthew Lentz, Danyang Zhuo, Ratul Mahajan

    Abstract: Service meshes play a central role in the modern application ecosystem by providing an easy and flexible way to connect different services that form a distributed application. However, because of the way they interpose on application traffic, they can substantially increase application latency and resource consumption. We develop a decompositional approach and a tool, called MeshInsight, to system… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

  6. arXiv:2205.04713  [pdf, other

    cs.LG cs.DB cs.DC

    Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures

    Authors: Yongji Wu, Matthew Lentz, Danyang Zhuo, Yao Lu

    Abstract: With the advent of ubiquitous deployment of smart devices and the Internet of Things, data sources for machine learning inference have increasingly moved to the edge of the network. Existing machine learning inference platforms typically assume a homogeneous infrastructure and do not take into account the more complex and tiered computing infrastructure that includes edge devices, local hubs, edge… ▽ More

    Submitted 3 August, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

  7. arXiv:2011.08069  [pdf, other

    cs.CR cs.CY cs.SI q-bio.PE

    Reconciling Security and Utility in Next-Generation Epidemic Risk Mitigation Systems

    Authors: Pierfrancesco Ingo, Nichole Boufford, Ming Cheng Jiang, Rowan Lindsay, Matthew Lentz, Gilles Barthe, Manuel Gomez-Rodriguez, Bernhard Schölkopf, Deepak Garg, Peter Druschel, Aastha Mehta

    Abstract: Epidemics like the recent COVID-19 require proactive contact tracing and epidemiological analysis to predict and subsequently contain infection transmissions. The proactive measures require large scale data collection, which simultaneously raise concerns regarding users' privacy. Digital contact tracing systems developed in response to COVID-19 either collected extensive data for effective analyti… ▽ More

    Submitted 9 May, 2024; v1 submitted 16 November, 2020; originally announced November 2020.

  8. arXiv:2001.08840  [pdf, other

    cs.CR

    SeCloak: ARM Trustzone-based Mobile Peripheral Control

    Authors: Matthew Lentz, Rijurekha Sen, Peter Druschel, Bobby Bhattacharjee

    Abstract: Reliable on-off control of peripherals on smart devices is a key to security and privacy in many scenarios. Journalists want to reliably turn off radios to protect their sources during investigative reporting. Users wish to ensure cameras and microphones are reliably off during private meetings. In this paper, we present SeCloak, an ARM TrustZone-based solution that ensures reliable on-off control… ▽ More

    Submitted 23 January, 2020; originally announced January 2020.

  9. arXiv:1805.01190  [pdf, other

    cond-mat.mes-hall cond-mat.quant-gas quant-ph

    Floquet Perturbation Theory: Formalism and Application to Low-Frequency Limit

    Authors: M. Rodriguez-Vega, M. Lentz, B. Seradjeh

    Abstract: We develop a low-frequency perturbation theory in the extended Floquet Hilbert space of a periodically driven quantum systems, which puts the high- and low-frequency approximations to the Floquet theory on the same footing. It captures adiabatic perturbation theories recently discussed in the literature as well as diabatic deviation due to Floquet resonances. For illustration, we apply our Floquet… ▽ More

    Submitted 28 September, 2018; v1 submitted 3 May, 2018; originally announced May 2018.

    Comments: v2: 28 single-column pages, 5 figures; various typos fixed; some notation and connection to other perturbation schemes clarified; new, more descriptive title and abstract. Published version

    Journal ref: New J. Phys. 20, 093022 (2018)