Skip to main content

Showing 1–14 of 14 results for author: Uta, A

.
  1. arXiv:2312.00391  [pdf, ps, other

    math.OC

    Inverse-Optimization-Based Uncertainty Set for Robust Linear Optimization

    Authors: Ayaka Ueta, Mirai Tanaka, Ken Kobayashi, Kazuhide Nakata

    Abstract: We consider solving linear optimization (LO) problems with uncertain objective coefficients. For such problems, we often employ robust optimization (RO) approaches by introducing an uncertainty set for the unknown coefficients. Typical RO approaches require observations or prior knowledge of the unknown coefficient to define an appropriate uncertainty set. However, such information may not always… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: 6 pages, 1 figure, To appear in Proceedings of International Conference on Operations Research 2023

  2. Log Parsing Evaluation in the Era of Modern Software Systems

    Authors: Stefan Petrescu, Floris den Hengst, Alexandru Uta, Jan S. Rellermeyer

    Abstract: Due to the complexity and size of modern software systems, the amount of logs generated is tremendous. Hence, it is infeasible to manually investigate these data in a reasonable time, thereby requiring automating log analysis to derive insights about the functioning of the systems. Motivated by an industry use-case, we zoom-in on one integral part of automated log analysis, log parsing, which is t… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

  3. arXiv:2206.03259  [pdf

    cs.CY

    Future Computer Systems and Networking Research in the Netherlands: A Manifesto

    Authors: Alexandru Iosup, Fernando Kuipers, Ana Lucia Varbanescu, Paola Grosso, Animesh Trivedi, Jan Rellermeyer, Lin Wang, Alexandru Uta, Francesco Regazzoni

    Abstract: Our modern society and competitive economy depend on a strong digital foundation and, in turn, on sustained research and innovation in computer systems and networks (CompSys). With this manifesto, we draw attention to CompSys as a vital part of ICT. Among ICT technologies, CompSys covers all the hardware and all the operational software layers that enable applications; only application-specific de… ▽ More

    Submitted 26 May, 2022; originally announced June 2022.

    Comments: Position paper: 7 foundational research themes in computer science and networking research, 4 advances with outstanding impact on society, 10 recommendations, 50 pages. Co-signatories from (alphabetical order): ASTRON, CWI, Gaia-X NL, NIKHEF, RU Groningen, SIDN Labs, Solvinity, SURF, TNO, TU/e, TU Delft, UvA, U. Leiden, U. Twente, VU Amsterdam

    ACM Class: A.1; A.m; C.0; D.4; J.0; K.3; K.4; K.6

  4. arXiv:2204.06074  [pdf, other

    cs.DC

    Skyhook: Towards an Arrow-Native Storage System

    Authors: Jayjeet Chakraborty, Ivo Jimenez, Sebastiaan Alvarez Rodriguez, Alexandru Uta, Jeff LeFevre, Carlos Maltzahn

    Abstract: With the ever-increasing dataset sizes, several file formats such as Parquet, ORC, and Avro have been developed to store data efficiently, save the network, and interconnect bandwidth at the price of additional CPU utilization. However, with the advent of networks supporting 25-100 Gb/s and storage devices delivering 1, 000, 000 reqs/sec, the CPU has become the bottleneck trying to keep up feeding… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2105.09894

  5. Tiny Autoscalers for Tiny Workloads: Dynamic CPU Allocation for Serverless Functions

    Authors: Yuxuan Zhao, Alexandru Uta

    Abstract: In serverless computing, applications are executed under lightweight virtualization and isolation environments, such as containers or micro virtual machines. Typically, their memory allocation is set by the user before deployment. All other resources, such as CPU, are allocated by the provider statically and proportionally to memory allocations. This contributes to either under-utilization or thro… ▽ More

    Submitted 31 October, 2022; v1 submitted 1 March, 2022; originally announced March 2022.

    Comments: Published in 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2022

  6. arXiv:2112.06280  [pdf, other

    cs.DC

    In-Memory Indexed Caching for Distributed Data Processing

    Authors: Alexandru Uta, Bogdan Ghit, Ankur Dave, Jan Rellermeyer, Peter Boncz

    Abstract: Powerful abstractions such as dataframes are only as efficient as their underlying runtime system. The de-facto distributed data processing framework, Apache Spark, is poorly suited for the modern cloud-based data-science workloads due to its outdated assumptions: static datasets analyzed using coarse-grained transformations. In this paper, we introduce the Indexed DataFrame, an in-memory cache th… ▽ More

    Submitted 8 February, 2022; v1 submitted 12 December, 2021; originally announced December 2021.

    Comments: Accepted for publication at IEEE IPDPS 2022

  7. arXiv:2107.11832  [pdf, other

    cs.DC

    A Holistic Analysis of Datacenter Operations: Resource Usage, Energy, and Workload Characterization -- Extended Technical Report

    Authors: Laurens Versluis, Mehmet Cetin, Caspar Greeven, Kristian Laursen, Damian Podareanu, Valeriu Codreanu, Alexandru Uta, Alexandru Iosup

    Abstract: Improving datacenter operations is vital for the digital society. We posit that doing so requires our community to shift, from operational aspects taken in isolation to holistic analysis of datacenter resources, energy, and workloads. In turn, this shift will require new analysis methods, and open-access, FAIR datasets with fine temporal and spatial granularity. We leverage in this work one of the… ▽ More

    Submitted 25 July, 2021; originally announced July 2021.

  8. arXiv:2106.13020  [pdf, other

    cs.DC

    Zero-Cost, Arrow-Enabled Data Interface for Apache Spark

    Authors: Sebastiaan Alvarez Rodriguez, Jayjeet Chakraborty, Aaron Chu, Ivo Jimenez, Jeff LeFevre, Carlos Maltzahn, Alexandru Uta

    Abstract: Distributed data processing ecosystems are widespread and their components are highly specialized, such that efficient interoperability is urgent. Recently, Apache Arrow was chosen by the community to serve as a format mediator, providing efficient in-memory data representation. Arrow enables efficient data movement between data processing and storage engines, significantly improving interoperabil… ▽ More

    Submitted 27 November, 2021; v1 submitted 24 June, 2021; originally announced June 2021.

    Comments: 6 pages, 6 figures

  9. arXiv:2105.09894  [pdf, other

    cs.DC

    Towards an Arrow-native Storage System

    Authors: Jayjeet Chakraborty, Ivo Jimenez, Sebastiaan Alvarez Rodriguez, Alexandru Uta, Jeff LeFevre, Carlos Maltzahn

    Abstract: With the ever-increasing dataset sizes, several file formats like Parquet, ORC, and Avro have been developed to store data efficiently and to save network and interconnect bandwidth at the price of additional CPU utilization. However, with the advent of networks supporting 25-100 Gb/s and storage devices delivering 1, 000, 000 reqs/sec the CPU has become the bottleneck, trying to keep up feeding d… ▽ More

    Submitted 21 May, 2021; v1 submitted 20 May, 2021; originally announced May 2021.

    Comments: 7 pages, 6 figures, workshop

  10. arXiv:2012.06171  [pdf, other

    cs.DC cs.DB

    The Future is Big Graphs! A Community View on Graph Processing Systems

    Authors: Sherif Sakr, Angela Bonifati, Hannes Voigt, Alexandru Iosup, Khaled Ammar, Renzo Angles, Walid Aref, Marcelo Arenas, Maciej Besta, Peter A. Boncz, Khuzaima Daudjee, Emanuele Della Valle, Stefania Dumbrava, Olaf Hartig, Bernhard Haslhofer, Tim Hegeman, Jan Hidders, Katja Hose, Adriana Iamnitchi, Vasiliki Kalavri, Hugo Kapp, Wim Martens, M. Tamer Özsu, Eric Peukert, Stefan Plantikow , et al. (16 additional authors not shown)

    Abstract: Graphs are by nature unifying abstractions that can leverage interconnectedness to represent, explore, predict, and explain real- and digital-world phenomena. Although real users and consumers of graph instances and graph workloads understand these abstractions, future problems will require new abstractions and systems. What needs to happen in the next decade for big graph processing to continue t… ▽ More

    Submitted 11 December, 2020; originally announced December 2020.

    Comments: 12 pages, 3 figures, collaboration between the large-scale systems and data management communities, work started at the Dagstuhl Seminar 19491 on Big Graph Processing Systems, to be published in the Communications of the ACM

    ACM Class: C.3; E.0; H.2; J.0

  11. arXiv:2011.15028  [pdf, other

    cs.DC cs.DB

    The LDBC Graphalytics Benchmark

    Authors: Alexandru Iosup, Ahmed Musaafir, Alexandru Uta, Arnau Prat Pérez, Gábor Szárnyas, Hassan Chafi, Ilie Gabriel Tănase, Lifeng Nai, Michael Anderson, Mihai Capotă, Narayanan Sundaram, Peter Boncz, Siegfried Depner, Stijn Heldens, Thomas Manhardt, Tim Hegeman, Wing Lung Ngai, Yinglong Xia

    Abstract: In this document, we describe LDBC Graphalytics, an industrial-grade benchmark for graph analysis platforms. The main goal of Graphalytics is to enable the fair and objective comparison of graph analysis platforms. Due to the diversity of bottlenecks and performance issues such platforms need to address, Graphalytics consists of a set of selected deterministic algorithms for full-graph analysis, s… ▽ More

    Submitted 6 April, 2023; v1 submitted 30 November, 2020; originally announced November 2020.

    ACM Class: C.4; H.2.4

  12. arXiv:2003.04824  [pdf, other

    cs.PF cs.DC

    In Datacenter Performance, The Only Constant Is Change

    Authors: Dmitry Duplyakin, Alexandru Uta, Aleksander Maricq, Robert Ricci

    Abstract: All computing infrastructure suffers from performance variability, be it bare-metal or virtualized. This phenomenon originates from many sources: some transient, such as noisy neighbors, and others more permanent but sudden, such as changes or wear in hardware, changes in the underlying hypervisor stack, or even undocumented interactions between the policies of the computing resource provider and… ▽ More

    Submitted 10 March, 2020; originally announced March 2020.

    Comments: To be presented at the 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid, http://cloudbus.org/ccgrid2020/) on May 11-14, 2020 in Melbourne, Victoria, Australia

  13. arXiv:1912.09256  [pdf, other

    cs.PF cs.DC

    Is Big Data Performance Reproducible in Modern Cloud Networks?

    Authors: Alexandru Uta, Alexandru Custura, Dmitry Duplyakin, Ivo Jimenez, Jan Rellermeyer, Carlos Maltzahn, Robert Ricci, Alexandru Iosup

    Abstract: Performance variability has been acknowledged as a problem for over a decade by cloud practitioners and performance engineers. Yet, our survey of top systems conferences reveals that the research community regularly disregards variability when running experiments in the cloud. Focusing on networks, we assess the impact of variability on cloud-based big-data workloads by gathering traces from mains… ▽ More

    Submitted 19 December, 2019; originally announced December 2019.

    Comments: 12 pages paper, 3 pages references

  14. arXiv:1802.05465  [pdf, other

    cs.DC cs.SE

    Massivizing Computer Systems: a Vision to Understand, Design, and Engineer Computer Ecosystems through and beyond Modern Distributed Systems

    Authors: Alexandru Iosup, Alexandru Uta, Laurens Versluis, Georgios Andreadis, Erwin van Eyk, Tim Hegeman, Sacheendra Talluri, Vincent van Beek, Lucian Toader

    Abstract: Our society is digital: industry, science, governance, and individuals depend, often transparently, on the inter-operation of large numbers of distributed computer systems. Although the society takes them almost for granted, these computer ecosystems are not available for all, may not be affordable for long, and raise numerous other research challenges. Inspired by these challenges and by our expe… ▽ More

    Submitted 22 February, 2018; v1 submitted 15 February, 2018; originally announced February 2018.