Search | arXiv e-print repository

Coarse-To-Fine Tensor Trains for Compact Visual Representations

Authors: Sebastian Loeschcke, Dan Wang, Christian Leth-Espensen, Serge Belongie, Michael J. Kastoryano, Sagie Benaim

Abstract: The ability to learn compact, high-quality, and easy-to-optimize representations for visual data is paramount to many applications such as novel view synthesis and 3D reconstruction. Recent work has shown substantial success in using tensor networks to design such compact and high-quality representations. However, the ability to optimize tensor-based representations, and in particular, the highly… ▽ More The ability to learn compact, high-quality, and easy-to-optimize representations for visual data is paramount to many applications such as novel view synthesis and 3D reconstruction. Recent work has shown substantial success in using tensor networks to design such compact and high-quality representations. However, the ability to optimize tensor-based representations, and in particular, the highly compact tensor train representation, is still lacking. This has prevented practitioners from deploying the full potential of tensor networks for visual data. To this end, we propose 'Prolongation Upsampling Tensor Train (PuTT)', a novel method for learning tensor train representations in a coarse-to-fine manner. Our method involves the prolonging or `upsampling' of a learned tensor train representation, creating a sequence of 'coarse-to-fine' tensor trains that are incrementally refined. We evaluate our representation along three axes: (1). compression, (2). denoising capability, and (3). image completion capability. To assess these axes, we consider the tasks of image fitting, 3D fitting, and novel view synthesis, where our method shows an improved performance compared to state-of-the-art tensor-based methods. For full results see our project webpage: https://sebulo.github.io/PuTT_website/ △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: Project webpage: https://sebulo.github.io/PuTT_website/

arXiv:2405.20322 [pdf, other]

Quantum generalizations of Glauber and Metropolis dynamics

Authors: András Gilyén, Chi-Fang Chen, Joao F. Doriguello, Michael J. Kastoryano

Abstract: Classical Markov Chain Monte Carlo methods have been essential for simulating statistical physical systems and have proven well applicable to other systems with complex degrees of freedom. Motivated by the statistical physics origins, Chen, Kastoryano, and Gilyén [CKG23] proposed a continuous-time quantum thermodynamic analog to Glauber dynamic that is (i) exactly detailed balanced, (ii) efficient… ▽ More Classical Markov Chain Monte Carlo methods have been essential for simulating statistical physical systems and have proven well applicable to other systems with complex degrees of freedom. Motivated by the statistical physics origins, Chen, Kastoryano, and Gilyén [CKG23] proposed a continuous-time quantum thermodynamic analog to Glauber dynamic that is (i) exactly detailed balanced, (ii) efficiently implementable, and (iii) quasi-local for geometrically local systems. Physically, their construction gives a smooth variant of the Davies' generator derived from weak system-bath interaction. In this work, we give an efficiently implementable discrete-time quantum counterpart to Metropolis sampling that also enjoys the desirable features (i)-(iii). Also, we give an alternative highly coherent quantum generalization of detailed balanced dynamics that resembles another physically derived master equation, and propose a smooth interpolation between this and earlier constructions. We study generic properties of all constructions, including the uniqueness of the fixed-point and the locality of the resulting operators. We hope our results provide a systematic approach to the possible quantum generalizations of classical Glauber and Metropolis dynamics. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.16528 [pdf, other]

LoQT: Low Rank Adapters for Quantized Training

Authors: Sebastian Loeschcke, Mads Toftrup, Michael J. Kastoryano, Serge Belongie, Vésteinn Snæbjarnarson

Abstract: Training of large neural networks requires significant computational resources. Despite advances using low-rank adapters and quantization, pretraining of models such as LLMs on consumer hardware has not been possible without model sharding, offloading during training, or per-layer gradient updates. To address these limitations, we propose LoQT, a method for efficiently training quantized models. L… ▽ More Training of large neural networks requires significant computational resources. Despite advances using low-rank adapters and quantization, pretraining of models such as LLMs on consumer hardware has not been possible without model sharding, offloading during training, or per-layer gradient updates. To address these limitations, we propose LoQT, a method for efficiently training quantized models. LoQT uses gradient-based tensor factorization to initialize low-rank trainable weight matrices that are periodically merged into quantized full-rank weight matrices. Our approach is suitable for both pretraining and fine-tuning of models, which we demonstrate experimentally for language modeling and downstream task adaptation. We find that LoQT enables efficient training of models up to 7B parameters on a consumer-grade 24GB GPU. We also demonstrate the feasibility of training a 13B parameter model using per-layer gradient updates on the same hardware. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2404.00295 [pdf, ps, other]

A system of hypergeometric differential equations in $m$ variables of rank $p^m$

Authors: Jyoichi Kaneko, Keiji Matsumoto, Katsuyoshi Ohara, Tomohide Terasoma

Abstract: We define a hypergeometric series in $m$ variables with $p+(p-1)m$ parameters, which reduces to the generalized hypergeometric series $_pF_{p-1}$ when $m=1$, and to Lauricella's hypergeometric series $F_C$ in $m$ variables when $p=2$. We give a system of hypergeometric differential equations annihilating the series. Under some non-integral conditions on parameters, we give an Euler type integral r… ▽ More We define a hypergeometric series in $m$ variables with $p+(p-1)m$ parameters, which reduces to the generalized hypergeometric series $_pF_{p-1}$ when $m=1$, and to Lauricella's hypergeometric series $F_C$ in $m$ variables when $p=2$. We give a system of hypergeometric differential equations annihilating the series. Under some non-integral conditions on parameters, we give an Euler type integral representation of the series, and linearly independent $p^m$ solutions to this system around a point near to the origin. We show that this system is of rank $p^m$, and determine its singular locus. △ Less

Submitted 30 March, 2024; originally announced April 2024.

Comments: 22 pages, no figure

MSC Class: 33C70; 32S40

arXiv:2403.19182 [pdf]

doi 10.7566/JPSJ.93.044707

Single-Crystal Growth and Characterization of Cuprate Superconductor (Hg,Re)Ba$_2$Ca$_2$Cu$_3$O$_{8+δ}$

Authors: Yutaro Mino, Shigeyuki Ishida, Junichiro Kato, Shungo Nakagawa, Takanari Kashiwagi, Takahiro Nozue, Nao Takeshita, Kunihiro Kihou, Chul-Ho Lee, Taichiro Nishio, Hiroshi Eisaki

Abstract: We grew (Hg,Re)Ba$_2$Ca$_2$Cu$_3$O$_{8+δ}$ ((Hg,Re)1223) single crystals with good reproducibility via the single-step flux method using monoxides as raw materials. A double-sealing method using a thick-walled quartz tube and a stainless-steel container was adopted for explosion protection. The maximum crystal size was approximately 1 mm x 1 mm in the ab plane and 0.04 mm in thickness. The crystal… ▽ More We grew (Hg,Re)Ba$_2$Ca$_2$Cu$_3$O$_{8+δ}$ ((Hg,Re)1223) single crystals with good reproducibility via the single-step flux method using monoxides as raw materials. A double-sealing method using a thick-walled quartz tube and a stainless-steel container was adopted for explosion protection. The maximum crystal size was approximately 1 mm x 1 mm in the ab plane and 0.04 mm in thickness. The crystal was square-shaped, reflecting the tetragonal crystal structure of (Hg,Re)1223. Magnetic susceptibility measurements indicated a critical temperature of 130 K. The in-plane resistivity exhibited a linear temperature dependence, indicating that the sample was close to optimal do** level. The out-of-plane resistivity was also measured, and the anisotropy parameter was 250-650 at 300 K. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 21 pages, 9 figures, 2 tables

Journal ref: J. Phys. Soc. Jpn. 93, 044707 (2024)

arXiv:2403.03004 [pdf, other]

Ultralight vector dark matter search using data from the KAGRA O3GK run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, I. Abouelfettouh, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi , et al. (1778 additional authors not shown)

Abstract: Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we prese… ▽ More Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we present the result of a search for $U(1)_{B-L}$ gauge boson DM using the KAGRA data from auxiliary length channels during the first joint observation run together with GEO600. By applying our search pipeline, which takes into account the stochastic nature of ultralight DM, upper bounds on the coupling strength between the $U(1)_{B-L}$ gauge boson and ordinary matter are obtained for a range of DM masses. While our constraints are less stringent than those derived from previous experiments, this study demonstrates the applicability of our method to the lower-mass vector DM search, which is made difficult in this measurement by the short observation time compared to the auto-correlation time scale of DM. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 20 pages, 5 figures

Report number: LIGO-P2300250

arXiv:2401.09855 [pdf, ps, other]

Small energy scattering for radial solutions to the generalized Zakharov system

Authors: Jun Kato, Osamu Tojo

Abstract: We prove the small energy scattering for the three-dimensional generalized Zakharov system with radial symmetry based on the idea by Guo and Nakanishi (2014), which treats the usual Zakharov system. For the proof, we use the frequency-localized normal form reduction, and the radially improved Strichartz estimates. The relation between the solution to the integral equations, which includes the unus… ▽ More We prove the small energy scattering for the three-dimensional generalized Zakharov system with radial symmetry based on the idea by Guo and Nakanishi (2014), which treats the usual Zakharov system. For the proof, we use the frequency-localized normal form reduction, and the radially improved Strichartz estimates. The relation between the solution to the integral equations, which includes the unusual boundary terms, and the original differential equations is also considered. △ Less

Submitted 9 February, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

arXiv:2311.09207 [pdf, other]

An efficient and exact noncommutative quantum Gibbs sampler

Authors: Chi-Fang Chen, Michael J. Kastoryano, András Gilyén

Abstract: Preparing thermal and ground states is an essential quantum algorithmic task for quantum simulation. In this work, we construct the first efficiently implementable and exactly detailed-balanced Lindbladian for Gibbs states of arbitrary noncommutative Hamiltonians. Our construction can also be regarded as a continuous-time quantum analog of the Metropolis-Hastings algorithm. To prepare the quantum… ▽ More Preparing thermal and ground states is an essential quantum algorithmic task for quantum simulation. In this work, we construct the first efficiently implementable and exactly detailed-balanced Lindbladian for Gibbs states of arbitrary noncommutative Hamiltonians. Our construction can also be regarded as a continuous-time quantum analog of the Metropolis-Hastings algorithm. To prepare the quantum Gibbs state, our algorithm invokes Hamiltonian simulation for a time proportional to the mixing time and the inverse temperature $β$, up to polylogarithmic factors. Moreover, the gate complexity reduces significantly for lattice Hamiltonians as the corresponding Lindblad operators are (quasi-) local (with radius $\simβ$) and only depend on local Hamiltonian patches. Meanwhile, purifying our Lindbladians yields a temperature-dependent family of frustration-free "parent Hamiltonians", prescribing an adiabatic path for the canonical purified Gibbs state (i.e., the Thermal Field Double state). These favorable features suggest that our construction is the ideal quantum algorithmic counterpart of classical Markov chain Monte Carlo sampling. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: 39 pages, 4 figures

arXiv:2310.14771 [pdf, other]

Evaluating the Knowledge Base Completion Potential of GPT

Authors: Blerta Veseli, Simon Razniewski, Jan-Christoph Kalo, Gerhard Weikum

Abstract: Structured knowledge bases (KBs) are an asset for search engines and other applications, but are inevitably incomplete. Language models (LMs) have been proposed for unsupervised knowledge base completion (KBC), yet, their ability to do this at scale and with high accuracy remains an open question. Prior experimental studies mostly fall short because they only evaluate on popular subjects, or sampl… ▽ More Structured knowledge bases (KBs) are an asset for search engines and other applications, but are inevitably incomplete. Language models (LMs) have been proposed for unsupervised knowledge base completion (KBC), yet, their ability to do this at scale and with high accuracy remains an open question. Prior experimental studies mostly fall short because they only evaluate on popular subjects, or sample already existing facts from KBs. In this work, we perform a careful evaluation of GPT's potential to complete the largest public KB: Wikidata. We find that, despite their size and capabilities, models like GPT-3, ChatGPT and GPT-4 do not achieve fully convincing results on this task. Nonetheless, they provide solid improvements over earlier approaches with smaller LMs. In particular, we show that, with proper thresholding, GPT-3 enables to extend Wikidata by 27M facts at 90% precision. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: 12 pages 4 tables

Journal ref: Findings of EMNLP 2023

arXiv:2310.03011 [pdf, other]

Quantum algorithms: A survey of applications and end-to-end complexities

Authors: Alexander M. Dalzell, Sam McArdle, Mario Berta, Przemyslaw Bienias, Chi-Fang Chen, András Gilyén, Connor T. Hann, Michael J. Kastoryano, Emil T. Khabiboulline, Aleksander Kubica, Grant Salton, Samson Wang, Fernando G. S. L. Brandão

Abstract: The anticipated applications of quantum computers span across science and industry, ranging from quantum chemistry and many-body physics to optimization, finance, and machine learning. Proposed quantum solutions in these areas typically combine multiple quantum algorithmic primitives into an overall quantum algorithm, which must then incorporate the methods of quantum error correction and fault to… ▽ More The anticipated applications of quantum computers span across science and industry, ranging from quantum chemistry and many-body physics to optimization, finance, and machine learning. Proposed quantum solutions in these areas typically combine multiple quantum algorithmic primitives into an overall quantum algorithm, which must then incorporate the methods of quantum error correction and fault tolerance to be implemented correctly on quantum hardware. As such, it can be difficult to assess how much a particular application benefits from quantum computing, as the various approaches are often sensitive to intricate technical details about the underlying primitives and their complexities. Here we present a survey of several potential application areas of quantum algorithms and their underlying algorithmic primitives, carefully considering technical caveats and subtleties. We outline the challenges and opportunities in each area in an "end-to-end" fashion by clearly defining the problem being solved alongside the input-output model, instantiating all "oracles," and spelling out all hidden costs. We also compare quantum solutions against state-of-the-art classical methods and complexity-theoretic limitations to evaluate possible quantum speedups. The survey is written in a modular, wiki-like fashion to facilitate navigation of the content. Each primitive and application area is discussed in a standalone section, with its own bibliography of references and embedded hyperlinks that direct to other relevant sections. This structure mirrors that of complex quantum algorithms that involve several layers of abstraction, and it enables rapid evaluation of how end-to-end complexities are impacted when subroutines are altered. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: Survey document with wiki-like modular structure. 337 pages, including bibliography and sub-bibliographies. Comments welcome

arXiv:2309.12075 [pdf, other]

Prompt Tuned Embedding Classification for Multi-Label Industry Sector Allocation

Authors: Valentin Leonhard Buchner, Lele Cao, Jan-Christoph Kalo, Vilhelm von Ehrenheim

Abstract: Prompt Tuning is emerging as a scalable and cost-effective method to fine-tune Pretrained Language Models (PLMs), which are often referred to as Large Language Models (LLMs). This study benchmarks the performance and computational efficiency of Prompt Tuning and baselines for multi-label text classification. This is applied to the challenging task of classifying companies into an investment firm's… ▽ More Prompt Tuning is emerging as a scalable and cost-effective method to fine-tune Pretrained Language Models (PLMs), which are often referred to as Large Language Models (LLMs). This study benchmarks the performance and computational efficiency of Prompt Tuning and baselines for multi-label text classification. This is applied to the challenging task of classifying companies into an investment firm's proprietary industry taxonomy, supporting their thematic investment strategy. Text-to-text classification is frequently reported to outperform task-specific classification heads, but has several limitations when applied to a multi-label classification problem where each label consists of multiple tokens: (a) Generated labels may not match any label in the label taxonomy; (b) The fine-tuning process lacks permutation invariance and is sensitive to the order of the provided labels; (c) The model provides binary decisions rather than appropriate confidence scores. Limitation (a) is addressed by applying constrained decoding using Trie Search, which slightly improves classification performance. All limitations (a), (b), and (c) are addressed by replacing the PLM's language head with a classification head, which is referred to as Prompt Tuned Embedding Classification (PTEC). This improves performance significantly, while also reducing computational costs during inference. In our industrial application, the training data is skewed towards well-known companies. We confirm that the model's performance is consistent across both well-known and less-known companies. Our overall results indicate the continuing need to adapt state-of-the-art methods to domain-specific tasks, even in the era of PLMs with strong generalization abilities. We release our codebase and a benchmarking dataset at https://github.com/EQTPartners/PTEC. △ Less

Submitted 12 April, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

Comments: Accepted by NAACL 2024 industry track (6 pages, 4 figures). Source code to be found at https://github.com/EQTPartners/PTEC

MSC Class: 68T50 ACM Class: I.2.7; I.2.0

arXiv:2308.12535 [pdf, other]

SCP: Spherical-Coordinate-based Learned Point Cloud Compression

Authors: Ao Luo, Linxin Song, Keisuke Nonaka, Kyohei Unno, Heming Sun, Masayuki Goto, Jiro Katto

Abstract: In recent years, the task of learned point cloud compression has gained prominence. An important type of point cloud, the spinning LiDAR point cloud, is generated by spinning LiDAR on vehicles. This process results in numerous circular shapes and azimuthal angle invariance features within the point clouds. However, these two features have been largely overlooked by previous methodologies. In this… ▽ More In recent years, the task of learned point cloud compression has gained prominence. An important type of point cloud, the spinning LiDAR point cloud, is generated by spinning LiDAR on vehicles. This process results in numerous circular shapes and azimuthal angle invariance features within the point clouds. However, these two features have been largely overlooked by previous methodologies. In this paper, we introduce a model-agnostic method called Spherical-Coordinate-based learned Point cloud compression (SCP), designed to leverage the aforementioned features fully. Additionally, we propose a multi-level Octree for SCP to mitigate the reconstruction error for distant areas within the Spherical-coordinate-based Octree. SCP exhibits excellent universality, making it applicable to various learned point cloud compression techniques. Experimental results demonstrate that SCP surpasses previous state-of-the-art methods by up to 29.14% in point-to-point PSNR BD-Rate. △ Less

Submitted 8 February, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

arXiv:2308.08249 [pdf, ps, other]

The asymptotic behavior of the Bergman kernel on pseudoconvex model domains

Authors: Joe Kamimoto

Abstract: In this paper, we investigate the asymptotic behavior of the Bergman kernel at the boundary for some pseudoconvex model domains. This behavior can be described by the geometrical information of the Newton polyhedron of the defining function of the respective domains. We deal with not only the finite type cases but also some infinite type cases. In this paper, we investigate the asymptotic behavior of the Bergman kernel at the boundary for some pseudoconvex model domains. This behavior can be described by the geometrical information of the Newton polyhedron of the defining function of the respective domains. We deal with not only the finite type cases but also some infinite type cases. △ Less

Submitted 16 August, 2023; originally announced August 2023.

arXiv:2308.06374 [pdf, other]

Large Language Models and Knowledge Graphs: Opportunities and Challenges

Authors: Jeff Z. Pan, Simon Razniewski, Jan-Christoph Kalo, Sneha Singhania, Jiaoyan Chen, Stefan Dietze, Hajira Jabeen, Janna Omeliyanenko, Wen Zhang, Matteo Lissandrini, Russa Biswas, Gerard de Melo, Angela Bonifati, Edlira Vakaj, Mauro Dragoni, Damien Graux

Abstract: Large Language Models (LLMs) have taken Knowledge Representation -- and the world -- by storm. This inflection point marks a shift from explicit knowledge representation to a renewed focus on the hybrid representation of both explicit knowledge and parametric knowledge. In this position paper, we will discuss some of the common debate points within the community on LLMs (parametric knowledge) and… ▽ More Large Language Models (LLMs) have taken Knowledge Representation -- and the world -- by storm. This inflection point marks a shift from explicit knowledge representation to a renewed focus on the hybrid representation of both explicit knowledge and parametric knowledge. In this position paper, we will discuss some of the common debate points within the community on LLMs (parametric knowledge) and Knowledge Graphs (explicit knowledge) and speculate on opportunities and visions that the renewed focus brings, as well as related research topics and challenges. △ Less

Submitted 11 August, 2023; originally announced August 2023.

Comments: 30 pages

arXiv:2308.03822 [pdf, other]

Search for Eccentric Black Hole Coalescences during the Third Observing Run of LIGO and Virgo

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi , et al. (1750 additional authors not shown)

Abstract: Despite the growing number of confident binary black hole coalescences observed through gravitational waves so far, the astrophysical origin of these binaries remains uncertain. Orbital eccentricity is one of the clearest tracers of binary formation channels. Identifying binary eccentricity, however, remains challenging due to the limited availability of gravitational waveforms that include effect… ▽ More Despite the growing number of confident binary black hole coalescences observed through gravitational waves so far, the astrophysical origin of these binaries remains uncertain. Orbital eccentricity is one of the clearest tracers of binary formation channels. Identifying binary eccentricity, however, remains challenging due to the limited availability of gravitational waveforms that include effects of eccentricity. Here, we present observational results for a waveform-independent search sensitive to eccentric black hole coalescences, covering the third observing run (O3) of the LIGO and Virgo detectors. We identified no new high-significance candidates beyond those that were already identified with searches focusing on quasi-circular binaries. We determine the sensitivity of our search to high-mass (total mass $M>70$ $M_\odot$) binaries covering eccentricities up to 0.3 at 15 Hz orbital frequency, and use this to compare model predictions to search results. Assuming all detections are indeed quasi-circular, for our fiducial population model, we place an upper limit for the merger rate density of high-mass binaries with eccentricities $0 < e \leq 0.3$ at $0.33$ Gpc$^{-3}$ yr$^{-1}$ at 90\% confidence level. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Comments: 24 pages, 5 figures

Report number: LIGO-P2300080

arXiv:2307.12417 [pdf, other]

Practical Commercial 5G Standalone (SA) Uplink Throughput Prediction

Authors: Kasidis Arunruangsirilert, Jiro Katto

Abstract: While the 5G New Radio (NR) network promises a huge uplift of the uplink throughput, the improvement can only be seen when the User Equipment (UE) is connected to the high-frequency millimeter wave (mmWave) band. With the rise of uplink-intensive smartphone applications such as the real-time transmission of UHD 4K/8K videos, and Virtual Reality (VR)/Augmented Reality (AR) contents, uplink throughp… ▽ More While the 5G New Radio (NR) network promises a huge uplift of the uplink throughput, the improvement can only be seen when the User Equipment (UE) is connected to the high-frequency millimeter wave (mmWave) band. With the rise of uplink-intensive smartphone applications such as the real-time transmission of UHD 4K/8K videos, and Virtual Reality (VR)/Augmented Reality (AR) contents, uplink throughput prediction plays a huge role in maximizing the users' quality of experience (QoE). In this paper, we propose using a ConvLSTM-based neural network to predict the future uplink throughput based on past uplink throughput and RF parameters. The network is trained using the data from real-world drive tests on commercial 5G SA networks while riding commuter trains, which accounted for various frequency bands, handover, and blind spots. To make sure our model can be practically implemented, we then limited our model to only use the information available via Android API, then evaluate our model using the data from both commuter trains and other methods of transportation. The results show that our model reaches an average prediction accuracy of 98.9\% with an average RMSE of 1.80 Mbps across all unseen evaluation scenarios. △ Less

Submitted 23 July, 2023; originally announced July 2023.

arXiv:2306.12141 [pdf, other]

doi 10.1145/3605573.3605588

Recoil: Parallel rANS Decoding with Decoder-Adaptive Scalability

Authors: Fangzheng Lin, Kasidis Arunruangsirilert, Heming Sun, Jiro Katto

Abstract: Entropy coding is essential to data compression, image and video coding, etc. The Range variant of Asymmetric Numeral Systems (rANS) is a modern entropy coder, featuring superior speed and compression rate. As rANS is not designed for parallel execution, the conventional approach to parallel rANS partitions the input symbol sequence and encodes partitions with independent codecs, and more partitio… ▽ More Entropy coding is essential to data compression, image and video coding, etc. The Range variant of Asymmetric Numeral Systems (rANS) is a modern entropy coder, featuring superior speed and compression rate. As rANS is not designed for parallel execution, the conventional approach to parallel rANS partitions the input symbol sequence and encodes partitions with independent codecs, and more partitions bring extra overhead. This approach is found in state-of-the-art implementations such as DietGPU. It is unsuitable for content-delivery applications, as the parallelism is wasted if the decoder cannot decode all the partitions in parallel, but all the overhead is still transferred. To solve this, we propose Recoil, a parallel rANS decoding approach with decoder-adaptive scalability. We discover that a single rANS-encoded bitstream can be decoded from any arbitrary position if the intermediate states are known. After renormalization, these states also have a smaller upper bound, which can be stored efficiently. We then split the encoded bitstream using a heuristic to evenly distribute the workload, and store the intermediate states and corresponding symbol indices as metadata. The splits can then be combined simply by eliminating extra metadata entries. The main contribution of Recoil is reducing unnecessary data transfer by adaptively scaling parallelism overhead to match the decoder capability. The experiments show that Recoil decoding throughput is comparable to the conventional approach, scaling massively on CPUs and GPUs and greatly outperforming various other ANS-based codecs. △ Less

Submitted 26 June, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

Comments: Accepted to the International Conference on Parallel Processing (ICPP) 2023

arXiv:2303.18224 [pdf, other]

Quantum Thermal State Preparation

Authors: Chi-Fang Chen, Michael J. Kastoryano, Fernando G. S. L. Brandão, András Gilyén

Abstract: Preparing ground states and thermal states is essential for simulating quantum systems on quantum computers. Despite the hope for practical quantum advantage in quantum simulation, popular state preparation approaches have been challenged. Monte Carlo-style quantum Gibbs samplers have emerged as an alternative, but prior proposals have been unsatisfactory due to technical obstacles rooted in energ… ▽ More Preparing ground states and thermal states is essential for simulating quantum systems on quantum computers. Despite the hope for practical quantum advantage in quantum simulation, popular state preparation approaches have been challenged. Monte Carlo-style quantum Gibbs samplers have emerged as an alternative, but prior proposals have been unsatisfactory due to technical obstacles rooted in energy-time uncertainty. We introduce simple continuous-time quantum Gibbs samplers that overcome these obstacles by efficiently simulating Nature-inspired quantum master equations (Lindbladians). In addition, we construct the first provably accurate and efficient algorithm for preparing certain purified Gibbs states (called thermal field double states in high-energy physics) of rapidly thermalizing systems; this algorithm also benefits from a quantum walk speedup. Our algorithms' costs have a provable dependence on temperature, accuracy, and the mixing time (or spectral gap) of the relevant Lindbladian. We complete the first rigorous proof of finite-time thermalization for physically derived Lindbladians by develo** a general analytic framework for nonasymptotic secular approximation and approximate detailed balance. Given the success of classical Markov chain Monte Carlo (MCMC) algorithms and the ubiquity of thermodynamics, we anticipate that quantum Gibbs sampling will become indispensable in quantum computing. △ Less

Submitted 15 November, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

Comments: 79 pages, 12 figures; v2 modified table 1 and improved presentation and bounds

arXiv:2303.14978 [pdf, other]

Learned Image Compression with Mixed Transformer-CNN Architectures

Authors: **ming Liu, Heming Sun, Jiro Katto

Abstract: Learned image compression (LIC) methods have exhibited promising progress and superior rate-distortion performance compared with classical image compression standards. Most existing LIC methods are Convolutional Neural Networks-based (CNN-based) or Transformer-based, which have different advantages. Exploiting both advantages is a point worth exploring, which has two challenges: 1) how to effectiv… ▽ More Learned image compression (LIC) methods have exhibited promising progress and superior rate-distortion performance compared with classical image compression standards. Most existing LIC methods are Convolutional Neural Networks-based (CNN-based) or Transformer-based, which have different advantages. Exploiting both advantages is a point worth exploring, which has two challenges: 1) how to effectively fuse the two methods? 2) how to achieve higher performance with a suitable complexity? In this paper, we propose an efficient parallel Transformer-CNN Mixture (TCM) block with a controllable complexity to incorporate the local modeling ability of CNN and the non-local modeling ability of transformers to improve the overall architecture of image compression models. Besides, inspired by the recent progress of entropy estimation models and attention modules, we propose a channel-wise entropy model with parameter-efficient swin-transformer-based attention (SWAtten) modules by using channel squeezing. Experimental results demonstrate our proposed method achieves state-of-the-art rate-distortion performances on three different resolution datasets (i.e., Kodak, Tecnick, CLIC Professional Validation) compared to existing LIC methods. The code is at https://github.com/jmliu206/LIC_TCM. △ Less

Submitted 27 March, 2023; originally announced March 2023.

Comments: Accepted by CVPR2023 (Highlight)

arXiv:2302.09263 [pdf, other]

Multistage Spatial Context Models for Learned Image Compression

Authors: Fangzheng Lin, Heming Sun, **ming Liu, Jiro Katto

Abstract: Recent state-of-the-art Learned Image Compression methods feature spatial context models, achieving great rate-distortion improvements over hyperprior methods. However, the autoregressive context model requires serial decoding, limiting runtime performance. The Checkerboard context model allows parallel decoding at a cost of reduced RD performance. We present a series of multistage spatial context… ▽ More Recent state-of-the-art Learned Image Compression methods feature spatial context models, achieving great rate-distortion improvements over hyperprior methods. However, the autoregressive context model requires serial decoding, limiting runtime performance. The Checkerboard context model allows parallel decoding at a cost of reduced RD performance. We present a series of multistage spatial context models allowing both fast decoding and better RD performance. We split the latent space into square patches and decode serially within each patch while different patches are decoded in parallel. The proposed method features a comparable decoding speed to Checkerboard while reaching the RD performance of Autoregressive and even also outperforming Autoregressive. Inside each patch, the decoding order must be carefully decided as a bad order negatively impacts performance; therefore, we also propose a decoding order optimization algorithm. △ Less

Submitted 18 February, 2023; originally announced February 2023.

Comments: Accepted to IEEE ICASSP 2023

arXiv:2302.04527 [pdf, other]

doi 10.1109/TITS.2022.3217342

Toward Extremely Lightweight Distracted Driver Recognition With Distillation-Based Neural Architecture Search and Knowledge Transfer

Authors: Dichao Liu, Toshihiko Yamasaki, Yu Wang, Kenji Mase, Jien Kato

Abstract: The number of traffic accidents has been continuously increasing in recent years worldwide. Many accidents are caused by distracted drivers, who take their attention away from driving. Motivated by the success of Convolutional Neural Networks (CNNs) in computer vision, many researchers developed CNN-based algorithms to recognize distracted driving from a dashcam and warn the driver against unsafe… ▽ More The number of traffic accidents has been continuously increasing in recent years worldwide. Many accidents are caused by distracted drivers, who take their attention away from driving. Motivated by the success of Convolutional Neural Networks (CNNs) in computer vision, many researchers developed CNN-based algorithms to recognize distracted driving from a dashcam and warn the driver against unsafe behaviors. However, current models have too many parameters, which is unfeasible for vehicle-mounted computing. This work proposes a novel knowledge-distillation-based framework to solve this problem. The proposed framework first constructs a high-performance teacher network by progressively strengthening the robustness to illumination changes from shallow to deep layers of a CNN. Then, the teacher network is used to guide the architecture searching process of a student network through knowledge distillation. After that, we use the teacher network again to transfer knowledge to the student network by knowledge distillation. Experimental results on the Statefarm Distracted Driver Detection Dataset and AUC Distracted Driver Dataset show that the proposed approach is highly effective for recognizing distracted driving behaviors from photos: (1) the teacher network's accuracy surpasses the previous best accuracy; (2) the student network achieves very high accuracy with only 0.42M parameters (around 55% of the previous most lightweight model). Furthermore, the student network architecture can be extended to a spatial-temporal 3D CNN for recognizing distracted driving from video clips. The 3D student network largely surpasses the previous best accuracy with only 2.03M parameters on the Drive&Act Dataset. The source code is available at https://github.com/Dichao-Liu/Lightweight_Distracted_Driver_Recognition_with_Distillation-Based_NAS_and_Knowledge_Transfer. △ Less

Submitted 9 February, 2023; originally announced February 2023.

Journal ref: IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 24, NO. 1, JANUARY 2023

arXiv:2302.03676 [pdf, other]

doi 10.3847/1538-4365/acdc9f

Open data from the third observing run of LIGO, Virgo, KAGRA and GEO

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, H. Abe, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah, C. Alléné, A. Allocca , et al. (1719 additional authors not shown)

Abstract: The global network of gravitational-wave observatories now includes five detectors, namely LIGO Hanford, LIGO Livingston, Virgo, KAGRA, and GEO 600. These detectors collected data during their third observing run, O3, composed of three phases: O3a starting in April of 2019 and lasting six months, O3b starting in November of 2019 and lasting five months, and O3GK starting in April of 2020 and lasti… ▽ More The global network of gravitational-wave observatories now includes five detectors, namely LIGO Hanford, LIGO Livingston, Virgo, KAGRA, and GEO 600. These detectors collected data during their third observing run, O3, composed of three phases: O3a starting in April of 2019 and lasting six months, O3b starting in November of 2019 and lasting five months, and O3GK starting in April of 2020 and lasting 2 weeks. In this paper we describe these data and various other science products that can be freely accessed through the Gravitational Wave Open Science Center at https://gwosc.org. The main dataset, consisting of the gravitational-wave strain time series that contains the astrophysical signals, is released together with supporting data useful for their analysis and documentation, tutorials, as well as analysis software packages. △ Less

Submitted 7 February, 2023; originally announced February 2023.

Comments: 27 pages, 3 figures

Report number: LIGO-P2200316

arXiv:2212.14480 [pdf, other]

Performance Evaluations of C-Band 5G NR FR1 (Sub-6 GHz) Uplink MIMO on Urban Train

Authors: Kasidis Arunruangsirilert, Pasapong Wongprasert, Jiro Katto

Abstract: Due to the recent demand for huge Uplink throughput on Mobile networks driven by the rapid development of social media platforms, UHD 4K/8K video, and VR/AR contents, Uplink MIMO (UL-MIMO) has now been deployed on commercial 5G networks with reasonable availability of supported User Equipment (UE) for consumers. By utilizing up to 2 Tx antenna ports, UL-MIMO-capable UE promised to achieve up to tw… ▽ More Due to the recent demand for huge Uplink throughput on Mobile networks driven by the rapid development of social media platforms, UHD 4K/8K video, and VR/AR contents, Uplink MIMO (UL-MIMO) has now been deployed on commercial 5G networks with reasonable availability of supported User Equipment (UE) for consumers. By utilizing up to 2 Tx antenna ports, UL-MIMO-capable UE promised to achieve up to two times the uplink throughput in ideal conditions, while providing improved uplink performance over UE with 1Tx in challenging conditions. In Japan, SoftBank, one of the carriers, introduced 5G Standalone (SA) services for the Fixed Wireless Access (FWA) application back in October 2021. Mobile services were commenced in May 2022, which provide UL-MIMO for supported UE on C-Band or Band n77 (3.7 GHz). In this paper, the uplink performance of UL-MIMO-capable UE will be compared against the conventional UL-1Tx UE on trains, which is the most popular method of transportation for the Japanese. The results show that UL-MIMO-capable UE delivers an average of 19.8% better throughput on moving trains with up to 33.5% in the more favorable signal conditions. A moderate relationship between downlink 5G NR SS-RSRP and uplink throughput also has been observed. △ Less

Submitted 29 December, 2022; originally announced December 2022.

Comments: 2023 IEEE Wireless Communications and Networking Conference (WCNC), 26-29 March 2023, Glasgow, Scotland, UK

arXiv:2212.14479 [pdf, other]

Pensieve 5G: Implementation of RL-based ABR Algorithm for UHD 4K/8K Content Delivery on Commercial 5G SA/NR-DC Network

Authors: Kasidis Arunruangsirilert, Bo Wei, Hang Song, Jiro Katto

Abstract: While the rollout of the fifth-generation mobile network (5G) is underway across the globe with the intention to deliver 4K/8K UHD videos, Augmented Reality (AR), and Virtual Reality (VR) content to the mass amounts of users, the coverage and throughput are still one of the most significant issues, especially in the rural areas, where only 5G in the low-frequency band are being deployed. This call… ▽ More While the rollout of the fifth-generation mobile network (5G) is underway across the globe with the intention to deliver 4K/8K UHD videos, Augmented Reality (AR), and Virtual Reality (VR) content to the mass amounts of users, the coverage and throughput are still one of the most significant issues, especially in the rural areas, where only 5G in the low-frequency band are being deployed. This called for a high-performance adaptive bitrate (ABR) algorithm that can maximize the user quality of experience given 5G network characteristics and data rate of UHD contents. Recently, many of the newly proposed ABR techniques were machine-learning based. Among that, Pensieve is one of the state-of-the-art techniques, which utilized reinforcement-learning to generate an ABR algorithm based on observation of past decision performance. By incorporating the context of the 5G network and UHD content, Pensieve has been optimized into Pensieve 5G. New QoE metrics that more accurately represent the QoE of UHD video streaming on the different types of devices were proposed and used to evaluate Pensieve 5G against other ABR techniques including the original Pensieve. The results from the simulation based on the real 5G Standalone (SA) network throughput shows that Pensieve 5G outperforms both conventional algorithms and Pensieve with the average QoE improvement of 8.8% and 14.2%, respectively. Additionally, Pensieve 5G also performed well on the commercial 5G NR-NR Dual Connectivity (NR-DC) Network, despite the training being done solely using the data from the 5G Standalone (SA) network. △ Less

Submitted 29 December, 2022; originally announced December 2022.

Comments: 2023 IEEE Wireless Communications and Networking Conference (WCNC), 26-29 March 2023, Glasgow, Scotland, UK

arXiv:2211.06595 [pdf]

ABCAS: Adaptive Bound Control of spectral norm as Automatic Stabilizer

Authors: Shota Hirose, Shiori Maki, Naoki Wada, Heming Sun, Jiro Katto

Abstract: Spectral Normalization is one of the best methods for stabilizing the training of Generative Adversarial Network. Spectral Normalization limits the gradient of discriminator between the distribution between real data and fake data. However, even with this normalization, GAN's training sometimes fails. In this paper, we reveal that more severe restriction is sometimes needed depending on the traini… ▽ More Spectral Normalization is one of the best methods for stabilizing the training of Generative Adversarial Network. Spectral Normalization limits the gradient of discriminator between the distribution between real data and fake data. However, even with this normalization, GAN's training sometimes fails. In this paper, we reveal that more severe restriction is sometimes needed depending on the training dataset, then we propose a novel stabilizer which offers an adaptive normalization method, called ABCAS. Our method decides discriminator's Lipschitz constant adaptively, by checking the distance of distributions of real and fake data. Our method improves the stability of the training of Generative Adversarial Network and achieved better Fréchet Inception Distance score of generated images. We also investigated suitable spectral norm for three datasets. We show the result as an ablation study. △ Less

Submitted 12 November, 2022; originally announced November 2022.

Comments: ICCE 2023

arXiv:2211.05100 [pdf, other]

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Authors: BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major , et al. (369 additional authors not shown)

Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access… ▽ More Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License. △ Less

Submitted 27 June, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

arXiv:2209.05683 [pdf, other]

One-shot Network Pruning at Initialization with Discriminative Image Patches

Authors: Yinan Yang, Yu Wang, Ying Ji, Heng Qi, Jien Kato

Abstract: One-shot Network Pruning at Initialization (OPaI) is an effective method to decrease network pruning costs. Recently, there is a growing belief that data is unnecessary in OPaI. However, we obtain an opposite conclusion by ablation experiments in two representative OPaI methods, SNIP and GraSP. Specifically, we find that informative data is crucial to enhancing pruning performance. In this paper,… ▽ More One-shot Network Pruning at Initialization (OPaI) is an effective method to decrease network pruning costs. Recently, there is a growing belief that data is unnecessary in OPaI. However, we obtain an opposite conclusion by ablation experiments in two representative OPaI methods, SNIP and GraSP. Specifically, we find that informative data is crucial to enhancing pruning performance. In this paper, we propose two novel methods, Discriminative One-shot Network Pruning (DOP) and Super Stitching, to prune the network by high-level visual discriminative image patches. Our contributions are as follows. (1) Extensive experiments reveal that OPaI is data-dependent. (2) Super Stitching performs significantly better than the original OPaI method on benchmark ImageNet, especially in a highly compressed model. △ Less

Submitted 3 October, 2022; v1 submitted 12 September, 2022; originally announced September 2022.

Comments: BMVC 2022

arXiv:2209.05663 [pdf, ps, other]

doi 10.2206/kyushujm.77.319

Asymptotic expansion of oscillatory integrals with singular phases

Authors: Joe Kamimoto, Hiromichi Mizuno

Abstract: The purpose of this article is to describe the singularities of one-dimensional oscillatory integrals, whose phases have a certain singularity, in the form of an asymptotic expansion. In the case of the Laplace integral, an analogous result is also given. The purpose of this article is to describe the singularities of one-dimensional oscillatory integrals, whose phases have a certain singularity, in the form of an asymptotic expansion. In the case of the Laplace integral, an analogous result is also given. △ Less

Submitted 12 September, 2022; originally announced September 2022.

MSC Class: 42A38 (41A60)

Journal ref: Kyushu Journal of Mathematics, 77-2 (2023) , 319-329

arXiv:2209.05657 [pdf, ps, other]

doi 10.1016/j.jfa.2023.110185

Resolution of singularities for $C^{\infty}$ functions and meromorphy of local zeta functions

Authors: Joe Kamimoto

Abstract: In this paper, we attempt to resolve the singularities of the zero variety of a $C^{\infty}$ function of two variables as much as possible by using ordinary blowings up. As a result, we formulate an algorithm to locally express the zero variety in the ``almost'' normal crossings form, which is close to the normal crossings form but may include flat functions. As an application, we investigate anal… ▽ More In this paper, we attempt to resolve the singularities of the zero variety of a $C^{\infty}$ function of two variables as much as possible by using ordinary blowings up. As a result, we formulate an algorithm to locally express the zero variety in the ``almost'' normal crossings form, which is close to the normal crossings form but may include flat functions. As an application, we investigate analytic continuation of local zeta functions associated with $C^{\infty}$ functions of two variables. As is well known, the desingularization theorem of Hironaka implies that the local zeta functions associated with real analytic functions admit the meromorphic continuation to the whole complex plane. On the other hand, it is recently observed that the local zeta function associated with a specific (non-real analytic) $C^{\infty}$ function has a singularity different from the pole. From this observation, the following questions are naturally raised in the $C^{\infty}$ case: how wide the meromorphically extendible region can be and what kinds of information essentially determine this region? This paper shows that this region can be described in terms of some kind of multiplicity of the zero variety of each $C^{\infty}$ function. By using our blowings up algorithm, it suffices to investigate local zeta functions in the almost normal crossings case. This case can be effectively analyzed by using real analysis methods; in particular, a van der Corput-type lemma plays a crucial role in the determination of the above region. △ Less

Submitted 12 September, 2022; originally announced September 2022.

MSC Class: 58K05 (26E10; 14H20)

Journal ref: Journal of Functional Analysis, Volume 286, Issue 1, 1 January 2024, 110-185

arXiv:2209.05385 [pdf]

doi 10.1063/5.0137686

Direct measurement of electrocaloric effect based on multi-harmonic lock-in thermography

Authors: Ryo Iguchi, Daisuke Fukuda, Jun Kano, Takashi Teranishi, Ken-ichi Uchida

Abstract: In this study, we report on a direct measurement method for the electrocaloric effect, the heating/cooling upon application/removal of an electric field in dielectric materials, based on a lock-in thermography technique. By use of sinusoidal excitation and multi-harmonic detection, the actual temperature change can be measured by a single measurement in the frequency domain even when the electroca… ▽ More In this study, we report on a direct measurement method for the electrocaloric effect, the heating/cooling upon application/removal of an electric field in dielectric materials, based on a lock-in thermography technique. By use of sinusoidal excitation and multi-harmonic detection, the actual temperature change can be measured by a single measurement in the frequency domain even when the electrocaloric effect shows nonlinear response to the excitation field. We have demonstrated the method by measuring the temperature dependence of the electric-field-induced temperature change for two Sr-doped BaTiO$_3$ systems with different ferroelectric-paraelectric phase transition temperatures, where the procedure for extracting the pure electrocaloric contribution free from heat losses and Joule heating due to leakage currents is introduced. This method can be used irrespective of the type of dielectric materials and enables simultaneous estimation of the polarization change and power dissipation during the application of the electric field, being a convenient imaging measurement method for the electrocaloric effect. △ Less

Submitted 12 September, 2022; originally announced September 2022.

arXiv:2209.01355 [pdf, other]

Semantic Segmentation in Learned Compressed Domain

Authors: **ming Liu, Heming Sun, Jiro Katto

Abstract: Most machine vision tasks (e.g., semantic segmentation) are based on images encoded and decoded by image compression algorithms (e.g., JPEG). However, these decoded images in the pixel domain introduce distortion, and they are optimized for human perception, making the performance of machine vision tasks suboptimal. In this paper, we propose a method based on the compressed domain to improve segme… ▽ More Most machine vision tasks (e.g., semantic segmentation) are based on images encoded and decoded by image compression algorithms (e.g., JPEG). However, these decoded images in the pixel domain introduce distortion, and they are optimized for human perception, making the performance of machine vision tasks suboptimal. In this paper, we propose a method based on the compressed domain to improve segmentation tasks. i) A dynamic and a static channel selection method are proposed to reduce the redundancy of compressed representations that are obtained by encoding. ii) Two different transform modules are explored and analyzed to help the compressed representation be transformed as the features in the segmentation network. The experimental results show that we can save up to 15.8\% bitrates compared with a state-of-the-art compressed domain-based work while saving up to about 83.6\% bitrates and 44.8\% inference time compared with the pixel domain-based method. △ Less

Submitted 3 September, 2022; originally announced September 2022.

arXiv:2208.13974 [pdf]

Learned Lossless Image Compression With Combined Autoregressive Models And Attention Modules

Authors: Ran Wang, **ming Liu, Heming Sun, Jiro Katto

Abstract: Lossless image compression is an essential research field in image compression. Recently, learning-based image compression methods achieved impressive performance compared with traditional lossless methods, such as WebP, JPEG2000, and FLIF. However, there are still many impressive lossy compression methods that can be applied to lossless compression. Therefore, in this paper, we explore the method… ▽ More Lossless image compression is an essential research field in image compression. Recently, learning-based image compression methods achieved impressive performance compared with traditional lossless methods, such as WebP, JPEG2000, and FLIF. However, there are still many impressive lossy compression methods that can be applied to lossless compression. Therefore, in this paper, we explore the methods widely used in lossy compression and apply them to lossless compression. Inspired by the impressive performance of the Gaussian mixture model (GMM) shown in lossy compression, we generate a lossless network architecture with GMM. Besides noticing the successful achievements of attention modules and autoregressive models, we propose to utilize attention modules and add an extra autoregressive model for raw images in our network architecture to boost the performance. Experimental results show that our approach outperforms most classical lossless compression methods and existing learning-based methods. △ Less

Submitted 29 August, 2022; originally announced August 2022.

Comments: 5 pages

arXiv:2208.11057 [pdf, other]

Prompting as Probing: Using Language Models for Knowledge Base Construction

Authors: Dimitrios Alivanistos, Selene Báez Santamaría, Michael Cochez, Jan-Christoph Kalo, Emile van Krieken, Thiviyan Thanapalasingam

Abstract: Language Models (LMs) have proven to be useful in various downstream applications, such as summarisation, translation, question answering and text classification. LMs are becoming increasingly important tools in Artificial Intelligence, because of the vast quantity of information they can store. In this work, we present ProP (Prompting as Probing), which utilizes GPT-3, a large Language Model orig… ▽ More Language Models (LMs) have proven to be useful in various downstream applications, such as summarisation, translation, question answering and text classification. LMs are becoming increasingly important tools in Artificial Intelligence, because of the vast quantity of information they can store. In this work, we present ProP (Prompting as Probing), which utilizes GPT-3, a large Language Model originally proposed by OpenAI in 2020, to perform the task of Knowledge Base Construction (KBC). ProP implements a multi-step approach that combines a variety of prompting techniques to achieve this. Our results show that manual prompt curation is essential, that the LM must be encouraged to give answer sets of variable lengths, in particular including empty answer sets, that true/false questions are a useful device to increase precision on suggestions generated by the LM, that the size of the LM is a crucial factor, and that a dictionary of entity aliases improves the LM score. Our evaluation study indicates that these proposed techniques can substantially enhance the quality of the final predictions: ProP won track 2 of the LM-KBC competition, outperforming the baseline by 36.4 percentage points. Our implementation is available on https://github.com/HEmile/iswc-challenge. △ Less

Submitted 19 June, 2023; v1 submitted 23 August, 2022; originally announced August 2022.

Comments: Published in LM-KBC 22: Knowledge Base Construction from Pre-trained Language Models, Challenge at ISWC 2022. 12+12 pages

arXiv:2208.01641 [pdf]

Streaming-capable High-performance Architecture of Learned Image Compression Codecs

Authors: Fangzheng Lin, Heming Sun, Jiro Katto

Abstract: Learned image compression allows achieving state-of-the-art accuracy and compression ratios, but their relatively slow runtime performance limits their usage. While previous attempts on optimizing learned image codecs focused more on the neural model and entropy coding, we present an alternative method to improving the runtime performance of various learned image compression models. We introduce m… ▽ More Learned image compression allows achieving state-of-the-art accuracy and compression ratios, but their relatively slow runtime performance limits their usage. While previous attempts on optimizing learned image codecs focused more on the neural model and entropy coding, we present an alternative method to improving the runtime performance of various learned image compression models. We introduce multi-threaded pipelining and an optimized memory model to enable GPU and CPU workloads asynchronous execution, fully taking advantage of computational resources. Our architecture alone already produces excellent performance without any change to the neural model itself. We also demonstrate that combining our architecture with previous tweaks to the neural models can further improve runtime performance. We show that our implementations excel in throughput and latency compared to the baseline and demonstrate the performance of our implementations by creating a real-time video streaming encoder-decoder sample application, with the encoder running on an embedded device. △ Less

Submitted 1 August, 2022; originally announced August 2022.

Comments: Accepted to IEEE ICIP 2022

arXiv:2206.10083 [pdf, other]

Memory-Efficient Learned Image Compression with Pruned Hyperprior Module

Authors: Ao Luo, Heming Sun, **ming Liu, Jiro Katto

Abstract: Learned Image Compression (LIC) gradually became more and more famous in these years. The hyperprior-module-based LIC models have achieved remarkable rate-distortion performance. However, the memory cost of these LIC models is too large to actually apply them to various devices, especially to portable or edge devices. The parameter scale is directly linked with memory cost. In our research, we fou… ▽ More Learned Image Compression (LIC) gradually became more and more famous in these years. The hyperprior-module-based LIC models have achieved remarkable rate-distortion performance. However, the memory cost of these LIC models is too large to actually apply them to various devices, especially to portable or edge devices. The parameter scale is directly linked with memory cost. In our research, we found the hyperprior module is not only highly over-parameterized, but also its latent representation contains redundant information. Therefore, we propose a novel pruning method named ERHP in this paper to efficiently reduce the memory cost of hyperprior module, while improving the network performance. The experiments show our method is effective, reducing at least 22.6% parameters in the whole model while achieving better rate-distortion performance. △ Less

Submitted 20 June, 2022; originally announced June 2022.

Comments: Accepted for presentation at IEEE ICIP 2022

arXiv:2206.09427 [pdf]

doi 10.1109/ACCESS.2023.3326326

QuDASH: Quantum-inspired rate adaptation approach for DASH video streaming

Authors: Bo Wei, Hang Song, Makoto Nakamura, Koichi Kimura, Nozomu Togawa, Jiro Katto

Abstract: Internet traffic is dramatically increasing with the development of network technologies and video streaming traffic accounts for large amount within the total traffic, which reveals the importance to guarantee the quality of content delivery service. Based on the network conditions, adaptive bitrate (ABR) control is utilized as a common technique which can choose the proper bitrate to ensure the… ▽ More Internet traffic is dramatically increasing with the development of network technologies and video streaming traffic accounts for large amount within the total traffic, which reveals the importance to guarantee the quality of content delivery service. Based on the network conditions, adaptive bitrate (ABR) control is utilized as a common technique which can choose the proper bitrate to ensure the video streaming quality. In this paper, new bitrate control method, QuDASH is proposed by taking advantage of the emerging quantum technology. In QuDASH, the adaptive control model is developed using the quadratic unconstrained binary optimization (QUBO), which aims at increasing the average bitrate and decreasing the video rebuffering events to maximize the user quality of experience (QoE). In order to formulate the video control model, first the QUBO terms of different factors are defined regarding video quality, bitrate change, and buffer condition. Then, all the individual QUBO terms are merged to generate an objective function. By minimizing the QUBO objective function, the bitrate choice is determined from the solution. The control model is solved by Digital Annealer, which is a quantum-inspired computing technology. The evaluation of the proposed method is carried out by simulation with the throughput traces obtained in real world under different scenarios and the comparison with other methods is conducted. Experiment results demonstrated that the proposed QuDASH method has better performance in terms of QoE compared with other advanced ABR methods. In 68.2% of the examined cases, QuDASH achieves the highest QoE results, which shows the superiority of the QuDASH over conventional methods. △ Less

Submitted 21 October, 2023; v1 submitted 19 June, 2022; originally announced June 2022.

Comments: Accepted Version

Journal ref: IEEE Access, 2023

arXiv:2206.05275 [pdf, other]

Spatial-temporal Concept based Explanation of 3D ConvNets

Authors: Ying Ji, Yu Wang, Kensaku Mori, Jien Kato

Abstract: Recent studies have achieved outstanding success in explaining 2D image recognition ConvNets. On the other hand, due to the computation cost and complexity of video data, the explanation of 3D video recognition ConvNets is relatively less studied. In this paper, we present a 3D ACE (Automatic Concept-based Explanation) framework for interpreting 3D ConvNets. In our approach: (1) videos are represe… ▽ More Recent studies have achieved outstanding success in explaining 2D image recognition ConvNets. On the other hand, due to the computation cost and complexity of video data, the explanation of 3D video recognition ConvNets is relatively less studied. In this paper, we present a 3D ACE (Automatic Concept-based Explanation) framework for interpreting 3D ConvNets. In our approach: (1) videos are represented using high-level supervoxels, which is straightforward for human to understand; and (2) the interpreting framework estimates a score for each voxel, which reflects its importance in the decision procedure. Experiments show that our method can discover spatial-temporal concepts of different importance-levels, and thus can explore the influence of the concepts on a target task, such as action classification, in-depth. The codes are publicly available. △ Less

Submitted 9 June, 2022; originally announced June 2022.

arXiv:2205.14510 [pdf, other]

Q-LIC: Quantizing Learned Image Compression with Channel Splitting

Authors: Heming Sun, Lu Yu, Jiro Katto

Abstract: Learned image compression (LIC) has reached a comparable coding gain with traditional hand-crafted methods such as VVC intra. However, the large network complexity prohibits the usage of LIC on resource-limited embedded systems. Network quantization is an efficient way to reduce the network burden. This paper presents a quantized LIC (QLIC) by channel splitting. First, we explore that the influenc… ▽ More Learned image compression (LIC) has reached a comparable coding gain with traditional hand-crafted methods such as VVC intra. However, the large network complexity prohibits the usage of LIC on resource-limited embedded systems. Network quantization is an efficient way to reduce the network burden. This paper presents a quantized LIC (QLIC) by channel splitting. First, we explore that the influence of quantization error to the reconstruction error is different for various channels. Second, we split the channels whose quantization has larger influence to the reconstruction error. After the splitting, the dynamic range of channels is reduced so that the quantization error can be reduced. Finally, we prune several channels to keep the number of overall channels as origin. By using the proposal, in the case of 8-bit quantization for weight and activation of both main and hyper path, we can reduce the BD-rate by 0.61%-4.74% compared with the previous QLIC. Besides, we can reach better coding gain compared with the state-of-the-art network quantization method when quantizing MS-SSIM models. Moreover, our proposal can be combined with other network quantization methods to further improve the coding gain. The moderate coding loss caused by the quantization validates the feasibility of the hardware implementation for QLIC in the future. △ Less

Submitted 28 May, 2022; originally announced May 2022.

arXiv:2204.08898 [pdf, other]

Complexity phase transitions in instantaneous quantum polynomial-time circuits

Authors: Chae-Yeun Park, Michael J. Kastoryano

Abstract: We study a subclass of the Instantaneous Quantum Polynomial-time (IQP) circuit with a varying density of two-qubit gates. In addition to a known anticoncentration regime, we identify novel parameter conditions where the model is classically simulable or the output distribution follows the Porter-Thomas distribution. By showing that those parameter regimes do not coincide, we argue the presence of… ▽ More We study a subclass of the Instantaneous Quantum Polynomial-time (IQP) circuit with a varying density of two-qubit gates. In addition to a known anticoncentration regime, we identify novel parameter conditions where the model is classically simulable or the output distribution follows the Porter-Thomas distribution. By showing that those parameter regimes do not coincide, we argue the presence of more than two phases in the model. The learnability of the output distribution of this model is further studied, which indicates that an energy-based model fails to learn the output distribution even when it is not anticoncentrated. Our study reveals that a quantum circuit model can have multiple fine-grained complexity phases, suggesting the potential for quantum advantage even when the output distribution is far from the Porter-Thomas distribution. △ Less

Submitted 24 June, 2023; v1 submitted 19 April, 2022; originally announced April 2022.

Comments: 5+17 pages, 10 figures. Significantly revised by adding complexity-theoretical arguments

arXiv:2203.12888 [pdf]

RSSI-CSI Measurement and Variation Mitigation with Commodity WiFi Device

Authors: Bo Wei, Hang Song, Jiro Katto, Takamaro Kikkawa

Abstract: Owing to the plentiful information released by the commodity devices, WiFi signals have been widely studied for various wireless sensing applications. In many works, both received signal strength indicator (RSSI) and the channel state information (CSI) are utilized as the key factors for precise sensing. However, the calculation and relationship between RSSI and CSI is not explained in detail. Fur… ▽ More Owing to the plentiful information released by the commodity devices, WiFi signals have been widely studied for various wireless sensing applications. In many works, both received signal strength indicator (RSSI) and the channel state information (CSI) are utilized as the key factors for precise sensing. However, the calculation and relationship between RSSI and CSI is not explained in detail. Furthermore, there are few works focusing on the measurement variation of the WiFi signal which impacts the sensing results. In this paper, the relationship between RSSI and CSI is studied in detail and the measurement variation of amplitude and phase information is investigated by extensive experiments. In the experiments, transmitter and receiver are directly connected by power divider and RF cables and the signal transmission is quantitatively controlled by RF attenuators. By changing the intensity of attenuation, the measurement of RSSI and CSI is carried out under different conditions. From the results, it is found that in order to get a reliable measurement of the signal amplitude and phase by commodity WiFi, the attenuation of the channels should not exceed 60 dB. Meanwhile, the difference between two channels should be lower than 10 dB. An active control mechanism is suggested to ensure the measurement stability. The findings and criteria of this work is promising to facilitate more precise sensing technologies with WiFi signal. △ Less

Submitted 24 March, 2022; originally announced March 2022.

arXiv:2111.09348 [pdf, other]

End-to-End Learned Image Compression with Quantized Weights and Activations

Authors: Heming Sun, Lu Yu, Jiro Katto

Abstract: End-to-end Learned image compression (LIC) has reached the traditional hand-crafted methods such as BPG (HEVC intra) in terms of the coding gain. However, the large network size prohibits the usage of LIC on resource-limited embedded systems. This paper reduces the network complexity by quantizing both weights and activations. 1) For the weight quantization, we study different kinds of grou** an… ▽ More End-to-end Learned image compression (LIC) has reached the traditional hand-crafted methods such as BPG (HEVC intra) in terms of the coding gain. However, the large network size prohibits the usage of LIC on resource-limited embedded systems. This paper reduces the network complexity by quantizing both weights and activations. 1) For the weight quantization, we study different kinds of grou** and quantization scheme at first. A channel-wise non-linear quantization scheme is determined based on the coding gain analysis. After that, we propose a fine tuning scheme to clip the weights within a certain range so that the quantization error can be reduced. 2) For the activation quantization, we first propose multiple non-linear quantization codebooks with different maximum dynamic ranges. By selecting an optimal one through a multiplexer, the quantization range can be saturated to the greatest extent. In addition, we also exploit the mean-removed quantization for the analysis transform outputs in order to reduce the bit-width cost for the specific channel with the large non-zero mean. By quantizing each weight and activation element from 32-bit floating point to 8-bit fixed point, the memory cost for both weight and activation can be reduced by 75% with negligible coding performance loss. As a result, our quantized LIC can still outperform BPG in terms of MS-SSIM. To our best knowledge, this is the first work to give a complete analysis on the coding gain and the memory cost for a quantized LIC network, which validates the feasibility of the hardware implementation. △ Less

Submitted 17 November, 2021; originally announced November 2021.

arXiv:2109.12293 [pdf]

Adaptive video transmission using QUBO method and Digital Annealer based on Ising machine

Authors: Bo Wei, Hang Song, Jiro Katto

Abstract: With the dramatically increasing video streaming in the total network traffic, it is critical to develop effective algorithms to promote the content delivery service of high quality. Adaptive bitrate (ABR) control is the most essential technique which determines the proper bitrate to be chosen based on network conditions, thus realize high-quality video streaming. In this paper, a novel ABR strate… ▽ More With the dramatically increasing video streaming in the total network traffic, it is critical to develop effective algorithms to promote the content delivery service of high quality. Adaptive bitrate (ABR) control is the most essential technique which determines the proper bitrate to be chosen based on network conditions, thus realize high-quality video streaming. In this paper, a novel ABR strategy is proposed based on Ising machine by using the quadratic unconstrained binary optimization (QUBO) method and Digital Annealer (DA) for the first time. The proposed method is evaluated by simulation with the real-world measured throughput, and compared with other state-of-the-art methods. Experiment results show that the proposed QUBO-based method can outperform the existing methods, which demonstrating the superior of the proposed QUBO-based method. △ Less

Submitted 25 September, 2021; originally announced September 2021.

arXiv:2108.08551 [pdf, other]

Learned Video Compression with Residual Prediction and Loop Filter

Authors: Chao Liu, Heming Sun, Jiro Katto, Xiaoyang Zeng, Yibo Fan

Abstract: In this paper, we propose a learned video codec with a residual prediction network (RP-Net) and a feature-aided loop filter (LF-Net). For the RP-Net, we exploit the residual of previous multiple frames to further eliminate the redundancy of the current frame residual. For the LF-Net, the features from residual decoding network and the motion compensation network are used to aid the reconstruction… ▽ More In this paper, we propose a learned video codec with a residual prediction network (RP-Net) and a feature-aided loop filter (LF-Net). For the RP-Net, we exploit the residual of previous multiple frames to further eliminate the redundancy of the current frame residual. For the LF-Net, the features from residual decoding network and the motion compensation network are used to aid the reconstruction quality. To reduce the complexity, a light ResNet structure is used as the backbone for both RP-Net and LF-Net. Experimental results illustrate that we can save about 10% BD-rate compared with previous learned video compression frameworks. Moreover, we can achieve faster coding speed due to the ResNet backbone. This project is available at https://github.com/chaoliu18/RPLVC. △ Less

Submitted 19 August, 2021; originally announced August 2021.

arXiv:2108.02503 [pdf, other]

Fully Neural Network Mode Based Intra Prediction of Variable Block Size

Authors: Heming Sun, Lu Yu, Jiro Katto

Abstract: Intra prediction is an essential component in the image coding. This paper gives an intra prediction framework completely based on neural network modes (NM). Each NM can be regarded as a regression from the neighboring reference blocks to the current coding block. (1) For variable block size, we utilize different network structures. For small blocks 4x4 and 8x8, fully connected networks are used,… ▽ More Intra prediction is an essential component in the image coding. This paper gives an intra prediction framework completely based on neural network modes (NM). Each NM can be regarded as a regression from the neighboring reference blocks to the current coding block. (1) For variable block size, we utilize different network structures. For small blocks 4x4 and 8x8, fully connected networks are used, while for large blocks 16x16 and 32x32, convolutional neural networks are exploited. (2) For each prediction mode, we develop a specific pre-trained network to boost the regression accuracy. When integrating into HEVC test model, we can save 3.55%, 3.03% and 3.27% BD-rate for Y, U, V components compared with the anchor. As far as we know, this is the first work to explore a fully NM based framework for intra prediction, and we reach a better coding gain with a lower complexity compared with the previous work. △ Less

Submitted 5 August, 2021; originally announced August 2021.

Comments: VCIP 2020 Best Paper

arXiv:2107.12567 [pdf, other]

Guided Optimization for Image Processing Pipelines

Authors: Yuka Ikarashi, Jonathan Ragan-Kelley, Tsukasa Fukusato, Jun Kato, Takeo Igarashi

Abstract: Writing high-performance image processing code is challenging and labor-intensive. The Halide programming language simplifies this task by decoupling high-level algorithms from "schedules" which optimize their implementation. However, even with this abstraction, it is still challenging for Halide programmers to understand complicated scheduling strategies and productively write valid, optimized sc… ▽ More Writing high-performance image processing code is challenging and labor-intensive. The Halide programming language simplifies this task by decoupling high-level algorithms from "schedules" which optimize their implementation. However, even with this abstraction, it is still challenging for Halide programmers to understand complicated scheduling strategies and productively write valid, optimized schedules. To address this, we propose a programming support method called "guided optimization." Guided optimization provides programmers a set of valid optimization options and interactive feedback about their current choices, which enables them to comprehend and efficiently optimize image processing code without the time-consuming trial-and-error process of traditional text editors. We implemented a proof-of-concept system, Roly-poly, which integrates guided optimization, program visualization, and schedule cost estimation to support the comprehension and development of efficient Halide image processing code. We conducted a user study with novice Halide programmers and confirmed that Roly-poly and its guided optimization was informative, increased productivity, and resulted in higher-performing schedules in less time. △ Less

Submitted 27 July, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

arXiv:2106.06238 [pdf, other]

doi 10.1088/1361-6420/ac346a

Approximation error method for imaging the human head by electrical impedance tomography

Authors: Valentina Candiani, Nuutti Hyvönen, Jari P. Kaipio, Ville Kolehmainen

Abstract: This work considers electrical impedance tomography imaging of the human head, with the ultimate goal of locating and classifying a stroke in emergency care. One of the main difficulties in the envisioned application is that the electrode locations and the shape of the head are not precisely known, leading to significant imaging artifacts due to impedance tomography being sensitive to modeling err… ▽ More This work considers electrical impedance tomography imaging of the human head, with the ultimate goal of locating and classifying a stroke in emergency care. One of the main difficulties in the envisioned application is that the electrode locations and the shape of the head are not precisely known, leading to significant imaging artifacts due to impedance tomography being sensitive to modeling errors. In this study, the natural variations in the geometry of the head and skull are modeled based on a library of head anatomies. The effect of these variations, as well as that of misplaced electrodes, on (absolute) impedance tomography measurements is in turn modeled by the approximation error method. This enables reliably reconstructing the conductivity perturbation caused by the stroke in an average head model, instead of the actual head, relative to its average conductivity levels. The functionality of a certain edge-preferring reconstruction algorithm for locating the stroke is demonstrated via numerical experiments based on simulated three-dimensional data. △ Less

Submitted 11 June, 2021; originally announced June 2021.

Comments: 24 pages, 5 figures

MSC Class: 65N21; 35R30; 62F15

arXiv:2106.03421 [pdf, other]

doi 10.3842/SIGMA.2022.014

$q$-Selberg Integrals and Koornwinder Polynomials

Authors: Jyoichi Kaneko

Abstract: We prove a generalization of the $q$-Selberg integral evaluation formula. The integrand is that of $q$-Selberg integral multiplied by a factor of the same form with respect to part of the variables. The proof relies on the quadratic norm formula of Koornwinder polynomials. We also derive generalizations of Mehta's integral formula as limit cases of our integral. We prove a generalization of the $q$-Selberg integral evaluation formula. The integrand is that of $q$-Selberg integral multiplied by a factor of the same form with respect to part of the variables. The proof relies on the quadratic norm formula of Koornwinder polynomials. We also derive generalizations of Mehta's integral formula as limit cases of our integral. △ Less

Submitted 28 February, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

MSC Class: 33D52; 05A30; 11B65

Journal ref: SIGMA 18 (2022), 014, 35 pages

arXiv:2105.02225 [pdf, other]

doi 10.1121/10.0007049

Model reduction in acoustic inversion by artificial neural network

Authors: Janne Koponen, Timo Lähivaara, Jari Kaipio, Marko Vauhkonen

Abstract: In ultrasound tomography, the speed of sound inside an object is estimated based on acoustic measurements carried out by sensors surrounding the object. An accurate forward model is a prominent factor for high-quality image reconstruction, but it can make computations far too time-consuming in many applications. Using approximate forward models, it is possible to speed up the computations, but the… ▽ More In ultrasound tomography, the speed of sound inside an object is estimated based on acoustic measurements carried out by sensors surrounding the object. An accurate forward model is a prominent factor for high-quality image reconstruction, but it can make computations far too time-consuming in many applications. Using approximate forward models, it is possible to speed up the computations, but the quality of the reconstruction may have to be compromised. In this paper, a neural network -based approach is proposed, that can compensate for modeling errors caused by the approximate forward models. The approach is tested with various different imaging scenarios in a simulated two-dimensional domain. The results show that with fairly small training datasets, the proposed approach can be utilized to approximate the modelling errors, and to significantly improve the image reconstruction quality in ultrasound tomography, compared to commonly used inversion algorithms. △ Less

Submitted 24 August, 2021; v1 submitted 5 May, 2021; originally announced May 2021.

arXiv:2104.11011 [pdf, other]

Learning Neural Network Quantum States with the Linear Method

Authors: J. Thorben Frank, Michael J. Kastoryano

Abstract: Due to the strong correlations present in quantum systems, classical machine learning algorithms like stochastic gradient descent are often insufficient for the training of neural network quantum states (NQSs). These difficulties can be overcome by using physically inspired learning algorithm, the most prominent of which is the stochastic reconfiguration (SR) which mimics imaginary time evolution.… ▽ More Due to the strong correlations present in quantum systems, classical machine learning algorithms like stochastic gradient descent are often insufficient for the training of neural network quantum states (NQSs). These difficulties can be overcome by using physically inspired learning algorithm, the most prominent of which is the stochastic reconfiguration (SR) which mimics imaginary time evolution. Here we explore an alternative algorithms for the optimization of complex valued NQSs based on the linear method (LM), and present the explicit formulation in terms of complex valued parameters. Beyond the theoretical formulation, we present numerical evidence that the LM can be used successfully for the optimization of complex valued NQSs, to our knowledge for the first time. We compare the LM to the state-of-the-art SR algorithm and find that the LM requires up to an order of magnitude fewer iterations for convergence, albeit at a higher cost per epoch. We further demonstrate that the LM becomes the more efficient training algorithm whenever the cost of sampling is high. This advantage, however, comes at the price of a larger variance. △ Less

Submitted 22 April, 2021; originally announced April 2021.

Comments: 16 pages, 7 figures

arXiv:2103.12863 [pdf, other]

doi 10.1109/LRA.2018.2857858

A Plenum-Based Calibration Device for Tactile Sensor Arrays

Authors: Joan Kangro, Anand Vazhapilli Sureshbabu, Silvio Traversaro, Daniele Pucci, Francesco Nori

Abstract: In modern robotic applications, tactile sensor arrays (i.e., artificial skins) are an emergent solution to determine the locations of contacts between a robot and an external agent. Localizing the point of contact is useful but determining the force applied on the skin provides many additional possibilities. This additional feature usually requires time-consuming calibration procedures to relate t… ▽ More In modern robotic applications, tactile sensor arrays (i.e., artificial skins) are an emergent solution to determine the locations of contacts between a robot and an external agent. Localizing the point of contact is useful but determining the force applied on the skin provides many additional possibilities. This additional feature usually requires time-consuming calibration procedures to relate the sensor readings to the applied forces. This letter presents a novel device that enables the calibration of tactile sensor arrays in a fast and simple way. The key idea is to design a plenum chamber where the skin is inserted, and then the calibration of the tactile sensors is achieved by relating the air pressure and the sensor readings. This general concept is tested experimentally to calibrate the skin of the iCub robot. The validation of the calibration device is achieved by placing the masses of known weight on the artificial skin and comparing the applied force against the one estimated by the sensors. △ Less

Submitted 23 March, 2021; originally announced March 2021.

Comments: 8 pages, 18 figures

Journal ref: IEEE Robotics and Automation Letters ( Volume: 3, Issue: 4, Oct. 2018)

Showing 1–50 of 158 results for author: Ka**o, J