Search | arXiv e-print repository

Robust Resource Allocation for STAR-RIS Assisted SWIPT Systems

Authors: Guangyu Zhu, Xidong Mu, Li Guo, Ao Huang, Shibiao Xu

Abstract: A simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted simultaneous wireless information and power transfer (SWIPT) system is proposed. More particularly, an STAR-RIS is deployed to assist in the information/power transfer from a multi-antenna access point (AP) to multiple single-antenna information users (IUs) and energy users (EUs), where two practica… ▽ More A simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted simultaneous wireless information and power transfer (SWIPT) system is proposed. More particularly, an STAR-RIS is deployed to assist in the information/power transfer from a multi-antenna access point (AP) to multiple single-antenna information users (IUs) and energy users (EUs), where two practical STAR-RIS operating protocols, namely energy splitting (ES) and time switching (TS), are employed. Under the imperfect channel state information (CSI) condition, a multi-objective optimization problem (MOOP) framework, that simultaneously maximizes the minimum data rate and minimum harvested power, is employed to investigate the fundamental rate-energy trade-off between IUs and EUs. To obtain the optimal robust resource allocation strategy, the MOOP is first transformed into a single-objective optimization problem (SOOP) via the ε-constraint method, which is then reformulated by approximating semi-infinite inequality constraints with the S-procedure. For ES, an alternating optimization (AO)-based algorithm is proposed to jointly design AP active beamforming and STAR-RIS passive beamforming, where a penalty method is leveraged in STAR-RIS beamforming design. Furthermore, the developed algorithm is extended to optimize the time allocation policy and beamforming vectors in a two-layer iterative manner for TS. Numerical results reveal that: 1) deploying STAR-RISs achieves a significant performance gain over conventional RISs, especially in terms of harvested power for EUs; 2) the ES protocol obtains a better user fairness performance when focusing only on IUs or EUs, while the TS protocol yields a better balance between IUs and EUs; 3) the imperfect CSI affects IUs more significantly than EUs, whereas TS can confer a more robust design to attenuate these effects. △ Less

Submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.15130 [pdf, ps, other]

Coexisting Passive RIS and Active Relay Assisted NOMA Systems

Authors: Ao Huang, Li Guo, Xidong Mu, Chao Dong, Yuanwei Liu

Abstract: A novel coexisting passive reconfigurable intelligent surface (RIS) and active decode-and-forward (DF) relay assisted non-orthogonal multiple access (NOMA) transmission framework is proposed. In particular, two communication protocols are conceived, namely Hybrid NOMA (H-NOMA) and Full NOMA (F-NOMA). Based on the proposed two protocols, both the sum rate maximization and max-min rate fairness prob… ▽ More A novel coexisting passive reconfigurable intelligent surface (RIS) and active decode-and-forward (DF) relay assisted non-orthogonal multiple access (NOMA) transmission framework is proposed. In particular, two communication protocols are conceived, namely Hybrid NOMA (H-NOMA) and Full NOMA (F-NOMA). Based on the proposed two protocols, both the sum rate maximization and max-min rate fairness problems are formulated for jointly optimizing the power allocation at the access point and relay as well as the passive beamforming design at the RIS. To tackle the non-convex problems, an alternating optimization (AO) based algorithm is first developed, where the transmit power and the RIS phase-shift are alternatingly optimized by leveraging the two-dimensional search and rank-relaxed difference-of-convex (DC) programming, respectively. Then, a two-layer penalty based joint optimization (JO) algorithm is developed to jointly optimize the resource allocation coefficients within each iteration. Finally, numerical results demonstrate that: i) the proposed coexisting RIS and relay assisted transmission framework is capable of achieving a significant user performance improvement than conventional schemes without RIS or relay; ii) compared with the AO algorithm, the JO algorithm requires less execution time at the cost of a slight performance loss; and iii) the H-NOMA and F-NOMA protocols are generally preferable for ensuring user rate fairness and enhancing user sum rate, respectively. △ Less

Submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.15120 [pdf, ps, other]

STAR-RIS Assisted Downlink Active and Uplink Backscatter Communications with NOMA

Authors: Ao Huang, Xidong Mu, Li Guo

Abstract: A simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted downlink (DL) active and uplink (UL) backscatter communication (BackCom) framework is proposed. More particularly, a full-duplex (FD) base station (BS) communicates with the DL users via the STAR-RIS's transmission link, while exciting and receiving the information from the UL BackCom devices with t… ▽ More A simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted downlink (DL) active and uplink (UL) backscatter communication (BackCom) framework is proposed. More particularly, a full-duplex (FD) base station (BS) communicates with the DL users via the STAR-RIS's transmission link, while exciting and receiving the information from the UL BackCom devices with the aid of the STAR-RIS's reflection link. Non-orthogonal multiple access (NOMA) is exploited in both DL and UL communications for improving the spectrum efficiency. The system weighted sum rate maximization problem is formulated for jointly optimizing the FD BS active receive and transmit beamforming, the STAR- RIS passive beamforming, and the DL NOMA decoding orders, subject to the DL user's individual rate constraint. To tackle this challenging non-convex problem, we propose an alternating optimization (AO) based algorithm for the joint active and passive beamforming design with a given DL NOMA decoding order. To address the potential high computational complexity required for exhaustive searching all the NOMA decoding orders, an efficient NOMA user ordering scheme is further developed. Finally, numerical results demonstrate that: i) compared with the baseline schemes employing conventional RISs or space division multiple access, the proposed scheme achieves higher performance gains; and ii) higher UL rate gain is obtained at a cost of DL performance degradation, as a remedy, a more flexible performance tradeoff can be achieved by introducing the STAR-RIS. △ Less

Submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.01956 [pdf, ps, other]

Hybrid Active-Passive RIS Transmitter Enabled Energy-Efficient Multi-User Communications

Authors: Ao Huang, Xidong Mu, Li Guo, Guangyu Zhu

Abstract: A novel hybrid active-passive reconfigurable intelligent surface (RIS) transmitter enabled downlink multi-user communication system is investigated. Specifically, RISs are exploited to serve as transmitter antennas, where each element can flexibly switch between active and passive modes to deliver information to multiple users. The system energy efficiency (EE) maximization problem is formulated b… ▽ More A novel hybrid active-passive reconfigurable intelligent surface (RIS) transmitter enabled downlink multi-user communication system is investigated. Specifically, RISs are exploited to serve as transmitter antennas, where each element can flexibly switch between active and passive modes to deliver information to multiple users. The system energy efficiency (EE) maximization problem is formulated by jointly optimizing the RIS element scheduling and beamforming coefficients, as well as the power allocation coefficients, subject to the user's individual rate requirement and the maximum RIS amplification power constraint. Using the Dinkelbach relaxation, the original mixed-integer nonlinear programming problem is transformed into a nonfractional optimization problem with a two-layer structure, which is solved by the alternating optimization approach. In particular, an exhaustive search method is proposed to determine the optimal operating mode for each RIS element. Then, the RIS beamforming and power allocation coefficients are properly designed in an alternating manner. To overcome the potentially high complexity caused by exhaustive searching, we further develop a joint RIS element mode and beamforming optimization scheme by exploiting the Big-M formulation technique. Numerical results validate that: 1) The proposed hybrid RIS scheme yields higher EE than the baseline multi-antenna schemes employing fully active/passive RIS or conventional radio frequency chains; 2) Both proposed algorithms are effective in improving the system performance, especially the latter can achieve precise design of RIS elements with low complexity; and 3) For a fixed-size hybrid RIS, maximum EE can be reaped by setting only a minority of elements to operate in the active mode. △ Less

Submitted 4 March, 2024; originally announced March 2024.

arXiv:2303.07821 [pdf, ps, other]

Self-attention for Enhanced OAMP Detection in MIMO Systems

Authors: Alexander Fuchs, Christian Knoll, Nima N. Moghadam, Alexey Pak **liang Huang, Erik Leitinger, Franz Pernkopf

Abstract: Multiple-Input Multiple-Output (MIMO) systems are essential for wireless communications. Sinceclassical algorithms for symbol detection in MIMO setups require large computational resourcesor provide poor results, data-driven algorithms are becoming more popular. Most of the proposedalgorithms, however, introduce approximations leading to degraded performance for realistic MIMOsystems. In this pape… ▽ More Multiple-Input Multiple-Output (MIMO) systems are essential for wireless communications. Sinceclassical algorithms for symbol detection in MIMO setups require large computational resourcesor provide poor results, data-driven algorithms are becoming more popular. Most of the proposedalgorithms, however, introduce approximations leading to degraded performance for realistic MIMOsystems. In this paper, we introduce a neural-enhanced hybrid model, augmenting the analyticbackbone algorithm with state-of-the-art neural network components. In particular, we introduce aself-attention model for the enhancement of the iterative Orthogonal Approximate Message Passing(OAMP)-based decoding algorithm. In our experiments, we show that the proposed model canoutperform existing data-driven approaches for OAMP while having improved generalization to otherSNR values at limited computational overhead. △ Less

Submitted 14 March, 2023; originally announced March 2023.

Comments: 8 pages, 2 figures, ICASSP 2023

ACM Class: I.2.1; H.1.1

arXiv:2303.07406 [pdf]

Infra-Red, In-Situ (IRIS) Inspection of Silicon

Authors: Andrew 'bunnie' Huang

Abstract: This paper introduces the Infra-Red, In Situ (IRIS) inspection method, which uses short-wave IR (SWIR) light to non-destructively "see through" the backside of chips and image them with lightly modified conventional digital CMOS cameras. With a ~1050 nm light source, IRIS is capable of constraining macro- and meso-scale features of a chip. This hardens existing micro-scale self-test verification t… ▽ More This paper introduces the Infra-Red, In Situ (IRIS) inspection method, which uses short-wave IR (SWIR) light to non-destructively "see through" the backside of chips and image them with lightly modified conventional digital CMOS cameras. With a ~1050 nm light source, IRIS is capable of constraining macro- and meso-scale features of a chip. This hardens existing micro-scale self-test verification techniques by ruling out the existence of extra circuitry that can hide a hardware trojan with a test bypass. Thus, self-test techniques used in conjunction with IRIS can ensure the correct construction of security-critical hardware at all size scales. △ Less

Submitted 5 March, 2023; originally announced March 2023.

Comments: 8 pages, 19 figures

ACM Class: B.m

arXiv:2205.03997 [pdf, other]

A Real Time Super Resolution Accelerator with Tilted Layer Fusion

Authors: An-Jung Huang, Kai-Chieh Hsu, Tian-Sheuan Chang

Abstract: Deep learning based superresolution achieves high-quality results, but its heavy computational workload, large buffer, and high external memory bandwidth inhibit its usage in mobile devices. To solve the above issues, this paper proposes a real-time hardware accelerator with the tilted layer fusion method that reduces the external DRAM bandwidth by 92\% and just needs 102KB on-chip memory. The des… ▽ More Deep learning based superresolution achieves high-quality results, but its heavy computational workload, large buffer, and high external memory bandwidth inhibit its usage in mobile devices. To solve the above issues, this paper proposes a real-time hardware accelerator with the tilted layer fusion method that reduces the external DRAM bandwidth by 92\% and just needs 102KB on-chip memory. The design implemented with a 40nm CMOS process achieves 1920x1080@60fps throughput with 544.3K gate count when running at 600MHz; it has higher throughput and lower area cost than previous designs. △ Less

Submitted 8 May, 2022; originally announced May 2022.

Comments: 5 pages, 6 figures, published in ISCAS 2022

arXiv:2203.15140 [pdf, other]

Improving Source Separation by Explicitly Modeling Dependencies Between Sources

Authors: Ethan Manilow, Curtis Hawthorne, Cheng-Zhi Anna Huang, Bryan Pardo, Jesse Engel

Abstract: We propose a new method for training a supervised source separation system that aims to learn the interdependent relationships between all combinations of sources in a mixture. Rather than independently estimating each source from a mix, we reframe the source separation problem as an Orderless Neural Autoregressive Density Estimator (NADE), and estimate each source from both the mix and a random s… ▽ More We propose a new method for training a supervised source separation system that aims to learn the interdependent relationships between all combinations of sources in a mixture. Rather than independently estimating each source from a mix, we reframe the source separation problem as an Orderless Neural Autoregressive Density Estimator (NADE), and estimate each source from both the mix and a random subset of the other sources. We adapt a standard source separation architecture, Demucs, with additional inputs for each individual source, in addition to the input mixture. We randomly mask these input sources during training so that the network learns the conditional dependencies between the sources. By pairing this training method with a block Gibbs sampling procedure at inference time, we demonstrate that the network can iteratively improve its separation performance by conditioning a source estimate on its earlier source estimates. Experiments on two source separation datasets show that training a Demucs model with an Orderless NADE approach and using Gibbs sampling (up to 512 steps) at inference time strongly outperforms a Demucs baseline that uses a standard regression loss and direct (one step) estimation of sources. △ Less

Submitted 28 March, 2022; originally announced March 2022.

Comments: To appear at ICASSP 2022

arXiv:2112.09312 [pdf, other]

MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

Authors: Yusong Wu, Ethan Manilow, Yi Deng, Rigel Swavely, Kyle Kastner, Tim Cooijmans, Aaron Courville, Cheng-Zhi Anna Huang, Jesse Engel

Abstract: Musical expression requires control of both what notes are played, and how they are performed. Conventional audio synthesizers provide detailed expressive controls, but at the cost of realism. Black-box neural audio synthesis and concatenative samplers can produce realistic audio, but have few mechanisms for control. In this work, we introduce MIDI-DDSP a hierarchical model of musical instruments… ▽ More Musical expression requires control of both what notes are played, and how they are performed. Conventional audio synthesizers provide detailed expressive controls, but at the cost of realism. Black-box neural audio synthesis and concatenative samplers can produce realistic audio, but have few mechanisms for control. In this work, we introduce MIDI-DDSP a hierarchical model of musical instruments that enables both realistic neural audio synthesis and detailed user control. Starting from interpretable Differentiable Digital Signal Processing (DDSP) synthesis parameters, we infer musical notes and high-level properties of their expressive performance (such as timbre, vibrato, dynamics, and articulation). This creates a 3-level hierarchy (notes, performance, synthesis) that affords individuals the option to intervene at each level, or utilize trained priors (performance given notes, synthesis given performance) for creative assistance. Through quantitative experiments and listening tests, we demonstrate that this hierarchy can reconstruct high-fidelity audio, accurately predict performance attributes for a note sequence, independently manipulate the attributes of a given performance, and as a complete system, generate realistic audio from a novel note sequence. By utilizing an interpretable hierarchy, with multiple levels of granularity, MIDI-DDSP opens the door to assistive tools to empower individuals across a diverse range of musical experience. △ Less

Submitted 17 March, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

Comments: Accepted by International Conference on Learning Representations (ICLR) 2022

arXiv:2111.14951 [pdf, other]

Expressive Communication: A Common Framework for Evaluating Developments in Generative Models and Steering Interfaces

Authors: Ryan Louie, Jesse Engel, Anna Huang

Abstract: There is an increasing interest from ML and HCI communities in empowering creators with better generative models and more intuitive interfaces with which to control them. In music, ML researchers have focused on training models capable of generating pieces with increasing long-range structure and musical coherence, while HCI researchers have separately focused on designing steering interfaces that… ▽ More There is an increasing interest from ML and HCI communities in empowering creators with better generative models and more intuitive interfaces with which to control them. In music, ML researchers have focused on training models capable of generating pieces with increasing long-range structure and musical coherence, while HCI researchers have separately focused on designing steering interfaces that support user control and ownership. In this study, we investigate through a common framework how developments in both models and user interfaces are important for empowering co-creation where the goal is to create music that communicates particular imagery or ideas (e.g., as is common for other purposeful tasks in music creation like establishing mood or creating accompanying music for another media). Our study is distinguished in that it measures communication through both composer's self-reported experiences, and how listeners evaluate this communication through the music. In an evaluation study with 26 composers creating 100+ pieces of music and listeners providing 1000+ head-to-head comparisons, we find that more expressive models and more steerable interfaces are important and complementary ways to make a difference in composers communicating through music and supporting their creative empowerment. △ Less

Submitted 29 November, 2021; originally announced November 2021.

Comments: 15 pages, 6 figures, submitted to ACM Intelligent User Interfaces 2022 Conference

arXiv:2106.08846 [pdf, other]

Algorithm to Compilation Co-design: An Integrated View of Neural Network Sparsity

Authors: Fu-Ming Guo, Austin Huang

Abstract: Reducing computation cost, inference latency, and memory footprint of neural networks are frequently cited as research motivations for pruning and sparsity. However, operationalizing those benefits and understanding the end-to-end effect of algorithm design and regularization on the runtime execution is not often examined in depth. Here we apply structured and unstructured pruning to attention w… ▽ More Reducing computation cost, inference latency, and memory footprint of neural networks are frequently cited as research motivations for pruning and sparsity. However, operationalizing those benefits and understanding the end-to-end effect of algorithm design and regularization on the runtime execution is not often examined in depth. Here we apply structured and unstructured pruning to attention weights of transformer blocks of the BERT language model, while also expanding block sparse representation (BSR) operations in the TVM compiler. Integration of BSR operations enables the TVM runtime execution to leverage structured pattern sparsity induced by model regularization. This integrated view of pruning algorithms enables us to study relationships between modeling decisions and their direct impact on sparsity-enhanced execution. Our main findings are: 1) we validate that performance benefits of structured sparsity block regularization must be enabled by the BSR augmentations to TVM, with 4x speedup relative to vanilla PyTorch and 2.2x speedup relative to standard TVM compilation (without expanded BSR support). 2) for BERT attention weights, the end-to-end optimal block sparsity shape in this CPU inference context is not a square block (as in \cite{gray2017gpu}) but rather a linear 32x1 block 3) the relationship between performance and block size / shape is is suggestive of how model regularization parameters interact with task scheduler optimizations resulting in the observed end-to-end performance. △ Less

Submitted 17 June, 2021; v1 submitted 16 June, 2021; originally announced June 2021.

arXiv:2010.09776 [pdf, other]

SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving

Authors: Ming Zhou, Jun Luo, Julian Villella, Yaodong Yang, David Rusu, Jiayu Miao, Weinan Zhang, Montgomery Alban, Iman Fadakar, Zheng Chen, Aurora Chongxi Huang, Ying Wen, Kimia Hassanzadeh, Daniel Graves, Dong Chen, Zhengbang Zhu, Nhat Nguyen, Mohamed Elsayed, Kun Shao, Sanjeevan Ahilan, Baokuan Zhang, Jiannan Wu, Zhengang Fu, Kasra Rezaee, Peyman Yadmellat , et al. (12 additional authors not shown)

Abstract: Multi-agent interaction is a fundamental aspect of autonomous driving in the real world. Despite more than a decade of research and development, the problem of how to competently interact with diverse road users in diverse scenarios remains largely unsolved. Learning methods have much to offer towards solving this problem. But they require a realistic multi-agent simulator that generates diverse a… ▽ More Multi-agent interaction is a fundamental aspect of autonomous driving in the real world. Despite more than a decade of research and development, the problem of how to competently interact with diverse road users in diverse scenarios remains largely unsolved. Learning methods have much to offer towards solving this problem. But they require a realistic multi-agent simulator that generates diverse and competent driving interactions. To meet this need, we develop a dedicated simulation platform called SMARTS (Scalable Multi-Agent RL Training School). SMARTS supports the training, accumulation, and use of diverse behavior models of road users. These are in turn used to create increasingly more realistic and diverse interactions that enable deeper and broader research on multi-agent interaction. In this paper, we describe the design goals of SMARTS, explain its basic architecture and its key features, and illustrate its use through concrete multi-agent experiments on interactive scenarios. We open-source the SMARTS platform and the associated benchmark tasks and evaluation metrics to encourage and empower research on multi-agent learning for autonomous driving. Our code is available at https://github.com/huawei-noah/SMARTS. △ Less

Submitted 31 October, 2020; v1 submitted 19 October, 2020; originally announced October 2020.

Comments: 20 pages, 11 figures. Paper accepted to CoRL 2020

arXiv:2010.05388 [pdf, other]

AI Song Contest: Human-AI Co-Creation in Songwriting

Authors: Cheng-Zhi Anna Huang, Hendrik Vincent Koops, Ed Newton-Rex, Monica Dinculescu, Carrie J. Cai

Abstract: Machine learning is challenging the way we make music. Although research in deep generative models has dramatically improved the capability and fluency of music models, recent work has shown that it can be challenging for humans to partner with this new class of algorithms. In this paper, we present findings on what 13 musician/developer teams, a total of 61 users, needed when co-creating a song w… ▽ More Machine learning is challenging the way we make music. Although research in deep generative models has dramatically improved the capability and fluency of music models, recent work has shown that it can be challenging for humans to partner with this new class of algorithms. In this paper, we present findings on what 13 musician/developer teams, a total of 61 users, needed when co-creating a song with AI, the challenges they faced, and how they leveraged and repurposed existing characteristics of AI to overcome some of these challenges. Many teams adopted modular approaches, such as independently running multiple smaller models that align with the musical building blocks of a song, before re-combining their results. As ML models are not easily steerable, teams also generated massive numbers of samples and curated them post-hoc, or used a range of strategies to direct the generation, or algorithmically ranked the samples. Ultimately, teams not only had to manage the "flare and focus" aspects of the creative process, but also juggle them with a parallel process of exploring and curating multiple ML models and outputs. These findings reflect a need to design machine learning-powered music interfaces that are more decomposable, steerable, interpretable, and adaptive, which in return will enable artists to more effectively explore how AI can extend their personal expression. △ Less

Submitted 11 October, 2020; originally announced October 2020.

Comments: 6 pages + 3 pages of references

ACM Class: J.5; I.2

Journal ref: ISMIR 2020

arXiv:2007.05500 [pdf, other]

Scientific Discovery by Generating Counterfactuals using Image Translation

Authors: Arunachalam Narayanaswamy, Subhashini Venugopalan, Dale R. Webster, Lily Peng, Greg Corrado, Paisan Ruamviboonsuk, Pinal Bavishi, Rory Sayres, Abigail Huang, Siva Balasubramanian, Michael Brenner, Philip Nelson, Avinash V. Varadarajan

Abstract: Model explanation techniques play a critical role in understanding the source of a model's performance and making its decisions transparent. Here we investigate if explanation techniques can also be used as a mechanism for scientific discovery. We make three contributions: first, we propose a framework to convert predictions from explanation techniques to a mechanism of discovery. Second, we show… ▽ More Model explanation techniques play a critical role in understanding the source of a model's performance and making its decisions transparent. Here we investigate if explanation techniques can also be used as a mechanism for scientific discovery. We make three contributions: first, we propose a framework to convert predictions from explanation techniques to a mechanism of discovery. Second, we show how generative models in combination with black-box predictors can be used to generate hypotheses (without human priors) that can be critically examined. Third, with these techniques we study classification models for retinal images predicting Diabetic Macular Edema (DME), where recent work showed that a CNN trained on these images is likely learning novel features in the image. We demonstrate that the proposed framework is able to explain the underlying scientific mechanism, thus bridging the gap between the model's performance and human understanding. △ Less

Submitted 19 July, 2020; v1 submitted 10 July, 2020; originally announced July 2020.

Comments: Accepted at MICCAI 2020. This version combines camera-ready and supplement

Journal ref: MICCAI 2020

arXiv:2005.12373 [pdf]

doi 10.1109/TSG.2020.2998041

Large-Signal Stability Criteria in DC Power Grids with Distributed-Controlled Converters and Constant Power Loads

Authors: Fangyuan Chang, Xiaofan Cui, Mengqi Wang, Wencong Su, Alex Q. Huang

Abstract: The increasing adoption of power electronic devices may lead to large disturbance and destabilization of future power systems. However, stability criteria are still an unsolved puzzle, since traditional small-signal stability analysis is not applicable to power electronics-enabled power systems when a large disturbance occurs, such as a fault, a pulse power load, or load switching. To address this… ▽ More The increasing adoption of power electronic devices may lead to large disturbance and destabilization of future power systems. However, stability criteria are still an unsolved puzzle, since traditional small-signal stability analysis is not applicable to power electronics-enabled power systems when a large disturbance occurs, such as a fault, a pulse power load, or load switching. To address this issue, this paper presents for the first time the rigorous derivation of the sufficient criteria for large-signal stability in DC microgrids with distributed-controlled DC-DC power converters. A novel type of closed-loop converter controllers is designed and considered. Moreover, this paper is the first to prove that the well-known and frequently cited Brayton-Moser mixed potential theory (published in 1964) is incomplete. Case studies are carried out to illustrate the defects of Brayton-Moser mixed potential theory and verify the effectiveness of the proposed novel stability criteria. △ Less

Submitted 25 May, 2020; originally announced May 2020.

arXiv:2002.02451 [pdf, other]

Federated Orchestration for Network Slicing of Bandwidth and Computational Resource

Authors: Yingyu Li, Anqi Huang, Yong Xiao, Xiaohu Ge, Sumei Sun, Han-Chieh Chao

Abstract: Network slicing has been considered as one of the key enablers for 5G to support diversified IoT services and application scenarios. This paper studies the distributed network slicing for a massive scale IoT network supported by 5G with fog computing. Multiple services with various requirements need to be supported by both spectrum resource offered by 5G network and computational resourc of the fo… ▽ More Network slicing has been considered as one of the key enablers for 5G to support diversified IoT services and application scenarios. This paper studies the distributed network slicing for a massive scale IoT network supported by 5G with fog computing. Multiple services with various requirements need to be supported by both spectrum resource offered by 5G network and computational resourc of the fog computing network. We propose a novel distributed framework based on a new control plane entity, federated-orchestrator , which can coordinate the spectrum and computational resources without requiring any exchange of the local data and resource information from BSs. We propose a distributed resource allocation algorithm based on Alternating Direction Method of Multipliers with Partial Variable Splitting . We prove DistADMM-PVS minimizes the average service response time of the entire network with guaranteed worst-case performance for all supported types of services when the coordination between the F-orchestrator and BSs is perfectly synchronized. Motivated by the observation that coordination synchronization may result in high coordination delay that can be intolerable when the network is large in scale, we propose a novel asynchronized ADMM algorithm. We prove that AsynADMM can converge to the global optimal solution with improved scalability and negligible coordination delay. We evaluate the performance of our proposed framework using two-month of traffic data collected in a in-campus smart transportation system supported by a 5G network. Extensive simulation has been conducted for both pedestrian and vehicular-related services during peak and non-peak hours. Our results show that the proposed framework offers significant reduction on service response time for both supported services, especially compared to network slicing with only a single resource. △ Less

Submitted 6 February, 2020; originally announced February 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:2002.01101

arXiv:1908.08669 [pdf]

A Novel Synchronous Reference Frame Frequency-Locked Loop

Authors: Xiangjun Quan, Qinran Hu, Alex Q. Huang, Xiaobo Dou, Zaijun Wu

Abstract: This letter proposes a new design of frequency-locked loop (FLL) which is based on synchronous (dq) reference frame instead of stationary (α\b{eta}) reference frame. First, a synchronous reference frame FLL (briefly called SRF-FLL0) equivalent to the conventional FLL is proposed. Then the SRF-FLL0 is improved by utilizing the phase error to acquire a better performance. The small-signal modeling a… ▽ More This letter proposes a new design of frequency-locked loop (FLL) which is based on synchronous (dq) reference frame instead of stationary (α\b{eta}) reference frame. First, a synchronous reference frame FLL (briefly called SRF-FLL0) equivalent to the conventional FLL is proposed. Then the SRF-FLL0 is improved by utilizing the phase error to acquire a better performance. The small-signal modeling and parameter tuning of the improved synchronous reference frame FLL (SRF-FLL) are presented. Finally, the theoretical analysis and experiment results verify the superiority and effectiveness of proposed SRF-FLL. △ Less

Submitted 26 September, 2019; v1 submitted 23 August, 2019; originally announced August 2019.

Comments: 4 pages, 6 figures

arXiv:1907.06637 [pdf, other]

The Bach Doodle: Approachable music composition with machine learning at scale

Authors: Cheng-Zhi Anna Huang, Curtis Hawthorne, Adam Roberts, Monica Dinculescu, James Wexler, Leon Hong, Jacob Howcroft

Abstract: To make music composition more approachable, we designed the first AI-powered Google Doodle, the Bach Doodle, where users can create their own melody and have it harmonized by a machine learning model Coconet (Huang et al., 2017) in the style of Bach. For users to input melodies, we designed a simplified sheet-music based interface. To support an interactive experience at scale, we re-implemented… ▽ More To make music composition more approachable, we designed the first AI-powered Google Doodle, the Bach Doodle, where users can create their own melody and have it harmonized by a machine learning model Coconet (Huang et al., 2017) in the style of Bach. For users to input melodies, we designed a simplified sheet-music based interface. To support an interactive experience at scale, we re-implemented Coconet in TensorFlow.js (Smilkov et al., 2019) to run in the browser and reduced its runtime from 40s to 2s by adopting dilated depth-wise separable convolutions and fusing operations. We also reduced the model download size to approximately 400KB through post-training weight quantization. We calibrated a speed test based on partial model evaluation time to determine if the harmonization request should be performed locally or sent to remote TPU servers. In three days, people spent 350 years worth of time playing with the Bach Doodle, and Coconet received more than 55 million queries. Users could choose to rate their compositions and contribute them to a public dataset, which we are releasing with this paper. We hope that the community finds this dataset useful for applications ranging from ethnomusicological studies, to music education, to improving machine learning models. △ Less

Submitted 14 July, 2019; originally announced July 2019.

Comments: Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR 2019

arXiv:1905.08632 [pdf, other]

Human Vocal Sentiment Analysis

Authors: Andrew Huang, Puwei Bao

Abstract: In this paper, we use several techniques with conventional vocal feature extraction (MFCC, STFT), along with deep-learning approaches such as CNN, and also context-level analysis, by providing the textual data, and combining different approaches for improved emotion-level classification. We explore models that have not been tested to gauge the difference in performance and accuracy. We apply hyper… ▽ More In this paper, we use several techniques with conventional vocal feature extraction (MFCC, STFT), along with deep-learning approaches such as CNN, and also context-level analysis, by providing the textual data, and combining different approaches for improved emotion-level classification. We explore models that have not been tested to gauge the difference in performance and accuracy. We apply hyperparameter sweeps and data augmentation to improve performance. Finally, we see if a real-time approach is feasible, and can be readily integrated into existing systems. △ Less

Submitted 19 May, 2019; originally announced May 2019.

Comments: NYU Shanghai CSCS 2019

arXiv:1903.07227 [pdf, other]

Counterpoint by Convolution

Authors: Cheng-Zhi Anna Huang, Tim Cooijmans, Adam Roberts, Aaron Courville, Douglas Eck

Abstract: Machine learning models of music typically break up the task of composition into a chronological process, composing a piece of music in a single pass from beginning to end. On the contrary, human composers write music in a nonlinear fashion, scribbling motifs here and there, often revisiting choices previously made. In order to better approximate this process, we train a convolutional neural netwo… ▽ More Machine learning models of music typically break up the task of composition into a chronological process, composing a piece of music in a single pass from beginning to end. On the contrary, human composers write music in a nonlinear fashion, scribbling motifs here and there, often revisiting choices previously made. In order to better approximate this process, we train a convolutional neural network to complete partial musical scores, and explore the use of blocked Gibbs sampling as an analogue to rewriting. Neither the model nor the generative procedure are tied to a particular causal direction of composition. Our model is an instance of orderless NADE (Uria et al., 2014), which allows more direct ancestral sampling. However, we find that Gibbs sampling greatly improves sample quality, which we demonstrate to be due to some conditional distributions being poorly modeled. Moreover, we show that even the cheap approximate blocked Gibbs procedure from Yao et al. (2014) yields better samples than ancestral sampling, based on both log-likelihood and human evaluation. △ Less

Submitted 17 March, 2019; originally announced March 2019.

Comments: Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR 2017

ACM Class: H.5.5; I.2

arXiv:1811.09914 [pdf, other]

RADMPC: A Fast Decentralized Approach for Chance-Constrained Multi-Vehicle Path-Planning

Authors: Aaron Huang, Benjamin J. Ayton, Brian C. Williams

Abstract: Robust multi-vehicle path-planning is important for ensuring the safety of multi-vehicle systems in applications like transportation, search and rescue, and robotic exploration. Chance-constrained methods like Iterative Risk Allocation (IRA)\cite{IRA} have been developed for situations where environmental disturbances are unbounded. However, chance-constrained methods for the multi-vehicle case ge… ▽ More Robust multi-vehicle path-planning is important for ensuring the safety of multi-vehicle systems in applications like transportation, search and rescue, and robotic exploration. Chance-constrained methods like Iterative Risk Allocation (IRA)\cite{IRA} have been developed for situations where environmental disturbances are unbounded. However, chance-constrained methods for the multi-vehicle case generally use centralized strategies where the vehicle set is planned with couplings between all vehicle pairs. This approach is intractable as fleet size increases because computation time is exponential with respect to the number of vehicles being planned over due to a polynomial increase in coupling constraints between vehicle pairs. We present a faster approach for chance-constrained multi-vehicle path-planning that relies upon a decentralized path-planning method called Risk-Aware Decentralized Model Predictive Control (RADMPC) to rapidly approximate a centralized IRA approach. The RADMPC approximation is evaluated for vehicle interactions to determine the vehicle sets that should be planned in a coupled manner. Applying IRA to the smaller vehicle sets determined from the RADMPC approximation rapidly plans safe paths for the entire fleet. A Monte Carlo simulation analysis demonstrates the correctness of our approach and a significant improvement in computation time compared to a centralized IRA approach. △ Less

Submitted 24 November, 2018; originally announced November 2018.

arXiv:1810.12247 [pdf, other]

Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset

Authors: Curtis Hawthorne, Andriy Stasyuk, Adam Roberts, Ian Simon, Cheng-Zhi Anna Huang, Sander Dieleman, Erich Elsen, Jesse Engel, Douglas Eck

Abstract: Generating musical audio directly with neural networks is notoriously difficult because it requires coherently modeling structure at many different timescales. Fortunately, most music is also highly structured and can be represented as discrete note events played on musical instruments. Herein, we show that by using notes as an intermediate representation, we can train a suite of models capable of… ▽ More Generating musical audio directly with neural networks is notoriously difficult because it requires coherently modeling structure at many different timescales. Fortunately, most music is also highly structured and can be represented as discrete note events played on musical instruments. Herein, we show that by using notes as an intermediate representation, we can train a suite of models capable of transcribing, composing, and synthesizing audio waveforms with coherent musical structure on timescales spanning six orders of magnitude (~0.1 ms to ~100 s), a process we call Wave2Midi2Wave. This large advance in the state of the art is enabled by our release of the new MAESTRO (MIDI and Audio Edited for Synchronous TRacks and Organization) dataset, composed of over 172 hours of virtuosic piano performances captured with fine alignment (~3 ms) between note labels and audio waveforms. The networks and the dataset together present a promising approach toward creating new expressive and interpretable neural models of music. △ Less

Submitted 17 January, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

Comments: Examples available at https://goo.gl/magenta/maestro-examples

arXiv:1809.04281 [pdf, other]

Music Transformer

Authors: Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, Douglas Eck

Abstract: Music relies heavily on repetition to build structure and meaning. Self-reference occurs on multiple timescales, from motifs to phrases to reusing of entire sections of music, such as in pieces with ABA structure. The Transformer (Vaswani et al., 2017), a sequence model based on self-attention, has achieved compelling results in many generation tasks that require maintaining long-range coherence.… ▽ More Music relies heavily on repetition to build structure and meaning. Self-reference occurs on multiple timescales, from motifs to phrases to reusing of entire sections of music, such as in pieces with ABA structure. The Transformer (Vaswani et al., 2017), a sequence model based on self-attention, has achieved compelling results in many generation tasks that require maintaining long-range coherence. This suggests that self-attention might also be well-suited to modeling music. In musical composition and performance, however, relative timing is critically important. Existing approaches for representing relative positional information in the Transformer modulate attention based on pairwise distance (Shaw et al., 2018). This is impractical for long sequences such as musical compositions since their memory complexity for intermediate relative information is quadratic in the sequence length. We propose an algorithm that reduces their intermediate memory requirement to linear in the sequence length. This enables us to demonstrate that a Transformer with our modified relative attention mechanism can generate minute-long compositions (thousands of steps, four times the length modeled in Oore et al., 2018) with compelling structure, generate continuations that coherently elaborate on a given motif, and in a seq2seq setup generate accompaniments conditioned on melodies. We evaluate the Transformer with our relative attention mechanism on two datasets, JSB Chorales and Piano-e-Competition, and obtain state-of-the-art results on the latter. △ Less

Submitted 12 December, 2018; v1 submitted 12 September, 2018; originally announced September 2018.

Comments: Improved skewing section and accompanying figures. Previous titles are "An Improved Relative Self-Attention Mechanism for Transformer with Application to Music Generation" and "Music Transformer"

arXiv:1410.2792 [pdf, other]

Convex Model Predictive Control for Vehicular Systems

Authors: Tiffany A. Huang, Matanya B. Horowitz, Joel W. Burdick

Abstract: In this work, we present a method to perform Model Predictive Control (MPC) over systems whose state is an element of $SO(n)$ for $n=2,3$. This is done without charts or any local linearization, and instead is performed by operating over the orbitope of rotation matrices. This results in a novel MPC scheme without the drawbacks associated with conventional linearization techniques. Instead, second… ▽ More In this work, we present a method to perform Model Predictive Control (MPC) over systems whose state is an element of $SO(n)$ for $n=2,3$. This is done without charts or any local linearization, and instead is performed by operating over the orbitope of rotation matrices. This results in a novel MPC scheme without the drawbacks associated with conventional linearization techniques. Instead, second order cone- or semidefinite-constraints on state variables are the only requirement beyond those of a QP-scheme typical for MPC of linear systems. Of particular emphasis is the application to aeronautical and vehicular systems, wherein the method removes many of the transcendental trigonometric terms associated with these systems' state space equations. Furthermore, the method is shown to be compatible with many existing variants of MPC, including obstacle avoidance via Mixed Integer Linear Programming (MILP). △ Less

Submitted 10 October, 2014; originally announced October 2014.

Showing 1–24 of 24 results for author: Huang, A