-
Can Public LLMs be used for Self-Diagnosis of Medical Conditions ?
Authors:
Nikil Sharan Prabahar Balasubramanian,
Sagnik Dakshit
Abstract:
Advancements in deep learning have generated a large-scale interest in the development of foundational deep learning models. The development of Large Language Models (LLM) has evolved as a transformative paradigm in conversational tasks, which has led to its integration and extension even in the critical domain of healthcare. With LLMs becoming widely popular and their public access through open-s…
▽ More
Advancements in deep learning have generated a large-scale interest in the development of foundational deep learning models. The development of Large Language Models (LLM) has evolved as a transformative paradigm in conversational tasks, which has led to its integration and extension even in the critical domain of healthcare. With LLMs becoming widely popular and their public access through open-source models and integration with other applications, there is a need to investigate their potential and limitations. One such crucial task where LLMs are applied but require a deeper understanding is that of self-diagnosis of medical conditions based on bias-validating symptoms in the interest of public health. The widespread integration of Gemini with Google search and GPT-4.0 with Bing search has led to a shift in the trend of self-diagnosis using search engines to conversational LLM models. Owing to the critical nature of the task, it is prudent to investigate and understand the potential and limitations of public LLMs in the task of self-diagnosis. In this study, we prepare a prompt engineered dataset of 10000 samples and test the performance on the general task of self-diagnosis. We compared the performance of both the state-of-the-art GPT-4.0 and the fee Gemini model on the task of self-diagnosis and recorded contrasting accuracies of 63.07% and 6.01%, respectively. We also discuss the challenges, limitations, and potential of both Gemini and GPT-4.0 for the task of self-diagnosis to facilitate future research and towards the broader impact of general public knowledge. Furthermore, we demonstrate the potential and improvement in performance for the task of self-diagnosis using Retrieval Augmented Generation.
△ Less
Submitted 25 June, 2024; v1 submitted 18 May, 2024;
originally announced May 2024.
-
CYGENT: A cybersecurity conversational agent with log summarization powered by GPT-3
Authors:
Prasasthy Balasubramanian,
Justin Seby,
Panos Kostakos
Abstract:
In response to the escalating cyber-attacks in the modern IT and IoT landscape, we developed CYGENT, a conversational agent framework powered by GPT-3.5 turbo model, designed to aid system administrators in ensuring optimal performance and uninterrupted resource availability. This study focuses on fine-tuning GPT-3 models for cybersecurity tasks, including conversational AI and generative AI tailo…
▽ More
In response to the escalating cyber-attacks in the modern IT and IoT landscape, we developed CYGENT, a conversational agent framework powered by GPT-3.5 turbo model, designed to aid system administrators in ensuring optimal performance and uninterrupted resource availability. This study focuses on fine-tuning GPT-3 models for cybersecurity tasks, including conversational AI and generative AI tailored specifically for cybersecurity operations. CYGENT assists users by providing cybersecurity information, analyzing and summarizing uploaded log files, detecting specific events, and delivering essential instructions. The conversational agent was developed based on the GPT-3.5 turbo model. We fine-tuned and validated summarizer models (GPT3) using manually generated data points. Using this approach, we achieved a BERTscore of over 97%, indicating GPT-3's enhanced capability in summarizing log files into human-readable formats and providing necessary information to users. Furthermore, we conducted a comparative analysis of GPT-3 models with other Large Language Models (LLMs), including CodeT5-small, CodeT5-base, and CodeT5-base-multi-sum, with the objective of analyzing log analysis techniques. Our analysis consistently demonstrated that Davinci (GPT-3) model outperformed all other LLMs, showcasing higher performance. These findings are crucial for improving human comprehension of logs, particularly in light of the increasing numbers of IoT devices. Additionally, our research suggests that the CodeT5-base-multi-sum model exhibits comparable performance to Davinci to some extent in summarizing logs, indicating its potential as an offline model for this task.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
TSTEM: A Cognitive Platform for Collecting Cyber Threat Intelligence in the Wild
Authors:
Prasasthy Balasubramanian,
Sadaf Nazari,
Danial Khosh Kholgh,
Alireza Mahmoodi,
Justin Seby,
Panos Kostakos
Abstract:
The extraction of cyber threat intelligence (CTI) from open sources is a rapidly expanding defensive strategy that enhances the resilience of both Information Technology (IT) and Operational Technology (OT) environments against large-scale cyber-attacks. While previous research has focused on improving individual components of the extraction process, the community lacks open-source platforms for d…
▽ More
The extraction of cyber threat intelligence (CTI) from open sources is a rapidly expanding defensive strategy that enhances the resilience of both Information Technology (IT) and Operational Technology (OT) environments against large-scale cyber-attacks. While previous research has focused on improving individual components of the extraction process, the community lacks open-source platforms for deploying streaming CTI data pipelines in the wild. To address this gap, the study describes the implementation of an efficient and well-performing platform capable of processing compute-intensive data pipelines based on the cloud computing paradigm for real-time detection, collecting, and sharing CTI from different online sources. We developed a prototype platform (TSTEM), a containerized microservice architecture that uses Tweepy, Scrapy, Terraform, ELK, Kafka, and MLOps to autonomously search, extract, and index IOCs in the wild. Moreover, the provisioning, monitoring, and management of the TSTEM platform are achieved through infrastructure as a code (IaC). Custom focus crawlers collect web content, which is then processed by a first-level classifier to identify potential indicators of compromise (IOCs). If deemed relevant, the content advances to a second level of extraction for further examination. Throughout this process, state-of-the-art NLP models are utilized for classification and entity extraction, enhancing the overall IOC extraction methodology. Our experimental results indicate that these models exhibit high accuracy (exceeding 98%) in the classification and extraction tasks, achieving this performance within a time frame of less than a minute. The effectiveness of our system can be attributed to a finely-tuned IOC extraction method that operates at multiple stages, ensuring precise identification of relevant information with low false positives.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
SplitSim: Large-Scale Simulations for Evaluating Network Systems Research
Authors:
He**g Li,
Praneeth Balasubramanian,
Marvin Meiers,
Jialin Li,
Antoine Kaufmann
Abstract:
When physical testbeds are out of reach for evaluating a networked system, we frequently turn to simulation. In today's datacenter networks, bottlenecks are rarely at the network protocol level, but instead in end-host software or hardware components, thus current protocol-level simulations are inadequate means of evaluation. End-to-end simulations covering these components on the other hand, simp…
▽ More
When physical testbeds are out of reach for evaluating a networked system, we frequently turn to simulation. In today's datacenter networks, bottlenecks are rarely at the network protocol level, but instead in end-host software or hardware components, thus current protocol-level simulations are inadequate means of evaluation. End-to-end simulations covering these components on the other hand, simply cannot achieve the required scale with feasible simulation performance and computational resources.
In this paper, we address this with SplitSim, a simulation framework for end-to-end evaluation for large-scale network and distributed systems. To this end, SplitSim builds on prior work on modular end-to-end simulations and combines this with key elements to achieve scalability. First, mixed fidelity simulations judiciously reduce detail in simulation of parts of the system where this can be tolerated, while retaining the necessary detail elsewhere. SplitSim then parallelizes bottleneck simulators by decomposing them into multiple parallel but synchronized processes. Next, SplitSim provides a profiler to help users understand simulation performance and where the bottlenecks are, so users can adjust the configuration. Finally SplitSim provides abstractions to make it easy for users to build complex large-scale simulations. Our evaluation demonstrates SplitSim in multiple large-scale case studies.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Detecting nitrogen-vacancy-hydrogen centers on the nanoscale using nitrogen-vacancy centers in diamond
Authors:
Christoph Findler,
Rémi Blinder,
Karolina Schüle,
Priyadharshini Balasubramanian,
Christian Osterkamp,
Fedor Jelezko
Abstract:
In diamond, nitrogen defects like the substitutional nitrogen defect (Ns) or the nitrogen-vacancy-hydrogen complex (NVH) outnumber the nitrogen vacancy (NV) defect by at least one order of magnitude creating a dense spin bath. While neutral Ns has an impact on the coherence of the NV spin state, the atomic structure of NVH reminds of a NV center decorated with a hydrogen atom. As a consequence, th…
▽ More
In diamond, nitrogen defects like the substitutional nitrogen defect (Ns) or the nitrogen-vacancy-hydrogen complex (NVH) outnumber the nitrogen vacancy (NV) defect by at least one order of magnitude creating a dense spin bath. While neutral Ns has an impact on the coherence of the NV spin state, the atomic structure of NVH reminds of a NV center decorated with a hydrogen atom. As a consequence, the formation of NVH centers could compete with that of NV centers possibly lowering the N-to-NV conversion efficiency in diamond grown with hydrogen-plasma-assisted chemical vapor deposition (CVD). Therefore, monitoring and controlling the spin bath is essential to produce and understand engineered diamond material with high NV concentrations for quantum applications. While the incorporation of Ns in diamond has been investigated on the nano- and mesoscale for years, studies concerning the influence of CVD parameters and the crystal orientation on the NVH formation have been restricted to bulk N-doped diamond providing high-enough spin numbers for electron paramagnetic resonance and optical absorption spectroscopy techniques. Here, we investigate sub-micron-thick (100)-diamond layers with nitrogen contents of (13.8 +- 1.6) ppm and (16.7 +- 3.6) ppm, and exploiting the NV centers in the layers as local nano-sensors, we demonstrate the detection of NVH- centers using double-electron-electron-resonance (DEER). To determine the NVH- densities, we quantitatively fit the hyperfine structure of NVH- and confirm the results with the DEER method usually used for determining Ns0 densities. With our experiments, we access the spin bath composition on the nanoscale and enable a fast feedback-loop in CVD recipe optimization with thin diamond layers instead of resource- and time-intensive bulk crystals.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Fault-Tolerant Design Approach Based on Approximate Computing
Authors:
P Balasubramanian,
D L Maskell
Abstract:
Triple Modular Redundancy (TMR) has been traditionally used to ensure complete tolerance to a single fault or a faulty processing unit, where the processing unit may be a circuit or a system. However, TMR incurs more than 200% overhead in terms of area and power compared to a single processing unit. Hence, alternative redundancy approaches were proposed in the literature to mitigate the design ove…
▽ More
Triple Modular Redundancy (TMR) has been traditionally used to ensure complete tolerance to a single fault or a faulty processing unit, where the processing unit may be a circuit or a system. However, TMR incurs more than 200% overhead in terms of area and power compared to a single processing unit. Hence, alternative redundancy approaches were proposed in the literature to mitigate the design overheads associated with TMR, but they provide only partial or moderate fault tolerance. This research presents a new fault-tolerant design approach based on approximate computing called FAC that has the same fault tolerance as TMR and achieves significant reductions in the design metrics for physical implementation. FAC is suited for a plethora of error-tolerant applications. Here, the performance of TMR and FAC has been evaluated for a digital image processing application. The image processing results obtained confirm the usefulness of FAC. When an example processing unit was implemented using a 28-nm CMOS technology, FAC achieved a 15.3% reduction in delay, a 19.5% reduction in area, and a 24.7% reduction in power compared to TMR.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
Origins of multi-sublattice magnetism and superexchange interactions in double-double perovskite CaMnCrSbO6
Authors:
Rakshanda Dhawan,
Padmanabhan Balasubramanian,
Tashi Nautiyal
Abstract:
We have deployed density functional theory, Wannier function analysis and mean-field calculations to investigate the double-double perovskite compound CaMnCrSbO_{6}. The crystallographically non-equivalent Mn atoms in the unit cell have tetrahedral and planar oxygen coordinations (labelled as Mn(1) and Mn(2)), while the Cr atom is in the centre of distorted oxygen octahedron. While the bulk magnet…
▽ More
We have deployed density functional theory, Wannier function analysis and mean-field calculations to investigate the double-double perovskite compound CaMnCrSbO_{6}. The crystallographically non-equivalent Mn atoms in the unit cell have tetrahedral and planar oxygen coordinations (labelled as Mn(1) and Mn(2)), while the Cr atom is in the centre of distorted oxygen octahedron. While the bulk magnetization and neutron diffraction suggest a simpler ferrimagnetic order (T_C=49 K) between Mn2+ and Cr3+ spins, the exchange interactions are more complex than that expected from a two sublattice magnetic system. The electronic structure calculations yield a ferrimagnetic insulating ground state even in absence of Hubbard U which persists for a wide range of U. The Mn(1)-O-Mn(2) (out of plane and in-plane), Mn(1)-O-Cr and Mn(2)-O-Cr superexchange interactions are found to be anti-ferromagnetic, while the Cr-O-O-Cr super-superexchange is found to be ferromagnetic. The Mn(2)-O-Cr superexchange is weaker than the Mn(1)-O-Cr superexchange, thus effectively resulting in ferrimagnetism. From a simple 3-site Hubbard model, we derived expressions for the antiferromagnetic superexchange strength J_AFM and the weaker ferromagnetic J_FM. The relative strengths of JAFM for the various superexchange interactions are in agreement with those obtained from DFT. The expression for Cr-O-O-Cr super-superexchange strength (J_SS), which is derived considering a 4-site Hubbard model, predicts a ferromagnetic exchange in agreement with DFT. Finally, our mean field calculations reveal that assuming a set of four magnetic sub-lattice for Mn2+ spins and a single magnetic sublattice for Cr3+ spins yields a much improved T_C, while a simple two magnetic sublattice model yields a much higher T_C.
△ Less
Submitted 12 June, 2022; v1 submitted 5 May, 2022;
originally announced May 2022.
-
Gate-Level Static Approximate Adders
Authors:
P Balasubramanian,
R Nayar,
D L Maskell
Abstract:
This work compares and analyzes static approximate adders which are suitable for FPGA and ASIC type implementations. We consider many static approximate adders and evaluate their performance with respect to a digital image processing application using standard figures of merit such as peak signal to noise ratio and structural similarity index metric. We provide the error metrics of approximate add…
▽ More
This work compares and analyzes static approximate adders which are suitable for FPGA and ASIC type implementations. We consider many static approximate adders and evaluate their performance with respect to a digital image processing application using standard figures of merit such as peak signal to noise ratio and structural similarity index metric. We provide the error metrics of approximate adders, and the design metrics of accurate and approximate adders corresponding to FPGA and ASIC type implementations. For the FPGA implementation, we considered a Xilinx Artix-7 FPGA, and for an ASIC type implementation, we considered a 32-28 nm CMOS standard digital cell library. While the inferences from this work could serve as a useful reference to determine an optimum static approximate adder for a practical application, in particular, we found approximate adders HOAANED, HERLOA and M-HERLOA to be preferable.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Complex interplay of magnetic ordering and spin-lattice coupling in orthochromite Nd$_{0.5}$Dy$_{0.5}$CrO$_{3}$
Authors:
M. Anas,
Padmanabhan Balasubramanian,
K. Vikram,
Ankita Singh,
C. M. N. Kumar,
Andreas Hoser,
Dariusz Rusinek,
A. K. Sinha,
V. Srihari,
Ranjan K. Singh,
Rinku Kumar,
Mukul Gupta,
T. Maitra,
V. K. Malik
Abstract:
The mixed rare-earth orthochromite Nd$_{0.5}$Dy$_{0.5}$CrO$_{3}$ has a Néel temperature ($T_\mathrm{N}$) of ${\sim}$ 175\,K, resulting in the G-type antiferromagnetic ordering of Cr$^{3+}$ spins. The inverse susceptibility shows a deviation from Curie-Weiss law at 230\,K, with a large effective paramagnetic moment of 8.8\,$μ_{\mathrm{B}}$. The ZFC-FC magnetization bifurcate just above…
▽ More
The mixed rare-earth orthochromite Nd$_{0.5}$Dy$_{0.5}$CrO$_{3}$ has a Néel temperature ($T_\mathrm{N}$) of ${\sim}$ 175\,K, resulting in the G-type antiferromagnetic ordering of Cr$^{3+}$ spins. The inverse susceptibility shows a deviation from Curie-Weiss law at 230\,K, with a large effective paramagnetic moment of 8.8\,$μ_{\mathrm{B}}$. The ZFC-FC magnetization bifurcate just above $T_\mathrm{N}$ and show a distinct signature of spin reorientation near 60\,K. Neutron diffraction show that below $T_\mathrm{N}$, the Cr$^{3+}$ spins align in $Γ_{2}$ representation as ($F_{x}$, $G_{z}$). Below 60\,K, due to spin reorientation, the magnetic structure is in $Γ_{1}$ ($G_{y}$) configuration. The neutron diffraction does not show any signature of rare-earth ordering even at 1.5\,K. First principles density functional theory calculations within GGA+U and GGA+U+SO approximations reveal that the G-type antiferromagnetic order is the ground state magnetic structure of Cr sublattice and the spin-reorientation of Cr$^{3+}$ spins can happen in the absence of 3d-4f interactions unlike in the case of orthoferrites. The specific heat shows a `$λ$' anomaly at $T_\mathrm{N}$, while at low temperature two distinct Schottky anomalies are observed; a Schottky peak at 2\,K and an additional step-like feature above 10\,K. Above $T_\mathrm{N}$, the magnetic transition is preceded by structural anomalies as seen in our x-ray diffraction and Raman measurements. The deviation of structural parameters near Néel temperature is smaller. The phonon frequencies show deviation from the standard anharmonic behaviour: first near 250\,K, due to magneto-volume effects while the second deviation occurs near 200\,K due to spin-phonon coupling.
△ Less
Submitted 18 October, 2021;
originally announced October 2021.
-
Coexisting magnetic structures and spin-reorientation in Er$_{0.5}$Dy$_{0.5}$FeO$_{3}$: Bulk magnetization, neutron scattering, specific heat, and \emph{Ab-initio} studies
Authors:
Sarita Rajput,
Padmanabhan Balasubramanian,
Ankita Singh,
Francoise Damay,
C. M. N. Kumar,
W. Tabis,
T. Maitra,
V. K. Malik
Abstract:
The complex magnetic structures, spin-reorientation and associated exchange interactions have been investigate in Er$_{0.5}$Dy$_{0.5}$FeO$_3$ using bulk magnetization, neutron diffraction, specific heat measurements and density functional theory calculations. The Fe$^{3+}$ spins order as G-type antiferromagnet structure depicted by $Γ_{4}$($G_{x}$,$A_{y}$,$F_{z}$) irreducible representation below…
▽ More
The complex magnetic structures, spin-reorientation and associated exchange interactions have been investigate in Er$_{0.5}$Dy$_{0.5}$FeO$_3$ using bulk magnetization, neutron diffraction, specific heat measurements and density functional theory calculations. The Fe$^{3+}$ spins order as G-type antiferromagnet structure depicted by $Γ_{4}$($G_{x}$,$A_{y}$,$F_{z}$) irreducible representation below 700K, similar to its end compounds. The bulk magnetization data indicate occurrence of the spin-reorientation and rare-earth magnetic ordering below $\sim$75 K and 10 K, respectively. The neutron diffraction studies confirm an "incomplete" $Γ_{4}$${\rightarrow}$ $Γ_{2}$($F_{x}$,$C_{y}$,$G_{z}$) spin-reorientation initiated $\leq$75 K. Although, the relative volume fraction of the two magnetic structures varies with decreasing temperature, both co-exist even at 1.5 K. At 8 K, Er$^{3+}$/Dy$^{3+}$ moments order as $c_{y}^R$ arrangement develop, which gradually increases in intensity with decreasing temperature. At 2 K, magnetic structure associated with $c_{z}^R$ arrangement of Er$^{3+}$/Dy$^{3+}$ moments also appears. At 1.5 K the magnetic structure of Fe$^{3+}$ spins is represented by a combination of $Γ_{2}$+$Γ_{4}$+$Γ_{1}$, while the rare earth moments coexists as $c_{y}^R$ and $c_{z}^R$ corresponding to $Γ_{2}$ and $Γ_{1}$ representation, respectively. The observed Schottky anomaly at 2.5 K suggests that the "rare-earth ordering" is induced by polarization due to Fe$^{3+}$ spins. The Er$^{3+}$-Fe$^{3+}$ and Er$^{3+}$-Dy$^{3+}$ exchange interactions, obtained from first principle calculations, primarily cause the complicated spin-reorientation and $c_{y}^R$ rare-earth ordering, respectively, while the dipolar interactions between rare-earth moments, result in the $c_{z}^R$ type rare-earth ordering at 2 K.
△ Less
Submitted 4 October, 2021; v1 submitted 23 August, 2021;
originally announced August 2021.
-
Induction of VEGF secretion from bone marrow stromal cell line (ST-2) by the dissolution products of mesoporous silica glass particles containing CuO and SrO
Authors:
Preethi Balasubramanian,
Antonio J. Salinas,
Sandra Sanchez-Salcedo,
Rainer Detsch,
Maria Vallet-Regi,
Aldo R. Boccaccini
Abstract:
Certain biomaterials are capable of inducing the secretion of Vascular Endothelial Growth Factor (VEGF) from cells exposed to their biochemical influence, which plays a vital role in stimulating angiogenesis. Looking for this capacity, in this study three porous glasses were synthesized and characterized. The objective of this study was to determine the concentration of the glass particles that, b…
▽ More
Certain biomaterials are capable of inducing the secretion of Vascular Endothelial Growth Factor (VEGF) from cells exposed to their biochemical influence, which plays a vital role in stimulating angiogenesis. Looking for this capacity, in this study three porous glasses were synthesized and characterized. The objective of this study was to determine the concentration of the glass particles that, being out of the cytotoxic range, could increase VEGF secretion. The viability of cultivated bone marrow stromal cells (ST-2) was assessed. The samples were examined with light microscopy (LM) after the histochemical staining for haematoxylin and eosin (HE). The biological activity of glasses was evaluated in terms of the influence of the Cu2+ and Sr2+ ions on the cells. The dissolution products of CuSr-1 and CuSr-2.5 produced the highest secretion of VEGF from ST-2 cells after 48 h of incubation. The combination of Cu2+ and Sr2+ lays the foundation for engineering a bioactive glass than can lead to vascularized, functional bone tissue when used in bone regeneration applications.
△ Less
Submitted 22 April, 2021;
originally announced April 2021.
-
Area Optimized Quasi Delay Insensitive Majority Voter for TMR Applications
Authors:
P Balasubramanian,
D L Maskell,
N E Mastorakis
Abstract:
Mission-critical and safety-critical applications generally tend to incorporate triple modular redundancy (TMR) to embed fault tolerance in their physical implementations. In a TMR realization, an original function block, which may be a circuit or a system, and two exact copies of the function block are used to successfully overcome any temporary fault or permanent failure of an arbitrary function…
▽ More
Mission-critical and safety-critical applications generally tend to incorporate triple modular redundancy (TMR) to embed fault tolerance in their physical implementations. In a TMR realization, an original function block, which may be a circuit or a system, and two exact copies of the function block are used to successfully overcome any temporary fault or permanent failure of an arbitrary function block during the routine operation. The corresponding outputs of the function blocks are majority voted using 3-input majority voters whose outputs define the outputs of a TMR realization. Hence, a 3-input majority voter forms an important component of a TMR realization. Many synchronous majority voters and an asynchronous non-delay insensitive majority voter have been presented in the literature. Recently, quasi delay insensitive (QDI) asynchronous majority voters for TMR applications were also discussed in the literature. In this regard, this paper presents a new QDI asynchronous majority voter for TMR applications, which is better optimized in area compared to the existing QDI majority voters. The proposed QDI majority voter requires 30.2% less area compared to the best of the existing QDI majority voters, and this could be useful for resource-constrained fault tolerance applications. The example QDI TMR circuits were implemented using a 32/28nm complementary metal oxide semiconductor (CMOS) process. The delay insensitive dual rail code was used for data encoding, and 4-phase return-to-zero and return-to-one handshake protocols were used for data communication.
△ Less
Submitted 13 August, 2020;
originally announced August 2020.
-
Sensitivity Factors based Transmission Network Topology Control for Violation Relief
Authors:
Xingpeng Li,
Akshay Korad,
Pranavamoorthy Balasubramanian
Abstract:
Transmission networks consist of thousands of branches for large-scale real power systems. They are built with a high degree of redundancy for reliability concern. Thus, it is very likely that there exist various network topologies that can deliver continuous power supply to consumers. The optimal transmission network topology could be very different for different system conditions. Transmission n…
▽ More
Transmission networks consist of thousands of branches for large-scale real power systems. They are built with a high degree of redundancy for reliability concern. Thus, it is very likely that there exist various network topologies that can deliver continuous power supply to consumers. The optimal transmission network topology could be very different for different system conditions. Transmission network topology control (TNTC) can provide the operator with an additional option to manage network congestion, reduce losses, relieve violation, and achieve cost saving. This paper examines the benefits of TNTC in reducing post-contingency overloads that are identified by real-time contingency analysis (RTCA). The procedure of RTCA with TNTC is presented and two algorithms are proposed to determine the candidate switching solutions. Both algorithms use available system data: sensitivity factors or shifting factors. The proposed two TNTC approaches are based on the transmission switching distribution factor (TSDF) and flow transfer distribution factor (FTDF) respectively. FTDF based TNTC approach is an enhanced version of TSDF based TNTC approach by considering network flow distribution. Numerical simulations demonstrate that both methods can effectively relieve flow violations and FTDF outperforms TSDF.
△ Less
Submitted 16 December, 2019;
originally announced December 2019.
-
Performance Comparison of Quasi-Delay-Insensitive Asynchronous Adders
Authors:
P Balasubramanian
Abstract:
In this technical note, we provide a comparison of the design metrics of various quasi-delay-insensitive (QDI) asynchronous adders, where the adders correspond to diverse architectures. QDI adders are robust, and the objective of this technical note is to point to those QDI adders which are suitable for low power/energy and less area. This information could be valuable for a resource-constrained l…
▽ More
In this technical note, we provide a comparison of the design metrics of various quasi-delay-insensitive (QDI) asynchronous adders, where the adders correspond to diverse architectures. QDI adders are robust, and the objective of this technical note is to point to those QDI adders which are suitable for low power/energy and less area. This information could be valuable for a resource-constrained low power VLSI design scenario. Non-QDI adders are excluded from the comparison since they are not robust although they may have optimized design metrics. All the QDI adders were realized using a 32/28nm CMOS process.
△ Less
Submitted 24 July, 2019;
originally announced July 2019.
-
Discovery of ST1 centers in natural diamond
Authors:
Priyadharshini Balasubramanian,
Mathias H. Metsch,
Prithvi Reddy,
Lachlan J. Rogers,
Neil B. Manson,
Marcus W. Doherty,
Fedor Jelezko
Abstract:
The ST1 center is a point defect in diamond with bright fluorescence and a mechanism for optical spin initialization and readout. The center has impressive potential for applications in diamond quantum computing as a quantum bus to a register of nuclear spins. This is because it has an exceptionally high readout contrast and, unlike the well-known nitrogen-vacancy center, it does not have a ground…
▽ More
The ST1 center is a point defect in diamond with bright fluorescence and a mechanism for optical spin initialization and readout. The center has impressive potential for applications in diamond quantum computing as a quantum bus to a register of nuclear spins. This is because it has an exceptionally high readout contrast and, unlike the well-known nitrogen-vacancy center, it does not have a ground state electronic spin that decoheres the nuclear spins. However, its chemical structure is unknown and there are large gaps in our understanding of its properties. We present the discovery of ST1 centers in natural diamond. Our experiments identify interesting power dependence of the center's optical dynamics and reveal new electronic structure. We also present a theory of its electron-phonon interactions, which we combine with previous experiments, to shortlist likely candidates for its chemical structure.
△ Less
Submitted 20 June, 2019; v1 submitted 11 June, 2019;
originally announced June 2019.
-
Indicating Asynchronous Multipliers
Authors:
P Balasubramanian,
D L Maskell,
N E Mastorakis
Abstract:
Multiplication is a basic arithmetic operation that is encountered in almost all general-purpose microprocessing and digital signal processing applications, and multiplication is physically realized using a multiplier. This paper discusses the physical implementation of indicating asynchronous multipliers, which are inherently elastic and are robust to timing, process, and parametric variations, a…
▽ More
Multiplication is a basic arithmetic operation that is encountered in almost all general-purpose microprocessing and digital signal processing applications, and multiplication is physically realized using a multiplier. This paper discusses the physical implementation of indicating asynchronous multipliers, which are inherently elastic and are robust to timing, process, and parametric variations, and are modular. We consider the physical implementation of many weak-indication asynchronous multipliers using a 32/28-nm CMOS technology by adopting the array multiplier architecture. The multipliers are synthesized in a semi-custom ASIC-design style. The 4-phase return-to-zero (RTZ) and the 4-phase return-to-one (RTO) handshake protocols are considered for the data communication. The multipliers are realized using strong-indication or weak-indication full adders. Strong-indication 2-input AND function is used to generate the partial products in the case of both RTZ and RTO handshaking. The full adders considered are derived from different indicating asynchronous logic design methods. Among the multipliers considered, a weak-indication asynchronous multiplier utilizing the biased weak-indication full adder is found to be efficient in terms of the cycle time and the power-cycle time product with respect to both RTZ and RTO handshaking. Also, the 4-phase RTO handshake protocol is found to be preferable than the 4-phase RTZ handshake protocol for achieving enhanced optimizations in the design metrics.
△ Less
Submitted 23 May, 2019;
originally announced May 2019.
-
Indicating Asynchronous Array Multipliers
Authors:
P Balasubramanian,
D L Maskell
Abstract:
Multiplication is an important arithmetic operation that is frequently encountered in microprocessing and digital signal processing applications, and multiplication is physically realized using a multiplier. This paper discusses the physical implementation of many indicating asynchronous array multipliers, which are inherently elastic and modular and are robust to timing, process and parametric va…
▽ More
Multiplication is an important arithmetic operation that is frequently encountered in microprocessing and digital signal processing applications, and multiplication is physically realized using a multiplier. This paper discusses the physical implementation of many indicating asynchronous array multipliers, which are inherently elastic and modular and are robust to timing, process and parametric variations. We consider the physical realization of many indicating asynchronous array multipliers using a 32/28nm CMOS technology. The weak-indication array multipliers comprise strong-indication or weak-indication full adders, and strong-indication 2-input AND functions to realize the partial products. The multipliers were synthesized in a semi-custom ASIC design style using standard library cells including a custom-designed 2-input C-element. 4x4 and 8x8 multiplication operations were considered for the physical implementations. The 4-phase return-to-zero (RTZ) and the 4-phase return-to-one (RTO) handshake protocols were utilized for data communication, and the delay-insensitive dual-rail code was used for data encoding. Among several weak-indication array multipliers, a weak-indication array multiplier utilizing a biased weak-indication full adder and the strong-indication 2-input AND function is found to have reduced cycle time and power-cycle time product with respect to RTZ and RTO handshaking for 4x4 and 8x8 multiplications. Further, the 4-phase RTO handshaking is found to be preferable to the 4-phase RTZ handshaking for achieving enhanced optimizations of the design metrics.
△ Less
Submitted 14 May, 2019;
originally announced May 2019.
-
Successive spin reorientation and rare earth ordering in Nd$_{0.5}$Dy$_{0.5}$FeO$_{3}$: Experimental and $Ab$-$initio$ investigations
Authors:
Ankita Singh,
Sarita Rajput,
Padmanabhan Balasubramanian,
M. Anas,
Francoise Damay,
C. M. N. Kumar,
Gaku Eguchi,
T. Maitra,
V. K. Malik
Abstract:
In present study, the magnetic structure and spin reorientation of mixed doped orthoferrite Nd$_{0.5}$Dy$_{0.5}$FeO$_3$ have been investigated. Similar to both parent compounds (NdFeO$_3$ and DyFeO$_3$), the magnetic structure of Fe$^{3+}$ belongs to $Γ_{4}$ irreducible representation (G$_{x}$, F$_{z}$) at room temperature. The experimental measurements confirmed the spin reorientation where magne…
▽ More
In present study, the magnetic structure and spin reorientation of mixed doped orthoferrite Nd$_{0.5}$Dy$_{0.5}$FeO$_3$ have been investigated. Similar to both parent compounds (NdFeO$_3$ and DyFeO$_3$), the magnetic structure of Fe$^{3+}$ belongs to $Γ_{4}$ irreducible representation (G$_{x}$, F$_{z}$) at room temperature. The experimental measurements confirmed the spin reorientation where magnetic structure of Fe$^{3+}$ changes from $Γ_{4}$ to $Γ_{2}$(F$_{x}$, G$_{z}$) between 75 and 20 \,K while maintaining G-type configuration. Such a gradual spin reorientation is unusual since the large single ion anisotropy of Dy$^{3+}$ ions causes an abrupt $Γ_{4}$${\rightarrow}$ $Γ_{1}$(G$_{y}$) spin reorientation in DyFeO$_3$. Between 20 and 10 \,K, the Fe$^{3+}$ magnetic structure is represented by $Γ_{2}$ (F$_{x}$, G$_{z}$). Unexpectedly, magnetic structure of Fe$^{3+}$ with $Γ_{4}$ representation re-emerges below 10\,K which also coincides with the development of rare-earth (Nd$^{3+}$/Dy$^{3+}$) magnetic ordering having C$_{y}$ configuration with magnetic moment of 1.8 $μ_{B}$. The absence of any signature of second order phase transition in the specific heat confirms the role of $R$(Nd$^{3+}$/Dy$^{3+}$)-Fe$^{3+}$ exchange interaction in the observed "rare-earth ordering" unlike DyFeO$_3$ where Dy$^{3+}$ ordering takes place independently to the magnetic ordering of Fe$^{3+}$ magnetic structure. Our (DFT+U+SO) calculations show that the C-type arrangement of rare-earth ions (Nd$^{3+}$/Dy$^{3+}$) with $Γ_{2}$ configuration for Fe$^{3+}$ moments is the ground state whereas $Γ_{4}$ phase is energetically very close. Nd-Fe and Nd-Dy exchange interactions, estimated from DFT, are observed to have significant roles in the rare earth ordering and Fe spin reorientation corroborating our experimental results.
△ Less
Submitted 30 June, 2020; v1 submitted 11 April, 2019;
originally announced April 2019.
-
Speed and Energy Optimised Quasi-Delay-Insensitive Block Carry Lookahead Adder
Authors:
P. Balasubramanian,
D. L. Maskell,
N. E. Mastorakis
Abstract:
We present a new asynchronous quasi-delay-insensitive (QDI) block carry lookahead adder with redundancy carry (BCLARC) realized using delay-insensitive dual-rail data encoding and 4-phase return-to-zero (RTZ) and 4-phase return-to-one (RTO) handshaking. The proposed QDI BCLARC is found to be faster and energy-efficient than the existing asynchronous adders which are QDI and non-QDI (i.e., relative…
▽ More
We present a new asynchronous quasi-delay-insensitive (QDI) block carry lookahead adder with redundancy carry (BCLARC) realized using delay-insensitive dual-rail data encoding and 4-phase return-to-zero (RTZ) and 4-phase return-to-one (RTO) handshaking. The proposed QDI BCLARC is found to be faster and energy-efficient than the existing asynchronous adders which are QDI and non-QDI (i.e., relative-timed). Compared to existing asynchronous adders corresponding to various architectures such as ripple carry adder (RCA), conventional carry lookahead adder (CCLA), carry select adder (CSLA), BCLARC, and hybrid BCLARC-RCA, the proposed BCLARC is found to be faster and more energy-optimised. The cycle time (CT), which is the sum of forward and reverse latencies, governs the speed; and the product of average power dissipation and cycle time viz. the power-cycle time product (PCTP) defines the low power/energy efficiency. For a 32-bit addition, the proposed QDI BCLARC achieves the following average reductions in design metrics over its counterparts when considering RTZ and RTO handshaking: i) 20.5% and 19.6% reductions in CT and PCTP respectively compared to an optimum QDI early output RCA, ii) 16.5% and 15.8% reductions in CT and PCTP respectively compared to an optimum relative-timed RCA, iii) 32.9% and 35.9% reductions in CT and PCTP respectively compared to an optimum uniform input-partitioned QDI early output CSLA, iv) 47.5% and 47.2% reductions in CT and PCTP respectively compared to an optimum QDI early output CCLA, v) 14.2% and 27.3% reductions in CT and PCTP respectively compared to an optimum QDI early output BCLARC, and vi) 12.2% and 11.6% reductions in CT and PCTP respectively compared to an optimum QDI early output hybrid BCLARC-RCA. The adders were implemented using a 32/28nm CMOS technology.
△ Less
Submitted 22 March, 2019;
originally announced March 2019.
-
Majority and Minority Voted Redundancy for Safety-Critical Applications
Authors:
P Balasubramanian,
D L Maskell,
N E Mastorakis
Abstract:
A new majority and minority voted redundancy (MMR) scheme is proposed that can provide the same degree of fault tolerance as N-modular redundancy (NMR) but with fewer function units and a less sophisticated voting logic. Example NMR and MMR circuits were implemented using a 32/28nm CMOS process and compared. The results show that MMR circuits dissipate less power, occupy less area, and encounter l…
▽ More
A new majority and minority voted redundancy (MMR) scheme is proposed that can provide the same degree of fault tolerance as N-modular redundancy (NMR) but with fewer function units and a less sophisticated voting logic. Example NMR and MMR circuits were implemented using a 32/28nm CMOS process and compared. The results show that MMR circuits dissipate less power, occupy less area, and encounter less critical path delay than the corresponding NMR circuits while providing the same degree of fault tolerance. Hence the MMR is a promising alternative to the NMR to efficiently implement high levels of redundancy in safety-critical applications.
△ Less
Submitted 26 January, 2019;
originally announced January 2019.
-
Asynchronous Early Output Block Carry Lookahead Adder with Improved Quality of Results
Authors:
P Balasubramanian,
D L Maskell,
N E Mastorakis
Abstract:
A new asynchronous early output block carry lookahead adder (BCLA) incorporating redundant carries is proposed. Compared to the best of existing semi-custom asynchronous carry lookahead adders (CLAs) employing delay-insensitive data encoding and following a 4-phase handshaking, the proposed BCLA with redundant carries achieves 13% reduction in forward latency and 14.8% reduction in cycle time comp…
▽ More
A new asynchronous early output block carry lookahead adder (BCLA) incorporating redundant carries is proposed. Compared to the best of existing semi-custom asynchronous carry lookahead adders (CLAs) employing delay-insensitive data encoding and following a 4-phase handshaking, the proposed BCLA with redundant carries achieves 13% reduction in forward latency and 14.8% reduction in cycle time compared to the best of the existing CLAs featuring redundant carries with no area or power penalty. A hybrid variant involving a ripple carry adder (RCA) in the least significant stages i.e. BCLA-RCA is also considered that achieves a further 4% reduction in the forward latency and a 2.4% reduction in the cycle time compared to the proposed BCLA featuring redundant carries without area or power penalties.
△ Less
Submitted 26 January, 2019;
originally announced January 2019.
-
Performance Comparison of some Synchronous Adders
Authors:
P Balasubramanian
Abstract:
This technical note compares the performance of some synchronous adders which correspond to the following architectures: i) ripple carry adder (RCA), ii) recursive carry lookahead adder (RCLA), iii) hybrid RCLA-RCA with the RCA used in the least significant adder bit positions, iv) block carry lookahead adder (BCLA), v) hybrid BCLA-RCA with the RCA used in the least significant adder bit positions…
▽ More
This technical note compares the performance of some synchronous adders which correspond to the following architectures: i) ripple carry adder (RCA), ii) recursive carry lookahead adder (RCLA), iii) hybrid RCLA-RCA with the RCA used in the least significant adder bit positions, iv) block carry lookahead adder (BCLA), v) hybrid BCLA-RCA with the RCA used in the least significant adder bit positions, and vi) non-uniform input partitioned carry select adders (CSLAs) without and with the binary to excess-1 code (BEC) converter. The 32-bit addition was considered as an example operation. The adder architectures mentioned were implemented by targeting a typical case PVT specification (high threshold voltage, supply voltage of 1.05V and operating temperature of 25 degrees Celsius) of the Synopsys 32/28nm CMOS technology. The comparison leads to the following observations: i) the hybrid CCLA-RCA is preferable to the other adders in terms of the speed, the power-delay product, and the energy-delay product, ii) the non-uniform input partitioned CSLA without the BEC converter is preferable to the other adders in terms of the area-delay product, and iii) the RCA incorporating the full adder present in the standard digital cell library is preferable to the other adders in terms of the power-delay-area product.
△ Less
Submitted 2 October, 2018;
originally announced October 2018.
-
Asynchronous Ripple Carry Adder based on Area Optimized Early Output Dual-Bit Full Adder
Authors:
P Balasubramanian
Abstract:
This technical note presents the design of a new area optimized asynchronous early output dual-bit full adder (DBFA). An asynchronous ripple carry adder (RCA) is constructed based on the new asynchronous DBFAs and existing asynchronous early output single-bit full adders (SBFAs). The asynchronous DBFAs and SBFAs incorporate redundant logic and are encoded using the delay-insensitive dual-rail code…
▽ More
This technical note presents the design of a new area optimized asynchronous early output dual-bit full adder (DBFA). An asynchronous ripple carry adder (RCA) is constructed based on the new asynchronous DBFAs and existing asynchronous early output single-bit full adders (SBFAs). The asynchronous DBFAs and SBFAs incorporate redundant logic and are encoded using the delay-insensitive dual-rail code (i.e. homogeneous data encoding) and follow a 4-phase return-to-zero handshaking. Compared to the previous asynchronous RCAs involving DBFAs and SBFAs, which are based on homogeneous or heterogeneous delay-insensitive data encodings and which correspond to different timing models, the early output asynchronous RCA incorporating the proposed DBFAs and/or SBFAs is found to result in reduced area for the dual-operand addition operation and feature significantly less latency than the asynchronous RCAs which consist of only SBFAs. The proposed asynchronous DBFA requires 28.6% less silicon than the previously reported asynchronous DBFA. For a 32-bit asynchronous RCA, utilizing 2 stages of SBFAs in the least significant positions and 15 stages of DBFAs in the more significant positions leads to optimization in the latency. The new early output 32-bit asynchronous RCA containing DBFAs and SBFAs reports the following optimizations in design metrics over its counterparts: i) 18.8% reduction in area than a previously reported 32-bit early output asynchronous RCA which also has 15 stages of DBFAs and 2 stages of SBFAs, ii) 29.4% reduction in latency than a 32-bit early output asynchronous RCA containing only SBFAs.
△ Less
Submitted 24 July, 2018;
originally announced July 2018.
-
Comments on "Dual-rail asynchronous logic multi-level implementation"
Authors:
P Balasubramanian
Abstract:
In this research communication, we comment on "Dual-rail asynchronous logic multi-level implementation" [Integration, the VLSI Journal 47 (2014) 148-159] by expounding the problematic issues, and provide some clarifications on delay-insensitivity, robust asynchronous logic, multi-level decomposition, and physical implementation.
In this research communication, we comment on "Dual-rail asynchronous logic multi-level implementation" [Integration, the VLSI Journal 47 (2014) 148-159] by expounding the problematic issues, and provide some clarifications on delay-insensitivity, robust asynchronous logic, multi-level decomposition, and physical implementation.
△ Less
Submitted 31 January, 2018;
originally announced February 2018.
-
Approximate Early Output Asynchronous Adders Based on Dual-Rail Data Encoding and 4-Phase Return-to-Zero and Return-to-One Handshaking
Authors:
P Balasubramanian
Abstract:
Approximate computing is emerging as an alternative to accurate computing due to its potential for realizing digital circuits and systems with low power dissipation, less critical path delay, and less area occupancy for an acceptable trade-off in the accuracy of results. In the domain of computer arithmetic, several approximate adders and multipliers have been designed and their potential have bee…
▽ More
Approximate computing is emerging as an alternative to accurate computing due to its potential for realizing digital circuits and systems with low power dissipation, less critical path delay, and less area occupancy for an acceptable trade-off in the accuracy of results. In the domain of computer arithmetic, several approximate adders and multipliers have been designed and their potential have been showcased versus accurate adders and multipliers for practical digital signal processing applications. Nevertheless, in the existing literature, almost all the approximate adders and multipliers reported correspond to the synchronous design method. In this work, we consider robust asynchronous i.e. quasi-delay-insensitive realizations of approximate adders by employing delay-insensitive codes for data representation and processing, and the 4-phase handshake protocols for data communication. The 4-phase handshake protocols used are the return-to-zero and the return-to-one protocols. Specifically, we consider the implementations of 32-bit approximate adders based on the return-to-zero and return-to-one handshake protocols by adopting the delay-insensitive dual-rail code for data encoding. We consider a range of approximations varying from 4-bits to 20-bits for the least significant positions of the accurate 32-bit asynchronous adder. The asynchronous adders correspond to early output (i.e. early reset) type, which are based on the well-known ripple carry adder architecture. The experimental results show that approximate asynchronous adders achieve reductions in the design metrics such as latency, cycle time, average power dissipation, and silicon area compared to the accurate asynchronous adders. Further, the reductions in the design metrics are greater for the return-to-one protocol compared to the return-to-zero protocol. The design metrics were estimated using a 32/28nm CMOS technology.
△ Less
Submitted 17 January, 2018;
originally announced January 2018.
-
Critique of "Asynchronous Logic Implementation Based on Factorized DIMS"
Authors:
P Balasubramanian
Abstract:
This paper comments on "Asynchronous Logic Implementation Based on Factorized DIMS" [Journal of Circuits, Systems, and Computers, vol. 26, no. 5, 1750087: 1-9, May 2017] with respect to two main problematic issues: i) the gate orphan problem implicit in the factorized DIMS approach discussed in the referenced article which affects its strong-indication, and ii) how the enumeration of product terms…
▽ More
This paper comments on "Asynchronous Logic Implementation Based on Factorized DIMS" [Journal of Circuits, Systems, and Computers, vol. 26, no. 5, 1750087: 1-9, May 2017] with respect to two main problematic issues: i) the gate orphan problem implicit in the factorized DIMS approach discussed in the referenced article which affects its strong-indication, and ii) how the enumeration of product terms to represent the synthesis cost is skewed in the referenced article because the logic expression contains sum of products and also product of sums. It is observed that the referenced article has not provided a general logic synthesis algorithm excepting only an example illustration involving a 3-input AND logic function. The absence of a general logic synthesis algorithm would make it difficult to reproduce the research described in the referenced article. Moreover, the example illustration in the referenced article describes an unsafe logic decomposition which is not suitable for the multi-level synthesis of strong-indication asynchronous circuits. Further, a logic synthesis method which safely decomposes the DIMS solution to synthesize multi-level strong-indication asynchronous circuits is available in the existing literature, which was neither cited nor taken up for comparison in the referenced article, which is another drawback. Subsequently, it is concluded that the referenced article has not advanced existing knowledge in the field but on the contrary, has caused confusions. Hence, in the interest of readers, this paper additionally highlights some important and relevant literature which provide valuable information about robust asynchronous circuit synthesis techniques which employ delay-insensitive codes for data representation and processing and the 4-phase return-to-zero handshake protocol for data communication.
△ Less
Submitted 23 September, 2018; v1 submitted 7 November, 2017;
originally announced November 2017.
-
Electronic structure of Pr2MnNiO6 from x-ray photoemission, absorption and density functional theory
Authors:
Padmanabhan Balasubramanian,
Shalik Ram Joshi,
Ruchika Yadav,
Frank M. F. de Groot,
Amit Kumar Singh,
Avijeet Ray,
Mukul Gupta,
Ankita Singh,
Suja Elizabeth,
Shikha Varma,
Tulika Maitra,
Vivek Malik
Abstract:
The electronic structure of double perovskite Pr2MnNiO6 is studied using core x-ray photoelectron spectroscopy and x-ray absorption spectroscopy. The 2p x-ray absorption spectra show that Mn and Ni are in 2+ and 4+ states respectively. Using charge transfer multiplet analysis of Ni and Mn 2p XPS spectra, we find charge transfer energies Δ of 3.5 and 2.5 eV for Ni and Mn respectively. The ground st…
▽ More
The electronic structure of double perovskite Pr2MnNiO6 is studied using core x-ray photoelectron spectroscopy and x-ray absorption spectroscopy. The 2p x-ray absorption spectra show that Mn and Ni are in 2+ and 4+ states respectively. Using charge transfer multiplet analysis of Ni and Mn 2p XPS spectra, we find charge transfer energies Δ of 3.5 and 2.5 eV for Ni and Mn respectively. The ground state of Ni2+ and Mn4+ reveal a higher d electron count of 8.21 and 3.38 respectively as compared to the atomic values of 8.00 and 3.00 respectively thereby indicating the covalent nature of the system. The O 1s edge absorption spectra reveal a band gap of 0.9 eV which is comparable to the value obtained from first principle calculations for U-J >= 2 eV. The density of states clearly reveal a strong p-d type charge transfer character of the system, with band gap proportional to average charge transfer energy of Ni2+ and Mn4+ ions.
△ Less
Submitted 24 October, 2017;
originally announced October 2017.
-
Approximate Ripple Carry and Carry Lookahead Adders - A Comparative Analysis
Authors:
P Balasubramanian,
C Dang,
D L Maskell,
K Prasad
Abstract:
Approximate ripple carry adders (RCAs) and carry lookahead adders (CLAs) are presented which are compared with accurate RCAs and CLAs for performing a 32-bit addition. The accurate and approximate RCAs and CLAs are implemented using a 32/28nm CMOS process. Approximations ranging from 4- to 20-bits are considered for the less significant adder bit positions. The simulation results show that approxi…
▽ More
Approximate ripple carry adders (RCAs) and carry lookahead adders (CLAs) are presented which are compared with accurate RCAs and CLAs for performing a 32-bit addition. The accurate and approximate RCAs and CLAs are implemented using a 32/28nm CMOS process. Approximations ranging from 4- to 20-bits are considered for the less significant adder bit positions. The simulation results show that approximate RCAs report reductions in the power-delay product (PDP) ranging from 19.5% to 82% than the accurate RCA for approximation sizes varying from 4- to 20-bits. Also, approximate CLAs report reductions in PDP ranging from 16.7% to 74.2% than the accurate CLA for approximation sizes varying from 4- to 20-bits. On average, for the approximation sizes considered, it is observed that approximate CLAs achieve a 46.5% reduction in PDP compared to the approximate RCAs. Hence, approximate CLAs are preferable over approximate RCAs for the low power implementation of approximate computer arithmetic.
△ Less
Submitted 15 October, 2017;
originally announced October 2017.
-
Asynchronous Early Output Section-Carry Based Carry Lookahead Adder with Alias Carry Logic
Authors:
P Balasubramanian,
C Dang,
D L Maskell,
K Prasad
Abstract:
A new asynchronous early output section-carry based carry lookahead adder (SCBCLA) with alias carry output logic is presented in this paper. To evaluate the proposed SCBCLA with alias carry logic and to make a comparison with other CLAs, a 32-bit addition operation is considered. Compared to the weak-indication SCBCLA with alias logic, the proposed early output SCBCLA with alias logic reports a 13…
▽ More
A new asynchronous early output section-carry based carry lookahead adder (SCBCLA) with alias carry output logic is presented in this paper. To evaluate the proposed SCBCLA with alias carry logic and to make a comparison with other CLAs, a 32-bit addition operation is considered. Compared to the weak-indication SCBCLA with alias logic, the proposed early output SCBCLA with alias logic reports a 13% reduction in area without any increases in latency and power dissipation. On the other hand, in comparison with the early output recursive CLA (RCLA), the proposed early output SCBCLA with alias logic reports a 16% reduction in latency while occupying almost the same area and dissipating almost the same average power. All the asynchronous CLAs are quasi-delay-insensitive designs which incorporate the delay-insensitive dual-rail data encoding and adhere to the 4-phase return-to-zero handshaking. The adders were realized and the simulations were performed based on a 32/28nm CMOS process.
△ Less
Submitted 15 October, 2017;
originally announced October 2017.
-
Mathematical Estimation of Logical Masking Capability of Majority/Minority Gates Used in Nanoelectronic Circuits
Authors:
P Balasubramanian,
R T Naayagi
Abstract:
In nanoelectronic circuit synthesis, the majority gate and the inverter form the basic combinational logic primitives. This paper deduces the mathematical formulae to estimate the logical masking capability of majority gates, which are used extensively in nanoelectronic digital circuit synthesis. The mathematical formulae derived to evaluate the logical masking capability of majority gates holds w…
▽ More
In nanoelectronic circuit synthesis, the majority gate and the inverter form the basic combinational logic primitives. This paper deduces the mathematical formulae to estimate the logical masking capability of majority gates, which are used extensively in nanoelectronic digital circuit synthesis. The mathematical formulae derived to evaluate the logical masking capability of majority gates holds well for minority gates, and a comparison with the logical masking capability of conventional gates such as NOT, AND/NAND, OR/NOR, and XOR/XNOR is provided. It is inferred from this research work that the logical masking capability of majority/minority gates is similar to that of XOR/XNOR gates, and with an increase of fan-in the logical masking capability of majority/minority gates also increases.
△ Less
Submitted 21 July, 2017;
originally announced July 2017.
-
Redundant Logic Insertion and Fault Tolerance Improvement in Combinational Circuits
Authors:
P Balasubramanian,
R T Naayagi
Abstract:
This paper presents a novel method to identify and insert redundant logic into a combinational circuit to improve its fault tolerance without having to replicate the entire circuit as is the case with conventional redundancy techniques. In this context, it is discussed how to estimate the fault masking capability of a combinational circuit using the truth-cum-fault enumeration table, and then it i…
▽ More
This paper presents a novel method to identify and insert redundant logic into a combinational circuit to improve its fault tolerance without having to replicate the entire circuit as is the case with conventional redundancy techniques. In this context, it is discussed how to estimate the fault masking capability of a combinational circuit using the truth-cum-fault enumeration table, and then it is shown how to identify the logic that can introduced to add redundancy into the original circuit without affecting its native functionality and with the aim of improving its fault tolerance though this would involve some trade-off in the design metrics. However, care should be taken while introducing redundant logic since redundant logic insertion may give rise to new internal nodes and faults on those may impact the fault tolerance of the resulting circuit. The combinational circuit that is considered and its redundant counterparts are all implemented in semi-custom design style using a 32/28nm CMOS digital cell library and their respective design metrics and fault tolerances are compared.
△ Less
Submitted 21 July, 2017;
originally announced July 2017.
-
Latency Optimized Asynchronous Early Output Ripple Carry Adder based on Delay-Insensitive Dual-Rail Data Encoding
Authors:
P Balasubramanian,
K Prasad
Abstract:
Asynchronous circuits employing delay-insensitive codes for data representation i.e. encoding and following a 4-phase return-to-zero protocol for handshaking are generally robust. Depending upon whether a single delay-insensitive code or multiple delay-insensitive code(s) are used for data encoding, the encoding scheme is called homogeneous or heterogeneous delay-insensitive data encoding. This ar…
▽ More
Asynchronous circuits employing delay-insensitive codes for data representation i.e. encoding and following a 4-phase return-to-zero protocol for handshaking are generally robust. Depending upon whether a single delay-insensitive code or multiple delay-insensitive code(s) are used for data encoding, the encoding scheme is called homogeneous or heterogeneous delay-insensitive data encoding. This article proposes a new latency optimized early output asynchronous ripple carry adder (RCA) that utilizes single-bit asynchronous full adders (SAFAs) and dual-bit asynchronous full adders (DAFAs) which incorporate redundant logic and are based on the delay-insensitive dual-rail code i.e. homogeneous data encoding, and follow a 4-phase return-to-zero handshaking. Amongst various RCA, carry lookahead adder (CLA), and carry select adder (CSLA) designs, which are based on homogeneous or heterogeneous delay-insensitive data encodings which correspond to the weak-indication or the early output timing model, the proposed early output asynchronous RCA that incorporates SAFAs and DAFAs with redundant logic is found to result in reduced latency for a dual-operand addition operation. In particular, for a 32-bit asynchronous RCA, utilizing 15 stages of DAFAs and 2 stages of SAFAs leads to reduced latency. The theoretical worst-case latencies of the different asynchronous adders were calculated by taking into account the typical gate delays of a 32/28nm CMOS digital cell library, and a comparison is made with their practical worst-case latencies estimated. The theoretical and practical worst-case latencies show a close correlation....
△ Less
Submitted 13 June, 2017;
originally announced June 2017.
-
Asynchronous Early Output Dual-Bit Full Adders Based on Homogeneous and Heterogeneous Delay-Insensitive Data Encoding
Authors:
P Balasubramanian,
K Prasad
Abstract:
This paper presents the designs of asynchronous early output dual-bit full adders without and with redundant logic (implicit) corresponding to homogeneous and heterogeneous delay-insensitive data encoding. For homogeneous delay-insensitive data encoding only dual-rail i.e. 1-of-2 code is used, and for heterogeneous delay-insensitive data encoding 1-of-2 and 1-of-4 codes are used. The 4-phase retur…
▽ More
This paper presents the designs of asynchronous early output dual-bit full adders without and with redundant logic (implicit) corresponding to homogeneous and heterogeneous delay-insensitive data encoding. For homogeneous delay-insensitive data encoding only dual-rail i.e. 1-of-2 code is used, and for heterogeneous delay-insensitive data encoding 1-of-2 and 1-of-4 codes are used. The 4-phase return-to-zero protocol is used for handshaking. To demonstrate the merits of the proposed dual-bit full adder designs, 32-bit ripple carry adders (RCAs) are constructed comprising dual-bit full adders. The proposed dual-bit full adders based 32-bit RCAs incorporating redundant logic feature reduced latency and area compared to their non-redundant counterparts with no accompanying power penalty. In comparison with the weakly indicating 32-bit RCA constructed using homogeneously encoded dual-bit full adders containing redundant logic, the early output 32-bit RCA comprising the proposed homogeneously encoded dual-bit full adders with redundant logic reports corresponding reductions in latency and area by 22.2% and 15.1% with no associated power penalty. On the other hand, the early output 32-bit RCA constructed using the proposed heterogeneously encoded dual-bit full adder which incorporates redundant logic reports respective decreases in latency and area than the weakly indicating 32-bit RCA that consists of heterogeneously encoded dual-bit full adders with redundant logic by 21.5% and 21.3% with nil power overhead. The simulation results obtained are based on a 32/28nm CMOS process technology.
△ Less
Submitted 25 April, 2017;
originally announced April 2017.
-
FPGA Based Implementation of Distributed Minority and Majority Voting Based Redundancy for Mission and Safety-Critical Applications
Authors:
P Balasubramanian,
N E Mastorakis
Abstract:
Electronic circuits and systems used in mission and safety-critical applications usually employ redundancy in the design to overcome arbitrary fault(s) or failure(s) and guarantee the correct operation. In this context, the distributed minority and majority voting based redundancy (DMMR) scheme forms an efficient alternative to the conventional N-modular redundancy (NMR) scheme for implementing mi…
▽ More
Electronic circuits and systems used in mission and safety-critical applications usually employ redundancy in the design to overcome arbitrary fault(s) or failure(s) and guarantee the correct operation. In this context, the distributed minority and majority voting based redundancy (DMMR) scheme forms an efficient alternative to the conventional N-modular redundancy (NMR) scheme for implementing mission and safety-critical circuits and systems by significantly minimizing their weight and design cost and also their design metrics whilst providing a similar degree of fault tolerance. This article presents the first FPGAs based implementation of example DMMR circuits and compares it with counterpart NMR circuits on the basis of area occupancy and critical path delay viz. area-delay product (ADP). The example DMMR circuits and counterpart NMR circuits are able to accommodate the faulty or failure states of 2, 3 and 4 function modules. For physical synthesis, two commercial Xilinx FPGAs viz. Spartan 3E and Virtex 5 corresponding to 90nm and 65nm CMOS processes, and two radiation-tolerant and military grade Xilinx FPGAs viz. QPro Virtex 2 and QPro Virtex E corresponding to 150nm and 180nm CMOS processes were considered for the NMR and DMMR circuit realizations which employ the 4-by-4 array multiplier as a representative function module. To achieve a fault tolerance of 2 function modules, both the DMMR and the NMR schemes provide near similar mean ADPs across all the four FPGAs. But while achieving a fault tolerance of 3 function modules the DMMR features reduced ADP by 44.5% on average compared to the NMR, and in achieving a fault tolerance of 4 function modules the DMMR reports reduced ADP by 56.5% on average compared to the NMR with respect to all the four FPGAs considered.
△ Less
Submitted 28 November, 2016;
originally announced November 2016.
-
Bi-modal First Impressions Recognition using Temporally Ordered Deep Audio and Stochastic Visual Features
Authors:
Arulkumar Subramaniam,
Vismay Patel,
Ashish Mishra,
Prashanth Balasubramanian,
Anurag Mittal
Abstract:
We propose a novel approach for First Impressions Recognition in terms of the Big Five personality-traits from short videos. The Big Five personality traits is a model to describe human personality using five broad categories: Extraversion, Agreeableness, Conscientiousness, Neuroticism and Openness. We train two bi-modal end-to-end deep neural network architectures using temporally ordered audio a…
▽ More
We propose a novel approach for First Impressions Recognition in terms of the Big Five personality-traits from short videos. The Big Five personality traits is a model to describe human personality using five broad categories: Extraversion, Agreeableness, Conscientiousness, Neuroticism and Openness. We train two bi-modal end-to-end deep neural network architectures using temporally ordered audio and novel stochastic visual features from few frames, without over-fitting. We empirically show that the trained models perform exceptionally well, even after training from a small sub-portions of inputs. Our method is evaluated in ChaLearn LAP 2016 Apparent Personality Analysis (APA) competition using ChaLearn LAP APA2016 dataset and achieved excellent performance.
△ Less
Submitted 31 October, 2016;
originally announced October 2016.
-
System Reliability, Fault Tolerance and Design Metrics Tradeoffs in the Distributed Minority and Majority Voting Based Redundancy Scheme
Authors:
P Balasubramanian
Abstract:
The distributed minority and majority voting based redundancy (DMMR) scheme was recently proposed as an efficient alternative to the conventional N-modular redundancy (NMR) scheme for the physical design of mission/safety-critical circuits and systems. The DMMR scheme enables significant improvements in fault tolerance and design metrics compared to the NMR scheme albeit at the expense of a slight…
▽ More
The distributed minority and majority voting based redundancy (DMMR) scheme was recently proposed as an efficient alternative to the conventional N-modular redundancy (NMR) scheme for the physical design of mission/safety-critical circuits and systems. The DMMR scheme enables significant improvements in fault tolerance and design metrics compared to the NMR scheme albeit at the expense of a slight decrease in the system reliability. In this context, this paper studies the system reliability, fault tolerance and design metrics tradeoffs in the DMMR scheme compared to the NMR scheme when the majority logic group of the DMMR scheme is increased in size relative to the minority logic group. Some example DMMR and NMR systems were realized using a 32/28nm CMOS process and compared. The results show that 5-of-M DMMR systems have a similar or better fault tolerance whilst requiring similar or fewer function modules than their counterpart NMR systems and simultaneously achieve optimizations in design metrics. Nevertheless, 3-of-M DMMR systems have the upper hand with respect to fault tolerance and design metrics optimizations than the comparable NMR and 5-of-M DMMR systems. With regard to system reliability, NMR systems are closely followed by 5-of-M DMMR systems which are closely followed by 3-of-M DMMR systems. The verdict is 3-of-M DMMR systems are preferable to implement higher levels of redundancy from a combined system reliability, fault tolerance and design metrics perspective to realize mission/safety-critical circuits and systems.
△ Less
Submitted 8 November, 2017; v1 submitted 25 August, 2016;
originally announced August 2016.
-
Early Output Hybrid Input Encoded Asynchronous Full Adder and Relative-Timed Ripple Carry Adder
Authors:
P Balasubramanian,
K Prasad
Abstract:
This paper presents a new early output hybrid input encoded asynchronous full adder designed using dual-rail and 1-of-4 delay-insensitive data codes. The proposed full adder when cascaded to form a ripple carry adder (RCA) necessitates the use of a small relative-timing assumption with respect to the internal carries, which is independent of the RCA size. The forward latency of the proposed hybrid…
▽ More
This paper presents a new early output hybrid input encoded asynchronous full adder designed using dual-rail and 1-of-4 delay-insensitive data codes. The proposed full adder when cascaded to form a ripple carry adder (RCA) necessitates the use of a small relative-timing assumption with respect to the internal carries, which is independent of the RCA size. The forward latency of the proposed hybrid input encoded full adder based RCA is data-dependent while its reverse latency is the least equaling the propagation delay of just one full adder. Compared to the best of the existing hybrid input encoded full adders based 32-bit RCAs, the proposed early output hybrid input encoded full adder based 32-bit RCA enables respective reductions in forward latency and area by 7.9% and 5.6% whilst dissipating the same average power; in terms of the theoretically computed cycle time, the latter reports a 10.9% reduction compared to the former.
△ Less
Submitted 3 August, 2016;
originally announced August 2016.
-
Design of Synchronous Section-Carry Based Carry Lookahead Adders with Improved Figure of Merit
Authors:
P Balasubramanian
Abstract:
The section-carry based carry lookahead adder (SCBCLA) architecture was proposed as an efficient alternative to the conventional carry lookahead adder (CCLA) architecture for the physical implementation of computer arithmetic. In previous related works, self-timed SCBCLA architectures and synchronous SCBCLA architectures were realized using standard cells and FPGAs. In this work, we deal with impr…
▽ More
The section-carry based carry lookahead adder (SCBCLA) architecture was proposed as an efficient alternative to the conventional carry lookahead adder (CCLA) architecture for the physical implementation of computer arithmetic. In previous related works, self-timed SCBCLA architectures and synchronous SCBCLA architectures were realized using standard cells and FPGAs. In this work, we deal with improved realizations of synchronous SCBCLA architectures designed in a semi-custom fashion using standard cells. The improvement is quantified in terms of a figure of merit (FOM), where the FOM is defined as the inverse product of power, delay and area. Since power, delay and area of digital designs are desirable to be minimized, the FOM is desirable to be maximized. Starting from an efficient conventional carry lookahead generator, we show how an optimized section-carry based carry lookahead generator is realized. In comparison with our recent work dealing with standard cells based implementation of SCBCLAs to perform 32-bit addition of two binary operands, we show in this work that with improved section-carry based carry lookahead generators, the resulting SCBCLAs exhibit significant improvements in FOM. Compared to the earlier optimized hybrid SCBCLA, the proposed optimized hybrid SCBCLA improves the FOM by 88.3%. Even the optimized hybrid CCLA features improvement in FOM by 77.3% over the earlier optimized hybrid CCLA. However, the proposed optimized hybrid SCBCLA is still the winner and has a better FOM than the currently optimized hybrid CCLA by 15.3%. All the CCLAs and SCBCLAs are implemented to realize 32-bit dual-operand binary addition using a 32/28nm CMOS process.
△ Less
Submitted 8 November, 2017; v1 submitted 17 June, 2016;
originally announced June 2016.
-
A Fault Tolerance Improved Majority Voter for TMR System Architectures
Authors:
P Balasubramanian,
K Prasad
Abstract:
For digital system designs, triple modular redundancy (TMR), which is a 3-tuple version of N-modular redundancy is widely preferred for many mission-control and safety-critical applications. The TMR scheme involves two-times duplication of the simplex system hardware, with a majority voter ensuring correctness provided at least two out of three copies of the system remain operational. Thus the maj…
▽ More
For digital system designs, triple modular redundancy (TMR), which is a 3-tuple version of N-modular redundancy is widely preferred for many mission-control and safety-critical applications. The TMR scheme involves two-times duplication of the simplex system hardware, with a majority voter ensuring correctness provided at least two out of three copies of the system remain operational. Thus the majority voter plays a pivotal role in ensuring the correct operation of the system. The fundamental assumption implicit in the TMR scheme is that the majority voter does not become faulty, which may not hold well for implementations based on latest technology nodes with dimensions of the order of just tens of nanometers. To overcome the drawbacks of the classical majority voter some new voter designs were put forward in the literature with the aim of enhancing the fault tolerance. However, these voter designs generally ensure the correct system operation in the presence of either a faulty function module or the faulty voter, considered only in isolation. Since multiple faults may no longer be excluded in the nanoelectronics regime, simultaneous fault occurrences on both the function module and the voter should be considered, and the fault tolerance of the voters have to be analyzed under such a scenario. In this context, this article proposes a new fault-tolerant majority voter which is found to be more robust to faults than the existing voters in the presence of faults occurring internally and/or externally to the voter. Moreover, the proposed voter features less power dissipation, delay, and area metrics based on the simulation results obtained by using a 32/28nm CMOS process.
△ Less
Submitted 8 November, 2017; v1 submitted 12 May, 2016;
originally announced May 2016.
-
An Asynchronous Early Output Full Adder and a Relative-Timed Ripple Carry Adder
Authors:
P Balasubramanian
Abstract:
This article presents the design of a new asynchronous early output full adder which when cascaded leads to a relative-timed ripple carry adder (RCA). The relative-timed RCA requires imposing a very small relative-timing assumption to overcome the problem of gate orphans associated with internal carry propagation. The relative-timing assumption is however independent of the RCA size. The primary b…
▽ More
This article presents the design of a new asynchronous early output full adder which when cascaded leads to a relative-timed ripple carry adder (RCA). The relative-timed RCA requires imposing a very small relative-timing assumption to overcome the problem of gate orphans associated with internal carry propagation. The relative-timing assumption is however independent of the RCA size. The primary benefits of the relative-timed RCA are processing of valid data incurs data-dependent forward latency, while the processing of spacer involves a very fast constant time reverse latency of just 1 full adder delay which represents the ultimate in the design of an asynchronous RCA with the fastest reset. The secondary benefits of the relative-timed RCA are it achieves good optimization of power and area metrics simultaneously. A 32-bit relative-timed RCA constructed using the proposed early output full adder achieves respective reductions in forward latency by 67%, 10% and 3.5% compared to the optimized strong-indication, weak-indication, and early output 32-bit asynchronous RCAs existing in the literature. Based on a similar comparison, the proposed 32-bit relative-timed RCA achieves corresponding reductions in cycle time by 83%, 12.7% and 6.4%. In terms of area, the proposed 32-bit relative-timed RCA occupies 27% less Silicon than its optimized strong-indication counterpart and 17% less Silicon than its optimized weak-indication counterpart, and features increased area occupancy by a meager 1% compared to the optimized early output 32-bit asynchronous RCA. The average power dissipation of all the asynchronous 32-bit RCAs are found to be comparable since they all satisfy the monotonic cover constraint. The simulation results obtained correspond to a 32/28nm CMOS process.
△ Less
Submitted 8 November, 2017; v1 submitted 12 May, 2016;
originally announced May 2016.
-
Real-Time Contingency Analysis with Corrective Transmission Switching - Part II: Results and Discussion
Authors:
Xingpeng Li,
Mostafa Sahraei-Ardakani,
Pranavamoorthy Balasubramanian,
Mojdeh Abdi-Khorsand,
Kory W. Hedman,
Robin Podmore
Abstract:
This paper presents the performance of an AC transmission switching (TS) based real-time contingency analysis (RTCA) tool that is introduced in Part I of this paper. The approach quickly proposes high quality corrective switching actions for relief of potential post-contingency network violations. The approach is confirmed by testing it on actual EMS snapshots of two large-scale systems, the Elect…
▽ More
This paper presents the performance of an AC transmission switching (TS) based real-time contingency analysis (RTCA) tool that is introduced in Part I of this paper. The approach quickly proposes high quality corrective switching actions for relief of potential post-contingency network violations. The approach is confirmed by testing it on actual EMS snapshots of two large-scale systems, the Electric Reliability Council of Texas (ERCOT) and the Pennsylvania New Jersey Maryland (PJM) Interconnection; the approach is also tested on data provided by the Tennessee Valley Authority (TVA). The results show that the tool effectively reduces post-contingency violations. Fast heuristics are used along with parallel computing to reduce the computational difficulty of the problem. The tool is able to handle the PJM system in about five minutes with a standard desktop computer. Time-domain simulations are performed to check system stability with corrective transmission switching (CTS). In conclusion, the paper shows that corrective switching is ripe for industry adoption. CTS can provide significant reliability benefits that can be translated into significant cost savings.
△ Less
Submitted 15 April, 2016;
originally announced April 2016.
-
Real-Time Contingency Analysis with Corrective Transmission Switching - Part I: Methodology
Authors:
Xingpeng Li,
Pranavamoorthy Balasubramanian,
Mostafa Sahraei-Ardakani,
Mojdeh Abdi-Khorsand,
Kory W. Hedman,
Robin Podmore
Abstract:
Transmission switching (TS) has gained significant attention recently. However, barriers still remain and must be overcome before the technology can be adopted by the industry. The state of the art challenges include AC feasibility and performance, computational complexity, the ability to handle large-scale real power systems, and dynamic stability. This two-part paper investigates these challenge…
▽ More
Transmission switching (TS) has gained significant attention recently. However, barriers still remain and must be overcome before the technology can be adopted by the industry. The state of the art challenges include AC feasibility and performance, computational complexity, the ability to handle large-scale real power systems, and dynamic stability. This two-part paper investigates these challenges by develo** an AC TS-based real-time contingency analysis (RTCA) tool that can handle large-scale systems within a reasonable time. The tool proposes multiple corrective switching actions, after detection of a contingency with potential violations. To reduce the computational complexity, three heuristic algorithms are proposed to generate a small set of candidates for switching. Parallel computing is implemented to further speed up the solution time. Furthermore, stability analysis is performed to check for dynamic stability of proposed TS solutions. Part I of the paper presents a comprehensive literature review and the methodology. The promising results, tested on the Tennessee Valley Authority (TVA) system and actual energy management system (EMS) snapshots from Pennsylvania New Jersey Maryland (PJM) and the Electric Reliability Council of Texas (ERCOT), are presented in Part II. It is concluded that RTCA with corrective TS significantly reduces potential post-contingency violations and is ripe for industry adoption.
△ Less
Submitted 15 April, 2016;
originally announced April 2016.
-
Area/latency optimized early output asynchronous full adders and relative-timed ripple carry adders
Authors:
P Balasubramanian,
S Yamashita
Abstract:
This article presents two area/latency optimized gate level asynchronous full adder designs which correspond to early output logic. The proposed full adders are constructed using the delay-insensitive dual-rail code and adhere to the four-phase return-to-zero handshaking. For an asynchronous ripple carry adder (RCA) constructed using the proposed early output full adders, the relative-timing assum…
▽ More
This article presents two area/latency optimized gate level asynchronous full adder designs which correspond to early output logic. The proposed full adders are constructed using the delay-insensitive dual-rail code and adhere to the four-phase return-to-zero handshaking. For an asynchronous ripple carry adder (RCA) constructed using the proposed early output full adders, the relative-timing assumption becomes necessary and the inherent advantages of the relative-timed RCA are: (1) computation with valid inputs, i.e., forward latency is data-dependent, and (2) computation with spacer inputs involves a bare minimum constant reverse latency of just one full adder delay, thus resulting in the optimal cycle time. With respect to different 32-bit RCA implementations, and in comparison with the optimized strong-indication, weak-indication, and early output full adder designs, one of the proposed early output full adders achieves respective reductions in latency by 67.8, 12.3 and 6.1 %, while the other proposed early output full adder achieves corresponding reductions in area by 32.6, 24.6 and 6.9 %, with practically no power penalty. Further, the proposed early output full adders based asynchronous RCAs enable minimum reductions in cycle time by 83.4, 15, and 8.8 % when considering carry-propagation over the entire RCA width of 32-bits, and maximum reductions in cycle time by 97.5, 27.4, and 22.4 % for the consideration of a typical carry chain length of 4 full adder stages, when compared to the least of the cycle time estimates of various strong-indication, weak-indication, and early output asynchronous RCAs of similar size. All the asynchronous full adders and RCAs were realized using standard cells in a semi-custom design fashion based on a 32/28 nm CMOS process technology.
△ Less
Submitted 13 April, 2016;
originally announced April 2016.
-
Power, Delay and Area Comparisons of Majority Voters relevant to TMR Architectures
Authors:
P Balasubramanian,
N E Mastorakis
Abstract:
N-modular redundancy (NMR) is commonly used to enhance the fault tolerance of a circuit/system, when subject to a fault-inducing environment such as in space or military systems, where upsets due to radiation phenomena, temperature and/or other environmental conditions are anticipated. Triple Modular Redundancy (TMR), which is a 3-tuple version of NMR, is widely preferred for mission-control space…
▽ More
N-modular redundancy (NMR) is commonly used to enhance the fault tolerance of a circuit/system, when subject to a fault-inducing environment such as in space or military systems, where upsets due to radiation phenomena, temperature and/or other environmental conditions are anticipated. Triple Modular Redundancy (TMR), which is a 3-tuple version of NMR, is widely preferred for mission-control space, military, and aerospace, and safety-critical nuclear, power, medical, and industrial control and automation systems. The TMR scheme involves the two-times duplication of a simplex system hardware, with a majority voter ensuring correctness provided at least two out of three copies of the hardware remain operational. Thus the majority voter plays a pivotal role in ensuring the correct operation of the TMR scheme. In this paper, a number of standard-cell based majority voter designs relevant to TMR architectures are presented, and their power, delay and area parameters are estimated based on physical realization using a 32/28nm CMOS process.
△ Less
Submitted 25 March, 2016;
originally announced March 2016.
-
Global versus Local Weak-Indication Self-Timed Function Blocks - A Comparative Analysis
Authors:
P Balasubramanian,
N E Mastorakis
Abstract:
This paper analyzes the merits and demerits of global weak-indication self-timed function blocks versus local weak-indication self-timed function blocks, implemented using a delay-insensitive data code and adhering to 4-phase return-to-zero handshaking. A self-timed ripple carry adder is considered as an example function block for the analysis. The analysis shows that while global weak-indication…
▽ More
This paper analyzes the merits and demerits of global weak-indication self-timed function blocks versus local weak-indication self-timed function blocks, implemented using a delay-insensitive data code and adhering to 4-phase return-to-zero handshaking. A self-timed ripple carry adder is considered as an example function block for the analysis. The analysis shows that while global weak-indication could help in optimizing the power, latency and area parameters, local weak-indication facilitates the optimum performance in terms of realizing the data-dependent cycle time that is characteristic of a weak-indication self-timed design.
△ Less
Submitted 25 March, 2016;
originally announced March 2016.
-
ASIC-based Implementation of Synchronous Section-Carry Based Carry Lookahead Adders
Authors:
P Balasubramanian,
N E Mastorakis
Abstract:
The section-carry based carry lookahead adder (SCBCLA) topology was proposed as an improved high-speed alternative to the conventional carry lookahead adder (CCLA) topology in previous works. Self-timed and FPGA-based implementations of SCBCLAs and CCLAs were considered earlier, and it was found that SCBCLAs could help in delay reduction i.e. pave the way for improved speed compared to CCLAs at th…
▽ More
The section-carry based carry lookahead adder (SCBCLA) topology was proposed as an improved high-speed alternative to the conventional carry lookahead adder (CCLA) topology in previous works. Self-timed and FPGA-based implementations of SCBCLAs and CCLAs were considered earlier, and it was found that SCBCLAs could help in delay reduction i.e. pave the way for improved speed compared to CCLAs at the expense of some increase in area and/or power parameters. In this work, we consider semi-custom ASIC-based implementations of different variants of SCBCLAs and CCLAs to perform 32-bit dual-operand addition. Based on the simulation results for 32-bit dual-operand addition obtained by targeting a high-end 32/28nm CMOS process, it is found that an optimized SCBCLA architecture reports a 9.8% improvement in figure-of-merit (FOM) compared to an optimized CCLA architecture, where the FOM is defined as the inverse of the product of power, delay, and area. It is generally inferred from the simulations that the SCBCLA architecture could be more beneficial compared to the CCLA architecture in terms of the design metrics whilst benefitting a variety of computer arithmetic operations involving dual-operand and/or multi-operand additions. Also, it is observed that heterogeneous CLA architectures tend to fare well compared to homogeneous CLA architectures, as substantiated by the simulation results.
△ Less
Submitted 25 March, 2016;
originally announced March 2016.
-
Quantum metrology enhanced by repetitive quantum error correction
Authors:
Thomas Unden,
Priya Balasubramanian,
Daniel Louzon,
Yuval Vinkler,
Martin B. Plenio,
Matthew Markham,
Daniel Twitchen,
Igor Lovchinsky,
Alexander O. Sushkov,
Mikhail D. Lukin,
Alex Retzker,
Boris Naydenov,
Liam P. McGuinness,
Fedor Jelezko
Abstract:
The accumulation of quantum phase in response to a signal is the central mechanism of quantum sensing, as such, loss of phase information presents a fundamental limitation. For this reason approaches to extend quantum coherence in the presence of noise are actively being explored. Here we experimentally protect a room-temperature hybrid spin register against environmental decoherence by performing…
▽ More
The accumulation of quantum phase in response to a signal is the central mechanism of quantum sensing, as such, loss of phase information presents a fundamental limitation. For this reason approaches to extend quantum coherence in the presence of noise are actively being explored. Here we experimentally protect a room-temperature hybrid spin register against environmental decoherence by performing repeated quantum error correction whilst maintaining sensitivity to signal fields. We use a long-lived nuclear spin to correct multiple phase errors on a sensitive electron spin in diamond and realize magnetic field sensing beyond the timescales set by natural decoherence. The universal extension of sensing time, robust to noise at any frequency, demonstrates the definitive advantage entangled multi-qubit systems provide for quantum sensing and offers an important complement to quantum control techniques. In particular, our work opens the door for detecting minute signals in the presence of high frequency noise, where standard protocols reach their limits.
△ Less
Submitted 23 February, 2016;
originally announced February 2016.
-
Strong driving of a single spin using arbitrarily polarized fields
Authors:
P. London,
P. Balasubramanian,
B. Naydenov,
L. P. McGuinness,
F. Jelezko
Abstract:
The strong driving regime occurs when a quantum two-level system is driven with an external field whose amplitude is greater or equal to the energy splitting between the system's states, and is typically identified with the breaking of the rotating wave approximation (RWA). We report an experimental study, in which the spin of a single nitrogen-vacancy (NV) center in diamond is strongly driven wit…
▽ More
The strong driving regime occurs when a quantum two-level system is driven with an external field whose amplitude is greater or equal to the energy splitting between the system's states, and is typically identified with the breaking of the rotating wave approximation (RWA). We report an experimental study, in which the spin of a single nitrogen-vacancy (NV) center in diamond is strongly driven with microwave (MW) fields of arbitrary polarization. We measure the NV center spin dynamics beyond the RWA, and characterize the limitations of this technique for generating high-fidelity quantum gates. Using circularly polarized MW fields, the NV spin can be harmonically driven in its rotating frame regardless of the field amplitude, thus allowing rotations around arbitrary axes. Our approach can effectively remove the RWA limit in quantum-sensing schemes, and assist in increasing the number of operations in QIP protocols.
△ Less
Submitted 24 April, 2014;
originally announced April 2014.
-
Critical properties in single crystals of Pr1-xPbxMnO3
Authors:
B. Padmanabhan,
H. L. Bhat,
Suja Elizabeth,
Sahana Roessler,
U. K. Roessler,
K. Doerr,
K. H. Mueller
Abstract:
The critical properties at the ferromagnetic - paramagnetic transition have been analysed from data of static magnetization measurements on single crystals of Pr1-xPbxMnO3, for x = 0.23 and x = 0.30. In Pr1-xPbxMnO3 the ferromagnetic ordering and the metal-insulator transition do not coincide in parts of the phase diagram. The crystal with x = 0.23 is a ferromagnetic insulator with Curie tempera…
▽ More
The critical properties at the ferromagnetic - paramagnetic transition have been analysed from data of static magnetization measurements on single crystals of Pr1-xPbxMnO3, for x = 0.23 and x = 0.30. In Pr1-xPbxMnO3 the ferromagnetic ordering and the metal-insulator transition do not coincide in parts of the phase diagram. The crystal with x = 0.23 is a ferromagnetic insulator with Curie temperature Tc = 173 K, while the crystal with x =0.30 has Tc = 198 K and remains metallic up to a metal-insulator transition temperature Tmi = 235 K. The dc magnetization measurements were carried out in the field range from 0 to 5 T for an interval in the critical temperature range Tc+_10 K corresponding to a reduced temperature interval 0.003 < epsilon < 0.6. The exponents beta for spontaneous magnetization, gamma for the initial susceptibility above Tc and delta for the critical magnetization isotherm at Tc were obtained by static scaling analysis from modified Arrott plots and by the Kouvel Fisher method for the insulating crystal with composition x = 0.23. The data are well described by critical exponents similar to those expected for the Heisenberg universality class relevant for conventional isotropic magnets Systematic deviations from scaling in the data for the metallic crystal with composition x = 0.30 are demonstrated from effective critical exponents near the assumed ordering transition. The unconventional magnetic ordering in this system indicates the presence of frustrated magnetic couplings that suppresses magnetic ordering and lowers the transition temperature.
△ Less
Submitted 10 October, 2006;
originally announced October 2006.