-
RTL Interconnect Obfuscation By Polymorphic Switch Boxes For Secure Hardware Generation
Authors:
Haimanti Chakraborty,
Ranga Vemuri
Abstract:
Logic Obfuscation is a well renowned design-for-trust solution to protect an Integrated Circuit (IC) from unauthorized use and illegal overproduction by including key-gates to lock the design. This is particularly necessary for ICs manufactured at untrusted third-party foundries getting exposed to security threats. In the past, several logic obfuscation methodologies have been proposed that are vu…
▽ More
Logic Obfuscation is a well renowned design-for-trust solution to protect an Integrated Circuit (IC) from unauthorized use and illegal overproduction by including key-gates to lock the design. This is particularly necessary for ICs manufactured at untrusted third-party foundries getting exposed to security threats. In the past, several logic obfuscation methodologies have been proposed that are vulnerable to attacks such as the Boolean Satisfiability Attack. Many of these techniques are implemented at the gate level that may involve expensive re-synthesis cycles. In this paper, we present an interconnect obfuscation scheme at the Register-Transfer Level (RTL) using Switch Boxes (SBs) constructed of Polymorphic Transistors. A polymorphic SB can be designed using the same transistor count as its Complementary-Metal-Oxide-Semiconductor based counterpart, thereby no increased area in comparison, but serving as an advantage in having more key-bit combinations for an attacker to correctly identify and unlock each polymorphic SB. Security-aware high-level synthesis algorithms have also been presented to increase RTL interconnects to Functional Units impacting multiple outputs such that when a polymorphic SB is strategically inserted, those outputs would be corrupted upon incorrect key-bit identification. Finally, we run the SMT (Satisfiability Modulo Theories)-based RTL Logic Attack on the obfuscated design to examine its robustness.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Word-Level Structure Identification In FPGA Designs Using Cell Proximity Information
Authors:
Aparajithan Nathamuni-Venkatesan,
Ram-Venkat Narayanan,
Kishore Pula,
Sundarakumar Muthukumaran,
Ranga Vemuri
Abstract:
Reverse engineering of FPGA based designs from the flattened LUT level netlist to high level RTL helps in verification of the design or in understanding legacy designs. We focus on flattened netlists for FPGA devices from ** algorithm that makes use of the location information of the elements on the physical device after place and ro…
▽ More
Reverse engineering of FPGA based designs from the flattened LUT level netlist to high level RTL helps in verification of the design or in understanding legacy designs. We focus on flattened netlists for FPGA devices from ** algorithm that makes use of the location information of the elements on the physical device after place and route. The proposed grou** algorithm gives clusters with average NMI of 0.73 for grou**s including all element types. The benchmarks chosen include a range of designs from communication, arithmetic units, processors and DSP processing units.
△ Less
Submitted 7 March, 2023;
originally announced March 2023.
-
Reverse Engineering Word-Level Models from Look-Up Table Netlists
Authors:
Ram Venkat Narayanan,
Aparajithan Nathamuni Venkatesan,
Kishore Pula,
Sundarakumar Muthukumaran,
Ranga Vemuri
Abstract:
Reverse engineering of FPGA designs from bitstreams to RTL models aids in understanding the high level functionality of the design and for validating and reconstructing legacy designs. Fast carry-chains are commonly used in synthesis of operators in FPGA designs. We propose a method to detect word-level structures by analyzing these carry-chains in LUT (Look-Up Table) level netlists. We also prese…
▽ More
Reverse engineering of FPGA designs from bitstreams to RTL models aids in understanding the high level functionality of the design and for validating and reconstructing legacy designs. Fast carry-chains are commonly used in synthesis of operators in FPGA designs. We propose a method to detect word-level structures by analyzing these carry-chains in LUT (Look-Up Table) level netlists. We also present methods to adapt existing techniques to identify combinational operations and sequential modules in ASIC netlists to LUT netlists. All developed and adapted techniques are consolidated into an integrated tool-chain to aid in reverse engineering of word-level designs from LUT-level netlists. When evaluated on a set of real-world designs, the tool-chain infers 34\% to 100\% of the elements in the netlist to be part of a known word-level operation or a known sequential module.
△ Less
Submitted 5 March, 2023;
originally announced March 2023.
-
Some solutions of functional equation $I(I(y,x), I(x,y))=I(x,y)$ involving fuzzy implications
Authors:
Nageswara Rao Vemuri
Abstract:
In this article, a functional equation(IE) involving fuzzy implications has been considered. Two different perspectives of this equation have been provided to realize its significance. As it is very difficult to find the solutions of (IE) in general, the investigation of solutions of (IE) is restricted to main families of fuzzy implications.
In this article, a functional equation(IE) involving fuzzy implications has been considered. Two different perspectives of this equation have been provided to realize its significance. As it is very difficult to find the solutions of (IE) in general, the investigation of solutions of (IE) is restricted to main families of fuzzy implications.
△ Less
Submitted 28 December, 2020;
originally announced December 2020.
-
Non-Invasive Reverse Engineering of Finite State Machines Using Power Analysis and Boolean Satisfiability
Authors:
Harsh Vamja,
Richa Agrawal,
Ranga Vemuri
Abstract:
In this paper, we present a non-invasive reverse engineering attack based on a novel approach that combines functional and power analysis to recover finite state machines from their synchronous sequential circuit implementations. The proposed technique formulates the machine exploration and state identification problem as a Boolean constraint satisfaction problem and solves it using a SMT (Satisfi…
▽ More
In this paper, we present a non-invasive reverse engineering attack based on a novel approach that combines functional and power analysis to recover finite state machines from their synchronous sequential circuit implementations. The proposed technique formulates the machine exploration and state identification problem as a Boolean constraint satisfaction problem and solves it using a SMT (Satisfiability Modulo Theories) solver. It uses power measurements to achieve fast convergence. Experimental results using the LGSynth'91 benchmark suite show that the satisfiability-based approach is several times faster compared to existing techniques and can successfully recover 90%-100% of the transitions of a target machine.
△ Less
Submitted 6 August, 2019;
originally announced August 2019.
-
A performance evaluation of CCS QCD Benchmark on the COMA (Intel(R) Xeon Phi$^{TM}$, KNC) system
Authors:
Taisuke Boku,
Ken-Ichi Ishikawa,
Yoshinobu Kuramashi,
Lawrence Meadows,
Michael D`Mello,
Maurice Troute,
Ravi Vemuri
Abstract:
The most computationally demanding part of Lattice QCD simulations is solving quark propagators. Quark propagators are typically obtained with a linear equation solver utilizing HPC machines. The CCS QCD Benchmark is a benchmark program solving the Wilson-Clover quark propagator, and is developed at the Center for Computational Sciences (CCS), University of Tsukuba. We optimized the benchmark prog…
▽ More
The most computationally demanding part of Lattice QCD simulations is solving quark propagators. Quark propagators are typically obtained with a linear equation solver utilizing HPC machines. The CCS QCD Benchmark is a benchmark program solving the Wilson-Clover quark propagator, and is developed at the Center for Computational Sciences (CCS), University of Tsukuba. We optimized the benchmark program for a \Intel \XeonPhi (Knights Corner, KNC) system named "COMA (PACS-IX)" at CCS Tsukuba under the Intel Parallel Computing Center program. A single precision BiCGStab solver with the overlapped Restricted Additive Schwarz (RAS) preconditioner was implemented using SIMD intrinsics, OpenMP and MPI in the offload mode. With the reverse-offloading technique, we could reduce the communication and offloading overheads. We observed a performance of $\sim 200$ GFlops sustained for the Wilson-Clover hop** matrix multiplication on the lattice sizes larger than $24^3\times 32$ on a sinlge card of the COMA system. A good weak scaling perofmace was observed on the local lattice sizes larger than $24^3\times 32$.
△ Less
Submitted 20 December, 2016;
originally announced December 2016.
-
Asymmetry of radiation damage properties in Al-Ti nanolayers
Authors:
Wahyu Setyawan,
Matthew Gerboth,
Bo Yao,
Charles H. Henager Jr.,
Arun Devaraj,
Venkata R. S. R. Vemuri,
Suntharampillai Thevuthasan,
Vaithiyalingam Shutthanandan
Abstract:
Molecular dynamics (MD) simulations were employed with empirical potentials to study the effects of multilayer interfaces and interface spacing in Al-Ti nanolayers. Several model interfaces derived from stacking of close-packed layers or face-centered cubic \{100\} layers were investigated. The simulations reveal significant and important asymmetries in defect production with $\sim$60% of vacancie…
▽ More
Molecular dynamics (MD) simulations were employed with empirical potentials to study the effects of multilayer interfaces and interface spacing in Al-Ti nanolayers. Several model interfaces derived from stacking of close-packed layers or face-centered cubic \{100\} layers were investigated. The simulations reveal significant and important asymmetries in defect production with $\sim$60% of vacancies created in Al layers compared to Ti layers within the Al-Ti multilayer system. The asymmetry in the creation of interstitials is even more pronounced. The asymmetries cause an imbalance in the ratio of vacancies and interstitials in films of dissimilar materials leading to $>$90% of the surviving interstitials located in the Al layers. While in the close-packed nanolayers the interstitials migrate to the atomic layers adjacent to the interface of the Al layers, in the \{100\} nanolayers the interstitials migrate to the center of the Al layers and away from the interfaces. The degree of asymmetry and defect ratio imbalance increases as the layer spacing decreases in the multilayer films. Underlying physical processes are discussed including the interfacial strain fields and the individual elemental layer stop** power in nanolayered systems. In addition, experimental work was performed on low-dose (10$^{16}$ atoms/cm$^2$) helium (He) irradiation on Al/Ti nanolayers (5 nm per film), resulting in He bubble formation $\sim$1 nm in diameter in the Ti film near the interface. The correlation between the preferential flux of displaced atoms from Ti films to Al films during the defect production that is revealed in the simulations and the morphology and location of He bubbles from the experiments is discussed.
△ Less
Submitted 19 September, 2013;
originally announced September 2013.
-
An Iterative Algorithm for Battery-Aware Task Scheduling on Portable Computing Platforms
Authors:
Jawad Khan,
Ranga Vemuri
Abstract:
In this work we consider battery powered portable systems which either have Field Programmable Gate Arrays (FPGA) or voltage and frequency scalable processors as their main processing element. An application is modeled in the form of a precedence task graph at a coarse level of granularity. We assume that for each task in the task graph several unique design-points are available which correspond…
▽ More
In this work we consider battery powered portable systems which either have Field Programmable Gate Arrays (FPGA) or voltage and frequency scalable processors as their main processing element. An application is modeled in the form of a precedence task graph at a coarse level of granularity. We assume that for each task in the task graph several unique design-points are available which correspond to different hardware implementations for FPGAs and different voltage-frequency combinations for processors. It is assumed that performance and total power consumption estimates for each design-point are available for any given portable platfrom, including the peripheral components such as memory and display power usage. We present an iterative heuristic algorithm which finds a sequence of tasks along with an appropriate design-point for each task, such that a deadline is met and the amount of battery energy used is as small as possible. A detailed illustrative example along with a case study of a real-world application of a robotic arm controller which demonstrates the usefulness of our algorithm is also presented.
△ Less
Submitted 25 October, 2007;
originally announced October 2007.
-
Multi-Placement Structures for Fast and Optimized Placement in Analog Circuit Synthesis
Authors:
Raoul F. Badaoui,
Ranga Vemuri
Abstract:
This paper presents the novel idea of multi-placement structures, for a fast and optimized placement instantiation in analog circuit synthesis. These structures need to be generated only once for a specific circuit topology. When used in synthesis, these pre-generated structures instantiate various layout floorplans for various sizes and parameters of a circuit. Unlike procedural layout generato…
▽ More
This paper presents the novel idea of multi-placement structures, for a fast and optimized placement instantiation in analog circuit synthesis. These structures need to be generated only once for a specific circuit topology. When used in synthesis, these pre-generated structures instantiate various layout floorplans for various sizes and parameters of a circuit. Unlike procedural layout generators, they enable fast placement of circuits while kee** the quality of the placements at a high level during a synthesis process. The fast placement is a result of high speed instantiation resulting from the efficiency of the multi-placement structure. The good quality of placements derive from the extensive and intelligent search process that is used to build the multi-placement structure. The target benchmarks of these structures are analog circuits in the vicinity of 25 modules. An algorithm for the generation of such multi-placement structures is presented. Experimental results show placement execution times with an average of a few milliseconds making them usable during layout-aware synthesis for optimized placements.
△ Less
Submitted 25 October, 2007;
originally announced October 2007.