Search | arXiv e-print repository

VAR-DRAM: Variation-Aware Framework for Efficient Dynamic Random Access Memory Design

Authors: Kaustav Goswami, Hemanta Kumar Mondal, Shirshendu Das, Dip Sankar Banerjee

Abstract: Dynamic Random Access Memory (DRAM) is the de-facto choice for main memory devices due to its cost-effectiveness. It offers a larger capacity and higher bandwidth compared to SRAM but is slower than the latter. With each passing generation, DRAMs are becoming denser. One of its side-effects is the deviation of nominal parameters: process, voltage, and temperature. DRAMs are often considered as the… ▽ More Dynamic Random Access Memory (DRAM) is the de-facto choice for main memory devices due to its cost-effectiveness. It offers a larger capacity and higher bandwidth compared to SRAM but is slower than the latter. With each passing generation, DRAMs are becoming denser. One of its side-effects is the deviation of nominal parameters: process, voltage, and temperature. DRAMs are often considered as the bottleneck of the system as it trades off performance with capacity. With such inherent limitations, further deviation from nominal specifications is undesired. In this paper, we investigate the impact of variations in conventional DRAM devices on the aspects of performance, reliability, and energy requirements. Based on this study, we model a variation-aware framework, called VAR-DRAM, targeted for modern-day DRAM devices. It provides enhanced power management by taking variations into account. VAR-DRAM ensures faster execution of programs as it internally remaps data from variation affected cells to normal cells and also ensures data preservation. On extensive experimentation, we find that VAR-DRAM achieves peak energy savings of up to 48.8% with an average of 29.54% on DDR4 memories while improving the access latency of the DRAM compared to a variation affected device by 7.4%. △ Less

Submitted 18 January, 2022; originally announced January 2022.

arXiv:2109.10438 [pdf, other]

Active ploughing through a compressible viscoelastic fluid: Unjamming and emergent nonreciprocity

Authors: Jyoti Prasad Banerjee, Rituparno Mandal, Deb Sankar Banerjee, Shashi Thutupalli, Madan Rao

Abstract: A dilute suspension of active Brownian particles in a dense compressible viscoelastic fluid, forms a natural setting to study the emergence of nonreciprocity during a dynamical phase transition. At these densities, the transport of active particles is strongly influenced by the passive medium and shows a dynamical jamming transition as a function of activity and medium density. In the process, the… ▽ More A dilute suspension of active Brownian particles in a dense compressible viscoelastic fluid, forms a natural setting to study the emergence of nonreciprocity during a dynamical phase transition. At these densities, the transport of active particles is strongly influenced by the passive medium and shows a dynamical jamming transition as a function of activity and medium density. In the process, the compressible medium is actively churned up -for low activity, the active particle gets self-trapped in a spherical cavity of its own making, while for large activity, the active particle ploughs through the medium, either accompanied by a moving anisotropic wake, or leaving a porous trail. A hydrodynamic approach makes it evident that the active particle generates a long range density wake which breaks fore-aft symmetry, consistent with the simulations. Accounting for the back reaction of the compressible medium leads to (i) dynamical jamming of the active particle, and (ii) a dynamical non-reciprocal attraction between two active particles moving along the same direction, with the trailing particle catching up with the leading one in finite time. We emphasize that these nonreciprocal effects appear only when the active particles are moving and so manifest in the vicinity of the jamming-unjamming transition. △ Less

Submitted 12 October, 2021; v1 submitted 21 September, 2021; originally announced September 2021.

Comments: 11 pages, 6 figures

arXiv:2108.04150 [pdf]

Effect of stepwise adjustment of Dam** factor upon PageRank

Authors: Subhajit Sahu, Kishore Kothapalli, Dip Sankar Banerjee

Abstract: The effect of adjusting dam** factor α, from a small initial value α0 to the final desired αf value, upon then iterations needed for PageRank computation is observed. Adjustment of the dam** factor is done in one or more steps. Results show no improvement in performance over a fixed dam** factor based PageRank. The effect of adjusting dam** factor α, from a small initial value α0 to the final desired αf value, upon then iterations needed for PageRank computation is observed. Adjustment of the dam** factor is done in one or more steps. Results show no improvement in performance over a fixed dam** factor based PageRank. △ Less

Submitted 9 August, 2021; originally announced August 2021.

Comments: 4 pages, 1 figure

ACM Class: G.2.2

arXiv:2108.02997 [pdf]

Adjusting PageRank parameters and Comparing results

Authors: Subhajit Sahu, Kishore Kothapalli, Dip Sankar Banerjee

Abstract: The effect of adjusting dam** factor α and tolerance τ on iterations needed for PageRank computation is studied here. Relative performance of PageRank computation with L1, L2, and L{\infty} norms used as convergence check, are also compared with six possible mean ratios. It is observed that increasing the dam** factor α linearly increases the iterations needed almost exponentially. On the othe… ▽ More The effect of adjusting dam** factor α and tolerance τ on iterations needed for PageRank computation is studied here. Relative performance of PageRank computation with L1, L2, and L{\infty} norms used as convergence check, are also compared with six possible mean ratios. It is observed that increasing the dam** factor α linearly increases the iterations needed almost exponentially. On the other hand, decreasing the tolerance τ exponentially decreases the iterations needed almost exponentially. On average, PageRank with L{\infty} norm as convergence check is the fastest, quickly followed by L2 norm, and then L1 norm. For large graphs, above certain tolerance τ values, convergence can occur in a single iteration. On the contrary, below certain tolerance τ values, sensitivity issues can begin to appear, causing computation to halt at maximum iteration limit without convergence. The six mean ratios for relative performance comparison are based on arithmetic, geometric, and harmonic mean, as well as the order of ratio calculation. Among them GM-RATIO, geometric mean followed by ratio calculation, is found to be most stable, followed by AM-RATIO. △ Less

Submitted 6 August, 2021; originally announced August 2021.

Comments: 13 pages, 10 figures, 2 tables

ACM Class: G.2.2

arXiv:2008.11591 [pdf, other]

doi 10.1145/3386263.3406928

An Approximate Carry Estimating Simultaneous Adder with Rectification

Authors: Rajat Bhattacharjya, Vishesh Mishra, Saurabh Singh, Kaustav Goswami, Dip Sankar Banerjee

Abstract: Approximate computing has in recent times found significant applications towards lowering power, area, and time requirements for arithmetic operations. Several works done in recent years have furthered approximate computing along these directions. In this work, we propose a new approximate adder that employs a carry prediction method. This allows parallel propagation of the carry allowing faster c… ▽ More Approximate computing has in recent times found significant applications towards lowering power, area, and time requirements for arithmetic operations. Several works done in recent years have furthered approximate computing along these directions. In this work, we propose a new approximate adder that employs a carry prediction method. This allows parallel propagation of the carry allowing faster calculations. In addition to the basic adder design, we also propose a rectification logic which would enable higher accuracy for larger computations. Experimental results show that our adder produces results 91.2% faster than the conventional ripple-carry adder. In terms of accuracy, the addition of rectification logic to the basic design produces results that are more accurate than state-of-the-art adders like SARA and BCSA by 74%. △ Less

Submitted 26 August, 2020; originally announced August 2020.

Comments: To appear at the 30th ACM Great Lakes Symposium on VLSI

arXiv:2007.03157 [pdf, other]

Size-regulated symmetry breaking in reaction-diffusion models of developmental transitions

Authors: Jake Cornwall Scoones, Deb Sankar Banerjee, Shiladitya Banerjee

Abstract: The development of multicellular organisms proceeds through a series of morphogenetic and cell-state transitions, transforming homogeneous zygotes into complex adults by a process of self-organization. Many of these transitions are achieved by spontaneous symmetry breaking mechanisms, allowing cells and tissues to acquire pattern and polarity by virtue of local interactions without an upstream sup… ▽ More The development of multicellular organisms proceeds through a series of morphogenetic and cell-state transitions, transforming homogeneous zygotes into complex adults by a process of self-organization. Many of these transitions are achieved by spontaneous symmetry breaking mechanisms, allowing cells and tissues to acquire pattern and polarity by virtue of local interactions without an upstream supply of information. The combined work of theory and experiment has elucidated how these systems break symmetry during developmental transitions. Given such transitions are multiple and their temporal ordering is crucial, an equally important question is how these developmental transitions are coordinated in time. Using a minimal mass-conserved substrate-depletion model for symmetry breaking as our case study, we elucidate mechanisms by which cells and tissues can couple reaction-diffusion driven symmetry breaking to the timing of developmental transitions, arguing that the dependence of patterning mode on system size may be a generic principle by which develo** organisms measure time. By analyzing different regimes of our model, simulated on growing domains, we elaborate three distinct behaviours, allowing for clock-, timer-, or switch-like dynamics. By relating these behaviours to experimentally documented case studies of developmental timing, we provide a minimal conceptual framework to interrogate how develo** organisms coordinate developmental transitions. △ Less

Submitted 6 July, 2020; originally announced July 2020.

Comments: 11 pages, 5 figures, Perspective Article

arXiv:1605.07318 [pdf, ps, other]

Actomyosin pulsation and symmetry breaking flows in a confined active elastomer subject to affine and nonaffine deformations

Authors: Deb Sankar Banerjee, Akankshi Munjal, Thomas Lecuit, Madan Rao

Abstract: Tissue remodelling in diverse developmental contexts require cell shape changes that have been associated with pulsation and flow of the actomyosin cytoskeleton. Here we describe the dynamics of the actomyosin cytoskeleton as a confined active elastomer embedded in the cytosol and subject to turnover of its components. Under affine deformations (homogeneous deformation over a spatially coarse-grai… ▽ More Tissue remodelling in diverse developmental contexts require cell shape changes that have been associated with pulsation and flow of the actomyosin cytoskeleton. Here we describe the dynamics of the actomyosin cytoskeleton as a confined active elastomer embedded in the cytosol and subject to turnover of its components. Under affine deformations (homogeneous deformation over a spatially coarse-grained scale), the active elastomer exhibits spontaneous oscillations, propagating waves, contractile collapse and spatiotemporal chaos. The collective nonlinear dynamics shows nucleation, growth and coalescence of actomyosin-dense regions which, beyond a threshold, spontaneously move as a spatially localized traveling front towards one of the boundaries. However, large myosin-induced contractile stresses, can lead to nonaffine deformations due to actin turnover. This results in a transient actin network, that naturally accommodates intranetwork flows of the actomyosin dense regions as a consequence of filament unbinding and rebinding. Our work suggests that the driving force for the spontaneous movement comes from the actomyosin-dense region itself and not the cell boundary. We verify the many predictions of our study in Drosophila embryonic epithelial cells undergoing neighbour exchange during a collective process of tissue extension called germband extension. △ Less

Submitted 24 May, 2016; originally announced May 2016.

Comments: For supplementary movies contact us

arXiv:1303.2171 [pdf, ps, other]

CPU and/or GPU: Revisiting the GPU Vs. CPU Myth

Authors: Kishore Kothapalli, Dip Sankar Banerjee, P. J. Narayanan, Surinder Sood, Aman Kumar Bahl, Shashank Sharma, Shrenik Lad, Krishna Kumar Singh, Kiran Matam, Sivaramakrishna Bharadwaj, Rohit Nigam, Parikshit Sakurikar, Aditya Deshpande, Ishan Misra, Siddharth Choudhary, Shubham Gupta

Abstract: Parallel computing using accelerators has gained widespread research attention in the past few years. In particular, using GPUs for general purpose computing has brought forth several success stories with respect to time taken, cost, power, and other metrics. However, accelerator based computing has signifi- cantly relegated the role of CPUs in computation. As CPUs evolve and also offer matching c… ▽ More Parallel computing using accelerators has gained widespread research attention in the past few years. In particular, using GPUs for general purpose computing has brought forth several success stories with respect to time taken, cost, power, and other metrics. However, accelerator based computing has signifi- cantly relegated the role of CPUs in computation. As CPUs evolve and also offer matching computational resources, it is important to also include CPUs in the computation. We call this the hybrid computing model. Indeed, most computer systems of the present age offer a degree of heterogeneity and therefore such a model is quite natural. We reevaluate the claim of a recent paper by Lee et al.(ISCA 2010). We argue that the right question arising out of Lee et al. (ISCA 2010) should be how to use a CPU+GPU platform efficiently, instead of whether one should use a CPU or a GPU exclusively. To this end, we experiment with a set of 13 diverse workloads ranging from databases, image processing, sparse matrix kernels, and graphs. We experiment with two different hybrid platforms: one consisting of a 6-core Intel i7-980X CPU and an NVidia Tesla T10 GPU, and another consisting of an Intel E7400 dual core CPU with an NVidia GT520 GPU. On both these platforms, we show that hybrid solutions offer good advantage over CPU or GPU alone solutions. On both these platforms, we also show that our solutions are 90% resource efficient on average. Our work therefore suggests that hybrid computing can offer tremendous advantages at not only research-scale platforms but also the more realistic scale systems with significant performance gains and resource efficiency to the large scale user community. △ Less

Submitted 9 March, 2013; originally announced March 2013.

Comments: 20 pages

Showing 1–8 of 8 results for author: Banerjee, D S