\addbibresource

references.bib \AtEveryBibitem\clearfieldnote

A Wireless, Multicolor Fluorescence Image Sensor Implant for Real-Time Monitoring in Cancer Therapy

Micah Roschelle*, , Rozhan Rabbani*,  Surin Gweon,  Rohan Kumar,   Alec Vercruysse,  Nam Woo Cho, Matthew H. Spitzer, Ali M. Niknejad,  Vladimir M. Stojanović,  Mekhail Anwar This work was supported by the Office of the Director and the National Institute of Dental and Craniofacial Research of the National Institutes of Health under Award DP2DE030713 and the John V. Carbone Jr. Pancreatic Cancer Research Memorial Fund. (Corresponding authors: Micah Roschelle and Mekhail Anwar.) *Equally contributing authors. Micah Roschelle, Rozhan Rabbani, Surin Gweon, Rohan Kumar, Alec Vercruysse, Ali Niknejad, and Vladimir Stojanović are with the Department of Electrical Engineering and Computer Sciences, University of California at Berkeley, Berkeley CA 94720 USA. (email: [email protected]) Nam Woo Cho is with the Department of Radiation Oncology and the Department of Otolaryngology-Head and Neck Surgery, University of California, San Francisco, CA 94158 USA. Matthew Spitzer is with the Department of Otolaryngology-Head and Neck Surgery and the Department of Microbiology and Immunology, University of California, San Francisco, CA 94158 USA. Mekhail Anwar is with the Department of Electrical Engineering and Computer Sciences, University of California at Berkeley, Berkeley, CA 94720 USA and also the Department of Radiation Oncology, University of California, San Francisco, CA 94158 USA. (email: [email protected], [email protected]).
Abstract

Real-time monitoring of dynamic biological processes in the body is critical to understanding disease progression and treatment response. This data, for instance, can help address the lower than 50% response rates to cancer immunotherapy. However, current clinical imaging modalities lack the molecular contrast, resolution, and chronic usability for rapid and accurate response assessments. Here, we present a fully wireless image sensor featuring a 2.5×5 mm2 CMOS integrated circuit for multicolor fluorescence imaging deep in tissue. The sensor operates wirelessly via ultrasound (US) at 5 cm depth in oil, harvesting energy with 221 mW/cm2 incident US power density (31% of FDA limits) and backscattering data at 13 kbps with a bit error rate <10-6. In-situ fluorescence excitation is provided by micro-laser diodes controlled with a programmable on-chip driver. An optical frontend combining a multi-bandpass interference filter and a fiber optic plate provides >6 OD excitation blocking and enables three-color imaging for detecting multiple cell types. A 36×40-pixel array captures images with <125 µm resolution. We demonstrate wireless, dual-color fluorescence imaging of both effector and suppressor immune cells in ex vivo mouse tumor samples with and without immunotherapy. These results show promise for providing rapid insight into therapeutic response and resistance, guiding personalized medicine.

Index Terms:
Biomedical implant, fluorescence imaging, ultrasound energy harvesting, immunotherapy, personalized medicine.

I Introduction

WIRELESS , miniaturized, implantable sensors can monitor intricate biological processes unfolding in the body in real-time. Typically accessible only through highly invasive techniques, this data is crucial for advancing personalized medicine, tailoring treatments to individual responses to address the wide heterogeneity in therapeutic outcomes among patients.

One meaningful application is monitoring tumor response to cancer immunotherapy, a promising treatment that unlocks the patient’s own immune system to fight cancer. For instance, immune checkpoint inhibitors (ICIs), a class of immunotherapy, have been shown to nearly double patient survival rates in melanoma [hodi_f_stephen_improved_2010] and metastatic lung cancer [reck_martin_pembrolizumab_2016] with a lower incidence of adverse effects compared to conventional treatments like chemotherapy [gadgeel_updated_2020]. While more than 40% of US cancer patients are estimated to be eligible for ICIs [haslam_estimation_2019], these therapies face a significant challenge: across most cancer types, less than 30% of patients respond to treatment [morad_hallmarks_2021, das_immune-related_2019]. For non-responders, time spent on ineffective therapies not only allows for their cancer to grow and spread, but also exposes them to unnecessary toxicity with high-grade adverse events rates often exceeding 10% [morad_hallmarks_2021] and financial burdens of more than $150,000 per year [verma_systematic_2018, chiang_cost-effectiveness_2021]. Rapid assessments of therapeutic response that also provide insight into the underlying mechanisms of resistance can help clinicians quickly identify non-responders and pivot to more effective second-line therapies to overcome resistance. However, such an assessment must capture the complex and dynamic interplay between various effector and suppressor immune cells and cancer that determines response [morad_hallmarks_2021].

Current clinical imaging falls short of this goal. Anatomical imaging modalities such as computed tomography (CT) and magnetic resonance imaging (MRI) capture changes in tumor size, which take months to manifest and do not reliably correlate with response [chai_challenges_2020]. These limitations are apparent in standard response criteria. For example, iRECIST defines a partial response as at least a 30% reduction in tumor dimensions with a minimum size of 1 cm and recommends confirmation of disease progression at long 4–8 week intervals [nishino_imaging_2019], [seymour_irecist_2017]. Alternatively, positron emission tomography (PET) can image the underlying biology with molecular contrast [unterrainer_petct_2020], but is fundamentally limited to imaging a single cell type or biomarker [pratt_simultaneous_2023] at millimeter-scale resolution [moses_fundamental_2011]. As the immune response depends on interactions between a variety of immune cells, it cannot be reliably predicted by a single biomarker [gibney_predictive_2016, yang_liquid_2023]. Moreover, this millimeter-scale resolution averages out the spatial distributions of different cell populations within the tumor, shown to be increasingly important in understanding therapeutic resistance [gohil_applying_2021, vitale_intratumoral_2021].

Fluorescence microscopy, on the other hand, provides multi-cellular resolution across multiple biomarkers, essential to visualizing a more complete picture of the immune response. In fluorescence microscopy, targeted cells are labeled with fluorescent dyes, or fluorophores, which absorb light near a specific wavelength and emit light at slightly longer wavelengths [lichtman_fluorescence_2005]. Multiple cell types can be imaged simultaneously by labeling each with a different color fluorophore. However, in vivo optical imaging is constrained by scattering in tissue which fundamentally limits the penetration depth of light in the body to a few millimeters, even at near-infrared (NIR) wavelengths where tissue absorption is minimal and scattering is reduced [owens_nir_2015]. Therefore, chronic fluorescence imaging at depth requires implantable imagers with integrated light sources providing in-situ illumination.

Fluorescence imagers can be miniaturized to the scale of a single chip by eliminating bulky lenses through contact imaging [papageorgiou_chip-scale_2020, aghlmand_65-nm_2023, zhu_ingestible_2023, moazeni_mechanically_2021, rustami_needle-type_2020]. To this end, prior work has demonstrated on-chip or in-package integration of focusing optics [papageorgiou_chip-scale_2020, choi_fully_2020] as well as fluorescence filters [aghlmand_65-nm_2023, zhu_ingestible_2023, moazeni_mechanically_2021, rustami_needle-type_2020, papageorgiou_angle-insensitive_2018] and light sources [moazeni_mechanically_2021]. However, these systems are wired, precluding long-term implantation without risk of infection. While a fluorescence sensor with wireless radio-frequency (RF) communication is presented in [zhu_ingestible_2023], it uses a centimeter-scale battery for power and lacks wireless charging. Both wireless power transfer and communication are necessary for chronic use of these devices.

Here we present a fully wireless, miniaturized fluorescence image sensor capable of three-color fluorescence imaging, aiming to enable real-time, chronic monitoring of cellular interactions deep in the body (Fig. 1). Wired connections and batteries are eliminated by power harvesting and bi-directional communication through ultrasound (US). Among wireless power transfer modalities such as near-field inductive coupling, RF, and optical, US offers low loss in tissue (0.5–1 dB/MHz/cm [chen_acoustic_2022]), a high Food and Drug Administration (FDA) regulatory limit for power density (720 mW/cm2), and a short wavelength (~3–4 mm in the PZT material at 1 MHz) enabling power transfer to millimeter-scale implants at centimeter-scale depths [singer_wireless_2021, basaeri_review_2016].

While significant progress toward a wireless fluorescence imaging system using US is presented in our prior work [rabbani_3640_2022, rabbani_towards_2024, rabbani_towards_2021], this system has several limitations. It incorporates a large (0.18 cm3) ~1 mF off-chip capacitor for energy storage. It only operates at 2 cm depth, constraining its application to superficial tumors while exceeding FDA US safety limits by 26% due to high acoustic power requirements. Moreover, the sensor only images a single fluorescent channel, lacking the necessary hardware for multicolor imaging such as a wirelessly programmable laser driver to control multiple excitation lasers and a multi-bandpass optical filter. Additionally, due to in-pixel leakage during readout, the sensitivity of the imager when operating wirelessly is limited to high concentrations of fluorophores, rendering it insufficient for imaging biologically relevant samples.

Refer to caption
Figure 1: Concept of a fully wireless, multicolor, implantable imager for real-time monitoring of immune response.

This work demonstrates a new system with significant improvements in performance and size, specifically designed for multicolor imaging. Our new system shows fully wireless operation at 5 cm depth in oil, requiring 221 mW/cm2 US power flux density (31% of FDA limits) for power harvesting and transmitting data with a bit error rate (BER) less than 10-6 through US backscatter. It powers three different-wavelength laser diodes programmed through US downlink and incorporates a multi-bandpass optical frontend expanding on the design in [roschelle_multicolor_2024] to enable three-color fluorescence imaging. Moreover, we illustrate the application of our sensor in assessing response to cancer immunotherapy through multicolor fluorescence imaging of both effector and suppressor immune cells in ex vivo mice tumor samples with and without immunotherapy. Finally, a proof-of-concept mechanical assembly demonstrates a small form factor of 0.09 cm3.

This article further explains and expands on the work presented in [rabbani_173_2024] and is organized as follows. Section II discusses the components and design specifications for a fully wireless, multicolor fluorescence imager. We describe the design and implementation of our system in Section III. Section IV presents system-level measurement results. We illustrate the application of our sensor with ex vivo imaging results in Section V. Finally, Section VI includes a comparison with the state of the art and the conclusion.

II System Overview

Fig. 2 shows a diagram and mechanical assembly of the full system on a flex PCB with all external components. The system consists of: 1) micro-laser diodes (µLDs) for in-situ illumination; 2) an optical frontend comprising of a fiber optic plate and a multi-bandpass interference filter for lens-less multicolor fluorescence imaging; 3) a piezoceramic as the US transceiver; 4) off-chip capacitors for energy storage; and 5) an ASIC to integrate all of this functionality. In this section, we will describe the design of the components in the system and derive design requirements for the ASIC.

Refer to caption
Figure 2: (a) To-scale diagram of the full system. (b) Mechanical assembly.

II-A Multicolor Fluorescence Imaging

Fig. 3 illustrates the principle of multicolor fluorescence imaging. The fluorophores are first conjugated to a probe (Fig. 3(a)), such as an antibody, targeted toward a cell type of interest [lichtman_fluorescence_2005]. For in vivo imaging, the conjugated probe can be administered systemically through intravenous injection, binding only to targeted cells. Many organic fluorophores have low toxicity at doses relevant for imaging [alford_toxicity_2009] and a number of fluorescent probes are FDA-approved or in clinical trials, including some using Fluorescein (FAM) and Cyanine5 (Cy5) [barth_fluorescence_2020], the fluorophores in our ex vivo studies. Once injected, the half-life of antibody-based probes is days to weeks [freise_vivo_2015] and free-floating unbound probes are cleared through the liver and kidneys in 1–7 days [mieog_fundamentals_2022].

Refer to caption
Figure 3: Multicolor fluorescence imaging. (a) Each cell type is labeled with a different color fluorescent probe. (b,c) Fluorophores are excited near the absorption peak and emit light at a slightly longer wavelength. A multi-bandpass filter passes emissions while blocking excitation.

After labeling the cells, the fluorophores are excited near their absorption peak (λ𝜆\lambdaitalic_λEX) and emit light at a slightly longer wavelength with a peak at λ𝜆\lambdaitalic_λEM (Fig. 3(b) and (c)). For organic fluorophores, the difference between the absorption and emission peaks, or Stokes shift, is 10–30 nm (26 nm for FAM and 18 nm for Cy5). Moreover, due to the small absorption cross-section of the fluorophores relative to the illuminated field of view (FoV), the excitation light is often 4 to 6 orders of magnitude stronger than the emission light. Thus, in order to detect the weak fluorescence signal, an optical filter with an optical density (OD) \geq 6 is required to attenuate out-of-band excitation light that would otherwise saturate the sensor. Avoiding a filter altogether through time-gated imaging [moazeni_mechanically_2021, choi_512-pixel_2019, najafiaghdam_optics-free_2022]—where excitation and imaging are separated in the time domain—leads to inadequate excitation rejection and low signal intensities with typical organic fluorophores, which have fluorescence lifetimes less than 10 ns [berezin_fluorescence_2010]. Moreover, background subtraction in the electrical domain [aghlmand_65-nm_2023] adds additional noise sources and is challenging in vivo as the excitation background is dependent on tissue scattering.

For multicolor imaging, a variety of organic fluorophores are available with absorption and emission wavelengths spanning the visible and NIR spectrum [haugland_handbook_1992]. Their narrow absorption and emission spectra allow for multiplexed imaging using a monochrome sensor by taking a separate image at each excitation wavelength. Therefore, multicolor fluorescence imaging requires multiple excitation sources and a multi-bandpass filter to block all excitation wavelengths while passing fluorescence emissions.

II-B Light Sources

For fluorescence excitation, we use µLDs with wavelengths of 650 nm (250×300×100 µm3, CHIP-650-P5, Roithner LaserTechnik GmbH) and 455 nm (120×300×90 µm3, LS0512HBE1, Light Avenue). A third 785 nm laser diode (L785P5, ThorLabs) in a TO-can package is used for proof-of-principle three-color fluorescence imaging and will be replaced by a µLD in the future. Laser diodes are chosen instead of LEDs which have broader spectral bandwidths that can overlap with fluorescence emissions. These out-of-band emissions necessitate excitation filters on the LEDs that complicate sensor design and waste optical power output [azmer_miniaturized_2021].

Fig. 4(a) and (b) show the measured power-current-voltage (PIV) curves for all three lasers and their calculated wall-plug efficiencies (POptical/PElectricalsubscript𝑃𝑂𝑝𝑡𝑖𝑐𝑎𝑙subscript𝑃𝐸𝑙𝑒𝑐𝑡𝑟𝑖𝑐𝑎𝑙P_{Optical}/P_{Electrical}italic_P start_POSTSUBSCRIPT italic_O italic_p italic_t italic_i italic_c italic_a italic_l end_POSTSUBSCRIPT / italic_P start_POSTSUBSCRIPT italic_E italic_l italic_e italic_c italic_t italic_r italic_i italic_c italic_a italic_l end_POSTSUBSCRIPT), respectively. The lasers have different forward voltages: ~2 V for the 650 nm and 785 nm lasers and ~4.5 V for the 455 nm laser. Because of their several-mA threshold currents, the lasers operate most efficiently near their maximum current ratings. These characteristics motivate the design of a laser driver with programmable current that is tolerant of a wide range of forward voltages.

Refer to caption
Figure 4: Measured laser diode (a) PIV curves and (b) wall-plug efficiencies.

II-C Optical Frontend Design

The optical frontend design builds on our prior work [roschelle_multicolor_2024] and consists of a multi-bandpass interference filter and a low-numerical-aperture fiber optic plate (FOP). Interference filters offer more-ideal filter characteristics than absorption filters [papageorgiou_angle-insensitive_2018] or CMOS metal filters [aghlmand_65-nm_2023, zhu_ingestible_2023, hong_fully_2017], which do not allow for optimal excitation and imaging of organic fluorophores due to their gradual cutoff transitions, weak out-of-band attenuation, and significant passband losses. Hybrid filters combining interference and absorption filters [moazeni_mechanically_2021, rustami_needle-type_2020, sasagawa_highly_2018] retain the poor passband characteristics of absorption filters. Another major advantage of interference filters is their ability to support multiple passbands across the visible and NIR spectra for multicolor imaging. In contrast, demonstrated dual-color fluorescence sensors with absorption or CMOS filters rely on dedicated pixels for each color [aghlmand_65-nm_2023, hee_lens-free_2019, taal_toward_2022, kulmala_lensless_2022], reducing the sensor sensitivity and resolution.

However, interference filters are sensitive to angle of incidence (AOI) [dandin_optical_2007]. At increasing AOIs, the filter passbands shift towards shorter wavelengths, eventually transmitting the excitation light. This property is problematic for lensless imaging where the AOI is not precisely controlled and the excitation light is often angled between the sensor and the tissue above it. To mitigate this effect, the FOP acts as an angle filter, blocking off-axis excitation light that would otherwise pass through the filter. The FOP also improves resolution by eliminating divergent fluorescent emissions that contribute to blur, albeit at the cost of reducing the overall collected signal.

Here, we expand the dual-bandpass design in [roschelle_multicolor_2024] to three-color fluorescence imaging with a new interference filter. Fig. 5(a) shows the normal incidence (AOI=0°) transmittance spectra of the filter (ZET488/647/780+800lpm, Chroma Technologies Corp) which has three passbands with greater than 93% average transmittance. The first two bands pass the emissions of FAM and Cy5, the fluorophores used in our ex vivo imaging studies. The 800 nm band, added in this work, provides another fluorescence channel in the NIR-I window (700–900 nm), a preferred region for in vivo imaging where tissue scattering, absorption, and autofluorescence are minimal compared to the visible spectrum (400–700 nm) [frangioni_vivo_2003, ji_near-infrared_2020]. At normal incidence, the filter provides sufficient blocking of the lasers: more than 6 OD attenuation at both 450 nm and 650 nm as well as more than 5 OD attenuation at 785 nm.

Refer to caption
Figure 5: (a) Normal incidence transmittance spectra of the multi-bandpass interference filter. (b,c) Measured transmittance through the FOP across AOIs. (d) Angular transmittance of the filter with and without the FOP measured at the excitation laser wavelengths.

The 500 µm-thick FOP (LNP121011, Shenzhen Laser, LTD) consists of a matrix of 10 µm optical fibers embedded in black, absorptive glass. It has a normal incidence transmittance of 35% and a full-width at half maximum (FWHM) of 10° at 455 nm, which both reduce at longer wavelengths as shown in Fig. 5(b). The angular transmittance measurements in Fig. 5(c) show that beyond an AOI of 35° the FOP provides more than 6 OD attenuation of all three lasers.

Fig. 5(d) shows the transmittance through the filter with and without the FOP across different AOIs measured at the excitation wavelengths using collimated, fiber-coupled lasers. The filter attenuation at AOI=0° is different from that in Fig. 5(a) due to out-of-band emissions from the lasers. While the filter blocks the excitation lasers near 0°, the laser transmittance rapidly increases beyond AOIs of 20° for 650 nm and 785 nm and 60° for 455 nm. However, with the FOP, the optical frontend provides more than 6 OD of attenuation of all excitation lasers at AOIs greater than 5°. The maximum measured attenuation is limited by the sensitivity of the power meter (PM100D with S120C Photodiode, Thorlabs) used for this measurement.

For fabrication, the interference filter is directly deposited on the FOP, resulting in a total thickness of approximately 510 µm. The optical frontend is fixed to the chip using optically transparent epoxy (SYLGARD 184, Dow Chemicals). The filter is placed in between the chip and the FOP to ensure that it blocks any excitation light scattered through the FOP [roschelle_multicolor_2024].

II-D Ultrasound Link

We use a 1.5×1.5×1.5 mm3 piezoceramic (lead zirconate titanate) as the US transceiver for wireless power transfer and bi-directional communication. The thickness of the piezo is directly proportional to the harvested voltage and inversely proportional to the operation frequency [singer_wireless_2021]. Therefore, we chose a thickness of 1.5 mm to balance minimizing the overall size of the piezo with the need for harvesting a high enough voltage to drive the lasers while operating at a lower frequency with less tissue attenuation. An aspect ratio of one is selected as a compromise between volumetric efficiency and backscattering amplitude, as outlined in [ghanbari_optimizing_2020]. The piezo is mounted on a flex PCB for testing (Fig. 6(a)). On the backside of the piezo, an air gap is created by covering a through-hole via with a 3D-printed lid. The air gap reduces the acoustic impedance of the backside medium from 1.34 MRayl in canola oil to ~0 MRayl in air, decreasing the electrical impedance of the piezo to improve the power transfer efficiency [sonmezoglu_method_2021].

Refer to caption
Figure 6: (a) Piezo assembly with the air gap. (b) Measured electrical impedance of the piezo across frequency. (c) Measured harvested voltage across frequency with the piezo in open circuit condition and loaded by the chip.

Fig. 6(b) shows the impedance spectrum of the piezo measured within canola oil. Canola oil has 0.075 dB/cm acoustic attenuation at 920 kHz and 1.34 MRayl acoustic impedance [gladwell_ultrasonic_1985] similar to the impedance (1.4–1.67 MRayl) of tissue [chen_acoustic_2022]. The series and parallel resonance frequencies of the piezo occur at, fS=894 kHz and fP=960 kHz, respectively. Fig. 6(c) shows the normalized harvested voltage across frequency when the piezo is open circuit condition and when it is loaded with the chip (see section IV.A for the setup). While operating near fS minimizes the impedance, the open circuit voltage is maximized near fP. Therefore, the maximum harvested voltage with the chip occurs between fS and fP at 920 kHz.

II-E System Design Considerations

To derive the required harvested energy per image for sizing the storage capacitor, we estimate the signal detected by a pixel from Cy5-labeled CD8+ T-cells, a type of immune cell imaged in our ex vivo studies. The total emitted optical power, PCELLSsubscript𝑃𝐶𝐸𝐿𝐿𝑆P_{CELLS}italic_P start_POSTSUBSCRIPT italic_C italic_E italic_L italic_L italic_S end_POSTSUBSCRIPT, from C𝐶Citalic_C fluorescently labeled cells as a function of the input excitation intensity, IINsubscript𝐼𝐼𝑁I_{IN}italic_I start_POSTSUBSCRIPT italic_I italic_N end_POSTSUBSCRIPT, is given by

PCELLS=CNFLσQYIIN.subscript𝑃𝐶𝐸𝐿𝐿𝑆𝐶subscript𝑁𝐹𝐿𝜎𝑄𝑌subscript𝐼𝐼𝑁P_{CELLS}=C\cdot N_{FL}\cdot\sigma\cdot QY\cdot I_{IN}.italic_P start_POSTSUBSCRIPT italic_C italic_E italic_L italic_L italic_S end_POSTSUBSCRIPT = italic_C ⋅ italic_N start_POSTSUBSCRIPT italic_F italic_L end_POSTSUBSCRIPT ⋅ italic_σ ⋅ italic_Q italic_Y ⋅ italic_I start_POSTSUBSCRIPT italic_I italic_N end_POSTSUBSCRIPT . (1)

NFLsubscript𝑁𝐹𝐿N_{FL}italic_N start_POSTSUBSCRIPT italic_F italic_L end_POSTSUBSCRIPT is the number of fluorophores bound to each cell. Typically, between 0.5–2.1×106 CD8+ antibodies bind to a single CD8+ T-cell [siiman_cell_2000] with each antibody containing 2–8 fluorophores [vira_fluorescent-labeled_2010]. σ𝜎\sigmaitalic_σ and QY𝑄𝑌QYitalic_Q italic_Y are the absorption cross-section and quantum yield of the fluorophore, respectively (9.55×10-16 cm2 and 20% for Cy5 [aat_bioquest_extinction_2024]). We assume that a single pixel (with 55 µm pitch in our design) subtends a FoV containing C=100𝐶100C=100italic_C = 100 T-cells, considering that a T-cell is 5–10 µm in diameter [jiang_mri_2020]. Assuming that the 650 nm µLD uniformly illuminates the FoV of our sensor (2×2.2 mm2) and outputs 10 mW of optical power at ILD=20 mA bias (see Fig. 4), IINsubscript𝐼𝐼𝑁I_{IN}italic_I start_POSTSUBSCRIPT italic_I italic_N end_POSTSUBSCRIPT is approximately 223 mW/cm2. Therefore, the estimated total fluorescence signal is 20 nW. This signal can be converted to the expected photodiode current, IPHsubscript𝐼𝑃𝐻I_{PH}italic_I start_POSTSUBSCRIPT italic_P italic_H end_POSTSUBSCRIPT, according to

IPH=PCELLSAPIXEL4πzDIST2(1LFOP)R.subscript𝐼𝑃𝐻subscript𝑃𝐶𝐸𝐿𝐿𝑆subscript𝐴𝑃𝐼𝑋𝐸𝐿4𝜋superscriptsubscript𝑧𝐷𝐼𝑆𝑇21subscript𝐿𝐹𝑂𝑃𝑅I_{PH}=P_{CELLS}\cdot\frac{A_{PIXEL}}{4\pi z_{DIST}^{2}}\cdot(1-L_{FOP})\cdot R.italic_I start_POSTSUBSCRIPT italic_P italic_H end_POSTSUBSCRIPT = italic_P start_POSTSUBSCRIPT italic_C italic_E italic_L italic_L italic_S end_POSTSUBSCRIPT ⋅ divide start_ARG italic_A start_POSTSUBSCRIPT italic_P italic_I italic_X italic_E italic_L end_POSTSUBSCRIPT end_ARG start_ARG 4 italic_π italic_z start_POSTSUBSCRIPT italic_D italic_I italic_S italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ⋅ ( 1 - italic_L start_POSTSUBSCRIPT italic_F italic_O italic_P end_POSTSUBSCRIPT ) ⋅ italic_R . (2)

This equation accounts for both the spreading loss over the zDIST500subscript𝑧𝐷𝐼𝑆𝑇500z_{DIST}\approx 500italic_z start_POSTSUBSCRIPT italic_D italic_I italic_S italic_T end_POSTSUBSCRIPT ≈ 500 µm distance to the pixel with area, APIXELsubscript𝐴𝑃𝐼𝑋𝐸𝐿A_{PIXEL}italic_A start_POSTSUBSCRIPT italic_P italic_I italic_X italic_E italic_L end_POSTSUBSCRIPT (44×44 µm2 in our design) and the insertion loss of the FOP, LFOPsubscript𝐿𝐹𝑂𝑃L_{FOP}italic_L start_POSTSUBSCRIPT italic_F italic_O italic_P end_POSTSUBSCRIPT (~75% at 650 nm). Given that the pixel has a responsivity, R𝑅Ritalic_R, of 0.21 A/W at 650 nm, we expect IPHsubscript𝐼𝑃𝐻I_{PH}italic_I start_POSTSUBSCRIPT italic_P italic_H end_POSTSUBSCRIPT on the order of 6.3 fA.

In the capacitive trans-impedance amplifier (CTIA)-based pixel architecture reused from [papageorgiou_chip-scale_2020] the photocurrent is sensed by integrating it on a capacitor, CINTsubscript𝐶𝐼𝑁𝑇C_{INT}italic_C start_POSTSUBSCRIPT italic_I italic_N italic_T end_POSTSUBSCRIPT, during the exposure time, TEXPsubscript𝑇𝐸𝑋𝑃T_{EXP}italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT, resulting in a pixel output voltage of

VPIXEL=IPHTEXPCINT.subscript𝑉𝑃𝐼𝑋𝐸𝐿subscript𝐼𝑃𝐻subscript𝑇𝐸𝑋𝑃subscript𝐶𝐼𝑁𝑇V_{PIXEL}=\frac{I_{PH}\cdot T_{EXP}}{C_{INT}}.italic_V start_POSTSUBSCRIPT italic_P italic_I italic_X italic_E italic_L end_POSTSUBSCRIPT = divide start_ARG italic_I start_POSTSUBSCRIPT italic_P italic_H end_POSTSUBSCRIPT ⋅ italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT end_ARG start_ARG italic_C start_POSTSUBSCRIPT italic_I italic_N italic_T end_POSTSUBSCRIPT end_ARG . (3)

Sensing the fluorescence signal relies on VPIXELsubscript𝑉𝑃𝐼𝑋𝐸𝐿V_{PIXEL}italic_V start_POSTSUBSCRIPT italic_P italic_I italic_X italic_E italic_L end_POSTSUBSCRIPT exceeding the noise floor, characterized by the signal-to-noise ratio (SNR). Generally, SNR can be improved by increasing the total imaging time either through a longer exposure time, TEXPsubscript𝑇𝐸𝑋𝑃T_{EXP}italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT, or by averaging multiple images. Following the derivation in [32], the SNR at the output of a CTIA-based pixel when averaging n𝑛nitalic_n images with an exposure time of TEXPnsubscript𝑇𝐸𝑋𝑃𝑛\frac{T_{EXP}}{n}divide start_ARG italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG is given by

SNR(nTEXPn)=signalnoise=IPHTEXPCINTTEXPCINT22qeiD+nvNR2¯.𝑆𝑁𝑅𝑛subscript𝑇𝐸𝑋𝑃𝑛𝑠𝑖𝑔𝑛𝑎𝑙𝑛𝑜𝑖𝑠𝑒subscript𝐼𝑃𝐻subscript𝑇𝐸𝑋𝑃subscript𝐶𝐼𝑁𝑇subscript𝑇𝐸𝑋𝑃superscriptsubscript𝐶𝐼𝑁𝑇22subscript𝑞𝑒subscript𝑖𝐷𝑛¯superscriptsubscript𝑣𝑁𝑅2SNR(n\cdot\frac{T_{EXP}}{n})=\frac{signal}{noise}=\frac{\frac{I_{PH}T_{EXP}}{C% _{INT}}}{\sqrt{\frac{T_{EXP}}{C_{INT}^{2}}}2q_{e}i_{D}+n\overline{v_{NR}^{2}}}.italic_S italic_N italic_R ( italic_n ⋅ divide start_ARG italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG ) = divide start_ARG italic_s italic_i italic_g italic_n italic_a italic_l end_ARG start_ARG italic_n italic_o italic_i italic_s italic_e end_ARG = divide start_ARG divide start_ARG italic_I start_POSTSUBSCRIPT italic_P italic_H end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT end_ARG start_ARG italic_C start_POSTSUBSCRIPT italic_I italic_N italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG divide start_ARG italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT end_ARG start_ARG italic_C start_POSTSUBSCRIPT italic_I italic_N italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG 2 italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT + italic_n over¯ start_ARG italic_v start_POSTSUBSCRIPT italic_N italic_R end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG . (4)

This equation enables study of the SNR tradeoff between (1) taking a single exposure of (n𝑛nitalic_n=1) and (2) averaging n𝑛nitalic_n images with exposures of TEXPnsubscript𝑇𝐸𝑋𝑃𝑛\frac{T_{EXP}}{n}divide start_ARG italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG. The noise has two components: readout noise, vNR2¯¯superscriptsubscript𝑣𝑁𝑅2\overline{v_{NR}^{2}}over¯ start_ARG italic_v start_POSTSUBSCRIPT italic_N italic_R end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG, and shot noise from the photocurrent and dark current, iD=IPH+IDARKsubscript𝑖𝐷subscript𝐼𝑃𝐻subscript𝐼𝐷𝐴𝑅𝐾i_{D}=I_{PH}+I_{DARK}italic_i start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT = italic_I start_POSTSUBSCRIPT italic_P italic_H end_POSTSUBSCRIPT + italic_I start_POSTSUBSCRIPT italic_D italic_A italic_R italic_K end_POSTSUBSCRIPT. qesubscript𝑞𝑒q_{e}italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT is the charge of an electron. The factor of n𝑛nitalic_n only appears in the readout noise term. Therefore, if shot noise is the dominant source of noise, for small n𝑛nitalic_n, both (1) and (2) result in the same SNR. However, with increasing and lower exposure time per frame, readout noise dominates the overall noise of the averaged image, necessitating a greater number of averages to maintain the same SNR as a single exposure.

Using the estimated IPHsubscript𝐼𝑃𝐻I_{PH}italic_I start_POSTSUBSCRIPT italic_P italic_H end_POSTSUBSCRIPT and the measured noise values reported in Section IV, we calculate that without averaging, a TEXPsubscript𝑇𝐸𝑋𝑃T_{EXP}italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT of 98 ms is required to achieve an SNR of 20 dB (10×). This result corresponds to a minimum required energy ( ILDVLDTEXPsubscript𝐼𝐿𝐷subscript𝑉𝐿𝐷subscript𝑇𝐸𝑋𝑃I_{LD}\cdot V_{LD}\cdot T_{EXP}italic_I start_POSTSUBSCRIPT italic_L italic_D end_POSTSUBSCRIPT ⋅ italic_V start_POSTSUBSCRIPT italic_L italic_D end_POSTSUBSCRIPT ⋅ italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT) of 4.16 mJ per image.

Delivering ILDsubscript𝐼𝐿𝐷I_{LD}italic_I start_POSTSUBSCRIPT italic_L italic_D end_POSTSUBSCRIPT=20 mA from the incident US signal, given a piezo impedance of 5.4 kΩΩ\Omegaroman_Ω at 920 kHz, requires an open circuit voltage of at least 108 V, which is not practical within FDA limits. Therefore, harvested energy must first be stored on a capacitor to later supply the lasers when taking an image. The size of the storage capacitor, CSTOREsubscript𝐶𝑆𝑇𝑂𝑅𝐸C_{STORE}italic_C start_POSTSUBSCRIPT italic_S italic_T italic_O italic_R italic_E end_POSTSUBSCRIPT, is determined by CSTORE=ILDTEXPΔVCSTOREsubscript𝐶𝑆𝑇𝑂𝑅𝐸subscript𝐼𝐿𝐷subscript𝑇𝐸𝑋𝑃Δsubscript𝑉𝐶𝑆𝑇𝑂𝑅𝐸C_{STORE}=\frac{I_{LD}T_{EXP}}{\Delta V_{CSTORE}}italic_C start_POSTSUBSCRIPT italic_S italic_T italic_O italic_R italic_E end_POSTSUBSCRIPT = divide start_ARG italic_I start_POSTSUBSCRIPT italic_L italic_D end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT end_ARG start_ARG roman_Δ italic_V start_POSTSUBSCRIPT italic_C italic_S italic_T italic_O italic_R italic_E end_POSTSUBSCRIPT end_ARG in order to supply ILDsubscript𝐼𝐿𝐷I_{LD}italic_I start_POSTSUBSCRIPT italic_L italic_D end_POSTSUBSCRIPT for the duration of TEXPsubscript𝑇𝐸𝑋𝑃T_{EXP}italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT. ΔVCSTOREΔsubscript𝑉𝐶𝑆𝑇𝑂𝑅𝐸\Delta V_{CSTORE}roman_Δ italic_V start_POSTSUBSCRIPT italic_C italic_S italic_T italic_O italic_R italic_E end_POSTSUBSCRIPT is the voltage drop on the capacitor during TEXPsubscript𝑇𝐸𝑋𝑃T_{EXP}italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT. Maximizing ΔVCSTOREΔsubscript𝑉𝐶𝑆𝑇𝑂𝑅𝐸\Delta V_{CSTORE}roman_Δ italic_V start_POSTSUBSCRIPT italic_C italic_S italic_T italic_O italic_R italic_E end_POSTSUBSCRIPT results in a smaller capacitor size, but is limited by the maximum harvested voltage and the minimum supply requirements for operating the chip or laser. Assuming ΔVCSTOREΔsubscript𝑉𝐶𝑆𝑇𝑂𝑅𝐸\Delta V_{CSTORE}roman_Δ italic_V start_POSTSUBSCRIPT italic_C italic_S italic_T italic_O italic_R italic_E end_POSTSUBSCRIPT=3 V, results in a capacitor size of 650 µF. Capacitors of this size are large physical components, increasing implant volume as in [rabbani_towards_2024]. Therefore, the capacitor size can be minimized by reducing the required energy per image through the averaging strategy discussed previously.

Fig. 7(a) compares the SNR of a pixel with different levels of averaging. The signal is the estimated photocurrent from the above analysis (6.3 fA) and the noise is measured with the sensor from dark images (see Fig. 21(c)). Each data point on the black curve represents an exposure time of TEXPi and a number of averages ni such that the total exposure time, niTEXPi=96 ms stays constant. As TEXPi decreases (and ni increases), readout noise dominates the pixel output noise (because shot noise decreases with lower TEXPi), requiring additional averages to achieve the same SNR of a single exposure. The orange curve in Fig. 7(a) shows the increased number of averages, xi > ni, required to reach an SNR (shown in blue) within 90% of the initial SNR for TEXP=96 ms. Therefore, using averaging to decrease exposure time for individual frames increases the overall imaging time to greater than 96 ms. As shown in Fig. 7(b), the capacitor size decreases linearly with lower TEXPi ranging from 640 µF for TEXPi=96 ms to 50 µF for TEXPi=8 ms. Charging such a capacitor through US takes several seconds to minutes, dominating the frame time (see Section IV.B). Thus, for small exposure times, the additional required averages can significantly increase the total imaging time. The total imaging time must be less than several minutes to capture the motion of immune cells, which have mean velocities of 10 µm/min in the tumor microenvironment [dupre_t_2015].

Refer to caption
Figure 7: (a) In black, the SNR of a pixel with the estimated photocurrent and measured dark noise (see Fig. 21(c)) across different exposure times (TEXP,i) and averaging ni images such that total imaging time TEXPini=96subscript𝑇𝐸𝑋𝑃𝑖subscript𝑛𝑖96T_{EXPi}n_{i}=96italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P italic_i end_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 96 ms remains constant. In blue, the SNR after xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT averages (orange) required to maintain 90% of the SNR at TEXPi=96mssubscript𝑇𝐸𝑋𝑃𝑖96𝑚𝑠T_{EXPi}=96msitalic_T start_POSTSUBSCRIPT italic_E italic_X italic_P italic_i end_POSTSUBSCRIPT = 96 italic_m italic_s. (b) Capacitor size vs. exposure time.

Following these guidelines, we chose an 0805 100 µF tantalum capacitor for CSTORE with a size of 2×1.25×0.9 mm3 (0.002 cm3). This capacitor can supply 20 mA of laser current for TEXP=16 ms while drop** its voltage by 3 V. Averaging is employed to enhance SNR to levels comparable to those achieved by longer exposure times. We use a tantalum capacitor as opposed to a ceramic capacitor, which can lose up to 40–80% of its initial capacitance as the DC bias voltage increases and reduces the dielectric permittivity [cen_dc_nodate].

III System Design and Implementation

Fig. 8 shows the system block diagram of the ASIC with external connections to the piezo, off-chip storage capacitors, and µLDs. The ASIC has 4 main subsystems: (1) power management unit (PMU), (2) digital control, (3) laser driver, and (4) imaging frontend with readout.

Refer to caption
Figure 8: System block diagram.

The PMU consists of an active rectifier for AC-DC conversion of the piezo signal and a charge pump for generating an up to 6 V supply for driving the lasers. Harvested energy is stored on two off-chip capacitors, CVCP=10 µF and CSTORE=100 µF, to separate the power supplies of the lasers from the rest of the sensor throughout its operation. A PTAT develops current and voltage references and several low dropout voltage regulators (LDOs) generate stable DC power supplies for the chip. The sensor is programmed and controlled through a finite state machine (FSM) with 6 states of operation: charging up the storage capacitors (Charge-Up); programming the image sensor and laser driver parameters through US downlink (Set TEXP and Set LD); taking an image (Imaging); digitizing and storing the image (Readout); and wirelessly transmitting the data via US backscatter (Backscattering). To take an image, the laser driver, configured during downlink, supplies a µLD using energy stored in CSTORE. The image is captured on a 36×40-pixel array. During Readout, the pixel data is digitized by 4 parallel ADCs to be saved in the memory. Finally, image data is transmitted by modulating the reflected amplitude of incident US pulses with the SMOD switch.

The design and operation of the subsystems are described in detail below.

III-A Power Management Unit

Fig. 9 shows the schematic of the active rectifier and charge pump. The active rectifier converts the harvested AC signal on the piezo to a 3 V DC voltage (VRECT), which is stabilized by a 4.7 nF off-chip capacitor. VRECT is then multiplied by 1.83× to a 5.5 V supply (VCP) with the cross-coupled charge pump. The cross-coupled topology is chosen for its high power conversion efficiency for an optimized input range [guler_power_2017]. Compared to a rectifier-only architecture used in [rabbani_towards_2024], the charge pump reduces the required harvested AC voltage on the piezo (VPIEZO) to achieve an output voltage (VCP) of 5.5 V by 1.7×, which results in a 3× lower acoustic power density requirement. Acoustic power density is a square function of acoustic pressure, which is linearly proportional to the harvested AC voltage. Therefore, lowering the required harvested piezo voltage reduces the acoustic power density to ensure operation within FDA safety limits. However, with this architecture, the overall charging time increases due to the energy loss from the charge pump.

Refer to caption
Figure 9: PMU schematic consisting of (a) a full-wave active rectifier, (b) a cross-coupled charge pump, and (c) storage capacitors.

During Charge-Up, CVCP and CSTORE are connected through the CSTORE switch and are charged through the PMU. CSTORE stores energy for the lasers and imager array and a smaller CVCP stores energy for the readout and digital control. Following manufacturer guidelines, the external US transducer is duty-cycled for reduced average power dissipation to prevent damage to it from overheating while providing enough US power density to achieve sufficient harvested voltage on the sensor. To minimize power consumption during Charge-Up, the laser driver, pixel array, readout circuits and memory are switched off. A diode-based voltage clamp prevents charging beyond 6 V to protect the devices from overvoltage.

Five LDOs (Fig. S1) regulate the harvested voltage into stable DC power supplies and are compensated with off-chip 0201 surface mount capacitors (10–200 nF). They generate reference voltages of 0.5 V and 2.1 V for the ADCs, separate 1.8 V power supplies for the digital control and for the pixel array and laser driver biasing, and a 3.3 V supply for the readout.

A PTAT circuit generates a 200 nA reference current, IREF, and 1 V and 0.5 V references to bias the chip. The PTAT, with schematic shown in Fig. 10, uses a constant-gm topology to minimize the dependence on threshold voltage process variation. A PMOS core (M1–M4) avoids the body effect as deep N-well transistors were not available in the process. The diode-based start-up circuit (D1–D3) prevents zero current operation. To ensure that generated references are stable across the large voltage drop on VCP from 5.5 to 3.5 V, cascode current mirrors with high output impedance are used throughout the design. The voltage references are buffered and are generated by mirroring IREF (M3, M4, M9, M10) through resistors R4 and R5.

Refer to caption
Figure 10: PTAT schematic.

III-B Digital Control

The chip operates according to the system timing diagram shown in Fig. 11. When VCP reaches 3.9 V, ensuring stable operation of the chip, a power-on reset (POR, Fig. S2) circuit initializes the FSM. The FSM is synchronized to the external US transducer by on-off-key modulation of the US envelope, which is demodulated by a watchdog circuit.

Refer to caption
Figure 11: System timing diagram.

The schematic of the watchdog circuit is shown in Fig. 12. A latched-based control eliminates glitches in detecting the presence of the US pulses within 3 µs of the initial rising edge. The unwanted transitions result from insufficient drive strength of the AC inputs to transistors M1 and M2 during the gradual ramp-up of the US pulse.

Refer to caption
Figure 12: Schematic of watchdog circuit with error-tolerant edge detection.

To relay timing information to the FSM, the clock is extracted from the US carrier frequency (920 kHz). An US pulse longer than 1 ms indicates the end of the Charge-Up state. At this moment, the CSTORE switch is opened to isolate the storage capacitors, allowing VCSTORE to drop to a minimum of 2.5 V during Imaging while maintaining VCP above 3.5 V for the 3.3 V readout circuits. This approach allows for maximum energy usage from CSTORE, resulting in a 33% smaller required capacitance assuming a 5.5 V Charge-Up voltage.

After Charge-Up, the ASIC is programmed during the Set TEXP and Set LD states. As shown in Fig. 11, the transmitted downlink data is decoded through time-to-digital conversion of the US pulse widths. In each state, 4 LSBs are discarded to account for timing variations in the watchdog signal. In Set TEXP, the exposure time, TEXP, is set through the 5 MSBs and is programmable from 0–248 ms with LSB=8 ms. The next 2 bits set the pixel reset time, TRST, which can be 100, 200, 500, or 1000 µs. In Set LD, 3 MSBs set the 1-hot encoded laser channel and the next 5 bits determine the laser current, ILD. On the falling edge of the watchdog after Set LD, the laser driver and the pixel array bias circuits are turned on to prepare for Imaging.

III-C Laser Driver

Fig. 13 shows the schematic of the 3-channel laser driver with programmable output current. To minimize the change in driver current, ILD, across the large voltage drop on VCSTORE (5.5–2.5 V), the driver must have high output impedance. Therefore, a gain-boosted cascode current source topology is used, in which the output impedance of the current source (M8–M15) is multiplied by the 65 dB gain of the cascoded boost amplifier (M4–M7). A 5-bit current DAC (M11–M15) enables a programmable output current from 0–115 mA with a 3.9 mA LSB. While the µLDs in this work operate under 40 mA (see Fig. 4), this range accommodates a variety of commercial µLDs with threshold currents up to 100 mA for future applications. Since only one laser is turned on at a time, the same driver circuitry is used for all three lasers. Thus, the cascode transistors select between the laser channels. For maximum output swing, Vx is set by a level-shifting diode, M3, to bias M11–M15 at the edge of triode. A headroom of at least 400 mV is required at the drains of M8–M10 (VLD−) to ensure operation in saturation.

Refer to caption
Figure 13: Schematic of the 3-channel programmable laser driver.

III-D Imaging Frontend and Readout

The imaging frontend is similar to that presented in [papageorgiou_chip-scale_2020], but without the angle selective gratings as image deblurring is now provided by the FOP. The image sensor consists of a 36×40 array of pixels with a 44×44 µm2 Nwell/Psub photodiode and a 55 µm pitch, covering a 2×2.2 mm2 FoV. The pixel architecture, shown in Fig. 14(a), is based on a CTIA with CINT=11 fF. To reduce low-frequency noise, reset switch sampling noise, and pixel offset, a correlated double-sampling scheme is implemented with the following pixel timing (illustrated in Fig. 14(b)). First, the voltage on CINT is set to zero during the initial reset phase, TRST, with timing configured in the Set TEXP state. For the exposure time, TEXP, the photocurrent is integrated on CINT generating the pixel output voltage, VOUT=V0+IPDTEXP/CINTsubscript𝑉𝑂𝑈𝑇subscript𝑉0subscript𝐼𝑃𝐷subscript𝑇𝐸𝑋𝑃subscript𝐶𝐼𝑁𝑇V_{OUT}=V_{0}+I_{PD}T_{EXP}/C_{INT}italic_V start_POSTSUBSCRIPT italic_O italic_U italic_T end_POSTSUBSCRIPT = italic_V start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_I start_POSTSUBSCRIPT italic_P italic_D end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT / italic_C start_POSTSUBSCRIPT italic_I italic_N italic_T end_POSTSUBSCRIPT, which is sampled on reset (CR) and signal (CS) sampling capacitors after intervals of 100 µs and TEXP+100subscript𝑇𝐸𝑋𝑃100T_{EXP}+100italic_T start_POSTSUBSCRIPT italic_E italic_X italic_P end_POSTSUBSCRIPT + 100 µs, respectively. The final pixel value (VPIXEL) is the difference between the signal (VS) and reset (VR) values.

Refer to caption
Figure 14: (a) Active pixel architecture with correlated double sampling. (b) Pixel timing diagram.

After Imaging, the analog pixel values are digitized and stored in memory during the Readout state. Readout duration is set to limit the leakage on the in-pixel sampling capacitors to less than an LSB. Therefore, the readout is performed in parallel across 4 channels each spanning 10-pixel columns. Each channel consists of an 8-bit differential SAR ADC (Fig. S3) driven by a buffer. The ADC has a dynamic range of 500 mV with an LSB of 1.95 mV, which is below the pixel readout noise (see Section III.E). The readout circuits operate on a 3.3 V supply to ensure sufficient headroom considering that the in-pixel source followers level-shift the sampled pixel voltages up by 1 V. Thus, the size of CVCP is chosen to maintain VCP above 3.5 V throughout this state. The signal (VS) and reset (VR) pixel values are subtracted by the differential ADCs, and the digitized pixel values are stored immediately after conversion in a 11.52 kb latched-based memory. Unlike the work in [32], this design enables a short Readout time of 5.4 ms, which is not limited by the longer Backscattering state (890 ms at 5 cm depth) that increases with depth due to the longer time of flight of the acoustic waves.

III-E Data Transmission

During Backscattering, the memory is read serially (ΘΘ\Thetaroman_ΘMOD in Fig. 8) and transmitted by modulating the amplitude of the reflected (backscattered) US pulses using a switch (SMOD in Fig. 8). The uplink communication protocol is shown in the timing diagram in Fig. 11. The transmitted data for each pixel comprises a 9-bit packet containing a header (set to 0) followed by 8 data bits. The header pulse allows for a one-pulse delay to make sure memory is read and loaded into the serializer before data transmission. Additionally, the header is set to a known value of zero to help identify the backscattered bit values.

The external transducer generates a sequence of pulses each spanning a few cycles of the US carrier for the header and 8 individual bits. After a time of flight (ToF=33 µs for 5 cm depth) the acoustic pulses reach the piezo and reflect with an amplitude proportional to the reflection coefficient of the piezo, ΓΓ\Gammaroman_Γ. ΓΓ\Gammaroman_Γ is dependent on the electrical impedance loading the piezo, RLOADsubscript𝑅𝐿𝑂𝐴𝐷R_{LOAD}italic_R start_POSTSUBSCRIPT italic_L italic_O italic_A italic_D end_POSTSUBSCRIPT and, therefore, can be controlled through the SMOD switch. Near the parallel resonance frequency of the piezo, ΓRPIEZO/(RLOAD+RPIEZO)proportional-toΓsubscript𝑅𝑃𝐼𝐸𝑍𝑂subscript𝑅𝐿𝑂𝐴𝐷subscript𝑅𝑃𝐼𝐸𝑍𝑂\Gamma\propto R_{PIEZO}/(R_{LOAD}+R_{PIEZO})roman_Γ ∝ italic_R start_POSTSUBSCRIPT italic_P italic_I italic_E italic_Z italic_O end_POSTSUBSCRIPT / ( italic_R start_POSTSUBSCRIPT italic_L italic_O italic_A italic_D end_POSTSUBSCRIPT + italic_R start_POSTSUBSCRIPT italic_P italic_I italic_E italic_Z italic_O end_POSTSUBSCRIPT ), where RPIEZOsubscript𝑅𝑃𝐼𝐸𝑍𝑂R_{PIEZO}italic_R start_POSTSUBSCRIPT italic_P italic_I italic_E italic_Z italic_O end_POSTSUBSCRIPT is the equivalent resistance of the piezo [ghanbari_optimizing_2020]. The SMOD switch impedance can be configured (hard-coded) by 2 bits to account for different RPIEZOsubscript𝑅𝑃𝐼𝐸𝑍𝑂R_{PIEZO}italic_R start_POSTSUBSCRIPT italic_P italic_I italic_E italic_Z italic_O end_POSTSUBSCRIPT values. After a second ToF, the backscattered signal is received by the external transducer and is demodulated to reconstruct the image. To avoid overlap of high voltage Tx and low voltage reflected Rx pulses, the external transducer transmits 2 bits within 2 ToFs and listens for the next 2 ToFs as shown in Fig. 11.

IV Measurement Results

Fig. 15(a) shows the die photo of the chip. The ASIC measures 2.5×5 mm2 and is fabricated in a TSMC 180 nm high-voltage (1.8/5/32 V) LDMOS CMOS process. 1.8 V transistors are used for the digital, pixel, and laser driver, and 5 V devices are used for the PMU and pixel readout. Fig. 15(b) shows the power breakdown for the chip where the laser driver dominates the power consumption.

This section presents system-level measurement results for the US wireless link, laser driver, and imaging frontend.

Refer to caption
Figure 15: (a) Chip micrograph. (b) Breakdown of system power consumption.

IV-A Measurement Setup

Fig. 16 shows the measurement setup for demonstrating fully wireless operation of the chip. In the acoustic setup, the piezo is submerged at 5 cm depth in a tank of canola oil. An external focused transducer (V314-SU-F1.90IN-PTF, Evident Scientific) at the surface of the tank transmits US signals to the piezo. To minimize interference from US reflections on data uplink, an acoustic absorber (Aptflex F28P, Precision Acoustics) is placed at the bottom of the tank. An FPGA (Opal Kelly, XEM7010) generates the desired US pulse sequence (as in Fig. 11) to control the chip. The timing of the pulse sequence is programmed through a custom user interface that interfaces with the FPGA. The generated waveforms are sent to a high-voltage transducer pulser board (Max14808, Maxim Integrated) to drive the external transducer accordingly.

Refer to caption
Figure 16: Measurement setup for wireless imaging.

The chip is directly connected with wires to the piezo for wireless power harvesting and data transfer via US. It is located inside a black box to reduce the background signal from ambient light during imaging. Slide-mounted samples are placed directly on top of the chip. The chip drives the µLDs, mounted on separate PCBs, to transilluminate the sample from above. Admittedly, in vivo, the sample must be epi-illuminated between the sensor and the tissue. Epi-illumination can be accomplished in the future by directing the laser light through a glass separator or light guide plate placed on top of the sensor [sasagawa_front-light_2023, shin_miniaturized_2023].

After taking an image, the backscattered US pulses are received by the external transducer and captured on an oscilloscope for processing and demodulation. To remove the pixel-to-pixel DC offsets due to the photodiode dark current and mismatch in the readout circuitry, a dark image with the same integration time but with the laser off is subtracted from the final fluorescence image. The dark image is averaged to minimize its noise contribution.

IV-B Ultrasound Wireless Power Transfer

Fig. 17(a) shows the measured PMU waveforms (VPIEZO+, VRECT, VCP, VCSTORE), verifying wireless operation of the full system at 5 cm depth. In this measurement, the system operates with an US power density of 221 mW/cm2, which falls within 31% of FDA safety limits. Under this minimum required acoustic power condition, VCP charges to 5.5 V in 50 s for the initial image. The charging time decreases to 35 s for consecutive frames with a nonzero initial VCP. The Charge-Up time can be further reduced by increasing US power intensity, operating closer to the FDA limits. The output voltages of the rectifier (VRECT) and charge pump ((VCP)) across different input voltages (VPIEZO+) show a minimum VPIEZO+=2.42 V is required for stable operation of the chip (Fig. S4).

Refer to caption
Figure 17: Measured PMU waveforms during (a) Charge-Up and (b) Imaging and Readout. (c) Measured backscatter waveforms.

Measured PMU waveforms during the Imaging and Readout states are presented in Fig. 17(b). During Imaging (TEXP=8 ms), VCSTORE drops from 5.5–2.5 V while supplying the laser with ILD=37.5 mA from the energy stored in CSTORE. VCP remains at 5.5 V throughout Imaging and drops to 3.5 V during Readout.

Fig. 11(c) shows the measured waveforms while transmitting a single pixel data packet via US backscattering. VPIEZO+ is modulated according to the serial output of the memory (ΘΘ\Thetaroman_ΘMOD) and the backscattered pulses are received by the external transducer (VBACKSCATTER in Fig. 11(c)). The one bits correspond to a smaller load impedance, but appear larger in amplitude than the zero bits because the piezo is operated between series and parallel resonance frequencies for maximum voltage harvesting.

Fig. 18(a) shows the total acoustic power and acoustic power density (ISPTA) incident on the piezo surface area at 5 cm depth for transverse offsets along the X or Y axis. Fig. 18(b) shows a similar measurement as the depth is adjusted along the Z axis. The acoustic power density is measured with a hydrophone (HGL-1000, Onda) and it is integrated over the piezo area to measure the available acoustic power at the piezo surface. The reported spatial-peak time-average intensity (ISPTA) of the acoustic field is the relevant parameter in calculating FDA safety limits for diagnostic US [health_marketing_2023]. For both transverse and depth offsets, the power decreases as the piezo moves away from the focal point (near 5 cm depth) of the external transducer. The measured transverse and axial FWHMs for ISPTA are 4.5 mm and 60 mm, respectively. In the future, misalignment loss can be reduced through dynamic focusing of the US with beam forming [benedict_phased_2022]. It should be noted that angular misalignment of the piezo with respect to the US beam will also reduce the harvested power [sonmezoglu_monitoring_2021, piech_wireless_2020].

Refer to caption
Figure 18: Harvested acoustic power vs. (a) transverse offset and (b) depth.

While charging VCP from 0–5.5 V, the overall electrical energy efficiency of the PMU is 12.7%. The efficiency of the system in converting the available acoustic energy on the face of the piezo to the electrical output energy of the PMU is 3.3%. The output energy of the PMU is calculated by measuring the energy stored in the CSTORE and CVCP and the total energy consumption of the ASIC during Charge-Up. The input acoustic energy is calculated by integrating the measured acoustic power density at the surface of the piezo (Fig. 18(a)) throughout this same period.

IV-C Ultrasound Data Uplink

At 5 cm depth, transmission of one image (11.52 kb) takes 890 ms, resulting in a data rate of 13 kbps. The received backscattered waveform is processed and demodulated to reconstruct the image as follows. First, the signal is bandpass-filtered at the carrier frequency, windowed to select the bit intervals, and then reconstructed with sinc interpolation. The peak-to-peak amplitude is then measured for each pulse and compared with a predetermined threshold to predict the bit value. The serial output of the chip serves as the ground truth.

Fig. 19 shows a histogram of the backscattered signal amplitude for each bit normalized to the threshold amplitude, demonstrating a clear separation between one and zero bits. The measurement shows robust error-free transmission of 90 frames, including a combination of dark frames and images taken with the 650 nm and 455 nm lasers. The bimodal nature of the histogram results from combining data across different imaging conditions and differing interference from the high voltage pulsing of the external transducer on the two pulses received within each interval of 2 ToFs. The device achieves a BER better than 10-6 (0 out of 1,036,800 bits) with an average modulation index of 5.6%.

Refer to caption
Figure 19: Measured bit error rate (BER) at 5cm depth in oil.

IV-D Laser Driver

Fig. 20 shows measurements of the laser driver and PTAT. The output current of the laser driver (ILD) is measured with a precision measurement unit (B2912A, Keysight). Fig. 20(a) shows the measured ILD across all DAC codes and Fig. 20(b) shows the percent change in ILD as the output voltage of the laser driver, VLD−, drops from 3.5–0.4 V. This range corresponds to the VLD− for a 5.5–3.5 V drop on VCP accounting for the 2 V forward bias voltage of the 650 nm µLD. For DAC=5 (ILD=20 mA), there is less than 1% variation across the 3.1 V drop, corresponding to 1.3% variation in optical power output of the 650 nm µLD.

Fig. 20(c) shows the variation in the 0.5 V PTAT reference across VCP measured through the VADC0.5V LDO. As VCP drops from 5.5–3.5 V, the PTAT reference varies around 2.5%, which has minimal effect on the ADC during Readout.

These results are an improvement over [rabbani_towards_2024] where the reference current varied 11.5% over a 1.5 V drop, resulting in a 50% reduction in the laser output power.

Refer to caption
Figure 20: Measurements of (a) laser driver current (ILD) vs. DAC code, (b) percent change in ILD vs. driver output voltage (VLD-), and (c) PTAT voltage reference (VREF 0.5 V) variation vs. VCP.

IV-E Imaging Frontend

The photodiode responsivity is determined by measuring pixel output voltage across a range of incident optical powers as shown in Fig. 21(a). We use a LED with a collimator and beam expander to ensure uniform illumination of the sensor. A narrow bandpass interference filter placed in front of the LED selects a specific wavelength. Measurements are made at 535 nm and 705 nm, near the center of the optical frontend passbands. The optical power output of the LED is characterized with a power meter (PM100D, ThorLabs). In Fig. 21(a), the slope indicates pixel gain in mV/pW with TEXP=8 ms. The photodiode responsivity is calculated by dividing pixel gain by the transimpedance gain of the CTIA. The pixels have a mean responsivity of 0.13 A/W (quantum efficiency (QE=30%) and 0.21 A/W (QE=37%), at 535 nm and 705 nm respectively.

Refer to caption
Figure 21: (a) Pixel output voltage vs. incident optical power. (b) Histogram of measured dark current across pixels. (c) Measured pixel noise under dark condition for a single frame and after 8 averages.

A histogram of the measured dark current across pixels with a Gaussian fit is shown in Fig. 21(b). The mean dark current is 14.9 fA (7.7 aA/µm2) with a standard deviation of 0.7 fA (0.4 aA/µm2). Fig. 21(c) shows the measured pixel output noise in dark condition for different exposure times for a single frame and an average of 8 frames. For TEXP=8 ms, the measured pixel output noise is 5.34 mVrms for a single frame and 1.87 mVrms after 8 averages. The output noise increases with the exposure time due to the shot noise from the increased dark signal.

The resolution of the imager is measured with a negative USAF target (Fig. 22(a)) overlaying a uniform layer of Cy5 NHS ester (λ𝜆\lambdaitalic_λEX=649 nm, λ𝜆\lambdaitalic_λEM=670 nm) dissolved in PBS at 10 µM concentration. The dye is contained with a 150 µm-thick glass coverslip and the target is placed on the imager. The resolution measurements were conducted with wired power and data transfer and using a fiber-coupled 650 nm laser for uniform illumination. Fig. 22(b) shows the sensor image of the element with 125 µm line spacing compared to the microscope reference image in Fig. 22(c). The sensor images this element at 50% contrast as calculated with the line scan in Fig. 22(d). Contrast is calculated as (VMAXVMIN)/(VMAX+VMINVBK)subscript𝑉𝑀𝐴𝑋subscript𝑉𝑀𝐼𝑁subscript𝑉𝑀𝐴𝑋subscript𝑉𝑀𝐼𝑁subscript𝑉𝐵𝐾(V_{MAX}-V_{MIN})/(V_{MAX}+V_{MIN}-V_{BK})( italic_V start_POSTSUBSCRIPT italic_M italic_A italic_X end_POSTSUBSCRIPT - italic_V start_POSTSUBSCRIPT italic_M italic_I italic_N end_POSTSUBSCRIPT ) / ( italic_V start_POSTSUBSCRIPT italic_M italic_A italic_X end_POSTSUBSCRIPT + italic_V start_POSTSUBSCRIPT italic_M italic_I italic_N end_POSTSUBSCRIPT - italic_V start_POSTSUBSCRIPT italic_B italic_K end_POSTSUBSCRIPT ), where VMAXsubscript𝑉𝑀𝐴𝑋V_{MAX}italic_V start_POSTSUBSCRIPT italic_M italic_A italic_X end_POSTSUBSCRIPT and VMINsubscript𝑉𝑀𝐼𝑁V_{MIN}italic_V start_POSTSUBSCRIPT italic_M italic_I italic_N end_POSTSUBSCRIPT are the maximum and minimum pixel values in the bright and dark bars, respectively, and VBKsubscript𝑉𝐵𝐾V_{BK}italic_V start_POSTSUBSCRIPT italic_B italic_K end_POSTSUBSCRIPT is the background signal. Fig. 22(e) shows the full contrast transfer function measured by imaging elements on the target with line spacing ranging from 79–455 µm and calculating the contrast for each. These results demonstrate that with the FOP, the imager can distinguish line spacing as small as 100 µm with greater than 20% contrast.

Refer to caption
Figure 22: Resolution measurements using (a) USAF target. Image of element with 125 µm line width with the sensor (b) and a microscope (c). (d) Line scan of image in (a). (e) Measured contrast transfer function.

To demonstrate three-color imaging, we image a sample containing 15µm-diameter green (λ𝜆\lambdaitalic_λEX=505 nm, λ𝜆\lambdaitalic_λEM=515 nm, F8844, Thermo Fisher Scientific), red (λ𝜆\lambdaitalic_λEX=645 nm, λ𝜆\lambdaitalic_λEM=680 nm, F8843, Thermo Fisher Scientific), and NIR (λ𝜆\lambdaitalic_λEX=780 nm, λ𝜆\lambdaitalic_λEM=820 nm, DNQ-L069, CD Bioparticles) fluorescent beads. The beads are suspended in 1× PBS solution at a concentration of approximately 10 beads/µL. 50 µL of solution is pipetted into a micro-well chamber slide for imaging. Imaging results are shown in Fig. 23. The sensor images are obtained wirelessly with ILD=18.5 mA, TEXP,GREEN=8 ms, TEXP,RED=16 ms, TEXP,NIR=8 ms. For each color channel, 4 frames are averaged and the channels are colored and overlaid to make the multicolor image. The sensor images show good correspondence with the reference image taken with a bench-top fluorescence microscope (Leica DM-IRB). A few beads do not appear in the sensor image due to non-uniform illumination from the µLDs. There is also a line artifact visible in the NIR channel due to reflections off the wire-bonds and that be mitigated through more careful fabrication as detailed in [roschelle_multicolor_2024].

Refer to caption
Figure 23: 3-color imaging of fluorescent beads.

V Ex Vivo Imaging of Immune Response

We conducted an ex vivo mouse experiment to demonstrate the application of our sensor to assessing the response to cancer immunotherapy through dual-color fluorescence imaging of both effector and suppressor cells in the tumor microenvironment. In this study, we measure response to immune checkpoint inhibitors (ICIs), a class of immunotherapy that activates the immune system against cancer by blocking interactions between effector and inhibitory immune cells and cancer [ribas_cancer_2018, robert_decade_2020]. A successful immune response to ICIs requires the activation and proliferation of CD8+ T-cells, the most powerful effector cells in the anticancer response, into the tumor microenvironment [raskov_cytotoxic_2021]. Therefore, CD8+ T-cell infiltration has been identified as an indicator of a favorable immune response [spitzer_systemic_2017]. However, CD8+ T-cell activation can be inhibited by suppressor immune cells such as neutrophils, which regulate the immune system and inflammation in the body and are associated with resistance to ICI immunotherapy [faget_neutrophils_2021, kargl_neutrophil_2019]. Dual-color fluorescence imaging enables a differential measurement of these two control mechanisms of the immune response with the same imaging frontend which is not possible with clinical imaging modalities such as MRI, PET, or CT.

V-A Experimental Design

Fig. S5 outlines the ex vivo experiment design, which uses two engineered cancer models from [woo_cho_t_2022], an LLC lung cancer model (engineered to resist ICIs) and a B16F10 melanoma model (engineered to respond to ICIs). Both tumor models show increased CD8+ T-cell infiltration over the course of treatment. However, while the B16F10 tumors reliably respond, the LCC tumors are resistant to ICI therapy. This resistance has been linked to a T-cell-driven inflammatory response that triggers an influx of neutrophils into the tumor, suppressing T-cell activation [woo_cho_t_2022].

The experiment includes two groups of mice each bearing one type of tumor. Each group consists of a mouse treated with a combination of PD-1 and CTLA-4 inhibitors, a class of ICIs [ribas_cancer_2018], and an untreated mouse injected with a non-therapeutic antibody for control. Three weeks after tumor implantation, the tumors are harvested, sectioned to 4 µm-thick samples, and mounted on glass slides. Two adjacent sections from each tumor are labeled separately with fluorescent probes targeting CD8+ T-cells and neutrophils. CD8+ T-cells are stained with a CD8+ antibody labeled with Cy5 (λ𝜆\lambdaitalic_λEX=649 nm, λ𝜆\lambdaitalic_λEM=670 nm) and neutrophils are stained with a CD11b antibody labeled with FAM (λ𝜆\lambdaitalic_λEX=492 nm, λ𝜆\lambdaitalic_λEM=518 nm).

Refer to caption
Figure 24: Ex vivo imaging of mouse tumors with and without immunotherapy. Imaging results for (a) the resistant tumor model (LLC) and (b) the responsive model (B16F10). (c) Metrics for quantification of cell populations. (d) Quantified results.

V-B Imaging Results

Images of the tumor samples are captured wirelessly with the sensor and compared with reference images from a bench-top fluorescence microscope. Figs. 24(a) and (b) show the imaging results from the LLC (resistant) and B16F10 (responsive) groups, respectively. For each fluorescent channel, 8 frames are acquired with the chip, using imaging parameters of ILD=18.5 mA, TEXP,Cy5=16 ms, and TEXP,FAM=8 ms. The sensor images are averaged across all frames. The microscope images are overlaid with the cell nuclei of the entire sample, stained with DAPI (blue in the image) to highlight the tumor area. The white lines within the images indicate the boundaries of the tumor tissue. The sensor images are qualitatively consistent with the microscope references, albeit at a lower resolution and with varying intensity across the image due to non-uniform illumination from the µLDs.

To quantify the results for each tumor model, the percent change in the density of both cell types between the untreated and treated mice is calculated according to the metrics in Fig. 24(c). Ground truth cell densities are determined using the microscope images by counting the fraction of cell nuclei (DAPI) labeled with the targeted probe (red and green channel). As the sensor does not have single-cell resolution, the cell density in the sensor images is determined by the fluorescence intensity in the tumor normalized by the area bounded by the dashed white lines in Fig. 24(a) and (b). The background signal is mostly canceled out by measuring percent change.

The quantified results from the sensor and microscope are shown in Fig. 24(d). The sensor captures the general trends observed with the microscope, corresponding with the results in [woo_cho_t_2022]. The increase in the density of CD8+ T-cells in both B16F10 samples (sensor: 847%, microscope: 582%) and the LLC samples (sensor: 38%, microscope: 191%) suggests an effector response to immunotherapy in both models. However, a larger increase in CD11b density after treatment in the LLC tumors (sensor: 66%, microscope: 75%) over the B16F10 tumors (sensor: 42%, microscope: 51%), suggests resistance in the LLC model due an increase in neutrophils. These trends would better reflect the results in [woo_cho_t_2022] with a larger sample size to account for heterogeneity across the mice and neutrophil-specific biomarkers (CD11b also stains other myeloid cells).

However, these results highlight the utility of multicolor fluorescence imaging in evaluating the response to cancer immunotherapy, enabling a differential measurement of both effector (e.g. CD8+ T-cell) and suppressor (e.g. neutrophil) populations. As shown by the increase in CD8+ T-cells in resistant LLC tumors, an increase in effector populations does not always correlate with response as the effector cells may be inhibited by suppressor cells. Therefore, simultaneously imaging suppressor populations such as neutrophils has two advantages: (1) enabling a more accurate assessment of response and (2) revealing the mechanisms of resistance (e.g. neutrophil interference with CD8+ T-cells) that can be targeted with second-line therapies (e.g. blocking T-cell-induced immunosuppressive inflammation signaling as done in [woo_cho_t_2022]). Future in vivo studies can highlight the unique capability of our sensor to analyze real-time dynamics in the spatial interactions of these populations, which is critical for develo** a more nuanced understanding of the immune response [vitale_intratumoral_2021].

TABLE I: COMPARISON OF STATE-OF-THE-ART CHIP-SCALE FLUORESCENCE IMAGE SENSORS
[Uncaptioned image]

VI Conclusion

We present a fully wireless implantable image sensor capable of multicolor fluorescence imaging for real-time monitoring of response to cancer immunotherapy. A comparison of our work with recent chip-scale fluorescence imagers and sensors is shown in Table I. To the knowledge of the authors, our work is the first to demonstrate fully wireless operation of the entire system with biologically relevant samples. In [zhu_ingestible_2023], a battery is used for power. In [rabbani_towards_2024] the US link operates above FDA limits and low imager sensitivity limits wireless imaging to high concentrations of fluorescent dye. With a power harvesting frontend incorporating a cross-coupled charge-pump, we demonstrate safe operation at 5 cm depth in oil with US power densities at 31% of FDA limits. The robust communication link demonstrates a BER better than 10-6 with a 13 kbps data rate. Moreover, optimization of the storage capacitor sizing enables a small form factor of 0.09 cm3 demonstrated with a mechanical assembly of the implant.

Our system is specifically designed for multicolor fluorescence imaging with a three-channel laser driver to drive different color µLDs, an US downlink for programming imaging and laser settings, and an optical frontend design consisting of a multi-bandpass interference filter and a FOP. Our optical frontend provides greater than 6 OD of excitation rejection of lasers within 15 nm of the filter band edge, a significant improvement over the CMOS metal filters reported in [aghlmand_65-nm_2023, zhu_ingestible_2023] and competitive performance with the combination of absorption and interference filters in [moazeni_mechanically_2021, rustami_needle-type_2020]. To the best of our knowledge, this work is the first chip-scale fluorescence imager capable of three-color imaging, which we demonstrate through imaging fluorescent beads. The pixel noise is on the same order of magnitude as [aghlmand_65-nm_2023, zhu_ingestible_2023] despite these works using pixel sizes accommodating large low-noise readout circuits with higher power consumption.

By imaging CD8+ T-cells and neutrophils populations in ex vivo mouse tumors with or without immunotherapy, we show how multicolor fluorescence imaging can enable accurate identification of non-responders and their underlying resistance mechanisms. Such sub-millimeter imaging of multiple biomarkers is inaccessible to clinical imagers such as MRI, CT or PET and can inform personalized treatment regimens addressing the wide variability in response to immunotherapy across patients. With future work in biocompatible packaging and integration of optics for epi-illumination, our platform can open the door to real-time, chronic monitoring of the spatial interactions of multiple cell populations deep in the body.

Acknowledgments

The authors would like to thank sponsors of BSAC (Berkeley Sensors and Actuators Center) and TSMC for chip fabrication. We appreciate technical discussion and advice from Prof. Rikky Muller, Efthymios Papageorgiou, Hossein Najafiaghdamand, and Mohammad Meraj Ghanbari. Thank you to Eric Yang, Jade Pinkenburg, and Kingshuk Daschowdhury for their technical assistance. Finally, we acknowledge Dr. Mohammad Naser from Biological Imaging Development CoLab (BIDC) and Kristine Wong from Laboratory for Cell Analysis (LCA) for the development of immunohistochemistry workflow and imaging.

\printbibliography
[Uncaptioned image] Micah Roschelle (Graduate Student Member, IEEE) received his B.S. degree in electrical engineering from Columbia University, New York, NY, USA, in 2020. He is currently pursuing a Ph.D. in electrical engineering and computer sciences at the University of California, Berkeley, Berkeley, CA, USA. His research interests include implantable medical devices, lensless fluorescence imaging, and biomedical sensor design.
[Uncaptioned image] Rozhan Rabbani (Graduate Student Member, IEEE) received the B.Sc. degree from Sharif University of Technology, Tehran, Iran, in 2018. She received her Ph.D. degree from the Department of Electrical and Computer Sciences, University of California Berkeley, Berkeley, CA, USA in 2024. At Sharif University of Technology, she worked on analog and mixed-signal circuit design to optimize power consumption for a wearable ECG sensor. She worked at Apple Inc. during Summers 2020 and 2022 working on calibration and test automation for high-speed applications. Her research at UC Berkeley was focused on develo** biomedical circuits and sensors, specifically implantable image sensors for cancer therapy. She was the recipient of the Apple Ph.D. Fellowship in Integrated Circuits in 2022, the 2024 SSCS Rising Stars, and the 2024 SSCS Predoctoral Achievement Award.
[Uncaptioned image] Surin Gweon (Graduate Student Member, IEEE) received the B.S. degree in electrical engineering from Korea University, Seoul, South Korea, in 2018, and the M.S. degree in electrical engineering from the Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea, in 2020. She worked with System LSI Business, Samsung Electronics Company Ltd., Hwaseong, South Korea until 2023. She is currently pursuing the Ph.D. degree electrical engineering and computer sciences at the University of California at Berkeley (UC Berkeley), Berkeley, CA, USA. Her research interests include image sensor front-end and mixed-mode computing for implantable biomedical applications.
[Uncaptioned image] Rohan Kumar (Graduate Student Member, IEEE) received his B.S. degree in electrical engineering and computer science (EECS) from the University of California, Berkeley, Berkeley, CA, USA, in 2024. He is currently pursuing a Ph.D. in EECS at UC Berkeley. His interests include electronic design automation, die-to-die interconnects, and open-source hardware.
[Uncaptioned image] Alec Vercruysse (Graduate Student Member, IEEE) received a B.S. degree in engineering from Harvey Mudd College in Claremont, CA, USA in 2023. He is currently pursuing a Ph.D. in electrical engineering and computer sciences at the University of California, Berkeley. His interests include the system-level design of circuits for implantable medical devices.
[Uncaptioned image] Nam Woo Cho Dr. Nam Woo Cho, MD, PhD is a physician scientist in radiation oncology. Dr. Cho received his undergraduate degree from Harvard College, and MD/PhD degrees from the University of Pennsylvania. He completed internship in Internal Medicine at St. Mary’s Medical Center in San Francisco, and his residency in radiation oncology at UCSF. Following his postdoctoral work with Dr. Matthew Spitzer, he started his own research laboratory as an Assistant Professor in the Department of Radiation Oncology and Department of Otolaryngology-Head and Neck Surgery. His research focuses on understanding fundamental immunologic mechanisms that govern responses to immune stimulating therapies including radiation therapy and immune checkpoint inhibitors. Dr. Cho leverages molecular, cellular, organismal, and computational platforms to define novel mechanisms, pioneering the next generation of radio- and immune-therapeutics.
[Uncaptioned image] Matthew H. Spitzer received the B.S. degree from Georgetown University, Washington, DC, USA and the Ph.D. degree from Stanford University, Stanford, CA, USA, in 2015. In 2016, he joined University of California San Francisco (UCSF), San Francisco, CA, USA as a UCSF Parker Fellow and a Sandler Faculty Fellow. He is currently Associate Professor in the Departments of Otolaryngology-Head and Neck Surgery and Microbiology & Immunology at UCSF and an investigator of the Parker Institute for Cancer Immunotherapy, San Francisco, USA. His research aims to develop understanding of how the immune system coordinates its responses across the organism with an emphasis on tumor immunology by combining methods in experimental immunology and cancer biology with computation.
[Uncaptioned image] Ali M. Niknejad (Fellow, IEEE) received the B.S.E.E. degree from the University of California at Los Angeles, Los Angeles, CA, USA, in 1994, and the master’s and Ph.D. degrees in electrical engineering from the University of California at Berkeley (UC Berkeley), Berkeley, CA, in 1997 and 2000, respectively. He is currently a Professor with the EECS Department, UC Berkeley, the Faculty Director of the Berkeley Wireless Research Center (BWRC), Berkeley, and the Associate Director of the Center for Ubiquitous Connectivity. His research interests include wireless and broadband communications and biomedical imaging and sensors, integrated circuit technology (analog, RF, mixed signal, and mm-wave), device physics and compact modeling, and applied electromagnetics. Prof. Niknejad and his coauthors received the 2017 IEEE Transactions on Circuits and Systems—I: Regular Papers Darlington Best Paper Award, the 2017 Most Frequently Cited Paper Award (2010–2016) at the Symposium on VLSI Circuits, and the CICC 2015 Best Invited Paper Award. He was a recipient of the 2012 ASEE Frederick Emmons Terman Award for his textbook on electromagnetics and RF integrated circuits. He was a co-recipient of the 2013 Jack Kilby Award for Outstanding Student Paper for his work on an efficient Quadrature Digital Spatial Modulator at 60 GHz, the 2010 Jack Kilby Award for Outstanding Student Paper for his work on a 90-GHz pulser with 30 GHz of bandwidth for medical imaging, and the Outstanding Technology Directions Paper at ISSCC 2004 for co-develo** a modeling approach for devices up to 65 GHz.
[Uncaptioned image] Vladimir M. Stojanović (Fellow, IEEE) received the Dipl. Ing. degree from the University of Belgrade, Belgrade, Serbia, in 1998, and the Ph.D. degree in electrical engineering from Stanford University, Stanford, CA, USA, in 2005. He was with Rambus, Inc., Los Altos, CA, USA, from 2001 to 2004; and the Massachusetts Institute of Technology, Cambridge, MA, USA, as an Associate Professor, from 2005 to 2013. He is currently a Professor of electrical engineering and computer sciences with the University of California at Berkeley, Berkeley, CA, USA, where he is also a Faculty CoDirector of the Berkeley Wireless Research Center (BWRC). His current research interests include the design, modeling, and optimization of integrated systems, from CMOS-based VLSI blocks and interfaces to system design with emerging devices, such as NEM relays and silicon photonics, design and implementation of energy-efficient electrical and optical networks, and digital communication techniques in high-speed interfaces and high-speed mixed-signal integrated circuit (IC) design. Dr. Stojanović was a recipient of the 2006 IBM Faculty Partnership Award, the 2009 NSF CAREER Award, the 2008 ICCAD William J. McCalla, the 2008 IEEE TRANSACTIONS ON ADVANCED PACKAGING, and the 2010 ISSCC Jack Raper Best Paper and 2020 ISSCC Best Forum Presenter Awards. He was a Distinguished Lecturer of IEEE Solid-State Circuits Society from 2012 to 2013.
[Uncaptioned image] Mekhail Anwar (Member, IEEE) received the B.A. degree in physics from the University of California Berkeley (UC Berkeley), Berkeley, CA, USA, where he graduated as the University Medalist, the Ph.D. degree in electrical engineering and computer sciences from the Massachusetts Institute of Technology, Cambridge, MA, USA, in 2007, and the M.D. degree from the University of California San Francisco (UCSF), San Francisco, CA, in 2009. In 2014, he completed a Radiation Oncology residency with UCSF. In 2014, he joined the faculty with the Department of Radiation Oncology, UCSF, with a joint appointment in Electrical Engineering and Computer Sciences at UC Berkeley (in 2021), where he is currently an Associate Professor. His research focuses on develo** sensors to guide cancer care using integrated-circuit based platforms. His research centers on directing precision cancer therapy using integrated circuit-based platforms to guide therapy. His work in chip scale imaging has been recognized with awards from the DOD (Physician Research Award)  and the NIH (Trailblazer), and in 2020 he was awarded the prestigious DP2 New Innovator Award for work on implantable imagers. At UCB and UCSF he focuses on the development of implantable sensors across both imaging, molecular sensing and radiation therapy.  He is board certified in Radiation Oncology and maintains a clinical practice specializing in the treatment of GI malignancies with precision radiotherapy.
Refer to caption
Figure 25: (a) LDO schematic. (b–d) different error amplifier topologies. Table SI details design parameters for each of the 5 different LDOs.
Refer to caption
Figure 26: Power-on reset (POR) circuit schematic. Initially, when the chip is charging up and VCP is below 3.9 V, the diodes, D1–D5, do not conduct current and the gate of M2 is pulled low. Therefore, M2 is off and the POR signal is pulled low, continuously resetting the finite state machine (FSM). When VCP reaches 3.9V, D1–D5 turn on and the POR signal will be pulled high by M2, allowing the FSM to function normally. R1 and C1 prevent sudden fluctuations on VCP from triggering changes in the POR signal. As VCP falls below 3.9 V during the Readout state, the feedback transistor, M1, ensures that M2 stays on even as D1–D4 turn off, maintaining a high POR signal as long as the digital 1.8 V LDO is still operational. The 3.9 V POR voltage is selected to be slightly higher than the 3.5 V minimum operational voltage on VCP, to ensure stable operation of the chip as the FSM wakes up.
Refer to caption
Figure 27: ADC with conventional differential charge-redistribution SAR architecture. The ADC compares the pixel output voltage (VS) with the CDS reset voltage (VR) and produces an 8-bit digital pixel value. In the pixel architecture shown in Fig. 14(a), the output of the in-pixel CTIA is initially set to 0.6 V during the reset phase (TRST). The CTIA is supplied by the 1.8 V analog LDO such that its maximum output voltage is 1.6 V, resulting in a dynamic range of 1 V. During Readout, the CDS outputs—pixel signal voltage (VS) and reset voltage (VR)—are level-shifted up 1 V by the in-pixel source followers before the ADC samples them. Therefore, the common-mode input of the comparator is set to 2.1 V, which is provided by the VADC2.1V LDO. This voltage is selected to be halfway between the minimum signal at the ADC input, corresponding to the CDS reset voltage (VR=1.6 V), and the maximum achievable pixel signal considering the CTIA dynamic range (VS=1.6+1 V=2.6 V).

Despite the 1 V headroom for the pixel output signal and given the typically low photocurrent signals from fluorescence imaging, the ADC dynamic range is set to 0.5 V, which dominates over the dynamic range of the CTIA. The ADC dynamic range is set through the VADC0.5V LDO and is selected to achieve an LSB of 1.95 V, which adds negligent quantization noise to pixel readout noise (see Fig. 21(c) in the main text). This dynamic range is sufficient for capturing signals in the maximum exposure setting of 248 ms, where dark current uses up 338 mV of the dynamic range.

The strong-arm comparator utilizes PMOS input devices and operates on the 3.3 V supply with digital logic operating in the 1.8 V domain. The capacitive DAC (C1–C10) is implemented with MIM capacitors where the smallest capacitor, C1 (40.56 fF), consists of two minimum-size unit capacitors of 20.28 fF each.

While the ADC has a 9-bit output, the 9th sign bit is discarded and is not stored in the memory or transmitted via US backscatter. This sign bit is unnecessary because VS is always greater than VR even when no light is incident on the pixel: given the average pixel dark current of 14.9 fA and the 11 fF in-pixel integration cap, even at the shortest exposure time, TEXP=8 ms, VS is expected to be 10.8 mV (~5.5 ADC LSBs) greater than VR on average. One conversion cycle of the ADC lasts 15 CLK cycles which is 16.3 µs assuming a 920 kHz CLK from the US carrier. The simulated effective number of bits (ENOB) of the extracted ADC with the buffers, LDOs and PTAT is 7.93 bits.
Refer to caption
Figure 28: Measured harvested voltage at the output of the rectifier (VRECT) and the output of the charge pump (VCP) for different voltages at the input of the rectifier (VPIEZO+). A minimum voltage of VPIEZO+=2.42 V is required to harvest VCP=3.9 V, high enough to trigger the POR (Fig. S2) and, thus, ensure stable operation of the ASIC. This voltage is 27% lower than for nominal operation of the chip (VCP=5.5 V), which requires VPIEZO+=3.3 V. VRECT is less than VPIEZO+ due to the nonzero |VDS| of the actively controlled PMOS switches (M3 and M4 in Fig. 9(a)) in the active rectifier.
Refer to caption
Figure 29: Experimental design for the ex vivo mouse experiment.