Search | arXiv e-print repository

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content. △ Less

Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

arXiv:2402.18932 [pdf, other]

Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data

Authors: Takaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Fadi Biadsy, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov

Abstract: Collecting high-quality studio recordings of audio is challenging, which limits the language coverage of text-to-speech (TTS) systems. This paper proposes a framework for scaling a multilingual TTS model to 100+ languages using found data without supervision. The proposed framework combines speech-text encoder pretraining with unsupervised training using untranscribed speech and unspoken text data… ▽ More Collecting high-quality studio recordings of audio is challenging, which limits the language coverage of text-to-speech (TTS) systems. This paper proposes a framework for scaling a multilingual TTS model to 100+ languages using found data without supervision. The proposed framework combines speech-text encoder pretraining with unsupervised training using untranscribed speech and unspoken text data sources, thereby leveraging massively multilingual joint speech and text representation learning. Without any transcribed speech in a new language, this TTS model can generate intelligible speech in >30 unseen languages (CER difference of <10% to ground truth). With just 15 minutes of transcribed, found data, we can reduce the intelligibility difference to 1% or less from the ground-truth, and achieve naturalness scores that match the ground-truth in several languages. △ Less

Submitted 16 July, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

Comments: To appear in ICASSP 2024. Demo page: https://google.github.io/tacotron/publications/extending_tts/

arXiv:2311.00945 [pdf, other]

E3 TTS: Easy End-to-End Diffusion-based Text to Speech

Authors: Yuan Gao, Nobuyuki Morioka, Yu Zhang, Nanxin Chen

Abstract: We propose Easy End-to-End Diffusion-based Text to Speech, a simple and efficient end-to-end text-to-speech model based on diffusion. E3 TTS directly takes plain text as input and generates an audio waveform through an iterative refinement process. Unlike many prior work, E3 TTS does not rely on any intermediate representations like spectrogram features or alignment information. Instead, E3 TTS mo… ▽ More We propose Easy End-to-End Diffusion-based Text to Speech, a simple and efficient end-to-end text-to-speech model based on diffusion. E3 TTS directly takes plain text as input and generates an audio waveform through an iterative refinement process. Unlike many prior work, E3 TTS does not rely on any intermediate representations like spectrogram features or alignment information. Instead, E3 TTS models the temporal structure of the waveform through the diffusion process. Without relying on additional conditioning information, E3 TTS could support flexible latent structure within the given audio. This enables E3 TTS to be easily adapted for zero-shot tasks such as editing without any additional training. Experiments show that E3 TTS can generate high-fidelity audio, approaching the performance of a state-of-the-art neural TTS system. Audio samples are available at https://e3tts.github.io. △ Less

Submitted 1 November, 2023; originally announced November 2023.

Comments: Accepted by ASRU 2023

arXiv:2305.18802 [pdf, other]

LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus

Authors: Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna

Abstract: This paper introduces a new speech dataset called ``LibriTTS-R'' designed for text-to-speech (TTS) use. It is derived by applying speech restoration to the LibriTTS corpus, which consists of 585 hours of speech data at 24 kHz sampling rate from 2,456 speakers and the corresponding texts. The constituent samples of LibriTTS-R are identical to those of LibriTTS, with only the sound quality improved.… ▽ More This paper introduces a new speech dataset called ``LibriTTS-R'' designed for text-to-speech (TTS) use. It is derived by applying speech restoration to the LibriTTS corpus, which consists of 585 hours of speech data at 24 kHz sampling rate from 2,456 speakers and the corresponding texts. The constituent samples of LibriTTS-R are identical to those of LibriTTS, with only the sound quality improved. Experimental results show that the LibriTTS-R ground-truth samples showed significantly improved sound quality compared to those in LibriTTS. In addition, neural end-to-end TTS trained with LibriTTS-R achieved speech naturalness on par with that of the ground-truth samples. The corpus is freely available for download from \url{http://www.openslr.org/141/}. △ Less

Submitted 30 May, 2023; originally announced May 2023.

Comments: Accepted to Interspeech 2023

arXiv:2303.01664 [pdf, other]

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations

Authors: Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani

Abstract: Speech restoration (SR) is a task of converting degraded speech signals into high-quality ones. In this study, we propose a robust SR model called Miipher, and apply Miipher to a new SR application: increasing the amount of high-quality training data for speech generation by converting speech samples collected from the Web to studio-quality. To make our SR model robust against various degradation,… ▽ More Speech restoration (SR) is a task of converting degraded speech signals into high-quality ones. In this study, we propose a robust SR model called Miipher, and apply Miipher to a new SR application: increasing the amount of high-quality training data for speech generation by converting speech samples collected from the Web to studio-quality. To make our SR model robust against various degradation, we use (i) a speech representation extracted from w2v-BERT for the input feature, and (ii) a text representation extracted from transcripts via PnG-BERT as a linguistic conditioning feature. Experiments show that Miipher (i) is robust against various audio degradation and (ii) enable us to train a high-quality text-to-speech (TTS) model from restored speech samples collected from the Web. Audio samples are available at our demo page: google.github.io/df-conformer/miipher/ △ Less

Submitted 14 August, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

Comments: Accepted to WASPAA 2023

arXiv:2210.15868 [pdf, other]

Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation

Authors: Nobuyuki Morioka, Heiga Zen, Nanxin Chen, Yu Zhang, Yifan Ding

Abstract: Adapting a neural text-to-speech (TTS) model to a target speaker typically involves fine-tuning most if not all of the parameters of a pretrained multi-speaker backbone model. However, serving hundreds of fine-tuned neural TTS models is expensive as each of them requires significant footprint and separate computational resources (e.g., accelerators, memory). To scale speaker adapted neural TTS voi… ▽ More Adapting a neural text-to-speech (TTS) model to a target speaker typically involves fine-tuning most if not all of the parameters of a pretrained multi-speaker backbone model. However, serving hundreds of fine-tuned neural TTS models is expensive as each of them requires significant footprint and separate computational resources (e.g., accelerators, memory). To scale speaker adapted neural TTS voices to hundreds of speakers while preserving the naturalness and speaker similarity, this paper proposes a parameter-efficient few-shot speaker adaptation, where the backbone model is augmented with trainable lightweight modules called residual adapters. This architecture allows the backbone model to be shared across different target speakers. Experimental results show that the proposed approach can achieve competitive naturalness and speaker similarity compared to the full fine-tuning approaches, while requiring only $\sim$0.1% of the backbone model parameters for each speaker. △ Less

Submitted 27 October, 2022; originally announced October 2022.

Comments: Submitted to ICASSP 2023

arXiv:2210.15447 [pdf, other]

Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech

Authors: Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran

Abstract: This paper proposes Virtuoso, a massively multilingual speech-text joint semi-supervised learning framework for text-to-speech synthesis (TTS) models. Existing multilingual TTS typically supports tens of languages, which are a small fraction of the thousands of languages in the world. One difficulty to scale multilingual TTS to hundreds of languages is collecting high-quality speech-text paired da… ▽ More This paper proposes Virtuoso, a massively multilingual speech-text joint semi-supervised learning framework for text-to-speech synthesis (TTS) models. Existing multilingual TTS typically supports tens of languages, which are a small fraction of the thousands of languages in the world. One difficulty to scale multilingual TTS to hundreds of languages is collecting high-quality speech-text paired data in low-resource languages. This study extends Maestro, a speech-text joint pretraining framework for automatic speech recognition (ASR), to speech generation tasks. To train a TTS model from various types of speech and text data, different training schemes are designed to handle supervised (paired TTS and ASR data) and unsupervised (untranscribed speech and unspoken text) datasets. Experimental evaluation shows that 1) multilingual TTS models trained on Virtuoso can achieve significantly better naturalness and intelligibility than baseline ones in seen languages, and 2) they can synthesize reasonably intelligible and naturally sounding speech for unseen languages where no high-quality paired TTS data is available. △ Less

Submitted 15 March, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

Comments: To appear in ICASSP 2023

arXiv:2203.13339 [pdf, other]

Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation

Authors: Ye Jia, Yifan Ding, Ankur Bapna, Colin Cherry, Yu Zhang, Alexis Conneau, Nobuyuki Morioka

Abstract: End-to-end speech-to-speech translation (S2ST) without relying on intermediate text representations is a rapidly emerging frontier of research. Recent works have demonstrated that the performance of such direct S2ST systems is approaching that of conventional cascade S2ST when trained on comparable datasets. However, in practice, the performance of direct S2ST is bounded by the availability of pai… ▽ More End-to-end speech-to-speech translation (S2ST) without relying on intermediate text representations is a rapidly emerging frontier of research. Recent works have demonstrated that the performance of such direct S2ST systems is approaching that of conventional cascade S2ST when trained on comparable datasets. However, in practice, the performance of direct S2ST is bounded by the availability of paired S2ST training data. In this work, we explore multiple approaches for leveraging much more widely available unsupervised and weakly-supervised speech and text data to improve the performance of direct S2ST based on Translatotron 2. With our most effective approaches, the average translation quality of direct S2ST on 21 language pairs on the CVSS-C corpus is improved by +13.6 BLEU (or +113% relatively), as compared to the previous state-of-the-art trained without additional data. The improvements on low-resource language are even more significant (+398% relatively on average). Our comparative studies suggest future research directions for S2ST and speech representation learning. △ Less

Submitted 27 June, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

Comments: Interspeech 2022

arXiv:2203.08177 [pdf, other]

doi 10.1103/PhysRevApplied.17.054005

Spin-optical dynamics and quantum efficiency of single V1 center in silicon carbide

Authors: Naoya Morioka, Di Liu, Öney O. Soykal, Izel Gediz, Charles Babin, Rainer Stöhr, Takeshi Ohshima, Nguyen Tien Son, Jawad Ul-Hassan, Florian Kaiser, Jörg Wrachtrup

Abstract: Color centers in silicon carbide are emerging candidates for distributed spin-based quantum applications due to the scalability of host materials and the demonstration of integration into nanophotonic resonators. Recently, silicon vacancy centers in silicon carbide have been identified as a promising system with excellent spin and optical properties. Here, we in-depth study the spin-optical dynami… ▽ More Color centers in silicon carbide are emerging candidates for distributed spin-based quantum applications due to the scalability of host materials and the demonstration of integration into nanophotonic resonators. Recently, silicon vacancy centers in silicon carbide have been identified as a promising system with excellent spin and optical properties. Here, we in-depth study the spin-optical dynamics of single silicon vacancy center at hexagonal lattice sites, namely V1, in 4H-polytype silicon carbide. By utilizing resonant and above-resonant sub-lifetime pulsed excitation, we determine spin-dependent excited-state lifetimes and intersystem-crossing rates. Our approach to inferring the intersystem-crossing rates is based on all-optical pulsed initialization and readout scheme, and is applicable to spin-active color centers with similar dynamics models. In addition, the optical transition dipole strength and the quantum efficiency of V1 defect are evaluated based on coherent optical Rabi measurement and local-field calibration employing electric-field simulation. The measured rates well explain the results of spin-state polarization dynamics, and we further discuss the altered photoemission dynamics in resonant enhancement structures such as radiative lifetime shortening and Purcell enhancement. By providing a thorough description of V1 center's spin-optical dynamics, our work provides deep understanding of the system which guides implementations of scalable quantum applications based on silicon vacancy centers in silicon carbide. △ Less

Submitted 19 April, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

Comments: 32 pages, 12 figures

arXiv:2109.04737 [pdf]

doi 10.1038/s41563-021-01148-3

Nanofabricated and integrated colour centres in silicon carbide with high-coherence spin-optical properties

Authors: Charles Babin, Rainer Stöhr, Naoya Morioka, Tobias Linkewitz, Timo Steidl, Raphael Wörnle, Di Liu, Erik Hesselmeier, Vadim Vorobyov, Andrej Denisenko, Mario Hentschel, Christian Gobert, Patrick Berwian, Georgy V. Astakhov, Wolfgang Knolle, Sridhar Majety, Pranta Saha, Marina Radulaski, Nguyen Tien Son, Jawad Ul-Hassan, Florian Kaiser, Jörg Wrachtrup

Abstract: Optically addressable spin defects in silicon carbide (SiC) are an emerging platform for quantum information processing. Lending themselves to modern semiconductor nanofabrication, they promise scalable high-efficiency spin-photon interfaces. We demonstrate here nanoscale fabrication of silicon vacancy centres (VSi) in 4H-SiC without deterioration of their intrinsic spin-optical properties. In par… ▽ More Optically addressable spin defects in silicon carbide (SiC) are an emerging platform for quantum information processing. Lending themselves to modern semiconductor nanofabrication, they promise scalable high-efficiency spin-photon interfaces. We demonstrate here nanoscale fabrication of silicon vacancy centres (VSi) in 4H-SiC without deterioration of their intrinsic spin-optical properties. In particular, we show nearly transform limited photon emission and record spin coherence times for single defects generated via ion implantation and in triangular cross section waveguides. For the latter, we show further controlled operations on nearby nuclear spin qubits, which is crucial for fault-tolerant quantum information distribution based on cavity quantum electrodynamics. △ Less

Submitted 29 September, 2021; v1 submitted 10 September, 2021; originally announced September 2021.

Comments: 18 pages, 4 figures

Journal ref: Nature Materials 21, 67-73 (2022)

arXiv:2003.12591 [pdf, other]

doi 10.1038/s41534-020-00310-0

Spectrally reconfigurable quantum emitters enabled by optimized fast modulation

Authors: Daniil M. Lukin, Alexander D. White, Rahul Trivedi, Melissa A. Guidry, Naoya Morioka, Charles Babin, Öney O. Soykal, Jawad Ul Hassan, Nguyen Tien Son, Takeshi Ohshima, Praful K. Vasireddy, Mamdouh H. Nasr, Shuo Sun, Jean-Phillipe W. MacLean, Constantin Dory, Emilio A. Nanni, Jörg Wrachtrup, Florian Kaiser, Jelena Vučković

Abstract: The ability to shape photon emission facilitates strong photon-mediated interactions between disparate physical systems, thereby enabling applications in quantum information processing, simulation and communication. Spectral control in solid state platforms such as color centers, rare earth ions, and quantum dots is particularly attractive for realizing such applications on-chip. Here we propose t… ▽ More The ability to shape photon emission facilitates strong photon-mediated interactions between disparate physical systems, thereby enabling applications in quantum information processing, simulation and communication. Spectral control in solid state platforms such as color centers, rare earth ions, and quantum dots is particularly attractive for realizing such applications on-chip. Here we propose the use of frequency-modulated optical transitions for spectral engineering of single photon emission. Using a scattering-matrix formalism, we find that a two-level system, when modulated faster than its optical lifetime, can be treated as a single-photon source with a widely reconfigurable photon spectrum that is amenable to standard numerical optimization techniques. To enable the experimental demonstration of this spectral control scheme, we investigate the Stark tuning properties of the silicon vacancy in silicon carbide, a color center with promise for optical quantum information processing technologies. We find that the silicon vacancy possesses excellent spectral stability and tuning characteristics, allowing us to probe its fast modulation regime, observe the theoretically-predicted two-photon correlations, and demonstrate spectral engineering. Our results suggest that frequency modulation is a powerful technique for the generation of new light states with unprecedented control over the spectral and temporal properties of single photons. △ Less

Submitted 27 July, 2020; v1 submitted 27 March, 2020; originally announced March 2020.

Comments: 9 pages, 6 figures; Supplementary Information

Journal ref: npj Quantum Inf 6, 80 (2020)

arXiv:2001.02459 [pdf, ps, other]

doi 10.1103/PhysRevApplied.13.054017

Vibronic states and their effect on the temperature and strain dependence of silicon-vacancy qubits in 4H silicon carbide

Authors: Péter Udvarhelyi, Gergő Thiering, Naoya Morioka, Charles Babin, Florian Kaiser, Daniil Lukin, Takeshi Ohshima, Jawad Ul-Hassan, Nguyen Tien Son, Jelena Vučković, Jörg Wrachtrup, Adam Gali

Abstract: Silicon-vacancy qubits in silicon carbide (SiC) are emerging tools in quantum technology applications due to their excellent optical and spin properties. In this paper, we explore the effect of temperature and strain on these properties by focusing on the two silicon-vacancy qubits, V1 and V2, in 4H SiC. We apply density functional theory beyond the Born-Oppenheimer approximation to describe the t… ▽ More Silicon-vacancy qubits in silicon carbide (SiC) are emerging tools in quantum technology applications due to their excellent optical and spin properties. In this paper, we explore the effect of temperature and strain on these properties by focusing on the two silicon-vacancy qubits, V1 and V2, in 4H SiC. We apply density functional theory beyond the Born-Oppenheimer approximation to describe the temperature dependent mixing of electronic excited states assisted by phonons. We obtain polaronic gap around 5 and 22~meV for V1 and V2 centers, respectively, that results in significant difference in the temperature dependent dephasing and zero-field splitting of the excited states, which explains recent experimental findings. We also compute how crystal deformations affect the zero-phonon-line of these emitters. Our predictions are important ingredients in any quantum applications of these qubits sensitive to these effects. △ Less

Submitted 18 April, 2020; v1 submitted 8 January, 2020; originally announced January 2020.

Comments: 8 pages, 5 figures

Journal ref: Phys. Rev. Applied 13, 054017 (2020)

arXiv:2001.02455 [pdf]

doi 10.1038/s41467-020-16330-5

Spin-controlled generation of indistinguishable and distinguishable photons from silicon vacancy centres in silicon carbide

Authors: Naoya Morioka, Charles Babin, Roland Nagy, Izel Gediz, Erik Hesselmeier, Di Liu, Matthew Joliffe, Matthias Niethammer, Durga Dasari, Vadim Vorobyov, Roman Kolesov, Rainer Stöhr, Jawad Ul-Hassan, Nguyen Tien Son, Takeshi Ohshima, Péter Udvarhelyi, Gergő Thiering, Adam Gali, Jörg Wrachtrup, Florian Kaiser

Abstract: Quantum systems combining indistinguishable photon generation and spin-based quantum information processing are essential for remote quantum applications and networking. However, identification of suitable systems in scalable platforms remains a challenge. Here, we investigate the silicon vacancy centre in silicon carbide and demonstrate controlled emission of indistinguishable and distinguishable… ▽ More Quantum systems combining indistinguishable photon generation and spin-based quantum information processing are essential for remote quantum applications and networking. However, identification of suitable systems in scalable platforms remains a challenge. Here, we investigate the silicon vacancy centre in silicon carbide and demonstrate controlled emission of indistinguishable and distinguishable photons via coherent spin manipulation. Using strong off-resonant excitation and collecting photons from the ultra-stable zero-phonon line optical transitions, we show a two-photon interference contrast close to 90% in Hong-Ou-Mandel type experiments. Further, we exploit the system's intimate spin-photon relation to spin-control the colour and indistinguishability of consecutively emitted photons. Our results provide a deep insight into the system's spin-phonon-photon physics and underline the potential of the industrially compatible silicon carbide platform for measurement-based entanglement distribution and photonic cluster state generation. Additional coupling to quantum registers based on recently demonstrated coupled individual nuclear spins would further allow for high-level network-relevant quantum information processing, such as error correction and entanglement purification. △ Less

Submitted 10 January, 2020; v1 submitted 8 January, 2020; originally announced January 2020.

Comments: Manuscript and Methods: 21 pages, 4 figures Supplementary Information: 18 pages, 6 figures, 1 table

Journal ref: Nature Communications 11, 2516 (2020)

arXiv:1906.05964 [pdf]

doi 10.1021/acs.nanolett.9b02774

Electrical charge state manipulation of single silicon vacancies in a silicon carbide quantum optoelectronic device

Authors: Matthias Widmann, Matthias Niethammer, Dmitry Yu. Fedyanin, Igor A. Khramtsov, Torsten Rendler, Ian D. Booker, Jawad Ul Hassan, Naoya Morioka, Yu-Chen Chen, Ivan G. Ivanov, Nguyen Tien Son, Takeshi Ohshima, Michel Bockstedte, Adam Gali, Cristian Bonato, Sang-Yun Lee, Jörg Wrachtrup

Abstract: Colour centres with long-lived spins are established platforms for quantum sensing and quantum information applications. Colour centres exist in different charge states, each of them with distinct optical and spin properties. Application to quantum technology requires the capability to access and stabilize charge states for each specific task. Here, we investigate charge state manipulation of indi… ▽ More Colour centres with long-lived spins are established platforms for quantum sensing and quantum information applications. Colour centres exist in different charge states, each of them with distinct optical and spin properties. Application to quantum technology requires the capability to access and stabilize charge states for each specific task. Here, we investigate charge state manipulation of individual silicon vacancies in silicon carbide, a system which has recently shown a unique combination of long spin coherence time and ultrastable spin-selective optical transitions. In particular, we demonstrate charge state switching through the bias applied to the colour centre in an integrated silicon carbide opto-electronic device. We show that the electronic environment defined by the doping profile and the distribution of other defects in the device plays a key role for charge state control. Our experimental results and numerical modeling evidence that control of these complex interactions can, under certain conditions, enhance the photon emission rate. These findings open the way for deterministic control over the charge state of spin-active colour centres for quantum technology and provide novel techniques for monitoring doping profiles and voltage sensing in microscopic devices. △ Less

Submitted 23 June, 2019; v1 submitted 13 June, 2019; originally announced June 2019.

arXiv:1903.12236 [pdf, other]

doi 10.1038/s41467-019-13545-z

Coherent electrical readout of defect spins in 4H-SiC by photo-ionization at ambient conditions

Authors: Matthias Niethammer, Matthias Widmann, Torsten Rendler, Naoya Morioka, Yu-Chen Chen, Rainer Stöhr, Jawad Ul Hassan, Shinobu Onoda, Takeshi Ohshima, Sang-Yun Lee, Amlan Mukherjee, Junichi Isoya, Nguyen Tien Son, Jörg Wrachtrup

Abstract: Quantum technology relies on proper hardware, enabling coherent quantum state control as well as efficient quantum state readout. In this regard, wide-bandgap semiconductors are an emerging material platform with scalable wafer fabrication methods, hosting several promising spin-active point defects. Conventional readout protocols for such defect spins rely on fluorescence detection and are limite… ▽ More Quantum technology relies on proper hardware, enabling coherent quantum state control as well as efficient quantum state readout. In this regard, wide-bandgap semiconductors are an emerging material platform with scalable wafer fabrication methods, hosting several promising spin-active point defects. Conventional readout protocols for such defect spins rely on fluorescence detection and are limited by a low photon collection efficiency. Here, we demonstrate a photo-electrical detection technique for electron spins of silicon vacancy ensembles in the 4H polytype of silicon carbide (SiC). Further, we show coherent spin state control, proving that this electrical readout technique enables detection of coherent spin motion. Our readout works at ambient conditions, while other electrical readout approaches are often limited to low temperatures or high magnetic fields. Considering the excellent maturity of SiC electronics with the outstanding coherence properties of SiC defects the approach presented here holds promises for scalability of future SiC quantum devices. △ Less

Submitted 28 March, 2019; originally announced March 2019.

Journal ref: Nature Communications vol 10, 5569 (2019)

arXiv:1812.04284 [pdf]

Laser writing of scalable single colour centre in silicon carbide

Authors: Yu-Chen Chen, Patrick S. Salter, Matthias Niethammer, Matthias Widmann, Florian Kaiser, Roland Nagy, Naoya Morioka, Charles Babin, J ürgen Erlekampf, Patrick Berwian, Martin Booth, J örg Wrachtrup

Abstract: Single photon emitters in silicon carbide (SiC) are attracting attention as quantum photonic systems. However, to achieve scalable devices it is essential to generate single photon emitters at desired locations on demand. Here we report the controlled creation of single silicon vacancy ($V_{Si}$) centres in 4H-SiC using laser writing without any post-annealing process. Due to the aberration correc… ▽ More Single photon emitters in silicon carbide (SiC) are attracting attention as quantum photonic systems. However, to achieve scalable devices it is essential to generate single photon emitters at desired locations on demand. Here we report the controlled creation of single silicon vacancy ($V_{Si}$) centres in 4H-SiC using laser writing without any post-annealing process. Due to the aberration correction in the writing apparatus and the non-annealing process, we generate single $V_{Si}$ centres with yields up to 30%, located within about 80 nm of the desired position in the transverse plane. We also investigated the photophysics of the laser writing $V_{Si}$ centres and conclude that there are about 16 photons involved in the laser writing $V_{Si}$ centres process. Our results represent a powerful tool in fabrication of single $V_{Si}$ centres in SiC for quantum technologies and provide further insights into laser writing defects in dielectric materials. △ Less

Submitted 11 December, 2018; originally announced December 2018.

Showing 1–16 of 16 results for author: Morioka, N