FVLLMONTI: The 3D Neural Network Compute Cube $(N^{2}C^{2})$ Concept for Efficient Transformer Architectures Towards Speech-to-Speech Translation - Équipe Matériaux et Procédés pour la Nanoélectronique
Communication Dans Un Congrès Année : 2024

FVLLMONTI: The 3D Neural Network Compute Cube $(N^{2}C^{2})$ Concept for Efficient Transformer Architectures Towards Speech-to-Speech Translation

Jens Trommer
  • Fonction : Auteur
Cigdem Cakirlar
  • Fonction : Auteur
Thomas Mikolajick
  • Fonction : Auteur
Giovanni Ansaloni
  • Fonction : Auteur
Alireza Amirshahi
  • Fonction : Auteur
David Atienza
  • Fonction : Auteur

Résumé

This multi-partner-project contribution introduces the midway results of the Horizon 2020 FVLLMONTI project. In this project we develop a new and ultra-efficient class of ANN accelerators, the neural network compute cube (N2C2 ), which is specifically designed to execute complex machine learning tasks in a 3D technology, in order to provide the high computing power and ultra-high efficiency needed for future edge-AI applications. We showcase its effectiveness by targeting the challenging class of Transformer ANNs, tailored for Automatic Speech Recognition and Machine Translation, the two fundamental components of speech-to-speech translation. To gain the full benefit of the accelerator design, we develop disruptive vertical transistor technologies and execute design-technology-co-optimization (DTCO) loops from single device, to cell and compute cube level. Further, a hardware-software-co-optimization is executed, e.g. by compressing the executed speech recognition and translation models for energy efficient executing without substantial loss in precision.
Fichier principal
Vignette du fichier
3010_pdf_upload.pdf (1.78 Mo) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04739538 , version 1 (16-10-2024)

Identifiants

Citer

Ian O'Connor, Sara Mannaa, Alberto Bosio, Bastien Deveautour, Damien Deleruyelle, et al.. FVLLMONTI: The 3D Neural Network Compute Cube $(N^{2}C^{2})$ Concept for Efficient Transformer Architectures Towards Speech-to-Speech Translation. 2024 Design, Automation & Test in Europe Conference & Exhibition (DATE), Mar 2024, Valencia, Spain. pp.1-6, ⟨10.23919/DATE58400.2024.10546700⟩. ⟨hal-04739538⟩
10 Consultations
7 Téléchargements

Altmetric

Partager

More