Applying convolutional neural networks to speed up environmental DNA annotation in a highly diverse ecosystem - Université de Montpellier Accéder directement au contenu
Article Dans Une Revue Scientific Reports Année : 2022

Applying convolutional neural networks to speed up environmental DNA annotation in a highly diverse ecosystem

Alice Valentini
Tony Dejean
Wilfried Thuiller
Jérôme Murienne
Sébastien Brosse

Résumé

Abstract High-throughput DNA sequencing is becoming an increasingly important tool to monitor and better understand biodiversity responses to environmental changes in a standardized and reproducible way. Environmental DNA (eDNA) from organisms can be captured in ecosystem samples and sequenced using metabarcoding, but processing large volumes of eDNA data and annotating sequences to recognized taxa remains computationally expensive. Speed and accuracy are two major bottlenecks in this critical step. Here, we evaluated the ability of convolutional neural networks (CNNs) to process short eDNA sequences and associate them with taxonomic labels. Using a unique eDNA data set collected in highly diverse Tropical South America, we compared the speed and accuracy of CNNs with that of a well-known bioinformatic pipeline (OBITools) in processing a small region (60 bp) of the 12S ribosomal DNA targeting freshwater fishes. We found that the taxonomic labels from the CNNs were comparable to those from OBITools, with high correlation levels for the composition of the regional fish fauna. The CNNs enabled the processing of raw fastq files at a rate of approximately 1 million sequences per minute, which was about 150 times faster than with OBITools. Given the good performance of CNNs in the highly diverse ecosystem considered here, the development of more elaborate CNNs promises fast deployment for future biodiversity inventories using eDNA.
Fichier principal
Vignette du fichier
s41598-022-13412-w.pdf (2.56 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Licence : CC BY - Paternité

Dates et versions

hal-03824009 , version 1 (05-06-2023)

Licence

Paternité

Identifiants

Citer

Benjamin Flück, Laëtitia Mathon, Stéphanie Manel, Alice Valentini, Tony Dejean, et al.. Applying convolutional neural networks to speed up environmental DNA annotation in a highly diverse ecosystem. Scientific Reports, 2022, 12 (1), pp.10247. ⟨10.1038/s41598-022-13412-w⟩. ⟨hal-03824009⟩
45 Consultations
12 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More