Optimisation de l'allocation de la mémoire cache CPU pour les fonctions cloud et les applications haute performance

Armel Jeatsa Toulepi

Abstract

Contemporary IT services are mainly based on two major paradigms: cluster computing and cloud computing. The former involves the distribution of computing tasks between different nodes that work together as a single system, while the latter is based on the virtualization of computing infrastructure, enabling it to be provided on demand. In this thesis, our focus is on last-level cache (LLC) allocation in the context of these two paradigms, concentrating specifically on distributed parallel applications and FaaS functions. The LLC is a shared memory space used by all processor cores on a NUMA socket. As a shared resource, it is subject to contention, which can have a significant impact on performance. To alleviate this problem, Intel has implemented a technology in its processors that enables partitioning and allocation of cache memory: Cache Allocation Technology (CAT).In this work, using CAT, we first examine the impact of LLC contention on the performance of FaaS functions. Then, we study how this contention in a subset of nodes in a cluster affects the overall performance of a running distributed application. From these studies, we propose CASY and CADiA, intelligent LLC allocation systems for FaaS functions and distributed applications respectively. CASY uses supervised machine learning to predict the cache requirements of a FaaS function based on the size of the input file, while CADiA dynamically constructs the cache usage profile of a distributed application and performs harmonized allocation across all nodes according to this profile. These two solutions enabled us to achieve performance gains of up to around 11% for CASY, and 13% for CADiA.

Les services informatiques contemporains reposent principalement sur deux paradigmes majeurs : le cluster computing et le cloud computing. Le premier implique la répartition des tâches de calcul entre différents nœuds qui fonctionnent ensemble comme un seul système, tandis que le second se fonde sur la virtualisation de l'infrastructure informatique qui permet sa fourniture à la demande. Dans le cadre de cette thèse, notre attention se porte sur l'allocation du cache de dernier niveau (LLC) dans le contexte de ces deux paradigmes, en se concentrant spécifiquement sur les applications distribuées et les fonctions FaaS. Le LLC est un espace mémoire partagé et utilisé par tous les cœurs de processeur sur un socket NUMA. Étant une ressource partagée, il est sujet à de la contention qui peut avoir un impact significatif sur les performances. Pour pallier ce problème, Intel a mis en œuvre une technologie dans ses processeurs qui permet le partitionnement et l'allocation de la mémoire cache : Cache Allocation Technology (CAT).Dans ce travail, à l'aide de la technologie CAT, nous examinons d'abord l'impact de la contention du LLC sur les performances des fonctions FaaS. Ensuite, nous étudions comment cette contention dans un sous-ensemble de nœuds d'un cluster affecte les performances globales d'une application distribuée en cours d'exécution. De ces études, nous proposons CASY et CADiA, des systèmes d’allocation intelligents du LLC respectivement pour les fonctions FaaS et pour les applications distribuées. CASY utilise l'apprentissage automatique supervisé pour prédire les besoins en cache d'une fonction FaaS en se basant sur la taille du fichier d'entrée, tandis que CADiA construit dynamiquement le profil d'une application distribuée et effectue une allocation harmonisée sur tous les nœuds en fonction de ce profil. Ces deux solutions nous ont permis d'obtenir des gains de performance allant jusqu'à environ 11% pour CASY, et 13% pour CADiA.

Optimizing CPU cache allocation for cloud functions and high-performance applications

Optimisation de l'allocation de la mémoire cache CPU pour les fonctions cloud et les applications haute performance

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Share