HETEROFOR 1.0: a spatially explicit model for exploring the response of structurally complex forests to uncertain future conditions. II. Phenology and water cycle

the response Abstract Climate change affects forest growth in numerous and sometimes opposite ways and the resulting trend is often difficult to predict for a given site. Integrating and structuring the knowledge gained from the monitoring and experimental studies into process-based models is an interesting approach to predict the response of forest ecosystems to climate change. While the first 5 generation of such models operates at stand level, we need now individual-based and spatially-explicit approaches in order to account for structurally complex stands whose importance is increasingly recognized in the changing environment context. Among the climate-sensitive drivers of forest growth, phenology and water availability are often cited as crucial elements. They influence, for example, the length of the vegetation period during which photosynthesis takes place and the stomata opening, which determines the photosynthesis rate. 10 In this paper, we describe the phenology and water balance modules integrated in the tree growth model HETEROFOR and evaluate them on six Belgian sites. More precisely, we assess the ability of the model to reproduce key phenological processes (budburst, leaf development, yellowing and fall) as well as water fluxes. Three variants are used to predict budburst (Uniforc, Unichill and Sequential), which differ regarding the inclusion of chilling and/or forcing periods and the calculation of the coldness or heat accumulation. Among the three, the Sequential approach is 15 the least biased (overestimation of 2.46 days) while Uniforc (chilling not considered) best accounts for the interannual variability (Pearson’s R = 0.68). For the leaf development, yellowing and fall, predictions and observation are in accordance. Regarding the water balance module, the predicted throughfall is also in close agreement with the measurements (Pearson’s R = 0.856, bias = -1.3%) and the soil water dynamics across the year is well-reproduced for all the study sites (Pearson’s R comprised between 0.893 and 0.950, and bias between -1.81 and -9.33%). The positive results from the model 20 assessment will allow us to use it reliably in projection studies to evaluate the impact of climate change on tree growth and test how diverse forestry practices can adapt forests to these changes.


Introduction
Forests play an important role in regulating the climate system as their evapotranspiration and land surface properties (e.g. albedo, roughness) determine water and energy exchanges with the atmosphere (Stocker et al., 2013;Naudts et al., 2016).
Moreover, given the forest ability to sequester carbon in biomass and soil, they also affect climate by acting on the global carbon cycle (Schlamadinger and Marland 1996;Whitehead, 2011;Le Quéré et al., 2017). Forest ecosystems also provide 5 many other services such as biodiversity conservation, soil and water protection and recreation (Millennium Ecosystem Assessment, 2005). The extent to which the provision of these services will be ensured in the future is however quite uncertain and depends on the response and adaptability of these ecosystems to global changes (Lindner et al., 2010).
Forests experience numerous and fast perturbations in the context of anthropogenic global changes: physical environment modifications such as increasing CO2 (Reyer et al., 2014) and O3 concentrations (Lorenz et al., 2010;Ainsworth et al., 2012), 10 rising nitrogen depositions (Solberg et al., 2009) or climate change (Boisvenue and Running, 2006) coupled to landscape fragmentation and the subsequent biodiversity loss (Fahrig, 2003), the appearance of pests (Williams and Liebhold, 1995;Flower et al., 2015), diseases (Desprez-Loustau et al., 2006;Sturrock et al., 2011) and invasive species (Walther et al., 2009) as well as the modification of forest management practices (Noormets et al., 2015) linked to the evolution of forestry paradigms and society (Raum and Potter, 2015). In this study, we focus on climate change. According to European climate projections of 15 the last IPCC report (Kovats et al., 2014), all Europe will face a temperature increase between 1 and 5.5°C depending on the greenhouse gas emission scenario (Jacob et al., 2014). The temperature rise will be especially important in summer in the South of Europe and in winter in Northern Europe, leading among others to a decrease in the frequency of frost day occurrence.
Rainfall projections vary more regionally. A precipitation trend gradient should appear with 25% wetter climate conditions in the North and 15% dryer ones in the South while no clear trend emanates for continental Europe (Jacob et al., 2014). Moreover,20 in most of Europe, rainfall is expected to increase in winter and decrease in summer. Finally, climate extremes are projected to increase for the whole continent. In particular, the frequency of heat waves, the length of droughts and the magnitude of heavy rainfall events are likely to rise while a short increase in wind speed extremes could occur in winter over the Centre and the North of Europe.
The rapidly changing climate has already affected the forest productivity, which has globally increased since the middle of the 25 20th century (Boisvenue and Running, 2006). In North-Eastern France and Belgium, for example, beech productivity increased on average by 50% during the 20th century (Aertsen et al., 2014, Charru et al., 2010. Overall, the two main processes regulating forest growth, photosynthesis and respiration, are both stimulated by climate changes. While the higher temperatures in spring trigger earlier budburst and therefore extend the photosynthesis period (Menzel et al., 2006;Park et al., 2016), the rise in atmospheric CO2 increases the photosynthesis rate due to higher intercellular 30 CO2 concentrations (Ainsworth and Long., 2005;Thompson et al., 2017). For Europe, Menzel et al. (2006)  4 when ambient carbon dioxide concentrations were elevated to 550 ppm (Norby et al., 2005;Norby et al., 2010). In parallel, photosynthesis and maintenance respiration are favoured by the increase in air temperature (Aber et al., 2001;Yamori et al., 2014). Yet, there is no consensus on which of respiration and photosynthesis sensitivity to temperature will have the dominant effect (Zhang et al., 2017). Overall, so far, even if the enhanced photosynthesis has been attenuated by a higher maintenance respiration, the resulting climate change impact has been an increased forest productivity when soil water and nutrient 5 availability were not limiting (Boisvenue and Running, 2006). For the sites with a low extractable water reserve, the water stress experienced by the trees could intensify in the future due to increasing evapotranspiration rates and more frequent summer droughts. With the soil drying, photosynthesis is progressively reduced due to stomatal closure and the net primary production (NPP) is decreased. If the soil water potential approaches the wilting point such as in 1976, 2003 and 2018 in Europe, vitality loss and even tree mortality may occur due to carbon starvation and/or hydraulic failure depending on the tree 10 species strategy to cope with water stress (Ciais et al., 2005;McDowell, 2011;Choat et al., 2012). However, higher CO2 levels increases the water use efficiency (Keenan et al., 2013) and allow the trees to reduce their stomatal conductance while maintaining the photosynthesis active (Leuzinger and Körner, 2007;Franck et al., 2015). Besides this water stress, the response of forest ecosystems to increased atmospheric CO2 is constrained by nutrient availability including nitrogen, to the point of not responding at all on the nutritionally poorest sites (Oren et al., 2001;Fernandez-Martinez et al., 2014). 15 Since climate change affects some processes positively and others negatively and given the interactions among factors as well as the feedback and acclimation mechanisms, it is not easy to predict the resulting effect of climate change on tree growth at a given site (Lindner et al., 2014;Herr et al., 2016). Knowledge about climate change has been acquired based on long-term monitoring studies that are limited to the observed changes and on experiments of environment manipulation generally analysing one or two factors at a time on a limited period (CO2 enrichment, rainfall exclusion…). In order to apprehend the 20 complex functioning of forest ecosystems, the use of process-based modelling is a complementary approach that allows to integrate and structure the existing knowledge and to make extrapolations for unprecedented conditions like those projected for the coming decades.
Process-based models were originally built to predict forest growth response to environmental changes at stand level without accounting for management operations and canopy heterogeneity. Such models were therefore suitable for pure even-aged 25 stands but hardly manage to simulate mixed and structurally-complex stands (Pretzsch et al., 2007). Yet, nowadays, a promising way to adapt forests to climate change is to progressively turn them into uneven-aged and mixed stands using continuous cover forestry and natural-disturbance based management to improve their stress resistance and resilience (Messier et al., 2015). To account for the spatial heterogeneity, some process-based models were designed or adapted to simulate various tree cohorts (characterized by a same species and size class). Yet, this approach only considers the vertical dimension of spatial 30 heterogeneity while implementing innovative forestry practices in structurally-complex stands requires to account for the horizontal dimension through a spatially-explicit approach at tree level (Pretzsch et al., 2007;Fontes et al., 2010). Several papers have demonstrated that this level of spatial description is crucial for addressing hydrological questions. For, example, individual evapotranspiration strongly depends on the radiation intercepted by the tree and on the local resistance to water vapour transfer. For the same open-air climate conditions, two trees with identical dimensions can have different evapotranspiration rates if they experience contrasted light and wind conditions due to the size, density and species composition of their neighbours. Not accounting for this local conditions using one or two dimensions approaches can generate errors in 5 the evapotranspiration calculation at the tree and stand levels (Flerchinger et al., 2015;Vezy et al., 2018).
As the models of this particular type are very few (Simioni et al., 2016) and generally do not take into consideration tree nutrition and nutrient cycling (or in a very simplified way), we decided to develop a new model called HETEROFOR (for HETEROgeneous FORests). This model describes tree growth dynamics based on mechanistic approach in structurallycomplex stands and in a changing environment. It is based on resource sharing and integrates the main abiotic productivity 10 and vitality factors. The creation of a new model was driven as well by the fact that the comparison of models of the same type are interesting to evaluate conceptual differences and uncertainties, to highlight the relative importance of processes and to determine their optimal level of description according to the question addressed.
The processes regulating the carbon fluxes and the dimensional growth constitute the core of the HETEROFOR model and are described in Jonard et al. (in review, 2019). Here, we focus on the description of two modules essential for predicting the 15 impact of climate change on tree growth: phenology and water balance. In addition, we used data from long-term forest monitoring to evaluate the capacity of the model to reproduce key phenological phases (budburst, leaf development, yellowing and fall) and the soil water content dynamics as well as to estimate throughfall and deep drainage. Evaluating each module separately is necessary to ensure the consistency of the whole model and to avoid that different error types compensate each other. Given the number of parameters, good predictions can often be obtained on integrative variables such as the diameter at 20 breast height (dbh) increment but this is not sufficient to guarantee the quality of the model. A realistic evaluation should test each module component separately with independent data and then assess the overall model quality of predictions (Soares et al., 1995).  (Dufour-Kowalski et al., 2012) that provides the execution system and procedures to run 5 simulations and display the outputs. Still, apart from these data structures and operative methods, all initialisation and evolution procedures are specific to HETEROFOR. The initialisation phase of the model consists in loading different files (tree species parameters, tree and stand characteristics, chemical and physical soil properties, meteorological data and fruit production data) in order to create trees and soil horizons. Then, tree growth is calculated yearly according to the HETEROFOR methods presented in Jonard et al. (in review, 2019). So far, HETEROFOR is adapted and calibrated only for deciduous species but the 10 adaptation to evergreen species is under progress.
Once the initialisation is completed, the first routine called is the calculation of phenological periods from meteorological data, which is described is Sect. 2.1.2. This function provides key phenological dates and daily foliage state (proportions of leaf biomass and of green leaves relatively to full leaf development) for each day during the year. These phenological outputs are notably used for the radiation balance carried out using the SAMSARALIGHT library coupled to HETEROFOR (Courbaud 15 et al., 2003). According to a ray tracing approach and based on the solar radiation measurements from the meteorological file, this library differentiates the direct and the diffuse components from the global radiation and determines for both components the part of energy absorbed by the crown and the trunk of each tree and the part that reaches forest floor. All this information is required to estimate evapotranspiration components and tree photosynthesis. All aboveground and belowground water fluxes are calculated according to the processes described in Sect. 2.1.3, which allows to perform a water balance for each soil horizon 20 and to update its soil water content.
GPP is estimated for each individual tree using the photosynthesis method implemented in the model CASTANEA of CAPSIS (Dufrêne et al., 2005). The sunlit and shaded leaf proportions, the direct and diffuse photosynthetically active radiation (PAR) absorbed per unit of leaf area and the mean soil water potential are required as input variables for CASTANEA. A part of the GPP is used for growth and maintenance respiration, the remaining part constituting the NPP. Maintenance respiration can be 25 estimated as a fraction of the GPP or calculated for each tree compartment by a method accounting for the living biomass, its nitrogen concentration and a Q10 function that describes the temperature dependence. Growth respiration corresponds to a fraction of the carbon used to build the new tissues. NPP is then distributed to the different tree compartments (branches, trunk, roots, leaves) giving priority to the functional organs, namely, leaves and fine roots. The carbon sharing between these two sinks depends on the tree nutritional status, trees with a poorer nutrient status allocating relatively more carbon to fine roots. 30 7 allometry relationships. All these processes involving carbon fluxes are described in details in Jonard et al. (in review ,2019).
The HETEROFOR model also contains a tree nutrition and nutrient cycling module that will be described in a future paper.

Phenological module
The phenological module aims at simulating the evolution of leaf from budburst to yellowing and leaf fall in order to update the foliage status at a daily time step, namely, the proportions of leaf biomass and of green leaves relatively to complete leaf 5 development. These two foliage properties are key variables to simulate energy, water and carbon fluxes within the forest ecosystem. The leaf biomass proportion calculated for each tree species allows to predict the seasonal evolution of the individual leaf area. The first leaf appearance triggers the start of the leaved period running until all the leaves have fallen. The proportion of green leaves impacts photosynthesis and tree transpiration, as these processes are not active anymore on discoloured leaves. When leaves start yellowing, they still intercept rainfall while their photosynthetic activity and their 10 transpiration are progressively reduced.
The following phenological phases are distinguished, in chronological order:

-
Chilling period: accumulation of coldness that breaks the dormancy. It is initiated at the chilling starting date (t0) and ends at the forcing starting date (t1).
-Forcing period: accumulation of heat that initiates the leaf development in the bud and leads to the budburst (budburst 15 date = t2a).

-
Leaf development: progressive growth of the leaves from budburst to the complete leaf development (leaf development date = t2b).
-Ageing: accumulation of coldness that is initiated at the ageing starting date (t3) and ends at the yellowing starting date (t4a). 20

-
Yellowing: loss of photosynthetic activity linked to the decrease of day length. This phase ends at the yellowing ending date (t4b).
-Falling: the fall of the dead leaves starts (t5a) when less than 60% of the leaves are still green and continues until the leaf fall ending date (t5b).
Since the phenological timing can vary considerably between species, the phenology dates are calculated for each tree species 25 separately. Intra-specific differences are also likely to occur according to the age or social status (Cole and Sheldon, 2017) though are not considered here.
The phenological module is optional in HETEROFOR. Activating the phenology requires an hourly meteorological file. If not activated, the model uses identical budburst and leaf fall dates for all years and tree species set by the user.
The principle behind the whole phenology module is similar for each phase. A state variable is increasing progressively 30 growing at a rate depending on meteorological conditions (mainly air temperature). When the phase state reaches a certain threshold, the start of a new phase is triggered, except for the leaf yellowing and fall that are partly simultaneous.  (Chuine, 2000), the Unichill (Chuine, 2000) and Sequential (Kramer, 1994) models. The first only considers forcing while the latter ones integrate both chilling and forcing.
The Unichill model starts to operate when the day of year corresponds to the chilling starting date (t0). At this moment, the daily chilling rate (Rc) is calculated according to 5 with Ca, Cb and Cc (°C), chilling parameters T, the daily average temperature (°C).
This rate is summed each day until reaching the chilling threshold (C*) that triggers the forcing starting date (t1). For the 10 Uniforc model, t1 is fixed. Regarding the forcing period, the forcing rate (Rf) is calculated using the following equation in both models : with Fb and Fc (°C), parameters. 15 The budburst is activated when the sum of the daily forcing rates equals the forcing threshold (F*). For the sequential model, the following equations are considered for Rc and Rf : with 20 Tmin, Tmax and Topt, the minimum, maximum and optimal temperatures (°C), respectively, a , b and c (°C), forcing parameters.
The reason for this multi model implementation is that phenological model efficiency is extremely site-dependent (White et al., 1997). For example, studies have often shown that the models including chilling were less precise in Northern locations with generally sufficient cold accumulation to break dormancy (Leinonen and Kramer, 2002). Therefore, the choice of the 25 model should be done by the user with regards to the site.
As the data used for the calibration represented the phenology of an average tree, the model shifts forward the start of the budburst by half the mean budburst period (extending from the budburst date of the earliest tree to that of the latest) in order where T is the daily average temperature of the current day (°C).
The leaf proportion (leafProp, g g -1 ) is calculated for each day according to with 10 t, the current day.
A constant date, defined according to Dufrêne et al. (2005), is considered for the start of the ageing process ( 3 ). This process does not alter leaf quality but is a prerequisite for leaf yellowing (t4a) that is initiated when the cumulated daily ageing rate The leaf yellowing calculation gives the green leaf proportion, greenProp (g g -1 ), which provides the fraction of remaining green leaves compared to the maximum green leaf amount. It is set to 1 before the start of yellowing, and then decreases with day length according to the following equation: 20 with DLt and DLt4a, the day lengths (hours) for the current day and t4a, respectively, DLmin, the minimum day length (hours) value over the year, and Y, a parameter. 25 The day length (hours) is calculated according to Teh (2006): where λ is the site latitude (rad) and δ, the solar declination (rad) determined as = − The yellowing phase ends when the green leaf proportion reaches 0. The leaf fall ( 5 ) is set to start rapidly after yellowing initiation, namely, when greenProp reaches 0.60, considering that leaves no longer photosynthetically active can quickly fall.
The leaf fall rate (Rfall) is calculated daily and is used to update leafProp. It depends on the wind and frost episodes. While the frost weakens the leaf petiole, the wind can break it and take away the leaf. For this reason, leafProp is determined as follows for each day t: 5 with fampl, a frost amplifier coefficient fixed to 1 before the occurrence of five consecutive hours with air temperature below 0°C and is then set to 2 and 3 for oak and beech, respectively, WS is the daily average wind speed (m s -1 ), 10 Rfall is the falling rate (s m -1 d -1 ) calibrated as described in Sect. 2.2.
According to Eq. (10), leafPropt progressively decreases from 1 to 0 while it cannot take a value below greenPropt, accounting for the fact that green leaves are not expected to fall. Finally, when all leaves have fallen, the trees enter in the leafless period until the budburst of the following year.

Water balance module 15
The water balance module operates at an hourly time step and simulates the sharing of incident rainfall into the main forest water fluxes and pools, namely, interception (i.e., water storage on foliage and bark, and evaporation), throughfall, stemflow, water movements between soil horizons and deep drainage, transpiration and soil water uptake in the different soil horizons, and soil evaporation (Fig. 1). Surface runoff and groundwater level rise are not yet included at this stage.
In a first step, the parameters considered as constant during the leaved and leafless periods are estimated: maximum foliage 20 and bark storage capacities, throughfall and stemflow proportions (described hereafter) and absorbed radiation proportions.
Then, the various water fluxes are calculated at an hourly time step.

Foliage and bark storage capacity
The maximum foliage storage capacity of the stand (Cfoliage_max, l) is calculated by summing the storage capacity of each tree 25 species: with A _ , the leaf area of all the trees of species sp (m²), cfoliage_sp, the foliage storage capacity for that species (mm or l per m² of leaf). 30 Bark storage capacity depends on the season (i.e., leafed and leafless periods) and on the tree species. It is derived from a linear model proposed by André et al. (2008a) predicting the individual stemflow (sf, l) produced during a rain event as a function of tree girth (C130, cm) and rainfall amount (R, mm): = + . 130 + . + . 130 . + + + where a (l), b (l cm -1 ), c (m²) and d (m² cm -1 ) are fixed effect parameters varying with the species and the season and 5 and are random factors characterizing the tree and the rain event variability.
As it multiplies the rainfall amount in Eq. (12), the term "c + d.C130" may be interpreted as an estimation for the stemflow rate (sfrate, l mm -1 ). In other respects, André et al. (2008a) determined rainfall thresholds for stemflow appearance (Rmin, mm), defined as the amount of rainfall required to produce stemflow at the base of the trunk. This threshold was found to be independent of tree size while it depends on both the season and the species. Multiplying the sfrate estimations by Rmin values 10 for the corresponding species and season provides estimates of the tree bark storage capacity (cbark, l), namely, the amount of water accumulated on branch and trunk bark before stemflow occurs at tree base: The individual cbark estimates are then summed over all trees for each season to determine leafless and leaved stand bark storage capacity: 15 where subscripts 'll' and 'ld' refer to the leafless and the leaved periods, respectively.

Throughfall and stemflow proportions 20
For a given tree, the proportion of stand rainfall reaching the ground at the base of the trunk as stemflow may be calculated by dividing the stemflow rate (see above) by the stand area (Astand, m²): The stemflow proportion is then calculated separately for each tree species and for the leafless and the leaved periods by summing the corresponding tree stemflow proportions: 25 The stemflow proportion is also calculated at the stand scale for each period: Finally, stand level throughfall proportions are obtained directly from the stemflow proportions:

Absorbed radiation proportions
During the leaved period, the radiation absorbed by the trees is provided by the SAMSARALIGHT library either for the whole period (simplified radiation balance) or for every hour of key phenological dates (detailed radiation balance). It may be 5 determined either by considering absorption by tree crowns as a function of leaf area density and ray path length through the crown by applying the Beer-Lambert law, or by specifying relative crown radiation absorption coefficients for each species.
At the stand scale, the proportion of incident radiation absorbed per unit of leaf area during the vegetation period (% _ ² ) is calculated by summing the radiation absorbed by each crown ( _ , MJ) and dividing by the incident radiation and the leaf area of the whole stand: 10 with RAD the incident radiation cumulated over the whole vegetation period (MJ m -²) and Aleaf is the stand leaf area (m²).
Similarly, the proportion of incident radiation absorbed per unit of bark area is obtained by 15 with _ , the radiation absorbed by the trunk of a given tree (MJ) and is the stand bark area (m²).
At the tree level, the proportion of incident radiation absorbed by the crown expressed per unit of leaf area (% _ _ ² ) 20 may be formulated as where aleaf is the tree leaf area (m²).
The proportion of incident radiation transmitted to the understorey is the mean transmitted radiation ( , MJ −2 ), determined as the difference between the incident radiation and the radiation absorbed by the trees, divided by the incident 25 radiation: The radiation transmitted to the understory is then partitioned into the radiation intercepted by the ground vegetation and that reaching the soil by applying Beer-Lambert law considering the ground vegetation leaf area index (described later in Ground vegetation transpiration and soil evaporation).

13
In the following sections, all these proportions are used to estimate the hourly absorbed or transmitted radiations based on the hourly incident radiation.
For the leafless period, the proportions of incident radiation intercepted by the trunks and the branches and transmitted to the understory are obtained based on the Beer-Lambert law using the bark area index (BAI, m 2 -2 ).

Interception and evaporation of water stored on foliage and bark
Based on the preceding calculations, the water balance module starts updating the different water fluxes and pools for every hourly time step. First water evaporation from foliage and from bark is computed using the Penman Monteith equation 10 (Monteith, 1965), either at the stand scale for foliage or separately for each tree species for the bark. The latent heat flux density is calculated as follows: with λ.E: latent heat flux density (W m -2 ), 15 λ: water latent heat of vaporization = 2454000 J kg -1 (Teh, 2006), γ: psychometric constant = 0.658 mbar K -1 (Teh, 2006), ∆: slope of the saturated vapour pressure curve (mbar K -1 ): : air vapour pressure (mbar): where RH is the relative humidity (10 -2 hPa hPa -1 ) The radiation absorbed hourly per unit of leaf area (ℎ_ _ ² , W.m -2 ) is obtained by multiplying the proportion of 5 incident radiation absorbed per leaf area unit by the hourly incident radiation (h_RAD, W m -2 ): Similarly, the hourly absorbed radiation per unit of bark area (ℎ_ _ ² , W.m -2 ) is obtained by multiplying the proportion of incident radiation absorbed by the bark by the hourly incident radiation: The aerodynamic resistance is defined as the inverse of the aerodynamic conductance which represents the ease for a water vapour molecule to get away from its original location once it has been evaporated. Similarly, the surface resistance is the inverse of surface conductance that represents the ease for water molecules to migrate through the surface-air interface. The aerodynamic resistance depends mainly on wind speed and turbulence while the surface resistance is a function of the water diffusivity through the surface. 15 According to Teh (2006), the mean canopy air resistance may be obtained by integrating the canopy air conductance (ga, m.s -1 ) values estimated at 11 height levels between the mid-canopy height and the dominant height for the foliage and between half of the dominant height and the dominant height for the bark: with 20 l, the mean leaf width, fixed to 0.04 m and WS, the wind speed (m s -1 ).
The mid-canopy height is determined as the mid-height between the dominant height of the stand (hd, m), defined as the mean total height of the 100 biggest trees per ha, and the canopy base height (hcb, m), defined as the mean height to crown base of the 100 smallest trees per ha. 25 WS is estimated at the different heights (h, m) in the stand based on the dominant height wind speed ( ℎ , m s -1 ) and on the wind speed attenuation coefficient ( ): where ℎ is calculated according to Jetten (1996) based on the measured wind speed and its height of measurement: 30 with h is the height at which wind speed is estimated (in this case the dominant height), ze is the reference height (m) fixed to 50 m, zm is the wind speed measurement height (2.5 m), 5 dm is the surface roughness height (m) of the meteorological station fixed to 0.08 m, z0m is the zero plane displacement (m) of the meteorological station fixed to 0.015 m, df is the surface roughness height (m) of the forest and estimated as 0.75 • ℎ and z0f is the zero plane displacement (m) of the meteorological station fixed to 0.1 • ℎ .

10
While no surface resistance is considered for the foliage evaporation (infinite conductance), the bark conductance (m s -1 ) depends on the bark storage at the previous time step ( _ , l) and the bark storage capacity ( _ , l) according to The latent heat flux density is then converted to hourly water evaporation (EV, l per hour per m² of leaf): = .
Evaporation from foliage and from bark cannot be larger than the corresponding amounts of water stored on these surfaces, namely, (l) and _ (l) (see next section). Therefore, the following conditions are set: 25 Finally, stand bark evaporation (EVbark_stand, l h -1 ) is obtained by summing bark evaporation over species:

Partitioning of rainfall into interception, throughfall and stemflow
Rainfall passing through the canopy can be intercepted by the foliage, the branches and the stems of the trees. These reservoirs saturate progressively and the water then flows along the trunks to the tree bases to produce stemflow or drips from the canopy to the ground as throughfall. For some of the parameters (i.e., storage capacities, stemflow proportions) showing contrasting values depending on the season, the leaved and the leafless periods are distinguished to describe these processes. In addition, 5 several intermediate state variables are considered, namely: stand foliage storage ( , l) corresponding to the amount of water stored on the foliage; previous stand foliage storage ( , l) being the stand foliage storage at the previous time step; remaining foliage storage capacity at the stand scale ( , l), defined as 10 non-intercepted rainfall at the stand scale ( , l).
For the leaved period, the stand foliage storage and the non-intercepted rainfall are updated at every time step considering various cases: For the leafless period, we have = 0, which gives = . 25 Throughfall and stemflow fluxes are then calculated separately for the leaved and leafless periods. For both periods, stand throughfall and pre-stemflow ( , l) are considered as complementary fractions of the non-intercepted rainfall. Prestemflow is the amount of rain deviated towards the branches and the trunk but not necessarily reaching the base of the trunk due to storage and evaporation losses. Pre-stemflow is estimated independently for each tree species.
At this stage, the following state variables are used: the species bark storage ( _ , l) = amount of water stored in the bark of all the trees of a given tree species, the previous species bark storage ( _ , l) = species bark storage at the previous time step; the remaining bark storage capacity of a given tree species ( _ , l): Similarly as above for foliage storage and non-intercepted rainfall, various cases are distinguished to hourly update the bark 5 storage and the stemflow volume ( , l) of each species: if ( Finally, stand stemflow is obtained by summing stemflow fluxes over species: Tree transpiration 20 As for evaporation from foliage and bark, the Penman Monteith equation (see Eq. 29) is used to estimate hourly tree transpiration during the vegetation period. In this case, the radiation absorbed per unit of leaf area by each tree (ℎ_ _ _ ² , Watt per m² of leaf) is considered and is obtained by The aerodynamic resistance is determined from Eq. (36) to Eq. (38) applied between the height of largest crown extension 25 (hlce, m) and the dominant height. The surface resistance ( _ , s m -1 ) is defined as the inverse of the foliage stomatal conductance ( _ , m s -1 ) which is estimated based on a potential x modifier approach considering soil and climate conditions as well as individual tree characteristics. This approach allows to account for the increase in stomatal conductance with radiation and for the negative effect of increasing vapour pressure deficit and soil water potential (Granier and Breda, 1996;Buckley, 2017). For similar soil and climate conditions, the stomatal conductance is acknowledged to be higher for trees 30 with a larger sapwood to leaf area ratio and to decreases with crown height as stomata of top leaves close earlier to avoid cavitation when water stress occurs (Ryan and Yoder, 1997;Schäfer et al., 2000).
: the reference stomatal conductance (m s -1 ), : the sapwood to leaf area ratio (m² m -²) calculated at the tree level (see Jonard et al., in review, 2019 for 5 details), where pradiation is a parameter characterizing stomatal response to radiation.
: the soil water modifier = − 1 ( −2.5) 2 when pF > 2.5, 1 otherwise (56) where pF (cm) is the base-10 logarithm of the mean soil water potential (ϕ) (mean value of the various 10 horizons weighted based on root proportion, see below in the "root water uptake" section for calculation details of the soil water potential) and p1SW and p2SW are two parameters characterizing the stomatal response to soil water potential.
where pVPD is a species-dependent parameter characterizing stomatal response to vapour pressure deficit. 15 The latent heat flux density (W m -²) determined by applying this parametrization to Eq. (29) is then converted to tree transpiration (TRtree, l h -1 ) using the same approach as for foliage evaporation that was described in Eq. (40) and Eq. (41).
Finally, TRtree is corrected by multiplying it by the proportion of green leaves (greenProp) and by the fraction of leaves not covered with water (1 − ), considering that transpiration occurs from photosynthetically active and dry leaves only.

Ground vegetation transpiration and soil evaporation
The Penman Monteith equation is also used to estimate ground vegetation transpiration and soil evaporation. For this purpose, the radiation transmitted to the understory is subdivided for each time step into the radiation absorbed by per unit of leaf area of the ground vegetation (ℎ_ _ _ ² , Watt per m² of leaf) and the radiation absorbed by the soil (ℎ_ _ ² , W.m -2 ) through application of the Beer-Lambert law: 25 where k is the extinction coefficient fixed to 0.5 (Teh, 2006), LAIgrd_veg is the leaf area index of the ground vegetation calculated as the difference between the ecosystem LAI and the LAI averaged for the three last years when available (two last years or last year if not), greenPropstand is the proportion of remaining green leaves at the stand level.
The energy effectively available for soil evaporation is obtained by subtracting the soil heat flux density (G, W m -2 ) from ℎ_ _ ² . G is estimated based on the temperature gradient and the soil thermal conductivity (K, fixed to 0.25 W m -1 K -5 1 ) as follows: with (°C), the temperature at the soil surface, considered as equal to air temperature (T) (°C), the temperature at the interface between the organic layers and the mineral soil (see Jonard et al.,in review,10 2019 for more information on the way is obtained), ℎ (m), the thickness of the organic layer.
For ground vegetation transpiration and soil evaporation, the aerodynamic resistance is computed by applying Eq. (36) to (38) between the ground level and the dominant height.
The surface resistances of the ground vegetation ( _ _ ) and of the soil ( _ ) are the reciprocals of the ground vegetation 15 and soil conductances, respectively. The ground vegetation conductance ( _ _ , m s -1 ) is estimated based on the same approach as _ for tree transpiration while the soil conductance ( _ , m s -1 ) depends on the relative extractable water (see below for computation details) of the forest floor at the previous time step ( _ ): The latent heat flux density (W m -²) is then converted to ground vegetation transpiration (TRgrd_veg, l h -1 ) and soil evaporation 20 (EVsoil, l h -1 ) using the same approach as for tree transpiration and foliage evaporation, respectively Eq. (40) and Eq. (41).

Soil hydraulic properties
The modelling of water uptake distribution among soil horizons and of water transfer from a horizon to another requires estimates of the hydraulic properties for all soil horizons. The relationship between the soil water content (θ, m³ m -³) and the 25 absolute matric potential (h, cm) is described by the van Genuchten function that can be rearranged under the form The Mualem-van Genuchten function allows to estimate the soil hydraulic conductivity based on the relative water content 5 and the saturated conductivity.
with clay and sand, the clay and sand content of the soil (10 -2 g g -1 ) respectively , the organic carbon content of the soil (g kg -1 ) and 25 , the bulk density (g cm -³).

Water uptake distribution among soil horizons
Once tree and ground vegetation hourly transpiration has been calculated, the module sums transpiration on all trees and add the ground vegetation transpiration to obtain the hourly stand transpiration, corresponding to the stand water uptake. Then, 30 water uptake is distributed among the horizons according to a method described in Couvreur et al. (2012). This method assumes that water absorption occurs preferentially in horizons where the water potential (matric potential, h, plus a gravimetric component), ϕ, is higher. Moreover, it considers that the amount of water uptake is proportional on the one hand to the difference between the horizon water potential and the averaged water potential weighted by the fine root proportion of the whole soil profile and on the other hand to the fine root proportion of the horizon. This can be transcribed as: ). 10. f ℎ .
with UProot and UProot(hr), the total water uptake and the water uptake of the hr horizon respectively (l h -1 ) fhr, the fine root proportion Kcomp, the compensatory conductivity set to 1.10 -9 (s -1 ) ϕhr, the water potential (cm) 10 The right term of Eq. (71) is null when integrated on all the horizons. Then, it does not change the total amount of water uptake but it refines its distribution. Moreover, this method can generate water uplift that can occur when the top horizons are much dryer than the deep ones if the right term of the Eq. (71) becomes negative enough to override the left term.

Water balance of the soil horizons 15
The module performs an hourly water balance for each soil horizon hr (numbered from the topsoil) and updates its water content ( ℎ , m 3 m -3 ) as follows: with θhr_prev, the water content of the hr horizon at the previous time step (m³ m -³), 20 Vhr, the volume of the hr horizon (m³), INhr, the sum of the input water fluxes (l) and OUThr, the sum of the output water fluxes (l).
The input fluxes are the drainage ( , l) and the water surplus ( , l) from the upper horizon (hr-1) and the capillary rise ( , l) from the lower horizon (hr+1) described hereafter and represented in the figure 2: 25 The output fluxes are the drainage, the soil evaporation ( , l), the root water uptake ( , l) and the capillary rise from the current horizon (hr) (Fig. 2): The water transfer (WT, l) between the horizon hr and hr+1 (considered as drainage if directed downward or as capillary rise 30 if directed upward) is estimated with the Darcy law and the average conductivity between the horizons is calculated according to the upwind scheme that takes into account the water potential, ϕ (Eq. 75) (e.g. An and Noh, 2014).
where th (m) is the horizon thickness 5 1 (cm cm -1 ), an equation element that states for the gravimetric component of the gradient.
To ensure the mass conservation, a variable time step (∆ , s) is considered based on a stability criterion derived from the Peclet number.
This criterion is calculated for each horizon and the minimum value is retained. Still, the mass conservation is tested for the 10 whole soil profile at the end of each hour. If the water balance error exceeds 0.01 mm, the time step is divided by 10 (with 1000 as a maximum). The hourly water transfer is then obtained by cumulating the discretized values of water transfer.
Soil evaporation occurs only in organic horizons. The amount of water evaporated from the horizon hr ( (ℎ ) , l) is 15 obtained by taking the minimum value between the remaining water to evaporate ( (ℎ ) , l) and the volume of extractable water in the horizon ( ℎ = ℎ • , l). For the upper organic horizon, (ℎ ) is initialized to the total amount of water evaporated from the soil and is progressively decremented by subtracting (ℎ ) for the deeper organic horizons: If the water balance leads to a soil horizon water content higher than saturation, the soil horizon water content is set to the value of the saturated water content and a surplus is calculated. Part of this surplus is passed to the next horizon ( ℎ −1 ) while the rest is considered as preferential flows and is added to the deep drainage (DD).
with 25 vhr, the additional coarse fraction of the horizon (m³ m -³), not accounted for in the bulk density.
The deep drainage is calculated as the sum of ℎ and ℎ −1 of the last horizon plus the preferential flows.
Before passing to the next horizon, ℎ −1 takes the value of ℎ and ℎ the value of ℎ +1 .

Absolute and relative extractable water
The absolute extractable water ( , mm) is defined as the amount of water stored in the soil that can be used by the plants: where θwp_hr is the water content of the soil horizon at the wilting point (m³ m -³).
The relative extractable water ( , mm) corresponds to the ratio between this value of extractable water and the reference extractable water ( , mm): 5 where θfc_hr is the water content of the soil horizon at the field capacity (m³ m -³).

Parameter determination 10
Most of the model parameters were taken directly from the literature. In addition, an adjustment of some relationships was conducted using available data, which are described hereafter but no overall calibration of the model was performed (Table   1). For the soil hydraulic properties, the saturation θs was based on the 0.999 quantile of measured soil water contents (see Sect. 2.4 for more details). For horizon without soil water content sensor, θs was extrapolated from the closest horizons. Then, the wilting point water content was determined using the obtained saturated water content and the Eq. (64) with a matric potential, h, of 15000 cm.
The parameters of the phenological module used to calculate the start of budburst were determined using observations from 20 the Pan European Phenology dataset (PEP725) which provides data about phenological observations across different European countries, though not in Belgium. We selected 129 sites on the western border of Germany covering the latitudes of our 6 study plots (49.5-51.0°N), for which the budburst dates of a representative tree were available at least between 1951 and 2015.
The daily minimum, maximum and mean temperatures required to achieve the calibration came from the meteorological stations of the DWD Climate Data Center (Deutscher Wetterdienst). Phenological data from each site were assigned to the 25 nearest meteorological station (5 different stations were sufficient). The calibration was carried out with the Phenological Modeling Platform software (Chuine et al., 2013). This module enables the user to perform a Bayesian calibration procedure using the algorithm of Metropolis et al. (1953). Some of the parameters can also be fixed. In our case, the chilling starting date of the uniChill and sequential models were fixed to the 1 st of November of the previous year (e.g., Roberts et al., 2015;Chiang and Brown, 2007) in order to enhance the effectiveness of the other parameter calibration. The length of the budburst period, 30 the leaf development, yellowing and falling rates were all adjusted from phenological observations made in our study sites.

Site description
Six sessile oak (Quercus petraea (Matt.) Liebl.) and European beech (Fagus sylvatica L.) stands located in Wallonia (Belgium) were used to evaluate the model. They all belong to long-term ecological research sites (Belgium LTER network). Three of them were located in Baileux and were monitored since 2001. The three other stands were part of the level II plot network of ICP Forests since 1998 and were located in Louvain-la-Neuve, Chimay and Virton. These sites were selected as their contrasted 5 stand structure, species composition, soil and climate make them suitable for testing the ability of the model to account for structure complexity in various ecological conditions (at the regional scale).

Stand characteristics
The experimental site of Baileux was installed to study the impact of species mixture on forest ecosystem functioning (Jonard et al., 2006a(Jonard et al., , 2007(Jonard et al., , 2008André et al., 2008aAndré et al., , 2008b and consisted of three plots. Two plots were located in stands dominated 10 either by sessile oak or by beech and the third one presents a mixture of both species. In these plots, oak trees originated from a massive regeneration in 1880 and displayed the typical Gaussian distribution of even-aged stands, while beech trees appeared progressively giving rise to an uneven-aged structure with all diameter classes represented. The stand in Chimay was an ancient coppice-with-standards, presently composed of mature oak trees with an important hornbeam understorey. The stands in Louvain-la-Neuve and Virton were both more or less even-aged stands dominated by beech but differed in their age, with 15 much older trees in Louvain-la-Neuve than in Virton (130 vs 60 years old in 2009). All stand characteristics are provided in Table 2.

Soil properties
The Baileux, Chimay and Virton stands were all located on Cambisol but with some nuances, ranging from Dystric to the Calcaric variants in Chimay and Virton, respectively, while an Abruptic Luvisol was found in Louvain-la-Neuve (FAO soil 20 taxonomy). All sites presented a moder humus, except Virton for which mull was observed. In Baileux, Chimay and Louvainla-Neuve, the soil developed from the parent bedrock mixed with aeolien loess deposition that occurred at the interglacial period. In Virton, the soil originated only from the bedrock weathering. The parent materials were sandstone and shales, clayey sandstone and hard limestone bedrocks in Baileux, Chimay and Virton, respectively. In Louvain-la-Neuve, the soil was almost exclusively built from the loess deposition. These differences in parent material generated contrasted physical and chemical 25 soil properties (Table 3).
The soil textures also varied significantly among sites. Based on the USDA taxonomy, the soil texture was silty clayey loam and silty loam in Baileux and Louvain-la-Neuve, respectively. In Chimay and Virton, finer soil textures were observed with a clayey loam and a clay texture, respectively. In relation to the texture, drainage was good in Baileux and Louvain-la-Neuve, while the presence of inflating clay triggered the appearance of a shallow water table during the wet period and drought cracks 30 during summer in Chimay. In Virton, despite the high clay content in the lower horizons, drainage was good due to the existence of faults in the bedrock (Table 3).
Finally, stoniness and drainage influenced the estimate of the reference extractable water reserve as shown by Eq. (83). While the beech-dominated and mixed stands in Baileux and in Virton showed the lowest water reserve, the highest value was found in Louvain-la-Neuve, with intermediate values for the last stand in Baileux and in Chimay (Table 3). 5

Climate
Even if the same type of climate occurred all over Belgium (temperate oceanic), the study sites were located in different  (Poncelet, 1956). Finally, Virton was part of the "Basse Lorraine" with elevated 15 annual rainfall (1060 mm) and intermediate average temperature values (9.9°C) ( Table 3).
For Chimay, Louvain-la-Neuve and Virton, we used data from the meteorological stations of the PAMESEB network. The records covered the 1999-2018 period. A tipping bucket located at 1 m height was used to monitor rainfall. Global radiation was registered with a pyranometer, air temperature with a resistance sensor thermistor, relative humidity with a psychrometer and wind speed with an anemometer. All these devices were placed at 1.5 m height. Data were collected at 12 min intervals 20 and were then averaged hourly. For Baileux, an independent meteorological station managed by our laboratory was used to collect meteorological data since 2002. The devices were identical to those described before. Air temperature, relative humidity and rainfall were monitored at 1.5 m. Wind speed and global radiation were taken at 2.5 m above the ground.

Model evaluation
For the phenological module, different models to calculate the budburst starting date are available in HETEROFOR. Yet, as 25 the water balance module functioning depends on the proportion of leaf biomass and of green leaves, the model choice potentially influences the results. For the water cycle evaluation, we decided to use the Sequential model to predict budburst as this approach was the least biased (see Sect. 3.1.1).  (Beuker et al., 2016). They consisted of weekly observations of the percentage of budburst, yellowing and leaf fall depending on the season. As the model predicted the budburst for an average tree, we evaluated it with the budburst observations of the median tree. In addition, we visually 5 assessed the agreement between the predicted and observed increase in leaf proportion (leafProp) during the leaf development period and between the predicted and observed decrease in green leaf proportion (greenProp) and in leafProp during leaf yellowing and leaf fall, respectively. We did not perform a statistical evaluation for these latter variables as the corresponding processes were not calibrated independently in the model.

Water balance
Regarding the water balance module, the evaluation was conducted using variables integrating most of the processes described in the model. The observed throughfall, extractable water and deep drainage (considered in the next section) were compared to model predictions.
For the evaluation of throughfall predictions, only throughfall data collected in Chimay, Louvain-la-Neuve and Virton between 15 2000 and 2016 were used as the rainfall partitioning routine was based on the work of André et al. (2008aAndré et al. ( , 2008b using data from the Baileux forest. The collecting devices consisted of three long gutters covering each the radius of a crown and connected to plastic barrels. The throughfall volume was measured weekly based on the height of water in the barrels. A log transformation of both the observations and the predictions was necessary to remove the heteroscedasticity.

Extractable water was estimated based on Eq. (81) using soil water content measurements taken between 2005 and 2017 in 20
Baileux and for the 2015-2018 period in the other sites. Measurements were recorded hourly using TDR in most of the major horizons (measurements at 3 to 5 different soil depths depending on the site). In order to decrease the influence of the soil disturbance due to the instrument installation, the first year of records was discarded. Indeed, Walker et al. (2004) showed that inserting a moisture sensor in a soil disturbed its hydraulic properties and water content during at least 9 months. The electrical signal from the TDR was transformed in relative dielectric permittivity and then converted into soil volumetric water content 25 (m³ m -³) using the equation of Topp et al. (1980) for Baileux and resorting to our own calibration for the other sites (established based on gravimetric measurements of soil water content).

Drainage
Deep drainage can represent a large water output but is difficult to measure directly. Among the existing indirect approaches 30 to estimate this component, we retained the mass-balance method using chloride ion (Cl -) as tracer. This method has been widely used to estimate groundwater recharge (e.g. Scanlon et al., 2002;Ting et al., 1998;Bazuhair and Wood,1996) but can be applied to assess deep percolation as well (Willis et al., 1997). It relies on the fact that Clis not subject to any chemical transformations in the soil and undergoes only temporary storage in soil and biomass (Öberg, 2003). The only Clinput in our study plots comes from throughfall and stemflow and can be determined from Cldeposition data obtained from monthly chemical analyses of throughfall and stemflow samples. For the deep drainage, which constitutes the only output, the Clconcentration is also obtained from monthly chemical analyses of soil solution collected with zero-tension lysimeters at 1 m depth in the three stands of Baileux between 2008 and 2016 and between 2013 and 2016 for the other sites. Deep drainage was 5 estimated yearly by considering that the Clamount leaving the soil through drainage was equal to the Clinput from throughfall and stemflow. This annual time step or even the growing season like in Willis et al. (1997) are considered sufficiently long to avoid storage biases. Based on Eq. (84), the amount of deep drainage was estimated and compared to our model results.

Statistical analyses
To test the quality of the predictions, different statistical tests and indexes were used. The absolute bias, defined as the 15 difference between the observation and prediction means, and the relative bias, corresponding to the ratio between the absolute bias and the observation mean, were calculated to detect any over-or underestimation. To assess the precision of the predictions, the Root Mean Square Error (RMSE) was used and calculated as follows: with 20 n the number of observations. When the range of values differed considerably for one variable between the different sites, the RMSE was divided by the range, i.e. the difference between the maximum and the minimum values. This Normalised Root Mean Square Error (NRMSE) is much more adapted for comparisons in these situations.
The agreement between observations and predictions was also evaluated with the Pearson's correlation coefficient (r) and with 25 a regression test conducted to analyse the linear relationship between observed and predicted values. As both predictions and observations are subject to uncertainties, we used orthogonal regression that minimizes the perpendicular distances from the data points to the regression line instead of the vertical distance as done in ordinary least square regression. Then, we tested whether the regression line confidence interval (95%) included the identity line. These tests were realized with the mcr package in R.

Phenology
On average, the budburst was best predicted with the Sequential model (bias = 2.46 days compared with 8.23 and -5.88 days for Uniforc and Unichill, respectively). However, this option was less appropriate to capture the inter-annual variations 5 (Pearson's r = 0.537) than Uniforc (Pearson's r = 0.680). The temporal variability was very poorly estimated with the Unichill model, which displayed an inverse trend for the ranking among years (Pearson's r = -0.277) (Fig. 3). Moreover, as the Unichill model was not able to predict the end of the chilling period for some years in Louvain-la-Neuve, all results for this site were discarded. The predicted leaf development displayed a good agreement with observations (Fig. 4).
Simulated leaf yellowing and leaf fall were also evaluated by comparison with observations. While the leaf ageing threshold 10 was taken from Dufrêne et al. (2005), the yellowing parameter determining the length of the yellowing period was adjusted with the five years of data from Chimay, Louvain-la-Neuve and Virton. Therefore, only the yellowing start was independently evaluated. The prediction of the start of the yellowing displayed a low absolute bias (2.7 days) and RMSE (7.0 days). However, a weak correlation (0.056) was found between predictions and observations (data not shown).
For the temporal dynamics of leaf yellowing and leaf fall, the agreement between model predictions and observations was just 15 assessed visually since the parameter regulating these processes (yellowing, falling rate and falling frost amplifier) were adjusted with the same data. The overall agreement was good. The simulated decrease of green leaf proportion was similar for all sites as the photoperiod reduction is identical for each site and year (Fig. 5). The only noticeable difference came from the yellowing starting date, which depended on air temperature. For Chimay, a close agreement was found between predictions and observations. For Louvain-la-Neuve, predictions were correctly centred but the predicted trend was more abrupt and the 20 start of the decrease displayed some delay, except in 2012. For Virton, the decreasing trend was correctly displayed but the decrease start was less precise in 2016 (Fig. 5).
Concerning the leaf fall, the temporal dynamics was effectively represented in Chimay. In Louvain-la-Neuve, the model predicted a slightly too slow decrease in leaf proportion in 2012 and 2015. For the other years, the observed and predicted leaf proportion matched well even if the predicted start of the fall appeared later than in the observations for some years. In Virton, 25 the predictions were well centred with regards to the observations but the decrease in leaf proportion was a bit too fast in 2012 (Fig. 5).

Water balance
For each site, the main water fluxes affecting the water balance were calculated daily, summed up and the annual values were averaged for the 2002-2016 period (Table 4) 29 31 to 45% of the water received as rainfall returned in the atmosphere through tree transpiration. The remaining 26 to 44% were lost from the ecosystem through drainage.

Rainfall partitioning
Rainfall partitioning was correctly reproduced by the HETEROFOR model. Across all considered sites (Virton, Chimay and 5 Louvain-la-Neuve), the mean bias of throughfall predictions was very limited (-1.3%) and non-significant (P value of the paired t-test = 0.316). The confidence interval of the linear relationship between the logarithm of the observed and predicted throughfall contained the identity line corresponding to the perfect match (Fig. 6). The correlation between predictions and observations amounted to 0.86 and the RMSE to 16.62 mm which corresponded to 34.2% of the mean througfall (48.6 mm).
The separate examination of the different sites revealed that throughfall in Virton were very well predicted but that a slight 10 underestimation of the throughfall predictions in Chimay was compensating an overestimation of similar magnitude in Louvain-la-Neuve (Supplementary material 2).

Soil water content
As the temporal variation of the extractable water was affected by all the water fluxes, it was used to check the performances 15 of the water balance module (Fig. 7). A clear seasonal pattern appeared. At the beginning of the vegetation period, the extractable water values (EW) were highest. Then, tree and ground vegetation transpiration progressively depleted the water reserve which was partly recharged with rainfall events. Depending on their frequency, duration and intensity, the decline in EW was more or less pronounced and available water could reach levels close to zero. For all the sites, the Pearson's correlation between observed and predicted relative extractable water ranged from 0.893 to 0.950. These high correlation values and the 20 graph inspection show that the seasonal pattern was precisely reproduced by the HETEROFOR model. NRMSE values range from 10.54 to 13.96% while relative bias values were around -2 and -3% in Baileux-oak, Baileux-mixed and Chimay and close to -8 -9 % in Baileux-beech, Louvain-la-Neuve and Virton. These higher negative bias in the latter stands originated mainly from the model underestimation of the high values of EW (i.e. during wet periods). Despite these similar statistical results, the amount of extractable water in Virton displayed some peculiarities with regards to the other stands. Indeed, the observed EW 25 levels fluctuated considerably more than in the other sites with frequent peaks both for high and low values that were not represented by the model. Finally, apart from Virton where some discrepancy between observations and predictions can be pointed out, the model quality did not decrease in Chimay or Virton during the 2018 summer that was categorized as exceptionally dry by the Royal Meteorological Institute of Belgium. In order to predict the impact of global changes on forests, it is crucial to integrate and structure the existing knowledge in process-based models. However, this first step is not sufficient. A detailed documentation of the models as well as an evaluation of their performances are also needed in order to use them knowing exactly their strengths and limits. While most models were described in scientific articles or reports, their evaluation was often limited to one or two sites used to illustrate the model 5 functioning and was generally based on integrative response variables such as radial tree growth (Schmidt et al., 2006;Vanclay and Skovsgaard, 1997). Yet, to provide robust predictions of tree growth under changing conditions, the model must be able to accurately reproduce not only the observed tree growth but also the intermediate processes describing resource availability (light, water and nutrient) (Soares et al., 1995). In the following section, we discuss the quality of the predictions for two main drivers of tree growth (phenology and water balance) in relation with the concepts used to describe them. 10

Phenology
The Sequential model that calculates both chilling and forcing periods was the least biased variant for predicting budburst. However, Uniforc model including only the forcing period better captured the inter-annual variability. While the bias is likely to originate from the model calibration (data used for calibration were observations from western Germany) and could be corrected, the ability of the model to predict temporal variability is more representative of its structural quality. It is common 15 that models accounting only for the forcing period better represent the budburst temporal variability (Leinonen and Kramer, 2002;Yuan et al., 2007;Fu et al., 2014). Indeed, in areas where the chilling requirements are always met, as in Western Europe, the inclusion of chilling in models generally has a negative impact on model predictions. Consequently, we considered the Uniforc model as the most adapted to simulate budburst in future conditions even if Sequential model seems more appropriate for our study sites under current climate. Still, given the expected rise in winter temperatures, accounting for chilling could 20 become essential to make goods predictions (Clarck et al., 2014) but would require more data for calibration. This highlights once again the importance of having several options to describe budburst. Most process-based models listed in Pretzsch et al. (2015) had however only one phenological variant. Apart from 4C that considers the opposite actions of inhibitory and promotory agent concentrations, all the models used a classical approach based on air temperature sum (e.g., Sequential) or sigmoid function (e.g., Uniforc and Unichill). Some of them include the chilling process (4C, ForestV5.1, MAESPA, Hybrid, 25 ANAFORE) while the others only consider forcing (BALANCE, GOTILWA+, CASTANEA) ( Table 5).
Depending on the phenological variant, HETEROFOR explained between 29 and 46% of the budburst variability and the RMSE amounted to 2.46 and 8.23 days for Sequential and Uniforc, respectively. Given the limited number of observations, these model performances are only indicative. By comparison, the phenological model of BALANCE explained 54 and 55% of the budburst variability and displayed a mean absolute error of 4.9 and 4.7 days for beech and oak respectively (Rötzer et 30 al., 2004). In Fu et al. (2014), the R² obtained for budburst prediction ranged from 0.36 to 0.82 and the mean absolute error between 4.8 and 7.5 days for the sequential model. A possible improvement of the phenological models accounting for chilling would be to integrate the photoperiod effect on budburst. Indeed, some recent studies have shown evidences that photoperiod can compensate for a lack of chilling temperature that would prevent the buds to open and for an early frost episode that would trigger budburst before winter (Vitasse and Basler, 2013;Pletsers et al., 2015). This mechanism is particularly present for late-successional species like beech and oak trees and is regularly cited as a key element to simulate the phenology under climate change (Basler and Körner, 2012). Some 5 models tried to account for the photoperiod effect simply by replacing chilling by photoperiod (Kramer, 1994;Schaber and Badeck, 2003) but, in this way, failed to represent the combined effect of these variables. Recently, a few models integrating the compensatory effect of photoperiod on chilling have appeared. However, these models include more phenological parameters for similar predictive ability (Gauzere et al., 2017). Some reasons for this are that in situ measurements make it nearly impossible to disentangle the co-varying effect of chilling and day length (Flynn and Wolkovich, 2018) and that 10 photoperiod variations only occur for sites with different latitudes where other confusing factors play a role as well (Primack et al., 2009). Therefore, many data is necessary to calibrate these models. Then, we decided to privilege the accuracy of our phenological model to a more process-based approach but we are looking forward for improvements in these kinds of models and a more consensual body of literature.

Water balance 15
In a first step, the annual water fluxes predicted by HETEROFOR were compared to measurements and predictions of other studies (Table 4). Then, some water fluxes were individually evaluated when data was available. Finally, some potential improvement of the water balance module were discussed.
Various studies were taken from the literature to compare our water module predictions with observations. They cover a range of annual rainfall comprised between 425 and 1476 mm (Table 4), which is comparable to what can be found in Belgium. The 20 proportions of rainfall converted to stemflow obtained with HETEROFOR (6.1 to 13.1%) are within the range reported in the literature (0.6 to 20.4%). This large observation spectrum comes from the important seasonal (higher stemflow proportion in winter than during the vegetation period) and species differences (stemflow importance is higher for beech than oak trees), which features are accounted for in HETEROFOR. However, the mean value from the literature (7.3% of rainfall) is close to the average value for the six study sites (10.3%). The proportions of intercepted rainfall (15.9 to 22.0%) and throughfall (64.8 25 to 78.0%) are also consistent with the ranges reported in other studies (1.9 to 31.0% and 59.8 to 83.1%). Moreover, we observed a good matching between the average values (respectively 19.5 and 73.8% from literature and 19.4% and 70.2% for our study sites). For transpiration, the range found in the literature is large (14.8 to 52.3% for an average value of 31.9%), which is not surprising since inter-annual and inter-site variabilities are high for this variable (Schipka et al., 2005;Vincke et al., 2005). Comparing predicted and observed throughfall is interesting to evaluate the water balance module since throughfall is an integrative variable depending on the water storage capacity of foliage, on evaporation, and on the proportion of stemflow. 5 The good agreement between observations and predictions indicates that the partitioning of rainfall when passing through the canopy and the evaporation of the water intercepted by foliage and bark are well described. Among the different models of the Table 5, Gotilwa+ and Castanea are the only ones that account separately for stemflow and throughfall. Yet, separating throughfall and stemflow is important, especially for structurally-complex stands. In these stands, rainfall interception cannot be simulated based on a mean foliage storage capacity and a mean allocation between throughfall and stemflow since these 10 parameters vary with stand composition and structure. Our tree-level approach estimating foliage storage capacity and stemflow proportion based on individual tree characteristics allows to overcome this difficulty. Moreover, if one wants to accurately describe the nutrient cycle, partitioning rainfall is essential as nutrient concentrations in stemflow and throughfall can be 10 to 100 times higher than in rainfall due to dry and wet deposition and canopy exchange (Levia and Herwitz, 2000;André et al., 2008c;van Stan and Gordon, 2018). Even if the rainfall partitioning can still be improved from a theoretical 15 perspective (e.g., including canopy drainage after rain events or the impact of wind on the foliage storage capacity like in Muzylo et al. (2009) or Hörmann et al. (1996), we chose to limit the level of complexity in order to avoid parameterizing difficulties due to insufficient or improper data.
The extractable water (EW) is also suitable for evaluating the water balance module as most of the water fluxes influence it.
The temporal dynamics of EW was well captured by HETEROFOR as evidenced by the high correlations (Pearson's coefficient 20 comprised between 0.893 and 0.950) between observed and predicted EW for the various study sites (Fig. 7). These correlations are within the high end of the range reported for similar models. With the BALANCE model, Gröte and Pretzsch (2002) obtained a Pearson's correlation of 0.85 between the observed and predicted soil water content of the upper soil (0-20 cm horizon) in a beech forest in Germany (Freising). Applying BALANCE on three broadleaved stands of oak or beech in Germany, Rötzer et al. (2005) were also able to correctly reproduce soil water content dynamics but they mentioned a 25 significant decrease in the quality of predictions during the 2003 drought due to an overestimation of the soil drying, which was not observed with HETEROFOR in 2018. Comparing the observed soil water content at various soil depths with that predicted by the 4C model in mixed oak and pine forest (Brandeburg, Germany), Gutsch et al. (2015) obtained Pearson's correlations ranging from 0.59 to 0.74. In an oak stand in Tennessee (USA), Hanson et al. (2004) compared the ability of nine process-based forest models to reproduce soil water dynamics in the 0-35 cm horizon of the soil and obtained correlations 30 ranging from 0.81 to 0.96.
In the study of Hanson et al. (2004), relative bias was evaluated as well for soil water content and ranged between -1.3 and 4.0%. These values are comparable to those found in this study yet a bit lower. Furthermore, discrepancies between predicted https://doi.org /10.5194/gmd-2019-201 Preprint. Discussion started: 30 August 2019 c Author(s) 2019. CC BY 4.0 License. and observed EW occurred during limited periods. Several reasons can be advanced to explain them. Errors in the prediction of the budburst date can result in a too early or too late restarting of tree transpiration and induce an inaccurate depletion of the soil extractable water during the vegetation period. In order to distinguish this error source from the others, one could force the model with the observed budburst date. This option is however not yet implemented in the model. The lack of agreement between observed and predicted EW could also be ascribed to the strong heterogeneity of soil properties in forest ecosystems. 5 While the observations reflect the local soil water conditions, the model predict the average EW for the whole stand. Similarly, local rainfall events recharging soil extractable water during summer (often associated with thunderstorms) are sometimes not correctly taken into account when missing meteorological data (due to failed sensors or other technical problems) are replaced by rainfall data of a meteorological station further away.
Simplifications and errors in the model conception may also generate divergence between observations and predictions. 10 However, this structural uncertainty can be limited by selecting the most appropriate concepts. HETEROFOR predicts water transfer between soil horizons using the Darcy law. We tried to implement an approach of intermediate complexity between the simple bucket model and the Richards equations. From a theoretical point of view, the Richards approach is the most stateof-the-art but requires very long calculation times (Fatichi et al., 2016) and is usually implemented in models specifically dedicated to water flow simulations. Forest ecosystem models generally use simpler approaches such as the bucket model 15 declined in a large variety of forms (Table 5). These models consider one or several buckets with a specified water storage capacity that is filled with rainfall and is emptied by evapotranspiration. If the soil water content is at field capacity, water is transferred to the underlying layer and finally lost by drainage. Improved versions can account for transfer between buckets in unsaturated conditions using the Darcy law (leaky bucket model).
Our water transfer routine discretises the soil in horizons whose thickness varies from a few centimetres (upper horizons) to 20 half a meter (deeper horizons). Compared to the numerical resolution of Richards equation which requires thin soil layers (1 to 2 cm), our vertical discretisation of the soil profile is quite coarse and inaccurately predict the advance of the wetting front.
As the tree transpiration and photosynthesis depend on the soil water conditions of the whole soil profile, this inaccuracy has very limited implications on the simulated tree growth. In our approach, water transfer during a time step is calculated based on the horizon water potentials estimated at the end of the previous time step. As such, the model makes the hypothesis that 25 the water content does not change significantly during the time step, which is certainly not the case close to the wetting front and cannot ensure mass conservation. In order to limit this problem, the model uses an adaptive time step estimated based on the Peclet number described in Eq. (78). This allows to ensure mass conservation.
Finally, another reason that could explain the discrepancy between predictions and observations is the presence of macropores that cause preferential flows. These water fluxes defined as water movements in the soil along preferred pathways that bypass 30 the soil matrix (Hardie et al., 2011) can be generated by soil shrinkage, root growth, chemical weathering, cycles of freezing and thawing or bioturbation (Aubertin, 1971). These macropores are more frequent in forest soils than in agricultural soils as the latter are often ploughed and homogenized. They are however difficult to characterize given their strong spatial https://doi.org /10.5194/gmd-2019-201 Preprint. (Aubertin, 1971). Adaptations of the Richards equations can be used to account for the preferential flows (dual porosity and dual permeability) but require a good characterisation of soil macropores (not possible to achieve routinely in forest soils given their heterogeneity) and are still more complicated to solve than the classical Richards equations. We implemented in the model the transfer of the soil water surplus (when water saturation is reached) to the underlying horizon and the possibility to redirect part of this surplus as deep drainage to account 5 empirically for preferential flows. Indeed, preferential flows in macropores become significant only when rainfall exceeds the water infiltration rate in the soil matrix and accumulates in the soil surface. The fraction of the water surplus considered as preferential flows is an empirical parameter taking the macroporosity of the site into account.
The performances of the soil water transfer routine can also be checked based on the deep drainage flux. In this study, we compared the deep drainage estimated with HETEROFOR and with the chloride mass balance approach. The mean drainage 10 predicted with HETEROFOR was 379 mm per year while the average drainage obtained with the chloride approach amounted to 472 mm per year, which corresponds to a bias of -19.9%. The correlation between the two types of estimate amounted to a Pearson's coefficient of 0.963, with a RMSE value of 100.6 mm. These values depict a constant negative bias in the predictions that can easily be seen on figure 8. It is hard to tell whether the gap originates from the model or the method used to estimate drainage from the chloride approach. It is more likely that the bias must be ascribed to both. Indeed, on the one hand, even if 15 the use of chemical tracers to estimate drainage or groundwater recharge is commonly used (Scanlon et al., 2002), its application remains subject to uncertainties. First, the chloride method supposes that the main chloride source is rainfall and that the other sources can be neglected (Murphy et al., 1996). This hypothesis is not always fulfilled due to anthropogenic chloride introduction (road salting, wastewater) or when chloride is present in the bedrock (Ping et al., 2014). Then, preferential flows have been regularly highlighted as an error source since the associated water fluxes are not well sampled by zero-tension 20 lysimeters (Tyler and Walker, 1994;Nkotagu, 1996). Finally, this method displays better results when rainfall and soil water is richer in chloride (e.g., sites close to the sea with high marine deposits or with low drainage flux) because the chemical analyses are more accurate for higher concentrations (Sammis et al., 1982;Grismer et al., 2000).
On the other hand, modelling errors could explain the bias presence. One of them could be the overestimation of the transpired water amount. However, deep drainage tends to produce during winter while transpiration only takes place during the 25 vegetation period (spring and summer). Therefore, if transpiration was overestimated we should observe an underestimation of the EW during spring and summer (low values), which is not the case (Fig. 7). Hanson et al. (2004) measured deep drainage at the watershed level by accounting for rainfall and stream flow outputs and compared their measurements with the predictions of several models. Their multi-model comparison displayed similar RMSE (65.5 to 225.6 mm) and relative bias (-27.6 to 20.5 %) values but the Pearson's coefficient displayed by HETEROFOR is 30 definitely located in the high tail of the study range (0.61 to 0.95). However, the performances of their models are not strictly comparable to ours since the reference method for estimating drainage differs (Sammis et al., 1982;Grismer et al., 2000; Obiefuna and Orazulike, 2011).

Code availability
The source code of CAPSIS and HETEROFOR is accessible to all the members of the CAPSIS co-development community.
Those who want to join this community are welcome but must contact François de Coligny (coligny@cirad.fr) or Nicolas Beudez (nicolas.beudez@inra.fr) and sign the CAPSIS charter (http://capsis.cirad.fr/capsis/charter). This charter grants access on all the models to the modellers of the CAPSIS community but only to them. The modellers may distribute the CAPSIS 5 platform with their own model but not with the models of the others without their agreement. CAPSIS4 is a free software (LGPL licence) which includes the kernel, the generic pilots, the extensions and the libraries. For HETEROFOR, we also choose an LGPL license and decided to freely distribute it through an installer containing the CAPSIS4 kernel and the latest version (or any previous one) of HETEROFOR upon request from Mathieu Jonard (mathieu.jonard@uclouvain.be). The source code for the modules published in Geoscientific Model Development (Jonard et al., submitted, 2019;de Wergifosse et 10 al., submitted) can be downloaded from the CAPSIS website (http://amap-dev.cirad.fr/projects/capsis/files) or obtained by contacting directly Mathieu Jonard.
The end-users who do not need access to the source code can install CAPSIS from an installer containing only the HETEROFOR model while the modellers who signed the CAPSIS charter can have access the complete version of CAPSIS 15 with all the models. Depending on your status (end-user vs modeller or developer), the instructions to install CAPSIS are 15 given on the CAPSIS website (http://capsis.cirad.fr/capsis/documentation). The source code for the modules published in Geoscientific Model Development (Jonard et al., submitted; de Wergifosse et al., submitted) can be downloaded from https://github.com/jonard76/HETEROFOR-1.0 (DOI 10.5281/zenodo.3242014).

Data availability
The data used in this paper are available through the input files for HETEROFOR which are embedded in the installer (see 20 Sect. 6).

Competing interests
The authors declare that they have no conflict of interest.

Acknowledgements
This work was supported by the FRIA grant n°1.E005.18, the Service Public de Wallonie (SPW/DGO 3/DNF) through the 10

Accord-Cadre de Recherche et Vulgarisation Forestières 2014-2019 and by the Fonds de la Recherche Scientifique -FNRS
under the PDR-WISD Grant n°09 (project SustainFor). We are also grateful to RENECOFOR (Réseau National de suivi à long terme des Ecosystèmes Forestier français) for providing us data related to the three sites in Chimay, Louvain-la-Neuve and Virton.