Dispersal behaviour and riverine network connectivity shape the genetic diversity of freshwater amphipod metapopulations

Abstract Theory predicts that the distribution of genetic diversity in a landscape is strongly dependent on the connectivity of the metapopulation and the dispersal of individuals between patches. However, the influence of explicit spatial configurations such as dendritic landscapes on the genetic diversity of metapopulations is still understudied, and theoretical corroborations of empirical patterns are largely lacking. Here, we used microsatellite data and stochastic simulations of two metapopulations of freshwater amphipods in a 28,000 km2 riverine network to study the influence of spatial connectivity and dispersal strategies on the spatial distribution of their genetic diversity. We found a significant imprint of the effects of riverine network connectivity on the local and global genetic diversity of both amphipod species. Data from 95 sites showed that allelic richness significantly increased towards more central nodes of the network. This was also seen for observed heterozygosity, yet not for expected heterozygosity. Genetic differentiation increased with instream distance. In simulation models, depending on the mutational model assumed, upstream movement probability and dispersal rate, respectively, emerged as key factors explaining the empirically observed distribution of local genetic diversity and genetic differentiation. Surprisingly, the role of site‐specific carrying capacities, for example by assuming a direct dependency of population size on local river size, was less clear cut: while our best fitting model scenario included this feature, over all simulations, scaling of carrying capacities did not increase data‐model fit. This highlights the importance of dispersal behaviour along spatial networks in shaping population genetic diversity.


| INTRODUC TI ON
The genetic diversity of populations is shaped by gene flow, selection, mutation, and genetic drift (Hartl & Clark, 2006;Manel et al., 2003). These processes interact with ecological processes, determining the organisms' demography, population size and dynamics.
Understanding both ecological and evolutionary processes affecting natural populations is thus central to the understanding of patterns and dynamics of biological diversity and for implementing appropriate conservation strategies (Balkenhol et al., 2016;Lande, 1988), especially in the context of habitat fragmentation.
Extensive theoretical and empirical work highlights that dispersal has a pronounced effect on the genetic diversity and effective size of populations (Bowler & Benton, 2005;Clobert et al., 2012) and community composition (Vellend, 2016). Dispersal is defined as the movement of organisms with potential consequences for gene flow (Ronce, 2007), and is especially relevant in spatially structured landscapes (Gilpin & Hanski, 1991;Hanski & Simberloff, 1997).
Of course, the effects of dispersal can be modulated by features of landscape connectivity or spatial distributions of patch sizes, for example (Hanski & Gaggiotti, 2004). These need to be considered to understand the overall effects of dispersal on the genetic diversity of natural populations. Understanding the importance of different aspects of dispersal for natural populations is empirically challenging because both the processes influencing individual dispersal and its population genetic consequences need to be explored simultaneously.
The study of dispersal has a long tradition in both landscape ecology and metapopulation ecology, respectively, using slightly different tools and perspectives (Clobert et al., 2012;Hanski & Gaggiotti, 2004;Leibold et al., 2004;Vellend, 2016). Ideally, approaches combine measures of genetic diversity and landscape connectivity, thereby linking physical connectivity to population genetics (Manel & Holderegger, 2013). The metapopulation approach provides the means of including connectivity more explicitly (Hanski & Gaggiotti, 2004). Studies about the influence of landscape connectivity on the genetic diversity of populations mostly study lattice-like landscapes (2D), such as grasslands or forests (Dyer et al., 2012;Fortuna et al., 2009;Rozenfeld et al., 2008), or n-island models (Wright, 1931), using least-cost path methods with landscape resistance to integrate their spatial complexity (Adriaensen et al., 2003;Pinto & Keitt, 2009;Wang et al., 2009). However, this may not be generalized to the spatial structure of all ecosystems, and dispersal of organisms may be more strongly confined in spatially more complex ecosystems.
Riverine systems are a prominent example thereof. Their connectivity is highly characteristic, can be explicitly quantified, and generally follows a universal dendritic network structure. These systems are formed by geological processes leading to general topological patterns (Altermatt, 2013;Carraro et al., 2020;Rodríguez-Iturbe & Rinaldo, 1997). Ecological consequences of the spatial configuration in such networks are well-studied, and encompass effects on species richness, on beta-diversity, as well as on population sizes (Altermatt et Muneepeerakul et al., 2008;Tonkin et al., 2018). In contrast, the evolutionary consequences of the network structure on the intraspecific genetic diversity are less well understood, even though one could apply the same approaches to study them. Comparative studies focusing on the effect of riverine network structures on intraspecific genetic diversity within populations are still rare (Blanchet et al., 2020;Brauer et al., 2018;Fourtune et al., 2016), but generally show an increase in diversity in more downstream parts of the network ("downstream increase in intraspecific genetic diversity" (DIGD;  or in highly-connected sections such as confluences . Importantly, these studies highlight that various processes, such as dendritic connectivity (i.e., habitat fragmentation), asymmetric gene flow, or remnant signals of past colonization histories can lead to the empirically observed patterns (Blanchet et al., 2020;Cyr & Angers, 2011;. In parallel, theoretical models that address the effect of spatial connectivity of riverine networks on genetic variation (Morrissey & de Kerckhove, 2009;, on evolution of dispersal (Henriques-Silva et al., 2015), and emergence of neutral genetic structure (Fronhofer & Altermatt, 2017;Stokes & Perron, 2020;Thomaz et al., 2016) have demonstrated that dispersal along riverine networks has a direct imprint on the genetic structure and diversity of the inhabiting organisms.
While these theoretical models provide direct testable predictions, a direct comparison between within-population genetic diversities estimated from empirical data and predictions from theoretical models assuming an identical riverine network has been largely lacking (but see . Here, we studied the influence of connectivity in a real-world riverine network on within-and between-populations genetic diversity of freshwater amphipods (crustaceans), by combining largescale empirical data on their population genetic diversity with a simulation model making analogue predictions of their population genetic diversity using a graph theoretic approach. Graph theory has not yet been widely used in landscape genetics although it allows concise presentation of spatial configuration of natural populations (Dyer & Nason, 2004;Fortuna et al., 2009;Garroway et al., 2008;Manel & Holderegger, 2013;but see McRae et al., 2008;McRae & Beier, 2007). Based on previous work (Altermatt & Fronhofer, 2018;Blanchet et al., 2020;Fronhofer & Altermatt, 2017;Muneepeerakul et al., 2007), we wanted to gain a better understanding of the relative importance of different ecologically relevant aspects of dispersal on shaping the genetic diversity of populations. In particular, we expected allelic richness, observed, and expected heterozygosity to be higher in more central sections (i.e., downstream or confluences) of the riverine network Ritland, 1989). This increase in genetic diversity in central sections of the network might be caused by a strong signal of dispersal rate, upstream movement probability, and habitat carrying capacity, leading to those sections receiving more migrants and sustain larger populations (centrality aspect; Altermatt, 2013). We addressed this with microsatellite data from 3319 amphipod individuals collected from 95 sites across a riverine network covering more than 28,000 km 2 and compared it to the output of stochastic simulation models examining alternative parameter combinations influencing dispersal, but conducted on the identical riverine network structure.

| Genetic data collection
We conducted the study in the river Rhine drainage within Switzerland, which encompasses about 28,000 km 2 of its headwater area. We sampled Gammarus fossarum amphipods from 281 sites evenly and representatively spaced across the river Rhine headwaters between 2007 and 2015 by a kicknet approach. We morphologically identified all individuals to the species-complex level (Altermatt et al., 2019). We further genotyped a subset of individuals of the G. fossarum complex using microsatellites (Westram et al., 2013), conventional 16S sequencing, or SNP pyrosequencing (Westram et al., 2011). We used the 16S mitochondrial gene sequence, or three SNPs therein, to reconstruct the realized distribution of both G. fossarum type A and type B (see Westram et al., 2011 for detailed methods). We relied on published and unpublished sequences and SNPs (Altermatt et al., 2014;Westram et al., 2013;Westram et al. unpublished data;Alther et al. unpublished data). We used the microsatellite data for subsequent population genetic analyses.
We extracted DNA for microsatellite analyses from complete individuals or their heads using a hotshot approach (Montero-Pau et al., 2008). We amplified fragments using multiplex amplifications with the qiagen Multiplex PCR Kit chemicals. We used nine different microsatellite markers (gf08, gf10, gf13, gf18, gf19, gf22, gf24, gf27, gf28 sensu Westram et al. (2010)), specifically designed for G. fossarum and previously established in several studies (Altermatt et al., 2014;Westram et al., 2013) (note that microsatellite marker gf21 by Westram et al., 2010 was also included initially, but then discarded because of signs of null alleles, as also reported by Westram et al., 2013). We used primers in different concentrations (see Table 1 in and combined them with size standard (GeneScan 500 LIZ, Applied Biosystems). We sequenced fragments on an Applied Biosystems 3730xl DNA Analyser at the Genomic Diversity Centre of ETH Zurich, Switzerland. We analysed and manually edited the electropherograms using softgenetics genemarker software (v. 1.80). In total, we genotyped 3577 individuals. We used the microsatellite data to quantify genetic diversity within these two species. For the detailed molecular procedure on DNA extraction, microsatellite sequencing and microsatellite interpretation, see Westram et al. (2010), Westram et al. (2013), in which some of the individuals used here have already been analysed for different purposes.

| Spatial data preparation
The spatial riverine network used for the subsequent analysis represents a restricted version of the full Rhine network within Switzerland. We constructed a digital representation of the riverine  Figure S1), based on a graph theory approach and following topological connectivity along the river lines. The riverine network is based on a 2 km 2 subcatchment representation of streams and rivers of Switzerland (BAFU, 2012). Specifically, we interpreted these subcatchments as being nodes within the network and stream flow direction being directed vertices between these nodes. Based on the coordinates of the outlet site of each subcatchment (or on the centroid coordinates for headwater subcatchments), we constructed the riverine network connecting the outlet coordinates to each other. Distances from one subcatchment outlet to the adjacent downstream subcatchment outlet can be approximated as Euclidean distances on a small-scale basis, and were included as vertex weights. Additionally, the graph object also contained information on the summed upstream catchment area for each subcatchment.
The detailed methods of how we prepared the extensive graph object are described in Alther and Altermatt (2018).
We then restricted the analysis to the part of the riverine network that is actually inhabitable by either one or both of the studied amphipod species, based on an empirically validated cropping of the network. We used a database on amphipod occurrences in Switzerland with >2000 sites covered (Altermatt et al., 2019) to distinguish between nodes containing G. fossarum from unoccupied nodes. After preparation of the initial complete Rhine riverine network, we selected nodes (subcatchments) that contain one or both species of the G. fossarum complex and all their spatially interconnecting nodes ( Figure S1). This resulted in the truncated riverine network that is empirically validated to be accessible and inhabitable to G. fossarum, containing 2401 nodes (referred to as G. fossarum network). The corresponding graph object is available on GitHub (see data accessibility section).
We subsequently mapped our microsatellite data from G. fossarum complex sites of Switzerland to the nodes of the prepared graph using arcgis 10.5.1 (ESRI Inc.). We refer to sites as nodes hereafter.
Removing nodes that had <15 individuals successfully genotyped resulted in 95 nodes for subsequent analysis, harbouring 3319 individuals. The corresponding microsatellite data are available on GitHub (see data accessibility section). Data were available for 67 nodes for G. fossarum type A (2257 individuals) and 33 nodes for G. fossarum type B (1062 individuals), with five nodes having data on both ( Figure S1). This preparation step resulted in a vector containing the corresponding node IDs where microsatellite data of either one or both species of the G. fossarum complex were available.

| Stochastic simulation
To compare the empirical data to simulated data, we used a discretetime and stochastic individual-based simulation (adapted from Fronhofer & Altermatt, 2017;Fronhofer et al., 2013Fronhofer et al., , 2014 which is governed by a dispersal rate (d), and by the connectivity matrix of the metapopulation that is identical to the one derived for the empirical data (G. fossarum network). Most of the parameters of the simulation were fixed (see Table 1) but informed by the study system or the empirical methods used. We assume ten neutral, diploid loci that can take any of 100 different values as alleles to explore genetic diversity. The mutation rate of those alleles is set to 0.0001, which is in the range of empirically observed values (Estoup & Angers, 1998).
We analysed two different mutational models: (1) a random mutational model, where, upon mutation, the value of the allele is randomly chosen with uniform probability from the 100 possible allele values; and (2) a stepwise mutational model (Kimura & Ohta, 1978), where, upon mutation, the value of the allele changes by +/− 1 with equal probability. In the latter case, we assume reflecting boundary conditions at 0 and 100.
We subsequently explored the full-orthogonal parameter space along three free parameters. This included (1)  individuals). All simulations were run with ten replicates, and for 10,000 generations each, which is sufficient to reach (quasi-)equilibrium (checked by plotting dynamics over 10,000 generations, data not shown). All population genetic analyses where performed on the individuals of the last generation (t = 10,000). The explored parameter space is detailed in Table 1. The simulation code is available on GitHub (see data accessibility section).

| Statistical analyses
For the spatial data (explanatory variables), we calculated a series of network metrics in order to identify the influence of network topology based on the G. fossarum network containing 2401 nodes and the subset of 95 nodes with microsatellite data available. The calculations were done in r 3.6.1 (R Core Team, 2019) with the package igraph 1.2.4.2 (Csárdi & Nepusz, 2006). The network metrics for single nodes were upstream distance from the outlet node, total upstream catchment area, directed and undirected betweenness centrality, directed and undirected closeness centrality, and degree centrality.
Directed and undirected measures correspond to either considering flow direction, or ignoring it. The upstream distance corresponds to the instream distance from the outlet node of the riverine network near Basel, where the river Rhine continues to France and Germany and gets into a different biogeographic zone, thereby naturally separating the catchments considered here. The closeness centrality corresponds to the reciprocal of the sum of the distances between a node and all other nodes in the riverine network. We standardized the closeness centrality (c) for analysis using the following approach: . In biological terms, all these network metrics capture the connectivity of single populations to the other populations, with higher values translating to reduced connectivity.
For the genetic data (response variables), we calculated mean allelic richness, observed heterozygosity, expected heterozygosity, and pairwise genetic differentiation (Nei F ST ;Nei, 1987) for G. fossarum type A and G. fossarum type B using the packages hierfstat 0.04-22 (Goudet, 2005) and adegenet 2.1.1 (Jombart, 2008;Jombart & Ahmed, 2011) within r 3.6.1 (R Core Team, 2019). We calculated these measures for both the empirical data and the simulated data.
Prior to modelling the empirical genetic response variables, we excluded highly correlated explanatory variables (Kendall's Tau >0.8; Figure S2), specifically undirected betweenness centrality, degree centrality, and directed closeness centrality. Additionally, we log-transformed the total upstream catchment area and the directed betweenness centrality to reduce skewedness ( Figure S3). We modelled the genetic response variables separately using linear models (LM) using the lm() function since the residuals followed a Gaussian distribution. We included all network metrics (upstream distance, total catchment area, directed betweenness centrality, undirected closeness centrality) and species as factors with all higher-level interaction terms. We applied backward stepwise selection (function step()) using AIC scores to reduce interaction terms. Additionally, we ran models without interaction terms and selected the most parsimonious one with a dredge approach using function dredge() from mumin package (Bartoń, 2020) based on AICc scores. We also calculated variance-inflation factors (VIF) for all explanatory variables in the interaction and simple linear models in order to detect signals of strong collinearity. We selected the overall best fitting model for each genetic response variable comparing the AIC score of the selected interaction model and the selected model without interaction terms, additionally requiring variance inflation factors to be around 1-2. We additionally conducted separate LMs for all explanatory variables individually for a qualitative comparison. Figures were plotted using the fitted values retrieved from the separate LMs with only one explanatory variable each and species included if AIC was lower, using the predict() function. We modelled pairwise genetic differentiation and isolation-by-distance using linear models, with instream distance as an explanatory variable, including species as factor. We ran models with or without interaction, with untransformed or logtransformed F ST values, or including a power term. Five negative F ST values arose from a calculation artefact and we manually set them to zero prior to modelling. For the models using a power term and for selecting the best fitting one, we ran 100 models varying the power term from 0 to 1 in steps of 0.01, subsequently checking for minimum AIC score. Finally, we selected the overall best fitting model for F ST based on AIC scores. We retrieved F-test statistics and R 2 as coefficient of determination for all models directly from the lm() function. To assess isolation-by-distance, we used a Mantel test using the function mantel() from the package vegan 2.5-6 (Oksanen et al., 2019).
To assess which parameter combination for the simulations best fit the observed data, we correlated simulated to empirical population genetic variables (mean allelic richness, mean observed heterozygosity, expected heterozygosity, and genetic differentiation).
If simulation results and empirical data were identical, they would lie on the 1:1 diagonal line when plotting empirical versus simulated data from identical nodes ( Figure S4a). So in order to formalize simulation-empirical data discrepancies we calculated deviations from this 1:1 line fit using the perpendicular offset (distance). This approach is straightforward and requires very few assumptions.
However, unlike a conventional correlation, this also assesses the fits to both the range (intercept) and the explicit arrangement (slope) of response variables (see Figure S4 for different scenarios of correlations 1, -1, and 0). We used the sum of perpendicular offsets (SPO) as well as the median of the perpendicular offsets (MPO) as goodness-of-fit measures. The SPO takes into account the overall spread of simulated values from their empirical counterpart, where a larger SPO indicates a poorer fit (e.g., Figure S4b vs. S4d).
Considering the median using MPO partially takes into account outliers of individual nodes. An MPO closer to zero indicates that most of the simulated values fell close to the empirical counterpart. Since the perpendicular offset does not take into account if the offset is above or below the vertical (1:1) line, we additionally computed the directed median of the perpendicular offset (DMPO, e.g., Figure S4b vs. S4c). To assess which specific parameter value for each of the varying parameters (dispersal rate, upstream movement probability, scaling of carrying capacity) generally best fitted to the observed data, we compared simulations with a specific parameter value to all corresponding simulations with the remaining parameter values of the same type. Specifically, we subtracted their goodness-of- Repeating this across all simulation combinations and all response variables (mean allelic richness, mean observed heterozygosity, expected heterozygosity) resulted in a fraction of comparisons with a negative sign. If this fraction was higher than 0.5, the former parameter value was considered superior, since models with this parameter value combination performed better in more than half of all comparisons. We calculated the SPO, the MPO, and the DMPO using our own functions, included in the analysis script available on GitHub (see data accessibility section). All calculations were done in 3 | RE SULTS
Both species showed spatial gradients of within-population genetic diversity when using mean allelic richness as a diversity metric, with higher values in more central nodes, visually apparent when plotted on a map (Figure 1a Linear models explained allelic richness and observed heterozygosity by network topology. Models with higher-level interactions performed worse than models without interactions based on AIC and VIF (data not shown) and we used the latter. Mean allelic richness was best explained by upstream distance in combination with directed betweenness centrality and undirected closeness centrality ( Figure 2a; F 3,96 = 8.5; p < .001; R 2 adj = 0.19). It significantly decreased with upstream distance from the outlet node within the riverine network in both species of the Gammarus fossarum complex.
This translates to higher allelic richness in more central and betterconnected nodes of the network. In addition, higher carrying capacity generally increased allelic richness, both in the simulations as well as in the empirical data (data not shown). Mean observed hetero-

F I G U R E 2
Empirically assessed mean allelic richness, mean observed heterozygosity, and expected heterozygosity of both species of the Gammarus fossarum complex (type A: orange points, type B: cyan points) with respect to different river network metrics. Raw data points as well as model fits of linear models (solid lines) and their 95% confidence intervals (shading) are given. Mean allelic richness as a function of (a) upstream distance from the outlet node within the riverine network and as a function of (b) standardized undirected closeness centrality. Mean observed heterozygosity as a function of (c) upstream distance from the outlet node within the riverine network and as a function of (d) standardized undirected closeness centrality. Expected heterozygosity as a function of (e) upstream distance from the outlet node within the riverine network and as a function of (f) standardized undirected closeness centrality [Colour figure can be viewed at wileyonlinelibrary. com]

F I G U R E 3
Maps depicting the predicted mean allelic richness for all 18 stochastic simulation scenarios show different spatial structuring along the Rhine riverine network of Switzerland. The gradient legends show mean allelic richness. Their scale is individually adjusted in each map for the best representation of spatial structuring. d is the dispersal rate and W is the upstream movement probability (W = 0 corresponds to no upstream movement, W = 1 represents equal probability of moving up-and downstream). The corresponding figure for mean observed heterozygosity is given in Figure S5, the one for expected heterozygosity is given in Figure S6.

| Simulation -data comparison
The stochastic simulations assuming a stepwise mutation model resulted in highly differentiated spatial patterns of population genetic diversity depending on the set of parameter values used (Figure 3, Figure S5 and Figure S6). Of the three different parameters considered (dispersal rate, upstream movement probability, scaling of carrying capacity), upstream movement probabilities generally showed the strongest effect on the response variable with respect to the parameter space covered, with unidirectional movement (W = 0) generally resulting in better model fits ( Figure S7). Using different dispersal rates in the stochastic simulations resulted in comparable effects on the genetic diversity (shift of overall median of perpendicular offsets), with low dispersal rates generally resulting in better model fits ( Figure S8). Scaling the carrying capacity (K = 1) consistently worsened the model fits and showed a smaller effect on the response variable compared to upstream movement probability and dispersal rate ( Figure S9).
Comparing the simulation outputs to empirical data showed The best fitting simulations for allelic richness according to SPO for both species were based on high dispersal rates (d = 0.1), no upstream movement (W = 0), and scaling of carrying capacity ( Figure 4 and Figure S10, Table S11). However, the next best fitting simulations were based on low dispersal rate and no scaling of habitat capacity, indicating interactions between parameters. For mean observed heterozygosity we found that the best fitting simulation according to SPO was based on low dispersal rates (d = 0.001), no upstream movement (W = 0), and no scaling of carrying capacity in G. fossarum type A. In G. fossarum type B, the best simulation fit required low dispersal (d = 0.001), moderate upstream movement (W = 0.5), and no scaling of carrying capacity ( Figure S12 and S13,  Figure S15 and Figure S16, Table S17).
When accommodating for outlier nodes by comparing MPO, the best fitting simulations for allelic richness for G. fossarum type A was based on high dispersal rates (d = 0.1), no upstream movement (W = 0), and scaling of carrying capacity (Figure 4 and Table   S18). For mean observed heterozygosity, it was based on low dispersal rates (d = 0.001), no upstream movement (W = 0), and no scaling of carrying capacity (Figure 4 and Table S19) carrying capacity (Figure 4 and  Figure S15 and Table S20) (Table S26). With G. fossarum type B, the best fitting simulation required high dispersal (d = 0.1), equal up-and downstream movement probabilities (W = 1), and no scaling of carrying capacity (Table S26). When considering outliers by comparing MPO, the best fitting simulations were identical to the ones considering SPO (moderate to high dispersal rates (d = 0.01 or d = 0.1), upstream and downstream movements being equally likely (W = 1), and no scaling of carrying capacity ( Figure S25 and Table   S27). Better fits of simulations without scaling of carrying capacity was mostly due to the smaller variance of the perpendicular offset ( Figure S25).
Comparing the simulations based on a random mutational model to the empirical data also confirmed the major influence of upstream movement probability and dispersal rate. However, the effect of dispersal rate was stronger than the effect of upstream movement probability. Models with low dispersal rates (d = 0.001) outperformed the ones with higher dispersal rates (d = 0.01 or d = 0.1) in 88% of the cases, simulations with no upstream dispersal (W = 0) outperformed simulations allowing some level of upstream dispersal (W = 0.5 and W = 1) in 76% of the cases, and simulations with no scaling of carrying capacity outperformed their counterparts with scaling in 65% of the cases. The best models according to SPO and MPO always required low dispersal rates (Figures and Tables   S28-S52).

| DISCUSS ION
Combining stochastic simulations and empirical data from two amphipod species within a large riverine network, we showed a clear  (Garza & Williamson, 2001).
Genetic differentiation between populations increased with increasing instream distance ( Figure 5). We confirmed the isolationby-distance pattern in the studied amphipod species over hundreds of kilometres across a large riverine network (Westram et al., 2013).
Overall, genetic diversity in Gammarus fossarum type A and type B did not differ significantly, despite being functionally (Eisenring et al., 2016) and phylogeographically distinct, with different colonization history in Switzerland and Europe in general (Wattier et al., 2020;Westram et al., 2013). This suggests that the riverine network entails similar constraints on the genetic diversity of organisms with comparable life histories.
We then compared the empirical data to stochastic simulations ran under different scenarios, to identify the main drivers of population genetic diversity ( Figure 3). Importantly, the simulations ran on the same, and spatially realistic, graph representation of the empirical river network from where the population genetics data originated, whereas previous studies on population genetic diversity in riverine networks strongly relied on more artificial representations of riverine networks (e.g. . The comparison using the sum of perpendicular offsets (SPO; Figures S10, S13, and S16) and the median of perpendicular offsets (MPO; Figures 4, S12, and S15) showed that simulations with no upstream movement matched best the observed patterns. We varied upstream movement considerably in our simulations, from upstream and downstream movements being equally likely (W = 1) to excluding upstream movement completely (W = 0). This biologically meaningful change did result in considerable differences in simulation matches. Among the best fitting scenarios, we usually find simulations without or moderate upstream movement, (Tables S11, S14, S17-S20). This highlights that dispersal asymmetry clearly contributes to explaining the observed spatial distribution of genetic diversity of the particular species in the studied river basin. The riverine network imposes a low and restricted connectivity compared to a lattice-type landscape and seems to reinforce the role of dispersal, rendering its directionality an important component, probably interacting with dispersal rate.
Dispersal rate showed a comparable and consistent response, with simulations based on a low dispersal rate (d = 0.001) generally outperforming simulations with higher dispersal rates ( Figure S8).
This strongly suggests that the magnitude of dispersal has a pronounced effect on population genetic diversity. Higher dispersal rates may homogenize populations (Bohonak, 1999 better. This finding is most probably caused by the good fit to the range of empirically observed allelic richness values, whereas the spatial configuration seems less fitting ( Figure S10). An alternative explanation suggests that in order to maintain high allelic richness in central parts of the riverine network, either one assumes similar population sizes coupled with low connectivity (resulting from low dispersal rate and asymmetric dispersal; see also discussion about random mutational model), or one assumes larger populations downstream coupled with high dispersal rates, especially upstream.
Note that these last conclusions depend on the mutational model we assume. They hold only in the stepwise mutational model and not in the random mutational model. The differences between the two mutational models mainly stem from the fact that in the stepwise mutational model there is a 50% chance of back-mutations since mu- Overall, this result clearly supports the notion that the genetic diversity in the fluvial network is shaped by the interaction between the parameters, that is, if dispersal rate is high but there is no upstream movement, the scaling of habitat capacities becomes important.
When comparing simulations with identical dispersal rates, those including restricted upstream movement (lower values of W) fitted better than simulations with no movement directionality ( Figure S7), indicating the influence of asymmetric gene flow in generating the observed patterns (Fraser et al., 2004). With low dispersal rates, the upstream movement probability did not play a major role in improving the model fit. Whereas a previous study already suggested an effect of asymmetric gene flow on the genetic diversity patterns in G. fossarum (Alp et al., 2012), fully directional gene flow seems unrealistic, given the subtle differences within simulations using the same dispersal rate (Morrissey & de Kerckhove, 2009 (Statzner & Holm, 1989).  Figure S24). The influence of upstream movement probability seemed to be overruled by the other two parameters ( Figure S24).
But again, simulations assuming no scaling of carrying capacity usually performed better than their scaled counterpart, confirming our finding from the within population genetic diversity. Therefore the main difference between comparisons of the within and between population genetic diversity was the required dispersal rate in order to maximize simulation fit to empirical data. been found and noted by Westram et al. (2013). Actually, some of the values are so high that a hidden diversity of genetically incompatible lineages cannot be excluded (see also Wattier et al., 2020 on the cryptic diversity within Gammarus fossarum). However, the study of such hidden cryptic diversity may require further markers not yet available.
Simulations that were based on the random mutational model resulted in slightly different fits to empirical data, with dispersal rate being the main driver and upstream movement probability contributing to the observed pattern (Figures and Tables S28-S52). Here, the best models always relied on low dispersal rate, often coupled with no upstream movement and no scaling of habitat capacity (Tables S32, S33, S38, S39, S44, S45). Therefore, if mutations arise randomly with respect to the locally occurring alleles, low dispersal rate presumably maintains genetic diversity by lowering exchange between populations. If coupled with asymmetric movement probability, this effect seems often pronounced.
In conclusion, our study showed a pronounced effect of upstream movement probability and dispersal rate within the riverine network on the genetic diversity of two amphipod species across a large spatial extent. The impact of assuming increasing local population sizes with increasing downstream distance was less clear cut but overall probably less important in shaping population genetic diversity. This suggests that for understanding and protecting genetic diversity of strictly riverine organisms, network connectivity is a key aspect to be considered.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data are deposited on GitHub under the following digital ob-