Hybridization enables the fixation of selfish queen genotypes in eusocial colonies

Abstract A eusocial colony typically consists of two main castes: queens that reproduce and sterile workers that help them. This division of labor, however, is vulnerable to genetic elements that favor the development of their carriers into queens. Several factors, such as intracolonial relatedness, can modulate the spread of such caste‐biasing genotypes. Here we investigate the effects of a notable yet understudied ecological setting: where larvae produced by hybridization develop into sterile workers. Using mathematical modeling, we show that the coevolution of hybridization with caste determination readily triggers an evolutionary arms race between nonhybrid larvae that increasingly develop into queens, and queens that increasingly hybridize to produce workers. Even where hybridization reduces worker function and colony fitness, this race can lead to the loss of developmental plasticity and to genetically hard‐wired caste determination. Overall, our results may help understand the repeated evolution toward remarkable reproductive systems (e.g., social hybridogenesis) observed in several ant species.


A.1.2 Recurrence equations for the evolution of genotype frequencies
Our first goal is to develop recurrence equations for the frequencies of each genotype in males and females (i.e. express p u (t + 1) and p v (t + 1) in terms of p u (t) and p v (t) for u ∈ {a, b} and v ∈ {aa, ab, bb}). By definition, these frequencies can be written as p u (t + 1) = n u (t + 1) n a (t + 1) + n b (t + 1) p v (t + 1) = n v (t + 1) n aa (t + 1) + n ab (t + 1) + n bb (t + 1) , where n u (t + 1) is the number of males of genotype u ∈ {a, b} at generation t + 1, and n v (t + 1) the number of queens of genotype v ∈ {aa, ab, bb} at generation t + 1 in the mating pool. Under our assumption that the probability for a sexual to reach the mating pool increases with the workforce of a colony (section 2 in main text), the numbers of males and females of each genotype can be expressed as: n v (t + 1) = x aa v (t)n aa (t)p aa (t) + x ab v n ab (t)p ab (t) + x bb v n bb (t)p bb (t) n v (t + 1) = x aa v (t)n aa (t)p aa (t) + x ab v (t)n ab (t)p ab (t) + x bb v (t)n bb (t)p bb (t), where x u v (t) is the number of males with genotype v ∈ {a, b}, and x u v (t) the number of queens with genotype v ∈ {aa, ab, bb}, produced by a colony founded by a queen of genotype u ∈ {aa, ab, bb} at generation t. Following Reuter and Keller (2001), we assume that these numbers are proportional to the energy invested into the production of sexuals. So instead of numbers, x u v (t) can be viewed as the investment into the production of males (of genotype v ∈ {a, b}) and x u v (t) into the production of queens (of genotype v ∈ {aa, ab, bb}) by a colony whose queen has genotype u ∈ {aa, ab, bb}. Finally, n u (t) is the effective workforce of a colony whose queen has genotype u ∈ {aa, ab, bb} at generation t. This effective workforce is given by the sum of all types of workers present in a colony, including hybrids (with the latter weighted by their efficiency e), i.e. n u (t) = x u aa (t) + x u ab (t) + x u bb (t) + ex u hyb (t) where x u v (t) is the investment into the production of workers of genotype v ∈ {aa, ab, bb, hyb} (with hyb denoting hybrid genotype) made by a colony whose queen has genotype u ∈ {aa, ab, bb} at generation t. The parameter α > 0 determines the effect of the workforce on the probability for a sexual to reach the mating pool. When α = 1, investment in workers affects the survival of queens and males linearly (i.e. one extra unit of workforce always increases survival by the same amount). By contrast when α < 1, investment in workers show diminishing returns. Conversely when α > 1, investment in workers show increasing returns. For most of our analyses, we assume linear effects of the workforce (α = 1). We relax this assumption in section B.2.3.
We specify the investments into males, x u v (t), queens, x u v (t), and workers, x u v (t), in terms of model parameters in Table S1. For the sake of completeness, we do so for a model that encompasses all the effects explored sequentially in the main text, i.e. we allow for both traits ω and η to coevolve; for a finite number m of mates for each queen; and for a fraction c of a queen's brood to be produced via parthenogenesis. To read Table S1, note that the different investments made by a colony with a queen of type u ∈ {aa, ab, bb} (i.e. x u v (t), x u v (t), and x u v (t)) depend on the types of males she has mated with. To capture this, we let M u,v be the random number of males of genotype v ∈ {a, b, h} (where h denotes allospecific type) that a queen of genotype u ∈ {aa, ab, bb} mates with. Assuming that each mate is independent from one another, these random variables follow a multinomial distribution with parameters, where m is the total number of mates; (1 − η u )p v (t) is the probability that in one mating event a queen of type u mates with a conspecific male of type v ∈ {a, b} (which requires that this queen does not hybridize, with probability (1 − η u ), and encounters a male of type v, with probability given by its proportion, p v (t)); and η u is the probability that in one mating event a queen of type u mates with an allospecific male.
To get to the recurrence equations tracking the proportion of males and queens of each genotype, we first substitute the entries of Table S1 into eq. (A-2) (with α = 1). Doing so we obtain polynomials for the numbers n v (t + 1) (for v ∈ {a, b}) and n v (t + 1) (for v ∈ {aa, ab, bb}) in terms of the random variables M u,a , M u,b , and M u,h (with u ∈ {aa, ab, bb}). We marginalise (i.e. take the expectation of) these polynomials over the joint probability mass function of M u,a , M u,b , and M u,h for each u ∈ {aa, ab, bb}, which is given by eq. (A-3). Finally, the so-obtained numbers of different types of individuals (eq. A-2) are substituted into eq. (A-1). From this operation and using the fact that p a (t) = 1 − p b (t) and p aa (t) = 1 − p bb (t) − p ab (t), we obtain a  consider for e.g. the investment in queens of genotype ab in a colony led by a queen of genotype ab (fourth row, second column). First, queens can arise only from the fraction f of the brood that is diploid. Next, as the laying queen is of genotype ab, the fraction c of diploid eggs that are produced through parthenogenesis will also be of genotype ab. The fraction (1−c) of diploid eggs that are fertilised through regular sex is ab with a probability that depends on the queen's mates: (M ab,a + M ab,b )/(2m) (assuming random chromosomal segregation and fertilisation, e.g. because the amount of sperm provided by each male is the same and well-mixed within a queen's spermathecae). Finally, diploid ab eggs develop into queens with probability 1 − ω ab .
The other entries of the table are derived similarly. recurrence equation, that is characterised by a mapping F : [0, 1] 3 → [0, 1] 3 . This recurrence is too complicated to be presented here for the general case but can straightforwardly be iterated numerically to track allelic frequency changes for given parameter values (see Mathematica notebook for e.g.).
A.2 Long-term evolution: adaptive dynamics To gain more analytical insights, we use the recurrence eq. (A-4) to study the long term adaptive dynamics of both traits under the assumption that traits evolve via mutations that are rare and with weak additive phenotypic effects.

A.2.1 Invasion fitness of rare additive allele
An adaptive dynamics model is typically based on the invasion fitness of a mutant allele in a population that is otherwise fixed for a resident allele (i.e. the asymptotic growth rate of a mutant allele). To obtain this invasion fitness, we first introduce some notation. We denote the resident allele by a vector z = (ω, η) where ω is probability that a larva homozygote for the resident allele develops into a worker, and η is the probability that a mate of queen homozygote for the resident allele is allo-specific. Similarly, the mutant allele is described by a vector ζ = (ω + δ ω , η + δ η ) whose first entry gives the probability that a larva homozygote for the mutant allele develops into a worker, and whose second entry is the probability that a mate of a queen homozygote for the mutant allele is allo-specific (δ ω and δ η thus denote the mutant effect on trait values). Assuming additive genetic effects on phenotypes, a heterozygote then expresses phenotype (ω + δ ω /2, η + δ η /2).
To use the recurrence equations developed in the previous section, we arbitrarily set allele a as the resident and b as the mutant. The allele specific trait values (appearing in table S1 and eq. A-3) are then replaced by: Next, we use the fact that the mutant is rare so that its frequency in the population is of the order of a small parameter denoted 0 < 1. As a rare allele can only be found in heterozygous form in a large panmictic population, the initial dynamics of a mutant allele b can be described through linear approximations of p b (t + 1) and p ab (t + 1) at a near-zero frequency of b (e.g. Brännström et al., 2013). In other words, eq. (A-4) can be linearised to where A(ζ, z) is a 2 × 2 matrix that depends on mutant and resident phenotypes, ζ and z, and is a small parameter of the order of the frequency of the mutant b in males and queens.
The invasion fitness of the mutant phenotype, which we write as W (ζ, z), is then given by the leading eigenvalue of A(ζ, z) (e.g. Caswell, 2000), i.e.
where λ max (M) gives the leading eigenvalue of a matrix M. In a large population, W (ζ, z) tells the fate of the mutant allele. If W (ζ, z) ≤ 1, then the mutant allele is purged by selection and vanishes with probability one. Otherwise if W (ζ, z) > 1, the mutant has a non zero probability of invading the population (e.g. Brännström et al., 2013).

A.2.2 Directional selection
When mutations are rare with weak phenotypic effects, the population first evolves under directional selection whereby an advantageous mutation fixes before a new mutation arises so that the population "jumps" from one monomorphic state to another (Dercole and Rinaldi, 2008).
To study these dynamics, we use the selection gradient, s(z), which is a vector pointing in the direction favoured by selection at every point z ∈ [0, 1] × [0, 1] of the phenotypic space (i.e., the space of all possible phenotypic combinations with ω and η both between 0 and 1 as they are both probabilities) . This vector is given by the marginal effect of each trait on invasion fitness, i.e.
where s ω (z) and s η (z) give the direction of selection on ω and η respectively.
Singular strategies. A singular strategy, z * = (ω * , η * ), is such that all selection gradients are equal to zero, A singular strategy therefore represents a potential equilibrium in the context of adaptive dynamics (Brännström et al., 2013).
Jacobian matrix and convergence stability. Whether the population evolves towards or away from a singular strategy z * depends on the Jacobian matrix, Specifically, one necessary condition for a singular strategy to be an evolutionary attractor is that the greatest real part of the eigenvalues of J(z * ) is negative (Leimar, 2009). Such a singular strategy z * is said to be convergence stable. Otherwise, the population will be repelled away from z * . Even if z * is convergence stable, it is possible for the population to evolve away from z * when both evolving traits are genetically correlated (Leimar, 2009). A sufficient condition for a singular strategy to be an attractor is that the symmetric part of the Jacobian matrix, (J(z * ) + J(z * ) T )/2, is negative-definite, in which case z * is said to be strongly convergence stable (Leimar, 2009). When this is true, the population evolves towards z * , whatever the genetic correlations between both traits (i.e. independently from the statistical distribution of mutational effects on both traits).

A.2.3 Stabilising/disruptive selection.
Once the population is at an equilibrium for directional selection (i.e. a convergence stable phenotype), it either remains monomorphic under stabilising selection (when the equilibrium is evolutionary stable or uninvadable, Parker and Maynard Smith, 1990) or becomes polymorphic due to disruptive selection (when the equilibrium is not evolutionary stable or invadable, Geritz et al., 1998). When two traits are coevolving, this depends on the Hessian matrix (Phillips and Arnold, 1989, Leimar, 2009, Geritz et al., 2016, Otherwise, selection may be disruptive and the population may experience evolutionary branching, whereby it splits among two diverging morphs (Geritz et al., 1998, Leimar, 2009, Geritz et al., 2016.

A.3 Individual-based simulations
To complement our mathematical analysis, we also performed individual based simulations (an R script implementing these is provided as a supplement here: https://zenodo.org/record/ 4434257). These simulations track a population of N q = 10000 diploid queens (with f = 0.5, see figure legends for other parameters). Each queen is characterized by its genotype: a pair of haplotypes, each of which is given by the values of ω and η they code for (so four genotypic values in total). Simulations are initialized by setting both haplotypes of all N q queens to the same arbitrary values (i.e. we start with a monomorphic population). Each generation of a simulation consists of the following steps: 1. Mating. First, queens mate. To model this process, we first compute the propensity η i of each queen i ∈ {1, 2, . . . , N q } to hybridize as the mean of the two relevant alleles it is carrying. Then, each queen i is mated with a number m i of conspecific haploid males. This number m i is drawn from a binomial distribution with m trials and success probability (1 − η i ) (in line with eq. A-3). At the first generation, all males carry the same genetic values for ω and η as queens (i.e. the initial trait values). In subsequent generations, males are sampled (with replacement) as single haplotypes from the 2i haplotypes present in the laying queens of the previous generation. Following eq. (A-2a), the probability to sample a given haplotype is weighted by the investment in workers within its colony of origin (as the investment in workers increases the probability for males to reach the mating pool).
2. Colony development. Each queen i settles to form a colony. We characterise each colony in two steps. First, a list is constructed that contains the 2m i non-hybrid diploid female genotypes produced within each colony (i.e. the combinations of the alleles of a queen and of its conspecific mates). If thelytokous parthenogenesis is included (c > 0), the genotype of the queen itself is added to this list. Second, the investment in workers within each colony is calculated following equations in table S1 and eq. (A-2b). These calculations use the genetic value expressed by each of the 2m i + 1 non-hybrid genotype within the female progeny (characterised in the first step), as well as the proportion of the brood produced sexually and asexually (the parameter c), the proportion of conspecific and allospecific males the queen has mated with (i.e. m i /m and 1 − m i /m), and the efficiency of hybrid workers (the parameter e).
3. Next-generation queens. To generate the next generation of queens, N q new diploid female genotypes are sampled (with replacement) from all non-hybrid genotypes produced within each colony. Following table S1, the probability to sample a given genotype is weighted by its own genetic value of (1 − ω) and by the investment in workers within its colony of origin (as the investment in workers increases the probability for queens to reach the mating pool). Finally, each genotypic value independently mutates with probability 10 −2 . Mutation effects are drawn independently from a normal distribution with mean 0 and standard deviation 10 −2 . Mutated genetic values are capped between 0 and 1 to ensure that traits remain within their domain of definition.

B Analyses
Here, we present the derivations of our results summarised in the main text. These derivations are organised in the same order as they appear in the main text. As a supplement, we also provide a Mathematica (Wolfram Research, 2020) notebook that allows to follow our analyses.

B.1 Baseline model
We first explore the baseline case where females mate with a large (effectively infinite) number of mates and there is no parthenogenesis (i.e. when m → ∞ and c = 0).

B.1.1 Independent evolution of genetic caste determination
As presented in the main text, we initially assume that hybridization η is fixed and only caste determination ω evolves. Using eq. (A-8) with m → ∞ and c = 0, we find that the selection gradient on genetic caste determination is, Accordingly, there is a unique singular strategy ω * for caste determination when hybridization η is fixed (i.e. ω * such that s ω ((ω * , η)) = 0), which is eq. 1 of the main text.
It is straightforward to show that with hybridization fixed, the singular strategy (eq. B-2) is convergence stable (plugging eq. B-2 into the Jacobian, that is eq. A-10, for a single trait with as well as uninvadable (plugging eq. B-2 into the Hessian, that is eq. A-11, for a single trait with m → ∞ and c = 0), Therefore, when hybridization is fixed, our analyses show that genetic caste determination will gradually evolve to the singular value eq. (B-2) and remain monomorphic for this value (which is what we observe when we simulate this scenario, fig. 2A).

B.1.2 Coevolution of genetic caste determination and hybridization
An unstable singularity. When both caste determination ω and hybridization η evolve, their trajectories under directional selection are given by the selection gradient vector, (from eq. A-8 with m → ∞ and c = 0). Solving the above for z * = (ω * , η * ) such that s(z * ) = (0, 0) yields a single singular strategy in two dimensional trait space, which is plotted in fig. 3A against e. However, when we look at the Jacobian matrix of the system eq. (B-5) at this singular value (i.e. substitute eqs. B-5 and B-6 into eq. A-10), we see that this matrix has a negative determinant, det (J(z * )) = − 9 64e 2 < 0 (B-8) so its eigenvalues cannot both be negative (since the product of the eigenvalues of a matrix is equal to its determinant). Hence the singular value z * eq. (B-6) is not convergence stable, but rather an evolutionary repellor.
Our result that evolutionary trajectories will be repelled away from the singular value eq. (B-6) tells us that adaptive dynamics will eventually get to the boundary of the trait space. This trait space consists of the square [0, 1] × [0, 1] (as both traits must be between zero and one). Two edges of this square (when ω = 1 or η = 1) cannot be accessed by evolutionary dynamics as either of these trait values lead to zero fitness (as a population monomorphic for ω = 1 or η = 1 produces no queen in our baseline model). We can therefore focus on dynamics along the edges η = 0 or ω = 0 of the trait space, which respectively correspond to the case of hybridization avoidance and worker-loss.
Convergence to hybridization avoidance. Evolutionary dynamics will settle somewhere on the edge where hybridization is absent in the population (η = 0) only if: (1) selection on hybridization maintains it at zero (i.e. s η (z) ≤ 0 when η = 0); and (2) selection on caste determination settles for an equilibrium ω * (i.e. s ω (z) = 0 for some ω * when η = 0). From eq. (B-5), these two conditions are true when e ≤ 1/2 and the equilibrium for caste determination is simply ω * = 1/3 (in line with eq. B-2). As established in eq. (B-3), this equilibrium is convergence stable and evolutionary stable when η is fixed.

B.1.3 Decomposition of directional selection in terms of inclusive fitness effects
The kin selection approach. In this section, we use the so-called "kin selection" or "inclusive fitness" approach to obtain the selection gradient eq. (B-5) (Taylor and Frank, 1996). This approach, which is based on invasion analyses of alleles in class-structured populations, gives the same quantitative result about directional selection than other common methods in theoretical evolutionary biology such as adaptive dynamics, population or quantitative genetics (assuming genetic variance for traits is small, e.g. Taylor and Frank, 1996, Rousset, 2004, Lehmann et al., 2016. But one particular advantage of a kin selection approach is that it immediately decomposes directional selection on mutant alleles into the sum of: (1) their direct fitness effects on the reproductive success of the individuals that express them; and (2) of their indirect fitness effects on other related individuals that can also transmit them. This decomposition allows to delineate the various forces at play in the evolution of social behaviours (Hamilton, 1964). Here, we use it to better understand the evolution towards worker-loss (and obtain fig. 3E-F).

We follow Taylor and Frank (1996)'s general method. Consider a population with mean trait
values ω and η. In this population, consider a focal colony that is home to a mutant allele that codes for deviant trait values η • in queens and ω • in larvae that carry this allele. Let ω 0 denote the mean trait value expressed by all larvae within this focal colony. Using this notation, the expected number of successful (i.e. that mate) males that are produced by the focal colony and that carry the mutant allele is given by, where the numerator and denominator are the total number of males produced by the focal and a random colony, respectively. For the focal colony (the numerator), (1 − f ) is the probability that an egg is haploid (i.e. male) while the term in square brackets is the colony's investment in workers (which in our model is also the probability that a sexual reaches maturity). The denominator follows the same logic for an average colony in the population.
Similarly, the expected number of successful queens that are produced by the focal colony that carry the mutant allele is, is the number of queens produced in the focal colony and the term in square brackets is the probability that a queen survives till mating (i.e. the colony's investment in workers).
Fitness effects within a mutant colony. With the above notation, the selection gradient vector can then be computed as, where all derivatives are evaluated at ω • = ω 0 = ω and η • = η 0 = η; r lm is the relatedness of a female larva to a brother; r lf is the relatedness of a female larva to a sister; r qm is the relatedness of a queen to its sons; r qf is the relatedness of a queen to its daughters; v is the reproductive

B.1.4 Correspondence with Reuter and Keller (2001)
Here we connect our results to those of Reuter and Keller (2001) in workers on colony productivity. More specifically, if we set their term ∆ c = δs/(δw) × 1/f (their notation in their eq. 3, where ∆ c corresponds to the gain in sexual production brought by one additional worker) to -14) and assume that the population is monogynous and highly polyandrous with balanced sex-ratio (i.e. in their notation, f = 1/2; g f = 1/4; g m = 1/2; v f = 2; v m = 1), then we find that eq. 3 of Reuter and Keller (2001) is proportional to our selection gradient s ω (z) (eq. B-1) with η = 0.
In line with this, both yield the convergence stable equilibrium w * = 1/3.

B.2 Extensions
We now consider several extensions to our baseline model.

B.2.1 Effect of finite matings
First, we relax our assumption that queens mate with an infinite number of mates (i.e. m < ∞).
Selection gradient. Working from eq. (A-8) with c = 0, we find that the selection gradient vector on caste determination ω and hybridization η under finite matings reads as, -15) These gradients are complicated but we can extract relevant information by starting our analysis on the two boundaries of the trait space along which evolutionary dynamics may end up (ω = 0 or η = 0). Using eq. (B-15), we ask first when is worker-loss (ω = 0) stable? And second when is hybridization avoidance (η = 0) stable?
Stability of worker-loss. Worker-loss is stable only if: (1) selection maintains ω at zero (i.e.
s ω (z) ≤ 0 when ω = 0); and (2) selection on hybridization settles for an equilibrium η * (i.e. Stability of hybridization avoidance. Conversely, hybridization avoidance is stable only if: (1) selection on hybridization maintains η at zero (i.e. s η (z) ≤ 0 when η = 0); and (2) selection on caste determination in absence of hybridization settles for an equilibrium ω * (i.e. s ω (z) = 0 for some ω * when η = 0). From eq. (B-15), these two conditions reduce to (region below plain line in fig. 4A) and .  fig. 4A), hybridization cannot evolve when rare and worker-loss cannot be maintained. We therefore focus on the three remaining cases where worker loss can emerge.
Doing so, we find that there are four possible types of evolutionary dynamics.
Type 1: Evolution towards worker-loss. Where condition (B-16) is met but (B-18) is not (dark green region of fig. 4A), selection favours the emergence of hybridization and maintenance of worker-loss. In addition, it can be shown that under these conditions, there exists no singular strategy within the trait space (i.e., there exists no z * = (ω * , η * ) such that 0 < ω * , η * < 1 and s(z * ) = (0, 0), e.g. using the function Reduce[] in Mathematica, see notebook). This means that the phase portrait of evolutionary dynamics is qualitatively the same as in fig. 3D: worker-loss always evolves.
Type 2: Evolution towards worker-loss or hybridization avoidance depending on initial conditions. Where conditions (B-16) and (B-18) are met simultaneously, both workerloss and hybridization avoidance are stable so either strategy is maintained when common (when m ≥ 5, light green region of fig. 4A). Under these conditions, we find that there exists a singular strategy within the trait space (top row, columns m = 5 and m = 6 in fig. S1 for numerical values, see Mathematica notebook for analytical expression). When we compute numerically the leading eigenvalue of the system's Jacobian matrix, we find that it is positive ( fig. S1, second row, columns m = 5 and m = 6, dashed line), revealing that the singularity is an evolutionary repellor. Therefore the phase portrait of evolutionary dynamics is qualitatively the same as in fig. 3C: depending on initial conditions, evolutionary dynamics will lead to worker-loss or hybridization avoidance. This says that disruptive selection in our model is due to correlational selection between caste determination and hybridization (i.e. the selection that associates caste determination and hybridization, Phillips and Arnold, 1989) and only occurs because both traits are coevolving Figure S1: Properties of the internal singular strategy under monoandry and low polyandry. Each column describes the unique internal singular strategy for a specific value of m. Top row: value of the singular strategy (ω * in green, η * in black) within the range of e for which an internal strategy exists (range given by eqs. B-16 and B-18; Mathematica notebook for value of singular strategy). Bottom row: leading eigenvalues of the Jacobian (dashed line; for convergence stability), symmetric part of the Jacobian (dotted line; for strong convergence stability) and Hessian (full line; for evolutionary stability) matrices at the singular strategy (Mathematica notebook for calculations).
(i.e. if either trait evolves while the other is fixed, the population remains monomorphic, e.g. Mullon et al., 2018). We also find that S2A, blue line), which tells us that correlational selection is positive (i.e. selection favours a positive correlation between caste determination and hybridization within individuals, Phillips and Arnold, 1989). This is confirmed by individual based simulations, in which we observe the emergence of a polymorphism characterised by a positive correlation between ω and η within haplotypes ( fig. 4C and fig. S2B-D).  Figure S2: Polymorphism under monandry is due to positive correlational selection. A. Characteristics of the Hessian matrix at the internal singular strategy as a function of e for m = 1 (first column of fig. S1 for singular value): quadratic selection coefficient on ω (h ωω (z * ), in green) and on η h ηη (z * ), in grey); correlational selection (h ωη (z * ), in blue) and its relative strength (h ωη (z * ) 2 −h ωω (z * )h ηη (z * ), in black, Mathematica notebook for calculations).

B.2.2 Effect of thelytokous parthenogenesis
When we allow for a fraction c of a queens brood to be produce parthenogenetically, the selection gradient (obtained from eq. A-8) is too complicated to be displayed or for singular strategies to be found analytically. We therefore go through an invasion analysis similar to above (Appendix B.1.2 and B.2.1) and again ask: (1) under which conditions and values of ω is hybridization avoidance (η = 0) stable? and (2) under which conditions and values of η is worker-loss (ω = 0) stable?
Stability of hybridization avoidance. Hybridization avoidance is stable if selection on caste determination settles for an equilibrium ω * in the absence of hybridization (i.e. s ω (z) = 0 for some ω * when η = 0), and if selection on hybridization at this equilibrium maintains η at zero (i.e. s η (z) ≤ 0 when η = 0 and ω = ω * ). These two conditions respectively reduce to, . (B-23) Condition eq. (B-23) corresponds to the area of the graph below the plain line in fig. 5A where hybridization avoidance is stable. Conversely, the area above the plain line in fig. 5A-B (in blue) is where avoidance is not stable and thus where hybridization evolves.
Worker-loss coupled with complete hybridization. In principle, it is also possible with parthenogenesis for a population to evolve worker-loss (ω = 0) with complete hybridization (η = 1) (as parthenogenesis allows the production of queens in the absence of intraspecific matings). We therefore further need to determine whether worker-loss can also be stable in the case where η = 1 (rather than for some 0 < η * < 1). We find that selection under workerloss (ω = 0) and complete hybridization (η = 1) maintains both worker-loss and complete hybridization (i.e. s ω (z) ≤ 0 and s η (z) ≥ 0 where z = (ω, η) = (0, 1)) when

B.2.3 Effect of non-linear workforce productivity
Our analyses so far have assumed a linear effect of worker number on colony fitness (α = 1 in eq. A-2b). Here we investigate how non-linear effects of the number of workers on the pre-mating survival of virgin queens and males influence our results. We restrict our exploration to the case where queens mate with an infinite number of males and do not reproduce via parthenogenesis for simplicity (m → ∞ and c = 0). With α in eq (A-2b) as a variable, we find from eq. (A-8) that the selection gradient vector now reads as, Solving for both of these gradients to vanish simultaneously, we find that there exists a unique singular strategy,      ω * = e + e − 1 3 η * = 1 + 3e (e − 1)(1 + 2α) (B-28) ( fig. S3). The Jacobian matrix (eq. A-10) of the system eq. (B-27) at this singular value eq. (B-28) reads as It is straightforward to show from eq. (B-29) that the singular strategy eq. (B-28) is a repellor, just as under linear effects (α = 1, eq. B-7). This indicates that as illustrated in fig. 3, the coevolution of caste determination and hybridization under non-linear effects also lead to either hybridization avoidance or worker-loss depending on parameters and initial conditions.
We can gain further insights into the influence of non-linear effects by determining when the singular strategy eq. (B-28) is within the trait space (i.e., when 0 < ω * , η * < 1). We find that this is the case when 1 4 < e < 1 + 2α 4 + 2α (B-30) (light green region in fig. S3). This means that the threshold value for worker efficiency e above which worker-loss can evolve is 1/4 (as under linear effects α = 1). Condition (B-30) further shows that the threshold for e above which worker-loss always evolves (i.e. independently from initial conditions, fig. 3D for e.g.) increases with α (dark green region in fig. S3). In other words, the evolution of worker loss is facilitated under diminishing (α < 1, fig. S3A) and impaired under increasing returns (α > 1, fig. S3C). Figure S3: Non-linear effects of investment in workers. Singular values for η (in black) and ω (in white) as a function of hybrid worker efficiency e (given by eq. B-28).