Hansarsia, Shaw, 2023, Shaw, 2023
|
publication ID |
https://doi.org/10.1093/zoolinnean/zlae031 |
|
DOI |
https://doi.org/10.5281/zenodo.17523429 |
|
persistent identifier |
https://treatment.plazi.org/id/0398BA60-F26C-FFE8-4810-7DF2FA39CB3C |
|
treatment provided by |
Plazi |
|
scientific name |
Hansarsia |
| status |
|
Relevance of MLR models
Analysis 1 showed that the detection rate for the true labelling is higher than for random labelling in all three datasets (see the distance between the ‘true labelling’ upper row of boxes and ‘random labelling’ lower row of boxes in Fig. 2 View Figure 2 ). The detection rate is, therefore, driven by a similarity within molecular clades and does not depend on the number of individuals measured of each clade, which confirms the relevance of MLR models. In addition, the analysis showed that: (i) the detection rate grows rapidly with number of characters in use, but 8–10 characters are enough for a perfect separation (median detection rate is nearly 1.0) of females and males ( Fig. 2A, B View Figure 2 ) in our dataset; (ii) results for males are less satisfactory than those for females, i.e. the distance between the ‘true labelling’ and ‘random labelling’ boxes is relatively short (compare Fig. 2A, B View Figure 2 ); and (iii) the combined dataset provides a lower detection rate than the separate male and female datasets (median rate always below one; Fig. 2C View Figure 2 ). At the same time, MLR models are still relevant for this dataset, which is evident from the significant distance between the ‘true labelling’ and ‘random labelling’ boxes.
Overall, analysis 1 shows the validity of our approach, adequacy of 8–10 characters for a perfect separation of clades within our dataset and redundancy of additional characters, and a preference for the use of ‘single-sex’ (preferably female) datasets.
Ability of MLR models to generalize
Analysis 2 showed a high ability of MLR models to generalize ( Fig. 3 View Figure 3 ) for possible extra datasets not included in our analyses. The analysis also showed that: (i) the models are relevant for both sexes and for all molecular clades regardless of the size of the datasets (grey lines indicate individual clades); (ii) the detection rate for females is higher than that for males, with maximal median values of.85–.95 for females ( Fig. 3A View Figure 3 ) against.75 for males ( Fig. 3B View Figure 3 ) and the combined dataset yielding in an intermediate result, with median values of.80–.85; and (iii) five to seven characters are enough for a significant detection rate. The addition of new characters does not greatly improve the model: the use of seven female characters results in only.1 lower detection rate than the use of the whole character set ( Fig. 3A View Figure 3 ), the use of five male characters results in the same detection rate as the use of the whole character set ( Fig. 3B View Figure 3 ).
Overall, analysis 2 shows that our model is relevant for identification of new (not included here) specimens, the detection rate for females is better than that for males, and five to seven characters are enough for a perfect separation of clades outside our dataset.
The most ‘powerful’ detection characters
Morphological characters made unequal contributions to the detection of molecular clades. Analysis 3 showed that half of the characters (6 of 12) provided the highest detection rate in most combinations (green–yellow spectra in Fig. 4A, B View Figure 4 ). Three of these characters were common for females and males: the proportions of the sixth pleonic somite (character 6), the ratio of the carpus length to the propodus length in the first thoracopod (character 8), and the proportions of the propodus in the first thoracopod (character 9). The three additional most powerful characters were different in females and males: the ratio length of the sixth pleonic somite to length of the fourth + fifth pleonic somites (character 4) and the ratio length of the sixth pleonic somite to length of the fifth pleonic somite (character 6) in females ( Fig. 4A View Figure 4 ), and the proportion of the rostrum (character 2) in males ( Fig. 4B View Figure 4 ).
The difference in detection power of various character subsets is shown in Fig. 4C, D View Figure 4 . The use of the most ‘powerful’ character subsets (retrieved in analysis 2) provided a mean detection rate of ~.90 for females with six characters used ( Fig. 4C View Figure 4 ), which is higher than the corresponding values for 6–10 random character subsets ( Figs 3A View Figure 3 ). In addition, the confidence intervals of detection rates for the ‘powerful’ subsets are narrower than for random subsets.
For the male dataset, the detection rates obtained on the most ‘powerful’ character subsets are ~.85 (four characters used; Fig. 4D View Figure 4 ), which is.1 higher than median detection rates for all possible subsets of the same sizes ( Fig. 3B View Figure 3 ). Further increase of the number of characters did not improve the detection rate and even resulted in a slight decrease in the detection rate for more than six characters. This indicates that the models are likely to be prone to overfitting on the small male dataset and have a potential for improvement by adding new data to the male dataset. When we used the ‘weakest’ character subsets, the detection rate was.20 (females, five characters used) or.35 (males, three characters used) lower than in case of the most ‘powerful’ characters; addition of new characters to ‘weak’ subsets resulted in an increase in the detection rate.
Overall, six ‘powerful’ characters for the female specimens and four characters for the male specimens present an optimal trade-off between the detection rate and the complexity of measurements.
DISCUSSION
Is our approach relevant?
Probably the most important outcome of our research is the possibility of detecting all genetic clades using only morphological characters, in contrast to the approach by Kulagin et al. (2024), which provided identification of only some of the clades. Qualitative characters that usually help in identification of valid species appear to be ineffective for recognizing the main bulk of molecular clades of Hansarsia ( Kulagin et al. 2021, 2024). Conversely, continuous quantitative characters, such as proportions of somites and segments and ratios between them, do provide confident results, especially for females. Although we analysed the biggest currently available dataset of Hansarsia genetic clades, males are less abundant than females, which is usual for krill populations ( Vereshchaka 1990, 1995). Results for males in analysis 1 are, therefore, less satisfactory than those for females; the distance between the ‘true labelling’ and ‘random labelling’ boxes is relatively short ( Fig. 2A, B View Figure 2 ). Conversely, the use of females significantly increases the detection rate and provides statistically significant results.
As expected, analysis 1 proved post hoc recognition of all molecular clades within our dataset: we ran MLR models on the basis of the molecular clades previously identified via molecular methods. Analysis 2 resulted in a less expected and promising outcome: the models retrieved in analysis 1 can also be used for a satisfactory identification of the Atlantic Hansarsia in other datasets, i.e. before molecular analyses, with an estimated detection rate of.85–.95 (females) and.75 (males). The lower detection rate of males might be balanced by the use of additional qualitative male-specific characters, such as enlarged photophores and chitin saddles on the pleon ( Kulagin et al. 2021).
The obtained MLR models based on morphology might be especially useful when molecular analyses cannot be run, e.g. duringfieldstudiesorfordamagedspecimenswithanincomplete set of qualitative characters, specimens stored in collections for a long time, and/or specimens fixed in formalin. Nonetheless, distinct morphological characters to diagnose molecular clades and separate them dichotomously are preferable. However, in the case of closely related clades this is impossible; instead, we propose an alternative approach, i.e. measurements of continuous characters and use of our MLR models. Identification scripts for the Atlantic clades of Hansarsia (along with the readme file) are provided in the Supporting Information (Table S2).
Sexual dimorphism in Hansarsia is an important factor calibrating morphometric proportions and, while running the MLR models, we should take this effect into account. The dimorphism drives male and female qualitative characters differently in various molecular clades, which results in a lower detection rate of the combined dataset, where sex-specific proportions are ‘averaged’ and mask each other ( Fig. 2C View Figure 2 vs. Fig. 2A, B View Figure 2 ). Differential use of the male and female scripts and separate running of MLR models for males and, especially, females provide a better detection rate.
A bigger dataset would provide an even better detection rate and increase the reliability of the model. Our dataset is limited owing to the deep-sea habitat of the genus, the requirement for special material (fresh alcohol-fixed and morphologically undamaged individuals with a complete set of characters), and the restricted opportunities of a single research team. We expect that application of our approach to other marine groups will result in more successful identification of molecular clades and better understanding of their distribution throughout the ocean.
Does cryptic diversity exist?
The presence of pseudocryptic and cryptic species is not a new finding. Recent papers have provided deeper insight into pseudocryptic and ‘truly’ cryptic diversity within taxa when distinguishing morphological characters can or cannot be found (e.g. Lajus et al. 2015, SØrensen et al. 2020, Feliciano et al. 2021). Our research, however, shows that we can identify morphologically any of nine genetic clades recorded using traditional molecular markers, such as COI, by measuring a limited number (in our case, six) quantitative characters and using an appropriate mathematical tool. We can state that all recorded genetic clades of the Atlantic Hansarsia have distinct patterns in geographical distribution coupled with morphological variations and are likely to represent incipient species at various stages of divergence. Before this study, we could consider them as cryptic/ pseudocryptic lineages or species; now, we identify all of them with a satisfactory detection rate and cannot formally call them cryptic species or complexes.
We hypothesize that any genetic divergence found via traditional mitochondrial and nuclear genetic markers is mirrored in morphological divergence (linked to variations in other genes) that can be detected using a combination of quantitative characters and appropriate mathematical tools. In this respect, ‘true’ cryptic species might not exist. We cannot detect molecular clades visually or by using simple statistical procedures, but MLR regressions and scripts (Supporting Information, Table S2) solve the problem.
And here we are faced with a taxonomic problem in the diagnosis of these species. The absence of qualitative characters and the greatly overlapping ranges of individual morphological characters make ordinary diagnoses impossible. We invite the scientific community to ponder this problem; an acceptance of scripts for identification of individual molecular clades, as proposed here, might be one of the options.
Our results might encourage other researchers to use quantitative morphological characters (similar to those we used) for post hoc detection of cryptic species in other taxa, which should lead to deeper insight into the distribution, diversity, and biogeography of these taxa. Finally, after a satisfactory accumulation of data, we expect that MLR models might retrieve cryptic species confidently, on the basis of morphological characters alone (i.e. before confirmation of these results with molecular methods), which is a worthy goal for systematic biologists.
The most ‘powerful’ morphological characters and their evolutionary significance
Figures 3 and 4 show that the use of all 12 characters might be redundant for confident detection of molecular clades. Only six characters, if chosen correctly, provide a significant detection rate that does not increase notably after addition of new characters ( Fig. 3A, B View Figure 3 ). Nonetheless, the observed median detection rates of.85–.95 for females and.75 for males are not perfect, and a bigger dataset, if collected, could provide a higher rate. At the same time, our current algorithms provide a confident identification of ~9 out of 10 females (that dominate in krill populations), which is a precision that is not always guaranteed for identification of many tropical taxa even to species level. For molecular clades, this precision might currently be considered to be satisfactory.
Our analyses retrieved the most ‘powerful’ characters making the greatest contribution to the detection of the molecular clades, which might suggest their evolutionary importance. Indeed, the divergence of the Hansarsia molecular clades is coupled with a divergence of continuous quantitative morphological characters that are definitely adaptive.
We observed the greatest sex-independent divergence in the characters linked to proportions/ratios of the sixth pleonic somites and proportions of the terminal segments of the first thoracopod. These characters mirror (at a microevolutionary level) general phylogenetic traits recently observed at a macroevolutionary level in various pelagic eucarids, such as krill ( Vereshchaka et al. 2019), Oplophoridae ( Lunina et al. 2019), Acanthephyridae ( Lunina et al. 2020), and Benthesicymidae ( Vereshchaka et al. 2021) .
The proportions of the sixth pleonic somite are known to be associated with the escape function; more elongated segments provide more efficient backward flips and successful escape from predators. The proportions of the distal segments of the first thoracopod are adapted to catching prey by carnivorous Nematoscelinae ( Vereshchaka et al. 2019) and are likely to mirror fine feeding specialization of the molecular clades. Given that the clades often co-occur, natural selection should favour a divergence in specialization to a certain (and different for various co-occurring clades) prey, i.e. a fine-tuning of feeding strategies. These fine adaptations might illustrate microevolutionary morphological traits that might develop further to a deeper divergence at a macroevolutionary level.
The sex-dependent ‘powerful’ morphological characters are likely to be linked to mating functions. They are linked to the anterior part of males (proportion of the rostrum) and intermediate pleonic somites in females, which might reflect fine-tuning of mating. This fine-tuning is likely to facilitate coupling of mates belonging to the same clade (similar morphology of the coupling parts of the mates) and to complicate mating between different clades, hence providing earlier genetic isolation between the clades. As we know, the mating behaviour of euphausiids involves the anterior part of males and pleonic somites in females and consists of the same ‘chase’, ‘probe’, ‘embrace’, and ‘flex’ stages ( Kawaguchi et al. 2011) as those observed in greater detail in the decapod penaeids ( Misamore and Browdy 1996). During the ‘probe’ stage, the male probes the ventral surface of a swimming female with its antennules, which might serve to prepare the female for spermatophore placement, determine female receptivity and maturity for mating, and/or assist in species recognition ( Bauer 1991, 1994). As a result, the anterior part of krill males and the pleon of females co-evolve at a macroevolutionary level ( Vereshchaka et al. 2019), and this trend is visible in our analyses as microevolutionary morphological traits of molecular clades.
No known copyright restrictions apply. See Agosti, D., Egloff, W., 2009. Taxonomic information exchange and copyright: the Plazi approach. BMC Research Notes 2009, 2:53 for further explanation.
|
Kingdom |
|
|
Phylum |
|
|
Class |
|
|
Order |
|
|
Family |
