M ITOCHONDRIAL DNA DIVERSITY , DIFFERENTIATION AND PHYLOGEOGRAPHY OF THE S OUTH A MERICAN RIVERINE AND COASTAL DOLPHINS S OTALIA FLUVIATILIS AND S OTALIA GUIANENSIS

A BSTRACT : Here we consider the phylogeography and population structure of the South American coastal and riverine dolphins, Sotalia guianensis and Sotalia fluviatilis , based on samples (n = 76) collected across more than 9000km of the species distribution. Phylogenetic reconstruction of 31 distinct haplotypes based on a combined analysis of two mitochondrial gene fragments (1052bp) revealed clear genetic differences between riverine and coastal individuals consistent with species-level ranking. Within the coastal species, a spatial analysis of molecular variance of the control region sequences showed significant regional population differentiation ( F ST = 0.4; F ST = 0.6; P<0.001). The highest mitochondrial diversity among costal population units was found along the Caribbean Coast of Colombia and Venezuela. The genetic distinctiveness of

ABSTRACT: Here we consider the phylogeography and population structure of the South American coastal and riverine dolphins, Sotalia guianensis and Sotalia fluviatilis, based on samples (n = 76) collected across more than 9000km of the species distribution.Phylogenetic reconstruction of 31 distinct haplotypes based on a combined analysis of two mitochondrial gene fragments (1052bp) revealed clear genetic differences between riverine and coastal individuals consistent with species-level ranking.Within the coastal species, a spatial analysis of molecular variance of the control region sequences showed significant regional population differentiation (F ST = 0.4; F ST = 0.6; P<0.001).The highest mitochondrial diversity among costal population units was found along the Caribbean Coast of Colombia and Venezuela.The genetic distinctiveness of the Maracaibo Lake (Venezuela) population has conservation implications regarding the threats faced by the animals in this region, including oil exploitation.Brazilian populations of Sotalia showed the lowest mitochondrial diversity and differentiation among the coastal species warranting further investigation.The Amazonian populations showed the highest mitochondrial diversity overall, suggesting a surprisingly large effective population size (N ef ) and relatively high female gene flow throughout the sampled regions of the main river and its tributaries.From our results, at least two different conservation strategies need to be developed for each of the proposed sister-species.For the coastal groups, characterized by restricted gene flow and very localized populations along the Caribbean and Atlantic Coast of South America, it is advisable to work at a local level in order to improve the fishing practices and prevent frequent dolphin entanglement in nets.For the Amazonian groups, priority must be given to maintain the connectivity detected between regions.Obstacles to connectivity, including hydroelectric and dam construction, as well as excessive boat traffic, could affect the future of these populations.

Introduction
The coastal and riverine forms of the South American dolphin Sotalia have been recently proposed and recognized as different species (Monteiro-Filho et al., 2002;Cunha et al., 2005;Caballero et al., 2007).The coastal species, S. guianensis, ranges from Nicaragua (Carr and Bonde, 2000) to Southern Brazil (Borobia et al., 1991;da Silva and Best, 1996b) including the Caribbean islands of Trinidad and Tobago.An apparently distinct population has also been described in Lake Maracaibo, Venezuela, with morphological characteristics different from other coastal individuals (Hershkovitz, 1962;Casinos et al., 1981).The riverine species, S. fluviatilis, ranges throughout the Amazon River and most of its tributaries (da Silva and Best, 1994;da Silva et al., 2011;Gómez-Salazar et al., 2010 this volume).Although Sotalia are also reported 250km up-river in the Orinoco (Gómez-Salazar et al., 2010 this volume), it is unclear if these animals are residents or transients from the coast (Boher et al., 1995).Sotalia is considered 'data deficient' by the IUCN (Klinowska, 1991;Reeves et al., 2003) and is listed in Appendix I of the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES).Other researchers consider it endangered and in need of protection (Barros and Teixeira, 1994).The main anthropogenic threat that affects this species is gillnet entanglement, mainly in the Amazonian Estuary (da Silva and Best, 1996a;Beltrán-Pedreros, 1998;Trujillo et al., 2000).In other areas they are killed for shark bait and their eyes and genital organs sold as magical charms (Siciliano, 1994;Meirelles et al., 2010 this volume).The destruction of their habitat, oil and pesticide pollution (Trujillo et al., 2000;Monteiro-Neto et al., 2003;Yogui et al., 2003;Alonso et al., 2010 this volume), and construction of dams for hydroelectric projects are other factors that may also impact the long-term viability this species (da Silva and Best, 1996b).
Here we present the first comprehensive description of the phylogeography of Sotalia; investigating the genetic relationships between sister-species and among various populations along the Caribbean and Atlantic Coast of South America and in the Amazonian region based on the analysis of two regions of mitochondrial DNA, the control region (CR) and the cytochrome b (Cyt-b) gene.

SAMPLE COLLECTION AND DNA EXTRACTION
A total of 76 samples of skin, liver, bone or teeth were obtained from S. fluviatilis and S. guianensis in 18 locations grouped into nine geographic regions throughout their range (Figure 1 and Table 1).DNA extraction from tissue samples followed the protocol of Sambrook et al. (1989), modified for small samples by Baker et al. (1994).DNA was extracted from bones following a silica-guanidinium thiocyanate based protocol described by Pichler et al. (2001).
PCR AMPLIFICATION AND SEQUENCING   Two mitochondrial genetic markers were analyzed; a 627 base pairs (bp) portion of the mitochondrial DNA control region (CR) and 425bp fragment of the cytochrome b (Cyt-b) gene.Degradation of DNA or inhibition prevented clean amplification and sequencing of Cyt-b from all teeth and bone (n = 13) and 12 skin samples.These samples are represented only by partial CR sequences.PCR products were cleaned, and sequenced on an automated capillary sequencer.For more information, including amplification conditions and primers, please refer to Caballero et al. (2007).

DATA ANALYSES
All sequences were manually edited and aligned using Sequencher 4.1 software (Gene Codes Corporation).For the combined mitochondrial dataset (CR + Cyt-b, 1,052 bp), haplotypes were defined using MacClade (Maddison and Maddison, 2000).For bone samples were it was not possible to obtain Cyt-b sequences, haplotypes were defined using only CR.The model of substitution for the combined mitochondrial dataset was tested in Modeltest v3.06 (Posada and Crandall, 1998) and the settings for this model were used in the phylogenetic reconstructions using Maximum Parsimony, Maximum Likelihood and Neighbor-Joining methods performed in PAUP v4.0b1 (Swofford, 2002).A Partitioning of Homogeneity Test was run in PAUP in order to determine if phylogenies reconstructed with each of the mitochondrial genes differed significantly from the phylogeny reconstructed from the combined genes.Steno bredanensis and Sousa chinensis were used as outgroups.In order to investigate genealogical relationships among Sotalia guianensis and among Sotalia fluviatilis CR haplotypes, Union of Maximum Parsimonious Trees (UMP) (Cassens et al., 2005) was used to calculate and construct a network of CR haplotypes.This method requires two consecutive steps.First, a Maximum Parsimony analysis was performed for the CR haplotype data set and all most parsimonious trees are saved with their respective branch lengths.We used the TBR branch-swapping (1000 replicates with random sequence addition) heuristic search option in PAUP (Swofford, 2002).Second, all the saved MP trees are combined into a single figure combining all connections from MP trees into a single reticulated graph, and merging branches, sampled or missing, that are identical among different trees (see Cassens et al. 2005 for additional details on this analysis).The haplotype frequency was combined with the CR haplotype network, and the final network was drawn by hand.Analyses of diversity and population structure were performed in the program Arlequin (Schneider et al., 2000) and restricted to the CR (450bp) because of the larger sample size for this locus.To evaluate genetic boundaries between the sampling locations studied, we performed a spatial analysis of molecular variance (SAMOVA) (Dupanloup et al., 2002).Genetic differences among the estimated populations detected in the SAMOVA analysis were then quantified by an analysis of molecular variance (AMOVA) as implemented in Arlequin (Excoffier et al., 1992) based on conventional F ST and Φ ST statistics, using 10000 random permutations.The number of female migrants per generation (N mf ), as a measure of gene flow among localities, was estimated based on the F ST value, using the equation and Palumbi, 1985) assuming Wright's island model.We calculated the long-term female effective population size (N ef ) for selected populations using the relationship θ = 2N ef µ (µ estimated to range from 1.70 x 10 -7 to 1.96 x 10 -7 bp -1 generation -1 ) (Caballero, 2006) using the software Fluctuate (Kuhner et al., 1998).

PHYLOGEOGRAPHY AND POPULATION STRUCTURE
A total of 627bp of the CR and 425bp of the Cyt-b gene were analyzed.As the Partition Homogeneity Test found no conflicting phylogenies (p = 0.97), both fragments were combined for haplotype determination.Twenty-nine of the thirty-one haplotypes found were distinguished by substitutions in the highly diverse CR.Two additional haplotypes were distinguished by the Cyt-b gene (Figure 2).Haplotype sequences were submitted to Genbank as accession numbers EF027006 to EF027092.Phylogenetic reconstructions by Maximum Parsimony, Maximum Likelihood (using the model HKY+I+G from Modeltest, proportion of invariable sites = 0.54, gamma shape parameter (α) = 0.5) and Neighbor-Joining showed clear reciprocal monophyly for individual and combined genes between haplotypes of the two sister-species (Figure 3).Given the reciprocal monophyly observed between forms and considering recent elevation to species level, we examined population structure within each separated sister-species.Very few haplotypes were shared between different geographic regions within each proposed sisterspecies.Only two haplotypes were shared among coastal regions: the haplotype D, was shared among the igure 1. Distribution of coastal and riverine Sotalia showing geographic regions, sampling locations and sample sizes of samples included in this study.Also indicated are the proposed genetic boundaries between Sotalia guianensis and Sotalia fluviatilis population units from the SAMOVA analysis.Four units for the coastal species (dashed line, black numbers): I = Colombian Caribbean + Maracaibo Lake, II = French Guiana, III = Amazonian Estuary and IV = Brazilian Coast.Three units for the riverine species (dotted line, grey numbers): 1 = Western Amazon, 2 = Central Amazon, 3 = Eastern Amazon.
Colombian Caribbean (CC), Maracaibo Lake (ML), and Nicaragua (NC) samples, and the haplotype (E) was shared between the Colombian Caribbean and Maracaibo Lake regions.Only one haplotype (S) was shared between the Colombian Amazon (CA) and Brazilian Amazon (BA) geographic regions.For Sotalia guianensis, fourteen out of seventeen haplotypes were included in the UMP analysis.Three were excluded since they contained too much missing data, as this can affect the performance of the algorithm used for combination of all most parsimonious trees into one network or haplotype genealogy.Ten most parsimonious trees were obtained and these were combined in the haplotype genealogy presented in Figure 4a.The haplotype in central position, connected with a high number of other haplotypes was D, found in the Colombian Caribbean (CC), Maracaibo Lake (ML) and Nicaragua (NC) geographic regions.Two unknown or missing haplotypes were determined by the UMP analysis.These could be ancestral or haplotypes that were missed during the sampling.For Sotalia fluviatilis, ten haplotypes were included in the UMP analysis.Three haplotypes were excluded since they contained too much missing data.Six most parsimonious trees were obtained and these were combined in the haplotype genealogy presented in Figure 4b.The haplotypes in a central position, connected with a high number of other haplotypes were X, S and T and haplotypes DD and EE were the most divergent.In three of the six most parsimonious trees, haplotypes U and V were connected therefore we included this haplotype connection in the final figure.We performed separate SAMOVA analysis for each sister-species, considering sampling regions with n ≥ 2. Thus, samples from Nicaragua, and Ceará (Brazilian Coast) were excluded from this analysis (n = 1 for these two sampling locations).Nine sampling locations were included for the coastal species (Table 1).We applied the SAMOVA algorithm searching for two to eight potential population units.The largest mean F CT index was found for four populations units (F CT = 0.6253), referred here to as: (i) Northern South America, combining the Colombian Caribbean and Maracaibo Lake geographic regions, (ii) French Guiana, (iii) Amazon Estuary and (iv) Brazilian Coast (Figure 1).A non-hierarchical AMOVA analysis confirmed significant differences between the population units identified by the SAMOVA excluding samples from the Amazonian Estuary (AE, n ≤ 2).The high degree of genetic differentiation among coastal Sotalia groups was reflected in the high F ST and Φ ST values obtained in the AMOVA (F ST = 0.4, Φ ST = 0.6, P<0.001, and values in Table 2. Due to the presence of unique haplotypes among the Maracaibo Lake samples, we decided to further investigate possible differentiation within the Northern South American population unit.An additional AMOVA was performed between the Colombian Caribbean and Maracaibo Lake samples.Differentiation was found between these geographic regions at the haplotype level (F ST = 0.169, P<0.004), but not at the nucleotide level (Φ ST = 0.075, P<0.1207).For the coastal population units, N mf was less than one female per generation (using F ST = 0.38).For the riverine species, four sampling locations within two geographic regions were considered in the SAMOVA analysis excluding the Peruvian Amazon (PA, n ≤ 2).The largest mean F CT index was found for three population units (F CT = 0.275): (1) Western Amazon (2) Central Amazon and (3) Eastern Amazon (Figure 1).Samples from the Central Amazon population unit were excluded from the AMOVA analysis (n ≤ 2).For the remaining two riverine Sotalia population units (Western and Eastern Amazon), no significant differences were found at the F ST level, but significant at the Φ ST level (Table 3).For the riverine populations units, N mf was 9 females per generation (using F ST = 0.054).We found relatively high haplotype and nucleotide diversity in most of the coastal population units considered in this analysis (Table 2), but very low haplotype and nucleotide diversity in the Brazilian Coast population unit.The highest haplotype diversity occurred among riverine population units (Table 3).Including all samples from all geographic regions, haplotype diversity (h) for coastal Sotalia was 0.85 ± 0.04 and for the riverine Sotalia was 0.90 ± 0.05.Overall nucleotide diversity (Π) was 0.74% for coastal Sotalia and 0.46% for riverine Sotalia (Table 4).For the coastal species, long-term female effective population size (N ef ), calculated as a way of estimating the evolutionary potential of this population but understanding its limitations, ranged between 24,400 and 26,900 individuals and for the riverine species, between 17,800 and 19,600.We chose to estimate the effective population size for the Brazilian population unit separately due to the low genetic diversity determined.
The long-term female effective population size for the Brazilian Coast population unit ranged between 12,800 and 14,200 individuals.

POPULATION STRUCTURE
The population structure, phylogenetic and SAMOVA analysis revealed strong regional structuring among the coastal populations sampled in this study.Most of the CR haplotypes were present in only one geographic region, indicating a low level of female-mediated gene flow between these regions.Our AMOVA results seems to suggest that the Maracaibo Lake population originated from a founder event of individuals from the Colombian Caribbean or that these two populations have differentiated recently, as can be deduced from the significant F ST values but the non-significant Φ ST values.This is also suggested by the haplotype genealogy, were some divergent haplotypes are found in the Maracaibo Lake geographic region (H and J).Table 4. Overall haplotype and nucleotide diversity, as well as the estimated q, for coastal and riverine Sotalia.
However, the degree of genetic differentiation observed between these two geographic regions is sufficient argument to consider the Maracaibo Lake population as a separate Genetic Management Unit (GMU) (Moritz, 1994).The low nucleotide diversity found in the Brazilian Coast population unit, accompanied by the surprisingly high long-term female effective population size estimated from our data, may reflect a historic founder event with a subsequent population expansion, perhaps at the end of the last glacial period (12000 ya), as suggested by Cunha et al. (2005), similar to what has been suggested for the Antillean manatee (Trichechus manatus) in the extremes of its distribution range (Vianna et al., 2006).These results (historic founder events followed by population expansions) are also consistent with the genealogy (Figure 4a), where haplotype D seems to be ancestral, considering that it is geographically widespread, is connected to a higher number of other haplotypes, and is located in a central position (Castelloe and Templeton, 1994).More divergent haplotypes (B, C, L, K, R and Q) are found in the extremes of the southern coastal distribution (Brazilian Coast, French Guiana and the Amazonian Estuary).
Less regional structure was found among the riverine population units compared to the coastal population units.Although the Western Amazon and the Eastern Amazon population units share only one haplotype, shorter genetic distances separate all riverine lineages, suggesting a lesser degree of differentiation than in the coastal haplotypes.This could be due to the relative shorter evolutionary history of riverine Sotalia when compared to the possibly longer evolutionary history of the coastal species (Caballero et al., 2007).Higher levels of female gene flow could also be expected between the Amazonian population units due to the scattered distribution of small groups of riverine Sotalia individuals along the main channels and tributaries of the Amazon River.Interestingly, in our study, significant statistical differences were obtained at the Φ ST level between the two Amazonian population units considered in the AMOVA analysis (Table 3).This might be due to the presence of a few very distinctive haplotypes with several nucleotide differences among these population units, especially haplotypes from samples from the extremes of the distribution.The haplotype genealogy (Figure 4b) confirmed these findings, suggesting that haplotypes X, S and T may be ancestral.It can be observed that haplotypes EE and DD are more divergent.This is an interesting finding, since haplotypes X, S and T were determined in samples collected along the main channel of the Amazon River and also in some tributaries located centrally along the distribution of Sotalia fluviatilis (Tefé, Puerto Nariño, Caquetá River) while haplotypes DD and EE were determined in samples from locations at the extremes of the distribution, for example the Cuyabeno River (EE) and Santarém (DD).This result can be reflecting patterns of connectivity among different Amazonian tributaries and channels with increasing haplotype and population differentiation in more isolated tributaries.More sampling along other Amazon River tributaries is required in order to rule out artifacts due to our small sample size.Overall, haplotype and nucleotide diversities for the mitochondrial DNA CR in Sotalia guianensis and Sotalia fluviatilis are similar to those reported for species with similar distributions and habitat ranges, including the Antillean and Amazonian manatees (García-Rodríguez et al., 1998, Vianna et al., 2006)) and the Amazon River dolphin Inia geoffrensis (Banguera-Hinestroza et al., 2002).
IMPLICATIONS FOR SOTALIA GUIANENSIS AND SOTALIA FLUVIATILIS

CONSERVATION AND MANAGEMENT
Our results suggest the existence of several distinct coastal and riverine Sotalia populations with localized distributions.As a result, at least two different conservation strategies need to be developed for each of the proposed sister-species.For the coastal groups, characterized by restricted female gene flow and very localized populations it is advisable to work at a local level in order to improve the fishing practices and prevent frequent dolphin entanglement in nets.This would require greater regulation and law enforcement of both commercial and artisanal fisheries.The extent of direct take and trade needs to be determined in more of these coastal areas, as done by Beltrán-Pedreros (1998) in the Amazonian Estuary region, and other authors in localized areas along the Brazilian Coast (Barros and Teixeira, 1994;Monteiro-Neto et al., 2000;Meirelles et al., 2010 this volume).The relatively low nucleotide diversity found in the Brazilian Coast population, needs to be taken into consideration in local management initiatives and requires further investigation.Greater conservation effort should also be directed at the unique Maracaibo Lake population, which is threatened by petroleum production in its environment (Lentino and Bruni, 1994).Research on its demographic status, life history and population estimates needs to be undertaken.Finer-scale analysis of genetic variation of coastal Sotalia is needed to determine male-mediated gene flow between these restricted populations.In the case of the Amazonian populations, priority must be given to maintain the connectivity detected between regions.Obstacles to connectivity could affect these population units and hydroelectric and dam constructions must be evaluated, depending on the region where they intend to be developed, taking into consideration the distribution of Sotalia and other aquatic mammals and reptiles in the region, as well as routes in fish migration and abundance of prey items to sustain these groups (Smith and Smith, 1998).Boat traffic and fishery interactions must also be determined along the Amazon and most of its channels and tributaries, as has been done by researchers in the Colombian Amazon (Trujillo et al., 2000;Diazgranados et al., 2002 13 ).Local takes will result in local extinction but connectivity could mask a wider decline (Taylor, 1997).Regulation of these activities needs also to be implemented with involvement of the local communities.

Figure 2 .Figure 3 .
Figure 2. 49 variable sites over 1,052 bp of the combined mitochondrial data set determining 31 Sotalia fluviatilis and Sotalia guianensis haplotypes.A star (*) denotes fixed site differences and (") designates a haplotype defined by nucleotide substitutions in the Cyt-b gene.

Figure 4 .
Figure 4. Haplotype genealogy obtained from the Union of Maximum Parsimonious Trees (UMP) analysis.The size of the circles reflect frequency of a particular haplotype found in: a) the Colombian Caribbean (CC), Maracaibo Lake (ML), Nicaragua (NC), French Guiana (FG), Amazonian Estuary (AE) and Brazilian Coast (BC) geographic regions; and b) the Colombian Amazon (CA), Peruvian Amazon (PA) and Brazilian Amazon (BA) geographic regions.Connections between haplotypes found in all most parsimonious trees are represented by a continuos line, while connections between haplotypes found in half of all most parsimonious trees are represented by a dotted line.Crossbars represent substitutions between haplotypes.

Figure 3 .
Figure 3. Maximum Parsimony phylogenetic reconstruction of the combined mitochondrial haplotypes (1052 bp), showing bootstrap values (1000 replicates) and the frequency of occurrence in each geographic region.Abbreviations follow Figure 1 and Table 1.Letters on terminal branches represent haplotype codes.(") indicates haplotypes distinguished on the basis of the Cyt-b gene.% divergence calculated in MEGA2, using the Tamura-Nei distance option and the settings for the HKY+G+I output in Modeltest.

Table 1 .
Sampling locations and tissue type obtained for coastal (S. guianensis) and riverine (S. fluviatilis) Sotalia.Numbers in parenthesis before each sampling location correspond to the number of this sampling location in Figure1.

Table 2 .
Pairwise F ST (below diagonal) and F ST (above diagonal) values for Control Region among four coastal Sotalia population units.Probability values based on 10,000 permutations are shown in italics.Significantly different values (P < 0.05) are shown in bold.Haplotype (h) and nucleotide (p) % ± standard deviation (SD) are shown on the diagonal for each population unit.

Table 3 .
Pairwise F ST (below diagonal) and F ST (above diagonal) values for Control Region among two riverine Sotalia population units.Probability values based on 10,000 permutations are shown in italics.Significantly different values (P < 0.05) are shown in bold.Haplotype (h) and nucleotide (p) % ± standard deviation (SD) are shown on the diagonal for each population unit.