My latest paper has just been published in Proceedings of the Royal Society B! My colleagues and I describe how a partnership between a group of ciliates (a type of single-celled organism) called Kentrophoros and their bacterial symbionts had a single evolutionary origin. This is despite the fact that different species of Kentrophoros can look very different from each other and are found all over the world. The bacteria are also a lineage that is new to science, and that as far as we know is only associated with these ciliates. This means that after the first Kentrophoros and its bacterial partner got together tens or hundreds of millions of years ago, their descendants have diversified into different species and spread themselves throughout the globe, all the while remaining true to each other.
Kentrophoros sp. from the Mediterranean island of Elba. This ciliate carries a few hundred thousand bacterial symbionts (whitish mass) and is almost 2 mm long despite being a single cell.
Still have questions? Read more below…
Recently stumbled across a 2013 paper from Ryan and Irene Newton describing a tool, called PhyBin, for binning phylogenetic trees, i.e. clustering them by similarity into groups (“bins”). They use the Robinson Foulds metric to represent the distance between trees.
The reason for doing this is to look at the phylogenies of individual gene ortholog clusters in a set of genomes, to find those genes that have a phylogeny different from the others. This might be useful e.g. to detect genes that have undergone horizontal gene transfer. The example they used for their paper was the insect symbiont Wolbachia.
It seems like a nice way to screen a set of genomes for genes that might be interesting. I had wanted to try to do something like this, but with a concordance-factor approach instead. Some other thoughts:
- Each gene is represented by one tree – uncertainty is not taken into account, unlike with concordance factors, as implemented in BUCKy for example
- If there are horizontally-transferred genes, they would probably have patchy distribution and not be in every species. But such genes that are present in only some genomes would be pre-excluded from the analysis, also in concordance analysis. In PhyBin paper the authors mention the case of Wolbachia prophage which has precisely this limitation.
- Collapsing short branches is a good idea