Submit a preprint

185

Revisiting pangenome openness with k-mersuse asterix (*) to get italics
Luca Parmigiani, Roland Wittler, Jens StoyePlease use the format "First name initials family name" as in "Marie S. Curie, Niels H. D. Bohr, Albert Einstein, John R. R. Tolkien, Donna T. Strickland"
2024
<p style="text-align: justify;">Pangenomics is the study of related genomes collectively, usually from the same species or closely related taxa. Originally, pangenomes were defined for bacterial species. After the concept was extended to eukaryotic genomes, two definitions of pangenome evolved in parallel: the gene-based approach, which defines the pangenome as the union of all genes, and the sequence-based approach, which defines the pangenome as the set of all nonredundant genomic sequences. Estimating the total size of the pangenome for a given species has been subject of study since the very first mention of pangenomes. Traditionally, this is performed predicting the ratio at which new genes are discovered, referred to as the openness of the species. Here, we abstract each genome as a set of items, which is entirely agnostic of the two approaches (gene-based, sequence-based). Genes are a viable option for items, but also other possibilities are feasible, e.g., genome sequence substrings of fixed length k (k-mers). In the present study, we investigate the use of k-mers to estimate the openness as an alternative to genes, and compare the results. An efficient implementation is also provided.</p>
https://doi.org/10.5281/zenodo.8256094You should fill this box only if you chose 'All or part of the results presented in this preprint are based on data'. URL must start with http:// or https://
https://doi.org/10.5281/zenodo.8256094You should fill this box only if you chose 'Scripts were used to obtain or analyze the results'. URL must start with http:// or https://
https://gitlab.ub.uni-bielefeld.de/gi/pangrowthYou should fill this box only if you chose 'Codes have been used in this study'. URL must start with http:// or https://
Bioinformatics Pangenomics
NonePlease indicate the methods that may require specialised expertise during the peer review process (use a comma to separate various required expertises).
Combinatorics, Genomics and Transcriptomics
No need for them to be recommenders of PCI Math Comp Biol. Please do not suggest reviewers for whom there might be a conflict of interest. Reviewers are not allowed to review preprints written by close colleagues (with whom they have published in the last four years, with whom they have received joint funding in the last four years, or with whom they are currently writing a manuscript, or submitting a grant proposal), or by family members, friends, or anyone for whom bias might affect the nature of the review - see the code of conduct
e.g. John Doe [john@doe.com]
2022-11-22 14:48:18
Leo van Iersel
Guillaume Marçais, Yadong Zhang