Chromosome-level genome assembly of Odontothrips loti Haliday (Thysanoptera: Thripidae).

Yingning L ¹,

Shuhua W ²,

Wenting D ¹,

Miao M ¹,

Ying W ²,

Rong Z ²,

Liping B ¹

Affiliations

1. College of Grassland Science and Technology, China Agricultural University, Beijing, 100193, China.
Authors
Yingning L¹
Wenting D¹
Miao M¹
Liping B¹
(4 authors)
2. Institute of Plant Protection, Ningxia Academy of Agriculture and Forestry Sciences, Yinchuan, 750002, China.
Authors
Shuhua W²
Ying W²
Rong Z²
(3 authors)

ORCIDs linked to this article

Scientific Data, 04 May 2024, 11(1):451
https://doi.org/10.1038/s41597-024-03289-x PMID: 38704405 PMCID: PMC11069530

This article is in the Europe PMC Open access subset. Refer to the copyright information in the article for licensing details.

Free full text in Europe PMC

Abstract

As the predominant pest of alfalfa, Odontothrips loti Haliday causes great damages over the major alfalfa-growing regions of China. The characteristics of strong mobility and fecundity make them develop rapidly in the field and hard to be controlled. There is a shortage of bioinformation and limited genomic resources available of O. loti for us to develop novel pest management strategies. In this study, we constructed a chromosome-level reference genome assembly of O. loti with a genome size of 346.59 Mb and scaffold N50 length of 18.52 Mb, anchored onto 16 chromosomes and contained 20128 genes, of which 93.59% were functionally annotated. The results of 99.20% complete insecta_odb10 genes in BUSCO analysis, 91.11% short reads mapped to the ref-genome, and the consistent tendency among the thrips in the distribution of gene length reflects the quality of genome. Our study provided the first report of genome for the genus Odontothrips, which offers a genomic resource for further investigations on evolution and molecular biology of O. loti, contributing to pest management.

Free full text

Sci Data. 2024; 11: 451.

Published online 2024 May 4. https://doi.org/10.1038/s41597-024-03289-x

PMCID: PMC11069530

PMID: 38704405

Chromosome-level genome assembly of Odontothrips loti Haliday (Thysanoptera: Thripidae)

Luo Yingning,¹ Wei Shuhua,² Dai Wenting,¹ Miao Miao,¹ Wang Ying,² Zhang Rong,² and Ban Liping¹

Author information Article notes Copyright and License information Disclaimer

Go to:

Associated Data

Data Citations

Luo Y. 2024. Chromosome-level reference genome assembly of O. loti. figshare. [CrossRef]
Luo Y, Ban L. 2024. Chromosome-level genome assembly of Odontothrips loti Haliday (Thysanoptera: Thripidae) GenBank. JAZGLN000000000
2024. NGDC Genome Sequence Archive (GSA) https://ngdc.cncb.ac.cn/gsa/browse/CRA014018/CRR997575
2024. NGDC Genome Sequence Archive (GSA) https://ngdc.cncb.ac.cn/gsa/browse/CRA014018/CRR997573
2024. NGDC Genome Sequence Archive (GSA) https://ngdc.cncb.ac.cn/gsa/browse/CRA014018/CRR997574
2024. NGDC Genome Sequence Archive (GSA) https://ngdc.cncb.ac.cn/gsa/browse/CRA014018/CRR997576https://ngdc.cncb.ac.cn/gsa/browse/CRA014018/CRR997576

Go to:

Abstract

As the predominant pest of alfalfa, Odontothrips loti Haliday causes great damages over the major alfalfa-growing regions of China. The characteristics of strong mobility and fecundity make them develop rapidly in the field and hard to be controlled. There is a shortage of bioinformation and limited genomic resources available of O. loti for us to develop novel pest management strategies. In this study, we constructed a chromosome-level reference genome assembly of O. loti with a genome size of 346.59Mb and scaffold N50 length of 18.52Mb, anchored onto 16 chromosomes and contained 20128 genes, of which 93.59% were functionally annotated. The results of 99.20% complete insecta_odb10 genes in BUSCO analysis, 91.11% short reads mapped to the ref-genome, and the consistent tendency among the thrips in the distribution of gene length reflects the quality of genome. Our study provided the first report of genome for the genus Odontothrips, which offers a genomic resource for further investigations on evolution and molecular biology of O. loti, contributing to pest management.

Subject terms: Genome assembly algorithms, Agricultural genetics

Go to:

Background & Summary

Odontothrips loti Haliday (Thysanoptera: Thripidae) is a destructive, oligophagous pest that mainly feeds on leguminous crops, particularly alfalfa Medicago sativa L.^1,2. As the predominant pest of alfalfa, in North China, the major alfalfa-growing region, O. loti can cause damage to 70%~100% of plants on average^3,4. Thrips attack the entire life cycle of the host plants, causing the plants to wilt or stop growing and the leaves to turn dry (Fig. 1), which not only leads to severe yield and forage quality reductions but also exacerbates the spread of plant viruses^5–7. Several features of thrips such as small body size, cryptic behavior, and high fecundity make them difficult to control.

Fig. 1

Odontothrips loti (a), alfalfa with O. loti damage (b) and without O. loti damage (c).

Taking advantages of the low-cost of next generation sequencing (NGS) technology, researchers could identify functional genes related to virus transmission or pesticide resistance from the whole genome level through the construction of genome map, understand the evolution of pesticide resistance and virus transmission mechanisms, and control pest by gene regulation, making it possible to develop new pest management strategies^8–15. As the genetic information of O. loti is still largely unknown currently, we aimed to disclose it for the development of novel O. loti control strategies.

In this study, we present a high-quality chromosome-level genome of O. loti, which was obtained using a combination of ONT long-read sequencing, Illumina short-read sequencing and chromosome conformation capture (Hi-C) technologies. Comparative genomic analysis was also performed on O. loti and another fourteen insect species to explore their phylogenetic relationship and genomic features. We provide the first genome assembly for a thrip in the Odontothrips genus to facilitate better understanding the genome evolution of thrips and developing novel control strategies for this important alfalfa pest.

Go to:

Methods

Sample preparation

Odontothrips loti individuals were initially collected from the alfalfa field at Shangzhuang Experimental Station at the China Agricultural University (40°8’15”N, 116°11’18”E), and the colony was established and maintained for approximately 10 generations in the laboratory using the ‘Zhongmu No.1’ alfalfa at the temperature of 25±1°C, the relative humidity of 65±5%, and the light: dark cycle of 16h:8h. The developmental stages of the thrips were examined under a light microscope. Individuals were collected, flash frozen in liquid nitrogen, and stored at −80°C until use. Detailed information for O. loti sampling was shown in Table 1.

Table 1

Sample information of Odontothrips loti in this study.

Sample	Nymph /Adult	Sex	The number of thrips
DNA for survey	Adult	Female	1
DNA for assembly	Adult	Female and male	800
DNA for Hi-C	Adult	Female and male	800
RNA for annotation	Nymph and adult	Female and male	240

Genomic DNA sequencing

For Illumina short-read sequencing, the genomic DNA was isolated from of a single female adult following Chen’s protocol¹⁶, briefly, using sodium dodecyl sulfate (SDS) and proteinase K digestion, followed by phenol-chloroform extraction. The library (150bp inserts) was constructed with Nextera DNA Flex Library Prep Kit (Illumina, San Diego, CA, USA), and sequenced on the Illumina NovaSeq 6000 (Illumina, San Diego, CA, USA), generating 43.66Gb of raw data with 150bp pair-end reads. Adapters and low-quality short reads were removed by Fastp (v0.21.0)¹⁷ with default parameters, resulting in a total of 42.05Gb (~123×coverage) of clean data (Table 2). The short-read data was used for genome survey and assembly polish.

Table 2

Library sequencing data and methods used in this study to assemble the Odontothrips loti genome.

Sequencing strategy	Platform	Usage	Insertion size	Clean data (Gb)	Coverage (X)
Short-reads	Illumina	Survey Assembly	150bp	42.05	123
Long-reads	Oxford Nanopore	Assembly	10–20Kb	39.63	116
Hi-C	Illumina	Hi-C assembly	150bp	31.78	93
RNA-seq	Oxford Nanopore	Annotation	1–15Kb	10.24	30

For long-read genomic DNA sequencing, we used approximately 800 mixed-sex adult thrips. Genomic DNA was extracted using the SDS method¹⁶, and the DNA fragment size and the degree of degradation were checked on a 0.7% agarose gel. The purity and concentration of extracted DNA were determined with NanoDrop One (Thermo Fisher Scientific). The library was constructed with SQK-LSK109 kit (Oxford Nanopore Technologies, Oxford, UK) according to the manufacturer’s instructions and sequenced on the Oxford Nanopore PromethION platform (Oxford Nanopore Technologies, Oxford, UK). We obtained 41.19Gb (~120×coverage) of raw long-read data with mean length of 6,182.26bp (N50=16,150bp). We then used Oxford Nanopore GUPPY (v0.3.0, https://timkahlke.github.io/LongRead_tutorials/BS_G.html) to filter reads with quality score<7 and obtained 39.63Gb (~116×coverage) of clean reads. The cleaned long-read data were used for contig-level genome assembly (Table 2).

Hi-C library preparation and sequencing

The Hi-C sequencing library was prepared with 800 mixed-sex adult thrips. Samples were cross-linked with a 2% formaldehyde isolation buffer and then treated with DpnII (New England Biolabs, Beijing, CN) to digest nuclei. Biotinylated nucleotides were used to repair tails, and the ligated DNA was split into fragments of 300–700bp in length. The resulting Hi-C library was sequenced in Illumina Novoseq 6000 for 150bp paired-end reads. After applying the same filter criteria for short reads, a total of 31.78Gb (~93×coverage) of clean data was generated to assist the chromosome-level assembly (Table 2).

ONT-Transcriptome sequencing

For ONT-transcriptome sequencing, approximately 240 thrips including nymph and adult were mixed for RNA extraction with the RNA Easy Fast Tissue/Cell Kit (Tiangen). NanoDrop (Thermo Fisher Scientific) and Qubit 3.0 Fluorometer (Life Technologies, Carlsbad, CA, USA) were used to evaluate the quality of extracted RNA. SQK-PCS109 and SQK-PBK004 kit (Oxford Nanopore Technologies) were used for reverse transcript and construction of cDNA library, and sequencing was proceeded on the PromethION sequencer (Oxford Nanopore Technologies, Oxford, UK). A total of 10.24Gb of clean reads were generated with mean length of 1,034.61bp (N50=1,238bp), used to assist genome annotation (Table 2).

Estimation of genomic characteristics

Genomic characteristics were estimated based on 42.05Gb of short-read data using a K-mer-based statistical analysis in Jellyfish (v2.3.0)¹⁸ and GenomeScope2¹⁹ (p=2, k=19). Based on 19-mer depth analysis, the genome size and heterozygosity were estimated to be 341.3Mb and 1.49%, respectively, therefore, this genome is considered highly heterozygous (Fig. 2).

Fig. 2

Characteristics of the Illumina short-read sequencing of the Odontothrips loti genome.

Genome assembly

Contig level assembly

We first used NextDenovo (v2.5.0)²⁰ to generate a draft assembly, and conducted two rounds of polish with ONT long reads on Racon (v1.4.11, https://github.com/lbcb-sci/racon). Illumina reads were mapped to the assembly using BWA v0.7.17 and another two rounds of contig polishing were performed with Pilon (v1.23)²¹. Owing to its highly heterozygous feature, Purge_haplotigs (v1.0.4, https://github.com/skingan/purge_haplotigs_multiBAM) was applied to de-heterozygosis the draft genome to generate the final contig-level genome, which was 346.58-Mb long and similar to the estimated size, with the N50 contig length of 8.59Mb (Table 3).

Table 3

Major indicators of the Odontothrips loti genome.

Features	Values
Estimated genome size (bp)	341,303,860
Contig-level assembly size (bp)	346,577,358
Chromosome-level assembly size (bp)	346,592,158
Anchored to chromosome (bp)	301,277,358
Contig N50(bp)	8,588,564
Scaffold N50(bp)	18,519,078

Hi-C scaffolding

Low-quality raw reads (quality score <20,length shorter than 30bp) and adaptors were removed using Fastp (v0.21.0)¹⁷. The clean reads were then mapped to the contig assembly using HICUP (v0.8.0)²² to filter unmapped reads, invalid pairs, dangling end and repeats resulting from PCR amplification. The valid paired-end pairs were used for contig cluster, order and orient by ALLHIC (v0.9.8)²³. The interaction between contig pairs were converted into binary files by 3D-DNA²⁴ and Juicer (v1.6)²⁵. The HiCExplorer (v3.6)²⁶ was used to generate the heat maps of contig interaction intensity and location. The Juicebox (v1.11.08)²⁷ was subsequently employed to review assembly manually. In summary, the resulting chromosome-level genome length was 346.59Mb with a scaffold N50 of 18.52Mb (Table 3), around 86.93% (301.28Mb) of the genome bases were anchored onto 16 chromosomes (Fig. 3a), and most syntenic blocks of genome presents in the low GC content region (Fig. 3b).

Fig. 3

Heatmap of genome-wide Hi-C data and circular representation of the chromosomes of Odontothrips loti. (a) The heatmap of chromosome interactions in O. loti. The frequency of Hi-C interaction links is represented by colors, which ranges from yellow (low) to red (high). (b) Circos plot of distribution of the genomic elements in O. loti. The tracks indicate (i) length of the chromosome, (ii) gene density, (iii) distribution of transposable element (TE) density, and (iv) GC density. Center: intra-genomic syntenic blocks of O. loti. The densities of genes, TEs, and GC were calculated in 500kb windows.

Predicting repeats

We used ReaptModeler (v.1.0.11, https://github.com/Dfam-consortium/RepeatModeler) to predict repeat sequence. LTR_FINDER (vOfficial, -size 1000000 -time 300)²⁸ and LTR_retriever (v2.9.0)²⁹ were used to find and de-redundant the LTR sequence. These two de novo library were combined with RepBase³⁰ for further prediction by RepeatMasker (v4.0.9,-nolow -no_is -norna)³¹. RepeatProteinMask (-noLowSimple -pvalue 0.0001) was used for homo-prediction. All results were de-redundant and merged to the final repeat sequence. In summary, 115.26Mb repeat sequences were identified, accounting for 33.26% of the O. loti genome (Table 4). Among these repeat sequences, most (18.85%) are DNA transposon, followed by 10.13% of long terminal repeats (LTRs), 3.45% of long interspersed nuclear elements (LINEs) and only 0.40% of short interspersed nuclear elements (SINEs) (Table 4).

Table 4

Statistics of the repeat sequences annotation in Odontothrips loti genome.

Type	Length (bp)	Percentage in genome (%)
DNA	65,317,630	18.85
LTR	35,092,753	10.13
LINE	11,957,062	3.45
SINE	1,382,412	0.40
Unknown	14,723,706	4.25
Total	115,261,572	33.26

Protein-coding genes and functional predictions

We utilized a pipeline include three strategies: transcriptome-based prediction, homology-based prediction, and ab initio prediction to annotate protein coding genes. For transcriptome-based prediction, we use NanoFilt (v2.8.0, -q 7 -l 100 -headcrop 30 -minGC 0.3)³², Pychopper (v2.7.2, https://github.com/epi2me-labs/pychopper), racon (v1.4.11, https://github.com/lbcb-sci/racon), minimap2 (v2.17-r941)³³, stringtie (v2.1.4)³⁴ and TransDecoder (v5.1.0, https://github.com/TransDecoder/TransDecoder) for ONT-transcriptome reads to predicted protein-coding gene. For homology-based prediction, tblastn (v2.7.1)³⁵ with an E-value cutoff of 1e-5 and Exonerate (v2.4)³⁶ were used to predict gene structure by comparing with 3 closely related species (Megalurothrips usitatus, Thrips palmi, Frankliniella occidentalis) and model species Drosophila melanogaster. Before ab initio prediction, repetitive elements from the whole genome were soft-masked. Augustus (v3.3.2)³⁷, GenScan (v1.0)³⁸ and GlimmerHMM (v3.0.4)³⁹ were used for de novo prediction. Finally, MAKER (v2.31.10)⁴⁰ integrated the above three strategies, resulting in a non-redundant gene set, with weighting as default. Overall, 20,128 protein coding genes were obtained (Table 5).

Table 5

Statistics for the Odontothrips loti functionally annotated protein-coding genes.

Database	Number	Percentage (%)
Protein-coding genes	20,128	100.00
Annotated genes	18,837	93.59
Interproscan	17,895	88.91
NR	16,363	81.29
Uniprot	16,241	80.69
Pfam	13,932	69.22
GO	12,229	60.76
KEGG	8,527	42.36
Pathway	4,801	23.85
Unanotated genes	1,291	6.41

For functional annotation, protein sequences were aligned to Non-Redundant protein (NR), Universal Protein (Uniprot), Protein Families Analysis and Modeling (Pfam), Clusters of Orthologous Groups of proteins (COG), Kyoto Encyclopedia of Genes and Genomes (KEGG) and evolutionary genealogy of genes: Non-supervised Orthologous Groups (eggNOG) database. Gene Ontology (GO) terms was obtained from Uniport. InterProScan (v5.52-86.0)⁴¹ was used to search the conserved sequences, motifs and domains. There were 12,229 (60.76%) and 8,527 (42.36%) genes annotated to GO terms and KEGG pathways respectively. A total of 18,837 genes (93.59%) were annotated using at least one public database (Table 5).

Go to:

Data Records

The assembly genome sequence and annotation data were deposited in Figshare⁴² and GenBank⁴³. Raw data from Nanopore (CRR997575)⁴⁴, Illumina (CRR997573)⁴⁵ and Hi-C (CRR997574)⁴⁶ genome sequencing and RNA-seq (CRR997576)⁴⁷ were deposited in the Genome Sequence Archive (GSA, https://ngdc.cncb.ac.cn/gsa)⁴⁸, and were related to the BioProject PRJCA022165.

Go to:

Technical Validation

Genome quality assessment

We assessed the quality of chromosome-level genome from the three aspects: continuity, consistency, and completeness. First, the scaffold N50 of O. loti is 18.52Mb (Table 3), representing the continuity of genome. Second, we evaluated the consistency of the genome by calculating the comparison rate and coverage of Illumina reads through BWA (v0.7.17)⁴⁹, resulting 91.11% short reads were aligned to and covered 94.68% of the ref-genome. Third, we used BUSCO (v4.1.4)⁵⁰ to estimate the completeness of chromosome-level genome by searching the 1367 BUSCO genes in insecta_odb10 (https://busco-data.ezlab.org/v5/data/lineages/). The results showed a high completeness level with 99.2%, 99.2%, 95.6%, 94.4% complete genes found in the contig-level genome, chromosome-level genome, annotated gene sets and protein-coding gene sets, respectively (Fig. 4).

Fig. 4

Benchmarking of genome completeness of Odontothrips loti genome assembly and annotation, evaluated by BUSCO based on insect_odb10 database which includes 1,367 genes. C: the number of complete genes, S: the number of complete and single-copy genes, D: the number of complete and duplicated genes, F: the number of incomplete genes, M: the number of missing genes.

Evaluation of gene prediction

To verify the accuracy and reliability of the gene prediction, we determined the distribution of gene length, CDS length, exon length and intron length in O. loti, D. melanogaster⁵¹ and other four related species (M. usitatus⁸, T. palmi¹², F. occidentalis¹⁴, S. biformis¹³). The consistent tendency among the thrips supported an ideal annotated gene dataset in O. loti (Fig. 5).

Fig. 5

Annotated genes comparison of the distribution of (a) gene length (b) CDS length (c) exon length (d) intron length in Odontothrips loti with Drosophila melanogaster and four closely related species. The x-axis represents the length, and the y-axis represents the density of genes.

Go to:

Acknowledgements

This work was supported by National Natural Science Foundation of China (no. 31971759 to B.L.), the Beijing Innovation Consortium of Modern Agricultural Industry Technology System (no. BAIC02-2024 to B.L.) and the Ningxia Province Sci-Tech Innovation Demonstration Program of High-Quality Agricultural Development and Ecological Conservation (no. NGSB-2021-15-04 to W.S.). We are grateful to Chaoyang Zhao (National Soil Dynamics Laboratory, USDA-ARS, Auburn, AL, USA) for guidance to improve the language of manuscript. The bioinformatics analysis is supported by High-performance Computing Platform of China Agricultural University.

Go to:

Author contributions

B.L. conceived of this project. L.Y. and D.W. participated in the data analysis. L.Y., D.W., M.M., W.S., W.Y. and Z.R. collected the samples. L.Y. wrote the manuscript. L.Y. and B.L. revised the manuscript. All authors have read, revised, and approved the final manuscript for submission.

Go to:

Code availability

All software and pipelines were executed according to the manual and protocols of the published bioinformatic tools. The version and code/parameters of software have been described in Methods section. No custom code was used.

Go to:

Competing interests

The authors declare no competing interests.

Go to:

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Go to:

References

1. Liu Y, Luo Y, Du L, Ban L. Antennal Transcriptome Analysis of Olfactory Genes and Characterization of Odorant Binding Proteins in Odontothrips loti (Thysanoptera: Thripidae) Int J Mol Sci. 2023;24:5284. 10.3390/ijms24065284. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

2. Liu Y, Li J, Ban L. Morphology and Distribution of Antennal Sensilla in Three Species of Thripidae (Thysanoptera) Infesting Alfalfa Medicago sativa. Insects. 2021;12:81. 10.3390/insects12010081. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

3. Buhe T, Wang X. Breeding research of the variety of anti-thrips alfalfa. Multifunctional Grasslands In A Changing World, Volume Ii Xxi International Grassland Congress And Viii International Rangeland Congress, Hohhot, China. 2008;29 E 5 Y:5–5. [Google Scholar]

4. Li N, Song X, Wang X. The complete mitochondrial genome of Odontothrips loti (Haliday, 1852) (Thysanoptera: Thripidae) Mitochondrial DNA B Resour. 2019;5:7–8. 10.1080/23802359.2019.1693296. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

5. Wu S, et al. A decade of a thrips invasion in China: lessons learned. Ecotoxicology. 2018;27:1032–1038. 10.1007/s10646-017-1864-6. [Abstract] [CrossRef] [Google Scholar]

6. Li J, et al. Occurrence, Distribution, and Transmission of Alfalfa Viruses in China. Viruses. 2022;14:1519. 10.3390/v14071519. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

7. Li J, et al. RNA-seq reveals plant virus composition and diversity in alfalfa, thrips, and aphids in Beijing, China. Arch Virol. 2021;166:1711–1722. 10.1007/s00705-021-05067-1. [Abstract] [CrossRef] [Google Scholar]

8. Ma L, et al. Chromosome-level genome assembly of bean flower thrips Megalurothrips usitatus (Thysanoptera: Thripidae) Sci Data. 2023;10:252. 10.1038/s41597-023-02164-5. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

9. Bao W, Kataoka Y, Fukada K, Sonoda S. Imidacloprid resistance of melon thrips, Thrips palmi, is conferred by CYP450-mediated detoxification. J. Pestic. Sci. 2015;40:65–68. 10.1584/jpestics.D15-004. [CrossRef] [Google Scholar]

10. Shi, P. et al. Variable resistance to spinetoram in populations of Thrips palmi across a small area unconnected to genetic similarity. Evolutionary Applications13, (2020). [Europe PMC free article] [Abstract]

11. Xue, B. & Sonoda, S. Resistance to cypermethrin in melon thrips, Thrips palmi (Thysanoptera: Thripidae), is conferred by reduced sensitivity of the sodium channel and CYP450-mediated detoxification. Applied Entomology and Zoology47, (2012).

12. Guo S, et al. Chromosome-level assembly of the melon thrips genome yields insights into evolution of a sap-sucking lifestyle and pesticide resistance. Molecular Ecology Resources. 2020;20:1110–1125. 10.1111/1755-0998.13189. [Abstract] [CrossRef] [Google Scholar]

13. Hu Q, Ye Z, Zhuo J, Li J-M, Zhang C. A chromosome-level genome assembly of Stenchaetothrips biformis and comparative genomic analysis highlights distinct host adaptations among thrips. Commun Biol. 2023;6:1–10. 10.1038/s42003-023-05187-1. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

14. Rotenberg, D. et al. Genome-enabled insights into the biology of thrips as crop pests. BMC Biology18, (2020). [Europe PMC free article] [Abstract]

15. Zhang, Z. et al. The Chromosome-Level Genome Assembly of Bean Blossom Thrips (Megalurothrips usitatus) Reveals an Expansion of Protein Digestion-Related Genes in Adaption to High-Protein Host Plants. Int J Mol Sci24, (2023). [Europe PMC free article] [Abstract]

16. Chen H, Rangasamy M, Tan SY, Wang H, Siegfried BD. Evaluation of Five Methods for Total DNA Extraction from Western Corn Rootworm Beetles. PLoS ONE. 2010;5:e11963. 10.1371/journal.pone.0011963. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

17. Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884–i890. 10.1093/bioinformatics/bty560. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

18. Marçais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k -mers. Bioinformatics. 2011;27:764–770. 10.1093/bioinformatics/btr011. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

19. Ranallo-Benavidez TR, Jaron KS, Schatz MC. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat Commun. 2020;11:1432. 10.1038/s41467-020-14998-3. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

20. Hu, J. et al. An efficient error correction and accurate assembly tool for noisy long reads. Preprint at 10.1101/2023.03.09.531669 (2023). [Europe PMC free article] [Abstract]

21. Walker BJ, et al. Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement. PLOS ONE. 2014;9:e112963. 10.1371/journal.pone.0112963. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

22. Wingett S, et al. HiCUP: pipeline for mapping and processing Hi-C data. F1000Res. 2015;4:1310. 10.12688/f1000research.7334.1. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

23. Zhang X, Zhang S, Zhao Q, Ming R, Tang H. Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data. Nat. Plants. 2019;5:833–845. 10.1038/s41477-019-0487-8. [Abstract] [CrossRef] [Google Scholar]

24. Dudchenko O, et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science. 2017;356:92–95. 10.1126/science.aal3327. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

25. Durand NC, et al. Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell Syst. 2016;3:95–98. 10.1016/j.cels.2016.07.002. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

26. Wolff J, et al. Galaxy HiCExplorer 3: a web server for reproducible Hi-C, capture Hi-C and single-cell Hi-C data analysis, quality control and visualization. Nucleic Acids Res. 2020;48:W177–W184. 10.1093/nar/gkaa220. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

27. Durand NC, et al. Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. Cell Syst. 2016;3:99–101. 10.1016/j.cels.2015.07.012. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

28. Xu Z, Wang H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007;35:W265–268. 10.1093/nar/gkm286. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

29. Ou S, Jiang N. LTR_retriever: A Highly Accurate and Sensitive Program for Identification of Long Terminal Repeat Retrotransposons. Plant Physiol. 2018;176:1410–1422. 10.1104/pp.17.01310. [Abstract] [CrossRef] [Google Scholar]

30. Bao W, Kojima KK, Kohany O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob DNA. 2015;6:11. 10.1186/s13100-015-0041-9. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

31. Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics. 2009;Chapter 4:4.10.1–4.10.14. [Abstract] [Google Scholar]

32. De Coster W, D’Hert S, Schultz DT, Cruts M, Van Broeckhoven C. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics. 2018;34:2666–2669. 10.1093/bioinformatics/bty149. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

33. Li H. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics. 2016;32:2103–2110. 10.1093/bioinformatics/btw152. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

34. Kovaka S, et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 2019;20:278. 10.1186/s13059-019-1910-1. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

35. Camacho C, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421. 10.1186/1471-2105-10-421. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

36. Slater GSC, Birney E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics. 2005;6:31. 10.1186/1471-2105-6-31. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

37. Stanke M, Diekhans M, Baertsch R, Haussler D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics. 2008;24:637–644. 10.1093/bioinformatics/btn013. [Abstract] [CrossRef] [Google Scholar]

38. Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol. 1997;268:78–94. 10.1006/jmbi.1997.0951. [Abstract] [CrossRef] [Google Scholar]

39. Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics. 2007;23:673–679. 10.1093/bioinformatics/btm009. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

40. Holt C, Yandell M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics. 2011;12:491. 10.1186/1471-2105-12-491. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

41. Blum M, et al. The InterPro protein families and domains database: 20 years on. Nucleic Acids Res. 2021;49:D344–D354. 10.1093/nar/gkaa977. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

42. Luo Y. 2024. Chromosome-level reference genome assembly of O. loti. figshare. [CrossRef]

43. Luo Y, Ban L. 2024. Chromosome-level genome assembly of Odontothrips loti Haliday (Thysanoptera: Thripidae) GenBank. JAZGLN000000000

44. 2024. NGDC Genome Sequence Archive (GSA) https://ngdc.cncb.ac.cn/gsa/browse/CRA014018/CRR997575

45. 2024. NGDC Genome Sequence Archive (GSA) https://ngdc.cncb.ac.cn/gsa/browse/CRA014018/CRR997573

46. 2024. NGDC Genome Sequence Archive (GSA) https://ngdc.cncb.ac.cn/gsa/browse/CRA014018/CRR997574

47. 2024. NGDC Genome Sequence Archive (GSA) https://ngdc.cncb.ac.cn/gsa/browse/CRA014018/CRR997576https://ngdc.cncb.ac.cn/gsa/browse/CRA014018/CRR997576

48. Chen T, et al. The Genome Sequence Archive Family: Toward Explosive Data Growth and Diverse Data Types. Genomics, Proteomics & Bioinformatics. 2021;19:578–583. 10.1016/j.gpb.2021.08.001. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

49. Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at 10.48550/arXiv.1303.3997 (2013).

50. Seppey M, Manni M, Zdobnov EM. BUSCO: Assessing Genome Assembly and Annotation Completeness. Methods Mol Biol. 2019;1962:227–245. 10.1007/978-1-4939-9173-0_14. [Abstract] [CrossRef] [Google Scholar]

51. Hoskins RA, et al. The Release 6 reference sequence of the Drosophila melanogaster genome. Genome Res. 2015;25:445–458. 10.1101/gr.185579.114. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

Articles from Scientific Data are provided here courtesy of Nature Publishing Group

Full text links

Read article at publisher's site: https://doi.org/10.1038/s41597-024-03289-x

Citations & impact

Impact metrics

Citation

Jump to Citations

Article citations

Chromosome-level genome assembly of Odontothrips loti Haliday (Thysanoptera: Thripidae).
Yingning L, Shuhua W, Wenting D, Miao M, Ying W, Rong Z, Liping B
Sci Data, 11(1):451, 04 May 2024
Cited by: 1 article | PMID: 38704405 | PMCID: PMC11069530
This article is in the Europe PMC Open access subset. Refer to the copyright information in the article for licensing details.
Free full text in Europe PMC

Data

Data behind the article

This data has been text mined from the article, or deposited into data resources.

Data Citations

(1 citation) DOI - 10.6084/m9.figshare.24865023.v2

Funding

Funders who supported this work.

Beijing Innovation Consortium of Modern Agricultural Industry Technology System

National Natural Science Foundation of China (1)

Grant ID: 31971759
8 publications

National Natural Science Foundation of China (National Science Foundation of China) (1)

Grant ID: 31971759
1 publication

Search life-sciences literature (45,090,497 articles, preprints and more)

Chromosome-level genome assembly of Odontothrips loti Haliday (Thysanoptera: Thripidae).

Author information

Affiliations

ORCIDs linked to this article

Abstract

Free full text

Chromosome-level genome assembly of Odontothrips loti Haliday (Thysanoptera: Thripidae)

Luo Yingning

Wei Shuhua

Dai Wenting

Miao Miao

Wang Ying

Zhang Rong

Ban Liping

Associated Data

Abstract

Background & Summary

Methods

Sample preparation

Table 1

Genomic DNA sequencing

Table 2

Hi-C library preparation and sequencing

ONT-Transcriptome sequencing

Estimation of genomic characteristics

Genome assembly

Contig level assembly

Table 3

Hi-C scaffolding

Predicting repeats

Table 4

Protein-coding genes and functional predictions

Table 5

Data Records

Technical Validation

Genome quality assessment

Evaluation of gene prediction

Acknowledgements

Author contributions

Code availability

Competing interests

Footnotes

References

Full text links

Citations & impact

Impact metrics

Article citations

Chromosome-level genome assembly of Odontothrips loti Haliday (Thysanoptera: Thripidae).

Data

Data behind the article

Data Citations

Similar Articles

Funding

Beijing Innovation Consortium of Modern Agricultural Industry Technology System

National Natural Science Foundation of China (1)﻿

National Natural Science Foundation of China (National Science Foundation of China) (1)﻿

The Ningxia Province Sci-Tech Innovation Demonstration Program of High-Quality Agricultural Development and Ecological Conservation

Partnerships & funding

National Natural Science Foundation of China (1)

National Natural Science Foundation of China (National Science Foundation of China) (1)