Europe PMC
Nothing Special   »   [go: up one dir, main page]

Europe PMC requires Javascript to function effectively.

Either your web browser doesn't support Javascript or it is currently turned off. In the latter case, please turn on Javascript support in your web browser and reload this page.

This website requires cookies, and the limited processing of your personal data in order to function. By using the site you are agreeing to this as outlined in our privacy notice and cookie policy.

Abstract 


Bean flower thrips Megalurothrips usitatus is a staple pest of cowpea and other legumes and causes dramatic economic losses. Its small size allows for easy concealment, and large reproductive capacity easily leads to infestations. Despite the importance of a genome in developing novel management strategies, genetic studies on M. usitatus remain limited. Thus, we generated a chromosome-level M. usitatus genome using a combination of PacBio long read and Hi-C technologies. The assembled genome was 238.14 Mb with a scaffold N50 of 13.85 Mb. The final genome was anchored into 16 pseudo-chromosomes containing 14,000 genes, of which 91.74% were functionally annotated. Comparative genomic analyses revealed that expanded gene families were enriched in fatty acid metabolism and detoxification metabolism (ABC transporters), and contracted gene families were strongly associated with chitin-based cuticle development and sensory perception of taste. In conclusion, this high-quality genome provides an invaluable resource for us to understand the thrips' ecology and genetics, contributing to pest management.

Free full text 


Logo of sdataLink to Publisher's site
Sci Data. 2023; 10: 252.
Published online 2023 May 3. https://doi.org/10.1038/s41597-023-02164-5
PMCID: PMC10156705
PMID: 37137922

Chromosome-level genome assembly of bean flower thrips Megalurothrips usitatus (Thysanoptera: Thripidae)

Associated Data

Data Citations

Abstract

Bean flower thrips Megalurothrips usitatus is a staple pest of cowpea and other legumes and causes dramatic economic losses. Its small size allows for easy concealment, and large reproductive capacity easily leads to infestations. Despite the importance of a genome in developing novel management strategies, genetic studies on M. usitatus remain limited. Thus, we generated a chromosome-level M. usitatus genome using a combination of PacBio long read and Hi-C technologies. The assembled genome was 238.14 Mb with a scaffold N50 of 13.85 Mb. The final genome was anchored into 16 pseudo-chromosomes containing 14,000 genes, of which 91.74% were functionally annotated. Comparative genomic analyses revealed that expanded gene families were enriched in fatty acid metabolism and detoxification metabolism (ABC transporters), and contracted gene families were strongly associated with chitin-based cuticle development and sensory perception of taste. In conclusion, this high-quality genome provides an invaluable resource for us to understand the thrips’ ecology and genetics, contributing to pest management.

Subject terms: Genomics, Entomology

Background & Summary

Bean flower thrips Megalurothrips usitatus is a highly harmful pest of leguminous crops in the genera Glycine, Arachis, and Vigna14. The insect lays eggs in plant tissue and feeds on leaves, flowers and pods, causing economic losses worldwide, particularly in southern China, India, Japan, the Philippines, and Australia1,3,5,6. Its small body size, cryptic behavior, and fast transmission present difficulties in pest control6,7.

Attempts to mitigate agricultural damage have largely involved chemical insecticides812. However, excessive pesticide usage leaves residues that risk consumer health and also induce resistance in pest insects. Understanding the evolution of pesticide resistance is necessary for developing novel management strategies, but the genetics of M. usitatus remains poorly understood. Filling this knowledge gap will benefit our efforts at pest control.

In this study, we assembled a chromosome-level genome of M. usitatus using a combination of PacBio long read, Illumina short-read sequencing, and chromosome conformation capture (Hi-C) technologies. We compared the genomic features of M. usitatus with those of other insects to explore the genomic signatures of resistance. The high-quality reference genome of the bean flower thrips obtained in this study will lay the foundation for future investigations on the ecology of thrips and provide valuable genetic information for its management.

Methods

Sample preparation and genomic DNA sequencing

Megalurothrips usitatus samples were collected from Wanning, Hainan province, and reared for approximately 100 generations in the laboratory. Adults were fed Lablab purpureus and kept at 25 ± 1 °C, 70 ± 5% relative humidity, and 14:10 light:dark cycle. Stages were confirmed under a light microscope and verified with pictorial keys13. Individuals were then quickly placed into collection tubes, flash-frozen in liquid nitrogen, and stored at −80 °C until use.

We prepared approximately 2,000 mixed-sex M. usitatus individuals for genome sequencing. Genomic DNA was extracted using the CTAB method, followed by purification using a Blood and Cell Culture DNA Midi Kit (QIAGEN, Germany). The purity and concentration of extracted DNA were determined with 0.75% agarose gel electrophoresis and a Qubit 2.0 Fluorometer (Thermo Fisher Scientific, USA), respectively. The library constructed from the extracted DNA was approximately 10–20 Kb in size. A PacBio Sequel sequencer (Pacific Biosciences, Menlo Park, USA) was used for long DNA fragments, and Illumina Novoseq 6000 was used to generate 150 bp paired-end short reads. The sequencing yielded 98.30 Gb (412.78 × coverage) of long-reads with an N50 length of 14,475 bp and an average length of 10,352.68 ± 2.46 bp (mean ± S.E.). The Illumina platform sequenced 58.80 Gb raw data, of which adapters and low-quality short reads were removed using Fastp version 0.21.014 with default parameters, resulting in a total of 55.86 Gb (234.57 × coverage) clean data (Table 1).

Table 1

Library sequencing data and methods used in this study to assemble the Megalurothrips usitatus genome.

Sequencing strategyPlatformUsageInsertion sizeClean data (Gb)Coverage (X)
Short-readsIlluminaGenome survey350 bp55.86234.57
Long-readsPacBio-sequel IIGenome assembly10–20 Kb98.30412.78
Hi-CIlluminaHi-C assembly350 bp53.90226.34
RNA-seqIlluminaAnno-evidence350 bp5.6123.56
Full-length transcriptomePacBio-sequel IIAnno-evidence1–10 Kb47.67200.18

Hi-C library preparation and sequencing

Chromosome conformation capture (Hi-C) sequencing used fresh tissues from 1,500 mixed-sex M. usitatus individuals. The samples were cross-linked with a 2% formaldehyde isolation buffer and then treated with DpnII (NEB) to digest nuclei. Biotinylated nucleotides were used to repair the tails, and the ligated DNA was split into fragments of 350 bp in length. The resulting Hi-C library was sequenced in Illumina Novoseq. 6000 with paired-end 150 bp. After applying the same filter criteria for short reads, a total of 53.90 Gb (226.34 × coverage) of clean data was generated (Table 1).

Transcriptome sequencing

A pooled M. usitatus sample was prepared using 30 eggs, 20 pseudo-pupae, 10 females, and 10 males. Total RNA was extracted using the TRIzol reagent (Thermo Fisher Scientific, USA). A paired-end library was constructed using the TruSeq RNA Library Preparation Kit (Illumina, USA) and sequenced on an Illumina Novoseq 6000 platform. It resulted in a total of 5.61 Gb RNA-seq clean data (Table 1). Additionally, total RNA (1 µg) was used to construct a full-length transcript isoform library using the SMRT bell Express Template Prep Kit 2.0 (Pacific Biosciences, USA). Target-size sequences were generated using the PacBio sequel II platform. A total of 47.67 Gb full-length transcriptome data was obtained (Table 1).

Estimation of genomic characteristics

Genomic characteristics were determined based on 55.86 Gb of short-read data using a K-mer-based statistical analysis in JELLYFISH version 2.1.315 with the following parameters: ‘count -m 17 -C -c 7 -s 1 G -F 2’. Genome heterozygosity and genome size were estimated in GenomeScope version 2.016 with default parameters. Based on 17-mer depth analysis, genome size and heterozygosity were estimated to be 255.81 Mb and 0.85%, respectively (Fig. 1).

An external file that holds a picture, illustration, etc.
Object name is 41597_2023_2164_Fig1_HTML.jpg
Genomic characteristics of Megalurothrips usitatus based on Illumina short-read data obtained in GenomeScope version 2.0 with 17 K-mer.

The K-mer distributions showed double peaks: the first peak with a coverage of 100 indicates genome duplication and the highest peak with a coverage of 200 represents a genome-size peak. Genome size was calculated to be 255.81 Mb with a heterozygous rate of 0.85%.

Genome assembly

We assembled a draft genome using wtdbg2 version 2.5 with default parameters17. We then had it polished using RACON version 1.4.1318 with parameters ‘-m 8 -x −6 -g −8 -w 500 -u’ and Pilon version 1.1419 with default parameters based on 98.30 Gb long reads and 55.86 Gb short reads.

A scaffolding pipeline based on Durand (2016)20 was used to generate a high-quality chromosome-scale genome. Initially, Hi-C data were mapped to the contig assembly using BWA-MEM version 0.7.1721 with the following parameters: ‘mem -SP5M’. Next, the DpnII sites were generated using the ‘generate_site_positions.py’ script in Juicer version 1.520. The 3D-DNA pipeline (-r 2) was subsequently employed to order, orient, and cluster the contig22. After viewing Hi-C contact maps, the chromosome-scale genome was assembled in Juicebox version 1.11.08 (https://github.com/aidenlab/Juicebox). The genome assembly was screened for contaminant sequences by using the “Contamination in Sequence Databases” in NCBI. A total of 33 sequences were labeled as contaminant and removed (available in Figshare). To identify the mitochondrial genome, we amplified the cytochrome oxidase subunit 1 (COI) gene fragment with primer pairs LCO1490 and HCO2198, and obtained a DNA barcode sequence of approximately 610 bp23. We then used BLAST version 2.2.2824 (-evalue 1e-5) to find assembly sequences of a high similarity to the COI fragment (>98%), and identified one unplaced sequence (scaffold46) as mitochondrial sequence. The resulting chromosome-level genome was 238.14 Mb with a scaffold N50 of 13.85 Mb, maximum length of 20.88 Mb, and GC rate of 55.90% (Table 2). 91.89% of the genome was anchored to 16 pseudo-chromosomes (Table 2), which were well-distinguished from each other based on the chromatin interaction heatmap (Fig. 2).

Table 2

Statistics for the chromosomal-level genome of the Megalurothrips usitatus.

FeaturesValues
Total length (bp)238,139,689
Longest scaffold length (bp)20,884,914
Scaffold N50 (bp)13,852,586
Scaffold N90 (bp)10,644,695
GC (%)55.90
Anchored to chromosome (Mb, %)218.82 (91.89%)
An external file that holds a picture, illustration, etc.
Object name is 41597_2023_2164_Fig2_HTML.jpg
Genome-wide contact matrix of Megalurothrips usitatus generated using Hi-C data.

Each black square represents a pseudo-chromosome. The color bar indicates the interaction intensity of Hi-C contacts.

Predicting repeats

Repeat sequences were annotated in Extensive de novo TE Annotator (EDTA) version 1.9.425. In brief, LTR retrotransposons were identified in LTR FINDER version 1.0726, LTRharvest27, and LTR retriever version 2.9.028 with default parameters. Next, TIR Learner29 and HelitronScanner30 were used to classify DNA transposons with default parameters. RepeatMasker version 4.0.7 (-gff -xsmall -no_is)31 and RepeatProteinMasker version 4.0.7 (-engine wublast) were utilized to identify repeat sequences based on RepBase edition 2017012732. Repeats were masked with de novo predictions using RepeatModeler version 2.0.1 with parameters ‘-engine ncbi -pa 4’. Additionally, Tandem Repeats Finder33 was used to annotate tandem repeats with parameters ‘2 7 7 80 10 50 500 -f -d -m’. Overall, 20.20% of the assembled genome was classified as repetitive sequences in the M. usitatus genome (Table 3). Tandem repeat elements were found to be the most abundant (8.42%), followed by the terminal inverted repeat category (5.39%) (Table 3).

Table 3

Classification of repeat annotation in the Megalurothrips usitatus genome.

ClassCountMasked length (bp)Percent (%)
LTR-retrotransposon4.28
Copia2,4081,536,2550.65
Gypsy7,9855,493,2512.31
Unknown8,2203,156,4221.33
Terminal inverted repeat5.39
CACTA13,0223,821,1431.60
Mutator15,5625,594,8232.35
PIF/Harbinger264117,7480.05
Tcl/Mariner7847,6980.02
hAT10,2593,260,9291.37
Non-terminal inverted repeat2.11
Helitron15,3665,027,3322.11
Tandem repeat239,28920,056,5148.42
Total73,16428,055,60120.20

Gene and functional predictions

Genes in the assembled genome were predicted using a combination of homology-based, transcriptome-based, and ab initio methods. Homology-based predictions involved downloaded sequences of peptides and transcripts from Aptinothrips rufus (http://v2.insect-genome.com/Organism/87), Frankliniella occidentalis (https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/697/945/GCF_000697945.3_Focc_3.1), and Thrips palmi (https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/012/932/325/GCF_012932325.1_TpBJ-2018v1). The IsoSeq version 3.4.0 workflow was utilized to generate 28,608 high-quality transcripts from the full-length transcriptome data, with quality parameters of 0.99 (https://github.com/PacificBiosciences/IsoSeq). Next, RNA-seq short data were mapped to the reference genome using HISAT2 version 2.2.134 with the parameter ‘-k 2’. The mapped reads were then assembled into transcripts using StringTie version 2.4.035 with default parameters. Homologous proteins and transcripts were aligned using Exonerate version 2.4.0 with default parameters to train the gene sets. Meanwhile, a sorted and mapped bam file of RNA-seq data was transferred to a hints file using the bam2hints program in AUGUSTUS version 3.2.336 with the parameter ‘–intronsonly’. The trained gene sets and hint files were combined as inputs for AUGUSTUS version 3.2.336, which predicted coding genes from the assembled genome with default parameters. Finally, homology-based, de novo-derived, and transcript genes were merged in MAKER version 2.31.10 to generate a high-confidence gene set37. It resulted in the annotation of 14,000 M. usitatus genes. The average transcript length was 2,243.30 bp with an average length of coding sequence (CDS) of 1,588.94 bp. The average exon number per gene was 7.38, and the average exon length was 303.85 bp (Table 4).

Table 4

Gene annotation statistics of the Megalurothrips usitatus genome.

FeaturesResults
Number of genes14,000
Average gene length (bp)4,612.39
Number of mRNAs13,474
Average mRNA length (bp)2,243.30
Average mRNA count per gene1.10
Average CDS length (bp)1,588.94
Average protein sequence length (bp)529.65
Average exon length (bp)303.85
Average exon count per gene7.38

Gene structure and annotations were determined through several methods, including eggnog-mapper38 (-m diamond–tax_scope auto–go_evidence experimental–target_orthologs all–seed_ortholog_evalue 0.001–seed_ortholog_score 60–query-cover 20–subject-cover 0 –override), InterProscan version 5.039 (-iprlookup -goterms -appl Pfam -f TSV), BLAST version 2.2.2824 (-evalue 1e-5), and HMMER version 3.3.240 (–noali–cut_ga Pfam-A.hmm). These methods were used to search against multiple public databases, including NCBI non-redundant protein (Nr), Gene Ontology (GO), Clusters of Orthologous Groups of Proteins (COG), Kyoto Encyclopedia of Genes and Genomes (KEGG), Swiss-Prot, and Pfam. Most genes (91.74%) were successfully annotated with at least one public database (Table 5).

Table 5

Functional annotation of the Megalurothrips usitatus genome.

Database nameAnnotated numberPercent (%)
NR12,80491.46
Swissport10,02671.61
GO4,33430.96
KEGG6,89149.22
COG10,25373.24
eggnog10,25373.24
Pfam9,90970.78
Bm10,44674.61
Dm9,90870.77
Total12,84391.74

Databases Bm and Dm were locally built in BLAST version 2.2.2824 using publicly available sequences of Bombyx mori69 and Drosophila melanogaster70, respectively.

Comparative genomic analysis

To identify single-copy orthologous genes, we utilized the longest protein sequence of each gene from M. usitatus and multiple other species (Table 6), including F. occidentalis41, T. palmi42, Acyrthosiphon pisum43, Triatoma rubrofasciata44, Columbicola columbae45, Aedes aegypti46, Danaus plexippus47, Tribolium castaneum48, Apis mellifera49 and Daphnia galeata50. We performed all-to-all single-copy ortholog BLAST comparisons in OrthoFinder version 2.5.451 with the parameters ‘-a blast -M msa’. We aligned the resulting single-copy orthologous genes using MAFFT version 7.487 (–auto)52 and further trimmed the poorly aligned regions using Gblocks version 0.91b53 (-t = p -b4 = 5). We maintained the genes that met the stationary, reversible and homogeneous (SRH) assumptions54 using IQ-TREE version 2.2.055 with a p-value cut-off of 0.05. We finally obtained 1,573 single-copy genes under these criteria. Next, We used FASconCAT-G version 1.05.156 to concatenate the genes to form a supermatrix, which was used for subsequent phylogenetic analysis.

Table 6

Genome datasets were used for comparative genomic analysis in the study.

OrderFamilySpeciesDatabaseAccession numberReference
ThysanopteraThripidaeMegalurothrips usitatusIn this study
ThysanopteraThripidaeFrankliniella occidentalisNCBIGCA_000697945.541
ThysanopteraThripidaeThrips palmiNCBIGCA_012932325.142
HemipteraReduviidaeTriatoma rubrofasciataGigaDB10061444
HemipteraAphididaeAcyrthosiphon pisumNCBIGCA_005508785.243
PhthirapteraPhilopteridaeColumbicola columbaeInsectBaseIBG_0019945
DipteraCulicidaeAedes aegyptiNCBIGCA_002204515. 146
LepidopteraNymphalidaeDanaus plexippusNCBIGCA_018135715.147
ColeopteraTenebrionidaeTribolium castaneumNCBIGCA_000002335.348
HymenopteraApidaeApis melliferaNCBIGCA_003254395.249
AnomopodaDaphniidaeDaphnia galeataNCBIGCA_918697745.150

We performed a maximum likelihood analysis of concatenated sequences in IQ-TREE version 2.2.055 with 1,000 UFBoot replicates (–bb 1,000 –model JTT + I + G4). The minimum correlation coefficient for the convergence criterion was set at 0.99 (-bcor 0.99). The age of each node was estimated using a correlated rates clock in MCMCTREE of PAML version 4.457. To estimate the divergence times, we selected fossil records listed in Table 7.

Table 7

Fossils were used for estimating divergence times and calibration point prior settings in the analysis.

Node assignedFossilsAge (Ma)Remarks
MinMax
Daphnia galeata + Insecta456531This calibration was based on the conclusion of (Rehm et al., 2011), which determined the divergence between Crustacea and Hexapoda ~510 Mya71.
Thysanoptera + Hemiptera + Columicola columbae333378This calibration was based on the conclusion of (wang et al., 2016), which determined that divergence between that Psocodea and Condylognatha occurred around the Devonian and Carboniferous boundary ~357 Ma (378–333 Ma)72.
Tribolium castaneum + (Danaus plexippus + Aedes aegypti)Gallia alsatica (Diptera: Rhagionidae)242Diptera was determined based on records of immature Diptera from the Triassic period (~242 Ma)73.
Moravocoleus permianus (Coleoptera: Tshekardocoleidae)293This calibration was based on the oldest Palaeozoic beetles described from Sakmarian (290–293 Ma)74.
Triatoma rubrofasciata + Acyrthosiphon pisumParaknightia magnifica (Hemiptera: Paraknightiidae)241This calibration was based on the oldest described fossils of Heteroptera (~241 Ma)75
Aviorrhyncha magnifica (Hemiptera: Aviorrhynchidae)307The oldest described fossils of Sternorrhyncha, are estimated to be from around 307 Ma76.
(Frankliniella occidentalis + Megalurothrips usitatus) + Thrips palmi70119This calibration was based on the findings of (Johnson et al., 2018), which determined that the Frankliniella and Thrips diverged at 90 Ma (70–119 Ma)77.

Gene-family expansion and contraction were estimated using CAFÉ version 4.2 with parameters ‘lambda -s -t’, based on maximum likelihood and reduction methods58. Phylogenetic tree topology and branch lengths were considered when inferring the significance of changes to gene-family size in each branch. The results revealed 684 expanded gene families and 1,639 contracted gene families in M. usitatus (Fig. 3). Next, functional enrichment analysis (GO enrichment and KEGG pathway) was performed in KOBAS version 3.059. Significantly enriched GO terms were those with an adjusted p < 0.05 under Fisher’s exact test. Expanded gene families were enriched in cAMP signaling pathway, fatty acid metabolism, detoxification metabolism (ABC transporters) and the ionotropic glutamate receptor pathway (Fig. 4a, available in Figshare). Contracted gene families were enriched in chitin-based cuticle development, sensory perception of taste and NADP + activity (Fig. 4b, available in Figshare).

An external file that holds a picture, illustration, etc.
Object name is 41597_2023_2164_Fig3_HTML.jpg
Genome evolution of Megalurothrips usitatus.

A time-calibrated phylogenetic tree inferred from 1,573 single-copy orthologs using IQ-TREE version 2.2.0 was shown. The upper panel in wheat represents Paraneoptera insects and the lower panel in light-blue represents Holometabola. The divergence between M. usitatus and F. occidentalis diverged 73.93 Mya (Million years ago). Bootstrap support values based on 1,000 replicates are equal to 100 (orange dot). The number of expanded (+red) and contracted (−blue) gene families are shown for each lineage.

An external file that holds a picture, illustration, etc.
Object name is 41597_2023_2164_Fig4_HTML.jpg
Functional annotation of expanded and contracted gene families.

(a) Expanded genes. (b) Contracted genes. Each row represents an enriched function, and the bar length represents the enrichment ratio (input gene number/background gene number). Bar colors represent different clusters. If any cluster has more than five terms, the top five with the highest enrichment ratio are displayed.

Data Records

Genomic PacBio sequencing data were deposited in the Sequence Read Archive at NCBI under accession number SRR2213748560.

Genomic Illumina sequencing data were deposited in the Sequence Read Archive at NCBI under accession SRR2213748261.

RNA-seq data were deposited in the Sequence Read Archive at NCBI under accession number SRR2213748462.

Full-length transcript isomer sequencing data were deposited in the Sequence Read Archive at NCBI under accession number SRR2213748363.

Hi-C sequencing data were deposited in the Sequence Read Archive at NCBI under accession number SRR2213748164.

The final chromosome assembly was deposited in GenBank at NCBI under accession number JAPTSV00000000065.

The contaminant file, single-copy orthologous genes, gene-family expansion and contraction, gene function annotation, and repeat annotation are available in Figshare66.

Technical Validation

DNA integrity

The integrity of extracted genomic DNA was determined using 0.75% agarose gel electrophoresis and analyzed with an Agilent 2100 Bioanalyzer (Agilent Technologies, USA). DNA concentration was measured using a Nanodrop 2000 spectrophotometer (Thermo Fisher Scientific, USA) and Qubit 2.0 (Thermo Fisher Scientific, USA). Absorbance at 260/280 nm was approximately 1.8.

Assessment of genome assemblies

We assessed the accuracy of the final genome assembly by mapping Illumina short reads to the M. usitatus genome with BWA-MEM version 0.7.1721. The analysis showed that 96.52% of short reads were successfully mapped to the M. usitatus genome (Table 8). We further assessed the base quality of genome assembly by estimating the quality value score (QVS) using Merqury version 1.167, which showed a high QVS of 32.65 (Table 8). These findings indicate that the quality of our assembled genome is high.

Table 8

Assessment metrics for the final genome assembly of Megalurothrips usitatus.

TypesResults
Genome completenessComplete BUSCOs (C)97.40%
Complete and single-copy BUSCOs (S)97.00%
Complete and duplicated BUSCOs (D)0.40%
Fragmented BUSCOs (F)0.60%
Missing BUSCOs (M)2.00%
Genome accuracyMapping short-reads rate96.52%
Quality value scores (QVs)32.65

Furthermore, we evaluated the completeness of the final genome assembly using Benchmarking Universal Single-Copy Orthologs (BUSCO version 3.0.2) insecta_odb1068, which includes 1,367 orthologous genes. The analysis revealed a high completeness of 97.40% for the M. usitatus genome with only 0.60% of BUSCO genes being fragmented, 2.00% being missing, and 0.40% being duplicated (Table 8). These BUSCO results were comparable to the completeness for other thrips genomes, such as T. palmi (97.20%), F. occidentalis (98.50%), and A. rufus (95.00%) (Table 9).

Table 9

Comparisons of genome assemblies of different thrips.

SpeciesAssembly levelGenome size (Mb)Scaffold N50 (Kb)BUSCO (%)GC (%)
Megalurothrips usitatusChromosome238.1413,85297.4055.90
Thrips palmiChromosome237.8514,67097.2053.90
Frankliniella occidentalisScaffold274.994,18098.5048.40
Aptinothrips rufusContig339.92595.0048.60

Acknowledgements

We thank Prof. Wangpeng Shi and Dr. Mingyue Feng for their assistance with sample collection, and Prof. Feng Zhang and Dr. Yingqi Liu for their help with divergence-time estimation. This work was supported by the National Natural Science Foundation of China (No. 31922012), Sanya Yazhou Bay Science and Technology City (No. SYND-2022-04), and the 2115 Talent Development Program of China Agricultural University.

Author contributions

H.L. and W.C. conceived the project. L.M. and Q.L. collected samples and extracted genomic nucleotides. L.M. and H.L. performed data analysis and wrote the manuscript. S.W., S.L., L.T., F.S. and Y.D. contributed to data analyses. All authors contributed to revising the manuscript. All authors have read and approved the final version.

Code availability

No specific codes or scripts were used in this study. All software used is in the public domain, with parameters clearly described in the Methods section.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Ling Ma, Qiaoqiao Liu.

References

1. Oparaeke AM. The sensitivity of flower bud thrips, Megalurothrips sjostedti Trybom (Thysanoptera: Thripidae), on cowpea to three concentrations and spraying schedules of Piper guineense Schum. & Thonn. extracts. Plant Prot. Sci. 2006;42:106. 10.17221/2757-PPS. [CrossRef] [Google Scholar]
2. Tillekaratne K, Edirisinghe J, Gunatilleke C, Karunaratne W. Survey of thrips in Sri Lanka: a checklist of thrips species, their distribution and host plants. Ceylon J. Sci. 2011;40:89–89. 10.4038/cjsbs.v40i2.3926. [CrossRef] [Google Scholar]
3. Tang L-D, et al. The life table parameters of Megalurothrips usitatus (Thysanoptera: Thripidae) on four leguminous crops. Fla. Entomol. 2015;2:620–625. 10.1653/024.098.0235. [CrossRef] [Google Scholar]
4. Zafirah Z, Azidah AA. Diversity and population of thrips species on legumes with special reference to Megalurothrips usitatus. Sains Malays. 2018;47:433–439. 10.17576/jsm-2018-4703-02. [CrossRef] [Google Scholar]
5. Duraimurugan P, Tyagi K. Pest spectra, succession and its yield losses in mungbean and urdbean under changing climatic scenario. Legume Res. 2014;37:212–222. 10.5958/j.0976-0571.37.2.032. [CrossRef] [Google Scholar]
6. Yasmin S, Ali M, Rahman MM, Akter MS, Latif MA. Biological traits of bean flower thrips, Megalurothrips usitatus (Thysanoptera: Thripidae) reared on mung bean. Herit. Sci. 2021;5:29–33. 10.26480/gws.02.2021.29.33. [CrossRef] [Google Scholar]
7. Liu P, et al. The male‐produced aggregation pheromone of the bean flower thrips Megalurothrips usitatus in China: identification and attraction of conspecifics in the laboratory and field. Pest Manage. Sci. 2020;76:2986–2993. 10.1002/ps.5844. [Abstract] [CrossRef] [Google Scholar]
8. Peter C, Govindarajulu V. Management of blossom thrips, Megalurothrips usitatus on pigeonpea. Int. J. Pest Manage. 1990;36:312–313. 10.1080/09670879009371495. [CrossRef] [Google Scholar]
9. Hossain MA. Efficacy of some insecticides against insect pests of mungbean (Vigna radiata L.) Bangladesh J. Agric. Res. 2015;40:657–667. 10.3329/bjar.v40i4.26940. [CrossRef] [Google Scholar]
10. Sujatha B, Bharpoda T. Evaluation of insecticides against sucking pests grown during Kharif. Int. Curr. Microbiol. App. Sci. 2017;6:1258–1268. 10.20546/ijcmas.2017.610.150. [CrossRef] [Google Scholar]
11. Yasmin S, Latif M, Ali M, Rahman M. Management of thrips infesting mung bean using pesticides. SAARC J. Agric. 2019;17:43–52. 10.3329/sja.v17i2.45293. [CrossRef] [Google Scholar]
12. Maradi RM, et al. Evaluation of bio-efficacy of newer molecules of different insecticides against thrips, Aphis craccivora in yard long bean, Vigna unguiculata subsp. sesquipedalis. J. Entomol. Zool. Stud. 2020;15:189–192. 10.55446/IJE.2021.360. [CrossRef] [Google Scholar]
13. Khan, R., Seal, D. & Adhikari, R. Bean flower thrips Megalurothrips usitatus (Bagnall) (Insecta: Thysanoptera: Thripidae). EDIS, 1–7, 10.32473/edis-IN1352-2022 (2022).
14. Chen S, Zhou Y, Chen Y, Gu J. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:884–890. 10.1093/bioinformatics/bty560. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
15. Marcais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011;27:764–770. 10.1093/bioinformatics/btr011. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
16. Vurture GW, et al. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics. 2017;33:2202–2204. 10.1093/bioinformatics/btx153. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
17. Ruan J, Li H. Fast and accurate long-read assembly with wtdbg2. Nat. Methods. 2020;17:155–158. 10.1038/s41592-019-0669-3. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
18. Vaser R, Sović I, Nagarajan N, Šikić M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 2017;27:737–746. 10.1101/gr.214270.116. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
19. Walker BJ, et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE. 2014;9:e112963. 10.1371/journal.pone.0112963. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
20. Durand NC, et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 2016;3:95–98. 10.1016/j.cels.2016.07.002. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
21. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. 10.1093/bioinformatics/btp324. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
22. Dudchenko O, et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science. 2017;356:92–95. 10.1126/science.aal3327. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
23. Folmer O, Black M, Hoeh W, Lutz R, Vrijenhoek R. DNA primers for amplification of mitochondrial cytochrome c oxidase subunit I from diverse metazoan invertebrates. Mol. Mar. Biol. Biotechnol. 1994;3:294–299. [Abstract] [Google Scholar]
24. Camacho C, et al. BLAST+: architecture and applications. BMC Bioinform. 2009;10:421–429. 10.1186/1471-2105-10-421. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
25. Ou S, et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 2019;20:1–18. 10.1186/s13059-019-1905-y. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
26. Ou S, Jiang N. LTR_FINDER_parallel: parallelization of LTR_FINDER enabling rapid identification of long terminal repeat retrotransposons. Mobile DNA. 2019;10:48–48. 10.1186/s13100-019-0193-0. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
27. Ellinghaus D, Kurtz S, Willhoeft U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinform. 2008;9:1–14. 10.1186/1471-2105-9-18. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
28. Ou S, Jiang N. LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 2017;176:1410–1422. 10.1104/pp.17.01310. [Abstract] [CrossRef] [Google Scholar]
29. Su W, Gu X, Peterson T. TIR-Learner, a new ensemble method for TIR transposable element annotation, provides evidence for abundant new transposable elements in the maize genome. Mol. Plant. 2019;12:447–460. 10.1016/j.molp.2019.02.008. [Abstract] [CrossRef] [Google Scholar]
30. Xiong W, He L, Lai J, Dooner HK, Du C. Helitronscanner uncovers a large overlooked cache of Helitron transposons in many plant genomes. Proc. Natl. Acad. Sci. USA. 2014;111:10263–10268. 10.1073/pnas.1410068111. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
31. Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinformatics. 2004;5:1–14. 10.1002/0471250953.bi0410s25. [Abstract] [CrossRef] [Google Scholar]
32. Jurka J, et al. Repbase update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 2005;110:462–467. 10.1186/s13100-015-0041-9. [Abstract] [CrossRef] [Google Scholar]
33. Benso G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999;27:573–580. 10.1093/nar/27.2.573. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
34. Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods. 2015;12:357–360. 10.1038/nmeth.3317. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
35. Kovaka S, et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 2019;20:1–13. 10.1186/s13059-019-1910-1. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
36. Stanke M, et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 2006;34:435–439. 10.1093/nar/gkl200. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
37. Cantarel BL, et al. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 2008;18:188–196. 10.1101/gr.6743907. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
38. Cantalapiedra CP, Hernández-Plaza A, Letunic I, Bork P, Huerta-Cepas J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 2021;38:5825–5829. 10.1093/molbev/msab293. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
39. Jones P, et al. InterProScan 5: genome-scale protein function classification. Bioinformatics. 2014;30:1236–1240. 10.1093/bioinformatics/btu031. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
40. Finn RD, Clements J, Eddy SR. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res. 2011;39:29–37. 10.1093/nar/gkr367. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
41. Rotenberg D, et al. Genome-enabled insights into the biology of thrips as crop pests. BMC Biol. 2020;18:1–37. 10.1186/s12915-020-00862. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
42. Guo SK, et al. Chromosome‐level assembly of the melon thrips genome yields insights into evolution of a sap‐sucking lifestyle and pesticide resistance. Mol. Ecol. Resour. 2020;20:1110–1125. 10.1111/1755-0998.13189. [Abstract] [CrossRef] [Google Scholar]
43. Consortium IAG. Genome sequence of the pea aphid Acyrthosiphon pisum. PLoS Biol. 2010;8:e1000313. 10.1371/journal.pbio.3000029. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
44. Liu Q, et al. A chromosomal-level genome assembly for the insect vector for Chagas disease, Triatoma rubrofasciata. GigaScience. 2019;8:giz089. 10.1093/gigascience/giz089. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
45. Baldwin-Brown JG, et al. The assembled and annotated genome of the pigeon louse Columbicola columbae, a model ectoparasite. G3. 2021;11:jkab009. 10.1093/g3journal/jkab009. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
46. Nene V, et al. Genome sequence of Aedes aegypti, a major arbovirus vector. Science. 2007;316:1718–1723. 10.1126/science.1138878. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
47. Mongue AJ, Nguyen P, Voleníková A, Walters JR. Neo-sex chromosomes in the monarch butterfly, Danaus plexippus. G3. 2017;7:3281–3294. 10.1534/g3.117.300187. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
48. Richards S, et al. The genome of the model beetle and pest Tribolium castaneum. Nature. 2008;452:949–955. 10.1038/nature06784. [Abstract] [CrossRef] [Google Scholar]
49. Consortium HGS. Insights into social insects from the genome of the honeybee Apis mellifera. Nature. 2006;443:931–949. 10.1038/nature05260. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
50. Nickel J, et al. Hybridization dynamics and extensive introgression in the Daphnia longispina species complex: new insights from a high-quality Daphnia galeata reference genome. Genome Biol. Evol. 2021;13:evab267. 10.1093/gbe/evab267. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
51. Emms DM, Kelly S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20:1–14. 10.1186/s13059-019-1832-y. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
52. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 2013;30:772–780. 10.1093/molbev/mst010. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
53. Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 2000;17:540–552. 10.1093/oxfordjournals.molbev.a026334. [Abstract] [CrossRef] [Google Scholar]
54. Naser-Khdour S, Minh BQ, Zhang W, Stone EA, Lanfear R. The prevalence and impact of model violations in phylogenetic analysis. Genome Biol. Evol. 2019;11:3341–3352. 10.1093/gbe/evz193. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
55. Nguyen L-T, Schmidt HA, Von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 2015;32:268–274. 10.1093/molbev/msu300. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
56. Kück P, Longo GC. FASconCAT-G: extensive functions for multiple sequence alignment preparations concerning phylogenetic studies. Front. Zool. 2014;11:1–8. 10.1186/s12983-014-0081-x. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
57. Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 2007;24:1586–1591. 10.1093/molbev/msm088. [Abstract] [CrossRef] [Google Scholar]
58. De Bie T, Cristianini N, Demuth JP, Hahn MW. CAFE: a computational tool for the study of gene family evolution. Bioinformatics. 2006;22:1269–1271. 10.1093/bioinformatics/btl097. [Abstract] [CrossRef] [Google Scholar]
59. Bu D, et al. KOBAS-i: intelligent prioritization and exploratory visualization of biological functions for gene enrichment analysis. Nucleic Acids Res. 2021;49:317–325. 10.1093/nar/gkab447. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
60. 2022. NCBI Sequence Read Archive. SRR22137485
61. 2022. NCBI Sequence Read Archive. SRR22137482
62. 2022. NCBI Sequence Read Archive. SRR22137484
63. 2022. NCBI Sequence Read Archive. SRR22137483
64. 2022. NCBI Sequence Read Archive. SRR22137481
65. Ma L, Liu Q, Li H, Cai W. 2022. Megalurothrips usitatusgenome sequencing and assembly. GenBank. JAPTSV000000000
66. Ma L, 2023. Chromosome-level genome assembly of bean flower thripsMegalurothrips usitatus. Figshare. [CrossRef]
67. Rhie A, Walenz BP, Koren S, Phillippy AM. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 2020;21:245. 10.1186/s13059-020-02134-9. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
68. Simao FA, Waterhouse RM, loannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31:3210–3212. 10.1093/bioinformatics/btv351. [Abstract] [CrossRef] [Google Scholar]
69. Mita K, et al. The genome sequence of silkworm, Bombyx mori. DNA Res. 2004;11:27–35. 10.1093/dnares/11.1.27. [Abstract] [CrossRef] [Google Scholar]
70. Adams MD, et al. The genome sequence of Drosophila melanogaster. Science. 2000;287:2185–2195. 10.1126/science.287.5461.2185. [Abstract] [CrossRef] [Google Scholar]
71. Rehm P, et al. Dating the arthropod tree based on large-scale transcriptome data. Mol. Phylogen. Evol. 2011;61:880–887. 10.1016/j.ympev.2011.09.003. [Abstract] [CrossRef] [Google Scholar]
72. Wang Y-h, et al. Fossil record of stem groups employed in evaluating the chronogram of insects (Arthropoda: Hexapoda) Sci. Rep. 2016;6:38939. 10.1038/srep38939. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
73. Krzeminski W, Krzeminska E. Triassic Diptera: descriptions, revisions and phylogenetic relations. Acta Zool. Cracov. 2003;46:153–184. [Google Scholar]
74. Nikolajev G, Ren D. The oldest fossil Ochodaeidae (Coleoptera: Scarabaeoidea) from the middle Jurassic of China. Zootaxa. 2010;2553:65–68. 10.11646/zootaxa.2553.1.4. [CrossRef] [Google Scholar]
75. Grimaldi, D. & Engel, M. S. Evolution of the Insects. (Cambridge University Press, 2005).
76. Nel A, et al. The earliest known holometabolous insects. Nature. 2013;503:257–261. 10.1038/nature12629. [Abstract] [CrossRef] [Google Scholar]
77. Johnson KP, et al. Phylogenomics and the evolution of hemipteroid insects. Proc. Natl. Acad. Sci. USA. 2018;115:12775–12780. 10.1073/pnas.1815820115. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

Articles from Scientific Data are provided here courtesy of Nature Publishing Group

Citations & impact 


Impact metrics

Jump to Citations

Citations of article over time

Alternative metrics

Altmetric item for https://www.altmetric.com/details/147331977
Altmetric
Discover the attention surrounding your research
https://www.altmetric.com/details/147331977

Article citations


Go to all (8) article citations

Data 


Data behind the article

This data has been text mined from the article, or deposited into data resources.

Funding 


Funders who supported this work.

National Natural Science Foundation of China (National Science Foundation of China) (1)