Get Transcriptome Data Analysis Methods and Protocols 1st Edition Yejun Wang Free All Chapters
Get Transcriptome Data Analysis Methods and Protocols 1st Edition Yejun Wang Free All Chapters
Get Transcriptome Data Analysis Methods and Protocols 1st Edition Yejun Wang Free All Chapters
com
https://textbookfull.com/product/transcriptom
e-data-analysis-methods-and-protocols-1st-
edition-yejun-wang/
textbookfull
More products digital (pdf, epub, mobi) instant
download maybe you interests ...
https://textbookfull.com/product/functional-proteomics-methods-
and-protocols-xing-wang/
https://textbookfull.com/product/microbiome-analysis-methods-and-
protocols-robert-g-beiko/
https://textbookfull.com/product/rna-abundance-analysis-methods-
and-protocols-hailing-jin/
https://textbookfull.com/product/single-cell-analysis-methods-
and-protocols-miodrag-guzvic/
Selected Methods of Planning Analysis 2nd Edition
Xinhao Wang
https://textbookfull.com/product/selected-methods-of-planning-
analysis-2nd-edition-xinhao-wang/
https://textbookfull.com/product/relative-fidelity-processing-of-
seismic-data-methods-and-applications-1-edition-edition-wang/
https://textbookfull.com/product/functional-analysis-of-long-non-
coding-rnas-methods-and-protocols-haiming-cao/
https://textbookfull.com/product/talens-methods-and-
protocols-1st-edition-ralf-kuhn/
https://textbookfull.com/product/zymography-methods-and-
protocols-1st-edition-jeff-wilkesman/
Methods in
Molecular Biology 1751
Yejun Wang
Ming-an Sun Editors
Transcriptome
Data Analysis
Methods and Protocols
METHODS IN MOLECULAR BIOLOGY
Series Editor
John M. Walker
School of Life and Medical Sciences
University of Hertfordshire
Hatfield, Hertfordshire, AL10 9AB, UK
Edited by
Yejun Wang
Department of Cell Biology and Genetics, School of Basic Medicine, Shenzhen University
Health Science Center, Shenzhen, China
Ming-an Sun
Epigenomics and Computational Biology Lab, Biocomplexity Institute of Virginia Tech,
Blacksburg, VA, USA
Editors
Yejun Wang Ming-an Sun
Department of Cell Biology and Epigenomics and Computational Biology Lab
Genetics, School of Basic Medicine Biocomplexity Institute of Virginia Tech
Shenzhen University Health Blacksburg, VA, USA
Science Center
Shenzhen, China
As sequencing technology improves and costs decrease, more and more laboratories are
performing RNA-Seq to explore the molecular mechanisms of various biological pheno-
types. Due to the increased sequencing depth available, the purposes of transcriptome
studies have also been expanded extensively. In addition to the conventional uses for gene
annotation, profiling, and expression comparison, transcriptome studies have been applied
for multiple other purposes, including but not limited to gene structure analysis, identifica-
tion of new genes or regulatory RNAs, RNA editing analysis, co-expression or regulatory
network analysis, biomarker discovery, development-associated imprinting studies, single-
cell RNA sequencing studies, and pathogen–host dual RNA sequencing studies.
The aim of this book is to give comprehensive practical guidance on transcriptome data
analysis with different scientific purposes. It is organized in three parts. In Part I, Chapters 1
and 2 introduce step-by-step protocols for RNA-Seq and microarray data analysis, respec-
tively. Chapter 3 focuses on downstream pathway and network analysis on the differentially
expressed genes identified from expression profiling data. Unlike most of the other proto-
cols, which were command line-based, Chapter 4 describes a visualizing method for tran-
scriptome data analysis. Chapters 5–11 in Part II give practical protocols for gene
characterization analysis with RNA-Seq data, including alternative spliced isoform analysis
(Chapter 5), transcript structure analysis (Chapter 6), RNA editing (Chapter 7), and
identification and downstream data analysis of microRNA (Chapters 8 and 9), lincRNA
(Chapter 10), and transposable elements (Chapter 11). In Part III, protocols on several new
applications of transcriptome studies are described: RNA–protein interactions (Chapter 12),
expression noise analysis (Chapter 13), epigenetic imprinting (Chapter 14), single-cell RNA
sequencing applications (Chapter 15), and deconvolution of heterogeneous cells
(Chapter 16). Some chapters cover more than one application. For example, Chapter 5
also presents the analysis of single molecule sequencing data in addition to alternative
splicing analysis; Chapter 12 also gives solutions for the analysis of small RNAs in bacteria.
Some topics were not included in this volume due to various factors, e.g., analysis on circular
RNAs, metatranscriptomics, biomarker identification, and dual RNA-Seq. For circular
RNAs, there are numerous published papers or books with protocols that can be followed.
Metatranscriptomics is a new technique and data-oriented methods for analysis are still
lacking. For most other applications, the core protocols for data processing and analysis are
the same as presented in the chapters of this volume.
v
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
vii
viii Contents
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Contributors
ix
x Contributors
Abstract
With recent advances of next-generation sequencing technology, RNA-Sequencing (RNA-Seq) has
emerged as a powerful approach for the transcriptomic profiling. RNA-Seq has been used in almost every
field of biological studies, and has greatly extended our view of transcriptomic complexity in different
species. In particular, for nonmodel organisms which are usually without high-quality reference genomes,
the de novo transcriptome assembly from RNA-Seq data provides a solution for their comparative tran-
scriptomic study. In this chapter, we focus on the comparative transcriptomic analysis of nonmodel
organisms. Two analysis strategies (without or with reference genome) are described step-by-step, with
the differentially expressed genes explored.
1 Introduction
Yejun Wang and Ming-an Sun (eds.), Transcriptome Data Analysis: Methods and Protocols, Methods in Molecular Biology,
vol. 1751, https://doi.org/10.1007/978-1-4939-7710-9_1, © Springer Science+Business Media, LLC 2018
3
4 Han Cheng et al.
2 Materials
2.1 Software All the software packages need to be installed in your workstation in
Packages advance. Because most bioinformatics tools are designed for Linux
operating systems, here we demonstrate each step according to 64-bit
Ubuntu OS. For the convenience of running the commands in
your working directory, add the folders containing your executes
into your PATH environment variable so that the executes could be
used directly when you type their names. To be noted, some software
used in this protocol may be not the latest version. In such case, it is
highly encouraged to download the latest version for use.
2.1.1 SRA Toolkit Download the SRA toolkit [14], unpack the tarball to your desti-
nation directory (e.g., /home/your_home/soft/), and add the
executables path to your PATH, type:
wget http://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/current/sratool
kit.current-centos_linux64.tar.gz.
Non-Model Organisms Transcriptome Analysis 5
export PATH¼/home/your_home/soft/sratoolkit.2.7.0-
ubuntu64/bin:$PATH
2.1.2 FastQC Download the FastQC package [15], unpack and add the directory
to your PATH.
wget http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/
fastqc_v0.10.1.zip
export PATH¼/home/your_home/soft/FastQC:$PATH
2.1.3 Trinity Download the Trinity package [4], unpack, and add the directory
to your PATH.
wget https://github.com/trinityrnaseq/trinityrnaseq/archive/
v2.2.0.tar.gz.
export PATH¼/home/your_home/soft/trinityrnaseq-2.2.0:
$PATH
export PATH¼/home/your_home/soft/trinityrnaseq-2.2.0/
util:$PATH
2.1.4 RSEM Download the RSEM package [7], unpack, and add the RSEM
directory to your PATH.
wget https://github.com/deweylab/RSEM/archive/v1.2.8.tar.gz
export PATH¼/home/your_home/soft/rsem-1.2.8:$PATH
cd /home/your_home/soft/R-3.2.2
make
make check
make install
2.1.6 Bowtie2 Download Bowtie2 package [17], unpack, and then add Bowtie2
directory to your PATH.
wget http://jaist.dl.sourceforge.net/project/bowtie-bio/bow
tie2/2.2.6/bowtie2-2.2.6-linux-x86_64.zip
2.1.7 Tophat Download Tophat [18], unpack and install, and then add the
(See Note 1) directory to your PATH.
wget http://ccb.jhu.edu/software/tophat/downloads/tophat-
2.0.9.Linux_x86_64.tar.gz
cd tophat-2.0.9.linux_x86_64
./configure --prefix¼/home/your_home/soft/tophat2
make
make install
export PATH¼/home/your_home/soft/tophat2:$PATH
2.1.8 Cufflinks Download Cufflinks [19], unpack and then add the directory to
your PATH.
wget http://cole-trapnell-lab.github.io/cufflinks/assets/down
loads/cufflinks-2.2.1.Linux_x86_64.tar.gz
export PATH¼/home/your_home/soft/cufflinks-2.2.1.Linux_
x86_64:$PATH
2.1.9 EBSeq EBSeq [20] is an R Bioconductor package for gene and isoform
differential expression analysis of RNA-Seq data. For installation,
just start R and enter:
source("https://bioconductor.org/biocLite.R")
biocLite("EBSeq")
biocLite("DESeq")
2.2 Data Samples Most public RNA-Seq data could be downloaded from NCBI SRA
database (https://www.ncbi.nlm.nih.gov/sra) (see Note 2). In this
protocol, we use RNA-Seq data set from the rubber tree. This data set
includes six samples from control and cold stressed conditions with
three biological replicates, which are denoted as “control” and “cold.”
3 Methods
Download the RNA-Seq data from NCBI SRA database and place
the files in your working directory (e.g., /home/your_name/
NGS/SRA). Run the commands as demonstrated in this protocol
in your working directory (see Notes 3 and 4).
3.1 RNA-Seq Data 1. Generate FASTQ files from SRA files. To extract FASTQ files
Quality Control from downloaded sra files, and put them in a new folder “fq”,
go to your NGS data directory and type (see Note 5):
fastq-dump -O ./fq --split-files ./SRA/SRR*.sra
2. Quality controlling by fastQC (see Note 6).
fastqc -o ./qc -f fastq ./fq/Sample*.fastq
3. Remove reads of low quality (optional). In most cases, the low
quality reads have been removed when the sequences were
transferred from the service supplier. In this example, the
FASTQ file has been filtered when submitted to the NCBI
SRA database (see Note 7).
fastq_quality_filter -Q33 -v -q 30 -p 90 -i fq/Sample*.fastq
-o fq/Sample*.fastq
8 Han Cheng et al.
3.2 Gene Expression In most cases, nonmodel organisms do not have reference genome.
Analysis Without We therefore use no reference genome analysis strategy to compare
Reference Genome gene expression profiles and to find DE genes. This strategy first
assembles a reference transcriptome from the RNA-Seq data, and
then maps the reads to the reference transcriptome and calculates
gene expression. In this protocol, we use Trinity to assemble transcrip-
tome, and then use RSEM to calculate reads counts, finally utilize two
popular packages, EBSeq and DESeq, to find DE genes respectively.
1. Reference transcriptome assembly. The Trinity program [4]
can assemble the reads in all the sample files into one reference
transcriptome. Then the reference transcriptome can be used
for gene expression analysis. For paired-end RNA-Seq with
read1 (*_1.fastq) and read2 (*_2.fastq), the reference tran-
scriptome could be assembled by typing:
Trinity.pl --JM 500G --seqType fq --left fq/Sample*_1.fastq
--right fq/Sample*_2.fastq --output trinity_out --min_
kmer_cov 5 --CPU 32
(see Note 8)
Trouble shooting: In some cases, the Trinity program will
stop due to short of memory when executing the “butterfly_
commands”. You may go to the results directory trinity_out/
chrysalis/ and check if the “butterfly_commands” file exists.
Then use the following commands to continue the assembly.
cmd_process_forker.pl -c trinity_out/chrysalis/butterfly_
commands --CPU 10 --shuffle;
Bowtie [22] for read alignment. The first step is to extract and
preprocess the reference sequences and then builds Bowtie
indices.
mkdir rsem
cd rsem
mkdir tmp
extract-transcript-to-gene-map-from-trinity ../trinity_out/
Trinity.fasta tmp/unigenes.togenes
for ((k¼1;k<6;kþ¼1));do
done
(see Note 9)
Alternatively, you can also use EBSeq in a native way for DE
gene identification. In R console, type:
library(“EBSeq”)
setwd("/path/to/your/directory/rsem/")
Condition ¼ factor(c("Control","Control","Control","Cold",
"Cold","Cold"))
GeneSizes ¼ MedianNorm(GeneMat)
GeneEBDERes¼GetDEResults(GeneEBOut, FDR¼0.05)
(see Note 9)
For more detailed function introduction, please refer EBSeq
vignette [20].
4. Differentially expressed gene identification with DESeq. Alter-
natively, you can use DESeq for DE gene identification. DESeq
is a R package to analyze sequence counts data from RNA-Seq
and test for differential expression [21]. DESeq accepts RSEM
output files for analysis. The first step is to merge each FPKM
count files generated by rsem-calculate-expression script in
RSEM package. The merging step can be performed with
merge_RSEM_frag_counts_single_table.pl scripts from Trinity
package:
TRINITY_HOME/util/RSEM_util/merge_RSEM_frag_
counts_single_table.pl Sample1.genes.results Sample2.genes.results
Sample3.genes.results Sample4.genes.results Sample5.genes.results
>all.genes.counts
countTable<-read.table("all.genes.counts",header¼T,sep¼
"\t",row.names¼1)
countTable ¼ round(countTable)
(see Note 10)
conditions<-factor(c("Control","Control","Control",
"Cold","Cold","Cold"))
cds<-newCountDataSet(countTable,conditions)
cds<-estimateSizeFactors(cds)
cds<-estimateDispersions(cds)
12 Han Cheng et al.
write.table(res, ’compare.csv’,sep¼’\t’,quote¼F,row.names¼F)
head(res)
plotMA(res)
res_sig<-subset(res, padj<0.05);
(see Note 11)
dim(res_sig)
res_sig_order<-res_sig[order(res_sig$padj),]
write.table(res_sig_order, ’difference.txt’,sep¼’\t’,quote¼F,
row.names¼F)
(see Note 12)
For detailed introduction, please refer to DESeq vignette [23].
3.3 Gene Expression Benefiting from genome sequencing projects, many reference gen-
Analysis omes have been published in nonmodel organisms recently. In
with Reference these organisms, the analysis strategy with reference genome can
Genome be adopted. Typically, we first prepare the reference genome files,
then map each reads file to the reference genome, and finally call the
DE genes.
1. Prepare reference genome file. Download the genome files
(sequence fasta file and gff annotation file) from GenBank
database, and then build the bowtie2 index with “bowtie2-
build” command in Bowtie2 package:
for ((k¼1;k<6;kþ¼1));do
done
Then merge all the assembled transcripts files:
ls *cl/transcripts.gtf >assemblies.txt
cuffmerge -p 32 -g /path/to/gff/HbGenome.gff3 -s /
path/to/genome /HbGenome.fas assemblies.txt
(see Note 14)
3. Call differential expression genes with Cuffdiff. Cufflinks
includes a program, “Cuffdiff”, which can be used to find
significant changes in transcript expression, splicing, and pro-
moter use. Cuffdiff requires two types of files: sam (or bam) file
from Tophat program and transcript annotation gtf file from
cufflinks:
4 Notes
Acknowledgments
References
1. Hoeijmakers WAM, Bártfai R, Stunnenberg 8. Chao J, Chen Y, Wu S, Tian W-M (2015)
HG (2013) Transcriptome analysis using Comparative transcriptome analysis of latex
RNA-Seq. Methods Mol Biol 923:221–239 from rubber tree clone CATAS8-79 and
2. Garg R, Jain M (2013) RNA-Seq for transcrip- PR107 reveals new cues for the regulation of
tome analysis in non-model plants. Methods latex regeneration and duration of latex flow.
Mol Biol 1069:43–58 BMC Plant Biol 15:104
3. Wang Z, Gerstein M, Snyder M (2009) 9. Fang Y, Mei H, Zhou B et al (2016) De novo
RNA-Seq: a revolutionary tool for transcrip- Transcriptome analysis reveals distinct Defense
tomics. Nat Rev Genet 10:57–63 mechanisms by young and mature leaves of
4. Grabherr MG, Haas BJ, Yassour M et al (2011) Hevea Brasiliensis (Para rubber tree). Sci Rep
Full-length transcriptome assembly from 6:33151
RNA-Seq data without a reference genome. 10. Bevilacqua CB, Basu S, Pereira A et al (2015)
Nat Biotechnol 29:644–652 Analysis of stress-responsive gene expression in
5. Xie Y, Wu G, Tang J et al (2014) cultivated and weedy Rice differing in cold
SOAPdenovo-trans: de novo transcriptome stress tolerance. PLoS One 10:e0132100
assembly with short RNA-Seq reads. Bioinfor- 11. Fu J, Miao Y, Shao L et al (2016) De novo
matics 30:1660–1666 transcriptome sequencing and gene expression
6. Simpson JT, Wong K, Jackman SD et al (2009) profiling of Elymus Nutans under cold stress.
ABySS: a parallel assembler for short read BMC Genomics 17:870
sequence data. Genome Res 19:1117–1123 12. Nakashima K, Yamaguchi-Shinozaki K, Shino-
7. Li B, Dewey CN (2011) RSEM: accurate tran- zaki K (2014) The transcriptional regulatory
script quantification from RNA-Seq data with network in the drought response and its cross-
or without a reference genome. BMC Bioin- talk in abiotic stress responses including
formatics 12:323 drought, cold, and heat. Front Plant Sci 5:170
16 Han Cheng et al.
13. An D, Yang J, Zhang P (2012) Transcriptome 19. Trapnell C, Roberts A, Goff L et al (2012)
profiling of low temperature-treated cassava Differential gene and transcript expression
apical shoots showed dynamic responses of analysis of RNA-seq experiments with TopHat
tropical plant to cold stress. BMC Genomics and cufflinks. Nat Protoc 7:562–578
13:64 20. Leng N, Dawson JA, Thomson JA et al (2013)
14. SRA Toolkit: https://trace.ncbi.nlm.nih.gov/ EBSeq: an empirical Bayes hierarchical model
Traces/sra/ for inference in RNA-seq experiments. Bioin-
15. FastQC: http://www.bioinformatics. formatics 29:1035–1043
babraham.ac.uk/projects/fastqc/ 21. Anders S, Huber W (2010) Differential expres-
16. R: The R Project for Statistical Computing. sion analysis for sequence count data. Genome
https://www.r-project.org/ Biol 11:R106
17. Langmead B, Salzberg SL (2012) Fast gapped- 22. Langmead B, Trapnell C, Pop M, Salzberg SL
read alignment with bowtie 2. Nat Methods (2009) Ultrafast and memory-efficient align-
9:357–359 ment of short DNA sequences to the human
18. Kim D, Pertea G, Trapnell C et al (2013) genome. Genome Biol 10:R25
TopHat2: accurate alignment of transcrip- 23. Love MI, Anders S, Kim V, Huber W (2015)
tomes in the presence of insertions, deletions RNA-Seq workflow: gene-level exploratory
and gene fusions. Genome Biol 14:R36 analysis and differential expression. F1000Res
4:1070
Chapter 2
Abstract
Microarray data have vastly accumulated in the past two decades. Due to the high-throughput characteristic
of microarray techniques, it has transformed biological studies from specific genes to transcriptome level,
and deeply boosted many fields of biological studies. While microarray offers great advantages for expres-
sion profiling, on the other hand it faces a lot challenges for computational analysis. In this chapter, we
demonstrate how to perform standard analysis including data preprocessing, quality assessment, differential
expression analysis, and general downstream analyses.
1 Introduction
Yejun Wang and Ming-an Sun (eds.), Transcriptome Data Analysis: Methods and Protocols, Methods in Molecular Biology,
vol. 1751, https://doi.org/10.1007/978-1-4939-7710-9_2, © Springer Science+Business Media, LLC 2018
17
18 Ming-an Sun et al.
2 Materials
2.1 Microarray Data This protocol starts with Affymetrix microarray data of CEL format
(see Note 2). The CEL files store the results of the calculated
intensity. In addition to newly generated CEL files in the lab, a
huge amount of published CEL files could be retrieved from several
public resources, in particular ArrayExpress (https://www.ebi.ac.
uk/arrayexpress/) and NCBI Gene Expression Ominibus (GEO;
https://www.ncbi.nlm.nih.gov/geo/). To be noted, ArrayExpress
is specific for microarray data, while GEO also contains other types
of OMICs data.
In this protocol, we use public datasets (GEO accession:
GSE67964) for Affymetrix Mouse Gene 2.0 ST Array (MoGene-
2.0-ST) for demonstration.
2.2 R Packages This protocol involves a number of R packages, thus basic knowl-
edge about R and Bioconductor is essential. The basics of R could
be found from resources such as http://tryr.codeschool.com/. R
and Bioconductor could be installed by following instructions from
http://www.bioconductor.org/install/. Below we briefly summar-
ized the ways for R and Bioconductor packages installation and
loading (see Note 3). For the installation of each package used in
this protocol, it will be described in the corresponding section.
Microarray Data Analysis 19
2.3 Annotation Files Two types of annotation files are required: (1) the probe set anno-
tation, which summarizes the location of all probes on the array, as
well as the probes for each probe set; (2) gene annotation, which
maps the probesets to their corresponding genes.
For most microarray platforms, R Bioconductor packages
providing the annotation information are ready for use (see Note 1).
For example, the two annotation packages for MoGene2.0-ST micro-
array are pd.mogene.2.0.st [16] and mogene20sttranscriptcluster.db
[17], respectively. Since this protocol involves a lot of R Bioconductor
packages, these annotation packages could be incorporated into the
pipeline seamlessly.
3 Methods
3.1.3 Read Data into The Bioconductor package “oligo” offers a number of tools for
Memory preprocessing of Affymetrix CEL files, including data import, back-
ground correction, normalization, data summarization and visuali-
zation [18]. In addition, you might need to install and load the
20 Ming-an Sun et al.
2. To get the list of all the CEL files in the directory, type:
3.1.4 Get Normalized To summarize gene level expression, the probeset annotation for
Gene Expression specific array is required. Take microarray data from mogene.2.0.st
platform as example, the Bioconductor package pd.mogene.2.0.st
[16] is needed.
1. To install and load the annotation library pd.mogene.2.0.st,
type:
biocLite("pd.mogene.2.0.st")
library(pd.mogene.2.0.st)
3. To save the expression data in a local file that may be used later
(to be noted, the expression values in the output are normal-
ized and log2 transformed), type:
write.exprs(eset,file="rma_norm_expr.txt")
3.1.5 Gene Annotation Gene annotation is need for further interpretation of the results.
Two Bioconductor packages are required, including Biobase [15]
and mogene20sttranscriptcluster.db [17].
1. To install and load these two packages, type:
biocLite("Biobase")
biocLite("mogene20sttranscriptcluster.db")
library(Biobase)
library(mogene20sttranscriptcluster.db)
The best English always has a bloom upon it. The danger is that, as
vulgarisms increase on one side, proprieties will increase on the
other, and that conversation may begin to burden itself with a sense
of duty. To be correct is already to be mechanical. The defiance of
correctness, even by the vulgar, has in it something of the virtue and
virility, which, in the work of masters, we recognize as the genius of
the language. It is easy enough to avoid saying “like I do”; but it is
difficult to realize that living language overrides grammatical
distinctions and that the test of a phrase is not whether it has been
tabled at Oxford, but whether it has its share of soil and sun and
dew. Here the indolences of our language, its cautiousness, and
even its propensity to wallow in the mire, may have their saving
influence. They are all symptoms of the instinct to get appearances
on the honourable side, the instinct to appear less, not more, than
you are; they are the tacit acknowledgment of a standard of reality,
and count for ballast and steadiness.
Are there then no means of vitalizing our English speech? One
cannot put the question without seeing that it is unreal. “The answer
is in the negative”, as our officials say. Even education itself,
consciously applied, may defeat its object; for if people are to talk
English, they must talk as they wish to talk; they know that the
majority of their would-be masters talk the worse for talking as they
have been taught. As to the meanings of words, the temptation to
suppose that they can be decided from on high must specially be
resisted. We all have our contribution to make to the meaning of the
words we use, and the greatest words—faith, freedom, sport, spirit—
cannot mean more than we do. These cannot be standardized;
standardization, the name without the thought, is their death, simply.
The Trade Unionists of England are disposed to banish ‘competition’
from our dictionary; will nature vanish it from hers? ‘Religion’,
somewhere in America, is the belief that the world was created in six
days; if truth is a fundamentalist, well and good. Obviously there
must be standardization up to a point if people are to stick together,
and we must be prepared to swallow it in considerable doses now
that English is the language of two hemispheres. But the essential is
that the point should be a point of agreement. The kind of feeling, the
kind of habit, that can be imposed on a man are not worth imposing:
the Germans showed that. We, too, have our outbreaks of the
dragooning impulse: the word ‘Empire’ is a notorious rally, with
hyænas always hot upon its trail. But, on the whole, the tendency to
reduce experience to rule and its expression to a formula, the
tendency to regularize men’s minds and drill them into uniformity,
flatly opposed as it is to all our traditions, wins little success amongst
us. True, we have a certain uniformity of drabness (the livery of the
sparrow) which suggests an army inured to all the degradations of
drill and rebellious only against its smartness. But then, it is the
smartness that kills. Drill is machine-made uniformity, a necessary
evil of which the English hate to make a panache. Their uniformities
are morose, because they are uniformities of submission; their pride
goes out to the things they touch directly and can make their own.
This is the attitude to be cherished at all costs, because the future is
open to it, because it opens to the future. By Heaven’s grace, the
English have it deep ingrained. Thus the future of English presents
itself to the mind as depending, above all, on the survival, in its pre-
eminence, of the spirit of freedom, the more so because the scope of
freedom is determined by the capacity for discipline. The question of
the day is how much machinery a man can stand; and the hope for
English is that the average Englishman can stand so much.
Regulations are necessary everywhere. Language itself must have
its dictionary, grammar its rules. The English rob them of their sting
by toleration. Their order even when they speak is spontaneous and
has a taste of liberty.
That an Englishman should regard England as the life-centre of
the English language is, perhaps, inevitable; yet he is foolish if he
assumes her to be so. The life-centre of English is to be found where
the spirit of those who speak it is in closest accord with developing
realities, and these cannot reveal themselves to minds fixed in any
past, however vital that past may have been when it was present.
Are not, then, the Americans living a more contemporary life than we
are?—has not the focus of development passed over to them? This
is a question so searching that I can touch upon it only with the
greatest diffidence. At the conclusion of his first preface to Leaves of
Grass, Whitman, distinguished among great writers for the forward
view, congratulated himself and the Americans on the qualities of the
language they had inherited. “English”, he wrote, “is the chosen
tongue to express growth, faith, self-esteem, freedom, justice,
equality, friendliness, amplitude, prudence, decision, and courage.” It
is a noble list of virtues which no one would wish to disavow; and yet
the Englishman, of whatever station, would still prefer the briefer
catalogue of Chaucer’s knight, who, five hundred years ago,
loved chivalrye
Trouthe and honour, fredom and curteisye.
Transcriber’s Notes:
Punctuation and spelling inaccuracies were silently
corrected.
Archaic and variable spelling has been preserved.
*** END OF THE PROJECT GUTENBERG EBOOK POMONA; OR,
THE FUTURE OF ENGLISH ***
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must, at
no additional cost, fee or expense to the user, provide a copy, a
means of exporting a copy, or a means of obtaining a copy upon
request, of the work in its original “Plain Vanilla ASCII” or other
form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.
• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.