Nothing Special   »   [go: up one dir, main page]

Proteomics in The Pharmaceutical and Biotechnology Industry A Look To The Next Decade

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Expert Review of Proteomics

ISSN: (Print) (Online) Journal homepage: www.tandfonline.com/journals/ieru20

Proteomics in the pharmaceutical and


biotechnology industry: a look to the next decade

Jennie R. Lill, William R. Mathews, Christopher M. Rose & Markus Schirle

To cite this article: Jennie R. Lill, William R. Mathews, Christopher M. Rose & Markus Schirle
(2021) Proteomics in the pharmaceutical and biotechnology industry: a look to the next
decade, Expert Review of Proteomics, 18:7, 503-526, DOI: 10.1080/14789450.2021.1962300

To link to this article: https://doi.org/10.1080/14789450.2021.1962300

© 2021 The Author(s). Published by Informa


UK Limited, trading as Taylor & Francis
Group.

Published online: 12 Aug 2021.

Submit your article to this journal

Article views: 12868

View related articles

View Crossmark data

Citing articles: 12 View citing articles

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=ieru20
EXPERT REVIEW OF PROTEOMICS
2021, VOL. 18, NO. 7, 503–526
https://doi.org/10.1080/14789450.2021.1962300

REVIEW

Proteomics in the pharmaceutical and biotechnology industry: a look to the next


decade
Jennie R. Lilla, William R. Mathewsb, Christopher M. Rosea and Markus Schirlec
a
Department of Microchemistry, Lipidomics and Next Generation Sequencing, Genentech Inc. DNA Way, South San Francisco, CA, USA; bOMNI
Department, Genentech Inc. 1 DNA Way, South San Francisco, CA, USA; cChemical Biology and Therapeutics Department, Novartis Institutes for
Biomedical Research, Cambridge, MA, USA

ABSTRACT ARTICLE HISTORY


Introduction: Pioneering technologies such as proteomics have helped fuel the biotechnology and Received 26 April 2021
pharmaceutical industry with the discovery of novel targets and an intricate understanding of the Accepted 27 July 2021
activity of therapeutics and their various activities in vitro and in vivo. The field of proteomics is
KEYWORDS
undergoing an inflection point, where new sensitive technologies are allowing intricate biological
Biological mass
pathways to be better understood, and novel biochemical tools are pivoting us into a new era of spectrometry; biomarkers;
chemical proteomics and biomarker discovery. In this review, we describe these areas of innovation, chemoproteomics;
and discuss where the fields are headed in terms of fueling biotechnological and pharmacological Proteomics; multi-omics;
research and discuss current gaps in the proteomic technology landscape. single cell proteomics;
Areas Covered: Single cell sequencing and single molecule sequencing. Chemoproteomics. Biological sensitivity; single protein
matrices and clinical samples including biomarkers. Computational tools including instrument control sequencing
software, data analysis.
Expert Opinion: Proteomics will likely remain a key technology in the coming decade, but will have to
evolve with respect to type and granularity of data, cost and throughput of data generation as well as
integration with other technologies to fulfill its promise in drug discovery.

1. Introduction
silico prediction tools? Imagine if it were possible to analyze
Proteomics has evolved to address increasingly complex bio­ post-translational modification events directly from the sub­
logical questions, unravel new intracellular signaling pathways sets of immunological cells, or neuronal cells, pre- and post-
leading to new therapeutic targets and has helped decipher response to a molecular perturbation? And imagine if one
key pathway modulators and biomarkers [1]. In embarking on could rapidly analyze all of the proteoforms from just tens of
assembling this review, we dissected the literature and inter­ cells from a xenograft model, or a few microliters of biofluid in
viewed colleagues for where they see this field evolving and a high throughput manner? These are all possible, but are far
having an influence in biotechnology and pharmaceutical from being routine, and require pooling of samples or heroic
research. Although the responses were diverse, some common efforts to produce meaningful reproducible data.
themes emerged which have been highlighted above. Sensitivity has long been the ‘Achilles’ heel’ in proteomics-
Figure 1 depicts the current and emerging future state of & protein-based mass spectrometry. Unlike our genomic coun­
proteomics in the pharmaceutical and biotechnology industry. terpart technologies, proteomics is not blessed with tools such
Here, we delve into the main technological themes and dis­ as the polymerase chain reaction (PCR) to amplify low level
cuss their current limitations and future possibilities. biomaterial; instead, researchers must rely on advances in
technologies to detect low level protein and peptide signals.
2. Sensitivity – advancements in single cell Recent advances in single-cell proteomics & single protein
proteomics and its impact on advancing biomedical molecule sequencing have the potential to revolutionize bio­
science medical research by enabling accurate characterization and
quantitation of translational and post translational events on
Imagine if one could dissect a metastatic tumor and be able to cellular samples from challenging sources, for example, from
analyze the T cell epitope repertoire directly for the develop­ rarer cell types as well as from low quantity clinical materials.
ment of a personalized cancer immunotherapeutic program, For many years, technologies such as microscopy have
rather than rely on a combination of genomic analyzes and in allowed dissection of biological events at a cellular level,

CONTACT Jennie R. Lill lill.jennie@gene.com Department of Microchemistry, Lipidomics and Next Generation Sequencing, Genentech Inc. DNA Way, South
San Francisco 94080, CA, USA

Jennie R Lill: Executive Director and Senior Fellow Scientist,


William R Mathews: Director, Scientist.
© 2021 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc-nd/4.0/),
which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.
504 J. R. LILL ET AL.

Recent advances in sample collection and processing,


Article highlights separation chemistries, ionization and mass spectrometric
This article highlights the current status of the proteomics field,
instrumentation as well as data collection and curation tech­
and how it supports drug discovery and development. It also niques now make it possible to quantify > 1000 proteins from
discusses current limitations, and areas of rapid growth in the a single cell. In the first section of this review, we investigate
field in addition to new technologies and approaches on the
horizon that have the potential to be highly impactful on how
the various techniques recently highlighted from the literature
proteomics shapes the next set of drug targets, therapeutic mod­ for the optimization of each of these parameters, these sub­
alities, biomarkers, diagnostics and clinical endpoints, assays and categories of proteomic technologies are illustrated in
diagnostics associated with the biotherapeutic and small molecule
drug research.
Figure 2.

2.1. Sample collection, preparation, and separation


however it is only in recent years that genomic sequencing
A proteomic analysis is only as sensitive and successful as its input
techniques have also advanced to routinely allow analysis of
material, and from then on, the cumulative gains or losses that are
cell-specific mediated events rather than an averaged over­
incurred in the analytical journey of a sample. This begins with how
view of cellular cluster or tissue-level activities. The next gen­
a sample is collected in the laboratory or the clinic, how it is then
eration sequencing (NGS) field has recently bloomed,
prepared, derivatized and separated, to how it is analyzed both
encompassing a variety of tools to decipher the content of
biophysically as well as via data analytics. Reducing sample pre­
the mRNA, nucleic DNA and epigenetic events associated with
paration time and the number of adherent surfaces that come into
single cells and is now considered an essential technology for
contact with it, can all contribute to more sensitive analyses. Over
unraveling biological mechanisms [2–4].
the past decade the field of proteomics has witnessed the emer­
However, given mounting evidence that transcript abun­
gence of various tools for more efficient introduction of low level
dance does not always correlate with translational and post
materials into the auto sampler or mass spectrometer. ‘NanoPOTS’
translational events [5,6], increasing our abilities to detect
(nanodroplet processing in one pot for trace samples), is one such
increasingly lower levels of protein and peptides is imperative
platform recently described to enable small cell number proteo­
if proteomics is to be of maximum utility to biomedical and
mics analysis [7] using a device where proteomic sample prepara­
clinical research and we are to be able to capture a true
tion (reduction,
snapshot of the translational events governing cellular
alkylation, proteolytic digestion) can be performed at
regulation.
a miniaturized scale in a humidified chamber, thus minimizing

Figure 1. Current and future state of proteomics in the pharmaceutical industry. Proteomics plays a critical role in drug discovery and development. At present,
proteomics is used pre-clinically for target identification and characterization, drug candidate selection and characterization, and clinically for biomarker discovery
and development. These are often independent, standalone efforts; for example, proteomics may be used to identify disease specific proteins from clinical samples
and those proteins subsequently used as diagnostic biomarkers. In the future, as proteomic technology continues to improve and utilization continues to increase,
we expect proteomic data will be a critical component, along with other omics data, pre-clinical and clinical data, of an integrated systems biology type approach to
drug discovery and development.
EXPERT REVIEW OF PROTEOMICS 505

Figure 2. Advances in proteomics technologies that will impact therapeutic development in the coming years. This includes more sensitive sample preparation on
more diverse cellular types and biological fluids, data collection, and analysis.

sample evaporation. Sample manipulation in volumes < 200 nL, approaches allowed single cell-sized protein quantities to
minimizes sample loss by reducing exposure to potentially adher­ a depth of ∼1600 identified proteins with a median CV of
ent surfaces. NanoPOTS glass chips are composed of photolitho­ 10.9% and correlation coefficient of 0.98 [9].
graphically patterned hydrophilic pedestals surrounded by An alternative method for sample clean-up and its introduc­
hydrophobic surfaces to serve as nanodroplet reaction vessels. tion to the ionization source, was described by Brunner et al.
The chip consists of a glass spacer, sealed to a membrane-coated Here, they employed the use of an Evotip for sample clean up
glass slide to minimize evaporation of the nanowell contents and eluted peptides directly for separation and mass spectro­
during the various incubation steps. The glass substrate facilitates metric analysis in one integrated procedure. This approach
microscopic imaging of samples and minimizes protein and pep­ reduces analytical time and avoids the losses associated with
tide adsorption relative to many other materials due to its hydro­ drying down and reconstituting low-level samples [5].
philicity and reduced surface charge at low pH. By combining There are a variety of other methods published on mini­
nanoPOTS with high sensitivity tandem mass spectrometry (MS/ mizing exposure of low level materials to adherent materi­
MS), Zhu et al. identified ∼1500 to ∼3000 proteins from ∼10 to ∼140 als, including oil-air droplet (OAD) [10] or the integrated
cells, respectively [7]. By also incorporating the ‘Match Between proteome analysis device (iPAD) [11], and on reducing the
Runs’ (MBR) algorithm [8], > 3000 proteins were consistently iden­ number of manipulations the sample encounters prior to
tified from as few as 10 cells. MBR is matching the MS/MS spectra analysis. This is a very important parameter to optimize if
from one run with the intact parent ion from another run. By one wishes to obtain the levels of sensitivity routinely
extrapolating the MS/MS identification from one run and using it appreciated by our genomic counterparts. With the trans­
to detect/quantify a peptide in another mass spectrometric run formation of material sciences in the next decade, new
with the same parameters file and chromatographic profile, one matrices and substances with more attractive biophysical
can increase the number of quantified species without having to properties to reduce sample adherence and increase recov­
trigger MS/MS in each run. Using this trifecta of technologies, ery of low level peptides for proteomic analyses are likely to
∼2400 proteins were quantified from single human pancreatic emerge.
islet thin sections from type 1 diabetic patients and control donors,
demonstrating the utility of nanoPOTS for spatially resolved pro­
2.2. Evolution of mass spectrometric instrumentation
teome measurements from clinical material.
and data analyses for increasing the levels of sensitivity
The same team went on to demonstrate that combined
in a proteomic analysis
microfluidic nanodroplet technology with tandem mass tag
(TMT) isobaric labeling could significantly improve analysis The majority of workhorse mass spectrometric instrumenta­
throughput and proteome coverage for low levels of mammalian tion currently claim an average limit of detection (LOD) of
cells. Dou et al. demonstrated that this combination of analytical approximately 10 amol or 6 million 50-kDa protein molecules,
506 J. R. LILL ET AL.

which is orders of magnitude higher than where the field was spectrometer which is a time of flight mass spectrometer
just a decade ago, but still correlates with challenges asso­ coupled to an ion mobility analytical unit. Ion mobility spec­
ciated with analyzing very low level biological materials. troscopy (IMS) allows for separation of ions in the gas phase
Signal to noise ratio (S:N) correlates directly with sensitivity, based on their mobility in a carrier buffer gas and IMS prior to
which in turn impacts dynamic range, the metric of the signal mass spectrometric analysis separates the noise (singly
available for detecting peptides or proteins from a complex charged, often non peptidic species) from peptides (typically
mixture. For a detector such as the Orbitrap, the typical higher charged species). The TIMS-TOF increased sensitivity
dynamic range is about 5–6 orders of magnitude, whereas through a number of analytical modifications that are not
peptide and protein concentrations can span > 10 orders of yet commercially available, including mechanisms for more
magnitude in a given biological matrix, for example a tumor efficient trapping of the peptides ions in the instrument.
sample, cell line or biological fluid such as plasma. Biological The instrument was run using parallel accumulation –
matrices where the dramatic range of protein concentrations serial fragmentation (PASEF), a mass spectrometric acquisi­
poses analytical challenges are whole blood or serum, where tion protocol whereby peptide ions are released from the
antibodies may be present at concentrations as high as 1 mg/ IMS in the vacuum system in concentrated packages, lead­
nL but immunomodulatory proteins such as cytokines or che­ ing to a tenfold increase in peak capacity. Precursor ions
mokines, popular targets for biological exploration or biomar­ were fragmented in either data dependent acquisition
kers of disease etiology are often present at a mere ng/mL PASEF (ddaPASEF) or data independent acquisition PASEF
concentration [12]. This vast difference in relative abundance (diaPASEF) mode and Brunner et al. employed the MBR
can make the analysis of lower level moieties extremely algorithm (as previously described) to improve the number
challenging. of proteins identified [5]. Their analyses showed that single
Factors governing the sensitivity of a mass spectrometric cell analyses could define a stable core proteome,
analysis include ionization efficiency, ion transfer efficiency a proteome subset in the MS-based proteomics data com­
into the vacuum system, and how ions are utilized/analyzed posed of the top 150 proteins with the lowest CVs of the
in the instrument [13]. Various mass spectrometric techniques proteins shared between at least 70% of the more than 420
have been employed to analyze increasingly less abundant single-cell measurements in their study, including drug per­
proteins from a complex proteome. The analysis of individual turbations analyses. A dilution series determined limits of
protein or sets of proteins are reviewed in section (6.2.) but proteome detection and a linear signal response throughout
here we review the techniques available for global proteomic the dilution series was highly reproducible between repli­
profiling, and the mass spectrometric approaches being uti­ cates. The samples were prepared in 384 well plates, with
lized to achieve low level analyses here can be generalized cells sorted into 1 µL of buffer, cells were lysed using a free
into two approaches; a label-free approach, and a chemically thaw approach with thaw sonication followed by proteolytic
tagged labeling technique, where reagents such as TMTs are digestion. In this study, they identified proteins predicted to
employed for multiplexing samples and collectively amplifying be associated with the G2/M phase of the cell cycle and
signals from pooled analytes. could characterize differentially expressed proteins in G2/M,
A potential benefit of the label-free approach is that there G1 and S previously reported in the literature.
is less sample manipulation, a key parameter for ultra-sensitive Earlier, we noted the now general observation that tran­
analyses. In addition, reduced sample complexity is often scriptome does not always correlate with translated products,
correlated with easier data analysis (no chemical tag modifica­ and this was also observed by Brunner et al. in their compar­
tion to add to the search parameters, no deconvolution of ison of single cell proteomics to scRNASeq [5], again high­
data needed as is required with a multiplexing approach). lighting the importance of developing this field.
However, label-free methods will be challenged by limited
throughput – particularly for single cells experiments where
thousands of measurements are required. For multiplexed 2.2.2. SCOPE-MS and the utilization of chemical labeling
samples that utilize TMT, the TMT reporter ions are known to to detect low level proteomes
potentially suffer from ratio compression [14] which can lead In parallel to label-free detection methods for low level pro­
to false negative quantitative results. However, multiplexed teomic analysis, a method called Single Cell ProtEomics by
approaches analyze many samples at once and provide an Mass Spectrometry (SCoPE-MS) has gained significant
avenue to analyzing thousands of samples within momentum. With SCoPE-MS, quantitative chemical labels
a reasonable timeframe. Although there are caveats and (e.g., TMTs) are utilized to provide an additive signal from
advantages to both techniques, each has shown merit in a carrier proteome to boost qualitative and quantitative sig­
catapulting us closer as a proteomics community to single nals from an experimental sample [15]. The main feature of
cell analyses. SCOPE-MS, is a carrier proteome that is typically spiked into
a multiplexed single cell biological sample of interest at ∼25–
2.2.1. Label-free detection of low level proteomes 500 excess of the single cell proteomes enabling detection of
One of the most sensitive studies to date was described by peptides in a survey scan and subsequent selection of iden­
Brunner et al. [5] whereby a label-free approach was described tification and quantification. Reporter ions which are revealed
for high sensitivity global proteomics. In addition to using the during the MS2 or MS3 scans allow quantitation of both the
Evotip described above, they also employed a trapped ion carrier protein and the low level experimental samples in
mobility spectrometry-time of flight (TIMS-TOF) mass parallel.
EXPERT REVIEW OF PROTEOMICS 507

Similar to SCoPE-MS, Tsai et al. developed a technique isobaric labels and synthetic peptides to enable sample and
‘boosting to amplify signal with isobaric labeling’ (iBASIL) to peptide multiplexing within a sensitive targeted assay [19].
quantify phosphorylation in a small number of cells, for highly TOMAHAQ comprises a complex MS scan sequence including
effective analysis of proteins in single cells. By optimization of a peptide sequencing scan where a synthetic peptide identi­
several mass spectrometric instrument parameters including fication triggers an offset analysis on the endogenous target
MS automatic gain control (AGC) and ion injection time set­ peptide. This peptide is also sequenced, and the correspond­
tings in MS/MS analysis (e.g., 5E5 and 300 ms, respectively, ing fragment ions are isolated for a final quantitative analysis.
which is significantly higher than that used in typical bulk This method was initially implemented within the native
analysis), further improvements in sensitivity were observed. instrument code, but later adapted to utilize the flexible ven­
By coupling these instrument setting advancements with dor method file format [20]. While TOMAHAQ is currently
nanoPOTS, iBASIL enabled identification of ∼2500 proteins limited to just ∼100 peptides per analysis, future improve­
and precise quantification of ∼1500 proteins in the analysis ments to the structure of vendor methods promise to allow
of 104 FACS-isolated single cells [16]. techniques such as TOMAHAQ to analyze thousands of pep­
SCoPE-MS is not only a powerful technique but also comes tides per MS analysis. Furthermore, improved computational
with several caveats that have to be taken into consideration capabilities afforded by modern programming languages have
before interpreting results. Cheung and colleagues [17] dis­ enabled more advanced spectral processing and analysis lead­
sected this approach and demonstrated the accuracy of ing to deeper proteome characterization. For example, recent
SCoPE-MS is dependent on the amount of carrier proteome improvements in MS instrumentation led to the number of
that is employed as well as the mass spectrometric para­ peptides sequenced per outpacing available candidates for
meters used during data analysis. By limiting carrier proteome sequencing – leading to lost instrument time [21]. By improv­
levels and optimizing data collection parameters, data quality ing the algorithm that determined which peaks within an MS
drastically improves, albeit at a cost to protein identifications. spectrum are candidates for sequencing instrument analysis
Using these principles, it is clear that early SCoPE MS data time was optimized and the depth of proteomic analysis was
suffered from quantitative ‘noise’ and inaccuracies (CV > substantially improved [22].
40%), but more recent data such as the iBASIL study (above) In addition to complex methods implemented through
appears to be much higher quality. While high carrier pro­ vendor software, IDA has been extended by ‘third-party’ appli­
teome levels can be overcome by optimizing data collection, cations that utilize an instrument application-program inter­
a recent study by Stopfer et al. suggests that including face (iAPI) to capture MS data in real time and instruct the
a carrier proteome decreases the dynamic range of quantifi­ mass spectrometer to perform a defined analysis. The most
cation [18]. By focusing on low-level phospho-tyrosine and advanced algorithms will map a peptide sequence to the MS
immunopeptidomic samples they demonstrate that quantita­ data in real time, enabling sequence specific tasks to be
tive dynamic range decreases 2 to 6-fold when a carrier pro­ performed. This paradigm was first introduced in dual pub­
teome is employed. These data demonstrate that the true lications that described a real-time implementation of the
impact of a carrier proteome and its utility in analyzing low MaxQuant algorithm [23] and the development of a novel
level and single cell samples is still being understood. As mass peptide sequencing approach, inSeq [24]. For both applica­
spectrometric sensitivity and multiplexing capabilities tions, the identification of peptide sequences enabled trigger­
increase it is possible that carrier proteomes will become ing of additional scans to improve stable isotope labeling
obsolete, but until that time research should proceed with using amino acids in cell culture (SILAC) quantitation through
caution as they collect and interpret data from methods that dedicated selected ion monitoring (SIM) scans, improve iso­
rely on carrier proteomes to enable deep proteome baric labeling quantitation through additional quantitative
quantification. scans, or localize post-translational modifications (PTMs) by
changing the fragmentation parameters. In recent years, the
complexity of the iAPI and the performance of desktop com­
2.3. Development of intelligent data acquisition
puters attached to mass spectrometers have dramatically
methods for mass spectrometry analysis
improved enabling more complex algorithms to be performed
As the sensitivity of mass spectrometers continues to improve, on the millisecond timescale required for MS analyses. For
intelligent data acquisition (IDA) enabled by real-time analysis example, this has enabled a more complete implementation
of MS data has enabled more sophisticated data collection of the MaxQuant ecosystem through MaxQuantLive [25]. The
methods as well as increased the efficiency and depth of area most impacted by these improvements to computational
proteomic analyses. Early versions of mass spectrometers ran power has been multiplexed global proteome quantification.
on rudimentary embedded computers utilizing custom code Multiplexing technologies have increased the number of
bases developed specifically for the mass spectrometer con­ proteomes that can be analyzed in one experiment and have
trol. More recently, mass spectrometers have utilized modern dramatically improved our ability to assay various genotypes,
programming languages such as Python and Lua, which treatments, or time points in one discovery proteomics experi­
enables more sophisticated method construction and ment. As described above, isobaric label-based multiplexing
execution. approaches are challenged by ratio compression caused by
This is exemplified by a recently introduced method called multiple peptides being isolated simultaneously during frag­
triggered by offset, multiplexed, accurate mass, high resolu­ mentation [14]. This effect can be alleviated by a gas phase
tion, and absolute quantitation (TOMAHAQ) which combines purification technique called SPS-MS3 that utilizes dedicated
508 J. R. LILL ET AL.

sequencing and quantitative scans for each candidate peptide spectrometer typically requires an amount of tumor tissue not
[26,27]. However, due to each candidate peptide being ana­ available within the course of treatment. Single molecule
lyzed twice, this approach decreases instrument duty cycle sequencing could enable the direct detection of therapeuti­
and ultimately proteomic depth. Recently, multiple IDA cally relevant epitopes for the inclusion of personalized cancer
approaches have addressed this limitation by performing vaccine or engineered T cell therapies.
a real time database search (RTS) and only performing the Another approach adopts nanopore technology to enable
slower, more accurate quantitative scans when a peptide is the electrical detection of specific amino acids as a protein is
confidently identified [28,29]. Implementation of this approach passed through the pore. A number of different techniques
improves data accuracy and allows for similar proteomic have been implemented to feed the protein through the pore
depth to be achieved in half of the analysis time [29]. This is including attachment of a DNA tag [34], utilization of an
due to a greater fraction of the available instrument duty cycle unfoldase [35], or the use of adhering negative ionic deter­
being used collecting data related to peptides that are identi­ gents [36]. Discriminating the 20 proteinogenic amino acids
fied in post-run data analysis pipelines. However, current itera­ remains a challenge for nanopore sequencing, due to the fact
tions of RTS are still limited in the size of the database that can that amino acids are smaller than a monophosphate nucleo­
be interrogated within the limited time available between tide and thus produce a smaller electrical current blockade
peptide sequencing scans (∼20 ms). In the coming years, as [37]. Recently, Ouldali et al. described an approach that links
RTS algorithms become more efficient it will be possible to each amino acid to a cationic carrier of seven arginine amino
search databases that consider multiple post-translational acids and passes this new polypeptide through an aerolysine
modifications or nonspecific cleavages events. These develop­ nanopore for sequencing [38]. The arginine carrier ensured the
ments benefit the quantification of therapeutically relevant peptide spent a sufficient amount of time in the pore and
peptide modifications such as covalent inhibitor screening or enabled sequencing of 13 of 20 proteinogenic amino acids.
traditionally difficult to identify MHC-associated peptides. The authors continue to demonstrate that chemical modifica­
tion of the amino acids can lead to the detection of the
remaining 7 amino acids within this system. Challenges
2.4. The emergence of next generation single molecule
remain for nanopore sequencing, particularly the discrimina­
peptide and protein sequencing
tion of post-translationally modified amino acid residues.
As mass spectrometry based proteomic technologies continue Nanopores have demonstrated the ability to discriminate
toward enabling single cell sensitivity, the era of next genera­ phosphorylated from non-phosphorylated proteins [39], but
tion peptide and protein sequencing is imminent. Single mole­ the challenge of discriminating all possible amino acid side
cule protein detection is currently possible through DNA- chain modifications (e.g., acetylation, methylation, glycosyla­
linked antibodies [30] or fluorescently-labeled protein specific tion) still remains unmet.
aptamers [31]. While powerful techniques, these technologies To date, examples from literature are limited, but it is clear
require validated tool molecules that are selective for the that next generation proteomics approaches have been
protein of interest and have the potential to produce a false quietly growing behind the scenes [40]. Just recently, stealth-
negative signal if the binding epitope on the target protein is mode startups Nautilus Biotechnology and Quantum-Si have
not accessible due to post-translational modification. These emerged with the stated goals of developing commercial next
challenges have driven the current race to introduce platforms generation proteomics platforms. While the technologies
for unbiased single molecule peptide and protein sequencing. underlying these platforms have yet to be revealed, it is
Marcotte and colleagues introduced an example of this clear that the coming years will unveil the possibilities of non-
paradigm by elegantly combining legacy protein sequencing mass spectrometry based unbiased and untargeted single
techniques with single molecule fluorescence detection [32]. molecule sequencing proteomics approaches.
Here, proteins are digested into peptides with trypsin and
subsequently digested with an enzyme that cleaves after spe­
cific amino acids (e.g., GluC which cleaves on the C-terminal
3. The promise of systems biology and multi-omics
side of glutamate). Fluorescent labels are added to specific
approaches
amino acid side chains (e.g., lysine or cysteine) before peptides
are affixed to a microscope slide. Through successive rounds In the past decade we have seen advances in various omics
of single molecule fluorescence detection and Edman degra­ techniques including genomics, transcriptomics, proteomics,
dation, peptides are monitored to detect losses in fluores­ and metabolomics. An emerging systems biology approach
cence which indicate a labeled amino acid has been cleaved. attempts to gain a holistic sense of an organism, cell or
Combining the pattern of fluorescence loss and the known biological pathway by analyzing these data sets together to
enzyme specificity the peptide sequence can be determined form a comprehensive molecular understanding of a given
[33]. This approach is attractive because it has the potential to biological pathway. This is no easy task, as each of these
sequence peptides in an unbiased manner and could poten­ data sets is produced under various biophysical conditions,
tially be used to specifically sequence post-translationally with nuances to data analysis let alone data integration. Many
modified peptides. One compelling application of this tech­ of these biomolecules are linked in disparate ways, not directly
nology is the direct sequencing of cancer neoantigen epitopes relating to our organized view that is the central dogma for
presented on the surface of tumor cells. These molecules exist these fields. Metabolites and short chain fatty acids for exam­
at low copy numbers per cell and direct detection by mass ple, represent the downstream products of multiple
EXPERT REVIEW OF PROTEOMICS 509

interactions between various genes, transcripts, and proteins. these translation events are rapidly turned over, it stands to
Identifying metabolites alone does not give one the whole reason that they would be more readily presented on the cell
story about how a cell is signaling, what it is interacting with surface. Building on this finding, studies from Ruiz Cuevas
or under what cellular state it is in, but it can offer important et al. [45] and Ouspenskaia et al. [46] combined RNA-seq, Ribo-
clues. For some analyses that are routinely performed there is seq, and MHC-associated peptidomics to demonstrate that
still some ‘guess work’ involved, or at least incorporation of non-canonical proteins are enriched in the immunopepti­
algorithms that make assumptions about the data that is dome. These therapeutically relevant ‘dark matter’ antigens
being used as a database or to interpret downstream analyses. are of interest for both cancer vaccine and T cell therapy
One example of this is the recent exploration of ‘dark matter’ approaches, where common, tumor specific antigens repre­
material in our genome, or the genome/proteome of an indi­ sent ideal targets. These examples demonstrate that while
vidual that does not confer to the traditional paradigm of genome annotations generally present an accurate view of
proteins being produced due to canonical translation events. what is transcribed and translated – there are specific tran­
Proteogenomics utilizes a combination of proteomics, scription and translation events that may occur in a disease-
genomics, and transcriptomics to aid in the discovery and specific manner. Due to this, technologies that comprehen­
identification of peptides and proteins and pathways evolved sively capture the proteome will be important in defining
a number of years ago [41]. With rapid advancements in the biological systems at the core of drug discovery efforts.
RNA sequencing field, proteogenomics has been shown to be The collection of large scale proteomic, genomic, proteo­
a power tool allowing the generation of customized protein mic, and lipidomic datasets offers the opportunity to combine
sequence databases using genomic and transcriptomic infor­ these data modalities and build functional networks important
mation. This has allowed easier identification of point muta­ in the severity or progression of disease. Performing compar­
tions, splice variants and other peptides that are not typically isons of multi-omics data is not trivial and requires a deep
represented in reference protein sequence databases. understanding of the complexity and caveats of each -omic
Although it is still not a common practice by most labora­ approach. And while multi-omic integration is still evolving,
tories, proteogenomic analysis has allowed certain biological examples of disease-relevant studies are starting to emerge.
questions to be answered that would be very time consuming For example, Overmyer et al. recently demonstrated that com­
using de novo sequencing or wild card searching approaches. bining proteomic, metabolomic, and lipidomic measurements
For example, in the rapidly growing field of cancer immu­ in plasma with transcriptomic analysis of leukocytes revealed
notherapy where neo-antigens are often the targets for var­ 219 biomolecules strongly associated with COVID-19 status
ious modalities, the identification of these tumor specific point and severity [47]. Combining multiple omics results resulted
mutations that occur due to the inherent genetic instability of in clusters enriched in severe COVID-19 cases, such as a cluster
a malignancy is often required. These point mutations can be that included the protein gelsolin (GSN) and the metabolite
easily identified using RNA-sequencing and Exome-seq, and citrate. This association makes functional sense because GNS is
translating these into a protein based FASTA file allows easy a Ca2+-activated actin-severing protein and citrate is a calcium
peptide characterization [42]. chelator. This example highlights that while multi-omics clus­
In addition to developing fit-for-purpose proteome data­ tering and analysis is possible, an understanding of the biolo­
bases through RNA- or Exome-sequencing, ribosome profiling gical roles of biomolecules is important to reveal the
(Ribo-seq) has been growing in popularity as a method to importance of enriched clusters. Lastly, the authors used
understand the translatome of a biological system. Unlike machine learning approaches to build a model that would
RNA-Seq or Exome-Seq, Ribo-Seq reveals the portions of the predict COVID-19 outcome as severe or less severe. This ana­
genome that are actively being translated as evidenced by the lysis allowed the authors to find additional metabolites asso­
presence of ribosomes on an RNA molecule. These data can be ciated with COVID-19 severity, kynurenine and quinolinic acid,
used alone as evidence of a protein product existing within both of which have roles in the immune function and
a cell and in some cases correlates better with protein abun­ inflammation.
dance as compared to RNA-seq [43]. However, Ribo-seq results The community is also witnessing the emergence of in
are more powerful when combined with proteomic analysis depth multi-omic datasets such as the UK Biobank where
that detect the protein product of the translation event. This is extensive data on ~ 500 thousand participants has been
particularly true for non-canonical translation events that can­ generated including genetic data (SNP array, WES already
not be predicted from genome sequence alone. performed and WGS planned) with linked clinical data and
Similar to the proteogenomics approach described above, full body MRI scans. In addition, NMR-metabolomics and
Ribo-seq data can be used to create a proteome database that O-link data was performed on participants who were
is used when searching mass spectrometry data, or small open recruited for a long term study spanning > a decade at
reading frames (smORFs). Weissman and colleagues used this this point. These multi-omics datasets not only can provide
approach to identify 3,455 ORFs distinct from annotated cod­ insights into differential protein or metabolite expression
ing sequences [44]. Interestingly, only 36 peptides from these associated with disease phenotypes and lab measures, but
distinct ORFs were observed, suggesting that the protein pro­ they can also be used to look for protein quantitative trait
ducts are not stable and are degraded quickly. This is sup­ loci (pQTLs) which help interpret genetic associations.
ported by the fact that the authors found 240 HLA-I associated Overlaying these genome-wide multi-omics datasets can
peptides from these distinct ORFs. If the protein products of reveal novel networks [48].
510 J. R. LILL ET AL.

With the growing application of machine learning techni­ currently not included in a database search, including single
ques, it is likely that utilizing multi-omic data to build predic­ nucleotide variants, rarer post-translational modifications, or
tive models of disease state or prediction will become more biologically relevant protease cleavage events.
common. The coming years will define how applicable this
approach is within a drug development or clinical setting, but
5. Availability of proteomics data
the studies such as the one described here are an example
of how this approach could relate to important disease The availability of large-scale genetic and transcriptomic data
models. has fueled our understanding of the prevalence of common
cancer mutations. This is particularly important as new ther­
apeutic modalities, such as cellular therapies, aim to target
4. Applications of machine learning to peptide
proteins upregulated in tumor tissue (tumor associated anti­
sequencing and characterization
gens, TAA) or the mutated cancer proteins. For example, when
Like many scientific fields, proteomics is currently undergoing determining if a protein is a TAA a common practice is to use
a machine learning revolution. Machine learning algorithms data within The Cancer Genome Atlas (TCGA) which has both
such as linear discriminant analysis (LDA) [49] or support tumor and normal tissue expression data. An alternative strat­
vector machines (SVM) [50,51] have traditionally been used egy is to analyze candidate transcript expression within data­
to separate true from false peptide identifications, but recently bases specialized in normal tissue expression, such as the
deep learning approaches (e.g., neural networks) have Genotype-Tissue Expression (GTEX) project. While these
emerged as useful proteomic tools. resources have proven invaluable to early target identification,
One exciting application is the prediction of peptide frag­ as targets get closer to clinical trials protein expression must
mentation spectra comprising the m/z and intensity values of be validated to limit potential toxic effects of therapeutic
each peak. Currently, search algorithms score peptide spectral intervention.
matches by calculating the m/z value of predetermined frag­ Unlike genomic sequencing or transcriptome expression
ment ion series (e.g., b- or y-type ions) and matching those to data, proteomic data have lacked a well-defined central public
peaks within a spectrum. Generally, search algorithms have repository that could be easily queried. While it is a common
ignored the intensity component of matching peaks because practice to deposit raw MS data such that it can be accessed
there were no clear rules governing the relative intensities of and re-analyzed, the vast diversity of proteomics data collec­
ions upon peptide fragmentation. One of the first MS spec­ tion (e.g., DDA, DIA, targeted) and data analysis (e.g.,
trum prediction algorithms, MS2PIP [52,53], demonstrated that MaxQuant, Proteome Discoverer, PEAKS, in-house approaches)
spectral prediction was a possibility. Recently, two deep learn­ techniques can make it difficult to quickly determine if
ing algorithms Prosit [54] and DeepMass:Prism [55] have a protein was detected and if so – how much was there.
demonstrated remarkable accuracy in predicting MS spectra Such databases would prove invaluable for late-stage thera­
given the peptide sequence, modifications, and fragmentation peutic development where protein expression can often
mode. Currently, this predicted spectrum can be compared to determine the risk of off-target toxicity.
an experimental one and the resulting match score can be Recently, a number of groups have begun to work
used to help discriminate true from false identifications – toward building such repositories. One such example is
increasing identifications as much as 30–50% for searches GTEX, which recently published a proteomic analysis of 32
that utilize a large database (e.g., MHC-associated peptide normal human tissues [59] and have made the data publicly
searches). Spectral prediction has also been used to facilitate available. As described above, normal tissue expression is
DIA experiments without the need to first collect deep pro­ important for understanding the safety of emerging thera­
teomics data and build sample-specific spectral libraries. Two pies such as cellular therapies targeting TAAs. The Clinical
separate approaches, DIA-NN [56] and DeepDIA [57], create Proteomic Tumor Analysis Consortium (CPTAC) has been
spectral libraries by predicting fragmentation spectra based collecting proteomics data on tumor and normal adjacent
on large amounts of training data. Here, spectral libraries are tissue (NAT) for many years [60,61] and recently an applica­
created in silico and used to identify and quantify peptides tion programming interface (API) was released to facilitate
from DIA spectra that may contain fragments from many programmatic access to the data [62]. Another large scale
peptides. These approaches ultimately produce data of similar protein sequencing project associated with the cancer cell
or better quality without the upfront costs of performing a DIA line encyclopedia (CCLE) surveyed 375 cancer cell lines at an
experiment and facilitate the utilization of DIA for a wide average depth of 8,500 proteins [63]. While large, standar­
range of applications. Lastly, in addition to predicting peptide dized studies offer the best opportunity to collect data that
fragmentation, deep learning can also be used to predict can be directly compared, there is currently an effort to
other peptide characteristics such as retention time [54] or make the numerous, bespoke quantitative proteomic ana­
collisional cross section [58]. lyses more amenable to re-analysis from non-experts. At the
In the near future, it is possible that a sufficiently sophisti­ forefront of this movement is MassIVE.quant, a community
cated deep learning algorithm could consider an MS spec­ resource of quantitative mass spectrometry-based proteo­
trum, retention time, and collisional cross section to predict mics datasets [64]. Building upon the MassIVE Knowledge
a peptide sequence without performing a database search. Base [65], MassIVE.quant includes experimental design infor­
This truly de novo peptide sequencing approach could enable mation commonly lacking in public data repositories. Due to
identification of therapeutically relevant targets that are the various quantitative technologies available (e.g. label-
EXPERT REVIEW OF PROTEOMICS 511

free quantitation, DIA, isobaric labeling, SILAC, etc.) captur­ chemoproteomics can generate target hypotheses and in
ing the experimental meta data associated with a mass extension provide information about the mechanism of action
spectrometry experiment is vital to re-analysis. Due to this, (MoA) by which the compound exerts its phenotypic effect.
MassIVE.quant represents an opportunity for biological find­ Chemoproteomics provides an unbiased map of physical inter­
ings to be more readily discovered in previously acquired, actions of a compound with cellular proteins that includes the
publicly available data. efficacy or on-target for which the physical binding event
With the development of more sophisticated therapeutic functionally results in the observed cellular phenotypic
programs and advanced computational methods, the impor­ response. It should be noted that for the notoriously challen­
tance of readily available protein abundance data will con­ ging task of efficacy target identification in phenotypic drug
tinue to increase. For example, emerging engineered T cell discovery, chemoproteomics is often and most successfully
therapies target tumor-associated antigens that have used as part of a multipronged strategy that also includes
increased protein levels in cancer tissue as compared to nor­ functional genetic, cellular profiling and computational
mal tissue [66,67]. As the number of TAAs targeted for ther­ approaches to generate as much complementary information
apeutic intervention increases, the chance that therapies as possible to hone in on the efficacy target amongst the hit
identify low levels of TAA expression in normal tissues lists of physical and functional interactors [72–74]. At the same
increases. Evidence of protein detection in public, previously time, chemoproteomics experiments typically yield additional
collected proteomic databases provides an avenue to detect ‘binding off-targets’ which can be functionally relevant in
target-protein expression in tissues that may trigger on-target a different biological context such as explain potential toxicity
toxicity in patients. mechanisms but also provide opportunities for drug repurpos­
ing (reviewed in PMID: 33,404,270). In addition, these
approaches offer an experimental framework to demonstrate
6. Applications of proteomics in drug discovery and
target engagement in cells, model organisms and ultimately
development
the patient – in this case often using a more targeted detec­
The development of novel drugs is time consuming, expen­ tion and quantitation of the protein of interest to increase
sive, challenging and risky. The cost of bringing a new drug to sensitivity and throughput.
market has increased significantly for the last several decades In general, chemoproteomics workflows share four general
and is now estimated to be between 1 USD and 2.8 billion steps, each of which will be the focus of technology develop­
[68,69]. One of the major factors driving the cost of drug ment efforts in the coming years to improve comprehensive­
development is the high cost of failure, in particular failures ness and disease-relevance of generated information as well
in clinical development. The probability of a successful launch as throughput and scalability of the workflow (see Figure 3).
for drug candidates entering Phase 1 clinical trials is approxi­ After 1) selection of an input material, 2) samples are treated
mately 10% [70]. Analysis of the root cause of drug develop­ with compound or probe to allow for binding events. This is
ment failures have consistently found that efficacy and safety followed by 3) separation of compound-interacting proteins
are the major contributors to the low success rate in clinical from the rest of the proteome by a variety of means including
trials [71]. In addition to on- and off-target toxicity, disease affinity enrichment or detection of changes in protein stability
heterogeneity and interpatient variability contribute to the upon compound binding. Finally, 4) the interacting proteins
challenge of bringing safe, effective new medicines to address are detected and quantified vs. an untreated control using
unmet medical needs. In addition to better understanding the typically quantitative mass spectrometry. While global proteo­
full target spectrum of a drug early on in the development mic profiling to detect compound-induced changes in cellular
process, the identification of better biomarkers and persona­ protein abundance would not fall into the rather narrow
lized medicine approaches are seen as important, critical areas definition of chemoproteomics used here, we will briefly men­
where proteomics can play a significant role to enable the tion recent applications in the context of compound target
successful development and use of novel therapeutics. In identification and mode of action elucidation.
addition, we will discuss the role of (chemo) proteomics
approaches in target identification and selection for different 6.1.1. Affinity-enrichmentbased chemoproteomic
modalities for respective pharmacological intervention. approaches
The ‘classical’ chemoproteomics workflow for target deconvo­
lution is based on a compound pulldown step using an immo­
6.1. Chemoproteomics and other proteomics approaches
bilized variant of the compound of interest. This variant is
in early drug discovery
typically generated by installation of a linker carrying either
‘Chemoproteomics’ encompasses a number of workflows that a functional group for immobilization on a bead-based matrix
aim to identify and characterize drug-target interactions in or e.g., a biotin affinity handle, using structure-activity rela­
cells or cell-derived samples such as cell lysates or enriched tionship (SAR) information to ensure that the modification
subcellular fractions. As proteins constitute the majority of does not interfere with phenotypic activity and thus target
targets in drug development, these workflows have become binding. The prepared affinity matrix is incubated with cell
indispensable at various stages of the drug discovery process: lysate and the enriched proteins eluted and analyzed by
in phenotypic or cell-based drug discovery, where screening quantitative mass spectrometry. In order to increase specificity
of large compound libraries in a cellular or organismal model and allow prioritization of hits by likelihood of functional
of disease is used to identify chemical starting points, relevance, the experiments are typically performed in
512 J. R. LILL ET AL.

Figure 3. Overview of common steps of the various chemoproteomics workflows described in the text with specific areas of active optimization and method
development. These common steps typically include: 1) selection of an appropriate, disease-relevant input material for the chemoproteomics experiment; 2)
treatment of proteome with either free compound (for competitive workflows or workflows based on a broad specificity enrichment steps) or functionalized
probe; 3) separation of proteins interacting with compound or probe in step 2) from background by e.g. affinity enrichment, centrifugation or proteolysis; 4)
identification and quantitation of peptides and proteins by LC-MS/MS and data analysis.

a competitive mode using preincubation of lysate with free ligase substrate receptor CRBN as the target of thalidomide
parent compound in dose response or using analogs covering [75] or Annexin A2 as a target of bleomycin in bleomycin-
a range of cellular activity. This approach has proven to be induced pulmonary fibrosis (PMID: 29,172,997). As a well-
most successful for soluble proteins that retain binding com­ established, robust workflow with known characteristics, for
petence under generic cell lysis conditions such as the E3 the lysate-based pulldown approach to stay relevant
EXPERT REVIEW OF PROTEOMICS 513

developments will aim on the one side to decrease input from improved data analysis strategies, experimental work­
material requirements to allow application to small, disease- flows have been introduced to aid with this process, e.g., the
relevant cell populations including primary cells and patient- SIM-PAL workflow which uses introduction of unique isotopic
derived material. On the other side, increasing throughput will patterns to identify probe-labeled peptides in digested
enable screening applications to proactively generate protein enriched samples [87].
interaction profiles for compounds in screening libraries. While While the chemoproteomics workflows described so far
the former will be mostly driven by progress in sample hand­ are most often used for non-covalent screening hits, the
ling and sensitivity of the analytical platforms as described resurgence of covalent drug discovery, including the use
earlier, the latter poses the key challenge of high-throughput of electrophile libraries in cell-based screens, has led in
identification and generation of suitable probes. In order to parallel to an increased interest in covalent chemoproteo­
circumvent this step, broad specificity enrichment matrices mics or activity-based protein profiling (ABPP) approaches.
have been developed for several target classes for use in A number of approaches are conceptually similar to target
a competition-based workflow, e.g. several variations of pan- class-specific matrices mentioned above: the compound of
kinase affinity matrices using promiscuous ATP-competitive interest is used as a competitor for preincubation of cells or
inhibitors have been available for many years [76–78]. While lysate followed by protein enrichment from lysate using
inherently biased toward a given target class and more speci­ a pan-reactive probe. These probes can be target family-
fically a conserved binding pocket, the recent characterization specific such as the fluorophosphonate-based probes for
of 243 clinical kinase inhibitors for off-target identification and serines hydrolases [88] which have e.g. recently been used
drug repurposing shows the general applicability to higher to identify RBBP9 as a valacyclovir-activating enzyme [89],
throughput selectivity profiling [79]. highlighting the fact that chemoproteomics can identify
Several pharmacologically relevant target classes such as functionally relevant binding events other than the efficacy
multispan transmembrane receptors and ion channels are target. Covering an even larger target spectrum are probes
notoriously difficult to access with a lysate-based workflow that target solvent exposed reactive amino acids in general,
run in discovery mode since they require the cellular context e.g. the iodoacetamide-based probe for cysteine as used in
for binding competence. Therefore, approaches that enable the competitive isoTOP-ABPP workflow [90]. Several variants
live cell applications are increasingly gaining popularity. of the latter have been published (e.g. [91,92],) which differ
Photoaffinity-labeling (PAL) allows the interrogation of com­ in aspects including the exact probe design with either pre-
pound-protein interactions in living cells since a typical PAL installed or latent affinity handle as well as quantitative MS
probe consists of three elements: the pharmacophore respon­ strategy with the final sample consisting of enriched probe-
sible for target binding, a functional group for installing an labeled peptides. Signal reduction for a specific probe-
affinity handle and a photoreactive moiety (e.g., diazirine, modified peptide upon cell pre-treatment with
benzophenone) that allows proximity-based covalent labeling a compound of interest is used to infer compound labeling
of the interacting protein(s) upon cell irradiation. After cell of a target residue. As a result, these workflows allow not
lysis, labeled proteins are enriched typically using a biotin- only the identification of protein interactors for a compound
based system with the biotin introduced post-lysis using e.g. of interest, but more specifically the mapping of modified
click chemistry to ensure cell permeability of the PAL probe. sites and thus ligandable pockets. Accordingly, the general
Again, competition-based workflows can help with both spe­ workflow has been applied successfully not only to target
cificity as well as prioritization of functionally relevant inter­ deconvolution for bioactive compounds such as for nimbo­
actors [80]. In addition to successful target deconvolution for lide (E3 ligase RNF114) [93] or dimethyl fumarate (kinase
challenging transmembrane target families of interest such as complexes PKCθ-CD28 [94] and IRAK44-Myd88 [95]) but also
solute carriers (e.g., SLC39A7/ZIP7 [81], SLC25A20 [82]), the to large scale mapping of protein interactors and ligandable
introduced covalent bond also allows application to larger pockets in living cells using electrophile libraries [96]. In this
scale mapping of protein interactors and ligandable pockets case, the covalent library members do not need additional
in live cells for chemical libraries based on the PAL probe features to be compatible with the workflow (compared to
design principles mentioned above [83,84]. Besides similar the PAL equivalent mentioned previously), so that through­
throughput considerations as mentioned for lysate-based pull­ put becomes a key limiting factor for screening applications.
downs, efforts to improve process efficiency and ease of hit This has led to the recent report of a scaled-down TMT-
calling will likely further increase applications of this workflow, based streamlined cysteine (SLC)-ABPP workflow [91] which
e.g., via exploration of alternative bio-orthogonal reaction allows profiling of 8,000 cysteine residues in 18 minutes per
chemistries for installation of the affinity handle which has compound with reduced input material requirements. While
already led, e.g., to the increased use of inverse electron these workflows are used so far predominantly for cysteine-
demand Diels–Alder reaction using trans cyclooctene tags targeting compounds, they can per se be applied to any
[85,86]. In addition, the reliable mapping of PAL-probe inser­ reactive amino acids for which pan-reactive probes are
tion sites remains a key challenge for this workflow to fulfill its available. While these are becoming increasingly available,
full promise. This is due to the low insertion efficiency of including for lysine [97], methionine [98] and tyrosine [99],
available photo-reactive moieties as well as the fact that the the identification of novel probes that are more robust and
carbene radical-based, random insertion process tends to give allow access to additional amino acids remains of high
rise to a mixture of molecular modification products even for importance. A proof-of-principle study by Hacker and col­
a single binding pocket and a given peptide sequence. Apart leagues recently demonstrated that an optimized data
514 J. R. LILL ET AL.

analysis workflow enables the use of 54 different probes bacterial lysates (LiP-SMap [113],) as well as target deconvolu­
covering 9 amino acid and N-terminal modifications in par­ tion in yeast and human cell lysates using the more extensive
allel for a direct comparison of probe selectivity and exten­ LiP-Quant workflow based on dose response treatments and
sion more comprehensive monitoring or reactive sites in machine learning [112]. A key advantage common to all non-
a proteome [100]. affinity enrichment-based approaches is that they do not
As an alternative to the purely competitive, peptide-based require the time- and resource intensive generation and vali­
approaches described so far, covalent chemoproteomics work­ dation of an affinity tool compound and thus are ideal for
flows can also be based on specific electrophilic probes higher throughput selectivity profiling. On the other hand, the
derived from the original compound of interest, akin to the absence of an enrichment step and multiple conditions
PAL probes discussed previously. One such approach, the exacerbates the analytical challenge for low abundance tar­
Covalent Inhibitor Target-site Identification (CITe-ID) workflow gets and requires significant MS instrument time, in particular
enabled the development of a PKN3 probe based on the for the approaches that rely on robust quantitation of indivi­
observation that PKN3 is an off-target of the CDK inhibitor dual peptides and therefore high sequence coverage. The
THZ1 [101]. CITe-ID also provides direct evidence of the com­ resulting throughput challenges have led to the introduction
pound adduct instead of relying on indirect, competition- of compressed workflows where individual treatment condi­
based information. In addition, such electrophilic probes can tions, e.g. different temperatures in CETSA, are pooled and
be used for protein level enrichment analyses and have been subjected to MS-based protein quantitation for hit calling
shown to provide overlapping but not identical information to [114,115]. In addition, for any given approach run in an
isoTOP-ABPP-like approaches, e.g., shown for selectivity profil­ unbiased fashion for de-novo target deconvolution, success
ing for KRAS G12C inhibitors [102]. Again, key points for is to some extent target-dependent, i.e., not every binding
method development in the coming years will aim to further event leads to detectable thermal stabilization or conforma­
increase throughput, sensitivity and ease of application for the tional change under the selected set of experimental condi­
various covalent chemoproteomics workflows. tions. Compound treatment of intact cells as reported so far
for TPP is preferable since it reflects the pharmacologically
6.1.2. Protein stability-based chemoproteomic approaches relevant environment, exemplified by the fact that a study of
In addition to affinity enrichment-based approaches, a number the targets of ciprofloxacin in E. coli identified the known
of proteomics approaches have been introduced that use target DNA gyrase only in live cell experiments where intact
compound-induced changes in thermodynamic stability or DNA is present which is required for compound binding [116].
conformational changes in the target protein to identify and However, it adds an additional layer of complexity to the data:
characterize compound-target interactions. For the Cellular compound-induced changes in e.g. post-translational modifi­
Thermal Shift Assay (CETSA) [103] and its coupling with cations, metabolite concentrations and protein–protein inter­
a quantitative MS-based read-out for proteome-wide analysis actions can also lead to an assay signal (reviewed in Prabhu
(also called Thermal Proteome Profiling, TPP) [104], the com­ [117]). While this complicates target deconvolution, it can
pound-induced stabilization in cells or lysate is detected as allow on the other hand the observation of broader aspects
protection from heat-induced denaturation by quantifying of the compound MoA and effects on downstream processes.
non-denatured protein in the supernatant after As already indicated, the workflows summarized in this section
a centrifugation step. The assay is typically run either as will particularly benefit from improvements in speed and sen­
a temperature curve at a single compound dose or – if the sitivity of the analytical platform to enable screening applica­
melting point of a target is known – in dose response for more tions and fully capitalize on the fact that compounds do not
granular picture and to increase the sensitivity of hit calling. require modification which is e.g., particularly attractive for
Accordingly, for an unbiased analysis of a whole proteome routine off-target profiling and application to later stage
which will cover a wide range of melting temperatures for compounds.
individual proteins, a 2D-TPP workflow has been introduced
which combines compound dose responses at multiple tem­
peratures to increase coverage of target space and allowed 6.1.3. Other proteomic approaches for target
e.g. the identification of phenyl hydroxylase as an off-target of identification and MoA elucidation
the HDAC inhibitor panobinostat [105]. Further optimized Finally, global proteomic profiling has seen renewed interest
workflows have described the successful application to trans­ in the context of compound target identification and mechan­
membrane targets [106–108] and even to in vivo models and ism of action studies. This is primarily due to the emergence of
patient material [109]. Several approaches use differences in targeted protein degradation (TPD) as a novel modality where
susceptibility to limited proteolysis upon compound treat­ pharmacological intervention results in modulation of target
ment to identify proteome-wide compound interaction, protein levels by recruitment of a target of interest to
including DARTS [110] and LiP-MS [111]. In addition to provid­ a suitable E3 ligase component such as CRBN or VHL to induce
ing protein-level interactions, the latter approach has the proteasome-dependent degradation. Therefore, TPD drug dis­
potential to enable mapping of the protein regions affected covery projects rely heavily on proteomics for target identifi­
by a binding event and in an ideal case the binding site itself cation and compound characterization and optimization.
via careful quantitation of individual proteolytic fragments Examples where proteomics provided crucial data toward
using targeted MS or data-independent acquisition [112,113]. MoA elucidation include the discovery that the efficacy of
Applications include profiling of metabolite interactions in lenalidomide in multiple myeloma is explained by CRBN-
EXPERT REVIEW OF PROTEOMICS 515

dependent degradation of transcription factors IKZF1 and 3 such as complex organoid systems and patient-derived pri­
[118]. On the other hand, Gray and colleagues used proteo­ mary cells, makes the considerations in this review regarding
mics to demonstrate the increased selectivity of increased sensitivity in MS instrumentation and the develop­
a promiscuous kinase inhibitor when linked to a ligand for ment of single cell proteomics workflows particularly relevant
the E3 ligase substrate receptor CRBN [119]. The Multiplexed in this area as well. In addition, the increased meta-analysis of
Proteome Dynamics Profiling (mPDP) workflow further allows chemoproteomics data and integration with other MoA-
additional differentiation of direct compound-induced protein relevant datasets will be crucial to further facilitate hit calling
degradation from downstream effects and has been used, e.g., and prioritization of target hypotheses for time- and resource-
to compare the effects of the heterobifunctional JQ1-VHL consuming in-depth validation experiments. Taken together,
degrader vs. the bromodomain inhibitor JQ1 alone [120]. the specific development efforts tackling individual pain
Recent advances in high-throughput sample preparation and points in chemoproteomics (Figure 3) need to reflect the over­
data acquisition including the BoxCar method [121] have also all changes in the drug discovery environment for this exciting
allowed the rapid recording of compound-induced changes at area of proteomics to continue to be impactful.
the global proteome level [122] or for a set of phosphorylation
sites (P100) [123] as signatures to derive compound MoA
hypotheses either directly or via correlation to signatures of
6.2. Biomarkers
compounds with known MoA, akin to e.g. transcriptional
approaches like L1000 [124]. Proteomics was recognized early on as a powerful tool with
great promise for biomarker discovery [129]. Plasma proteo­
6.1.4. Chemoproteomics in the next decade mics, in particular, has been an area of intense focus because
We are currently seeing a paradigm shift when considering the blood is readily available, it perfuses the entire body thus
application space of chemoproteomics. Historically, the focus providing the opportunity to identify biomarkers across
has been on identification of functionally relevant interactions a broad range of diseases and disorders, and because clinical
such as efficacy target identification where complementary, in analysis of blood is already a well-established, common diag­
particular genetic approaches were required to prioritize phy­ nostic procedure. A comprehensive pipeline for protein bio­
sical interactors identified by chemoproteomics hits by func­ marker discovery and validation was described in 2006 by Rifai
tional relevance (and vice versa since genetic screening hits et al. [130]. This pipeline involves identification of candidate
often include additional components of the target biology biomarkers in a discovery phase, typically by shotgun proteo­
network). This has changed with the current rise of chemical mics, using a relatively small number of samples, followed by
biology-inspired modalities and in particular those utilizing qualification and verification in larger sample sets using quan­
compound-induced recruitment of an effector protein to titative, multiplex multiple reaction monitoring (MRM) and
a (neo)substrate. These approaches often utilize heterobifunc­ ultimately validation with a high-throughput immunoassay
tional molecules consisting of a target-binding module and or MRM assay suitable for the analysis of high volumes of
a (validated) recruitment module for the enzyme or scaffold clinical samples. This triangular biomarker discovery strategy
protein of interest. The latter will lead to the biological effect, has been broadly used. However, despite extensive effort, and
which can range from target degradation in a ubiquitination- decades of research, there have been very few success stories.
dependent manner by the proteasome system [125] or via Kearny et al. recently reviewed two MRM based biomarkers,
autophagy [126] to modulation of phosphorylation- Xpresys Lung 2®, a blood test for assessing the cancer risk of
dependent events by recruitment of kinases [127] or phospha­ lung nodules discovered by radiology and PreTRM®, a blood
tases [128]. From a target perspective this means that essen­ test that assesses the risk of spontaneous preterm birth in
tially any small molecule-binding event to a protein of interest asymptomatic women in the middle of pregnancy, that were
can be functionalized, even if the binding event itself is ‘silent.’ discovered via proteomics [131]. PromarkerD, a biomarker for
While chemoproteomics has made crucial contributions to the predicting diabetic kidney disease based on multiplex immu­
identification of recruitment modules for, e.g., E3 ubiquitin noaffinity MS measurement of three plasma proteins (CDL5,
ligase components like CRBN [75], it is the large scale identi­ APOA4, and IBP3) with three clinical variables (age, HDL-
fication of ligands for targets of interest where it will most cholesterol, and eGFR) has been submitted to the FDA for
likely be most impactful. Since the interrogated target space approval, and was discovered using proteomics technol­
for each compound subjected to chemoproteomics is the full ogy [132].
cellular proteome, databases of chemoproteomics data and The challenges associated with proteomics-based biomar­
their proactive expansion in screening mode will increasingly ker discovery, referred to as the discovery to validation gap,
enable the identification of chemical starting points for these have been reviewed previously [133–136] and a number of
modalities. The fact that chemoproteomics identifies physical factors have been identified that contribute to the failure to
interactions independent of functional relevance turns from validate discovery findings. These include issues related to the
being a disadvantage in the context of efficacy target identi­ discovery sample set; including insufficient size, lack of appro­
fication into an advantage, as it provides the most compre­ priate controls, and changes in the patient population
hensive picture of both functional and silent compound- between discovery and validation experiments. The observa­
protein interactions that can be exploited using different tion that the number of biomarker candidates identified in the
modalities. At the same time, the increasing interest in more literature is perhaps a quarter of human proteins, suggests
disease-relevant cellular models for phenotypic screening, that the candidate discovery process is often not rigorous
516 J. R. LILL ET AL.

enough [133]. Technical issues such as analytical platform Table 1 describes several types of biomarkers used in drug
changes, e.g., shotgun proteomics to targeted MRM also con­ development, as defined in the BEST document as well as
tribute to lack of translation. Perhaps even more significant, in examples from the BEST document and literature, with an
the large majority of cases, discovery experiments are simply emphasis on protein and proteomics related biomarkers. In
not followed up and validation is not even attempted. addition, an estimate of the level of validation needed to sup­
Validation requires analysis of independent, well characterized port the biomarker is indicated, ranging from low to high where
clinical samples with robust, quantitative assays. Clinical trans­ low refers to biomarkers used for internal decision making,
lation is challenging with significant regulatory and financial medium refers to biomarkers that are submitted to regulatory
hurdles. The challenges associated with clinical validation are agencies to support the filing, and high refers to biomarkers that
likely enough to discourage replication unless a clear, cost impact diagnostics and companion diagnostics.
effective use case can be made. Pharmacodynamic and monitoring biomarkers are espe­
cially valuable in drug development and typically not dis­
6.2.1. Biomarkers as drug development tools cussed in the context of proteomics biomarker discovery, so
The use of biomarkers in drug development has increased in we will describe a few of these examples in more detail.
recent years, and a recent analysis reported that more than Urinary Type II collagen NeoEpitope (uTIINE) is an exam­
half of recently approved drugs were supported by biomarker ple of a pharmacodynamic biomarker discovered using
data [137]. This analysis looked at the documents submitted to a targeted discovery strategy. Collagen neoepitope pep­
regulatory agencies, Food and Drug Administration (FDA) and tides were identified by data dependent proteomics in an
European Medicines Agency (EMA), to support drugs ex vivo cartilage explant model [141]. Antibodies to the
approved between 2015 and 2019. Their analysis likely under­ major neoepitope identified in the cartilage explant were
estimates the contributions of biomarkers to drug develop­ then used for immunoaffinity proteomics of human urine
ment as it does not include biomarkers used to make internal and synovial fluid from normal and osteoarthritis (OA) sub­
decisions not included in regulatory packages. While the focus jects. A 45 amino acid peptide containing 5 hydroxy-proline
of biomarker discovery reported in the literature has been the residues was the most abundant neoepitope peptide in
identification of diagnostic tools, biomarkers play other critical human urine, and a quantitative immunoaffinity MRM
roles in the clinical development of novel therapeutics. assay for this neoepitope (uTIINE) was developed and vali­
Biomarkers such as pharmacodynamic biomarkers, and proof dated [142]. The uTIINE biomarker was used in a dog model
of activity biomarkers are important drug development tools. of OA to demonstrate the pharmacological activity of
Proteomics plays an important role in the discovery, validation PF152, a selective MMP-13 inhibitor [143]. Subsequently it
and implementation of these biomarkers, which require dis­ was demonstrated that the levels of uTIINE could differenti­
tinct, fit-for-purpose approaches. ate patients with symptomatic OA of the knee or hip from
The Biomarkers, EndpointS and other Tools (BEST) those with asymptomatic, radiographic OA of the same
resources developed by the FDA-NIH Biomarker Working joints and that longitudinal measures of uTIINE were asso­
Group is a valuable resource which classifies and defines ciated with joint space narrowing in patient with knee
biomarker categories and also describes biomarker validation OA [144].
and qualification [138]. From the BEST document, validation, N-terminomic proteomic profiling (TAILS) was used to iden­
for biomarkers and clinical outcome assessments, is “a process tify novel substrates of HtrA1, a serine hydrolase associated
to establish that the performance of a test, tool, or instrument with increased risk of age-related macular degeneration (AMD)
is acceptable for its intended purpose.” It is important to in preclinical models. One of these substrates, Dickkopf-related
demonstrate that the test measures what it was intended to protein 3 (DKK3), was used as a pharmacodynamic biomarker
measure (analytical validation) and that the biomarker in Phase 1 trials of an anti-HtrA1 Fab (Fab15H6.v4.D221).
(through its test) has the ability to predict or measure the Analysis of DKK3 cleavage in aqueous humor samples from
relevant clinical concept. Validation is important for biomarker study subjects provided clear evidence of sustained pharma­
applications, establishing that the biomarkers, and the assays cological activity of Fab15H6.v4.D221 and an important frame­
used to measure them, are appropriate for specific intended work for the design of clinical studies to test the therapeutic
use. Depending on the intended use, the requirements for hypothesis that inhibition of HtrA1 will slow the progression of
biomarker validation can vary significantly. A biomarker used geographic atrophy (GA) [145].
for internal decision making may need less validation than Protein arginine methyltransferase (PMRT) type I has
a biomarker used to support the approval or use of a novel been shown to have anti-proliferative effects in multiple
therapeutic. Biomarkers submitted to regulatory agencies may tumor types. Substrates of type I PMRT were identified
need to be formally reviewed or “qualified.” There are two using a methylated arginine enrichment proteomic strat­
typical paths for biomarker qualification either through sub­ egy (MethylScan) [146]. Human peripheral blood mononuc­
mission of biomarker data during drug approval, or indepen­ lear cells (PBMCs) were treated with the PMRT inhibitor
dently via the FDA biomarker qualification program [139]. GSK336871, total protein was isolated, digested with tryp­
Plasma fibrinogen has been qualified as a drug development sin, and immunoprecipitated with antibodies to arginine
tool in Chronic Obstructive Pulmonary Disease (COPD) by the methylation marks. Heterogeneous nuclear ribonucleopro­
COPD foundation biomarker qualification consortium. tein A1 (hnRNP-A1) was identified as a potential pharma­
A perspective article on this process has recently been pub­ codynamic biomarker. A novel liquid chromatography with
lished [140]. tandem mass spectrometry (LC-MS/MS) assay was
EXPERT REVIEW OF PROTEOMICS 517

Table 1. Biomarkers classification and validation (based on FDA-NIH BEST resource).


Biomarker Description Typical drug development use(s) Validation Examples
Pharmacodynamic A biomarker used to show that Confirm target engagement, PKPD for Low – internal DKK3 as a PD biomarker for HtrA1 in
a biological response has occurred in dose selection, demonstration of decision- geographic atrophy [141]
an individual who has been exposed activity, proof of mechanism. Provide making use. hnRNP-A1 as a marker for PMRT
to a medical product or an supporting evidence of Medium – inhibitors [142]
environmental agent. a pharmacodynamic effect or an early submitted to uTIINE as a target engagement
therapeutic response regulatory biomarker for MMP inhibitors in
agencies, preclinical studies in OA [143]
included in
the label.
Monitoring A biomarker measured serially for Proof of activity for novel therapeutic Med-High B-type natriuretic peptide (BNP) or
assessing status of a disease or N-terminal proBNP (NT-proBNP)
medical condition or for evidence of may be used as monitoring
exposure to (or effect of) a medical biomarkers during follow-up to
product or an environmental agent supplement clinical decision making
in pediatric patients with
pulmonary hypertension [144,145]
Diagnostic A biomarker used to detect or confirm Clinical trial enrollment criteria.
presence of a disease or condition of
High ®
Xpresys Lung 2 for differential
diagnosis of early stage lung cancer
interest or to identify individuals with [146]
a subtype of the disease. ®
PreTRM for risk of preterm delivery
[147]
PromarkerD for clinical prediction of
diabetic kidney disease [148]
Companion Dx A medical device, usually an in vitro The use of a companion diagnostic with High HER2 test (protein expression in tumor
Complimentary diagnostic (IVD) device, that provides a therapeutic product is typically tissue) co-approved with
Dx information that is essential for the stipulated in the instructions for use Trastuzumab for breast cancer
safe and effective use of in the labeling of both the diagnostic [149].
a corresponding therapeutic product. device and the corresponding
s of the therapeutic product. therapeutic product, including the
labeling of any generic equivalent
Predictive A biomarker used to identify individuals Enrichment Internal use – BReast CAncer genes 1 and 2 (BRCA1/
who are more likely than similar Stratification low 2) mutations may be used as
individuals without the biomarker to Balance arms Trial predictive biomarkers when
experience a favorable or enrichment – evaluating women with platinum-
unfavorable effect from exposure to med sensitive ovarian cancer, to identify
a medical product or an patients likely to respond to Poly
environmental agent. (ADP-ribose) polymerase (PARP)
inhibitors [150]
Prognostic A biomarker used to identify likelihood Enrichment – enroll patients more likely Plasma fibrinogen may be used as
of a clinical event, disease recurrence to have clinical events/progress a prognostic biomarker to select
or progression in patients who have patients with chronic obstructive
the disease or medical condition of pulmonary disease at high risk for
interest. exacerbation and/or all-cause
mortality for inclusion in
interventional clinical trials [140]
Surrogate An endpoint supported by a clear Approvable endpoint in Phase 3 clinical High Hemoglobin A1c (HbA1c) reduction is
Endpoint mechanistic rationale and clinical trial a validated surrogate endpoint for
data providing strong evidence that reduction of microvascular
an effect on the surrogate endpoint complications associated with
predicts a specific clinical benefit. diabetes mellitus and has been
used as the basis for approval of
drugs intended to treat diabetes
mellitus.

developed to quantify arginine methylation changes at candidate biomarkers were evaluated in longitudinal CSF
a specific residue (R225). This assay was used to character­ samples from aged, cognitively normal control, mild cogni­
ize GSK336871 activity in xenograft models and is currently tively impaired (MCI) and AD subjects.
being used to assess pharmacodynamics (PD) in a Phase 2 Of the 28 quantifiable proteins, 10 showed significant
clinical trial [147]. differences between diagnostic groups and 4 candidates
Although there have been over 100 published studies to demonstrated a significant longitudinal change consistent
identify potential diagnostic and prognostic biomarkers for with their utility as potential monitoring biomarkers. The
Alzheimer’s Disease (AD) in cerebrospinal fluid (CSF) [148], panel was designed based on cross sectional studies, it is
a key drug development need is for monitoring biomar­ perhaps not surprising that while many replicated as diag­
kers. Wildsmith et al. [149] developed a targeted MRM nostic candidates only a few emerged as monitoring bio­
panel of 30 candidate biomarkers for AD, based on CSF markers and highlights the importance of aligning the
discovery proteomics and literature review. These discovery experiments with the ultimate intended use.
518 J. R. LILL ET AL.

6.2.2. Future perspectives for clinical proteomics and High throughput analysis is critical for the analysis of large
biomarkers clinical cohorts for biomarker discovery but comes at a cost in
As the above examples illustrate, a variety of different types of terms of depth of proteome coverage. The importance of the
biomarkers are important for successful drug development. throughput vs depth will vary depending on the specific
The discovery of biomarker candidates, analytical validation, application. In many cases, prior knowledge can inform this
and biomarker validation depends on the ultimate intended decision, pointing toward high sensitivity methods for exam­
use of the biomarker and required a more nuanced approach ple if chemokines and cytokines are likely potential biomar­
than the triangular paradigm associated with diagnostic dis­ kers. A commonly used strategy to increase proteome
covery. The first step is to define the intended use of the coverage is to use pre-fractionation. G. Kaur et al. compared
biomarker. This should include a description of the perfor­ several different methods utilizing depletion of high-abundant
mance characteristics required for success. The second step proteins, enrichment of low-abundant proteins, SDS PAGE,
is biomarker candidate discovery. This step is critical; virtually and C18 pre-fractionation. All of the methods tested per­
any differential expression experiment will ‘discover’ proteins formed well, identifying between 3400–3800 plasma proteins.
that are up or down regulated between but few if any are They concluded that the 1D gel-based approach, which
actually potential clinical biomarkers. Identification of robust allowed for parallel sample processing represented the best
candidates, consistent with the intended use, and a high choice for high coverage and throughput [156]. Another
degree of confidence in translation, is essential before pro­ orthogonal strategy to increase proteome coverage is to uti­
ceeding. The third step is an iterative process of biomarker lize enrichment approaches for PTMs. Combining proteomics
assay development and analytical validation, and biomarker and phosphoproteomics is a common, generic strategy for
qualification. increasing depth and breadth. More specific approaches may
Advances in proteomic technology are enabling the dis­ be applicable in certain cases, such as the use of TAILS to
covery of more robust biomarker candidates. In addition to identify novel protease substrates as discussed previously
the sensitivity improvements discussed in section 2. These [145]. Affinity based proteomic technologies have recently
include the development of high throughput techniques that emerged as important tools for plasma protein biomarker
allow the analysis of large cohorts, techniques that increase discovery [157]. Affinity based proteomic technologies are
the depth and breadth of proteome coverage, and techniques well suited for characterizing low abundance proteins, and
that improve quantitation. combining unbiased MS proteomics, with large, targeted affi­
The Mann laboratory developed an automated, high- nity-based array technologies is a powerful, emerging strategy
throughput shotgun plasma proteomics workflow suitable for the identification of biomarker candidates. The perfor­
for use with very small (1 µL) volumes [150]. Sample pre­ mance of LC-MS/MS and affinity-based array technologies
paration was carried out in a single reaction vial, followed were evaluated in a study of 173 human plasma samples
by LC-MS/MS using a fast 20 minute gradient and DDA on [158]. LC-MS/MS was performed in the DIA and DDA modes
a Q Exactive HF Orbitrap. Quantitative label label-free ana­ using a Q Exactive HF instrument (Thermo) and affinity pro­
lysis employed MaxQuant. Approximately 1000 proteins teomics used the Olink PEA platform to measure the relative
could be analyzed, including nearly 50 known biomarkers abundance of 736 protein analytes. DIA-MS quantified a total
which showed good quantitation (CV’s < 20%). The of 734 plasma proteins, 379 of which were observed in more
method was used to analyze 1294 plasma samples in than 25% of the samples while Olink detected 728 proteins in
a human weight loss study [151]. The same group more at least 25% of the samples. A total of 35 proteins were
recently reported an improved method incorporating quantified using both techniques, with good correlation, espe­
a novel nano scale LC system using pre-formed gradients cially for proteins with significant spread around the mean.
and DIA MS and demonstrated the ability to quantify 5200 The study showed that these two complementary approaches
plasma proteins in 21 min [152]. Bruderer et al. developed targeting different components of the proteome could have
a robust high throughput capillary flow DIA method cap­ significant advantages for biomarker candidate discovery.
able of analyzing 31 plasma proteomes/day, measuring Finally, while it is tempting to do biomarker discovery in
over 500 proteins/sample and used this method to analyze plasma, given the complexity of the plasma proteome it may
the DioGenes cohort of 1508 samples [153]. Lennon et al. make more sense to analyze other matrices closer to the sites
describe a method using short 1 mm scale chromatogra­ of action such as CSF, stool, urine, synovial fluid, tears, aqu­
phy coupled to ion mobility MS able to detect over 500 eous humor, saliva, skin blister fluid, tissue, etc. Urine [159]
serum proteins in a 15 min run [154]. Messner and cow­ and stool [160] in particular have the additional advantage of
orkers described an ultra-high throughput clinical proteo­ being noninvasive, simple to collect, store and transport.
mics platform using short-gradient high-flow LC coupled to These matrices are well suited for biomarker discovery and
a Triple-TOF 6600 (Sciex), theoretically capable of analyzing can readily be incorporated into large clinical trials.
180 samples/day. Although fewer proteins (approximately Biomarker candidate discovery relies on accurate differ­
270 protein groups/sample) are detected using this ential analysis across large sample sets. DIA-MS is emerging
method, clinically relevant proteins including complement as the method of choice for analysis of large, clinical sample
factors, inflammation modulators, pro-inflammatory factors sets. While DIA methods have typically been optimized to
in the IL6 pathway that allowed classification of COVID-19 maximize the number of proteins identified, recent publica­
infection [155]. tions have focused on improving quantitation. These
EXPERT REVIEW OF PROTEOMICS 519

include optimizing instrument data acquisition parameters potential targets for future drug discovery. The same advances
for quantitation [161,162], libraries [163–165], feature selec­ in throughput, proteome coverage, and quantitation that are
tion (peptides, transitions) and lower limit of quantitation improving biomarker candidate discovery will accelerate these
(LLOQ) [166–168], and the use of external or sparse internal applications as well.
standards and calibration curves [169–173]. Although
a general consensus regarding the optimal approach to
quantitative proteomics for biomarker candidate discovery
has not yet emerged, the field is rapidly advancing and the
7. Conclusion
future looks very promising. Based on conversations with our industrial proteomics coun­
While the future may see global, quantitative proteo­ terparts, we have reviewed the technological advances that
mics use as a diagnostic tool, most biomarkers will require we envision being most impactful in the bio-pharma proteo­
validated clinical assays. While immunoassays and targeted mics arena in the next decade. Enhancements in sensitivity,
MRM MS assays are widely used for clinical assays, the integration of proteomics with other ‘omics’ technologies,
choice of platform ultimately depends on the use case expansion and higher utility of chemoproteomic technologies
for the biomarker. Targeted MRM assays represent & advances in biomarker discovery in addition to software and
a logical choice for the analytical validation of biomarker data analysis solutions are all evolving and merging to provide
candidates identified by discovery proteomics. Validation more intricate and informative data to help fuel the drug
of MRM assays are well established and guidance docu­ discovery and development pipeline.
ments are available [174–176]. Resources for developing Despite great strides in technology development, limita­
targeted MRM assays include the NCI’s Clinical Proteomic tions still plague the proteomics community. For example,
Tumor Consortium assay portal and SRMAtlas [177]. the ability to fully characterize and distinguish between pro­
MRMAssayDB is a comprehensive resource for targeted tein-isoforms remains a very important yet problematic area to
assays with information on assays for over 50,000 proteins solve for many studies. While the tools to fully distinguish
[178]. However, despite the availability of these tools, and between these proteinaceous species are lacking, the question
the advantages of using targeted MS to validate promising remains if there are truly functional differences between pro­
biomarker candidates identified using MS based discovery teo-isoforms, and therefore whether investing in this area is
experiments, a recent survey of the literature revealed that worthwhile [181]. Despite Top Down proteomic methods pro­
a large majority of discovery efforts lack validation, and mising to help resolve the isoform conundrum, and decipher­
those that are validated utilize immunoassays and not MS ing protein-isoforms at the purified protein level [182], the
[179]. As discussed above, while many factors may affect community has yet to demonstrate the technology’s utility in
validation in general, the lack of appropriate instrumenta­ a robust manner, particularly at the level of sensitivity and
tion could also be a contributing factor. Rather than tran­ throughput that are of general use for fast pathway analyses.
sitioning from DIA based discovery experiments using A concerted effort in method development, instrument, and
Orbitrap instruments, to MRM validation experiments data analysis is required to make this technology
using triple quadrupole instruments, that requires addi­ a commodity.
tional equipment and expertise, validation could be done Accurate quantitation tools have come a long way in the
on the same Orbitrap instrument using PRM. This simpli­ past decade, moving from binary SILAC experiments to 16-
fied MS workflow was successfully used to validate protein plex TMT and beyond. However, many biological experiments
biomarkers for diagnosis of colorectal cancer [180] and has would benefit from the ability to further increase multiplexing,
the potential to significantly improve the discovery to to allow for biological replicates, time points, or treatment
validation gap. conditions to be analyzed in parallel. Approaches to multiplex
past 30-plex samples in parallel have been proposed [183], but
at this time have yet to become commercially available.
6.3. Clinical proteomics and translational research
The promise of multi-omics workflows to decipher intricate
In addition to being a powerful tool for biomarker discovery, cellular signaling mechanisms at a cellular level has held great
clinical proteomics can add significant value to drug discovery promise, however it is only now that we see the true union of
and development in many other ways independent of biomar­ genomic sequencing technologies with proteomics, metabo­
kers. Successful Phase 3 clinical trials, typically large, well lomics and other cellular readouts as analytical tools become
characterized, longitudinal studies, represent an excellent more sensitive, and software analysis enables integration of
opportunity to combine proteomics, with clinical data, phar­ these data sets in a meaningful way. Much of the bottle neck
macokinetics, biomarker data, and other omics data to better with integrating these technologies is due to limitations in
understand the mechanism of action of a novel therapeutic. integrated data analysis pipelines. Pathway analysis tools often
This can guide the real world use of the novel therapeutic, concentrate on one type of data set at a time, rather than how
without necessarily requiring new biomarkers. Data from to extrapolate these data in concert. For example, it would be
Phase 3 clinical trials is also critical for reverse translation, powerful to examine metabolite changes in combination with
understanding not only what pathways and disease patholo­ profiling of the genes and proteins of their affiliated enzymatic
gies are impacted by the successful drug, but also which pathways in parallel and to decipher network interactions
pathways and pathologies remain unchanged thus providing across omics data sets.
520 J. R. LILL ET AL.

Certain subsections of the proteome have been intrinsi­ in a single post-translational event [199,200]. Often when
cally difficult to characterize using conventional mass spec­ we perform database searches, we still rely on standar­
trometric proteomic tools. Interactomics of cell–cell dized public annotations rather than cell-specific databases
interactions, both cis- and trans-mediated ligand receptors with pre-defined sets of PTMs. There is a good reason for
interactions, transient protein interactions and hydrophobic this as one needs a contained search environment to
membrane complexes assembly, particularly G-protein- mitigate false negative and positive results. However, this
coupled receptors (GPCRs) [184] and other classes of notor­ does mean that it is standard to only identify less than half
iously difficult to profile proteins remain under represented of the spectra in a typical bottom up workflow. With the
in proteomic studies. Martinez et al. have recently described emergence of machine learning algorithms and real-time
several suits of biochemical tools to identify cell surface searching, more de novo sequencing approaches [200]
protein interactions, both at large scale, as well as in might come of age and more ‘on the fly’ database
a pathway specific manner [185,186]. Further development generators.
of screening libraries with increasingly sensitive readouts
will continue to allow the biotechnology field to probe
8. Expert Opinion
hard to access parts of the proteome and decipher impor­
tant cellular interactions. With the emergence of macro­ While proteomics has established itself as a crucial suite of
cycles [187], aptamers [188] and other new probe based technologies in the drug discovery, there remains an
technologies, additional new areas of the proteome and untapped potential that goes beyond the field incrementally
their interactions will be revealed as these tools become improving current applications. For example, there are
part of the proteomic toolbox. Techniques such as BioID ongoing efforts to miniaturize proteomics-capable mass
[189], APEX [190] and FLARE [191] have emerged as extre­ spectrometers and to simplify their usage with the aim to
mely useful tools to study more transient intracellular inter­ bring the mass spectrometer to the bedside of a patient or
actions, however, there are limitations to their utility on the office of a clinician, for diagnostics and biomarker analy­
occasion as they require protein tagging, hence potentially sis. In addition, while mass spectrometers currently remain
changing native biological properties of the target protein. the primary analytical approach for the characterization of
Protein subcellular localization is tightly governed by and peptide and proteins, additional technologies characterize
intimately linked to protein function in health and disease. proteins are emerging as single molecule sequencing techni­
Capturing the spatial proteome – that is, the localizations of ques are emerging, and antibody-based readouts are becom­
proteins and their dynamics at the subcellular level – is there­ ing more sophisticated as they merge with DNA-barcoding
fore essential for a complete understanding of cell biology. and other infinitely more sensitive technologies. As new cell
The dynamics of protein complexes also remains biology arenas become more mainstream such as the imple­
a technologically challenging arena. The use of cross-linking mentation of synthetic biology, non-canonical amino acids as
technologies [192], and cellular localization tools such as tools for spatial and temporal analysis of proteome dynamics
LOPPIT [193] and OOPS [194] are paving the way for investi­ as well as reagents for engineering new chemistries of func­
gating how proteins or protein complexes translocate within tionalities into proteins will need to be analyzed in robust
the cell after specific signals or perturbations or in a cell and sensitive manners. Metabolic labeling of proteins with
specific context. non-canonical amino acids allows incorporation of biortho­
Spatial proteomics is emerging on a number of fronts gonal chemical groups into proteins by taking advantage of
and in depth resources are now available to the community, both endogenous and heterologous protein synthesis
mapping proteins and their interacting partners across tis­ machinery. These proteins can be further selectively conju­
sues. Advances in microscopy, mass spectrometry, flow gated to affinity reagents, nanoparticles or fluorophores, for
cytometry and machine learning has catapulted technology a variety of biochemical or proteomic applications [201].
development to allow for more granular spatial cellular From a proteomics point of view, synthetic biology
regulation. Various studies have been performed to probe approaches complicate proteomics data analysis as addi­
the complex architecture that is the cell, including single- tional masses and unique fragmentation profiles are
cell variations, dynamic protein translocations, changing introduced.
interaction networks and proteins that can localize to var­ In the proteomics community, we have traditionally navi­
ious sub-cellular compartments, allowing researchers to gated our bioanalytical analyses on the assumption that we
further unravel human disease biology [195,196]. The understand the composition of the proteome. However, the
Human Protein Atlas has been generated for probing increasing understanding of non-canonical translation events
a tissue based map of the human proteome, a wonderful and smORFs and the recognition of their unique cellular func­
resource for researchers who want to investigate the loca­ tions, (i.e. the emergence of additional dark matter antigens in
tion of proteins at the tissue level [197]. the MHC ligandome world [202] and spliced peptides [203])
A few years ago, the epigenetic era highlighted how our have demonstrated that there is a plethora of previously
in vivo biological circuitry is often dependent on complex unknown proteinaceous material lurking in our cells that war­
and highly heterogeneous post-translational events [198]. rant attention, both in terms of us understanding what our
The interplay between various types of PTMs is often baseline database for searching looks like, but also to be able
poorly understood beyond the Histone code, and yet var­ to dissect the functionality of these new protein-based
ious disease etiologies can be dictated by subtle changes entities.
EXPERT REVIEW OF PROTEOMICS 521

New and diverse findings of clinical relevance will emerge 6. Liu Y, Beyer A, Aebersold R. On the Dependency Of Cellular
in the next decade, and these ‘unknown unknowns’ in terms Protein Levels on mRNA abundance. Cell. 2016 Apr 21;165
(3):535–550.
of how the proteome can be modulated beyond our current 7. Zhu Y, Piehowski PD, Zhao R, et al. Nanodroplet processing plat­
understanding will continue to shape the role of proteomics in form for deep and quantitative proteome profiling of 10-100 mam­
drug discovery. This required diversification of the proteomic malian cells. Nat Commun. 2018 Feb 28;9(1):882.
space sampled in our research importantly also relates to the 8. Cox J, Hein MY, Luber CA, et al. Accurate proteome-wide label-free
clinical space: as a community, we need to generate data sets quantification by delayed normalization and maximal peptide ratio
extraction, termed MaxLFQ. Mol Cell Proteomics. 2014 Sep;13
that are not just European descent-centric, but ensure inclu­ (9):2513–2526.
sion of data being generated from participants and patients of 9. Dou M, Clair G, Tsai CF, et al. High-Throughput single cell proteo­
African, Asian, or Native Indigenous populations. mics enabled by multiplex isobaric labeling in a nanodroplet sam­
Taken together, since translational and post-translational ple preparation platform. Anal Chem. 2019 Oct 15;91
events are primary readouts for the cell’s biological function­ (20):13119–13127.
10. Li Z-Y, Huang M, Wang X-K, et al. Nanoliter-Scale oil-air-droplet
ality, we expect that proteomics will remain a key technology chip-based single cell proteomic analysis. Anal Chem. 2018 Apr
in the pharmaceutical and biotechnological arena in the com­ 17;90(8):5430–5438.
ing decade. However, its footprint within the drug discovery 11. Shao X, Wang X, Guan S, et al. Integrated proteome analysis device
process will depend on its adaptability to the changing needs for fast single-cell protein profiling. Anal Chem. 2018 Dec 4;90
with regard to the type of data it can provide, the ease, cost (23):14003–14010.
12. Hortin GL, Sviridov D. The dynamic range problem in the analysis
and throughput of data generation as well the ability to con­ of the plasma proteome. J Proteomics. 2010 Jan 3;73(3):629–636.
textualize generated data and turn them into clinically rele­ 13. Mun DG, Nam D, Kim H, et al. Accurate precursor mass assignment
vant information and hypotheses. Therefore, it will be exciting improves peptide identification in data-independent acquisition
to watch how this scientific area will evolve in terms of meth­ mass spectrometry. Anal Chem. 2019 Jul 2;91(13):8453–8460.
odology, instrumentation and software, as well as data inte­ 14. Ow SY, Salim M, Noirel J, et al. iTRAQ underestimation in simple
and complex mixtures: “the good, the bad and the ugly.”
gration: it will no doubt look very different in the future than J Proteome Res. 2009 Nov;8(11):5347–5355.
what we consider feasible right now. 15. Budnik B, Levy E, Harmange G, et al. SCoPE-MS: mass spectrometry
of single mammalian cells quantifies proteome heterogeneity dur­
ing cell differentiation. Genome Biol. 2018;19(1):161.
Acknowledgments • First description of single cell proteomics with common labora­
tory cell line strains and the seminal paper that started the
We thank Allison Bruce for her help with the graphics and Orit Rosenblatt- current excitement in single cell proteomics. This is the basis
Rosen and Mark McCarthy for insightful review. for many of the current single cell proteomics workflows.
16. Tsai CF, Zhao R, Williams SM, et al. An improved boosting to
amplify signal with isobaric labeling (iBASIL) strategy for precise
Declaration of interest quantitative single-cell proteomics. Mol Cell Proteomics. 2020
May;19(5):828–838.
J. Lill, R. Mathews and C. Rose are employees of Genentech Inc. M. Schirle 17. Cheung TK, Lee CY, Bayer FP, et al. Defining the carrier proteome
is an employee of Novartis. The authors have no other relevant affiliations limit for single-cell proteomics. Nat Methods. 2021 Jan;18(1):76–83.
or financial involvement with any organization or entity with a financial 18. Stopfer LE, Conage-Pough JE, White FM. Quantitative conse­
interest in or financial conflict with the subject matter or materials dis­ quences of protein carriers in immunopeptidomics and tyrosine
cussed in the manuscript apart from those disclosed. phosphorylation MS2 analyses. In bioRxiv. 2021May 28;20.
19. Erickson BK, Rose CM, Braun CR, et al. A strategy to combine
sample multiplexing with targeted proteomics assays for
Reviewer disclosures high-throughput protein signature characterization. Mol Cell.
2017 Jan 19;65(2):361–370.
Peer reviewers on this manuscript have no relevant financial or other 20. Rose CM, Erickson BK, Schweppe DK, et al. TomahaqCompanion:
relationships to disclose. a tool for the creation and analysis of isobaric label based multi­
plexed targeted assays. J Proteome Res. 2019 Feb 1;18(2):594–605.
21. Shishkova E, Hebert AS, Coon JJ. Now, more than ever, proteomics
References needs better chromatography. Cell Syst. 2016 Oct 26;3(4):321–324.
22. Hebert AS, Thoing C, Riley NM, et al. Improved precursor character­
Papers of special note have been highlighted as either of interest (•) or of ization for data-dependent mass spectrometry. Anal Chem. 2018
considerable interest (••) to readers. Feb 6;90(3):2333–2340.
1. Frantzi M, Latosinska A, Mischak H. Proteomics in drug develop­
23. Graumann J, Scheltema RA, Zhang Y, et al. A framework for intel­
ment: the dawn of a new era? PROTEOMICS – Clinical Applications.
ligent data acquisition and real-time database searching for shot­
2019 Mar;13(2):e1800087.
gun proteomics. Mol Cell Proteomics. 2012 Mar;11(3):M111.013185.
2. Kim N, Kim HK, Lee K, et al. Single-cell RNA sequencing demon­
24. Bailey DJ, Rose CM, McAlister GC, et al. Instant spectral assignment
strates the molecular and cellular reprogramming of metastatic
for advanced decision tree-driven mass spectrometry. Proceedings
lung adenocarcinoma. Nat Commun. 2020 May 8;11(1):2285.
of the National Academy of Sciences of the United States of
3. Philpott M, Cribbs AP, Brown T Jr., et al. Advances and challenges in
America. 2012 May 29;109(22):8411–8416.
epigenomic single-cell sequencing applications. Curr Opin Chem
25. Wichmann C, Meier F, Virreira Winter S, et al. enables global
Biol. 2020 Aug;57:17–26.
targeting of more than 25,000 peptides. Mol Cell Proteomics.
4. Slyper M, Porter CBM, Ashenberg O, et al. A single-cell and
2019 May;18(5):982–994.
single-nucleus RNA-Seq toolbox for fresh and frozen human
tumors. Nat Med. 2020 May;26(5):792–802. 26. McAlister GC, Nusinow DP, Jedrychowski MP, et al. MultiNotch MS3
5. Brunner A-D, Thielert M, Vasilopoulou CG, et al. Ultra-high sensitiv­ enables accurate, sensitive, and multiplexed detection of differen­
ity mass spectrometry quantifies single-cell proteome changes tial expression across cancer cell line proteomes. Anal Chem. 2014
upon perturbation. In: bioRxiv. 2021. Jul 15;86(14):7150–7158.
522 J. R. LILL ET AL.

27. Ting L, Rad R, Gygi SP, et al. MS3 eliminates ratio distortion in and shared cancer mutations. This paper describes the discov­
isobaric multiplexed quantitative proteomics. Nat Methods. 2011 ery of non-cannonical peptide targets that could drastically
Oct 2;8(11):937–940. expand therapeutic target space.
28. Erickson BK, Mintseris J, Schweppe DK, et al. Active instrument 47. Overmyer KA, Shishkova E, Miller IJ, et al. Large-scale multi-omic
engagement combined with a real-time database search for analysis of COVID-19 severity. Cell Syst. 2021 Jan 20;12(1):23–40
improved performance of sample multiplexing workflows. e7.
J Proteome Res. 2019 Mar 1;18(3):1299–1306. 48. Canela-Xandri O, Rawlik K, Tenesa A. An atlas of genetic associa­
29. Schweppe DK, Eng JK, Yu Q, et al. Full-featured, real-time tions in UK Biobank. Nat Genet. 2018 Nov;50(11):1593–1599.
database searching platform enables fast and accurate multiplexed 49. Du X, Yang F, Manes NP, et al. Linear discriminant analysis-based
quantitative proteomics. J Proteome Res. 2020 May 1;19 estimation of the false discovery rate for phosphopeptide
(5):2026–2034. identifications. J Proteome Res. 2008 Jun;7(6):2195–2203.
30. Assarsson E, Lundberg M, Holmquist G, et al. Homogenous 96-plex 50. Fondrie WE, Noble WS. mokapot: fast and flexible semisupervised
PEA immunoassay exhibiting high sensitivity, specificity, and excel­ learning for peptide detection. J Proteome Res. 2021 Feb;17;20
lent scalability. PLoS One. 2014;9(4):e95192. (4):1966–1971.
31. Rohloff JC, Gelinas AD, Jarvis TC, et al. Nucleic acid ligands with 51. Kall L, Canterbury JD, Weston J, et al. Semi-supervised learning for
protein-like side chains: modified aptamers and their use as diag­ peptide identification from shotgun proteomics datasets. Nat
nostic and therapeutic agents. Mol Ther Nucleic Acids. 2014 Oct;7 Methods. 2007 Nov;4(11):923–925.
(3):e201. 52. Degroeve S, Martens L. MS2PIP: a tool for MS/MS peak intensity
32. Swaminathan J, AA B, ET H, et al. Highly parallel single-molecule prediction. Bioinformatics. 2013 Dec 15;29(24):3199–3203.
identification of proteins in zeptomole-scale mixtures. Nat 53. Gabriels R, Martens L, Updated DS. MS(2)PIP web server delivers
Biotechnol. 2018 Oct 22:1038. fast and accurate MS(2) peak intensity prediction for multiple
• An example of a non-mass spectrometry based proteomics fragmentation methods, instruments and labeling techniques.
method that enables single molecule detection and quantifica­ Nucleic Acids Res. 2019 Jul 2;47(W1):W295–W299.
tion of protein molecules. Has the potential to significantly 54. Gessulat S, Schmidt T, Zolg DP, et al. Prosit: proteome-wide pre­
improve sensitivity of proteomics experiments. diction of peptide tandem mass spectra by deep learning. Nat
33. Swaminathan J, Boulgakov AA, Marcotte EM. A theoretical justifica­ Methods. 2019 Jun;16(6):509–518.
tion for single molecule peptide sequencing. PLoS Comput Biol. 55. Tiwary S, Levy R, Gutenbrunner P, et al. High-quality MS/MS spec­
2015 Feb;11(2):e1004080. trum prediction for data-dependent and data-independent acquisi­
34. Rodriguez-Larrea D, Bayley H. Multistep protein unfolding tion data analysis. Nat Methods. 2019 Jun;16(6):519–525.
during nanopore translocation. Nat Nanotechnol. 2013 Apr;8 56. Demichev V, Messner CB, Vernardis SI, et al. DIA-NN: neural net­
(4):288–295. works and interference correction enable deep proteome coverage
35. Helland SJ, Ewan RC, Trenkle A, et al. In vivo metabolism of leucine in high throughput. Nat Methods. 2020 Jan;17(1):41–44.
and alpha-ketoisocaproate in the pig: influence of dietary glucose 57. Yang Y, Liu X, Shen C, et al. In silico spectral libraries by deep
or sucrose. J Nutr. 1986 Oct;116(10):1902–1909. learning facilitate data-independent acquisition proteomics. Nat
36. Kennedy E, Dong Z, Tennant C, et al. Reading the primary structure Commun. 2020 Jan 9;11(1):146.
of a protein with 0.07 nm(3) resolution using a 58. Meier F, Kohler ND, Brunner AD, et al. Deep learning the collisional
subnanometre-diameter pore. Nat Nanotechnol. 2016;Nov;11 cross sections of the peptide universe from a million experimental
(11):968–976. values. Nat Commun. 2021 Feb 19;12(1):1185.
37. Howorka S, Siwy ZS. Reading amino acids in a nanopore. Nat 59. Jiang L, Wang M, Lin S, et al. A quantitative proteome map of the
Biotechnol. 2020 Feb;38(2):159–160. human body. Cell. 2020 Oct 1;183(1):269–283 e19.
38. Ouldali H, Sarthak K, Ensslen T, et al. Electrical recognition of the 60. Clark DJ, Dhanasekaran SM, Petralia F, et al. Integrated proteoge­
twenty proteinogenic amino acids using an aerolysin nanopore. nomic characterization of clear cell renal cell carcinoma. Cell. 2019
Nat Biotechnol. 2020 Feb;38(2):176–181. Oct 31;179(4):964–983 e31.
39. Rosen CB, Rodriguez-Larrea D, Bayley H. Single-molecule 61. Gillette MA, Satpathy S, Cao S, et al. Proteogenomic characteriza­
site-specific detection of protein phosphorylation with a tion reveals therapeutic vulnerabilities in lung adenocarcinoma.
nanopore. Nat Biotechnol. 2014 Feb;32(2):179–181. Cell. 2020 Jul 9;182(1):200–225 e35.
40. Oikonomou P, Salatino R, Tavazoie S In vivo mRNA display enables 62. Lindgren CM, Adams DW, Kimball B, et al. Simplified and unified
large-scale proteomics by next generation sequencing. access to cancer proteogenomic data. J Proteome Res. 2021;Apr
Proceedings of the National Academy of Sciences of the United 2;20(4):1902–1910.
States of America. 2020 Oct 27;117(43):26710–26718. 63. Nusinow DP, Szpyt J, Ghandi M, et al. Quantitative proteomics of
41. Nesvizhskii AI. Proteogenomics: concepts, applications and com­ the cancer cell line encyclopedia. Cell. 2020 Jan 23;180(2):387–402
putational strategies. Nat Methods. 2014;11(11):1114–1125. e16.
42. Yadav M, Jhunjhunwala S, Phung QT, et al. Predicting immuno­ 64. Choi M, Carver J, Chiva C, et al. MassIVE.quant: a community
genic tumour mutations by combining mass spectrometry and resource of quantitative mass spectrometry-based proteomics
exome sequencing. Nature. 2014 Nov 27;515(7528):572–576. datasets. Nat Methods. 2020 Oct;17(10):981–984.
43. Blevins WR, Tavella T, Moro SG, et al. Extensive post-transcriptional 65. Wang M, Wang J, Carver J, et al. Assembling the community-scale
buffering of gene expression in the response to severe oxidative discoverable human proteome. Cell Syst. 2018 Oct 24;7(4):412–421
stress in baker’s yeast. Sci Rep. 2019 Jul 29;9(1):11005. e5.
44. Chen J, Brunner AD, Cogan JZ, et al. Pervasive functional transla­ 66. Cole DJ, Weil DP, Shamamian P, et al. Identification of
tion of noncanonical human open reading frames. Science. 2020 MART-1-specific T-cell receptors: t cells utilizing distinct T-cell
Mar 6;367(6482):1140–1146. receptor variable and joining regions recognize the same tumor
45. Ruiz Cuevas MV, Hardy MP, Holly J, et al. Most non-canonical epitope. Cancer Res. 1994 Oct 15;54(20):5265–5268.
proteins uniquely populate the proteome or immunopeptidome. 67. Robbins PF, Kassim SH, Tran TL, et al. A pilot trial using lympho­
Cell Rep. 2021 Mar 9;34(10):108815. cytes genetically engineered with an NY-ESO-1-reactive T-cell
46. Ouspenskaia T, Law T, KR C, et al. Thousands of novel unannotated receptor: long-term follow-up and correlates with response.
proteins expand the MHC I immunopeptidome in cancer. bioRxiv. Clin Cancer Res off J Am Assoc Cancer Res. 2015 Mar 1;21
2020. (5):1019–1027.
• Finding novel candidates for targeted immunotherapies (e.g. 68. DiMasi JA, Grabowski HG, Hansen RW. Innovation in the pharma­
personalized cancer vaccines or engineered T cell therapies) ceutical industry: new estimates of R&D costs. J Health Econ. 2016
has traditionally been limited to tumor associated antigens May;47:20–33.
EXPERT REVIEW OF PROTEOMICS 523

69. Wouters OJ, McKee M, Luyten J. Estimated research and develop­ 89. Shenoy VM, Thompson BR, Shi J, et al. Chemoproteomic identifica­
ment investment needed to bring a new medicine to market, tion of serine hydrolase RBBP9 as a valacyclovir-activating enzyme.
2009-2018. Jama. 2020 Mar 3;323(9):844–853. Mol Pharm. 2020 May 4;17(5):1706–1714.
70. Dowden H, Munro J. Trends in clinical success rates and therapeu­ 90. Weerapana E, Wang C, Simon GM, et al. Quantitative reactivity
tic focus. Nat Rev Drug Discov. 2019 Jul;18(7):495–496. profiling predicts functional cysteines in proteomes. Nature. 2010
71. MJ W, Arrowsmith J, AR L, et al. An analysis of the attrition of drug Dec 9;468(7325):790–795.
candidates from four major pharmaceutical companies. Nat Rev 91. Kuljanin M, Mitchell DC, Schweppe DK, et al. Reimagining
Drug Discov. 2015 Jul;14(7):475–486. high-throughput profiling of reactive cysteines for cell-based
72. Comess KM, McLoughlin SM, Oyer JA, et al. Emerging approaches screening of large electrophile libraries. Nat Biotechnol. 2021
for the identification of protein targets of small molecules - Jan;39:630–641.
A practitioners’ perspective. J Med Chem. 2018 Oct 11;61 92. Patricelli MP, Janes MR, Li LS, et al. Selective inhibition of onco­
(19):8504–8535. genic KRAS output with small molecules targeting the inactive
73. Schirle M, Jenkins JL. Identifying compound efficacy targets in state. Cancer Discov. 2016 Mar;6(3):316–329.
phenotypic drug discovery. Drug Discov Today. 2016 Jan;21 93. Spradlin JN, Hu X, Ward CC, et al. Harnessing the anti-cancer
(1):82–89. natural product nimbolide for targeted protein degradation. Nat
74. Schirle M, Jenkins JL. Chapter 5. contemporary techniques for Chem Biol. 2019 Jul;15(7):747–755.
target deconvolution and mode of action elucidation. phenotypic 94. Blewett MM, Xie J, Zaro BW, et al. Chemical proteomic map of
drug discovery. Drug Discovery. 2020;83–103. dimethyl fumarate-sensitive cysteines in primary human T cells. Sci
75. Ito T, Ando H, Suzuki T, et al. Identification of a primary target of Signal. 2016 Sep 13;9(445):rs10.
thalidomide teratogenicity. Science. 2010 Mar 12;327 95. Zaro BW, Vinogradova EV, Lazar DC, et al. Dimethyl fumarate
(5971):1345–1350. disrupts human innate immune signaling by targeting the
76. Bantscheff M, Eberhard D, Abraham Y, et al. Quantitative chemical IRAK4-MyD88 complex. J Iimmunol. 2019 May 1;202
proteomics reveals mechanisms of action of clinical ABL kinase (9):2737–2746.
inhibitors. Nat Biotechnol. 2007 Sep;25(9):1035–1044. 96. Backus KM, Correia BE, Lum KM, et al. Proteome-wide covalent
77. Gower CM, Thomas JR, Harrington E, et al. Conversion of a single ligand discovery in native biological systems. Nature. 2016 Jun
polypharmacological agent into selective bivalent inhibitors of 23;534(7608):570–574.
intracellular kinase activity. ACS Chem Biol. 2016 Jan 15;11 • First application of chemoproteomics to screening a com­
(1):121–131. pound library to identify ligandable pockets for covalent
78. Patricelli MP, Szardenings AK, Liyanage M, et al. Functional inter­ ligands across a cellular proteome.
rogation of the kinome using nucleotide acyl phosphates. 97. Hacker SM, Backus KM, Lazear MR, et al. Global profiling of lysine
Biochemistry. 2007 Jan 16;46(2):350–358. reactivity and ligandability in the human proteome. Nat Chem.
79. Klaeger S, Heinzlmeir S, Wilhelm M, et al.The target landscape of 2017 Dec;9(12):1181–1190.
clinical kinase drugs. Science. 2017 Dec 1;Vol. 358:6367. 98. Lin S, Yang X, Jia S, et al. Redox-based reagents for chemoselective
80. Thomas JR, Brittain SM, Lipps J, et al. A photoaffinity labeling-based methionine bioconjugation. Science. 2017 Feb 10;355
chemoproteomics strategy for unbiased target deconvolution of (6325):597–602.
small molecule drug candidates. Methods Mol Biol. 99. Hahm HS, Toroitich EK, Borne AL, et al. Global targeting of func­
2017;1647:1–18. tional tyrosines using sulfur-triazole exchange chemistry. Nat Chem
81. Nolin E, Gans S, Llamas L, et al. Discovery of a ZIP7 inhibitor Biol. 2020 Feb;16(2):150–159.
from a Notch pathway screen. Nat Chem Biol. 2019 Feb;15 100. Hacker SM, Nesvizhskii Alexey I, Toste FD, et al. Profiling the
(2):179–188. proteome-wide selectivity of diverse electrophiles. In: ChemRxiv.
• An example how photoaffinity labeling-based chemoproteo­ 2021.
mics in combination with complementary approaches to target 101. Browne CM, Jiang B, Ficarro SB, et al. A chemoproteomic strategy
and MoA elucidation can enable the identification of a mem­ for direct and proteome-wide covalent inhibitor target-site
ber of a challenging protein class as the efficacy target of a identification. J Am Chem Soc. 2019 Jan 9;141(1):191–203.
phenotypic screening hit. 102. Wijeratne A, Xiao J, Reutter C, et al. Chemical proteomic character­
82. Parker CG, Kuttruff CA, Galmozzi A, et al. Chemical proteomics ization of a covalent KRASG12C inhibitor. ACS Med Chem Lett.
identifies SLC25A20 as a functional target of the ingenol class 2018 Jun 14;9(6):557–562.
of actinic keratosis drugs. ACS Cent Sci. 2017 Dec 27;3 103. Martinez Molina D, Jafari R, Ignatushchenko M, et al. Monitoring
(12):1276–1285. drug target engagement in cells and tissues using the cellular
83. Parker CG, Galmozzi A, Wang Y, et al. Ligand and target discovery thermal shift assay. Science. 2013 Jul 5;341(6141):84–87.
by fragment-based screening in human cells. Cell. 2017 Jan 26;168 104. Savitski MM, Reinhard FB, Franken H, et al. Tracking cancer drugs in
(3):527–541 e29. living cells by thermal profiling of the proteome. Science. 2014 Oct
84. Wang Y, Dix MM, Bianco G, et al. Expedited mapping of the 3;346(6205):1255784.
ligandable proteome using fully functionalized enantiomeric 105. Becher I, Werner T, Doce C, et al. Thermal profiling reveals pheny­
probe pairs. Nat Chem. 2019 Dec;11(12):1113–1123. lalanine hydroxylase as an off-target of panobinostat. Nat Chem
85. Rossin R, SM VDB, Ten HW, et al. Highly reactive trans-cyclooctene Biol. 2016 Nov;12(11):908–910.
tags with improved stability for diels-alder chemistry in living 106. Kalxdorf M, Gunthner I, Becher I, et al. Cell surface thermal pro­
systems. Bioconjug Chem. 2013 Jul 17;24(7):1210–1217. teome profiling tracks perturbations and drug targets on the
86. Rutkowska A, Thomson DW, Vappiani J, et al. A modular probe plasma membrane. Nat Methods. 2021 Jan;18(1):84–91.
strategy for drug localization, target identification and target occu­ 107. Kawatkar A, Schefter M, Hermansson N-O, et al. CETSA beyond
pancy measurement on single cell level. ACS Chem Biol. 2016 Sep soluble targets: a broad application to multipass transmembrane
16;11(9):2541–2550. proteins. ACS Chem Biol. 2019 Sep 20;14(9):1913–1920.
87. Flaxman HA, Miyamoto DK, Woo CM. Small molecule interactome 108. Reinhard FBM, Eberhard D, Werner T, et al. Thermal proteome
mapping by photo-affinity labeling (SIM-PAL) to identify binding profiling monitors ligand interactions with cellular membrane
sites of small molecules on a proteome-wide scale. Curr Protoc proteins. Nat Methods. 2015 Dec;12(12):1129–1131.
Chem Biol. 2019 Dec;11(4):e75. 109. Perrin J, Werner T, Kurzawa N, et al. Identifying drug targets in
88. Liu Y, Patricelli MP, Cravatt BF Activity-based protein profiling: the tissues and whole blood with thermal-shift profiling. Nat
serine hydrolases. Proceedings of the National Academy of Biotechnol. 2020 Mar;38(3):303–308.
Sciences of the United States of America. 1999 Dec 21;96 • Application of Thermal Proteome Profiling-based chemopro­
(26):14694–14699. teomics to patient-derived samples, opening the door for
524 J. R. LILL ET AL.

clinical applications for target engagement and off-target 131. Kearney P, Boniface JJ, Price ND, et al. The building blocks of
identification. successful translation of proteomics to the clinic. In: Current opi­
110. Lomenick B, Hao R, Jonai N, et al. Target identification using drug nion in biotechnology. Vol. 51. Jun; 2018. p. 123–129.
affinity responsive target stability (DARTS). Proceedings of the 132. Bringans SD, Ito J, Stoll T, et al. Comprehensive mass spectro­
National Academy of Sciences of the United States of America. metry based biomarker discovery and validation platform as
2009 Dec 22;106(51):21984–21989. applied to diabetic kidney disease. EuPA Open Proteom.
111. Feng Y, De Franceschi G, Kahraman A, et al. Global analysis of 2017;14:1–10.
protein structural changes in complex proteomes. Nat Biotechnol. 133. Bradshaw RA, Hondermarck H, Rodriguez H. Cancer proteomics
2014 Oct;32(10):1036–1044. and the elusive diagnostic biomarkers. Proteomics. 2019 Nov;19
112. Piazza I, Beaton N, Bruderer R, et al. A machine learning-based (21–22):e1800445.
chemoproteomic approach to identify drug targets and 134. Drucker E, Krapfenbauer K. Pitfalls and limitations in translation
binding sites in complex proteomes. Nat Commun. 2020 Aug from biomarker discovery to clinical utility in predictive and perso­
21;11(1):4200. nalised medicine. EPMA J. 2013 Feb 25;4(1):7.
113. Piazza I, Kochanowski K, Cappelletti V, et al. A map of 135. Geyer PE, Holdt LM, Teupser D, et al. Revisiting biomarker discovery
protein-metabolite interactions reveals principles of chemical by plasma proteomics. Mol Syst Biol. 2017 Sep 26;13(9):942.
communication. Cell. 2018 Jan 11;172(1–2):358–372 e23. 136. Percy AJ, Byrns S, Pennington SR, et al. Clinical translation of
114. Gaetani M, Sabatier P, Saei AA, et al. Proteome integral solubility MS-based, quantitative plasma proteomics: status, challenges,
alteration: a high-throughput proteomics assay for target requirements, and potential. Expert Rev Proteomics. 2016 Jul;13
deconvolution. J Proteome Res. 2019 Nov 1;18(11):4027–4037. (7):673–684.
115. Liu Y-K, Chen H-Y, Chueh PJ, et al. A one-pot analysis approach to 137. Gromova M, Vaggelas A, Dallmann G, et al. Biomarkers: opportu­
simplify measurements of protein stability and folding kinetics. nities and challenges for drug development in the current regula­
Biochimica Et Biophysica Acta Proteins and Proteomics. 2019 tory landscape. Biomark Insights. 2020;15:1177271920974652.
Mar;1867(3):184–193. 138. BEST (Biomarkers, EndpointS, and other Tools) Resource. Silver
116. Mateus A, Bobonis J, Kurzawa N, et al. Thermal proteome profiling Spring (MD): Food and Drug Administration (US); Bethesda
in bacteria: probing protein state in vivo. Mol Syst Biol. 2018 Jul (MD): National Institutes of Health (US), 2016.
6;14(7):e8242. • Excellent “living” resource from the FDA and NIH with clear,
117. Prabhu N, Dai L, Nordlund P. CETSA in integrated proteomics studies consistent definitions of the different types of biomarkers
of cellular processes. Curr Opin Chem Biol. 2020 Feb;54:54–62. and clinical endpoints including examples, background infor­
118. Kronke J, Udeshi ND, Narla A, et al. Lenalidomide causes selective mation and references.
degradation of IKZF1 and IKZF3 in multiple myeloma cells. Science. 139. Amur S, LaVange L, Zineh I, et al. Biomarker qualification: toward
2014 Jan 17;343(6168):301–305. a multiple stakeholder framework for biomarker development,
119. Huang HT, Dobrovolsky D, Paulk J, et al. A chemoproteomic regulatory acceptance, and utilization. Clin Pharmacol Ther. 2015
approach to query the degradable kinome using a multi-kinase Jul;98(1):34–46.
degrader. Cell Chem Biol. 2018 Jan 18;25(1):88–99 e6. 140. Miller BE, Tal-Singer R, Rennard SI, et al. Plasma fibrinogen qualifi­
120. Savitski MM, Zinn N, Faelth-Savitski M, et al. Multiplexed proteome cation as a drug development tool in chronic obstructive pulmon­
dynamics profiling reveals mechanisms controlling protein ary disease. Perspective of the chronic obstructive pulmonary
homeostasis. Cell. 2018 Mar 22;173(1):260–274 e25. disease biomarker qualification consortium. Am J Respir Crit Care
121. Meier F, Geyer PE, Virreira Winter S, et al. BoxCar acquisition Med. 2016 Mar 15;193(6):607–613.
method enables single-shot proteomics at a depth of 10,000 141. Nemirovskiy OV, Dufield DR, Sunyer T, et al. Discovery and devel­
proteins in 100 minutes. Nat Methods. 2018 Jun;15(6):440–448. opment of a type II collagen neoepitope (TIINE) biomarker for
122. Ruprecht B, Di Bernardo J, Wang Z, et al. A mass matrix metalloproteinase activity: from in vitro to in vivo. Anal
spectrometry-based proteome map of drug action in lung cancer Biochem. 2007 Feb 1;361(1):93–101.
cell lines. Nat Chem Biol. 2020 Oct;16(10):1111–1119. 142. Li WW, Nemirovskiy O, Fountain S, et al. Clinical validation of an
123. Litichevskiy L, Peckner R, Abelin JG, et al. A library of phosphopro­ immunoaffinity LC-MS/MS assay for the quantification of a collagen
teomic and chromatin signatures for characterizing cellular type II neoepitope peptide: a biomarker of matrix metalloproteinase
responses to drug perturbations. Cell Syst. 2018 Apr 25;6(4):424– activity and osteoarthritis in human urine. Anal Biochem. 2007 Oct
443 e7. 1;369(1):41–53.
124. Subramanian A, Narayan R, Corsello SM, et al. A next generation 143. Settle S, Vickery L, Nemirovskiy O, et al. Cartilage degradation
connectivity map: L1000 platform and the first 1,000,000 profiles. biomarkers predict efficacy of a novel, highly selective matrix
Cell. 2017 Nov 30;171(6):1437–1452 e17. metalloproteinase 13 inhibitor in a dog model of osteoarthritis:
125. Lai AC, Crews CM. Induced protein degradation: an emerging confirmation by multivariate analysis that modulation of type II
drug discovery paradigm. Nat Rev Drug Discov. 2017 Feb;16 collagen and aggrecan degradation peptides parallels pathologic
(2):101–114. changes. Arthritis Rheumatism. 2010 Oct;62(10):3006–3015.
126. Banik SM, Pedram K, Wisnovsky S, et al. Lysosome-targeting chi­ 144. MP HLG, KD B, SA M, et al. Association between concentrations of
maeras for degradation of extracellular proteins. Nature. 2020 urinary type II collagen neoepitope (uTIINE) and joint space nar­
Aug;584(7820):291–297. rowing in patients with knee osteoarthritis. Osteoarthritis Cartilage.
127. Siriwardena SU, DNP MG, Shoba VM, et al. Phosphorylation- 2006 Nov;14(11):1189–1195.
inducing chimeric small molecules. J Am Chem Soc. 2020 Aug 145. Tom I, Pham VC, Katschke KJ Jr., et al. Development of
19;142(33):14052–14057. a therapeutic anti-HtrA1 antibody and the identification of DKK3
128. Yamazoe S, Tom J, Fu Y, et al. Heterobifunctional molecules induce as a pharmacodynamic biomarker in geographic atrophy.
dephosphorylation of kinases-A proof of concept study. J Med Proceedings of the National Academy of Sciences of the United
Chem. 2020 Mar 26;63(6):2807–2813. States of America. 2020 May 5;117(18):9952–9963.
129. Anderson NL, Anderson NG. The human plasma proteome: history, 146. Stokes MP, Farnsworth CL, Moritz A, et al. PTMScan direct: identi­
character, and diagnostic prospects. Mol Cell Proteomics. 2002 fication and quantification of peptides from critical signaling pro­
Nov;1(11):845–867. teins by immunoaffinity enrichment coupled with LC-MS/MS. Mol
130. Rifai N, Gillette MA, Carr SA. Protein biomarker discovery and Cell Proteomics. 2012 May;11(5):187–201.
validation: the long and uncertain path to clinical utility. Nat 147. Noto PB, Sikorski TW, Zappacosta F, et al. Identification of
Biotechnol. 2006 Aug;24(8):971–983. hnRNP-A1 as a pharmacodynamic biomarker of type I PRMT
EXPERT REVIEW OF PROTEOMICS 525

inhibition in blood and tumor tissues. Sci Rep. 2020 Dec 17;10 167. Nigjeh EN, Chen R, Brand RE, et al. Quantitative proteomics based
(1):22155. on optimized data-independent acquisition in plasma analysis.
148. Pedrero-Prieto CM, Garcia-Carpintero S, Frontinan-Rubio J, et al. J Proteome Res. 2017 Feb 3;16(2):665–676.
A comprehensive systematic review of CSF proteins and peptides 168. Tsai T-H, Choi M, Banfai B, et al. Selection of features with
that define Alzheimer’s disease. Clin Proteomics. 2020;17:21. consistent profiles improves relative protein quantification in
149. Wildsmith KR, Schauer SP, Smith AM, et al. Identification of mass spectrometry experiments. Mol Cell Proteomics. 2020;19
longitudinally dynamic biomarkers in Alzheimer’s disease cere­ (6):944–959.
brospinal fluid by targeted proteomics. Mol Neurodegener. 2014 169. Anjo SI, Simoes I, Castanheira P, et al. Use of recombinant proteins
6;Jun(9):22. as a simple and robust normalization method for untargeted pro­
150. Geyer PE, Kulak NA, Pichler G, et al. Plasma proteome profiling to teomics screening: exhaustive performance assessment. Talanta.
assess human health and disease. Cell Syst. 2016 Mar 23;2 2019 Dec;1(205):120163.
(3):185–195. 170. Chang CY, Sabido E, Aebersold R, et al. Targeted protein quantifi­
151. Geyer PE, Wewer Albrechtsen NJ, Tyanova S, et al. Proteomics cation using sparse reference labeling. Nat Methods. 2014 Mar;11
reveals the effects of sustained weight loss on the human plasma (3):301–304.
proteome. Mol Syst Biol. 2016 Dec 22;12(12):901. 171. Kotol D, Hunt H, Hober A, et al. Longitudinal plasma protein
152. Bache N, Geyer PE, Bekker-Jensen DB, et al. A novel LC system profiling using targeted proteomics and recombinant
embeds analytes in pre-formed gradients for rapid, ultra-robust protein standards. J Proteome Res. 2020 Dec 4;19(12):4815–4825.
proteomics. Mol Cell Proteomics. 2018 Nov;17(11):2284–2296. 172. Pino LK, Searle BC, Huang EL, et al. Calibration using a single-point
153. Bruderer R, Muntel J, Muller S, et al. Analysis of 1508 plasma external reference material harmonizes quantitative mass spectro­
samples by capillary-flow data-independent acquisition profiles metry proteomics data between platforms and laboratories. Anal
proteomics of weight loss and maintenance. Mol Cell Proteomics. Chem. 2018 Nov 6;90(21):13112–13117.
2019 Jun;18(6):1242–1254. 173. Pino LK, Searle BC, Yang HY, et al. Matrix-matched calibration
154. Lennon S, Hughes CJ, Muazzam A, et al. High-throughput micro­ curves for assessing analytical figures of merit in quantitative
bore ultrahigh-performance liquid chromatography-ion proteomics. J Proteome Res. 2020 Mar 6;19(3):1147–1153.
mobility-enabled-mass spectrometry-based proteomics methodol­ 174. Abbatiello S, Ackermann BL, Borchers C, et al. New guidelines for
ogy for the exploratory analysis of serum samples from large publication of manuscripts describing development and applica­
cohort studies. J Proteome Res. 2021 Mar 5;20(3):1705–1715. tion of targeted mass spectrometry measurements of peptides and
155. Messner CB, Demichev V, Wendisch D, et al. Ultra-high-throughput proteins. Mol Cell Proteomics. 2017 Mar;16(3):327–328.
clinical proteomics reveals classifiers of COVID-19 infection. Cell 175. Neubert H, Shuford CM, Olah TV, et al. Protein biomarker quantifi­
Syst. 2020 Jul 22;11(1):11–24 e4. cation by immunoaffinity liquid chromatography-tandem mass
156. Kaur G, Poljak A, Ali SA, et al. Extending the depth of human spectrometry: current state and future vision. Clin Chem. 2020
plasma proteome coverage using simple fractionation techniques. Feb 1;66(2):282–301.
J Proteome Res. 2021 Feb 5;20(2):1261–1279. 176. Smit NPM, Ruhaak LR, Romijn F, et al. The time has come for
157. Smith JG, Gerszten RE. Emerging affinity-based proteomic technol­ quantitative protein mass spectrometry tests that target
ogies for large-scale plasma profiling in cardiovascular disease. unmet clinical needs. J Am Soc Mass Spectrom. 2021 Mar 3;32
Circulation. 2017 Apr 25;135(17):1651–1664. (3):636–647.
158. Petrera A, von Toerne C, Behler J, et al. Multi-platforms approach 177. Kusebauch U, Campbell DS, Deutsch EW, et al. Human SRMAtlas:
for plasma proteomics: complementarity of Olink PEA technology a resource of targeted assays to quantify the complete human
to mass spectrometry-based protein profiling. J Proteome Res. proteome. Cell. 2016 Jul 28;166(3):766–778.
2021 Jan 1;20(1):751–762. 178. Bhowmick P, Roome S, Borchers CH, et al. An update on
• Comparision of DDA and DIA MS proteomics with Olink affinity MRMAssayDB: a comprehensive resource for targeted proteomics
based proteomics platforms illustrating the signigicant assays in the community. J Proteome Res. 2021 Mar;20(4):2105–
increase in proteome coverage that can be achieved by using 2115.
these complementary approaches. 179. Sobsey CA, Ibrahim S, Richard VR, et al. Targeted and untargeted
159. Thomas S, Hao L, Ricke WA, et al. Biomarker discovery in mass proteomics approaches in biomarker development. Proteomics.
spectrometry-based urinary proteomics. Proteomics Clin Appl. 2020 May;20(9):e1900029.
2016 Apr;10(4):358–370. 180. Marin-Vicente C, Mendes M, de Los Rios V, et al. Identification and
160. Jin P, Wang K, Huang C, et al. Mining the fecal proteome: from validation of stage-associated serum biomarkers in colorectal can­
biomarkers to personalised medicine. Expert Rev Proteomics. 2017 cer using MS-based procedures. Proteomics Clin Appl. 2020 Jan;14
May;14(5):445–459. (1):e1900052.
161. Ludwig C, Gillet L, Rosenberger G, et al. Data-independent 181. Tress ML, Abascal F, Valencia A. Most alternative isoforms are not
acquisition-based SWATH-MS for quantitative proteomics: a functionally important. Trends Biochem Sci. 2017 Jun;42
tutorial. Mol Syst Biol. 2018 Aug 13;14(8):e8126. (6):408–410.
162. Reubsaet L, Sweredoski MJ, Moradian A. Data-independent acquisi­ 182. Brown KA, Melby JA, Roberts DS, et al. Top-down proteomics:
tion for the orbitrap Q exactive HF: a tutorial. J Proteome Res. 2019 challenges, innovations, and applications in basic and clinical
Mar 1;18(3):803–813. research. Expert Rev Proteomics. 2020 Oct;17(10):719–733.
163. Barkovits K, Pacharra S, Pfeiffer K, et al. Reproducibility, specificity 183. Braun CR, Bird GH, Wuhr M, et al. Generation of multiple reporter
and accuracy of relative quantification using spectral library-based ions from a single isobaric reagent increases multiplexing capacity
data-independent acquisition. Mol Cell Proteomics. 2020 Jan;19 for quantitative proteomics. Anal Chem. 2015 Oct 6;87
(1):181–197. (19):9855–9863.
164. Pino LK, Just SC, MacCoss MJ, et al. Acquiring and analyzing data 184. Sokolina K, Kittanakom S, Snider J, et al. Systematic protein-protein
independent acquisition proteomics experiments without spec­ interaction mapping for clinically relevant human GPCRs. Mol Syst
trum libraries. Mol Cell Proteomics. 2020 Jul;19(7):1088–1103. Biol. 2017 Mar 15;13(3):918.
165. Searle BC, Pino LK, Egertson JD, et al. Chromatogram libraries 185. Husain B, Ramani SR, Chiang E, et al. A platform for extracellular
improve peptide detection and quantification by data indepen­ interactome discovery identifies novel functional binding partners
dent acquisition mass spectrometry. Nat Commun. 2018 Dec 3;9 for the immune receptors B7-H3/CD276 and PVR/CD155. Mol Cell
(1):5128. Proteomics. 2019 Nov;18(11):2310–2323.
166. Galitzine C, Egertson JD, Abbatiello S, et al. Nonlinear regression 186. Verschueren E, Husain B, Yuen K, et al. The immunoglobulin super­
improves accuracy of characterization of multiplexed mass spectro­ family receptome defines cancer-relevant networks associated with
metric assays. Mol Cell Proteomics. 2018 May;17(5):913–924. clinical outcome. Cell. 2020 Jul 23;182(2):329–344 e19.
526 J. R. LILL ET AL.

187. Blanco MJ. Building upon nature’s framework: overview of key 195. Gatto L, Breckels LM, Lilley KS. Assessing sub-cellular resolution in
strategies toward increasing drug-like properties of natural product spatial proteomics experiments. Curr Opin Chem Biol. 2019
cyclopeptides and macrocycles. Methods Mol Biol. Feb;48:123–149.
2019;2001:203–233. 196. Lundberg E, Borner GHH. Spatial proteomics: a powerful discovery
188. Gold L, Ayers D, Bertino J, et al. Aptamer-based multiplexed proteomic tool for cell biology. Nat Rev Mol Cell Biol. 2019 May;20(5):285–302.
technology for biomarker discovery. PLoS One. 2010 Dec 7;5(12): 197. Uhlen M, Fagerberg L, Hallstrom BM, et al. Proteomics.
e15004. Tissue-based map of the human proteome. Science. 2015 Jan
189. Samavarchi-Tehrani P, Samson R, Gingras AC. Proximity dependent 23;347(6220):1260419.
biotinylation: key enzymes and adaptation to proteomics 198. Maile TM, Izrael-Tomasevic A, Cheung T, et al. Mass spectrometric
approaches. Mol Cell Proteomics. 2020 May;19(5):757–773. quantification of histone post-translational modifications by
190. Lobingier BT, Huttenhain R, Eichel K, et al. An approach to spatio­ a hybrid chemical labeling method. Mol Cell Proteomics. 2015
temporally resolve protein interaction networks in living cells. Cell. Apr;14(4):1148–1158.
2017 Apr 6;169(2):350–360 e12. 199. Bae EJ, Kim DK, Kim C, et al. LRRK2 kinase regulates alpha-synuclein
191. MI S, Ting AY. Directed evolution improves the catalytic efficiency propagation via RAB35 phosphorylation. Nat Commun. 2018 Aug
of TEV protease. Nat Methods. 2020 Feb;17(2):167–174. 27;9(1):3465.
192. Mintseris J, Gygi SP High-density chemical cross-linking for mod­ 200. Johnson RS, Searle BC, Nunn BL, et al. Assessing protein sequence
eling protein interactions. Proceedings of the National Academy database suitability using de novo sequencing. Mol Cell
of Sciences of the United States of America. 2020 Jan 7;117 Proteomics. 2020 Jan;19(1):198–208.
(1):93–102. 201. Saleh AM, Wilding KM, Calve S, Bundy BC, Kinzer-Ursem TL. Non-
193. Geladaki A, Kocevar Britovsek N, Breckels LM, et al. Combining canonical amino acid labeling in proteomics and biotechnology.
LOPIT with differential ultracentrifugation for high-resolution spa­ J Biol Eng. 2019;13:43.
tial proteomics. Nat Commun. 2019 Jan 18;10(1):331. 202. Granados DP, Laumont CM, Thibault P, et al. The nature of self for T
194. Queiroz RML, Smith T, Villanueva E, et al. Comprehensive identifi­ cells-a systems-level perspective. Curr Opin Immunol. 2015 Jun;34:1–8.
cation of RNA-protein interactions in any organism using orthogo­ 203. Faridi P, Woods K, Ostrouska S, et al. Spliced peptides and
nal organic phase separation (OOPS). Nat Biotechnol. 2019 Feb;37 cytokine-driven changes in the immunopeptidome of melanoma.
(2):169–178. Cancer Immunol Res. 2020 Oct;8(10):1322–1334.

You might also like