Show item record

dc.contributor.authorTse, Sirius P. K.
dc.contributor.authorBeauchemin, Mathieu
dc.contributor.authorMorse, David
dc.contributor.authorLo, Samuel C. L.
dc.date.accessioned2024-07-12T13:28:01Z
dc.date.availableNO_RESTRICTIONfr
dc.date.available2024-07-12T13:28:01Z
dc.date.issued2017-11-19
dc.identifier.urihttp://hdl.handle.net/1866/33555
dc.publisherWileyfr
dc.subjectDinoflagellatefr
dc.subjectMS-sequencingfr
dc.subjectProteomicsfr
dc.subjectTranscriptomefr
dc.titleRefining transcriptome gene catalogs by MS-validation of expressed proteinsfr
dc.typeArticlefr
dc.contributor.affiliationUniversité de Montréal. Faculté des arts et des sciences. Département de sciences biologiquesfr
dc.identifier.doi10.1002/pmic.201700271
dcterms.abstractProtein sequence identification by tandem mass spectroscopy (LC-MS/MS) identifies thousands of protein sequences even in complex mixtures, and provides valuable insight into the biological functions of different cells. For non-model organisms, transcriptomes are generally used to allow peptide identification, an important addition to their use as a gene catalog allowing the potential metabolic activities of cells to be determined. We used LC-MS/MS data to identify which of the six possible reading frames in the transcriptome was actually used by the cell to make protein, and asked whether this would have an impact on downstream analyses using the dataset. We combined results from several LC-MS/MS experiments designed to identify peptide sequences in extracts from the dinoflagellate Lingulodinium polyedra using a 74 655-sequence transcriptome. We compiled a list of 6628 translated nucleic acid sequences that contained the ensemble of peptide matches (termed MS-validated sequences) and assessed the similarity in downstream analyses between this data set and the 6628 nucleic acid sequences from which they were derived. When compared with BLASTx analyses of the DNA sequences, the MS-validated protein-sequences-analyzed using BLASTp showed differences in gene ontology, had more identified BLAST hits, and contained more KEGG pathway enzymes. The MS-validated protein sequences also differ from datasets containing longest open reading frame (ORF) protein sequences. We also note a poor correlation between the levels of protein and mRNA abundance, a comparison not previously performed for dinoflagellates. The differences observed between analyses of MS-validated protein sequence and nucleic acid sequence datasets suggest use of the former may provide a more accurate representation of cellular capacity than the latter. Developing MS-validated protein sequence datasets may also speed interpretation of MS-MS spectra in bottom up proteomics experiments.fr
dcterms.isPartOfurn:ISSN:1615-9853fr
dcterms.isPartOfurn:ISSN:1615-9861fr
dcterms.languageengfr
UdeM.ReferenceFournieParDeposantdoi.org/10.1002/pmic.201700271fr
UdeM.VersionRioxxVersion originale de l'auteur·e / Author's Originalfr
oaire.citationTitleProteomics : proteomics and systems biologyfr
oaire.citationVolume18fr
oaire.citationIssue1fr


Files in this item

PDF

This item appears in the following Collection(s)

Show item record

This document disseminated on Papyrus is the exclusive property of the copyright holders and is protected by the Copyright Act (R.S.C. 1985, c. C-42). It may be used for fair dealing and non-commercial purposes, for private study or research, criticism and review as provided by law. For any other use, written authorization from the copyright holders is required.