Comet release 2021.01
Documentation for parameters for release 2021.01
can be found here.
release 2021.01 rev. 0 (2021.01.0), release date 2021/06/23
- Update the expectation value (E-value) calculation by improving the
determination of the tail region of the xcorr cumulative distribution for
the linear regression fit.
- New ThreadPool code by D. Shteynberg. The previous code apparently
has intermittent issues when using many (70+) threads.
- Make flanking (previous and next) residue reporting consistent when a peptide
is present in a protein multiple times and thus could have different flanking
residues within the same protein. Previous versions did not consistently report
the same set of flanking residues in repeated/replicate searches. The flanking
residues for the first occurrence of the peptide in a protein will now always be reported.
- Added a parameter
"old_mods_encoding"
to enable using the old character based modification encodings (e.g. DLYM*NCK)
instead mass based encodings (e.g. DLYM[15.9949]NCK) in the SQT output files.
Add "old_mods_encoding = 1" to the comet.params file to use the old modification
character encodings. This functionality was added to support post-processing tools
that have not been updated to handle the numeric modification encodings. This is
a "hidden" parameter in that it is not present in the example params file
generated by "comet.exe -p" nor is it in the sample parameters files available
for download. It must be manually added to your comet.params.
- The
"print_expect_score"
parameter is now deprecated; it will be treated as a hidden parameter.
Anyone using SQT output who would rather have the Sp score instead of the E-value
reported will now have to manually add "print_expect_score = 0" to their params file.
- Added a no digestion (aka "no_cut", aka don't cleave anywhere) entry to the
comet.params file.
- The Windows Visual Studio solution is updated to compile with v142 build tools
using Visual Studio 2019.
- This version of Comet will also run using comet.params files from the 2020.01
releases as there have been no significant changes to the parameter entries.
- Known bug: in the mzIdentML output, the attribute "dBSequence_ref" for element
"PeptideEvidence" is incorrectly written as "DBSeqence_ref". On 2021/08/04, the
release files were updated to correct this. Thanks to A. Collins for reporting
the error.
- Here's a list of some known bugs that weren't addressed for this release.
Hopefully I can address some of these in a follow-up maintenance release.
- Reported PEFF modification and substitution positions are off by 1 when
the start methionine residue is cleaved (using
"clip_nterm_methionine = 1").
- The program will access restricted memory (negative array position) when
"precursor_NL_ions = 1"
is set. Presumably this can happen with other specified neutral loss masses
besides "1" although that hasn't been tested yet. Until the underlying bug can
be identified, Comet will now report a warning ("Error3") and skip the analysis
of those neutral loss peaks and allow the search to complete.
- The reported calculated peptide masses can vary by one number in the 6th
decimal point between replicate searches. This occurs very infrequently if at all.
- The preliminary score rank (Sp rank) can vary between replicate searches.
I've only observed the ranks differ by 1, e.g. 12 in one run and 13 in the
other. This is sadly an issue associated with threading that cannot
be addressed without a huge performance hit so this behavior will continue
to exist going forward. Fortunately I don't believe this occurs frequently,
especially for the "good" IDs. Plus the preliminary score rank and score are
old retrofit values that are added solely for backwards compatibility; they
were made unnecessary with the fast xcorr calculation from 2008.
Go download from the download page.