Skip to content

Spectrum in mgf file not found #2

@nukaemon

Description

@nukaemon

Dear developers

Thank you for providing this useful tool.
I setup DeepRescore in AWS(CentOS7) with GPU backend and it could run on test data successfully.

Now, I tried on my own data but encountered with an error at 'process_pDeep2_results' step.
The nextflow command executed is below which should be ok, and the identification file (output.2021_04_07_02_59_45.t.xml) was generated from the same mgf file(myown.mgf) using X!Tandem.

nextflow run ${DEEPRESCORE} --id_file output.2021_04_07_02_59_45.t.xml --ms_file myown.mgf --se xtandem --ms_instrument Lumos --ms_energy 0.34 --prefix d2 --decoy_prefix XXX_  --cpu 4 --mem 12

The command error message from nextflow tells that something wrong happened in Java execution

Command error:
  Exception in thread "main" java.io.IOException: Spectrum 'File: D:\Discoverer2_2Data\DiscovererDaemon\200611BPB\F24Z019E_1_5ul.raw; SpectrumID: 2220; scans: 2975' in mgf file 'myown.mgf' not found!
        at com.compomics.util.experiment.massspectrometry.SpectrumFactory.getSpectrum(SpectrumFactory.java:788)
        at com.compomics.util.experiment.massspectrometry.SpectrumFactory.getSpectrum(SpectrumFactory.java:730)
        at PDVGUI.GenerateSpectrumTable.process(GenerateSpectrumTable.java:84)
        at PDVGUI.GenerateSpectrumTable.<init>(GenerateSpectrumTable.java:31)
        at PDVGUI.GenerateSpectrumTable.main(GenerateSpectrumTable.java:21)

I looked at myown.mgf and there are actually entry lines related with SpectrumID: 2220; scans: 2975.

.
.
.
BEGIN IONS
TITLE=File: "D:\Discoverer2_2Data\DiscovererDaemon\200611BPB\F24Z019E_1_5ul.raw"; SpectrumID: "2220"; scans: "2975"
PEPMASS=496.76181 16035.10645
CHARGE=2+
RTINSECONDS=724
SCANS=2975
168.998 9.06042
171.242 47.312
176.236 15.0943
183.230 16.0298
186.481 12.8785
.
.
.

Also, SpectrumID "2220" shows up at the first line in 'd2_format_titles.txt', so it seems getting the error immediately on loading 'd2_format_titles.txt'.

I also manually tested each command executed in process_pDeep2_results step and confirmed that 'd2_spectrum_pairs.txt' was generated but empty after PDV-1.6.1.beta.features-jar-with-dependencies.jar.

Do you have any idea to solve this problem?
I paste below the whole log message from nextflow just in case.

log message from nextflow
[37/ea937f] process > xml2mzid (d2)                       [100%] 1 of 1 ✔
[1a/aa62b3] process > calc_basic_features_xt (d2)         [100%] 1 of 1 ✔
[89/0c6d10] process > pga_fdr_control (d2)                [100%] 1 of 1 ✔
[1d/7ff0ff] process > generate_train_prediction_data (d2) [100%] 1 of 1 ✔
[29/3b4f5d] process > run_pdeep2 (d2)                     [100%] 1 of 1 ✔
[f0/b0788f] process > process_pDeep2_results (d2)         [100%] 1 of 1, failed: 1 ✘
[-        ] process > train_autoRT                        -
[-        ] process > predicte_autoRT                     -
[-        ] process > generate_percolator_input           -
[-        ] process > run_percolator                      -
[-        ] process > generate_pdv_input                  -
Error executing process > 'process_pDeep2_results (d2)'

Caused by:
  Process `process_pDeep2_results (d2)` terminated with an error exit status (2)

Command executed:

  #!/bin/sh
  mv d2_pdeep2_prediction_results.txt d2_pdeep2_prediction_results.txt.mgf
  Rscript /home/centos/DeepRescore/bin/format_pDeep2_titile.R d2_pdeep2_prediction.txt d2-rawPSMs.txt ./d2_format_titles.txt

  java -Xmx12g -cp /home/centos/DeepRescore/bin/PDV-1.6.1.beta.features/PDV-1.6.1.beta.features-jar-with-dependencies.jar PDVGUI.GenerateSpectrumTable         ./d2_format_titles.txt myown.mgf d2_pdeep2_prediction_results.txt.mgf ./d2_spectrum_pairs.txt xtandem
  mkdir sections sections_results
  Rscript /home/centos/DeepRescore/bin/similarity/devide_file.R ./d2_spectrum_pairs.txt 4 ./sections/
  for file in ./sections/*
  do
      name=`basename $file`
      Rscript /home/centos/DeepRescore/bin/similarity/calculate_similarity_SA.R $file ./sections_results/${name}_results.txt &
  done
  wait
  awk 'NR==1 {header=$_} FNR==1 && NR!=1 { $_ ~ $header getline; } {print}' ./sections_results/*_results.txt > ./d2_similarity_SA.txt

Command exit status:
  2

Command output:
  (empty)

Command error:
  Exception in thread "main" java.io.IOException: Spectrum 'File: D:\Discoverer2_2Data\DiscovererDaemon\200611BPB\F24Z019E_1_5ul.raw; SpectrumID: 2220; scans: 2975' in mgf file 'myown.mgf' not found!
        at com.compomics.util.experiment.massspectrometry.SpectrumFactory.getSpectrum(SpectrumFactory.java:788)
        at com.compomics.util.experiment.massspectrometry.SpectrumFactory.getSpectrum(SpectrumFactory.java:730)
        at PDVGUI.GenerateSpectrumTable.process(GenerateSpectrumTable.java:84)
        at PDVGUI.GenerateSpectrumTable.<init>(GenerateSpectrumTable.java:31)
        at PDVGUI.GenerateSpectrumTable.main(GenerateSpectrumTable.java:21)
  Bioconductor version 3.10 (BiocManager 1.30.10), ?BiocManager::install for help
  Bioconductor version '3.10' is out-of-date; the current release version '3.12'
    is available with R version '4.0'; see https://bioconductor.org/install
  ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
  ✔ ggplot2 3.2.1     ✔ purrr   0.3.3
  ✔ tibble  2.1.3     ✔ dplyr   0.8.4
  ✔ tidyr   1.0.0     ✔ stringr 1.4.0
  ✔ readr   1.3.1     ✔ forcats 0.4.0
  ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
  ✖ dplyr::filter() masks stats::filter()
  ✖ dplyr::lag()    masks stats::lag()
  
  Attaching package: ‘data.table’
  
  The following objects are masked from ‘package:dplyr’:
  
      between, first, last
  
  The following object is masked from ‘package:purrr’:
  
      transpose
  
  Warning message:
  In fread(args[1]) :
    File './d2_spectrum_pairs.txt' has size 0. Returning a NULL data.table.
  Bioconductor version 3.10 (BiocManager 1.30.10), ?BiocManager::install for help
  Bioconductor version '3.10' is out-of-date; the current release version '3.12'
    is available with R version '4.0'; see https://bioconductor.org/install
  ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
  ✔ ggplot2 3.2.1     ✔ purrr   0.3.3
  ✔ tibble  2.1.3     ✔ dplyr   0.8.4
  ✔ tidyr   1.0.0     ✔ stringr 1.4.0
  ✔ readr   1.3.1     ✔ forcats 0.4.0
  ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
  ✖ dplyr::between()   masks data.table::between()
  ✖ dplyr::filter()    masks stats::filter()
  ✖ dplyr::first()     masks data.table::first()
  ✖ dplyr::lag()       masks stats::lag()
  ✖ dplyr::last()      masks data.table::last()
  ✖ purrr::transpose() masks data.table::transpose()
  Error in fread(args[1]) : 
    File './sections/*' does not exist or is non-readable. getwd()=='/home/centos/Work/test/work/f0/b0788f2a27d40b620301bb2776920b'
  Execution halted
  awk: cannot open ./sections_results/*_results.txt (No such file or directory)

Work dir:
  /home/centos/Work/test/work/f0/b0788f2a27d40b620301bb2776920b

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions