Skip to content

Commit 5a8e985

Browse files
committed
error in handling of identity exceptions
- enzymes with high sequence identity could be specified to use higher blast identity cutoff (dat/exception.tbl) - wrong handling could happen because the comment column was also used for ec/reaction matching - fix only consideres first column (ec number or reaction name) - this affects the alcohol dehydrogenase (1.1.1.1) which is mentioned in the comment column of the exception file and was not found by blast search for example for Enterococcus faecalis because the higher identity cutoff (70%) was needed
1 parent b4a0c76 commit 5a8e985

File tree

2 files changed

+18
-18
lines changed

2 files changed

+18
-18
lines changed

dat/exception.tbl

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
11
# file with enzymes having a high similar sequence identity to differently working enzymes (false friends) => need more strict blast search
2-
enzyme/reaction comment
3-
1.9.3.1 cytochrome-c oxidase
4-
4.1.1.39 RuBisCo
5-
D-ribulose-1,5-bisphosphate cleaving dioxygenase RuBisCo
6-
1.9.6.1 nitrate reductase nap
7-
1.7.2.5 nitric oxide reductase
8-
4.1.2.22 fructose-6-phosphate phosphoketolase
9-
2.6.1.96 gamma aminobutyrate transaminase
10-
2.8.3.5 succinyl-CoA:acetoacetate CoA-transferase
11-
2.3.1.31 homoserine O-acetyltransferaseInferred from experiment
12-
3.2.1.51 alpha-L-fucosidase
13-
4.1.1.17 ornithine decarboxylase (high similarity with other protein: 10.1074/jbc.R500031200)
14-
1.1.1.244 methanol dehydrogenase (high similarity to EC 1.1.1.1)
15-
1.1.1.26 glycolate dehydrogenase
16-
2.6.1.19 4-aminobutyrate aminotransferase
2+
enzyme/reaction comment
3+
1.9.3.1 cytochrome-c oxidase
4+
4.1.1.39 RuBisCo
5+
D-ribulose-1,5-bisphosphate cleaving dioxygenase RuBisCo
6+
1.9.6.1 nitrate reductase nap
7+
1.7.2.5 nitric oxide reductase
8+
4.1.2.22 fructose-6-phosphate phosphoketolase
9+
2.6.1.96 gamma aminobutyrate transaminase
10+
2.8.3.5 succinyl-CoA:acetoacetate CoA-transferase
11+
2.3.1.31 homoserine O-acetyltransferaseInferred from experiment
12+
3.2.1.51 alpha-L-fucosidase
13+
4.1.1.17 ornithine decarboxylase (high similarity with other protein: 10.1074/jbc.R500031200)
14+
1.1.1.244 methanol dehydrogenase (high similarity to EC 1.1.1.1)
15+
1.1.1.26 glycolate dehydrogenase
16+
2.6.1.19 4-aminobutyrate aminotransferase

src/gapseq_find.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -592,9 +592,9 @@ do
592592
geneRef=$(grep -wFe $rea $metaGenes | awk -vFPAT='([^,]*)|("[^"]+")' -vOFS=, {'print $5'})
593593
[[ verbose -ge 1 ]] && echo -e "\t$j) $rea $reaName $ec" $geneName
594594
[[ -z "$rea" ]] && { continue; }
595-
[[ -n "$ec" ]] && [[ -n "$reaName" ]] && [[ -n "$EC_test" ]] && { is_exception=$(grep -Fw -e "$ec" -e "$reaName" $dir/../dat/exception.tbl | wc -l); }
596-
( [[ -z "$ec" ]] || [[ -z "$EC_test" ]] ) && [[ -n "$reaName" ]] && { is_exception=$(grep -Fw "$reaName" $dir/../dat/exception.tbl | wc -l); }
597-
[[ -n "$ec" ]] && [[ -z "$reaName" ]] && [[ -n "$EC_test" ]] && { is_exception=$(grep -Fw "$ec" $dir/../dat/exception.tbl | wc -l); }
595+
[[ -n "$ec" ]] && [[ -n "$reaName" ]] && [[ -n "$EC_test" ]] && { is_exception=$(cat $dir/../dat/exception.tbl | cut -f 1 | grep -Fw -e "$ec" -e "$reaName" | wc -l); }
596+
( [[ -z "$ec" ]] || [[ -z "$EC_test" ]] ) && [[ -n "$reaName" ]] && { is_exception=$(cat $dir/../dat/exception.tbl | cut -f 1 | grep -Fw "$reaName" | wc -l); }
597+
[[ -n "$ec" ]] && [[ -z "$reaName" ]] && [[ -n "$EC_test" ]] && { is_exception=$(cat $dir/../dat/exception.tbl | cut -f 1 | grep -Fw "$ec" | wc -l); }
598598
if [[ $is_exception -gt 0 ]] && [[ $identcutoff -lt $identcutoff_exception ]];then # take care of similair enzymes with different function
599599
identcutoff_tmp=$identcutoff_exception
600600
[[ verbose -ge 1 ]] && echo -e "\t\tUsing higher identity cutoff for $rea"

0 commit comments

Comments
 (0)