Details
-
Type: Bug
-
Status: In Progress
-
Priority: Major
-
Resolution: Unresolved
-
Affects Version/s: 2.7_gsoc11, 2.8, 2.8.1, 2.8.2, 2.9
-
Fix Version/s: None
-
Component/s: data retrieval services, na
-
Labels:
Description
Looking up database entries for RNA sequences assumes that the database holds an exact match, rather than the genomic version of the sequence.
E.g.
>ABJT01000033.1/33739-33825
AACACAUCAGAUUUCCUGGUGUAACGAAUUUUUUAAGUGCUUCUUGCUUAAGCAAG-UUUC-AUCC-CGACC
CCCUCA-------GGG-UCGGGAUUU
Fetching [EMBL database references] for the above (by importing the fasta file above to Jalview, then using fetch db refs->EMBL) results in a 'Sequence not 100% match' error, and no identifier added to the sequence. The database ref fetcher needs to be taught how to translate between DNA and RNA nucleotide sequences.
E.g.
>ABJT01000033.1/33739-33825
AACACAUCAGAUUUCCUGGUGUAACGAAUUUUUUAAGUGCUUCUUGCUUAAGCAAG-UUUC-AUCC-CGACC
CCCUCA-------GGG-UCGGGAUUU
Fetching [EMBL database references] for the above (by importing the fasta file above to Jalview, then using fetch db refs->EMBL) results in a 'Sequence not 100% match' error, and no identifier added to the sequence. The database ref fetcher needs to be taught how to translate between DNA and RNA nucleotide sequences.