Details
-
Type: Improvement
-
Status: Open
-
Priority: Minor
-
Resolution: Unresolved
-
Affects Version/s: 2.10.0, 2.11.0
-
Fix Version/s: None
-
Component/s: analysis, file format issue
-
Labels:None
Description
For 2.9 and before, the jalview.analysis.SequenceIdMatcher routine required the following:
* one ID starts with the other ID string
* if one ID is longer, the next character is non-alphanumeric (e.g. a pipe symbol)
This doesn't quite work for IDs like PDB|1QIP|A and 1QIP|A. Furthermore, if the ID is lower case, then the matcher will also fail. Instead, when matching IDs Jalview should allow case insensitive matching and whole word, ie _match_ matches 'match', '_match' and 'match_', but not '_match2_'.
Prompted by thread on Jalview-discuss:
http://www.jalview.org/pipermail/jalview-discuss/2015-November/001251.html
* one ID starts with the other ID string
* if one ID is longer, the next character is non-alphanumeric (e.g. a pipe symbol)
This doesn't quite work for IDs like PDB|1QIP|A and 1QIP|A. Furthermore, if the ID is lower case, then the matcher will also fail. Instead, when matching IDs Jalview should allow case insensitive matching and whole word, ie _match_ matches 'match', '_match' and 'match_', but not '_match2_'.
Prompted by thread on Jalview-discuss:
http://www.jalview.org/pipermail/jalview-discuss/2015-November/001251.html
Attachments
Issue Links
- depends on
-
JAL-1537 relaxed id matches do not associate with all occurences
- In Progress
- related with
-
JAL-2038 Drag-and-drop to associate pdb files with sequences of same name is broken
- Closed
-
JAL-753 allow sequences with partial ID string matches to be annotated from GFF/jalview features files
- Resolved
-
JAL-872 structure associated with alignment fails for alignments of repeats
- Open
-
JAL-1427 Relaxed reference sequence ID matching when loading annotation files
- In Progress