Uploaded image for project: 'Jalview'
  1. Jalview
  2. JAL-254

Sequence IDs containing both uniprot name and accession code do not work for discover PDB ids

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.1.1
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Mantis ID:
      17632

      Description

      When a seuqence ID is of the form :
      >q00000|prot_orga
      The discover PDBID function will fail with a message like :
      prot_orga not found.
      This is because the Uniprot record retrieved from querying with prot_orga (or q00000 if the order is reversed) must be associated to its corresponding sequences in the alignment's dataset, and this was done by matching the Uniprot record name to the dataset sequence ID - which fails in the above ID. A magic 'hack' has been made to cope with the special case of :
      >Uniprot/SwissProt|q00000|prot_orga
      which is the form for sequences retrieved by the sequence fetcher, but the Uniprot record to seuqence ID matching is not robust.

      ****** ADDITIONAL INFORMATION ******

      This issue relates to others - the DBRef system must be expanded to cover those provided by the VAMSAS document (DB source, DB accession, DB start/end mapping to seuqence's entry). A general system for mass ID renaming and DBRef extraction must also be put in (even if its just a regex based find/replace mechanism).

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              amwaterhouse Andrew Waterhouse
              Reporter:
              jprocter James Procter
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: