Uploaded image for project: 'Jalview'
  1. Jalview
  2. JAL-3877

Parse LOCUS line of GenBank file to obtain possibly missing ACCESSION id (and other useful metadata)

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 2.11.2
    • Component/s: file format issue
    • Labels:
      None
    • Urgency:
      Urgent

      Description

      The GenBank file parser src/jalview/io/GenBankFile.java to be merged on to 2.11.2, adapted I think from EmblFlatFile.java, requires a sequence id to open. This is obtained from the ACCESSION line, which has been seen to sometimes not be present, causing the file to not be opened at all.

      Description of GenBank file format can be seen at
      https://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html (as provided by David Martínez in JAL-1260), replicated at
      https://www.ncbi.nlm.nih.gov/genbank/samplerecord/
      and original(?) description at https://europepmc.org/article/pmc/147205
      From these is is not explicitly stated whether the ACCESSION line is mandatory, although I believe it _should_ always be present.
      However the example file attached to JAL-1260
      https://issues.jalview.org/plugins/servlet/com.redmoon.jira.documentvault/download-jira-document?issueId=12298&attId=10633
      does NOT have an ACCESSION line and so failed to open.

      The description of the LOCUS line in the documentation says that the Locus ID (often/always? similar to the Sequence ID) is given as the first whitespace delimited value (after the "LOCUS" signature/pragma). This should be a suitable alternative if no other ACCESSION is available, although preference should probably be given to a "VERSION" value. (It's probably unlikely a file that has no ACCESSION line with have a VERSION line though.)

      At a minimum, this ACCESSION value can be used to at least allow the file to be opened.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              soares Ben Soares
              Reporter:
              soares Ben Soares
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated: