Uploaded image for project: 'Jalview'
  1. Jalview
  2. JAL-2445

More robust parser selection strategy for input data

    XMLWordPrintable

    Details

    • Type: Epic
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Epic Name:
      identifyfilev2

      Description

      Jalview 2's 'IdentifyFile' mechanism examines the input data stream to determine which parser should be used. There are several problems:
      1. Input data stream recognition may fail if the input data is missing or has slightly different headers or other data formatting 'furniture', even if the data itself conforms.
      2. Some formats can only be determined by parsing the whole stream.
      3. Only one parser can be determined for any input stream - if the parser fails or throws an error, no fallback is offered to the user.

      Suggested improvements are:
      * allow users to hint at which parser to use - if that fails, try others
      * extend the algorithm for stream identification so that other parsers can be tried if the first candidate parser fails
       - this also means being able to rollback modifications to the datamodel if a parser bails out part way through processing a stream.
      * as a final fallback provide a stream viewer/editor window (like UGENE) which will allow the user to adjust or omit any headers.
       - The simplest approach would be to show the input stream in a cut'n'paste text box - but that may not scale, and risks losing provenance (ie the original datasource locator needs to be preserved, but it also needs to be marked as modified on import).

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              jprocter James Procter
              Reporter:
              jprocter James Procter
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated: