[JAL-2110] Duplicate uniprot entries from Ensembl->Show cross-refs->Uniprot - Jalview

XML

Word

Printable

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.10.0
Component/s: data retrieval services, Datamodel
Labels:
None
Environment:
Jalview Version: Development Branch Build
Jalview Installation: webstart git-commit:3afeb8a [develop]
Build Date: 12 May 2016
Java version: 1.8.0_91
x86_64 Mac OS X 10.10.5

Epic Link:
ensembl

Description

To reproduce:
1.Select ENSEMBL datasource and retrieve example ID: ENSG00000157764
2. Show cross-references for 'Uniprot'

With no DAS uniprot sequence sources available, the result is split view containing 4 transcripts and 8 protein sequences. Half of the sequences are aligned as the transcripts, with annotations, the others are unaligned.

If das sequence sources are available (e.g. using the EBI das registry mirror page), then 16 proteins are shown, some sequences have names like 'UNIPROT|H7C5K3|...' whereas others have 'H7C5K3'.

Looking at this further: selecting just the transcripts when showing cross-references yields the expected view, but if the locus is included, then the additional unprot sequences are retrieved. So what appears to be happening is that the show cross-references is not recognising that its been asked to retrieve the same ID twice from two different cross-reference origins (the transcript and the locus).

Wierdly, if the locus sequence is selected and show uniprot cross-references attempted, the following mysterious messages appear:
#Adding 4 ids back to queries list for searching again (UNIPROT)
Failed to make CDS alignment
No Sequences generated for xRef type UNIPROT

Attachments

Issue Links

depends on

JAL-2154 Wrong start/end on mapped dataset sequence after reloading project

Closed

JAL-2210 Extra sequences appear when showing Uniprot cross-refs from an Ensembl locus view

Closed

related with

JAL-2138 Remove EMBLCDS as a database source

Open

JAL-2023 Error loading project saved with CDS/protein split frame

Closed

JAL-2037 "Get Cross-References" incomplete if performed a second time

Closed

Activity

People

Assignee:: James Procter

Reporter:: James Procter

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 12/May/16 5:33 PM

Updated:: 05/Oct/16 3:43 PM

Resolved:: 05/Oct/16 3:43 PM