Details
-
Type: Bug
-
Status: Closed
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: 2.10.0
-
Component/s: data retrieval services, Datamodel
-
Labels:None
-
Environment:Jalview Version: Development Branch Build
Jalview Installation: webstart git-commit:3afeb8a [develop]
Build Date: 12 May 2016
Java version: 1.8.0_91
x86_64 Mac OS X 10.10.5
-
Epic Link:
Description
To reproduce:
1.Select ENSEMBL datasource and retrieve example ID: ENSG00000157764
2. Show cross-references for 'Uniprot'
With no DAS uniprot sequence sources available, the result is split view containing 4 transcripts and 8 protein sequences. Half of the sequences are aligned as the transcripts, with annotations, the others are unaligned.
If das sequence sources are available (e.g. using the EBI das registry mirror page), then 16 proteins are shown, some sequences have names like 'UNIPROT|H7C5K3|...' whereas others have 'H7C5K3'.
Looking at this further: selecting just the transcripts when showing cross-references yields the expected view, but if the locus is included, then the additional unprot sequences are retrieved. So what appears to be happening is that the show cross-references is not recognising that its been asked to retrieve the same ID twice from two different cross-reference origins (the transcript and the locus).
Wierdly, if the locus sequence is selected and show uniprot cross-references attempted, the following mysterious messages appear:
#Adding 4 ids back to queries list for searching again (UNIPROT)
Failed to make CDS alignment
No Sequences generated for xRef type UNIPROT
1.Select ENSEMBL datasource and retrieve example ID: ENSG00000157764
2. Show cross-references for 'Uniprot'
With no DAS uniprot sequence sources available, the result is split view containing 4 transcripts and 8 protein sequences. Half of the sequences are aligned as the transcripts, with annotations, the others are unaligned.
If das sequence sources are available (e.g. using the EBI das registry mirror page), then 16 proteins are shown, some sequences have names like 'UNIPROT|H7C5K3|...' whereas others have 'H7C5K3'.
Looking at this further: selecting just the transcripts when showing cross-references yields the expected view, but if the locus is included, then the additional unprot sequences are retrieved. So what appears to be happening is that the show cross-references is not recognising that its been asked to retrieve the same ID twice from two different cross-reference origins (the transcript and the locus).
Wierdly, if the locus sequence is selected and show uniprot cross-references attempted, the following mysterious messages appear:
#Adding 4 ids back to queries list for searching again (UNIPROT)
Failed to make CDS alignment
No Sequences generated for xRef type UNIPROT