-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No found reference sequences when headers look the same #86
Comments
Hi Mike, Strange. Thanks for the report. Are you able to send a cut down BAM file and cut down fasta file so I can test? I suspect it might have something to do with the spaces in the names, but not sure yet. Maybe just email me. |
Hey Ben, Here is a link to the bam file: https://www.dropbox.com/s/tbuwnzv5ywb7jtn/LM.mapped.sorted.subsample001.bam?dl=0 Happy to move this to email or share full files somewhere else if that is easier. I subsampled to 1% of reads for the bam file and in the fasta file I filtered out contigs with no hits in the subsampled bam. So there are headers in the bam that aren't in the fasta. Hope that isn't a problem. These are contigs generated from megahit which always has those spaces. I didn't run into this problem with contigs from idba_ud which don't have spaces. |
Hey Mike, it is the end of the week and I haven't gotten to this. A workaround might be putting the sequence names as input to |
No problem not getting to this yet. I tried using |
|
Hey sorry about the above comment, not sure what I was thinking. Will get back to you. |
Hi Mike, I "fixed" this by adding a new flag
Hope that settles it? Thanks for the report. I hope to release a new version soon, but not entirely sure when - let me know if you need a static compile in the meantime. |
Hello @wwood,
I am using coverm (version coverm 0.6.1 from bioconda) to get average coverage for some metagenome assemblies to deposit to NCBI. The command I am running is
coverm genome -v -m mean -t 15 --bam-files /home/projects-wrighton/NIH_Salmonella/Salmonella/Metagenomes/Megahit/LM_megahit/LM.mapped.sorted.bam -f /home/projects-wrighton/NIH_Salmonella/Salmonella/Metagenomes/Megahit/LM_megahit/final.contigs.2500.fa
. When I run this error:The mappings were generated from this reference using bbmap. To dig into this I grabbed the headers from the fasta file using grep (uploaded here: LM_headers.txt) and the headers from the bam file using idxstats (uploaded here: LM.mapped.sorted.idxstats.txt). When I look between them it looks like the headers match. Am I missing something here? Is something in the headers making the matching break?
Thanks,
Mike
The text was updated successfully, but these errors were encountered: