Skip to content

Passing bam::record::Record between threads causes a segfault #293

Open
@DonFreed

Description

@DonFreed

Thank you for the very nice library!

I'm working on a tool that passes reads between threads. Unfortunately, the tool is producing non-deterministic segmentation faults and other memory errors. I've traced some of these issues back to rust-htslib, which will crash somewhat randomly when bam::record::Records are passes between threads. I am not too familiar with this library, so I am wondering if this is the expected behavior?

Here is a simplified example that can reproduce the crash:

use std::error::Error;
use std::str;
use std::sync::mpsc::{self, Receiver};
use std::thread;

use rust_htslib::bam::{Read, Reader};
use rust_htslib::bam::record::Record;

fn sum_mapqs(rx: Receiver<Option<Record>>) -> Result<(), Box<dyn Error>> {
    let mut total_mapq = 0u64;
    loop {
        match rx.recv() {
            Ok(x) => {
                match x {
                    Some(read) => {
                        let mapq = read.mapq();
                        total_mapq = total_mapq.saturating_add(mapq as u64);
                    },
                    None => {  // No more data
                        println!("Total MapQ: {}", total_mapq);
                        return Ok(());
                    },
                }
            },
            Err(e) => {
                eprintln!("Error reciving data: {}", e);
            }
        }
    }
}

fn main() -> Result<(), Box<dyn Error>> {
    let mut bam = match Reader::from_stdin() {
        Ok(bam) => { bam },
        Err(e) => { return Err(Box::new(e)); },
    };

    // Initialize the writer thread
    let (tx, rx) = mpsc::channel();
    let writer = thread::spawn(move || {
        if let Err(e) = sum_mapqs(rx) {
            eprintln!("Error writing output - {}", e);
        }
    });

    for read in bam.records() {
        match read {
            Ok(read) => {
                eprintln!("Parsed read: {}", str::from_utf8(read.qname()).unwrap());
                if let Err(e) = tx.send(Some(read)) {
                    eprintln!("Error sending data to writer thread");
                    return Err(Box::new(e));
                }
            },
            Err(e) => {
                return Err(Box::new(e));
            },
        }
    }

    // Close the spawned thread
    let _ = tx.send(None);
    let _ = writer.join().unwrap();
    Ok(())
}

Compiling with RUSTFLAGS="-g" cargo build and then running with a SAM passed through stdin produces the following:

$ cat test.sam | target/debug/rust-htslib-crash
...
Parsed read: H203:185:D2990ACXX:4:1101:11561:5493
23559 Broken pipe             cat test.sam
23560 Segmentation fault      (core dumped) | target/debug/rust-htslib-crash

Re-running the same command will produce the crash in different parts of the program. I've attached a backtrace from one of the crashes: crash backtrace.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions