Skip to content

Panic with follows from span in multithreaded runtime. #183

Open
@mTsBucy1

Description

@mTsBucy1

Bug Report

Version

tracing-opentelemetry: 0.25.0 / 0.27.0 (elaborated below),
tracing: 0.1.40

Platform

Linux mtstestclient03 6.8.0-40-generic #40~22.04.3-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 30 17:30:19 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

Description

I have this scenario in my code:

  1. I spawn (multiple) task, which loops and call async instrumented function.
  2. this function gets its tracing::Span::current().id() and sends it through a tokio mpsc channel.
  3. Another task reads this, spawns a instrumented async function, which follows from the received id.

Suddenly my code paniced on startup with

thread 'tokio-runtime-worker' panicked at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-opentelemetry-0.25.0/src/layer.rs:953:18:
Missing otel data span extensions
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'tokio-runtime-worker' panicked at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-subscriber-0.3.18/src/registry/sharded.rs:437:58:
Mutex poisoned: PoisonError { .. }
[stack trace...]
panic in a destructor during cleanup
thread caused non-unwinding panic. aborting.

This happened here in the code https://github.com/tokio-rs/tracing-opentelemetry/blob/v0.25.0/src/layer.rs#L953.
My hourly job crashed 4 times this way in the past week. I.e. approximately every 40th time.

I wrote a test, now on the current branch (0.27.0), to try to replicate this behaviour, but it also didn't show the same panic. However, I noticed another odd behaviour: The asserted count fails for a tokio new_multi_thread runtime, but succeeds with the tokio new_current_thread one. I am not sure whether these are indeed related, but it seems to be a bug in its own right.

// in tests/follows_from.rs

#[test]
fn trace_follows_from_diff_thread() {
    use tokio::sync::mpsc;

    #[tracing::instrument]
    async fn span_follows_this(chan: &mpsc::Sender<Option<tracing::Id>>) {
        chan.send(tracing::Span::current().id())
            .await
            .expect("Channel should be open");
    }

    #[tracing::instrument(follows_from = [cause])]
    async fn span(cause: Option<tracing::Id>) {}

    async fn receiver(mut chan: mpsc::Receiver<Option<tracing::Id>>) {
        while let Some(cause) = chan.recv().await {
            span(cause).await;
        }
    }

    let (_tracer, provider, exporter, subscriber) = test_tracer();

    tracing::subscriber::with_default(subscriber, || {
        tokio::runtime::Builder::new_multi_thread()
            .enable_all()
            .build()
            .unwrap()
            .block_on(async {
                let (tx, rx) = mpsc::channel(4);
                tokio::spawn(receiver(rx));
                span_follows_this(&tx).await;
                tokio::time::sleep(std::time::Duration::from_millis(10)).await;
                drop(tx);
            })
    });

    drop(provider); // flush all spans
    let spans = exporter.0.lock().unwrap();
    // The follower and followee
    assert_eq!(spans.len(), 2);
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions