-
Notifications
You must be signed in to change notification settings - Fork 163
Description
Bug Description
After upgrading from fuser 0.14.0 to 0.16.0, we discovered that BackgroundSession::drop() no longer blocks until the filesystem's destroy() method is called. This behavioral change causes our worker threads to hang indefinitely because they rely on destroy() being called to trigger cleanup.
Environment
- fuser version: 0.16.0 (upgrading from 0.14.0)
- Rust version: 1.88.0
- OS: Linux (Ubuntu 24.04.2 LTS)
- Use case: FUSE filesystem with background worker threads for Azure Blob Storage
Expected Behavior (v0.14.0)
{
let session = BackgroundSession::new(...);
// ... use session ...
} // <-- drop() blocks here until:
// 1. Unmount completes
// 2. destroy() is called on filesystem
// 3. Background thread finishesWorker threads would detect channel closure (from destroy()) and exit cleanly.
Actual Behavior (v0.16.0)
{
let session = BackgroundSession::new(...);
// ... use session ...
} // <-- drop() returns IMMEDIATELY
// destroy() is never called (or called much later)
// Background thread continues running
// Worker threads hang forever waiting on channelsRoot Cause Analysis
The regression stems from this change in BackgroundSession:
v0.14.0:
pub struct BackgroundSession {
_mount: Mount, // Always present
guard: Mutex<JoinHandle<Result<()>>>,
}v0.16.0:
pub struct BackgroundSession {
_mount: Option<Mount>, // Now optional!
guard: Mutex<JoinHandle<Result<()>>>,
}Impact: When _mount is None during drop:
- No
Mount::drop()is triggered - No blocking occurs waiting for unmount
destroy()is never called on the filesystem- Background FUSE thread continues running
- Application worker threads hang indefinitely
The BackgroundSession::join() method exists:
pub fn join(self) {
drop(_mount);
guard.join().unwrap().unwrap();
}This provides synchronous cleanup, but in our architecture:
join()ensuresdestroy()is called- Our
destroy()implementation is empty (just logging) - Worker threads still need explicit signaling to exit
So join() helps but doesn't fully restore v0.14 behavior where drop was guaranteed to be blocking.
Current Workaround
We're using a global AtomicBool flag:
static TEARING_DOWN: AtomicBool = AtomicBool::new(false);
impl Drop for Session {
fn drop(&mut self) {
// Signal workers before waiting
TEARING_DOWN.store(true, Ordering::SeqCst);
// Wait for workers to exit
self.on_storage_tasks_complete();
// Reset for next test (important for sequential tests)
TEARING_DOWN.store(false, Ordering::SeqCst);
}
}Workers check this flag:
loop {
match channel.recv_timeout(TIMEOUT) {
Ok(segment) => process(segment),
Err(_) => {
if TEARING_DOWN.load(Ordering::SeqCst) {
break; // Exit worker
}
}
}
}Questions
-
Was this change intentional? The
_mount: Option<Mount>change seems significant but isn't documented in release notes. -
Is this a breaking change? Applications that relied on
drop()blocking behavior will break silently. -
Should
join()be documented as required? If users need synchronous cleanup, should they always usejoin()instead of relying ondrop()? -
Why is
_mountnowOption<>? What use case requires it to be optional?
Impact: This affects any application that:
- Spawns worker threads coordinated with FUSE lifecycle
- Relies on
destroy()being called during teardown - Runs sequential tests (global state pollution from lingering sessions)
We'd appreciate clarification on whether this was intentional and what the recommended migration path is. Thank you!