Skip to content

Conversation

@AryanBagade
Copy link
Contributor

Summary

Implements workspace diagnostic mode for Pyrefly, allowing users to see type errors from all files in the workspace, not just open files (similar to Pyright's diagnosticMode setting).

Closes #397

Changes

Backend

  • Wired up the existing DiagnosticMode enum (was marked as dead code)
  • Modified get_diag_if_shown() to respect diagnostic mode setting
  • Updated publish closure to use get_all_errors() in workspace mode
  • Uses incremental transaction errors (no full rechecks)

Frontend & Tests

  • Added python.analysis.diagnosticMode setting to VS Code extension
  • Updated extension.ts to monitor python.analysis configuration changes
  • Added 3 LSP interaction tests:
    • test_workspace_mode_uses_get_all_errors
    • test_open_files_only_mode_filters_correctly
    • test_default_mode_is_open_files_only
  • Added test files for workspace diagnostic mode scenarios

Configuration

Users can now choose between:

  • 'openFilesOnly' (default): Show type errors only in open files
  • 'workspace': Show type errors in all files within the workspace

(I have shared a demo video in discord in dev channel)

  - Remove #[allow(dead_code)] from diagnostic_mode field
  - Add get_diagnostic_mode() helper method to Workspaces impl
  - Returns OpenFilesOnly as default for backward compatibility

  Related to facebook#397
  - Add DiagnosticMode import to server.rs
  - Modify get_diag_if_shown() to respect diagnostic mode
  setting
  - Update publish closure to use get_all_errors() in Workspace
   mode
  - Use transaction.get_errors(&handles) in OpenFilesOnly mode
  (default)

  This allows Pyrefly to analyze all files in the workspace
  when
  diagnosticMode is set to 'workspace', similar to Pyright's
  behavior.
  The implementation avoids full rechecks by using the
  transaction's
  cached errors, following the incremental pattern from
  run_watch.

  Defaults to OpenFilesOnly mode for backward compatibility.

  Related to facebook#397
…c mode

- Add python.analysis.diagnosticMode setting to VS Code extension
- Update extension.ts to monitor python.analysis configuration changes
- Add 3 LSP interaction tests verifying both diagnostic modes
- Add test files for workspace diagnostic mode scenarios

Allows users to choose between:
- 'openFilesOnly' (default): Show errors only in open files
- 'workspace': Show errors in all workspace files

All tests pass. Manual testing verified both modes work correctly.
@meta-cla meta-cla bot added the cla signed label Oct 29, 2025
@kinto0 kinto0 self-requested a review October 29, 2025 18:10
@meta-codesync
Copy link

meta-codesync bot commented Oct 29, 2025

@kinto0 has imported this pull request. If you are a Meta employee, you can view this in D85782844.

Copy link
Contributor

@kinto0 kinto0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!! it looks great! a few small nits, but it's basically there!

- Remove unnecessary diagnostic map initialization
- Fix error collection to use get_errors() instead of get_all_errors()
- Rely on get_diag_if_shown() for per file diagnostic mode filtering
- Add test for workspace mode not showing errors outside workspace
@AryanBagade AryanBagade requested a review from kinto0 October 29, 2025 20:24
Copy link
Contributor

@kinto0 kinto0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry about the back and forth: I noticed one more issue in a manual test (same as this) and think the test should be modified a little to test that behavior

I also think some of the cargo tests are failing

.server
.diagnostic("workspace_diagnostic_mode/opened_file.py");

// File has no errors
Copy link
Contributor

@kinto0 kinto0 Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make sure this test tests that errors appear for files that aren't yet opened? I don't think it would pass (I tested this manually).

I think we need to adjust the arguments to this to include every file that is included in the project. otherwise, we might not end up checking them

it gets a little confusing because files can be in the workspace but not covered by a pyrefly config. or they can be in a config but not covered by a workspace. I think a reasonable approximation is to include all files covered by any config of any opened file. if we actually use the workspace files themselves, anything that's ignored will still appear.

it might make sense to abstract out and reuse the logic in did_open that will get the config to populate_project_files_if_necessary. then you can do what populate_all_project_files_in_config does to get all files in the project paths and validate those files.

if you think we also want to use any workspace file, you can copy what populate_workspace_files_if_necessary (but I would skip that tbh, I think project files are a good approximation)

Copy link
Contributor Author

@AryanBagade AryanBagade Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for detailed feedback @kinto0

Implementation

textDocument/diagnostic request handler

When a diagnostic request comes in for a file in workspace mode:

  • Check if the file is in workspace mode and not currently open
  • Create a handle using handle_from_module_path with a filesystem path (not make_open_handle which expects in-memory content)
  • Run the transaction on that specific file to analyze it on-demand

document_diagnostics function

  • For open files → use make_open_handle (in-memory handle)
  • For unopened files in workspace mode → use handle_from_module_path (filesystem handle)
  • For files neither open nor in workspace mode → return empty diagnostics

Test Results

  • ✅ All 4 workspace diagnostic mode tests pass
  • ✅ The test that checks errors appear for unopened files now passes with actual type errors

Question About Implementation Approach

I've implemented a "pull" model where unopened files are analyzed when textDocument/diagnostic is explicitly requested for them.

This means:

  • The tests pass (because they explicitly request diagnostics for unopened files)
  • But in actual VS Code usage, unopened files don't show errors automatically

Should I also implement a "push" model where validate_in_memory_for_possibly_committable_transaction proactively publishes diagnostics for all workspace files when in workspace mode? Or is the current "pull" model approach acceptable?
Let me know your thoughts!

@AryanBagade
Copy link
Contributor Author

AryanBagade commented Oct 30, 2025

I've implemented the pull model (where unopened files are analyzed when textDocument/diagnostic is explicitly requested for them, which is what the tests do).

However, I noticed during manual testing that VS Code doesn't automatically show errors from unopened files. Looking at the LSP server output logs, I can see that unopened files ARE being analyzed (logs show "Prepare to check 2 files" and "Populated all files in the project path"), but their diagnostics aren't being published to VS Code's Problems panel because VS Code doesn't request diagnostics for them via textDocument/diagnostic, it only requests diagnostics for open files.

I attempted to implement the push model (adding workspace files to handles in validate_in_memory_for_possibly_committable_transaction so their diagnostics get published via publishDiagnostics), but the LSP server hangs/crashes during initialization even with just 2 files in the workspace.

I've tried several approaches:

  • Adding workspace files to handles and calling transaction.run() - crashes
  • Adding workspace files without calling run() (relying on populate_all_project_files_in_config) - still crashes
  • Using get_all_errors() - works but shows dependency errors (e.g., collections.Counter.__init__)

@kinto0
Copy link
Contributor

kinto0 commented Oct 30, 2025

I've implemented the pull model (where unopened files are analyzed when textDocument/diagnostic is explicitly requested for them, which is what the tests do).

I think it's necessary to keep our push model for a few reasons:

  • Pyrefly is the source of truth for what has "changed": we only want to recheck when the actual contents change and the naive file watcher will do extra work.
  • Pyrefly will need to error dependent files when a dependency updates, which is not something the textdocument/diagnostic supports

However, I noticed during manual testing that VS Code doesn't automatically show errors from unopened files. Looking at the LSP server output logs, I can see that unopened files ARE being analyzed (logs show "Prepare to check 2 files" and "Populated all files in the project path"), but their diagnostics aren't being published to VS Code's Problems panel because VS Code doesn't request diagnostics for them via textDocument/diagnostic, it only requests diagnostics for open files.

This is the case when we need to publishDiagnostics, which your code in validate_in_memory_for_possibly_committable_transaction should handle.

I attempted to implement the push model (adding workspace files to handles in validate_in_memory_for_possibly_committable_transaction so their diagnostics get published via publishDiagnostics), but the LSP server hangs/crashes during initialization even with just 2 files in the workspace.

I've tried several approaches:

  • Adding workspace files to handles and calling transaction.run() - crashes
  • Adding workspace files without calling run() (relying on populate_all_project_files_in_config) - still crashes
  • Using get_all_errors() - works but shows dependency errors (e.g., collections.Counter.__init__)

let's discuss over discord

thanks for all the hard work! this is a fun one

Changes:
- Modified get_diag_if_shown() to respect project includes/excludes for all files
- In workspace mode: shows diagnostics for all project files (open or closed)
- In openFilesOnly mode: shows diagnostics only for open project files
- Filters out stdlib/dependency errors using is_in_project check
- Uses get_all_errors() in workspace mode for efficiency
- Properly handles filesystem handles for unopened files

This implementation uses get_all_errors() as suggested, with filtering done in get_diag_if_shown() to exclude files outside project scope.
@AryanBagade AryanBagade requested a review from kinto0 October 30, 2025 19:18
transaction: &Transaction<'_>,
params: DocumentDiagnosticParams,
) -> DocumentDiagnosticReport {
let handle = make_open_handle(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually disagree with this part. Language clients will not send document_diagnostics for every file. If they request it, we should return it regardless of the mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm yess

@kinto0
Copy link
Contributor

kinto0 commented Oct 30, 2025

All of the tests seem to be based on document_diagnostic: but doesn't actually test the behavior, since the language server operates on publishDiagnostics in the IDE. you should remove the document_diagnostic changes and make the tests test for publishDiagnostics.

@AryanBagade AryanBagade requested a review from kinto0 October 31, 2025 07:51
Copy link
Contributor

@kinto0 kinto0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test now looks good, but you still have the logic change for document_diagnostics (the function and the code in as_request::<DocumentDiagnosticRequest>)

@AryanBagade
Copy link
Contributor Author

test now looks good, but you still have the logic change for document_diagnostics (the function and the code in as_request::<DocumentDiagnosticRequest>)

Removed the mode filtering logic from the DocumentDiagnosticRequest handler.

@AryanBagade AryanBagade requested a review from kinto0 October 31, 2025 18:10
Copy link
Contributor

@kinto0 kinto0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great! thanks for the hard work. I'll do some final testing and merge it after a second set of eyes.

a few things I might add in follow-ups:

  • add more documentation (to ide.mdx for our website and readme.md for the extension page)
  • rename diagnostic mode "workspace" to diagnostic mode "project" since it seems to fit the implementation better

@AryanBagade
Copy link
Contributor Author

Thanks for the review and suggestions!
Would you like me to handle the follow-up items (documentation + renaming to "project" mode) in separate PRs????

@kinto0
Copy link
Contributor

kinto0 commented Oct 31, 2025

Thanks for the review and suggestions! Would you like me to handle the follow-up items (documentation + renaming to "project" mode) in separate PRs????

sure, if you want, that'd be great!

On a related note, I'm noticing a few more things when I take a closer look and testing and I'm wondering if it will simplify the code a lot:

  • we have a few redundant checks for whether workspaces include this config or not and which handles to run the transaction on
  • every time we validate_in_memory_for_transaction we likely need to check these additional handles from the config

I'm wondering if we can simplify it by moving all the logic about which handles to run into this function:

    /// Run the transaction with the in-memory content of open files. Returns the handles of open files when the transaction is done.
    fn validate_in_memory_for_transaction(
        &self,
        state: &State,
        open_files: &RwLock<HashMap<PathBuf, Arc<String>>>,
        transaction: &mut Transaction<'_>,
    ) -> Vec<Handle> {
        let handles = open_files
            .read()
            .keys()
            .flat_map(|x| {
                self.workspaces.get_with(x, |(_, w)| {
                    let handle = make_open_handle(state, x);
                    if (w.lsp_analysis_config.is_some_and(|c| {
                        matches!(c.diagnostic_mode, Some(DiagnosticMode::Workspace))
                    })) {
                        let config = self
                            .state
                            .config_finder()
                            .python_file(handle.module(), handle.path());
                        // get all files in config and add them here
                        vec![handle]
                    } else {
                        vec![handle]
                    }
                })
            })
            .collect::<Vec<_>>();
        transaction.set_memory(
            open_files
                .read()
                .iter()
                .map(|x| (x.0.clone(), Some(x.1.dupe())))
                .collect::<Vec<_>>(),
        );
        transaction.run(&handles, Require::Everything);
        handles
    }

then validate_in_memory_for_possibly_committable_transaction can stay very similar to how it originally was

but i'm dealing with lifetime issues.... I think it's worth another look as it might help with clarity and hard-to-discover bugs. I will have more time next week to look into it.

@AryanBagade
Copy link
Contributor Author

thanks,
hmm! ic what you mean about centralizing the logic. Happy to help with the refactor if needed, Let me know!!!!

@kinto0
Copy link
Contributor

kinto0 commented Nov 7, 2025

I'm sorry about this falling through this week. There's a few things that have to happen:

  • merge conflicts
  • centralize logic in one place

I attempted this but ran into the lifetime issue described above. I also made a few changes while I was messing around (shown at bottom). I'm happy to set up time next week if you want to work together synchronously, or I can take another look / work with you next week on discord.

From c3f705ac3bdcc7aa9aedb41910479a7ab693662a Mon Sep 17 00:00:00 2001
From: Kyle Into <[email protected]>
Date: Fri, 7 Nov 2025 10:06:29 -0800
Subject: [PATCH] suggestions

Differential Revision: D86536062

fbshipit-source-id: c3f705ac3bdcc7aa9aedb41910479a7ab693662a


---

diff --git a/pyrefly/lib/lsp/non_wasm/server.rs b/pyrefly/lib/lsp/non_wasm/server.rs
--- a/pyrefly/lib/lsp/non_wasm/server.rs
+++ b/pyrefly/lib/lsp/non_wasm/server.rs
@@ -924,7 +924,6 @@
                         self.validate_in_memory_and_commit_if_possible(ide_transaction_manager);
                         let transaction =
                             ide_transaction_manager.non_committable_transaction(&self.state);
-
                         self.send_response(new_response(
                             x.id,
                             Ok(self.document_diagnostics(&transaction, params)),
@@ -1081,6 +1080,7 @@
 
     /// Run the transaction with the in-memory content of open files. Returns the handles of open files when the transaction is done.
     fn validate_in_memory_for_transaction(
+        &self,
         state: &State,
         open_files: &RwLock<HashMap<PathBuf, Arc<String>>>,
         transaction: &mut Transaction<'_>,
@@ -1088,7 +1088,23 @@
         let handles = open_files
             .read()
             .keys()
-            .map(|x| make_open_handle(state, x))
+            .flat_map(|x| {
+                self.workspaces.get_with(x.to_path_buf(), |(_, w)| {
+                    let handle = make_open_handle(state, x);
+                    if (w.lsp_analysis_config.is_some_and(|c| {
+                        matches!(c.diagnostic_mode, Some(DiagnosticMode::Workspace))
+                    })) {
+                        let config = self
+                            .state
+                            .config_finder()
+                            .python_file(handle.module(), handle.path());
+                        // get all files in config and add them here
+                        vec![handle]
+                    } else {
+                        vec![handle]
+                    }
+                })
+            })
             .collect::<Vec<_>>();
         transaction.set_memory(
             open_files
@@ -1124,27 +1140,20 @@
                 .config_finder()
                 .python_file(ModuleName::unknown(), e.path());
 
-            // Get diagnostic mode for this file's workspace
             let diagnostic_mode = self.workspaces.get_diagnostic_mode(&path);
-
-            // File must be in project (not excluded) to show diagnostics
             let is_in_project =
                 config.project_includes.covers(&path) && !config.project_excludes.covers(&path);
 
-            // Then check based on diagnostic mode
             let is_open = open_files.contains_key(&path);
             let should_show = match diagnostic_mode {
-                // Workspace mode: show if in project (open or closed files)
                 DiagnosticMode::Workspace => is_in_project,
-                // OpenFilesOnly mode: show if open AND in project
+                // OpenilesOnly mode: show if open AND in project
                 DiagnosticMode::OpenFilesOnly => is_open && is_in_project,
-            };
+            } && self
+                .type_error_display_status(e.path().as_path())
+                .is_enabled();
 
-            if should_show
-                && self
-                    .type_error_display_status(e.path().as_path())
-                    .is_enabled()
-            {
+            if should_show {
                 return Some((path.to_path_buf(), e.to_diagnostic()));
             }
         }
@@ -1230,86 +1239,21 @@
             Err(transaction) => transaction,
         };
         let handles =
-            Self::validate_in_memory_for_transaction(&self.state, &self.open_files, transaction);
-
-        // Check if any workspace is in workspace diagnostic mode
-        let has_workspace_mode = self.workspaces.roots().iter().any(|root| {
-            matches!(
-                self.workspaces.get_diagnostic_mode(root),
-                DiagnosticMode::Workspace
-            )
-        });
-
-        // In workspace mode, analyze all project files so get_all_errors() includes unopened files
-        if has_workspace_mode {
-            let open_file_paths: std::collections::HashSet<_> =
-                self.open_files.read().keys().cloned().collect();
-            if let Some(first_open_file) = open_file_paths.iter().next() {
-                let module_path = ModulePath::filesystem(first_open_file.clone());
-                let config = self
-                    .state
-                    .config_finder()
-                    .python_file(ModuleName::unknown(), &module_path);
-                let project_path_blobs = config.get_filtered_globs(None);
-                if let Ok(paths) = project_path_blobs.files() {
-                    let project_handles: Vec<_> = paths
-                        .into_iter()
-                        .filter_map(|path| {
-                            // Skip files that are already open (already in handles)
-                            if open_file_paths.contains(&path) {
-                                return None;
-                            }
-                            let module_path = ModulePath::filesystem(path.clone());
-                            let path_config = self
-                                .state
-                                .config_finder()
-                                .python_file(ModuleName::unknown(), &module_path);
-                            if config == path_config {
-                                Some(handle_from_module_path(&self.state, module_path))
-                            } else {
-                                None
-                            }
-                        })
-                        .collect();
-                    // Analyze only for errors, not full indexing
-                    transaction.run(&project_handles, Require::Errors);
-                }
-            }
-        }
+            self.validate_in_memory_for_transaction(&self.state, &self.open_files, transaction);
 
         let publish = |transaction: &Transaction| {
             let mut diags: SmallMap<PathBuf, Vec<Diagnostic>> = SmallMap::new();
             let open_files = self.open_files.read();
-
-            // Pre-populate with empty arrays for all open files to ensure we send
-            // publishDiagnostics notifications even when errors are cleared
             for x in open_files.keys() {
                 diags.insert(x.as_path().to_owned(), Vec::new());
             }
-
-            // In workspace mode, use get_all_errors() to get errors from all project files.
-            // In open-files-only mode, use get_errors(&handles) to only get errors from open files.
-            // The filtering by diagnostic mode and project includes/excludes is handled in get_diag_if_shown.
-            let errors = if has_workspace_mode {
-                transaction.get_all_errors()
-            } else {
-                transaction.get_errors(&handles)
-            };
-
-            for e in errors.collect_errors().shown {
+            for e in transaction.get_errors(&handles).collect_errors().shown {
                 if let Some((path, diag)) = self.get_diag_if_shown(&e, &open_files) {
                     diags.entry(path.to_owned()).or_default().push(diag);
                 }
             }
-
             for (path, diagnostics) in diags.iter_mut() {
-                // Use appropriate handle type: memory handle for open files, filesystem for others
-                let is_open = open_files.contains_key(path);
-                let handle = if is_open {
-                    make_open_handle(&self.state, path)
-                } else {
-                    handle_from_module_path(&self.state, ModulePath::filesystem(path.clone()))
-                };
+                let handle = make_open_handle(&self.state, path);
                 Self::append_unreachable_diagnostics(transaction, &handle, diagnostics);
             }
             self.connection.publish_diagnostics(diags);
@@ -1354,7 +1298,6 @@
             self.open_files.dupe(),
         );
     }
-
     fn invalidate_find_for_configs(&self, invalidated_configs: SmallSet<ArcId<ConfigFile>>) {
         self.invalidate(|t| t.invalidate_find_for_configs(invalidated_configs));
     }
@@ -1443,7 +1386,7 @@
             let mut transaction = state.new_committable_transaction(Require::indexing(), None);
             f(transaction.as_mut());
 
-            Self::validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
+            self.validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
 
             // Commit will be blocked until there are no ongoing reads.
             // If we have some long running read jobs that can be cancelled, we should cancel them
@@ -1689,7 +1632,7 @@
             let mut transaction = state.new_committable_transaction(Require::indexing(), None);
             transaction.as_mut().set_memory(vec![(uri, None)]);
             let _ =
-                Self::validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
+                self.validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
             state.commit_transaction(transaction);
             queue_source_db_rebuild_and_recheck(
                 state.dupe(),
@@ -1991,7 +1934,7 @@
             cancellation_handles
                 .lock()
                 .insert(request_id.clone(), transaction.get_cancellation_handle());
-            Self::validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
+            self.validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
             match transaction.find_global_references_from_definition(
                 handle.sys_info(),
                 metadata,
@@ -2498,7 +2441,7 @@
             let mut transaction = state.new_committable_transaction(Require::indexing(), None);
             transaction.as_mut().invalidate_config();
 
-            Self::validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
+            self.validate_in_memory_for_transaction(&state, &open_files, transaction.as_mut());
 
             // Commit will be blocked until there are no ongoing reads.
             // If we have some long running read jobs that can be cancelled, we should cancel them

--
1.7.9.5

@AryanBagade
Copy link
Contributor Author

I'm sorry about this falling through this week. There's a few things that have to happen:

  • merge conflicts
  • centralize logic in one place

I attempted this but ran into the lifetime issue described above. I also made a few changes while I was messing around (shown at bottom). I'm happy to set up time next week if you want to work together synchronously, or I can take another look / work with you next week on discord.

hmm ic ic, sure
will resolve the merge conflict no worries
i'll deep dive into it and will try to centralize the code this weekend
and yeah we can work next week, will let you know on discord if i stuck in something!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pyright diagnosticMode: "workspace" equivalent in Pyrefly

2 participants