Skip to content

Conversation

@sundanc
Copy link

@sundanc sundanc commented Mar 29, 2025

What does this PR do, and why?

This PR adds comprehensive debugging tools to help both users and developers troubleshoot issues in Zulip Terminal:

  1. New zulip-term-debug command-line utility with multiple functions:

    • log: Analyzes debug logs for common error patterns
    • connect: Tests connectivity to Zulip servers
    • terminal: Checks terminal capabilities for compatibility
    • run: Runs Zulip Terminal with debugging enabled
  2. Improved makefile targets for debugging:

    • make debug: Run with debug mode enabled
    • make debug-profile: Run with profiling enabled
    • make debug-clean: Clean debug files
  3. Enhanced documentation in README with examples for:

    • Remote debugging workflows
    • Profiling for performance issues
    • Common troubleshooting steps

As a frequent user of Zulip Terminal, I've sometimes found it challenging to diagnose issues. These tools standardize debugging approaches and make it easier for both users to report problems and developers to solve them.

I'm planning to apply for GSoC 2025 and would love to contribute more to Zulip Terminal in the future!

External discussion & connections

  • Discussed in #zulip-terminal in debugging improvements

How did you test this?

  • Manually - Behavioral changes
  • Manually - Visual changes
  • Existing automated tests should already cover this (only a refactor of tested code)

Self-review checklist for each commit

  • It is a minimal coherent idea
  • It has a commit summary following the documented style (title & body)
  • It has a commit summary describing the motivation and reasoning for the change
  • It individually passes linting and tests
  • It contains test additions for any new behavior
  • It flows clearly from a previous branch commit, and/or prepares for the next commit

@zulipbot zulipbot added the size: XL [Automatic label added by zulipbot] label Mar 29, 2025
@sundanc sundanc marked this pull request as draft March 29, 2025 18:57
@sundanc
Copy link
Author

sundanc commented Mar 29, 2025

@amanagr @neiljp

@neiljp neiljp added the PR needs review PR requires feedback to proceed label Mar 31, 2025
@neiljp
Copy link
Collaborator

neiljp commented Mar 31, 2025

@sundanc It seems that this is not passing the tests; some of this is related to a common error, but others are down to you - if you don't understand why, please investigate and then ask in the channel :)

It would also be helpful to split this into separate commits - these are independent changes, it seems?

@sundanc
Copy link
Author

sundanc commented Mar 31, 2025

Thank you for your feedback. I've addressed the linting issues in debug_helper.py by replacing f-strings with parameter-based logging to fix the G004 issues. This approach is recommended for performance since it defers string formatting until needed.

Regarding the failing tests, I understand my commit message format needs to follow the project's convention with area: description pattern. I'll update the commit message to match this format.

I also see that I should have separated these changes into different commits since they're independent. In the future, I'll split changes to address different concerns into separate commits for better clarity and easier review.

Let me know if there's anything else I need to address.

@sundanc sundanc marked this pull request as ready for review March 31, 2025 17:22
@sundanc sundanc force-pushed the main branch 2 times, most recently from 388c6e5 to 2758132 Compare March 31, 2025 18:39
@sundanc
Copy link
Author

sundanc commented Mar 31, 2025

@neiljp Finally, it passed all the checks. I cleaned up broken & unformatted codes that are in my forked repo commit history.

Copy link
Collaborator

@neiljp neiljp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sundanc There's a lot to cover here, and I can definitely see some of it being helpful 👍

I've left inline comments which could help with understanding some points, but until those are clarified:

  • an update to the README to indicate the profiling technique would clear up what it is (it's only in the source, otherwise?)
  • a connectivity and server-reporting feature could be useful, as a standalone tool

Re my earlier point about commits, if these changes were in individual independent commits, I may have been able to easily merge them separately first. Or some could have been squashed together later into related changes.

Comment on lines +29 to +35
debug: venv
@echo "=== Running with debug enabled ==="
$(PYTHON) -m zulipterminal.cli.run -d

debug-profile: venv
@echo "=== Running with profiling enabled ==="
$(PYTHON) -m zulipterminal.cli.run --profile
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern with these is that the extra options available to the run module cannot easily be added.

While eg. themes could be debugged by changing the config file, some options are only available on the command-line, such as 'explore mode' right now.

Copy link
Author

@sundanc sundanc Apr 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about making Makefile to pass things to debug-profile & debug commands? Maybe it would be useful to use ARGS to forward arguments to the ´´run´´ module. What do you think about something like:

debug-profile: venv
	@echo "= Profiling enabled ="
	$(PYTHON) -m zulipterminal.cli.run --profile $(ARGS)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I'm not yet convinced that wrapping what is already in zulip-term into a make target is helpful.

Passing ARGS to make is some way around my original concern, but if I wanted debugging and profiling, which one of these would I do (or prefer - or document?):

  • make debug --profile ?
  • make debug-profile -d ?
  • make && zulip-term -d --profile (or with make venv) ?

The only difference for the latter is the title line establishing profiling is enabled - debugging is mentioned at startup if enabled. A small fix to the main code could make that clearer.

So while I originally said the extra parameters were the issue, it really comes down to whether eg. make debug is cleaner than simply zulip-term -d (or zulip-term --debug). The extra part is running make each time, which does ensure the environment is up to date, but isn't necessarily required frequently.

Simplifying this, ultimately we could end up with a "run" target, which is what tools like pipenv and uv can enable, but we have zulip-term set up for this already.

Comment on lines +37 to +39
debug-clean:
@echo "=== Cleaning debug files ==="
rm -f debug.log zulip-terminal.prof zulip-terminal-tracebacks.log
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This certainly cleans files from the current folder, but not necessarily from where the files are accumulating.

Here I've only seen profile files in /tmp/, and with a random name; have you seen it in the local directory? Do you also see the zulip-terminal.prof name?

I'm also a little wary of making it too easy to delete the other files, since while that information can accumulate and it's not clear which is worth keeping, it can also often be really useful! I suspect a more refined solution could be to build upon the move to use .config and store debugging and tracebacks in different folders by name/date/branch/etc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're totally right. I had a messy gh codespace that I was organizing everything at that moment, so I may had a mistake about temp files. Cleaning files, especially the logs right now make me concern about changes more right now too. Yes, logs are long and hard to read files but deleting could make feel losing daily to-do. What about action of keeping an option for normal - developer mode so normal users just delete logs & devs keep them

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are XDG or similar directories for logs and so on, which are similar to the .config standard; there's an open PR I need to check for migrating to using .config. If so, a centralized location might be useful, grouped somehow by the organization and date/time of the session.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, I was referring to the last sentence of my first comment in this thread. When we have separate sets of files for each session by date/organization, it would keep generated files organized and out of the current directory, as well as make it easier to clean them up in some defined way :)

Comment on lines +802 to +826
#### Remote debugging example

Here's a complete example of how to debug a specific issue:

1. Let's say you suspect an issue when sending messages. Find the function in the codebase that handles this (e.g., in `zulipterminal/ui/ui.py`).

2. Add the debugger statement just before the suspicious code:
```python
def send_message(self):
from pudb.remote import set_trace
set_trace() # This will pause execution here
# Rest of the function...
```

3. Run Zulip Terminal with debug mode enabled:
```bash
zulip-term -d
```

4. When you trigger the send_message function, check debug.log for telnet connection details:
```bash
tail -f debug.log
```

5. Connect with telnet and you'll get an interactive debugger to step through the code.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this compare to the section above?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created because remote debugging example just takes the pudb tip and puts it into action with a real use case, like debugging send_message. It's not just explaining the tool, but showing exactly how to use it in a scenario that devs can actually relate to. That makes it way easier to follow for anyone who's never set this up before, and helps connect the dots between the theory and how it works in practice too

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't really a big difference in content between your example and the earlier section, and in the interest of avoiding duplication I'd rather that we improve the existing section with more specific text instead?

  • I like the numbered list (if we end up having enough steps - specifically re debug mode!)
  • It's unclear to me that we need to pick a specific existing file or function, since names might change, but including it in a function/location looks better
  • I do like the extra comments in your version
  • It may be worth having the breakpoint further down, to indicate it need not be at the top
  • I know it's in the original documentation, but do we need debug mode for pudb? I suspect these are independent settings, unless one is using print also?
  • The section title can likely be improved by combining the existing and your draft; pudb and telnet are less important in the title

From testing just now:

  • the telnet details output to the main terminal if I breakpoint before the TUI
  • adding a breakpoint elsewhere can cause the app to seemingly lock up?
  • debug mode doesn't appear to affect this ^

Have you tried remote debugging?

It would be useful to have a link to the pudb remote debugging docs - and that may help with understanding how to handle remote debugging more generally, and give a few extra tips on using it effectively for devs.

Please break this change out into a separate commit, or one for base improvements and then another for possible improvements that need more discussion - we can always squash commits together later, or merge one first and continue discussing the other.

Comment on lines +97 to +98
# Added debug helper with proper path
"zulip-term-debug = zulipterminal.scripts.debug_helper:main",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it available to users who install the basic package, not just run it from the source tree; is that the intention?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that was the idea. Making the debug_helper available to users who install the basic package means they can use the debugging tools whether they’re running from the source tree or from an installed environment. If this brings up any concerns, I’m down to discuss it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be helpful because more input about problems helps us solve them more effectively

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is installed for all users, not just devs, then we don't need to update helper_deps with requests; it will already be installed for the main app, indirectly via zulip as it stands, though we could potentially add requests as a direct dependency.

There's no problem with having small extra tools to help users debug issues, though 'debug' is a very broad name, and splitting the tool(s) out may be good.

Comment on lines +153 to +158
elif args.command == "run":
cmd = ["zulip-term", "-d"]
if args.profile:
cmd.append("--profile")
logger.info("Running: %s", " ".join(cmd))
subprocess.run(cmd, check=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to simply wrap commands, as per:

  • zulip-term-debug run vs zulip-term -d
  • zulip-term-debug run --profile vs zulip-term -d --profile

What is the benefit of this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since 2 weeks passed, I think I was trying to clean things up or reframe it, but I didn't add any new functionality. I can roll that back and just stick to the original commands if you want

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be similar to the Makefile changes I commented on elsewhere, so I'd skip it.

Comment on lines +101 to +115
# Check for color support
colors = os.environ.get("TERM", "unknown")
logger.info("TERM environment: %s", colors)

if "COLORTERM" in os.environ:
logger.info("COLORTERM: %s", os.environ["COLORTERM"])

# Check for Unicode support
logger.info("Testing Unicode rendering capabilities:")
test_chars = [
("Basic symbols", "▶ ◀ ✓ ✗"),
("Emoji (simple)", "😀 🙂 👍"),
("Box drawing", "│ ┌ ┐ └ ┘ ├ ┤ ┬ ┴ ┼"),
("Math symbols", "∞ ∑ √ ∫ π"),
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like something that could belong in render_symbols.py, or perhaps an extension of it?

That existing tool checks the symbols that are currently being used, rather than an assortment of sample ones.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this is similar to what render_symbols.py does, but there's a difference. This script tests a range of symbols for general Unicode rendering in the terminal, while render_symbols.py focuses on the symbols used specifically in Zulip Terminal.

If the goal is to extend render_symbols.py for broader Unicode testing, this code could be added as a feature. Otherwise, it could stay separate as part of a general terminal debugging tool. What do you think—integrate it or keep it separate?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the separate script since it provides an easy way to visually check how the symbols that we actively use are being output.

You've described the difference, but what is the benefit of testing the symbols you listed, compared to those in render_symbols.py?

I could see outputting the terminal environment (TERM, etc.) as being useful for debugging, and potentially outside of a TUI environment, in case the TUI fails for some reason (including due to those parameters?). However, we could output it before zulip-term enters the GUI. If we're in the TUI, then those could go in the about box, or something like it, as could render_symbols I guess, in a 'debug' sub-popup - in addition to the separate script we have now.

It is possible that some symbols aren't included there and would render badly on some terminals, so if you do find some then we can refactor those similarly to the others.

logger.info("No obvious errors found in the log file.")


def test_connectivity(server_url: Optional[str] = None) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the main zulip-term script does check for connectivity, and the server version can be listed from within it, I can see a simple tool along these lines being useful.

However, rather than having a script with semi-duplicate code from run.py, it would be useful to use existing functions from there - it looks very similar to it?

There is plenty of server information available other than the version, which could also be used to help with authentication issues, for example.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think good approach would be to refactor the relevant logic in run.py into reusable utility functions or a module. These functions could then be imported into both run.py and the debugging script, by ensuring shared functionality

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would seem like a reasonable way forward 👍

content = f.read()

# Look for error patterns
error_patterns = [r"ERROR", r"Exception", r"Traceback", r"Failed to"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can find lots of issues (my files are quite large right now!), but could you give an example of how it has helped you? Do you have a specific motivation for this?

I would expect that reading through the file to get multiple lines of context would be more helpful?

If this is also for reporting errors to maintainers/developers, then context can be useful.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I added this is to quickly spot common error patterns, which helps a lot with large files. But yeah, I get it — without context, it’s not super helpful. Like, finding an ERROR by itself doesn’t really tell us why it happened or what led up to it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would likely be simpler to apply when we have multiple log files (as mentioned in another comment) since it would likely make it easier to eg. deduplicate errors, or find patterns, between each session.

logger = logging.getLogger(__name__)


def analyze_debug_log(log_file: str = "debug.log") -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this tool were to exist, it would be useful to identify multiple possible files. We have a tracebacks file, an API log, and a thread-exceptions log as it stands.

I wouldn't expect debug.log to necessarily have the error_patterns you mention, since anything can be 'print'ed into them, so having a default of debug.log with those patterns is surprising.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. if the tool's for analyzing logs properly, it should handle multiple log files like tracebacks, API logs, and thread-exceptions. Limiting it to debug.log with hardcoded patterns makes it less useful.

A better approach:
Let users specify which log files to analyze.
Include default or custom patterns for each file.
Maybe add auto-detection to adjust patterns based on the log file.

@neiljp neiljp added PR awaiting update PR has been reviewed & is awaiting update or response to reviewer feedback and removed PR needs review PR requires feedback to proceed labels Apr 5, 2025
@sundanc
Copy link
Author

sundanc commented Apr 23, 2025

Failed a check after sync of latest 2 commits into the fork. Gonna check that out

Copy link
Collaborator

@neiljp neiljp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sundanc Thanks for responding 👍

Particularly since these are different changes being discussed, let's split these into multiple commits, since it will make it easier to apply them individually or incrementally. Note that you currently have a merge commit at the end; please use rebase instead, as various Zulip docs note.

To summarize further discussion in the inline comments:

  • It would definitely be helpful to save our log/traceback/debug files into app-specific locations, similar to .config/; there's no issue for that, but it's likely been discussed in the channel
  • The Remote Interactive debugging docs would be good to improve, as well as adding brief notes on the profiling approach
  • Refactoring any unauthenticated server interaction out of the current app into a module, and extending it for server interrogation and connection troubleshooting would definitely be a useful adjustment (perhaps in another PR) - which could then be an independent tool like you've proposed
  • I'm unsure why we'd want to test symbols that we're not using. However, including those we do use (as per the symbol test tool) in a dev part of the About popup could be useful, as well as various environment variables (eg. TERM, etc, but also others we use for settings), as well as having the latter output on the terminal before starting, if in debug mode
  • I'm open to persuasion, but unconvinced that the makefile changes are worthwhile, or wrapping the dev commands in a debug helper
  • Similarly, while analysis of log files could be helpful to improve the developer experience, we'd benefit from implementing the first bullet beforehand; even then, there's a lot we might want to improve in the app itself instead

I've briefly experimented with using mprof to understand our memory usage, so we We could also add brief notes on that. That's less important, but is in the same area. I mentioned about this in the channel at one point (see a search for memory).

Comment on lines +828 to +830
#### Profiling for performance issues

If you're experiencing performance problems, you can run Zulip Terminal with profiling enabled:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To confirm, adding this is definitely an improvement - split this into a separate commit and with a few changes it's something to add independently of anything else in this PR.

Thoughts on the content:

  • zulip-term indicates the location of the prof file and a way to run snakeviz, after it exits; mention that?
  • As mentioned at debugging functionality added #1569 (comment), the name of the 'prof' file is unlikely to be specifically as in the quoted bash text?
  • Possibly note that for running zulip-term, that you run it as you normally would, but with --profile added; I'd choose that type of wording since the current command line might suggest you run it just as it is, and if someone is using -c path/to/zuliprc then it will behave differently or fail

Comment on lines +802 to +826
#### Remote debugging example

Here's a complete example of how to debug a specific issue:

1. Let's say you suspect an issue when sending messages. Find the function in the codebase that handles this (e.g., in `zulipterminal/ui/ui.py`).

2. Add the debugger statement just before the suspicious code:
```python
def send_message(self):
from pudb.remote import set_trace
set_trace() # This will pause execution here
# Rest of the function...
```

3. Run Zulip Terminal with debug mode enabled:
```bash
zulip-term -d
```

4. When you trigger the send_message function, check debug.log for telnet connection details:
```bash
tail -f debug.log
```

5. Connect with telnet and you'll get an interactive debugger to step through the code.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't really a big difference in content between your example and the earlier section, and in the interest of avoiding duplication I'd rather that we improve the existing section with more specific text instead?

  • I like the numbered list (if we end up having enough steps - specifically re debug mode!)
  • It's unclear to me that we need to pick a specific existing file or function, since names might change, but including it in a function/location looks better
  • I do like the extra comments in your version
  • It may be worth having the breakpoint further down, to indicate it need not be at the top
  • I know it's in the original documentation, but do we need debug mode for pudb? I suspect these are independent settings, unless one is using print also?
  • The section title can likely be improved by combining the existing and your draft; pudb and telnet are less important in the title

From testing just now:

  • the telnet details output to the main terminal if I breakpoint before the TUI
  • adding a breakpoint elsewhere can cause the app to seemingly lock up?
  • debug mode doesn't appear to affect this ^

Have you tried remote debugging?

It would be useful to have a link to the pudb remote debugging docs - and that may help with understanding how to handle remote debugging more generally, and give a few extra tips on using it effectively for devs.

Please break this change out into a separate commit, or one for base improvements and then another for possible improvements that need more discussion - we can always squash commits together later, or merge one first and continue discussing the other.

Comment on lines +37 to +39
debug-clean:
@echo "=== Cleaning debug files ==="
rm -f debug.log zulip-terminal.prof zulip-terminal-tracebacks.log
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, I was referring to the last sentence of my first comment in this thread. When we have separate sets of files for each session by date/organization, it would keep generated files organized and out of the current directory, as well as make it easier to clean them up in some defined way :)

Comment on lines +97 to +98
# Added debug helper with proper path
"zulip-term-debug = zulipterminal.scripts.debug_helper:main",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is installed for all users, not just devs, then we don't need to update helper_deps with requests; it will already be installed for the main app, indirectly via zulip as it stands, though we could potentially add requests as a direct dependency.

There's no problem with having small extra tools to help users debug issues, though 'debug' is a very broad name, and splitting the tool(s) out may be good.

content = f.read()

# Look for error patterns
error_patterns = [r"ERROR", r"Exception", r"Traceback", r"Failed to"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would likely be simpler to apply when we have multiple log files (as mentioned in another comment) since it would likely make it easier to eg. deduplicate errors, or find patterns, between each session.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR awaiting update PR has been reviewed & is awaiting update or response to reviewer feedback size: XL [Automatic label added by zulipbot]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants