Skip to content

w3mman2html.cgi does not convert https to links #292

@cpaelzer

Description

@cpaelzer

Hi,
via a bug report on man page visualization I've realized that w3mman2html.cgi does not convert https to proper href links.

Reproducing the issue

$ cat > test << EOF
.TH TEST "1"
> .SH "Test"
> Test http URL: <http://www.gnu.org>
> .br
> Test https URL: <https://www.gnu.org>
> EOF

$ /usr/lib/w3m/cgi-bin/w3mman2html.cgi "local=/root/test"
Content-Type: text/html

<html>
<head><title>man </title></head>
<body>
<pre>
<u>TEST</u>(1)                                                                                     General Commands Manual                                                                                     <u>TEST</u>(1)

<b>Test</b>
       Test http URL: &lt;<a href="http://www.gnu.org">http://www.gnu.org</a>&gt;
       Test https URL: &lt;https://www.gnu.org&gt;

                                                                                                                                                                                                        <u><a href="file:///usr/lib/w3m/cgi-bin/w3mman2html.cgi?TEST(1)">TEST</a></u>(1)

You can see that the http link was converted to a proper link, while the https link was not changed.
I do not know if there is more to it as it seems to trivial and I feel I overlook something, but isn't that just this line:

s@(http|ftp)://[\w.\-/~]+[\w/]@<a href="$&">$&</a>@g;

In my test I found this to work well

diff -Naur /usr/lib/w3m/cgi-bin/w3mman2html.cgi.orig /usr/lib/w3m/cgi-bin/w3mman2html.cgi.new 
--- /usr/lib/w3m/cgi-bin/w3mman2html.cgi.orig	2024-01-30 08:08:50.278360949 +0000
+++ /usr/lib/w3m/cgi-bin/w3mman2html.cgi.new	2024-01-30 08:15:19.521156596 +0000
@@ -162,7 +162,7 @@
     next;
   }
 
-  s@(http|ftp)://[\w.\-/~]+[\w/]@<a href="$&">$&</a>@g;
+  s@(https|http|ftp)://[\w.\-/~]+[\w/]@<a href="$&">$&</a>@g;
   s@\b(mailto:|)(\w[\w.\-]*\@\w[\w.\-]*\.[\w.\-]*\w)@<a href="mailto:$2">$1$2</a>@g;
   s@(\W)(\~?/[\w.][\w.\-/~]*)@$1 . &file_ref($2)@ge;
   s@(include(<\/?[bu]\>|\s)*\&lt;)([\w.\-/]+)@$1 . &include_ref($3)@ge;

I'll file this trivial change as a PR, but I can't get rid of the feeling that I'll be told why we can't make that change :-)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions