Skip to content

Conversation

rdan
Copy link

@rdan rdan commented Sep 21, 2025

I implemented this feature differently from the code you proposed.
The main goal was to be able to retrieve for any "context.write()" in a code block the indentation of "<%". To achieve that, I added a new action in "codegen" which is to get the current position of the last element in the buffer "data" in the current "FastEncodingBuffer". I think that this operation is simple enough to not have an impact on the global performance of the tool even if this new feature is not used.

I tried to minimize the modifications of "FastEncodingBuffer" so I just added a method to get the length of data and I added a parameter of "getvalue()" to be able to use this position while concatenating the strings.
Due to this choice, each time "context.write()" is called with the new parameter equal to True, this concatenation on almost the full content of data will happen twice.
I did some performance tests on different strategies to search for the last "\n" in data and the concatenation can be from 2 times to 14 times slower than doing a search loop on data. If you think that the gain of performance is worth implementing a new method in "FastEncodingBuffer" (I was thinking about something like "get_text_since_last_cr(self, pos=None)" which will return a string without "\n" starting from the end of "data" or from "data[pos-1]") and that this new method makes sense in "FastEncodingBuffer", then I can do an update.

Regarding the added tests, do you see any additional tests that I should implement ?

And regarding the non-regression tests, do you want me to also run the 53 skipped tests ? How can I run them ? What do I need to install ?

Also, if I made any mistakes in the process of development on github, please let me know.

Modifications:

util.py::FastEncodingBuffer

  • deque replaced by list (no specific deque feature used, list faster than deque, deque does not allow slicing which is used by the new feature)
  • Creation of method "getpos()" to get the current location in "data"
  • Update of method "getvalue()" to get either the full content of "data" or only the content of data up to the position retrieved previously by "getpos()"

codegen.py::_GenerateRenderMethod::visitCode()

  • Mark the current location in the current FastEncodingBuffer

runtime.py::Context

  • Creation of the method "code_block_entry_mark()" that marks the current location in the current FastEncodingBuffer
  • Update of the method "write()"
    • New parameter "indent_relative_to_code_block_entry" allowing to activate the new feature
    • Implementation of the new feature

New tests added in test_util.py & test_runtime.py to cover the updates described above

466 tests passed, 53 skipped (linked to babel, lingua, Beaker, dogpile)

…()" relative to the indentation of the entry of a python code block ("<%")

util.py::FastEncodingBuffer
- deque replaced by list (no specific deque feature used, list faster than deque, deque does not allow slicing which is used in the new feature)
- Creation of method "getpos()" to get the current location in "data"
- Update of method "getvalue()" to get either the full content of "data" or only the content of data up to the position retrieved previously by "getpos()"

codegen.py::_GenerateRenderMethod::visitCode()
- Mark the current location in the current FastEncodingBuffer

runtime.py::Context
- Creation of the method "code_block_entry_mark()" that marks the current location in the current FastEncodingBuffer
- Update of the method "write()"
  - New parameter "indent_relative_to_code_block_entry" allowing to activate the new feature
  - Implementation of the new feature

New tests added in test_util.py & test_runtime.py to cover the updates described above

466 tests passed, 53 skipped (linked to babel, lingua, Beaker, dogpile)
Copy link
Member

@zzzeek zzzeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi and thanks for this.

I appreciate the effort into this alternative implementation, but I want to provide context for this change. As I indicated previously, this is a feature that it's very likely to not see much use. Realistically, this is very much the kind of feature where I think you will get some use out of it for as long as it suits your immediate use case, and probably nobody else as this is very obscure stuff and Mako itself is not really at the center of new development (however it is very much embedded in lots of legacy work. So its runtime is used a lot. New features just added, hardly at all).

for this reason, I have a very strong preference towards not adding new things to my libraries that don't have a significant audience. The way I approach these things are:

  1. see if the user can add their desired feature using extension points of the library. that way the library doesn't have to change. For lots of examples of this, see the Alembic cookbook at https://alembic.sqlalchemy.org/en/latest/cookbook.html . Every entry you see here came in as a user requesting a new feature to Alembic, and my solution was to make sure Alembic had adequate extension points so that the user could maintain their own version of the particular functionality. In this case, Mako doesn't have a great extension point for what you're doing.
  2. See if the desired feature can be added in such a way that is only adds new code, not changing anything existing at all, that is not invoked at all for the millions of applications that will never use this feature. The new feature and whatever issues it may have are completely inert unless someone tries to use the new feature. That's what we're going for here, but if it's not possible, I'd rather go back to option 1 and add some kind of way for you to plug in your own context or something like that.
  3. Change how the library works in order to suit the desired feature. We only do this for features that have obvious utility to the majority of users and where the existing architecture has clear space for improvement; for Mako, we have not added any such features in over ten years; we have had a few bug fixes that required library changes, they caused regressions and we had to re-release several times to get things right.

all of this is to say when you approached me with something you had working already and I sought to provide a more succinct way of opening up the API for what you needed, I was very much targeting item 2. The original proposal, to open up some of the elements inside of context._buffer to be public, suffers from the issue of exposing too many fine grained internals.

The flag approach I offered is still not that compelling to me, it's nicer than peeking inside of the context, but i dont think it's worth changing the runtime to an extensive degree like this. it would have to be self-contained and not impacting anything else.


def __init__(self, encoding=None, errors="strict"):
self.data = collections.deque()
self.data = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has to remain a deque(). a deque is much more efficient for what we're doing here

mako/runtime.py Outdated
if (len(string) == 0):
return

print(f"write({string}, {indent_relative_to_code_block_entry}, {self._code_block_entry_mark})")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is all very complicated and I dont know why need to do all this. if it's just for a marginal speedup, then it's not worth it. if you can show me test cases that dont pass with the simple and non-intrusive approach I had, i can see if i can get them to work.

"""Mark the current location in the buffer."""
self._code_block_entry_mark = self._buffer_stack[-1].getpos()

def write(self, string, indent_relative_to_code_block_entry=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this new argument needs to be keyword only:

 def write(self, string, *, new_keyword=False):

I think the name indent_relative_to_code_block_entry is too long of a name. needs to be a short name.

@rdan
Copy link
Author

rdan commented Sep 21, 2025

Thank you for the review.
I understand your answer.
I just tried the code you provided me with an updated version of "test_runtime.py" (test_runtime.py) including the 3 use cases I provided in the Discussion (these are the last 3 tests of "test_runtime.py").
On the 6 tests, test 2 is OK but only this is just a side effect, tests 1 & 3 cannot work because I added a new use case to this feature (each "\n" except if it is the last character in the string written using "context.write(string, indent_current=True)" is replaced by "\n" + indent, like this, everything written by "context.write()" is indented relatively to "<%").
And the tests 4 to 6 are failing also because each "context.write()", I think, is not relying on the indentation of "<%" but on the indentation of the text written by the previous "context.write()" in the code block (and, on "<%" for the first "context.write()" in the code block). And this is not what I need.
I am not sure if a non intrusive solution exists to implement this feature.

Copy link
Member

@zzzeek zzzeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these new test cases seem to introduce a lot of new scope to the original proposal.

I'm still not getting a clear sense of what the purpose of any of this is. Do you really need to use context.write() inside of a block, that is 1. preceded by variable length content (and not just whitespace, like the original example) and 2. sometimes writes with indent, and sometimes not, and figures it all out like that? Even Python builtins like textwrap.dedent() don't get into things like that, usually "indent text" is just that, "indent all the text".

Isn't the original use case simply that you want to have text that's output inside of <% %> to line up with the text outside? without all the complexity of switching on and off like that.

Indeed even using context.write() like this inside of a template is not what that method was really meant for. the original template that we looked at would be written like this usually:

        ENUM
        [
         % for Group in ListOfGroups:
          % if ("Parameter" in Group):
            ${Group['Name']}: ${Group['Parameter']}
          % endif
        % endfor
        ] GROUP;

however, if I for whatever reason wanted to output that content programmatically with indent, the original proposal looked very much like we just wanted a way to know how much the <% block is indented, and that's it.

Can we say that your real use case involves just having all the content in a <% %> block be indented with the outside? Here's a way to embed your original GetIndent approach as a namespace you can make available in templates. it certainly would be easier to add a function called context.get_current_indent() and to suit your original case:

from mako.template import Template
import textwrap
from mako.runtime import supports_caller

t = Template(r"""

## this could be any module where you set up functions
<%namespace name="indenter" module="__main__"/>


Hello ${what} <%indenter:indent_here><%
    context.write("text 1\ntext 2\n")
    context.write("text 3\n")
%></%indenter:indent_here>

    some other text

    <%indenter:indent_here>

        just put text without using context.write().
        this text would be indented four spaces
        on multiple lines

</%indenter:indent_here>


""")


@supports_caller
def indent_here(context):
    capture = context.get('capture')
    caller = context.get('caller')

    indent = " " * len(context._buffer_stack[-1].getvalue().split("\n")[-1])
    text = capture(caller.body)
    text = textwrap.dedent(text).strip()
    context.write(textwrap.indent(text, indent).lstrip())
    return ""


print(t.render(what="day"))

@zzzeek
Copy link
Member

zzzeek commented Sep 22, 2025

also I forgot to mention the main thing I dont want to do is add another function call to all the python blocks, so the change in codegen that adds two function calls (the outer function and the len()) is the biggest problem. The custom block approach I have above has the advantage that we can grab that preceding indent right as our block begins and then we have it within the block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants