Expand discussion of LLM use in contributions#15923
Expand discussion of LLM use in contributions#15923jakelishman wants to merge 3 commits intoQiskit:mainfrom
Conversation
We have some more experience of external and internal use of AI/LLM tooling on Qiskit since the last time we updated this section of the contributing guide. This is a revised version of the guidance, which attempts to more clearly direct use of the tools, based on patterns of use we have observed over the last few months. The policy espoused here should not be substantially different from the policy before this commit, other than a relaxation of the largely unenforceable rules about only using tools that have code-filtering capabilities. However, this new form should be clearer about expectations on contributors, and give more guidance about when AI-tool use is not appropriate. I also included a couple of lines that clarify that maintainers are not obliged to accept PRs, even ones that the contributor feels follow the letter of the guidance.
|
One or more of the following people are relevant to this code:
|
| > [!WARNING] | ||
| > If you use any AI tool while preparing your code contribution, you **must** disclose the name of the tool and its version in the PR description. | ||
| > [!NOTE] | ||
| > By "AI tools", we mean generative langauge tools like large language models (LLMs). |
There was a problem hiding this comment.
| > By "AI tools", we mean generative langauge tools like large language models (LLMs). | |
| > By "AI tools", we mean generative language tools like large language models (LLMs). |
|
|
||
| *You are responsible for any code you submit to Qiskit, no matter how it was generated.* | ||
|
|
||
| We discourage the use of LLM code generation, but do not forbid it in high-quality pull requests. |
There was a problem hiding this comment.
I'm not convinced we should categorically discourage it like this, I think the above statement about responsibility is clear enough. I don't think we can control people using it, it's a tool in the toolbox by now... If you want to keep it, we could maybe rephrase it in a constructive manner like
| We discourage the use of LLM code generation, but do not forbid it in high-quality pull requests. | |
| In particular, LLM contributions must uphold the coding standard of Qiskit and contributors are responsible to provide high-quality pull requests. |
Or something?
There was a problem hiding this comment.
I don't think the responsibility section says anything about code generation. I do want to discourage LLM code generation overall: empirically, code we receive that is LLM generated is worse on average, and it is a negative modifier on the chance of the PR getting accepted.
| (this is not an exhaustive list) are: | ||
|
|
||
| - You have reviewed and fully understand all code you submit, and explain the reasoning for it. | ||
| Asking an LLM to explain the reasoning for you is not acceptable. |
There was a problem hiding this comment.
| Asking an LLM to explain the reasoning for you is not acceptable. | |
| Asking an LLM to explain the reasoning is not sufficient. |
maybe this?
There was a problem hiding this comment.
I'd rather be stronger here. It isn't acceptable to use an LLM to provide the reasoning to a question a maintainer asks you - you have to be able to explain it yourself, and you shouldn't need the LLM to explain it to you first once it's reached the review process, or you didn't follow the rules before submission.
|
|
||
| > LLM disclosure: Claude Opus 4.6 was used to generate initial prototypes, which I then modified. | ||
|
|
||
| All non-trivial generated code must be clearly marked inline with code comments that: |
There was a problem hiding this comment.
Should we mention that one can also put it in the commit message if there's no clear block that was generated, but it was used throughout the code?
| #### Appropriate use of AI tools | ||
|
|
||
| We understand that using AI tooling is fun, feels productive, and can help communicate, particularly | ||
| for people whose primary language is not English. However, AI tooling is just a tool, and is not | ||
| always appropriate. | ||
|
|
||
| It is generally fine to use AI tooling privately in chat mode to ask about the existing code, | ||
| prototype solutions, and debug issues. Private use, where no LLM output becomes submitted code or | ||
| public communications, does not need disclosure. Be aware that writing code that is based on other | ||
| code (such as re-implementing LLM-output code yourself) may still be subject to licensing | ||
| restrictions from the source. | ||
|
|
||
| It is generally not ok to use AI tooling to produce code that you could not subsequently reproduce | ||
| yourself later without tooling, including if the AI tooling only provided explanations. This would | ||
| indicate that you do not sufficiently understand the code produced. |
There was a problem hiding this comment.
A bit like above, but I'm not convinced we should put this here. It's unlikely going to stop anyone from using LLMs and I don't think we should give an opinion on how to use tools, but rather focus on what they should submit -- exactly like the points you're listing below. How about removing these 3 paragraphs and just keep the bullets below as guideline?
There was a problem hiding this comment.
I very much want to include this section. I think it's clear examples of what kind of use is totally fine for submission to Qiskit, what definitely isn't, and it's something pre-written and public to point to when responding on PRs.
I'm fine to reword it to make it clear that I mean "Use of AI tools in submissions to Qiskit" and not "for personal use" though.
I think trying to pretend that we don't actually care about how you use such a tool whose improper use has such a negative impact on maintainers is disingenuous and unhelpful to users trying to understand: we do care.
|
|
||
| Remember that Qiskit maintainers have access to AI tools too. If the majority of your involvement | ||
| was to point an AI tool at an open issue and ask it to fix it, consider that we could have done | ||
| that too and there was a reason we didn't. |
There was a problem hiding this comment.
I think we can also skip this one -- or make it a bit more constructive 😂
There was a problem hiding this comment.
Imo this is constructive: I think people don't actually consider this, or think that "point an LLM at an issue" is actually helping.
| For example, you might write: | ||
|
|
||
| ```python | ||
| class SomeTranspilerPass: |
There was a problem hiding this comment.
Claude Opus 4.6 forgot to derive the base class 😛
| class SomeTranspilerPass: | |
| class SomeTranspilerPass(TransformationPass): |
There was a problem hiding this comment.
I mean, if you're going to nitpick this kind of thing then it also needs the imports to work, a docstring to pass lint, and the code body needs to have a statement and not just comments so it's syntactically valid.
I wouldn't do those, though - they're just noise that's not relevant to the point.
| Qiskit maintainers are not obliged to accept any pull request, even if it follows these guidelines. | ||
| Pull requests that require excessive maintainer time may be closed, even with no proposed | ||
| alternative. |
There was a problem hiding this comment.
I believe this may be a more conservative/commanding way of saying this, but just a nit
| Qiskit maintainers are not obliged to accept any pull request, even if it follows these guidelines. | |
| Pull requests that require excessive maintainer time may be closed, even with no proposed | |
| alternative. | |
| Qiskit maintainers reserve the right to reject any pull request, even if it follows these guidelines. | |
| Pull requests that require excessive maintainer time may be closed, even with no proposed | |
| alternative. |
| - You have reviewed and fully understand all code you submit, and explain the reasoning for it. | ||
| Asking an LLM to explain the reasoning for you is not acceptable. |
There was a problem hiding this comment.
Nit: It's mostly about using stronger language here, but also okay not to include.
| - You have reviewed and fully understand all code you submit, and explain the reasoning for it. | |
| Asking an LLM to explain the reasoning for you is not acceptable. | |
| - You have fully reviewed and have the capability to explain the reasoning for all code you submit to Qiskit. | |
| Using an LLM to generate an explanation for you is unacceptable. |
We have some more experience of external and internal use of AI/LLM tooling on Qiskit since the last time we updated this section of the contributing guide. This is a revised version of the guidance, which attempts to more clearly direct use of the tools, based on patterns of use we have observed over the last few months.
The policy espoused here should not be substantially different from the policy before this commit, other than a relaxation of the largely unenforceable rules about only using tools that have code-filtering capabilities. However, this new form should be clearer about expectations on contributors, and give more guidance about when AI-tool use is not appropriate.
I also included a couple of lines that clarify that maintainers are not obliged to accept PRs, even ones that the contributor feels follow the letter of the guidance.
Summary
Details and comments