-
Notifications
You must be signed in to change notification settings - Fork 8
Casting numeric to string #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
OSL ChatGPT ReviewerNOTE: This is generated by an AI program, so some comments may not make sense. src/irx/builders/llvmliteir.pyGood progress adding numeric-to-string casting and centralizing sprintf/format-string handling. A few important issues and suggested improvements: Correctness and safety
API/IR best practices
Code structure and style
Portability/behavior
Illustrative changes (sketch):
Tests to add
If you address the width/signedness handling, use snprintf, ensure proper lifetime/ownership, and stabilize the format global naming, this will be robust. |
OSL ChatGPT ReviewerNOTE: This is generated by an AI program, so some comments may not make sense. src/irx/builders/llvmliteir.pyThanks for the update—this is a useful addition. A few important correctness and robustness issues to address, plus some style and portability improvements. Major correctness and safety issues
Portability and linkage
Global format string creation
Alloca placement and optimization
Type matching for returned value
Code duplication and style
Suggested code adjustments (sketch)
Example changes for _get_or_create_format_global:
Error handling
Tests to add
Overall, the direction is good, but please fix the lifetime and safety concerns (sprintf + stack buffer) and handle width/signedness correctly for integers. Once those are addressed, the implementation will be much more robust and portable. |
760fdb7
to
e98401d
Compare
OSL ChatGPT ReviewerNOTE: This is generated by an AI program, so some comments may not make sense. src/irx/builders/llvmliteir.pyNo content returned by model. src/irx/system.pyNo content returned by model. tests/test_cast.pyNo content returned by model. |
OSL ChatGPT ReviewerNOTE: This is generated by an AI program, so some comments may not make sense. src/irx/builders/llvmliteir.pyNo content returned by model. src/irx/system.py
Suggested change:
Suggested change: message node to be printedmessage: astx.Expr tests/test_cast.pyNo content returned by model. |
@yuvimittal , thanks for working on that, I will review that later today. in the meantime, this is the output from the ai-reviewer executed locally. src/irx/builders/llvmliteir.py
src/irx/system.py
tests/test_cast.pyLGTM! |
I didn't have time to read the output from the ai reviewer ... so be careful with that .. maybe there are wrong suggestions |
These are the comments i got and i made the changes accordingly! It was a small issue and i changed the annotation of message from astx.AST to astx.Expr |
OSL ChatGPT ReviewerNOTE: This is generated by an AI program, so some comments may not make sense. src/irx/builders/llvmliteir.py
Suggested changes:
Apply in the int/float cast branches to:
src/irx/system.py
tests/test_cast.py
|
d935088
to
16bcee1
Compare
OSL ChatGPT ReviewerNOTE: This is generated by an AI program, so some comments may not make sense. src/irx/builders/llvmliteir.py
Suggested changes:
src/irx/system.py
tests/test_cast.py
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuvimittal , I am still reviewing your PR, but just sharing a few notes here for now.
init_val = ir.Constant(self._llvm.get_data_type(type_str), 0) | ||
|
||
alloca = self.create_entry_block_alloca(node.name, type_str) | ||
if type_str == "string": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this really necessary?
could we just keep the previous code?
alloca = self.create_entry_block_alloca(node.name, type_str)
why we need to make a special case for string?
src/irx/builders/llvmliteir.py
Outdated
|
||
self.result_stack.append(init_val) | ||
|
||
def _create_sprintf_decl(self) -> ir.Function: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuvimittal , to be honest I don't have the expertise this this part, so I checked it if gpt, and checking the results, it makes sense. Please take a look into the review:
the risk isn’t in the declaration itself, but in how you use it. Declaring sprintf
is fine, but calling it on a fixed stack buffer is unsafe and the review is right about two separate issues:
- Buffer overflow with
%f
on large-magnitude floats (unbounded output). - Width mismatch when printing 64-bit integers with
"%d"
(32-bit) and/or promoting args incorrectly for C varargs.
Below is a tight, actionable plan for IRx with llvmlite.
What to change
1) Declare and use snprintf
, not sprintf
Add a declaration for snprintf
and switch all formatting call sites to it.
def _create_snprintf_decl(self) -> ir.Function:
"""Declare (or return) the external snprintf (varargs)."""
name = "snprintf"
if name in self._llvm.module.globals:
return self._llvm.module.get_global(name)
# int snprintf(char *str, size_t size, const char *fmt, ...);
size_t_ty = getattr(self._llvm, "SIZE_T_TYPE", None)
if size_t_ty is None:
# Fallback: assume LP64/LLP64 by pointer width
size_t_ty = (
self._llvm.INT64_TYPE if self._llvm.POINTER_BITS == 64
else self._llvm.INT32_TYPE
)
snprintf_ty = ir.FunctionType(
self._llvm.INT32_TYPE,
[
self._llvm.INT8_TYPE.as_pointer(), # char *str
size_t_ty, # size_t size
self._llvm.INT8_TYPE.as_pointer(), # const char *fmt
],
var_arg=True,
)
fn = ir.Function(self._llvm.module, snprintf_ty, name=name)
fn.linkage = "external"
return fn
Then, wherever you previously called sprintf(buf, fmt, …)
, do:
snprintf(buf, buf_len, fmt, …)
and check the return value (n >= buf_len
means truncation).
2) Use width-correct format specifiers and ABI-correct promotions
When building varargs calls in LLVM IR you must mirror C’s default promotions:
i8/i16
→ sign-extend or zero-extend toi32
(match signedness).float
→ fpext todouble
.i32
→ pass asi32
.i64
→ pass asi64
, and use"%lld"
/"%llu"
(portable across LP64/LLP64).- Pointers/strings: pass as
i8*
.
Map your format strings to the IR types you’re passing. For 64-bit ints do not use "%ld"
(breaks on Windows LLP64); prefer "%lld"
/"%llu"
universally.
3) Bound float output
Prefer a bounded specifier for doubles to avoid huge strings:
- General:
"%.17g"
(round-trip-safe for double, compact, bounded). - If fixed decimals are desired:
"%.6f"
(or configurable precision).
This mitigates pathological lengths even with snprintf
, and reduces truncation.
4) Consider a tiny C runtime helper (optional but cleanest)
If you need an exact-sized string (no truncation and no guesswork), add a
runtime function you link against and call from IR:
// irx_runtime.c
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
char* irx_format(const char* fmt, ...) {
va_list ap;
va_start(ap, fmt);
int n = vsnprintf(NULL, 0, fmt, ap);
va_end(ap);
if (n < 0) return NULL;
char* s = (char*)malloc((size_t)n + 1);
if (!s) return NULL;
va_start(ap, fmt);
vsnprintf(s, (size_t)n + 1, fmt, ap);
va_end(ap);
return s;
}
Then expose a typed declaration in IR and avoid manual varargs at codegen call
sites. This sidesteps promotions/width issues entirely.
Minimal tests to add
- Large-magnitude doubles (e.g.,
1e308
,1e-308
) with"%.17g"
: no crash, no UB, sensible string. i64
min/max values printed with"%lld"
/"%llu"
: full value round-trips.- Truncation path: small buffer, confirm return
>=
capacity and that you handle it. - Mixed types in one call (int, double, string) verifying all promotions.
1c9828a
to
d34d4ca
Compare
OSL ChatGPT ReviewerNOTE: This is generated by an AI program, so some comments may not make sense. src/irx/builders/llvmliteir.py
src/irx/system.py
Suggested patch: tests/test_cast.pyLGTM! |
Pull Request description
Adds casting numeric type to string type.
#85How to test these changes
...
Pull Request checklists
This PR is a:
About this PR:
Author's checklist:
complexity.
Additional information
Reviewer's checklist
Copy and paste this template for your review's note: