Skip to content

enum: add support hash name for unnamed enum #3258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

qinghon
Copy link
Contributor

@qinghon qinghon commented Aug 3, 2025

Occasionally, conflicts between header files necessitate running Bindgen separately on each and then merging the resulting bindings. Present bindgen_ty_{number} identifiers generated for unnamed enums cause naming conflicts.

This PR introduces support for hash-based naming for unnamed enums.

@qinghon
Copy link
Contributor Author

qinghon commented Aug 3, 2025

example:

enum {                                                                                                                                                                                          
                                                                                                                                                                                                
        A = 1,                                                                                                                                                                                  
        B = 2                                                                                                                                                                                   
};                                                                                                                                                                                              
enum {                                                                                                                                                                                          
        C = -3,                                                                                                                                                                                 
        D = 4,                                                                                                                                                                                  
}; 
/* automatically generated by rust-bindgen 0.72.0 */

pub const A: bindgen_enum_98a4e4a38f13856d = 1;
pub const B: bindgen_enum_98a4e4a38f13856d = 2;
pub type bindgen_enum_98a4e4a38f13856d = ::std::os::raw::c_uint;
pub const C: bindgen_enum_bb1a5d36cda6e84c = -3;
pub const D: bindgen_enum_bb1a5d36cda6e84c = 4;
pub type bindgen_enum_bb1a5d36cda6e84c = ::std::os::raw::c_int;

@ojeda
Copy link
Contributor

ojeda commented Aug 3, 2025

Occasionally, conflicts between header files necessitate running Bindgen separately on each and then merging the resulting bindings.

Could you please expand on the use case?

@qinghon
Copy link
Contributor Author

qinghon commented Aug 3, 2025

@ojeda thank remind

@ojeda
Copy link
Contributor

ojeda commented Aug 3, 2025

I wasn't looking at the changes (but, of course, I imagine adding tests and documentation is important for bindgen) -- what I meant by my question above is that it isn't very clear what the use case is.

In other words, why do you need this? e.g. is this about including different generated bindgen files into a single Rust module or similar? Or something else? i.e. what is the problem being solved?

Thanks!

@qinghon
Copy link
Contributor Author

qinghon commented Aug 4, 2025

@ojeda I'm working on a library (dlibc) to generate bindings for all system libc header files into a single crate.

A major problem I've encountered is:

  1. If all are included in a single wrapper.h file, symbol conflicts between header files occur, leading Clang to complain and making it impossible to proceed.
  2. If bindgen is run separately for each header, the bindgen_ty_{number} identifiers generated by bindgen for unnamed enums cause conflicts during merging.

This problem also extends to common use cases: similar issues arise when running bindgen on two header files that have slight overlaps (if their generated bindings need to be merged).

edit:
there is also anonymous union and struct that requires hash name, but I am not very familiar with bindgen and have not encountered it yet, so I only made enum.

@ojeda
Copy link
Contributor

ojeda commented Aug 4, 2025

If all are included in a single wrapper.h file, symbol conflicts between header files occur, leading Clang to complain and making it impossible to proceed.

Do you mean those C headers cannot be included all together? That sounds a bit strange (it can happen, of course, but normally one would design public headers to avoid that) -- do you have an example? Thanks!

@qinghon
Copy link
Contributor Author

qinghon commented Aug 4, 2025

just simple case

#include <time.h>
#include <linux/time.h>

@ojeda
Copy link
Contributor

ojeda commented Aug 4, 2025

Thanks -- so it is indeed about the case where the C compiler would complain, i.e. this is before bindgen, so I would say that is not a use case for the feature here, which leaves us with number 2.

For number 2, I assume you are including the output of bindgen in a single Rust module? If so, is there a particular reason why you cannot put it in different ones, or it is simply that it is convenient to have it in a single namespace? An example for that use case would also be great.

@qinghon
Copy link
Contributor Author

qinghon commented Aug 5, 2025

@ojeda

When using bindgen separately to generate and use bindings directly, the following issues may arise:

1. A large number of duplicate symbols (this isn’t too serious, just a matter of time).
2. The same struct defined in different files actually becomes a different type.

The second issue is particularly severe, as it leads to a large amount of conversion code, making it essentially unusable for humans.

To solve this problem, the shared parts need to be extracted into a separate module and imported. This requires that item names remain stable when running bindgen on different header file entry points.

PS: In fact, the best approach is to keep the structure consistent with the C header files. Therefore, keeping enum names stable in this PR is only part of the series; we also need to split the generated bindings into paths corresponding to the C files when running bindgen. I’ll submit a separate PR for that.

// bindgen /usr/include/stdio.h > stdio.rs
mod stdio {
    include!("stdio.rs");
}
// bindgen /usr/include/wchar.h > wchar.rs
mod wchar {
    include!("wchar.rs");
}

use stdio::{stdout, fflush};
use wchar::FILE;

fn main() {

    let fname = std::ffi::CString::new("tmp.txt").unwrap();
    let mode  = std::ffi::CString::new("w").unwrap();
    let f: *mut stdio::FILE = unsafe {
        stdio::fopen(fname.as_ptr(), mode.as_ptr())
    };

    let ws: [wchar::wchar_t; 5] = [0x4E2D, 0x6587, 0x0020, 0x0066, 0x0069];

    let n = unsafe {
        wchar::fputws(ws.as_ptr(), f)   // ❌
//      -------------              ^ expected `wchar::_IO_FILE`, found `stdio::_IO_FILE`
//      arguments to this function are incorrect
    };

}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants