Skip to content

Conversation

klueska
Copy link
Contributor

@klueska klueska commented Jul 17, 2025

This interface can be used in conjunction with the standard "wrapper"
interface, including the use its interface abstractions like Devices,
GPUInstances, ComputeInstances, etc.

It is invoked as nvml.CgoAPI.<api-call>, following the original function
signatures of the C-API, where there is only a single return value and
additional return values are passed as arguments.

An example usage is:
```
        // Start with using the CgoAPI
        ret := nvml.CgoAPI.Init()
        defer nvml.Shutdown()
        require.Equal(t, SUCCESS, ret, "Init should succeed")

        var count uint32
        ret = nvml.CgoAPI.DeviceGetCount(&count)
        require.Equal(t, SUCCESS, ret, "DeviceGetCount should succeed")

        var device Device
        ret = nvml.CgoAPI.DeviceGetHandleByIndex(0, &device)
        require.Equal(t, SUCCESS, ret, "DeviceGetHandleByIndex should succeed")

        // Now call Device interface methods directly
        name, ret := device.GetName()
        require.Equal(t, SUCCESS, ret, "Device.GetName should succeed")
        t.Logf("Device.GetName: %s", name)

        uuid, ret := device.GetUUID()
        require.Equal(t, SUCCESS, ret, "Device.GetUUID should succeed")
        t.Logf("Device.GetUUID: %s", uuid)

        brand, ret := device.GetBrand()
        require.Equal(t, SUCCESS, ret, "Device.GetBrand should succeed")
        t.Logf("Device.GetBrand: %v", brand)

        busType, ret := device.GetBusType()
        require.Equal(t, SUCCESS, ret, "Device.GetBusType should succeed")
        t.Logf("Device.GetBusType: %v", busType)
```

@klueska klueska force-pushed the add-direct-cgo-api branch 2 times, most recently from b9b8bd1 to cafb054 Compare July 21, 2025 10:07
var builder strings.Builder

// Function comment
builder.WriteString(fmt.Sprintf("// %sHandle attempts to convert a %s to an %s.\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be easier to read if we use go templates instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitely. Updated.

@klueska klueska force-pushed the add-direct-cgo-api branch 2 times, most recently from c20e2b1 to 9db4a5a Compare July 22, 2025 15:00
klueska added 4 commits July 22, 2025 20:14
This interface can be used in conjunction with the standard "wrapper"
interface, including the use its interface abstractions like Devices,
GPUInstances, ComputeInstances, etc.

It is invoked as nvml.CgoAPI.<api-call>, following the original function
signatures of the C-API, where there is only a single return value and
additional return values are passed as arguments.

An example usage is:
```
	// Start with using the CgoAPI
	ret := nvml.CgoAPI.Init()
	defer nvml.Shutdown()
	require.Equal(t, SUCCESS, ret, "Init should succeed")

	var count uint32
	ret = nvml.CgoAPI.DeviceGetCount(&count)
	require.Equal(t, SUCCESS, ret, "DeviceGetCount should succeed")

	var device Device
	ret = nvml.CgoAPI.DeviceGetHandleByIndex(0, &device)
	require.Equal(t, SUCCESS, ret, "DeviceGetHandleByIndex should succeed")

	// Now call Device interface methods directly
	name, ret := device.GetName()
	require.Equal(t, SUCCESS, ret, "Device.GetName should succeed")
	t.Logf("Device.GetName: %s", name)

	uuid, ret := device.GetUUID()
	require.Equal(t, SUCCESS, ret, "Device.GetUUID should succeed")
	t.Logf("Device.GetUUID: %s", uuid)

	brand, ret := device.GetBrand()
	require.Equal(t, SUCCESS, ret, "Device.GetBrand should succeed")
	t.Logf("Device.GetBrand: %v", brand)

	busType, ret := device.GetBusType()
	require.Equal(t, SUCCESS, ret, "Device.GetBusType should succeed")
	t.Logf("Device.GetBusType: %v", busType)
```

Signed-off-by: Kevin Klues <[email protected]>
@klueska klueska force-pushed the add-direct-cgo-api branch from 9db4a5a to 5913ac9 Compare July 22, 2025 20:15
@klueska
Copy link
Contributor Author

klueska commented Aug 6, 2025

@elezar what are your thoughts on the name of "CgoAPI"? I'm not sold on it, but couldn't come up with anything else.

Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My one question is whether we want to use the cgoapi internally to provide more idiomatic go wrappers? Maybe as part of something like #125?

(This would obviously be a follow-up).

var deviceGetGraphicsRunningProcesses = deviceGetGraphicsRunningProcesses_v1
var nvmlDeviceGetMPSComputeRunningProcesses = nvmlDeviceGetMPSComputeRunningProcesses_v3
var deviceGetMPSComputeRunningProcesses = deviceGetMPSComputeRunningProcesses_v1
var nvmlGetBlacklistDeviceCount = nvmlGetExcludedDeviceCount
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed this in #156 too. These are not versioned and are simple aliases. Should we move them out of this block?

}

// Default all versioned APIs to v1 (to infer the types)
// BEGIN_UNVERSIONED_FUNCTIONS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is common for machine readable comments not have a leading spaces. What about:

Suggested change
// BEGIN_UNVERSIONED_FUNCTIONS
//generatecgoapi: BEGIN_UNVERSIONED_FUNCTIONS

(and below)

ret = CgoAPI.DeviceGetCount(&count)
require.Equal(t, SUCCESS, ret, "DeviceGetCount should succeed")

// Test getting a device handle by index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intent to try to get an invalid device? Or do we want to rather do something like:

if count == 0 {
   t.Skip("No devices available for testing, skipping device-specific tests")
}

so that we don't require redundant nesting?

}

// TestCgoAPIDeviceMethods exercises various device-related CgoAPI methods.
func TestCgoAPIDeviceMethods(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not quite clear to me how this is different from the functionality tested in the previous test case -- with the exception of non-overlapping function calls.

var nvmlEventSetWait = nvmlEventSetWait_v1
var nvmlDeviceGetAttributes = nvmlDeviceGetAttributes_v1
var nvmlComputeInstanceGetInfo = nvmlComputeInstanceGetInfo_v1
var nvmlDeviceGetComputeRunningProcesses = nvmlDeviceGetComputeRunningProcesses_v3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we defaulting to the _v3 versions here?

@elezar
Copy link
Member

elezar commented Aug 21, 2025

@elezar what are your thoughts on the name of "CgoAPI"? I'm not sold on it, but couldn't come up with anything else.

Does the name "leak" implementation details in some way? Strictly speaking these are more "accurate" wrappers to the NVML functions themselves since there are fewer modifications to make them more go-like. How much effort would it be to make these the top-level functions and come up with a name for the existing functions. Especially considering proposed changes such as #120

@klueska klueska marked this pull request as draft August 22, 2025 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants