Skip to content

cumsum() performance on Arc Pro B70 is lower than on Arc Pro B60 #2800

@caijimin

Description

@caijimin

With below code, the output on Intel Arc B60 is
memalloc_time=18.989502819720656, calculation_time=25.28791649499908
On B70
memalloc_time=18.39343382231891, calculation_time=40.41595908906311
The performance on B70 should be higher than B60, could you help to tell how to pinpoint the problem?
Do we have some tools to analyze?
Thanks.

import dpnp as np
import time

start_time = time.perf_counter()
a = np.array([0.1, 0.01, 0.001, 0.0001] * 4096*4096, dtype=np.float32)
end_time = time.perf_counter()
memalloc_time = end_time - start_time

i = 0
start_time = time.perf_counter()
while i < 10000:
    b = a.cumsum()
    i += 1
end_time = time.perf_counter()
calculation_time = end_time - start_time

print(
    f"memalloc_time={memalloc_time}, calculation_time={calculation_time}"
)

Tried some other functions like cumprod()/linalg.pinv()/linalg.solve() all have similar symptom.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions