Skip to content

ssd1306: avoid unnecessary heap allocations #767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 13, 2025

Conversation

ysoldak
Copy link
Contributor

@ysoldak ysoldak commented Jun 19, 2025

This re-uses internal buffer to avoid unnecessary heap allocations in ssd1306 driver
Switches away from the legacy I2C interface.

See similar: #766

Draft since not tested in all configurations (I2C and SPI) Tested with both I2C and SPI now.

No unnecessary allocations and better FPS, w/o drops on GC, see results below.


Tested on XIAO BLE (nRF52840 chip) and 128x32 display connected via I2C at 400KHz.

Before change:

$ tinygo flash -target xiao-ble -monitor ./examples/ssd1306/i2c_128x32
$ tinygo monitor -target xiao-ble
...
TS: 4024 FPS: 75 HeapInuse: 191520
TS: 5029 FPS: 74 HeapInuse: 14992
TS: 6030 FPS: 75 HeapInuse: 61792
TS: 7042 FPS: 76 HeapInuse: 109216
TS: 8042 FPS: 75 HeapInuse: 156016
...

After change:

$ tinygo flash -target xiao-ble -monitor ./examples/ssd1306/i2c_128x32
$ tinygo monitor -target xiao-ble
...
TS: 2034 FPS: 77 HeapInuse: 2672
TS: 3045 FPS: 77 HeapInuse: 2672
TS: 4058 FPS: 77 HeapInuse: 2672
...

@soypat
Copy link
Contributor

soypat commented Jun 20, 2025

It is seems to me Buser should have a command() method to avoid the big buffer juggling since commands always write a single byte.

type Buser interface {
    // ... other methods
    command(cmd byte) error
}
func (b *I2CBus) command(cmd byte) error {
	b.cmdbuf[0] = 0x00 // cmdbuf is only length 2 on I2CBus since SPI does some DC pin stuff. in place of 0x40/0x00 byte
	b.cmdbuf[1]= cmd 
	return b.wire.Tx(b.Address,b.cmdbuf[:],nil)
}

This will likely make it much more readable and easier to understand

@ysoldak ysoldak force-pushed the ssd1306-no-legacy branch from 1fac33f to fd49267 Compare June 23, 2025 16:55
@deadprogram deadprogram changed the base branch from release to dev June 23, 2025 17:16
@deadprogram
Copy link
Member

Tested using my Thumby using SPI display.

Before:

$ tinygo flash -target thumby -monitor ./examples/ssd1306/spi_thumby/
Connected to /dev/ttyACM0. Press Ctrl-C to exit.
TS 1019 | FPS 984 | HeapInuse 103568
TS 2019 | FPS 1103 | HeapInuse 209456
TS 3019 | FPS 1093 | HeapInuse 70240
TS 4019 | FPS 1102 | HeapInuse 176032
TS 5019 | FPS 1093 | HeapInuse 36816

After:

$ tinygo flash -target thumby -monitor ./examples/ssd1306/spi_thumby/
Connected to /dev/ttyACM0. Press Ctrl-C to exit.                                                          
TS 1019 | FPS 1004 | HeapInuse 8656                                                                                                                                                                                 
TS 2019 | FPS 1123 | HeapInuse 8656             
TS 3019 | FPS 1124 | HeapInuse 8656 
TS 4019 | FPS 1124 | HeapInuse 8656  
TS 5019 | FPS 1123 | HeapInuse 8656 
TS 6019 | FPS 1123 | HeapInuse 8656  
TS 7019 | FPS 1123 | HeapInuse 8656 
TS 8019 | FPS 1122 | HeapInuse 8656                                                                       
TS 9019 | FPS 1123 | HeapInuse 8656
TS 10019 | FPS 1122 | HeapInuse 8656
TS 11019 | FPS 1123 | HeapInuse 8656
TS 12019 | FPS 1122 | HeapInuse 8656
TS 13019 | FPS 1123 | HeapInuse 8656
TS 14019 | FPS 1122 | HeapInuse 8656
```

@deadprogram
Copy link
Member

Also I switched the target branch to dev @ysoldak

@ysoldak ysoldak marked this pull request as ready for review June 23, 2025 17:27
@ysoldak
Copy link
Contributor Author

ysoldak commented Jun 23, 2025

Thank you, @deadprogram for testing!
@soypat thanks for the suggestion, it sent me down the rabbit hole of complete driver refactoring 😅 PTAL

@sago35
Copy link
Member

sago35 commented Jun 24, 2025

Tested using my RP2040-Zero using I2C display (128x64)
LGTM.

TS 1041 | FPS 39 | HeapInuse 3184
TS 2053 | FPS 40 | HeapInuse 3184
TS 3064 | FPS 40 | HeapInuse 3184
...
TS 289293 | FPS 40 | HeapInuse 3184
TS 290304 | FPS 40 | HeapInuse 3184
TS 291316 | FPS 40 | HeapInuse 3184
TS 292327 | FPS 40 | HeapInuse 3184

@ysoldak ysoldak force-pushed the ssd1306-no-legacy branch from 0cbe343 to 923f36d Compare June 24, 2025 23:56
@ysoldak
Copy link
Contributor Author

ysoldak commented Jun 25, 2025

The lost in refactoring exported methods Command() and Tx() are now back. @soypat

@@ -0,0 +1,52 @@
package common
Copy link
Contributor

@soypat soypat Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple things:

  • common is not a good package name (yes, I may be guilty of doing the same in CYW43439, but I purposefully made it unimportable!)
  • You've made a brand new common package just ot abstract away a very vital part of the example code, which was serving the purpose of an example, by demonstrating how the API can be used. I feel like 50 lines of code does not justify a new package, especially if it is not reusable code, this is just the example code copy-pasted over...

I do understand it is very uncomfy to write multiple examples with the same logic- but I'm pretty sure we can leverage build tags to have the shared logic in the same file and various surrounding files with the specific MCU definitions and probably a instantiateSSD1306 function that returns a initialized SSD1306 device ready to use.

Please revert this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no build tags to use, these displays can be attached to any MCU.

First, I can have a global var and a README file with description how to use -ldflags.
Second, alternatively, invent some build tags like "ssd1306_i2c_128x32" and describe use in README.
Both alternatives are hardly more friendly to newcomers though...

Third alternative is to duplicate the code <- suboptimal for maintenance, but doable.

Forth alternative is to revert changes in examples altogether loosing HeapInUse printout <- sort of defies purpose, as it will be harder to verify the PR does eliminate heap allocations.

Choose your poison, @soypat

Copy link
Contributor

@soypat soypat Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ysoldak #773 here's what I meant. Let me know what you think. No global variables needed nor compiler flags, no duplicate code needed, no new packages needed, less folders needed, now the workings of adress are explicit, file names is clearer than 128x32 and 128x64

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so we single out "thumby" and leave the screen size editable by users in other cases.
Probably, good to have an example for non-thumby SPI, but that's details.
Thanks for taking time and making demo, let me adjust my code now.

@ysoldak ysoldak force-pushed the ssd1306-no-legacy branch from 8441db8 to 02f4996 Compare July 4, 2025 09:41
@ysoldak
Copy link
Contributor Author

ysoldak commented Jul 4, 2025

@soypat I've adjusted examples, PTAL.

Since we have thumby already and we do single it out by a build tag, why not add some other concrete examples (instead of generic ones), so I've added:

  • xiao-ble board that drives 128x32 display over I2C (77 FPS @ 400KHz)
  • xiao-rp2040 board that drives 128x64 display over SPI (4900 FPS @ 50Mhz)

Comment on lines +20 to +22
dcPin.Configure(machine.PinConfig{Mode: machine.PinOutput})
resetPin.Configure(machine.PinConfig{Mode: machine.PinOutput})
csPin.Configure(machine.PinConfig{Mode: machine.PinOutput})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to configure pins here? I think going forward we shouldn't configure pins in drivers...

Copy link
Contributor Author

@ysoldak ysoldak Jul 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like we re-define the contact in this repo so pins come initialized to the driver constructors, but we don't want to break existing users and book and tutorials? This is another discussion for another PR #749

This PR does not implement a new driver, this PR is about eliminating heap leaks in an existing driver.

Copy link
Contributor

@soypat soypat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found some things that can be simplified. If we can avoid having aliased buffers in the Buser and the Device we can avoid the whole "hmm, is this call going to overwrite the part of my buffer I care about before sending?" problem. I'm particularily interested in keeping this driver as simple as possible because it will likely serve as an example of how we want to write drivers going forward as well as an example of how to design display drivers, so I find this PR to be of extreme importance in my eyes. Let me know if I'm asking for too much, I can take on the work after this is merged, but I believe it must be done, by someone.

Comment on lines +47 to +50
if isCommand {
return b.command(data[0])
}
copy(b.buffer[2:], data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very confusing. we have a command method but tx can also be used to send command. I strongly suggest we let command sending to command and other transactions to tx. This goes for both the SPI and I2C implementations. If we go this way we can avoid the buffer juggling on the Buser side, the buffer lives on the Device side and is sent over the tx method, but is not stored in the Buser (I2CBus or SPIBus). The Buser may still need a 2 or 3 byte length array to avoid heap allocations though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please document the Buser interface! The actual I2C/SPI implementations are not so important to document since they just fulfill the Buser contract. I think documenting the Buser interface will speed up this PR review since the tx and command methods are still not clear on what they do (do they overwrite some part of the buffer? currently they do, but that can very likely be avoided by designing it well)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel Buser must have been un-exported buser, but become exported by mistake in the initial implementation of this driver. We have public NewSPI(), NewI2C() constructors and Device struct, everything else must be internal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yes, I2C implementation is a pain in the ass, as b.wire.Tx() needs a slice in the write parameter, and that slice have to be pre-allocated otherwise it goes to the heap, as we need to inject commad/data flag byte first, then append actual data.
SPI in this sense much cleaner, as it has a separate pin to indicate command or data mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have the pre-allocated buffer in Device struct, but it must come from the buser implementation, as implementations have different requirements on this buffer (I2C needs it to be larger).

Avoiding heap allocations is not simple, at least not for this dirver.

@soypat
Copy link
Contributor

soypat commented Jul 4, 2025

The examples look good as is btw!

@ysoldak
Copy link
Contributor Author

ysoldak commented Jul 5, 2025

Let me know if I'm asking for too much, I can take on the work after this is merged, but I believe it must be done, by someone.

TBH, I feel like give up by now.
What started as a fun task to eliminate heap allocations in an existing driver by pre-allocating some buffers, has become an ordeal of driver refactoring.
Please, feel free to implement the buffers in some better more understandable way, to everyone's benefit! I'm out of ideas by now.

@soypat
Copy link
Contributor

soypat commented Jul 7, 2025

I will get to this sometime this week. Was out this weekend

@soypat
Copy link
Contributor

soypat commented Jul 8, 2025

OK, we can merge this as is and I'll take a look at making a PR with improvements later on. Will merge by end of week if none against this

@soypat soypat merged commit 7f1bd68 into tinygo-org:dev Jul 13, 2025
1 check passed
@soypat
Copy link
Contributor

soypat commented Jul 13, 2025

Alright merged! Thank you for your contribution @ysoldak and to everyone that helped test @sago35 @deadprogram :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants