mirror of
https://github.com/maziggy/bambuddy.git
synced 2026-05-09 08:25:54 +02:00
[GH-ISSUE #1164] [Bug]: Bambuddy-Printer-Connection fails after a few filament configs #838
Labels
No labels
A1
automated
automated
bug
bug
Closed due to inactivity
contrib
dependencies
dependencies
duplicate
enhancement
feedback
hold
invalid
Notes
P1S
pull-request
security
ThumbsUp
user-report
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/bambuddy-maziggy-1#838
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @RosdasHH on GitHub (Apr 29, 2026).
Original GitHub issue: https://github.com/maziggy/bambuddy/issues/1164
Originally assigned to: @maziggy on GitHub.
Component
Bambuddy
Bug Description
I experienced the issue, that sometimes the filament slot data from bambuddy doesn't reach the printer. This happens fairly often and for me, almost always on the 6th filament change in a row on a printer. After a minute of waiting after that happens, the filament colors on the printer jump around at the different slots for 1-2 seconds, then it stops. This mapping can then be seen in bambuddy too after a short time. In the logs I configured a few slots, had the issue, pressed reconnect, configured a few slots, and had the issue again.
Expected Behavior
The filament config from bambuddy should always reach the printer.
Steps to Reproduce
Configure the slots of an ams 6-8 times
Printer Model
Not printer-related
Bambuddy Version
v0.2.4b1-daily
SpoolBuddy Version
No response
Printer Firmware Version
No response
Installation Method
Docker
Operating System
Linux (Other)
Relevant Logs / Support Package
bambuddy-support-20260429-110759.zip
Screenshots
No response
Additional Context
No response
Checklist
@maziggy commented on GitHub (Apr 29, 2026):
Available/Fixed in branch dev and available with the next release or daily build. Please let me know if it works for you.
If you find Bambuddy useful, please consider giving it a ⭐ on GitHub — it helps others discover the project!
@RosdasHH commented on GitHub (Apr 29, 2026):
I just pulled the dev branch and tried it, but unfortunately the problem persists.
bambuddy-support-20260429-125928.zip
@maziggy commented on GitHub (Apr 29, 2026):
I traced through the logs carefully; the 0.2.4b2 fix is in place and working as intended:
What's still visible is a separate layer of the original report: the printer itself takes 11–19 s to acknowledge some commands when several are sent in rapid succession. Slot 0 published at 12:55:44, for example, was finally acked ~19 s later. Each slot configure publishes three MQTT messages (ams_filament_setting, extrusion_cali_sel, pushall), so 4–6 configures in under 20 s adds up to a couple of dozen messages hitting the printer's broker. Some firmware (P1S 01.09.01.00 in your case) doesn't pipeline that well and queues acks.
That's not something Bambuddy can fix server-side without slowing down workflows that don't have this issue. The right place to handle it is on the API consumer side: wait for the WebSocket AMS update before sending the next configure, or insert a 1–2 s pause between calls. Bambu Handy / Studio do this implicitly because no human can click that fast.
If you're driving these calls programmatically — e.g. from a scan-and-assign workflow like BamScan — that's where I'd add the pacing. Closing this as the underlying watchdog regression is fixed; happy to reopen if you can show a case where Bambuddy itself is dropping the publish (i.e. no MQTT publish line in the log for a configure that the API accepted).
@RosdasHH commented on GitHub (Apr 29, 2026):
I repeated the test with 10-second intervals between AMS slot configurations. I kept Orca Slicer open alongside Bambuddy to directly compare state updates.
I changed one slot via Bambuddy, waited 10 seconds, then configured the next one, and so on. After the 4th successful change, none of the subsequent configurations were shown in Orca Slicer anymore.
After a few minutes, all previously sent configurations were then applied in rapid succession, causing the filament states in Orca to cycle quickly through each intermediate state until finally reaching the last configuration.
This behavior occurs even with a 10-second delay between requests. During the same interval, I can perform identical configuration changes directly in Orca Slicer without any issues, and they are reflected in Bambuddy. This makes me think that the request rate is not the underlying problem.
Additionally, after the 4th change, Bambuddy appears to enter a state where further configuration changes are no longer processed until the printer is reconnected. In this state, Orca Slicer continues to function normally and its changes are still accepted by the printer and reflected back in Bambuddy, but the reverse path (Bambuddy → printer → Orca) is effectively stalled.
It is the exact same behavior that we've had here: #887, but not because of long idle times.
And none of the tests before were made through any automation. They were both made through the bambuddy ui.
Since Orca Slicer continues to work normally with new filament configs, the issue seems to be specific to the Bambuddy -> printer update rather than the printer itself.
At the end of the logs (From 2026-04-29 18:07:04,219 to the end), I successfully configured 4 times, with a 10 to 13sec delay between the configs through the bambuddy ui, but the 5th one didn't make it through. I've waited a few minutes, but it didn't arrive.
bambuddy-support-20260429-181319.zip
@maziggy commented on GitHub (May 1, 2026):
Traced through it line by line; here's what I see.
The 0.2.4b2 fix is doing its job. In this session the unanswered counter reaches count=1 once (18:08:25) and never escalates to count=2, so no force_reconnect fires. The cascade that produced your original "colors jump around after a minute" symptom is gone.
What's left is firmware-side. Your 5 configures at 13–15 s spacing: configures 1–4 get acked in 17 ms – 0.6 s, configure 5 is published successfully but the printer simply never acks it. MQTT stays alive (push_status keeps streaming the whole time), so it's not a connection problem — the printer just stopped processing that command. Your earlier observation that the queue "flushes in rapid succession after a few minutes" is the right read: P1S 01.09.01.00 is queuing, not discarding. The Orca comparison doesn't translate to a Bambuddy fix because Studio/Orca are driven by a human clicking through a dialog — they implicitly wait for the printer to settle between commands. An API consumer firing configures back-to-back doesn't have that natural pause, and even at 13 s the firmware queue fills up.
On the API side. Since you're driving these from your own app rather than the Bambuddy UI, the right place to handle the pacing is there. Two approaches that work in practice:
Closing this one — the regression watchdog fix is in, and the remaining behaviour isn't something Bambuddy can patch without slowing down workflows that don't have this issue. Happy to reopen if you can show a case where Bambuddy itself drops a publish (API accepts it but no Publishing ams_filament_setting line in the log) or the unanswered watchdog mis-fires again.
@RosdasHH commented on GitHub (May 1, 2026):
Sorry, the unanswered counter reaches count=2. I just downloaded the logs too fast, so that was not in it anymore.
All tests were made independent from BamScan or any other api service. It is all directly made through the Bambuddy UI.
The problem occurs independent from the bypassed time. I can configure 4x in the morning and 2x in the evening where the first config in the evening doesn't get applied and the second fires a reconnect, because the unanswered counter reaches count=2.
I tested the same (again manually, through the bambuddy UI) with 10min between the configs (which should be overkill in my opinion, and enough time for the printers queue to empty), but after config 4 (40min), the problem was there again. When trying to configure the next one, the config modal loads longer than usual (probably due to the K-profile timeout "Failed to get K-profiles after 3 attempts"), and my selected config is not applied. As soon as I try to configure another one, after the one that didn't work, the unanswered count goes up to 2 and mqtt reconnects: "Forcing MQTT reconnect: ams_filament_setting unanswered 2×".
Tested on a Linux Docker installation and Windows dev branch respectively with P1S and a1 Mini, same results everywhere.
I know that Bambuddy seems to send the config to the printer but from my perspective anywhere has to be a problem. That was what I wanted to point out with the Orca Slicer comparison, that I can click me through the dialogs there and configure the slots as much as I want, without running into this problem. But when I do the same through the Bambuddy UI (even with long times between) I run into the issue. So there I also have the "natural pause" you mentioned.
The same problem also occurs with assigning spools, but there are more slot changes required to run into this issue.
A third person also confirms that it happened to him as well, so it is not based on my installation or printers.
Would you be able to try to reproduce this on your setup?
Here is the log with manual configs (through the bambuddy UI), and a delay of ~10min between them. So it should show that it is happening independent from the bypassed time:
bambuddy-support-20260501-161335.zip
@maziggy commented on GitHub (May 2, 2026):
You're right and my earlier close was wrong on the substance. Apologies for the runaround. Reopening.
Two things in this bundle I missed before:
The watchdog DID escalate to count=2 in this run (count=1 at 15:53:14, count=2 at 16:03:53 -> Forcing MQTT reconnect). So the 0.2.4b2 fix is firing as designed; force_reconnect heals the wedge. But it costs one "wasted" config attempt to trigger, which matches your "first config silently dropped, second triggers reconnect" observation exactly.
More importantly, the wedge isn't ams_filament_setting-specific. Right before config 6, your K-profile fetch also timed out 3x ("Failed to get K-profiles after 3 attempts" at 16:03:18) into the same dead channel — different command, proper incrementing sequence_id, same result. So the wedge is the publish/response path itself, not anything about the AMS command shape or pacing.
The flood-on-reconnect is the giveaway: when force_reconnect fires at 16:03:53, the printer dumps every pending response within 5 seconds, including responses to commands sent ~10 minutes earlier. So the printer did process them; the path back to us was blocked until the MQTT session reset. That also rules out my earlier firmware-queue-saturation theory — 10 minutes between configs is plenty of time for any reasonable queue to drain.
Two near-term improvements I can ship while I hunt the root cause:
Going to try to reproduce on my own P1S over the next sessions. Two questions to help narrow it down:
Thanks for the careful repro and the patience.
@RosdasHH commented on GitHub (May 2, 2026):
Today, out of 10 attempts, I was able to configure successfully 5 times; the 6th attempt didn’t go through, and the 7th triggered a reconnect. Yesterday, I was only able to configure 4 times, so there seems to be some drift.
I tried the spool assignments again. I am able to change the spool 16x before nothing comes through anymore. (2 tries, both came to 16)
The spool assignments and slot configs seem to correlate in some way, because if I change the spool assignments 8x, I get the problem after the third config. If I assign a spool 12x, I have only one successful config left.
Sometimes when I change, for example, the third slot to red, both the first and the third slot briefly turn red (Don't know if it is the first slot everytime, but had this just now). However, after about half a second, the first slot reverts to its original color.
Thanks for looking into this.
@maziggy commented on GitHub (May 2, 2026):
Thanks for the patient testing — your data clearly characterizes the problem and rules out everything that's actionable on Bambuddy's side.
What's confirmed:
Where this can actually be fixed: the printer firmware. Bambuddy can only paper over the symptom from outside. I considered tightening the watchdog from count=2 to count=1 (you'd lose zero commands instead of one), but legitimate slow ACKs on big config bursts can take 10–15 s to come back, and a count=1 trigger would loop into spurious reconnects on healthy sessions. The current count=2 is the right balance for the average user; what you're seeing is the tradeoff at the pathological end.
I'm going to close this since the auto-heal is the practical fix and there's no further bambuddy-side improvement that doesn't trade one user's experience for another's. If you want to pursue this with Bambu Lab — they're the ones who can fix the firmware queue — point them at:
Thanks again for the thorough repros — bug reports this well-instrumented are rare.
@RosdasHH commented on GitHub (May 2, 2026):
I found out that the problem does not occurr at all with qos=0 or qos=2 in the bambu_mqtt.py, only with qos=1 it happens.
@RosdasHH commented on GitHub (May 2, 2026):
Qos 1 - at least once delivery (Resends package if not delivered)
Maybe there is an issue with the acknowledgement of the sent packet, causing it to be resent because bambuddy assumes the message has not been delivered. This could then lead to messages accumulating over time until the printer eventually stop accepting new packets? Would be my guess.
I don't think this is a firmware issue of the printer, because our slicers don't have that problem, and a quick python test didn't had this either.
@maziggy commented on GitHub (May 2, 2026):
Now that's a catch! 💯
Your QoS=1 vs QoS=0 vs QoS=2 bisect is the actual signal: the wedge isn't in the printer firmware, it's in paho-mqtt's QoS=1 inflight queue. Default ceiling is 20 messages; on Bambu's non-standard broker the PUBACK matching is racy, so paho's inflight slots don't all get freed, the queue silently fills up, and after ~16-20 commands
publish()returns success but the packet just sits in paho's internal queue. force_reconnect heals it because the inflight queue is per-session.I don't want to switch QoS globally — we spent weeks getting QoS=1 working reliably across all printer models (A1, P1S, X1C, H2D, P2S, X2D, ...) and that wasn't arbitrary. But your bisect points at the right spot: lifting paho's inflight ceiling solves it without touching wire-protocol behaviour. I've pushed
max_inflight_messages_set(1000)to dev — that's well above any realistic session command count.Could you pull the latest dev image and run your repro again? Specifically: configure 30+ slots / spool assignments in a session and see if the wedge still fires. If yes, the diagnosis is incomplete and I'll dig further. If no, we have the fix and the watchdog reconnect from 0.2.4b2 just becomes defence-in-depth.
Thanks for the careful bisect — without the QoS test we'd still be looking in the wrong place.
@RosdasHH commented on GitHub (May 2, 2026):
96 configs - No problem
I think thats it! 🎊🎉
@maziggy commented on GitHub (May 2, 2026):
Yeah!