Unfortunately, that seems like a chicken-egg scenario.
I want wolfSSL to only return the deciphered bytes from (at most) one fragment.
Interpreting 5 bytes as being a header (potentially also checking) in my receive callback is not very sound software engineering, as the state of the TLS connection is already kept inside wolfSSL. Besides, wolfSSL could potentially ask for 5 bytes even when it's not a header. Having my callback function keep state (as well as wolfSSL) sounds like overhead.
I'm mainly asking for where to modify the wolfSSL state machine to NOT try to give me all the bytes that the application requested for, but only those of one deciphered fragment. The application has no way of knowing how many bytes it should ask for, but is legal for the API contract of wolfSSL_read() to return less bytes then requested.