Topic: Mixed reading and writing from wolfSSL_connect is giving me a headache
After literally weeks of work, I finally managed to integrate wolfssl into libevio.
But it was really painful to deal with the duality of wolfssl_connect
which, even when called when it says it WANT_WRITE - happily continuous
to read after writing.
Here is some of the "code" that I wrote (more comment than code, but
otherwise nobody would understand what is going on ...
// Bits of TLS::m_session_state
// _ post_handshake
// / _ want_write
// |/ _ inside_do_handshake
// || /
// || | Required action Possible transitions to Condition result
// || |
// 00x0 stop 00x0, 00x1, 01x0, 10x0 WasTrue
// 00x1 stop 00x0, 01x0, 10x0 WasTrue
// 01x0 do not stop 01x0 False
// 01x1 stop 00x0, 01x0, 10x0 WasTrue
// 10x0 stop if 10x0 obuffer->StreamBufConsumer::nothing_to_get() (False or WasTrue)
// 10x1 not possible
// 11x0 not possbile
// 11x1 not possible
//
// If the post_handshake bit is not set, then it was never set before.
// Therefore, since in the immediate past is_blocked_or_handshake_finished()
// returned true, the inside_do_handshake bit must have been set.
// Hence, if we see that neither post_handshake nor inside_do_handshake
// are set then the read thread returned from do_handshake and
// signalled if it wants to continue with reading or writing by clearing
// or setting the want_write bit; however, since we also get here when
// state == 0, it is possible that in that case it was THIS thread that
// just reset the inside_do_handshake bit (after having just executed
// do_handshake). In that case the read thread is still running and
// could cause transitions from 00x0 to any other state, but that doesn't
// change the fact that the required action to stop is a WasTrue.
//
// Hence, if we see that handshake_wants_write_and_not_blocked(state)
// then the read thread must have left do_handshake (resetting
// inside_do_handshake) and set the want_write bit. That means the
// handshake is not finished yet and the handshake wants to continue
// with writing (not reading!).
// Therefore the *read* thread will not re-enter do_handshake and
// thus m_session_state can't change anymore.
//
// Therefore this condition must return fuzzy::WasTrue for all states
// except where handshake_wants_write(state), in which case it should
// return fuzzy::False.
//
// If is_post_handshake(state) then instead it should return
// obuffer->StreamBufConsumer::nothing_to_get(). Note that a state of
// post_handshake can also no longer change, leaving the fuzzy::False
// returned by StreamBufConsumer::nothing_to_get() at fuzzy::False.
utils::FuzzyBool must_stop_output_device(OutputBuffer const* obuffer)
{
int state = m_session_state.load(std::memory_order_relaxed);
if (handshake_wants_write_and_not_blocked(state))
return fuzzy::False;
utils::FuzzyBool output_buffer_is_empty = obuffer->StreamBufConsumer::nothing_to_get();
if (output_buffer_is_empty.is_false())
m_session_state.fetch_or(s_have_plain_text, std::memory_order_release);
if (is_post_handshake(state))
return output_buffer_is_empty;
return fuzzy::WasTrue;
}
// Bits of TLS::m_session_state
// _ post_handshake
// / _ want_write
// |/ _ inside_do_handshake
// || /
// || | Required action Possible transitions to Condition result
// || |
// 00x0 do not stop 00x0 False
// 00x1 stop 00x0, 01x0, 10x0 WasTrue
// 01x0 do not stop 00x0, 01x0, 01x1, 10x0 False
// 01x1 do not stop 00x0, 01x0, 10x0 False
// 10x0 do not stop 10x0 False
// 10x1 not possible
// 11x0 not possbile
// 11x1 not possible
//
// We can not stop the input device when the state is 01x0 because
// that means that the write thread is running which might stop
// the output device (and stopping both input and output device
// could terminate the application; or more specifically it means
// that 'we are done' with this device in evio terms - which is
// obviously not true).
//
// Because that state (01x0) can transition to 01x1 (when the
// write thread enters do_handshake()), that state can not cause
// us to stop the input device either or the result of state 01x0
// would need WasFalse to be returned - which is not allowed (or
// rather, that can't work: due to a race condition we could
// *miss* stopping on state 01x1).
//
// Obviously we do not want to stop the input device when the
// state is 00x0 or 10x0; in the first case because the handshake
// is not finished and it wants to read, and in the second because
// the handshake is finished and we can return to the sane strategy
// of never stopping to read an input device until we're done with it.
//
// We HAVE to stop when the state is 00x1 because reading is
// required but the write thread is handling the handshake at
// the moment (so we can't). Not stopping could lead to an immediate
// return to read_from_fd and thus cause a tight loop using 100% cpu.
//
// None of the other states can transition to 00x1, so that we
// can return fuzzy::False for all of them as is required (see above).
//
// The reason that a transition to 00x1 is not possible is because
// 1) when the write thread is inside do_handshake (xxx1) the only
// possible transition is when the write thread leaves do_handshake
// which always resets the least significant bit (xxx1 --> xxx0).
// 2) Once the state is 10x0 the handshake is finished and the state
// won't return to an unfinished handshake.
// 3) If the write thread is outside do_handshake and the state is
// 01x0 then the only possible (first) transition is to 01x1 (
// which subsequently could change back to 01x1, to 10x0 or to
// 00x0, and)
// 4) If the write thread is outside do_handshake and the state is
// 00x0 then the write thread is stopped, so it won't make any
// additional changes anymore.
//
// Not stopping in the state 01x1 is unfortunate, but discussed
// elsewhere.
utils::FuzzyBool must_stop_input_device() const
{
int state = m_session_state.load(std::memory_order_relaxed);
return handshake_wants_read_and_blocked(state) ? fuzzy::WasTrue : fuzzy::False;
}
and....
// Did the handshake finish successfully?
if (TLS::handshake_completed(state))
{
Dout(dc::tls, "Handshake completed!");
m_connected_flags |= is_connected;
// Do the m_connected() callback at this point (as opposed to when the TCP connection was established),
// as in most cases it will be used as a "you can now send/receive data" signal...
if (m_connected)
{
int count = allow_deletion_count;
m_connected(allow_deletion_count, true);
if (allow_deletion_count > count)
// Device is marked for deletion.
return;
}
// It is impossible to test if the output buffer is empty from this thread.
//
// It would work to simply start the output device and let the write thread deal with it (that is,
// the write thread would stop the output device again if the buffer turns out to be empty).
// However, if there is a way to avoid a needless start and subsequent stop then that would be preferable.
//
// If we do some fuzzy test - and based on that start the output device, then
// nothing is lost. The only thing that we want to avoid is that we end up with
// a stopped output device while there is something in the output buffer.
// It is not possible however that by doing nothing we end in that state unless
// there is already something in the output buffer (that was flushed) and no
// new flush happens after this point (from another thread).
//
// When something is, or was, written to the output buffer and flushed - then that
// caused the output device to be started. So, it is necessary that subsequently
// this was ignored from write_to_fd() because the TLS handshake had not finished
// yet.
//
// Moreover, the output device begins started, so it must have been stopped in
// the meantime (as part of the TLS handshake), which happens exclusively from
// the write thread.
//
// Therefore, it is possible to know if there is something in the (plain text)
// output buffer as detected by the write thread when it stopped the output
// device.
if (m_tls.need_start_output_device(state))
start_output_device(state_t::wat(m_state));
// It is very unlikely that there is more to read, immediately after the handshake.
break;
}
// Stopping the input device could cause the application to exit if
// this is the only device and the output device is stopped too.
// Therefore, we will only stop the input device if
// 1) the handshake is not finished,
// 2) the handshake wants to read,
// 3) the write thread is inside do_handshake.
// See TLS::must_stop_input_device for the detailed argumentation.
utils::FuzzyCondition condition_must_stop_input_device([this]{
return m_tls.must_stop_input_device();
});
if (condition_must_stop_input_device.is_momentary_false())
{
Dout(dc::tls, "Trying buffer again because condition_must_stop_input_device.is_momentary_false() returned true (state = " << state << ").");
continue;
}
I suspect nobody actually read all that, but you get the idea.