Skip to content

SSLSocket related improvements#351

Draft
kares wants to merge 9 commits intomasterfrom
ssl-socket-fixes
Draft

SSLSocket related improvements#351
kares wants to merge 9 commits intomasterfrom
ssl-socket-fixes

Conversation

@kares
Copy link
Copy Markdown
Member

@kares kares commented Mar 26, 2026

No description provided.

Copy link
Copy Markdown
Member

@headius headius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I approve of the fixes and they appear to make sense. The bulk of functional changes appear to be handling incomplete reads and writes better, rather than assuming completion and incorrectly dropping data.

The tests are harder to verify. They apear to be correct but there's a lot of duplicate boilerplate and they do not exist in any CRuby tests that I could find. There's some tests that are testing JRuby-specific internals via reflection and those tests may be problematic to maintain over time. They may also may fail as we further adopt JPMS and lose the ability to do arbitrary reflective access.

I'm not opposed to the tests in spirit but it's hundreds of lines of code to test largely the same server/client setup with slightly different behaviors. If these are valid test cases then they should probably be added to CRuby tests or ruby/spec and verified by other ruby/openssl committers who have knowledge of SSLSocket expected behaviors. I suspect some of these already exist in some form in the existing tests. They pass on CRuby (where not masked by a JRuby check) but what they are verifying is sometimes unclear to me.

ssl.sync_close = true
ssl.connect

java_cls = Java::OrgJrubyExtOpenssl::SSLSocket.java_class
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section seems problematic. Access to non-public fields may be rejected on certain levels of JDK and it's not clear to me why we are manipulating SSLSocket internals to verify behavior here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also testing JRuby-specific internals so it's hard to verify whether this actually reflects correct behavior from CRuby.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah haven't made my up yet how to tackle these.

all the buffering bugs are really hard to reproduce, even in a non-isolated test environment.

but the test itself was useful "approximating" the issue.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will look into changing these to Java unit tests, even though the approximation will be even less accurate.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I understand the situation a bit better. As Java unit tests this state could be made package visible and not have to use reflection tricks that might fail on newer versions of jvm. I'll leave it up to your judgment.

begin
n = ssl.write_nonblock(line.byteslice(written, line.bytesize - written))
written += n
rescue IO::WaitWritable, OpenSSL::SSL::SSLErrorWaitWritable
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary and proper to rescue both of these?

begin
ssl_conn = ssl_server.accept
server_ready << :ready
sleep 5
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a 5 second unconditional sleep necessary here to test saturation?

@kares
Copy link
Copy Markdown
Member Author

kares commented Mar 30, 2026

I'm not opposed to the tests in spirit but it's hundreds of lines of code to test largely the same server/client setup with slightly different behaviors. If these are valid test cases then they should probably be added to CRuby tests or ruby/spec and verified by other ruby/openssl committers who have knowledge of SSLSocket expected behaviors.

test suite hasn't been aligned with OpenSSL (ever since JOSSL was split off the main repo) and there's a lot of JRuby specific ones or adjusted one among these. the process currently is just picking up test from upstream and adding here as a bug/feature is worked on.

the ultimate plan (a big task) over the years was to have the suite from MRI as is and the JRuby specific ones in a different folder. that also needs exclusion support, like in JRuby repo.

@kares kares marked this pull request as draft March 30, 2026 09:56
kares added 8 commits March 30, 2026 14:22
propagate exception flag through read() and readAndUnwrap()
`read==0` guard only called waitSelect when `status==BUFFER_UNDERFLOW`

after `readAndUnwrap` processes a TLS 1.3 NewSessionTicket (status=OK,
zero app bytes, no buffered network data), the guard was skipped causing
an unnecessary extra loop through `readAndUnwrap` before
`BUFFER_UNDERFLOW` was reached on the second pass
Cover normal (non-error) data flow through the code paths changed by
the read_nonblock exception fix and the netReadData.position() guard:

- Multi-chunk read_nonblock: 30KB echo in 1KB chunks (TLS 1.3 + 1.2)
- Multi-chunk with exception:false (TLS 1.3)
- Partial read_nonblock: read 5 bytes then remainder from appReadData
- Multiple puts/gets cycles: 10 rounds on a single connection
- sysread/syswrite round-trip: 3 blocking cycles with exact byte counts
- Large server write: server sends 48KB, client reads in 4KB chunks;
  regression test for netReadData.position()==0 guard (TLS 1.3 + 1.2)
write() unconditionally called netWriteData.clear() after a non-blocking
flushData() that may not have flushed all encrypted bytes.

This discarded unflushed TLS records, corrupting the encrypted stream
and causing 'Broken pipe' or 'Connection reset' errors on subsequent
writes — most commonly seen during 'gem push' of large gems over HTTPS.

Fix: replace clear() with compact() which preserves unflushed bytes by
moving them to the front of the buffer before engine.wrap() appends new
encrypted output.

Additionally, sysreadImpl() now flushes pending netWriteData before
reading.  After write_nonblock, encrypted bytes could remain unsent;
without flushing first the server would never receive the complete
request body (e.g. net/http POST), causing it to time out or reset.
- writeNonblockDataIntegrity: approximates the gem push scenario from
  #242 — large payload via write_nonblock loop, then read server's byte
  count response, assert data integrity (no bytes lost)
- writeNonblockNetWriteDataState: saturates TCP buffer, then accesses
  the package-private netWriteData field directly to verify buffer
  consistency after the compact() fix
@kares kares force-pushed the ssl-socket-fixes branch from e93753b to 78c4b2d Compare March 30, 2026 12:24
@headius
Copy link
Copy Markdown
Member

headius commented Mar 30, 2026

Okay, I understand better why these are new tests and the disconnect with the CRuby tests is an ongoing challenge. Perhaps it's worth me taking some time this week to set up a testing harness that can fetch the CRuby tests and run with exclusions so we can track that more closely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants