wolfSSL Intel SGX Testing

wolfSSL has support for Intel SGX and we do continuous integration testing on that support. This means that every night a process starts up and runs unit tests on crypto operations in a secure Enclave. Here’s a peek at some of the on going tests in action


LINK => App
GEN => trusted/Wolfssl_Enclave_t.c
CC <= trusted/Wolfssl_Enclave_t.c
cc -Wno-implicit-function-declaration -std=c11 -m64 -O2 -nostdinc -fvisibility=hidden -fpie -fstack-protector -IInclude -Itrusted -I../..// -I../..//wolfcrypt/ -I/opt/intel/sgxsdk/include -I/opt/intel/sgxsdk/include/tlibc -I/opt/intel/sgxsdk/include/stlport-fno-builtin -fno-builtin-printf -I. -DWOLFSSL_SGX -DHAVE_WOLFSSL_TEST -c trusted/Wolfssl_Enclave.c -o trusted/Wolfssl_Enclave.o
CC <= trusted/Wolfssl_Enclave.c -m64 -O2 -Wl,--no-undefined -nostdlib -nodefaultlibs -nostartfiles -L/opt/intel/sgxsdk/lib64 -L../../IDE/LINUX-SGX/ -lwolfssl.sgx.static.lib -Wl,--whole-archive -lsgx_trts -Wl,--no-whole-archive -Wl,--start-group -lsgx_tstdc -lsgx_tstdcxx -lsgx_tcrypto -lsgx_tservice -Wl,--end-group -Wl,-Bstatic -Wl,-Bsymbolic -Wl,--no-undefined -Wl,-pie,-eenclave_entry -Wl,--export-dynamic -Wl,--defsym,__ImageBase=0 -Wl,--version-script=trusted/Wolfssl_Enclave.lds@ LINK => Wolfssl_Enclave.so

SIGN => Wolfssl_Enclave.signed.so
+ ./App -t
Crypt Test:
error test passed!
base64 test passed!
base64 test passed!
MD5 test passed!
MD4 test passed!
SHA test passed!
SHA-256 test passed!
Hash test passed!
HMAC-MD5 test passed!
HMAC-SHA test passed!
HMAC-SHA256 test passed!
GMAC test passed!
ARC4 test passed!
HC-128 test passed!
Rabbit test passed!
DES test passed!
DES3 test passed!
AES test passed!
AES192 test passed!
AES256 test passed!
AES-GCM test passed!
RANDOM test passed!
RSA test passed!
DH test passed!
DSA test passed!
PWDBASED test passed!
ECC test passed!
ECC buffer test passed!
mutex test passed!
memcb test passed!
Crypt Test: Return code 0

Interested in using FIPS with SGX or questions about wolfSSL testing? Contact us at facts@wolfssl.com.

How to use the 0-RTT rope to climb, without hanging yourself!

One of the major new features of TLS v1.3 is the 0-RTT handshake protocol. This variation of the handshake, using Pre-Shared Keys (PSKs), allows the client to send encrypted data to the server in the first flight. This is particularly useful for TLS on embedded devices. Take the example of IoT. There may be thousands or even millions of devices reporting back regularly to the central servers with small updates.

Using 0-RTT, the IoT device can send a ClientHello plus all the update data, known as “early data”, in one flight. Then, the server responds with the ServerHello, EncryptedExtensions, and Finished messages plus acknowledgement of the early data all in one flight. Finally the device responds with EndOfEarlyData and Finished messages in a final flight to close the loop on the security.

We can see that the data is offloaded, without having to wait for the server. The device stores a little state and goes back to its job ready for the interrupt on response. If the response times out then the server can resend with an updated ClientHello. On response, the device processes the handshake messages and responds closing the connection and the update can be discarded.

This is all very efficient in terms of processing and overall round-trip time. But, there are potential security issues including: replay attacks and no forward security.

An attacker can replay messages from a device. The server decrypts the early data using a key directly derived from the PSK and no other authentication is performed. Without the second flight from the client, the server would not recognize the copy is invalid. The recommended defense is single-use tickets. Each ticket contains a fresh PSK. This has the downside of requiring a shared database of tickets across servers. Alternatively, unique values from the ClientHello used with each PSK can be stored instead.

The attacker may also intercept the client’s first flight and spam the server with copies. If the early data contains “state modifying” data as in the example above, processing a copy would be disastrous. If the PSK is single-use, the client will get out of sync with the server and a full handshake will be required. The server may well interpret the attack as the client attempting to retry and therefore this must be handled at the application level.

When the PSK is reused for a number of messages, forward secrecy is lost. This means that if a device is compromised all messages encrypted using keys derived from the current PSK are exposed. The recommended defense is to use a short timeout with tickets to limit the period of vulnerability.

Using 0-RTT does require more careful architecture on the server side, the benefits at the client side are worth it.

Overview of Testing in wolfSSL

The security of wolfSSL products is always on our mind and holds high importance.  Conducting regular, diligent, and well-planned testing helps maintain wolfSSL’s robustness and security.  We strive to write and maintain clean, readable, and understandable code.

Like the halting problem, we know it is impossible to test every single possible path through the software, but we practice an approach that is focused on lowering risk of failure. In addition to extensive automated testing, we make sure that we specifically test well-known use cases. This post outlines some of our internal testing process.

  1. API Unit Testing:  We have unit tests in place that test API functions for correct behavior. This helps maintain library consistency across releases and as the code evolves.  It helps us to deliver a high quality well tested API to our end users with each software release.  API unit tests are run with each “make check” of wolfSSL.

  1. Cipher Suite Testing: wolfSSL supports an extensive list of cipher suites, which are all tested with every “make check” using the wolfSSL example client and example server.  Each cipher suite is tested not only in the default configuration, but also in non-blocking mode and with client authentication both turned on/off.

  1. Algorithm Testing: The security of our SSL/TLS implementation depends on the correctness and robustness of our underlying cryptography library, wolfCrypt.  We test all algorithms using NIST test vectors in addition to running our CAVP test harness used for our FIPS 140-2 validations.  We also test on both big and little endian platforms for portability.

  1. Benchmark Testing: We engage in another ever expanding universe of benchmark testing, where we look at sizing, transmission rates, connection speeds, and cryptography performance.  A version of our benchmark suite is included in every download for users to enjoy!

  1. Static Analysis: We do static analysis on our entire codebase using not only one, but multiple different static analysis tools.  We currently use Coverity Scanclang scan-build, and Facebook infer.  These tools help us to automatically find bugs including ones on low-traffic code paths.

  1. Detecting Memory Errors:  We mitigate memory errors by using valgrind on a regular and automated basis.  This helps find memory errors including invalid access, use of undefined values, incorrect freeing of dynamic memory, and memory leaks.

  1. Interop Testing: We test for interoperability with other Open Source TLS implementations, including OpenSSLBoringSSL, and GnuTLS.  This helps us to catch any protocol implementation errors in either wolfSSL or the implementation being tested against.  We also test outside of a closed environment by connecting to servers in the real world running unknown SSL/TLS implementations.

  1. Real World Builds: We build with a series of ‘real’ applications, like cURLwgetpppdOpenSSHstunnellighttpd, etc.  For some of our customers with top level support, we build new releases with their application.

  1. Compiler Testing: We have users who compile wolfSSL with a variety of different compilers.  As such, we test compiling wolfSSL with many different compilers and toolchains including gcc/g++clangiccVisual StudioCodeWarriorKDSLPCXpressoMPLAB XCTI CCSKeilIARCygwinMinGWCrossWorksArduinoWind River Workbench, and more.

  1. Peer Review: More eyes on a codebase reduces bugs that end up in a final product.  Internally, we operate using a “Fork and Pull Request” model.  This means that every commit that makes it into our master branch has been reviewed and tested by at least two separate engineers.

  1. Third Party Testing: Our code is regularly reviewed by university researchers, customer and user security teams, FIPS and certification labs, and our Open Source user base.  This helps put more eyes on our code and product architecture.

  1. Fuzz Testing: We test using several different software fuzzers, including an in-memory fuzzer, a network fuzzer, OSS-fuzzlibfuzzertlsfuzzer, and AFL.  Fuzz testing bombards the program with invalid, unexpected, and random data that then allows for observing if there is potential memory leaks or logic errors.  This allows us to catch bugs that could turn into potential vulnerabilities if released in a final release.

  1. Continuous Integration (CI): Leveraging Jenkins, we run tests on each commit submitted to the wolfSSL code repository.  Tests run on each commit include testing of our FIPS build, numerous build options (customer/user/common), running valgrind, and doing static analysis with scan-build.

  1. Nightly Test Cycle: Each night we run extended tests that last longer than the typical ones during the work day.  These are more in-depth than our CI testing and puts results in our engineers’ inboxes each morning.  Some tests included in our nightly cycle include extended build option testing on multiple platforms with multiple compilers, and extended fuzz testing.

If you have specific questions about how we test, please contact us at facts@wolfssl.com.  If you would like us to include your SSL/TLS or crypto implementation in our interop testing, please let us know!  Likewise, if you would like to include wolfSSL in your own test framework, we would be happy to discuss.

wolfSSL STM32F7 Support

We would like to announce that the wolfSSL embedded SSL library now has support for hardware-based cryptography and random number generation offered by the STM32F7.  Supported cryptographic algorithms include AES (CBC, CTR), DES (ECB, CBC), 3DES, MD5, and SHA1.  For details regarding the STM32F7 crypto and hash processors, please see the STM32F7 Hardware Abstraction Layer (HAL) and Low-layer drivers document (linked below).

If you are using the STM32F7 with wolfSSL, you can see substantial speed improvements when using the hardware crypto versus using wolfSSL’s software crypto implementation.  The following benchmarks were gathered from the wolfCrypt benchmark application (wolfcrypt/benchmark/benchmark.c) running on the STM32F777NI board (STM32F7) using the STM32F7 HAL on bare metal (No OS).

wolfSSL Software Crypto, Normal Big Integer Math Library

RNG               3 MB took 1.000 seconds,    3.149 MB/s

AES-Enc           6 MB took 1.000 seconds,    6.494 MB/s

AES-Dec           7 MB took 1.000 seconds,    6.519 MB/s

AES-GCM-Enc       3 MB took 1.004 seconds,    2.553 MB/s

AES-GCM-Dec       3 MB took 1.004 seconds,    2.553 MB/s

AES-CTR           7 MB took 1.000 seconds,    6.543 MB/s

CHACHA           16 MB took 1.000 seconds,   15.723 MB/s

CHA-POLY         10 MB took 1.000 seconds,   10.474 MB/s

3DES              1 MB took 1.008 seconds,    1.405 MB/s

MD5              24 MB took 1.000 seconds,   24.243 MB/s

POLY1305         42 MB took 1.000 seconds,   41.821 MB/s

SHA              14 MB took 1.000 seconds,   14.380 MB/s

SHA-224           8 MB took 1.000 seconds,    8.423 MB/s

SHA-256           8 MB took 1.000 seconds,    8.423 MB/s

SHA-384           2 MB took 1.000 seconds,    2.319 MB/s

SHA-512           2 MB took 1.000 seconds,    2.319 MB/s

STM32F7 Hardware Crypto, Normal Big Integer Math Library

RNG              6 MB took 1.000 seconds,    6.030 MB/s

AES-Enc         30 MB took 1.000 seconds,   30.396 MB/s

AES-Dec         30 MB took 1.000 seconds,   30.371 MB/s

AES-GCM-Enc     42 MB took 1.000 seconds,   42.261 MB/s

AES-GCM-Dec     33 MB took 1.000 seconds,   32.861 MB/s

AES-CTR         48 MB took 1.000 seconds,   47.827 MB/s

CHACHA          16 MB took 1.000 seconds,   15.747 MB/s

CHA-POLY        11 MB took 1.000 seconds,   10.522 MB/s

3DES            13 MB took 1.000 seconds,   12.988 MB/s

MD5             41 MB took 1.000 seconds,   40.894 MB/s

POLY1305        42 MB took 1.000 seconds,   41.846 MB/s

SHA             38 MB took 1.004 seconds,   38.202 MB/s

SHA-224         41 MB took 1.000 seconds,   41.309 MB/s

SHA-256         39 MB took 1.000 seconds,   39.111 MB/s

SHA-384          2 MB took 1.004 seconds,    2.310 MB/s

SHA-512          2 MB took 1.004 seconds,    2.310 MB/s

 As the above benchmarks (and chart) show, the hardware-based algorithms on the STM32F7 demonstrate significantly faster speeds than that of their software counterparts.

To enable STM32F7 hardware crypto and RNG support, define WOLFSSL_STM32F7 when building wolfSSL.  For a more complete list of defines which may be required, please see the WOLFSSL_STM32F7 define in <wolfssl_root>/wolfssl/wolfcrypt/settings.h.  You can find the most recent version of wolfSSL on GitHub, here: https://github.com/wolfssl/wolfssl.

If you would like to use wolfSSL with STM32F7 hardware-based cryptography or RNG, or have any questions, please contact us at facts@wolfssl.com for more information.

wolfSSL embedded SSL library

STM32: http://www.st.com/internet/mcu/class/1734.jsp

STM32F7 HAL and Low-layer drivers documentation: http://www.st.com/content/ccc/resource/technical/document/user_manual/45/27/9c/32/76/57/48/b9/DM00189702.pdf/files/DM00189702.pdf/jcr:content/translations/en.DM00189702.pdf

wolfSSL with PikeOS and ElinOS and TLS 1.3

Are you a user of PikeOS or ElinOS, and interested in a lightweight TLS 1.3 implementation?  The wolfSSL embedded SSL/TLS library now supports TLS 1.3 (drafts 18 and 20).  TLS 1.3 improves performance of establishing TLS connections by reducing the required number of round trips during the TLS handshake (including a new 0-RTT option where applications can send application data in the first flight!).  It also increases security by removing old legacy algorithms in favor of new, secure, and performant ones.

If you aren’t familiar with these operating systems, here’s a quick summary via Wikipedia:

PikeOS:

“PikeOS is a microkernel-based real-time operating system made by SYSGO AG. It is targeted at safety and security critical embedded systems. It provides a partitioned environment for multiple operating systems with different design goals, safety requirements, or security requirements to coexist in a single machine.”

ElinOS:

“ELinOS is a commercial development environment for embedded Linux. It consists of a Linux distribution for the target embedded system and development tools for a development host computer. ELinOS provides embedded Linux as a standalone operating system or it can be integrated into the PikeOS virtualization platform if safety and security demands cannot be met by Linux alone.”

To learn more about how to use wolfSSL with TLS 1.3, you can visit our TLS 1.3 webpage, or contact us at facts@wolfssl.com!

NXP CAU, mmCAU, and LTC Hardware Cryptography with TLS 1.3

As you may know, wolfSSL includes support for offloading cryptography operations into NXP Coldfire and Kinetis devices that include the CAU, mmCAU, or LTC hardware crypto modules. Taking advantage of these modules improves performance of both the cryptography and the SSL/TLS layer running on top of it.

Here is a quick comparison of performance between software cryptography and the hardware-based cryptography offered by the Kinetis mmCAU on a K60 TWR running at 100MHz:

Software Crypto Hardware Crypto

AES 0.49 MB/s 2.71 MB/s
DES 0.31 MB/s 3.49 MB/s
3DES 0.12 MB/s 1.74 MB/s
MD5 4.07 MB/s 4.88 MB/s
SHA-1 1.74 MB/s 2.71 MB/s
SHA-256 1.16 MB/s 2.22 MB/s
HMAC-SHA 1.74 MB/s 3.05 MB/s
HMAC-SHA256 1.22 MB/s 2.03 MB/s

And, here are some benchmark comparisons between software and hardware cryptography offered by the LTC module on a NXP FRDM-K82F, Cortex M4 running at 150 MHz:

Software Crypto Hardware Crypto

RNG 0.136 MB/s 0.939 MB/s
AES enc 0.247 MB/s 12.207 MB/s
AES dec 0.239 MB/s 12.207 MB/s
AES-GCM 0.016 MB/s 12.207 MB/s
AES-CTR 0.247 MB/s 8.138 MB/s
AES-CCM 0.121 MB/s 6.104 MB/s
CHACHA 0.568 MB/s 3.052 MB/s
CHA-POLY 0.444 MB/s 1.878 MB/s
POLY1305 2.441 MB/s 8.138 MB/s
SHA 0.842 MB/s 4.069 MB/s
SHA-256 0.309 MB/s 2.713 MB/s
SHA-384 0.224 MB/s 0.763 MB/s
SHA-512 0.216 MB/s 0.698 MB/s
RSA 2048 public 147.000 ms 12.000 ms (over 1 iteration)
RSA 2048 private 2363.000 ms 135.000 ms (over 1 iteration
ECC 256 key generation 355.400 ms 17.400 ms (over 5 iterations)
EC-DHE key agreement 352.400 ms 15.200 ms (over 5 iterations)
EC-DSA sign time 362.400 ms 20.200 ms (over 5 iterations)
EC-DSA verify time 703.400 ms 33.000 ms (over 5 iterations)
CURVE25519 256 key generation 66.200 ms 14.400 ms (over 5 iterations)
CURVE25519 key agreement 65.400 ms 14.400 ms (over 5 iterations)
ED25519 key generation 25.000 ms 14.800 ms (over 5 iterations)
ED25519 sign time 30.400 ms 16.800 ms (over 5 iterations)
ED25519 verify time 74.400 ms 30.400 ms (over 5 iterations)

Did you know that wolfSSL also now supports TLS 1.3? With TLS 1.3, users also have the ability to use this new protocol version for even better performance for TLS connections!

TLS 1.3 includes several improvements over TLS 1.2, including reducing the number of round trips required to perform a full handshake, and repurposing the ticketing system to allow for servers to be stateless. These changes mean better performance on Freescale/NXP CAU, mmCAU, and LTC-based devices, and lower memory usage on those devices acting as a TLS server.

To learn more about using TLS 1.3 in wolfSSL, visit our TLS 1.3 webpage today!