This informative write-up was done by Deja Associate Security Consultant, Soumil Deshpande, who came to us from the world-class Information Assurance MS program at Northeastern University in Boston.
What is SSL/TLS?
Secure Socket Layer (SSL)/Transport Layer Security (TLS) are widely used security protocols that help provide security features such as confidentiality, integrity, and authentication of data transferred two nodes on the internet. SSL/TLS also provides other features such as perfect forward secrecy and authentication of extra application data in some of its configurations.
Put simply, SSL/TLS is used to send encrypted data over the public internet. It checks if the data sent was tampered with, and if the data came from the correct source. It also makes sure that if an encryption key is leaked, no previous messages can be decrypted with the leaked key, leaving only the current session vulnerable.
Clients and servers achieve these cryptographic properties by agreeing to certain cryptographic primitives during the beginning of the SSL/TLS session, using the “client hello” and “server hello” messages. This step is referred to as the handshake protocol.
What are these cryptographic primitives?
Cryptographic primitives in SSL/TLS consist of three main components:
The key exchange and signing/authentication algorithm
The symmetric key cipher (with key size and mode of operation)
The hashing algorithm (message authentication)
Here’s a sample “cipher suite” in SSL/TLS which uses these components:
(Note that all TLS cipher suites are prepended with TLS_)
In this example, we can see the following details about the cryptographic algorithms that will be used for this TLS session:
Diffie-Hellman (DHE) is the algorithm used for key exchange
RSA is the algorithm used for authenticating server/client identity
AES is the symmetric key cipher used with a key length of 128 and Cipher Block Chaining (CBC) as its mode of operation
SHA256 is used as the hashing function for message integrity and authentication
These are the main components of cryptographic security in all SSL/TLS sessions. Next, let’s take a deeper look at each element of the cipher suite, and discuss in-detail the best security recommendations for each field.
Key exchange algorithms
The Diffie-Hellman (DH) key exchange algorithm is the most popular key exchange algorithm and is used in almost all implementations of SSL/TLS. However, there are many forms of the Diffie-Hellman algorithm that can have different properties. The main property we’ll be looking at is “perfect forward secrecy.”
Perfect forward secrecy is a feature which gives assurance that the session keys will not be compromised even if the private key of the server is. This simply means that perfect forward secrecy protects the past sessions from any future compromises of private keys or passwords. This is done with ephemeral parameters, which are used for key exchanges only once and are not stored on disk or memory. Thus, even if the attackers gain access to the victim’s private key and other parameters, they still cannot retrieve the ephemeral parameters which were used to generate past keys; therefore, they cannot decipher past sessions and their messages.
The most commonly used variants of Diffie-Hellman key exchange which have this property are:
DHE (Ephemeral Diffie-Hellman key exchange)
ECDHE (Elliptic Curve Diffie-Hellman key exchange)
The Ephemeral Diffie-Hellman key exchange is an improvement from the regular Diffie-Hellman in that the DH parameters used for the key exchange (prime modulus and the generator) are freshly generated for every new key exchange that takes place, which ensures perfect forward secrecy.
The Elliptic Curve Diffie-Hellman key exchange is similar to the DHE discussed above, but it’s performed over elliptic curves. This is done by sending the base point on the curve (public key) and the description of the curve as parameters.
It is a good security recommendation to use ECDHE over DHE, as it fulfills perfect forward secrecy, and its implementation is much faster because of elliptic curve implementation. The performance of a few common key exchange algorithms is shown in the diagram below (the x-axis is time; lower is better):
Authentication algorithms verify if the message came from the intended sender. This is mainly done in cryptography using signatures. Signatures are deterministic strings of data sent with the message, which are generated from the same message and a secret key.
RSA is the most prevalent algorithm used for this. RSA is an asymmetric key cipher that generates a public key and a private key. In this public key infrastructure, the public key is readily available to everyone, but the private key is available only to the owner of the key pair. Encryption and decryption keys in RSA are reversible in nature (you can encrypt with a private key and decrypt with a public key). The process of encrypting with a private key so that only the owner’s public key can decrypt and verify the contents is called “signing.” Thus, an owner can encrypt a message (typically the hash of the original message) with their private key, and only the public key of the owner can be used to decrypt the message and get its contents back. This assures the message came from the correct owner, as no other key would be able to decrypt the contents of the message.
The de facto standard is to use RSA with 2048 or 4096 bits (lower key size = faster operations). It is a good security recommendation to use at least 2048-bit keys for trivial implementations of TLS, and 4096-bit RSA keys for sensitive data.
The security equivalency of the RSA keys to symmetric keys are as follows: A 1024-bit RSA key is equivalent to an 80-bit symmetric key, a 2048-bit RSA to a 112-bit symmetric key, and a 3072-bit RSA to a 128-bit symmetric key. It is estimated that in the year 2030 it will take 100,000 years to crack a 112-bit key (2048-bit RSA key); or with significant improvements in integer factorization algorithms, a $10 million computer in 2030 would take up to 5 months to factor a 2048-bit RSA key. Though the security period of key sizes is not predetermined and is dependent upon technological progress, a 2048-bit key could provide enough security until at least 2030. Key sizes greater than 2048-bits should be used if guaranteed security is needed after 2030.
Symmetric key cipher, key size, and mode of operation
Advanced Encryption Standard (AES) is the most popular symmetric key cipher used by a majority of applications. Replacing DES (Data Encryption Standard), AES supports a larger key size and is at least six times faster. Therefore, using AES as a block cipher in TLS is an effective security practice.
Advanced Encryption Standard (AES) is an iterative block cipher based on a substitution-permutation network, unlike its predecessor DES which was based on a Feistel network. It works by replacing specific inputs with specific outputs (substitution) and shuffling bits around (permutation). AES performs all its operation on bytes rather than bits; thus, 128-bits of plaintext is considered 16 bytes of plaintext by AES. Unlike DES, the number of rounds (encryption and decryption) depends on the key length of the implementation, e.g. 10 rounds for 128-bit keys, 12 rounds for 192-bit keys, and 14 rounds for 256-bit keys.
Advantages of using AES include:
Full specification and design details are publicly available
Expandable key sizes, e.g. 128/192/256-bit keys
Faster performance and stronger encryption than DES and Triple-DES
Easily implemented in popular programming languages like C and Java
Key sizes in AES
For all intents and purposes, a 128-bit key is sufficient to ensure unbreakable security. Larger key sizes exist mainly to add more security for sensitive data if needed, or for some U.S. military regulations or other standards in information security which require heightened security.
Although larger key sizes come with additional security, they do require significant CPU overhead (+20% for 192-bit keys and +40% for 256-bit keys). As discussed in the previous section about AES, the number of rounds for encryption and decryption increase as the key size increases, which makes each encryption/decryption process more time consuming than its small key size counterpart.
An advantage of larger key sizes is that they prevent future quantum computing attacks. A good example here is that a 256-bit key can still provide security equivalent to 128-bit key and be uncrackable even using future quantum computers, whereas a 128-bit key could be cracked in 2^64 operations. Thus, using AES 256 is recommended for TLS implementation which handles very sensitive data. For any other trivial TLS implementation, AES 128 provides enough security to be functionally unbreakable.
What make AES post-quantum secure?
Biclique Cryptanalysis is likely the best-known viable attack on AES, and reduces the security of AES-256 from 2^256 to 2^254.4–a trivial difference. The best-known theoretical quantum attack on AES is the Grover quantum search algorithm. Grover's algorithm lets us search an unsorted database of ‘n’ entries in ‘sqrt(n)’ operations. AES-128 can be broken and AES-192 can be potentially broken using this approach, but AES-256 is secure for the long-term. However, no key size is safe indefinitely–with advances in computer powers and algorithms, this estimate can change. Nonetheless, we can say with enough certainty that AES-256 is post-quantum secure.
Mode of operation
As outlined earlier, the AES algorithm works with a specific block size of 128-bits. Thus, any plaintext which exceeds 128-bits needs to be divided into chunks of 128-bits each, and padding needs to be added if the last block is not exactly 128-bits long. Separate AES operations need to be performed on each block of plaintext. A “mode of operation” describes how to repeatedly apply this single block AES operation to securely transform amounts of data larger than the block size of 128-bits.
The two most common and secure modes of operations are:
Cipher Block Chaining (CBC)
In Cipher Block Chaining, the last block of ciphertext is XOR-ed with the current block of plaintext before encryption. This is called the “chaining property” of this mode of operation, as it links the previous block to the next block. This chaining property makes the output of the blocks appear pseudo-random, which makes it less vulnerable to statistical attacks (such as frequency analysis) on ciphertext, as each bit of plaintext modified will affect every subsequent block of ciphertext output. As no previous ciphertext block is present to be XOR-ed with the first plaintext block, we use an Initialization Vector (IV)–a random number (nonce)–that can be publicly transferred over the internet without affecting the security of the encryption scheme. It’s important to note here that CBC mode only gives us confidentiality (encryption).
Galois Counter Mode (GCM)
Galois Counter Mode (GCM) is a mode of operation on block ciphers–like AES–that provides both data authenticity (integrity) and confidentiality. GCM works similarly to the regular counter mode where blocks are numbered sequentially, and every block number is combined with an Initialization Vector (IV) and encrypted with AES. The result of this operation is then XOR-ed with the plaintext to obtain the ciphertext. It is important to use different Initialization Vectors (IV) for each encrypted stream. The resulting ciphertext blocks are then treated as coefficients of a polynomial, which are then used to perform multiplication operations in the Galois Field (finite field). The result of this is encrypted, which produces an authentication tag that can be used to verify data integrity.
Why pick Galois Counter Mode over Cipher Block Chaining?
Insecure implementations of CBC mode can be vulnerable to known attacks such as Padding Oracle
In theory, there is an information disclosure in CBC mode if two plaintext blocks encrypt to the same ciphertext blocks
AES-GCM can be written in parallel (because it essentially works like a stream cipher), which means throughput is significantly higher than AES-CBC mode
AES-GCM operation can be carried out in parallel for both encryption and decryption, which makes it easier to optimize
Each block of AES-GCM can be encrypted/decrypted individually
AES-GCM provides data authenticity (integrity) along with confidentiality (encryption) by generating a Galois Message Authentication Code (GMAC tag) along with the final ciphertext message
AES-GCM also supports sending additional authentication data along with encrypted ciphertext, which can also be verified by the GMAC tag
Why pick Cipher Block Chaining over Galois Counter Mode?
CBC came out before GCM, and is supported by legacy devices and hardware
CBC can perform faster in scenarios where authentication is not required
As GCM is based on CTR mode, it inherits the many-time-pad problem, e.g. if the IV is reused, it can give you significant information about the plaintext as it makes up a critical part of the CTR mode structure. But, if IV is reused in CBC mode, then the only thing the attacker can detect is the equality in message prefixes, which does not give any significant information.
CBC can be faster in some hardware implementations
Disadvantages of AES include:
Every block in AES in encrypted the same way. Because of this, any insecure configurations for AES, e.g. ECB mode and IV reuse in CTR mode, can leak information about the plaintext.
AES is a complicated algorithm with many iterations, so it can be hard to implement in software from scratch
As AES inherently is a symmetric cipher, both the sender and receiver need to have the key before any communication can occur. If this key is compromised by any means, all encrypted text from that implementation is compromised.
The hashing algorithm/message authentication
In cryptography, a hash function is a one-way function which takes an input message and returns a fixed-size alphanumeric string. This string is called the “hash” value or “message digest.” The hash function is cryptographically secure if it has these properties:
Pre-image resistance: Given a hash, it should be computationally infeasible to retrieve the original plaintext input
Collision resistance: Given an input message with a hash, it should be infeasible to find another input message with the same hash value
Avalanche effect: Changing a single bit in the input message to the hash function should change a large number of bits in the resulting hash
The hash functions used in SSL/TLS are in the SHA (Secure Hashing Algorithm) family of hash functions. The SHA family contains two main categories: SHA1 and SHA2.
SHA1 has been proven to be cryptographically insecure since Google announced the first SHA1 collision on February 23, 2017. Therefore, under no circumstances should SHA1 be used with TLS/SSL as a hashing algorithm.
The SHA2 family of hashes contains six hash functions: SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, and SHA-512/256. If a hash function is cryptographically secure, the only feasible attack is to brute-force to find hash collisions. Due to the large output sizes of SHA2 hashes, it is computationally infeasible to find collisions even in the average-case difficulty of the problem. It is a good security recommendation to use SHA2 hashes for message authentication–especially SHA-256 as it has sufficient output length and it can be computed reasonably fast.
TLS/SSL is a robust and very secure algorithm if configured with proper cryptographic primitives. The key exchange algorithm, signing and authentication algorithm, symmetric key cipher, and the hashing algorithm are the crux of SSL/TLS security. The general recommendation for the cipher suite with reasonable security and performance is “TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA256.” A few parameters can be changed to increase or decrease performance according to need, but always keep in the mind the level of security the given configuration provides and the sensitivity of the data being transferred.