What is data encryption?
Computer encryption is based on the science of cryptography, which has been used as long as humans have wanted to keep information secret. Most forms of cryptography in use nowadays rely on computers, simply because a human-based code is too easy for a computer to crack. Cryptosystems use a set of procedures known as cryptographic algorithms, or ciphers, to encrypt plain text messages into cipher text or encrypted messages or decrypt cipher text messages into plain text.
In 1883, Auguste Kerckhoffs asserted that encryption methods should be publicly disclosed, while the "keys" should remain confidential, which is now called Kerckhoff’s Principle. Computer encryption systems generally belong in one of two categories: symmetric encryption and asymmetric or public-key encryption.
What is symmetric encryption?
In symmetric encryption, both the sender and receiver use a symmetric key, which is a second instance of the same key, to encrypt and decrypt messages. The confidentiality of the keys is a pivotal aspect of symmetric encryption. Providing a secure method for key distribution presents a notable challenge in symmetric encryption, commonly known as the "key distribution problem." The key plays an essential role in symmetric cryptography, and losing or misplacing it has dire consequences. If these keys end up with malicious individuals, they could potentially decrypt the messages.
The primary benefit of symmetric cryptography is its speed compared to asymmetric cryptography. However, the major drawbacks of symmetric encryption include challenges in key distribution and key management. As the number of users increases, the number of necessary keys also rises. Handling the growing count of secret keys evolves into what's known as the "key management problem.
What is asymmetric encryption?
When connecting to a website on the public internet it becomes more complicated and symmetric encryption, by itself, won’t work because you don’t control the other end of the connection. How do you share a secret key with each other without the risk of someone on the internet intercepting it in the middle? In November 1976, Diffie and Hellman published a paper in the IEEE Transactions on Information Theory journal called "New Directions in Cryptography," This paper tackled a pressing issue and proposed a resolution: public-key encryption.
Also known as asymmetric encryption, public key cryptography is used as a method of assuring the confidentiality, authenticity and non-repudiation of electronic communications and data storage. Public-key encryption uses two different keys at once, a combination of a private key and a public key. The private key must remain confidential to its respective owner, while the public key is made available to everyone via a publicly accessible repository or directory. To decrypt an encoded message, a computer needs to employ the public key shared by the sender's computer, along with its unique private key. Even though a message transferred from one machine to another isn't fully secure (since the public key used for encryption is public and accessible to everyone), anyone intercepting it would still be unable to decipher it without the corresponding private key.
The key pair is derived from prime numbers that are long in length. The public and private keys are both calculated simultaneously through a unified mathematical procedure utilizing "trapdoor" functions. The defining feature of these "trapdoor" functions is their ease of computation in one direction, contrasted with the challenge of reversing the computation (determining its inverse) without specific knowledge.
The main disadvantage of asymmetric encryption is that it is slow when compared with symmetric encryption. This is because of the mathematical complexity involved in asymmetric encryption and therefore requires much more computing power to sustain. It is not suitable for long sessions because of the processing power it takes to keep it going.
Advantages and disadvantages of symmetric vs asymmetric encryption
While asymmetric encryption is often recognized as being more advanced than symmetric encryption, organizations still use both cryptographic techniques in their security strategies. For example, symmetric encryption is ideal for maximizing the speed of bulk data encryption or to secure communication within closed systems. On the other hand, asymmetric encryption is more beneficial for open systems where the priority is securing key exchanges, digital signatures and authentication.
Because there are specific business requirements that each type of encryption supports, organizations simply have to decide when to prioritize the two main advantages of speed, security and other relevant factors.
- Speed. Symmetric encryption has the advantage of being faster than asymmetric encryption, as it requires less computational power for both encryption and decryption. This is largely because the keys used in symmetric encryption are much shorter than asymmetric keys. And because symmetric encryption only required one key, the entire encryption process is faster, making it suitable for encrypting large amounts of data. Asymmetric encryption does not share these advantages, so it can be less efficient and possibly create performance issues when network processes get bogged down trying to encrypt or decrypt communications. This can result in slow processes and issues with memory capacity.
- Security. Asymmetric encryption is considered more secure because it uses two different keys—a public key which is used to encrypt communications and a private key which is used to decrypt those communications. Because the private key never needs to be shared, it acts as the safeguard that ensures that only the intended recipient is able to decrypt encoded communications. The resulting tamper-proof digital signature makes it harder for attackers to compromise the system. On the flip side, symmetric encryption is a bit riskier because it uses the same key to encrypt communications, which means it must be shared with anyone who needs to decrypt that communication. Every time the key is shared, it risks being intercepted by a malicious third party.
- Simplified key distribution. Because symmetric encryption uses the same key is used for both encryption and decryption, secure key distribution is crucial. The key distribution process is simper with asymmetric encryption, because only the public key is shared, while the private key remains confidential.
Use cases for symmetric encryption
Banking Sector. Due to the better performance and faster speed of symmetric encryption, symmetric cryptography is typically used for bulk encryption of large amounts of data. Applications of symmetric encryption in the banking sector include:
- Payment applications, such as card transactions where PII (Personal Identifying Information) needs to be protected to prevent identity theft or fraudulent charges without huge costs of resources. This helps lower the risk involved in dealing with payment transactions on a daily basis.
- Validations to confirm that the sender of a message is who he claims to be.
Data at rest. Data at rest refers to information that isn't actively being transferred between devices or networks, such as data saved on hard drives, laptops, flash drives, or stored/archived in other forms. The goal of protecting data at rest is to safeguard inactive data wherever it resides. Even though data at rest might be perceived as less exposed than data in transit, for attackers, it often represents a more enticing target. To safeguard data at rest, companies can opt to encrypt individual sensitive files before storage or decide to encrypt the entire storage medium.
Encrypting data at rest is most effectively done through whole disk or full disk encryption. Full disk encryption offers several advantages over standard file or folder encryption and encrypted vaults. It ensures almost everything, including swap space and temporary files, is encrypted. Encrypting these aspects is vital since they might expose sensitive information. In software-based methods, however, the bootstrapping code remains unencrypted. For instance, BitLocker Drive Encryption uses an unencrypted volume for booting, while the volume with the operating system is entirely encrypted. Moreover, the choice of which specific files to encrypt isn't dependent on user judgment, eliminating potential oversights or reluctance in encrypting crucial data.
Use case for asymmetric encryption: Digital signatures
As businesses and organizations transition from paper documents bearing ink signatures or authenticity markers, digital signatures offer enhanced guarantees regarding the origin, identity, and status of an electronic document. Additionally, they affirm a signatory's informed consent and endorsement.
Digital signatures serve to identify any unauthorized alterations to data and verify the identity of the individual signing. Moreover, the recipient of the signed data can present the digital signature as proof to a third party that the said signature genuinely originated from the alleged signatory. This characteristic is termed non-repudiation, as it prevents the signatory from denying the signature at a future point.
The digital signatures standard was proposed by NIST and is defined in FIPS 186-4. Digital signatures employ asymmetric cryptography and they provide a layer of validation and security to messages sent through a non-secure channel. They enforce the concepts of authentication, non-repudiation, and confidentiality.
To create a digital signature and use it along with a message between two clients, Alice and Bob, the following steps are followed:
- The message that has to be digitally signed by Alice is hashed creating a message digest. Hashing is a method employed to maintain data integrity. Hash functions process a message, append a string value to it, and then transform it into a different value, termed the message digest. These functions are irreversible, meaning it's impossible to retrieve the original message from the message digest.
- The message digest is encrypted using Alice's private key, resulting in a digital signature.
- The digital signature is then appended to the message and transmitted to Bob.
- Upon receiving the message, Bob uses Alice's public key to decode the digital signature, producing a message digest.
- Bob also processes the message through hashing, leading to the creation of another message digest.
- If the message digests from steps 4 and 5 match, Bob can confidently ascertain that Alice signed the message and that its content remains unaltered. Any discrepancy in the hash values would indicate that the message was tampered with.
Digital signatures are intended for use in electronic mail, electronic funds transfer, electronic data interchange, software distribution, data storage, and other applications that require data integrity assurance and data origin authentication.
Use case for both asymmetric and symmetric encryption: Messaging applications
In end-to-end encryption, only the data is encrypted. The headers, trailers, and routing information are not. The basis for the end-to-end encryption is the Signal Protocol, designed by Open Whisper Systems. This encryption protocol is crafted to stop third parties and the messaging service provider from accessing the actual content of messages or calls. Additionally, if a user's encryption keys are ever breached, they can't be employed to decrypt messages sent in the past.
Messaging end-to-end encryption is implemented using both asymmetric and symmetric cryptography. Asymmetric encryption is used to initialize the encrypted conversation between two users, and symmetric encryption is used to for the duration of the communication. The Whatsapp Encryption Overview White Paper provides the details.
Once the application is installed on a user’s smartphone, the public keys of the client are registered with the application server. The private key is not stored in the server and remains secret in the user’s device. The client who wants to initiate a session, retrieves from the Whatsapp server the public keys for the recipient. Using these keys, the initiator encrypts the first message and sends it to the recipient. This message contains the parameters for establishing a symmetric session key. The recipient uses his own private key to decrypt the message. “Once a session has been established, clients exchange messages that are protected with a Message Key using AES256 in CBC mode for encryption and HMAC-SHA256 for authentication.” The encrypted session needs to be re-created only when the device is changed or when the application software is re-installed.
Use case for both asymmetric and symmetric encryption: HTTPS
While the previous applications were focused on user identities, HTTPS is used for machine identification. In a highly connected world where on a daily basis millions of sensitive data travel through the internet, the need to secure the communication channels between clients/browsers and servers is of the utmost importance.
HTTPS is a TCP/IP application layer protocol, which is actually the SSL/TLS security protocol running on top of HTTP. An HTTPS connection between a client and a server employs both types of encryption. Asymmetric encryption is used first to establish the connection, which is then replaced with symmetric encryption (called the session) for the duration of the connection. A session key is a unique symmetric key employed for a single instance of encryption and decryption. These keys are generated randomly and designated for a specific session only. This is how HTTPS works in simple steps:
- To facilitate a secure conversation between the server and client, a TLS certificate must be generated and authenticated by the Certificate Authority (CA).
- The browser then sends a ClientHello message and indicates that it would like to initiate a conversation with the secure server. The ClientHello message contains all the information the server needs in order to connect to the client via TLS, including the various cipher suites and maximum TLS version that it supports.
- The server responds with a ServerHello message which includes the TLS version to be used, the server TLS certificate, and the server’s asymmetric public key.
- The browser verifies the server certificate, and creates a random session key.
- The session key is encrypted using the server’s public key and is sent back to the server.
- The server decrypts the session key with its own private key.
- Now both parties have the session key. The public key encryption is terminated and replaced with symmetric encryption. The session with the server continues using only symmetric encryption.
In both cases above, messaging apps and HTTPS, the asymmetric encryption is only used briefly in the beginning to exchange the symmetric session key which is used for the rest of the connection. This is done in order to overcome the main disadvantage of asymmetric encryption, being slow and resource exhaustive because of its mathematical complexity. On the other hand, the use of asymmetric encryption solves the problem of key distribution experienced in symmetric encryption.
Common symmetric encryption algorithms
Certain algorithms are widely used in securing data and communications within the bounds of symmetric encryption. The algorithms that are ideal for symmetric encryption include Advanced Encryption Standard (AES), Data Encryption Standard (DES), Triple Data Encryption Standard (3DES), Blowfish Twofish and Rivest Cipher (RC4).
Advanced Encryption Standard (AES)
One of the most common symmetric encryption algorithms, AES encrypts 128 bits of data in one go, operating on fixed-size data blocks with key lengths of 128, 192, or 256 bits. The 128-bit key encrypts data in 10 rounds, the 192-bit key in 12 steps, and the 256-bit key in 14 steps. AES employs a substitution-permutation network, making it highly secure and efficient for various applications. Developed as a replacement to the outdated Data Encryption Standard (DES) which was cracked by security researchers back in 2005, AES aimed at solving its predecessor’s main weakness, a short encryption key length vulnerable to brute force.
Triple Data Encryption Standard (3DES)
Data Encryption Standard is an algorithm with a symmetric key for encrypting digital data. When the initial DES algorithm was compromised, 3DES emerged as an improvement, implementing the DES algorithm three times in succession with three distinct keys. Every data block is subjected to three consecutive transformations, offering notably superior security than the original DES, primarily because of the incorporation of 112-bit and 168-bit keys. Because it applies the same process three times, 3DES ends up being slower than its more modern counterparts. Also, the process uses small blocks of data, which increases the risk of decryption by brute force. According to the standards, 3DES will be deprecated for all new applications, and its use will be prohibited after 2023.
Blowfish is another key block cipher recognized for its simplicity, efficiency and resistance to attacks. This variable-length, symmetric, 64-bit block cipher was designed as a "general-purpose algorithm" that would provide a fast and free alternative to the aging DES encryption algorithm. While Blowfish boasts a speed advantage over DES, its limited block size has hindered it from fully replacing DES, as it's deemed less secure. Blowfish supports key length from 32 bits to 448 bits and was designed as a public tool, not licensed and available at no cost. Primary use cases for Blowfish include password hashing and secure data storage and transmission.
As the successor of Blowfish, Twofish addresses its predecessor’s security issues with a larger block size of 128 bits, extendable up to 256-bits. Like AES, Twofish processes data in fixed-size blocks and accommodates key lengths of 128, 192, or 256 bits. It's tailored for 32-bit CPUs and is well-suited for both hardware and software settings. Similar to Blowfish, it's open source and can be used without restrictions. One of the key advantages Twofish maintains over other encryption algorithms is that it uses 16 rounds of encryption, independently of the key or data size.
Rivest Cipher (RC4)
Developed as a stream cipher for RSA Security in 1987 by Ron Rivest, RC4 encrypts data one byte at a time. RC4 is one of the most popular stream ciphers, used in SSL/TLS protocols, IEEE 802.11 wireless LAN standard, and Wi-Fi Security Protocol Wireless Equivalent Protocol (WEP). While it offers significant advantages in terms of usability and performance speed, RC4 has declined in popularity due to significant flaws that have come to light.
Common Asymmetric Encryption Algorithms
Asymmetric encryption algorithms have become essential in securing digital communications. The most commonly used asymmetric algorithms include Rivest-Shamir-Adleman (RSA), Diffie-Hellman, Elliptic Curve Cryptography (ECC) and Pretty Good Privacy (PGP).
RSA is a widely used asymmetric encryption algorithm found in a variety of products and services and is considered to be a staple of asymmetric encryption. The mechanics of RSA rely on the notion that multiplying two adequately large numbers is straightforward. However, figuring out the original prime numbers from their product is immensely challenging. One of the two figures used to derive the public and private keys is the result of multiplying two large prime numbers. Both keys are formulated using these same prime values. RSA keys commonly have lengths of 1024 or 2048 bits, rendering their factorization to be highly complex. However, since RSA requires two different keys of incredible length, the encryption, and decryption process can be slow, but the level of security it provides for sensitive information is incomparable.
Diffie-Hellman Key Exchange
Often referred to as an exponential key exchange, Diffie-Hellman is a digital encryption method that leverages numbers elevated to certain powers to generate decryption keys. This is done using elements that are never sent outright. Such a process renders the task of potential decryption exceedingly complex from a mathematical perspective, making it harder for code breakers to crack. The Diffie-Hellman key exchange facilitates the sharing of a confidential secret between two entities, enabling the secure exchange of information over an open network. Essentially, the algorithm leverages public-key methods to enable the transfer of a private encryption key.
Elliptic Curve Digital Signature Algorithm (ECDSA)
ECDSA, or Elliptic Curve Digital Signature Algorithm, is among the more intricate algorithms used in public key cryptography. Elliptic curve cryptography (ECC) produces keys that are smaller compared to the average size of a digital signature algorithm key. ECDSA leverages the algebraic framework of elliptic curves over finite fields. ECDSA performs the same function as other digital signatures, but more effectively. This is because ECDSA uses smaller keys to achieve the same level of security as other digital signature algorithms. Because ECC is a mathematical operation that is quick and easy to complete but extremely difficult to reverse, it is nearly impossible to crack the private key. The primary applications of elliptic curve cryptography include the generation of pseudo-random numbers, digital signatures, and more.
Pretty Good Privacy (PGP)
PGP was a widely-used program for encrypting and decrypting emails online, validating messages via digital signatures, and securing files. Today, PGP is a general term often applied to any software or tool that adheres to the OpenPGP public key cryptography standard.
(This post has been updated. It was originally published on September 16, 2019.)