What is data encryption?
Computer encryption is based on the science of cryptography, which has been used as long as humans have wanted to keep information secret. Most forms of cryptography in use nowadays rely on computers, simply because a human-based code is too easy for a computer to crack. Cryptosystems use a set of procedures known as cryptographic algorithms, or ciphers, to encrypt plain text messages into cipher text or encrypted messages or decrypt cipher text messages into plain text.
Auguste Kerckhoff in 1883 stated that encryption algorithms should be made public and the “keys” be kept secret, which is Kerckhoff’s Principle. Computer encryption systems generally belong in one of two categories: symmetric encryption and asymmetric or public-key encryption.
What is symmetric encryption?
In symmetric encryption, the sender and receiver use a symmetric key, which is a separate instance of the same key to encrypt and decrypt messages. Symmetric encryption heavily relies on the fact that the keys must be kept secret. Distributing the key in a secure way is one of the primary challenges of symmetric encryption, which is known as the “key distribution problem.” The key that is the vital component in symmetric cryptography and we cannot afford to lose it or misplace it. If the individual keys are misplaced, the message can be decrypted by malicious actors.
The main advantage of symmetric cryptography is that it is much faster than asymmetric cryptography. The most important disadvantages of symmetric encryption are the key distribution problem and the key management problem. When the number of connected users grows, so does the number of required keys. Management of an increasing number of secret keys becomes a “key management problem.” Further, symmetric cryptography ensures only the ‘confidentiality’ of the transmitted or stored data. It cannot be used to ensure integrity and/or authenticity.
What is asymmetric encryption?
When connecting to a website on the public internet it becomes more complicated and symmetric encryption, by itself, won’t work because you don’t control the other end of the connection. How do you share a secret key with each other without the risk of someone on the internet intercepting it in the middle? In November 1976, a paper published in the journal IEEE Transactions on Information Theory by Diffie and Hellman, titled "New Directions in Cryptography," addressed this problem and offered up a solution: public-key encryption.
Also known as asymmetric encryption, public key cryptography is used as a method of assuring the confidentiality, authenticity and non-repudiation of electronic communications and data storage. Public-key encryption uses two different keys at once, a combination of a private key and a public key. The private key must remain confidential to its respective owner, while the public key is made available to everyone via a publicly accessible repository or directory. To decode an encrypted message, a computer must use the public key, provided by the originating computer, and its own private key. Although a message sent from one computer to another won't be secure since the public key used for encryption is published and available to anyone, anyone who picks it up can't read it without the private key.
The key pair is based on prime numbers of long length. Both the public and private keys are computed together at the same time, in the same mathematical process, using “trapdoor” functions. The main characteristic of “trapdoor” functions is that they are easy to compute in one direction, yet difficult to compute in the opposite direction (finding its inverse) without special information.
The main disadvantage of asymmetric encryption is that it is slow when compared with symmetric encryption. This is because of the mathematical complexity involved in asymmetric encryption and therefore requires much more computing power to sustain. It is not suitable for long sessions because of the processing power it takes to keep it going.
Advantages and disadvantages of symmetric vs asymmetric encryption
While asymmetric encryption is often recognized as being more advanced than symmetric encryption, organizations still use both cryptographic techniques in their security strategies. For example, symmetric encryption is ideal for maximizing the speed of bulk data encryption or to secure communication within closed systems. On the other hand, asymmetric encryption is more beneficial for open systems where the priority is securing key exchanges, digital signatures and authentication.
Because there are specific business requirements that each type of encryption supports, organizations simply have to decide when to prioritize the two main advantages of speed, security and other relevant factors.
- Speed. Symmetric encryption has the advantage of being faster than asymmetric encryption, as it requires less computational power for both encryption and decryption. This is largely because the keys used in symmetric encryption are much shorter than asymmetric keys. And because symmetric encryption only required one key, the entire encryption process is faster, making it suitable for encrypting large amounts of data. Asymmetric encryption does not share these advantages, so it can be less efficient and possibly create performance issues when network processes get bogged down trying to encrypt or decrypt communications. This can result in slow processes and issues with memory capacity.
- Security. Asymmetric encryption is considered more secure because it uses two different keys—a public key which is used to encrypt communications and a private key which is used to decrypt those communications. Because the private key never needs to be shared, it acts as the safeguard that ensures that only the intended recipient is able to decrypt encoded communications. The resulting tamper-proof digital signature makes it harder for attackers to compromise the system. On the flip side, symmetric encryption is a bit riskier because it uses the same key to encrypt communications, which means it must be shared with anyone who needs to decrypt that communication. Every time the key is shared, it risks being intercepted by a malicious third party.
- Simplified key distribution. Because symmetric encryption uses the same key is used for both encryption and decryption, secure key distribution is crucial. The key distribution process is simper with asymmetric encryption, because only the public key is shared, while the private key remains confidential.
Use cases for symmetric encryption
Banking Sector. Due to the better performance and faster speed of symmetric encryption, symmetric cryptography is typically used for bulk encryption of large amounts of data. Applications of symmetric encryption in the banking sector include:
- Payment applications, such as card transactions where PII (Personal Identifying Information) needs to be protected to prevent identity theft or fraudulent charges without huge costs of resources. This helps lower the risk involved in dealing with payment transactions on a daily basis.
- Validations to confirm that the sender of a message is who he claims to be.
Data at rest. Data at rest is data that is not actively moving from device to device or network-to-network such as data stored on a hard drive, laptop, flash drive, or archived/stored in some other way. Data protection at rest aims to secure inactive data stored on any device or network. While data at rest is sometimes considered to be less vulnerable than data in transit, attackers often find data at rest a more valuable target than data in motion. For protecting data at rest, enterprises can simply encrypt sensitive files prior to storing them and/or choose to encrypt the storage drive itself.
The best way to encrypt data at rest is by whole disk or full disk encryption. Full disk encryption has several benefits compared to regular file or folder encryption, or encrypted vaults. Nearly everything including the swap space and the temporary files is encrypted. Encrypting these files is important, as they can reveal important confidential data. With a software implementation, the bootstrapping code cannot be encrypted, however. For example, BitLocker Drive Encryption leaves an unencrypted volume to boot from, while the volume containing the operating system is fully encrypted. In addition, the decision of which individual files to encrypt is not left up to users' discretion. This is important for situations in which users might not want or might forget to encrypt sensitive files.
Use case for asymmetric encryption: Digital signatures
As organizations move away from paper documents with ink signatures or authenticity stamps, digital signatures can provide added assurances of the evidence to provenance, identity, and status of an electronic document as well as acknowledging informed consent and approval by a signatory. Digital signatures are used to detect unauthorized modifications to data and to authenticate the identity of the signatory. In addition, the recipient of signed data can use a digital signature as evidence in demonstrating to a third party that the signature was, in fact, generated by the claimed signatory. This is known as non-repudiation, since the signatory cannot easily repudiate the signature at a later time.
The digital signatures standard was proposed by NIST and is defined in FIPS 186-4. Digital signatures employ asymmetric cryptography and they provide a layer of validation and security to messages sent through a non-secure channel. They enforce the concepts of authentication, non-repudiation, and confidentiality.
To create a digital signature and use it along with a message between two clients, Alice and Bob, the following steps are followed:
- The message that has to be digitally signed by Alice is hashed creating a message digest. Hashing is the process that is used to enforce data integrity. Hashing functions take the message and add a string value and convert it to another value (message digest). Hashing functions are one-way which means that the message digest cannot be reverted back to the message.
- The message digest is encrypted with Alice’s private key. This is a digital signature.
- The digital signature is now attached to the message and sent to Bob.
- Once the message is received, Bob decrypts the digital signature with Alice’s public key. This decryption results in a message digest.
- Bob also hashes the message which results in the message digest again.
- If the message digests in steps 4 and 5 above are the same, then Bob can be sure that Alice has signed the message and that the content of the message is as shown. Any difference in the hash values would reveal tampering of the message.
Digital signatures are intended for use in electronic mail, electronic funds transfer, electronic data interchange, software distribution, data storage, and other applications that require data integrity assurance and data origin authentication.
Use case for both asymmetric and symmetric encryption: Messaging applications
In end-to-end encryption, only the data is encrypted. The headers, trailers, and routing information are not. The basis for the end-to-end encryption is the Signal Protocol, designed by Open Whisper Systems. This end-to-end encryption protocol is designed to prevent third parties and the messaging vendor from having plaintext access to messages or calls. What’s more, even if encryption keys from a user’s device are ever physically compromised, they cannot be used to go back in time to decrypt previously transmitted messages.
Messaging end-to-end encryption is implemented using both asymmetric and symmetric cryptography. Asymmetric encryption is used to initialize the encrypted conversation between two users, and symmetric encryption is used to for the duration of the communication. The Whatsapp Encryption Overview White Paper provides the details.
Once the application is installed on a user’s smartphone, the public keys of the client are registered with the application server. The private key is not stored in the server and remains secret in the user’s device. The client who wants to initiate a session, retrieves from the Whatsapp server the public keys for the recipient. Using these keys, the initiator encrypts the first message and sends it to the recipient. This message contains the parameters for establishing a symmetric session key. The recipient uses his own private key to decrypt the message. “Once a session has been established, clients exchange messages that are protected with a Message Key using AES256 in CBC mode for encryption and HMAC-SHA256 for authentication.” The encrypted session needs to be re-created only when the device is changed or when the application software is re-installed.
Use case for both asymmetric and symmetric encryption: HTTPS
While the previous applications were focused on user identities, HTTPS is used for machine identification. In a highly connected world where on a daily basis millions of sensitive data travel through the internet, the need to secure the communication channels between clients/browsers and servers is of the utmost importance.
HTTPS is a TCP/IP application layer protocol, which is actually the SSL/TLS security protocol running on top of HTTP. An HTTPS connection between a client and a server employs both types of encryption. Asymmetric encryption is used first to establish the connection, which is then replaced with symmetric encryption (called the session) for the duration of the connection. A session key is a one-time-use symmetric key that is used for encryption and decryption. Session keys are randomly created and are used only for any particular session. This is how HTTPS works in simple steps:
- For the server and client to engage in a secure conversation, a TLS certificate needs to be created and verified by the Certificate Authority (CA).
- The browser sends a ClientHello message and indicates that it would like to start a conversation with a secure server. The ClientHello message contains all the information the server needs in order to connect to the client via TLS, including the various cipher suites and maximum TLS version that it supports.
- The server responds with a ServerHello message which includes the TLS version to be used, the server TLS certificate, and the server’s asymmetric public key.
- The browser verifies the server certificate, and creates a random session key.
- The session key is encrypted using the server’s public key and is sent back to the server.
- The server decrypts the session key with its own private key.
- Now both parties have the session key. The public key encryption is terminated and replaced with symmetric encryption. The session with the server continues using only symmetric encryption.
In both cases above, messaging apps and HTTPS, the asymmetric encryption is only used briefly in the beginning to exchange the symmetric session key which is used for the rest of the connection. This is done in order to overcome the main disadvantage of asymmetric encryption, being slow and resource exhaustive because of its mathematical complexity. On the other hand, the use of asymmetric encryption solves the problem of key distribution experienced in symmetric encryption.
Common symmetric encryption algorithms
Certain algorithms are widely used in securing data and communications within the bounds of symmetric encryption. The algorithms that are ideal for symmetric encryption include Advanced Encryption Standard (AES), Data Encryption Standard (DES), Triple Data Encryption Standard (3DES), Blowfish Twofish and Rivest Cipher (RC4).
Advanced Encryption Standard (AES)
One of the most common symmetric encryption algorithms, AES encrypts 128 bits of data in one go, operating on fixed-size data blocks with key lengths of 128, 192, or 256 bits. The 128-bit key encrypts data in 10 rounds, the 192-bit key in 12 steps, and the 256-bit key in 14 steps. AES employs a substitution-permutation network, making it highly secure and efficient for various applications. Developed as a replacement to the outdated Data Encryption Standard (DES) which was cracked by security researchers back in 2005, AES aimed at solving its predecessor’s main weakness, a short encryption key length vulnerable to brute force.
Triple Data Encryption Standard (3DES)
Data Encryption Standard is an algorithm with a symmetric key for encrypting digital data. However, when the original DES algorithm was cracked, 3DES developed a way to provide enhanced security by applying the DES algorithm three times sequentially, using three different keys. Each block of data undergoes a series of three transformations, significantly boosting security compared to the original DES, largely due to the use of 112-bit and 168-bit keys. Because it applies the same process three times, 3DES ends up being slower than its more modern counterparts. Also, the process uses small blocks of data, which increases the risk of decryption by brute force. According to the standards, 3DES will be deprecated for all new applications, and its use will be prohibited after 2023.
Blowfish is another key block cipher recognized for its simplicity, efficiency and resistance to attacks. This variable-length, symmetric, 64-bit block cipher was designed as a "general-purpose algorithm" that would provide a fast and free alternative to the aging DES encryption algorithm. Even though Blowfish is significantly faster than DES, it couldn't completely replace DES due to its small block size, which is considered insecure. Blowfish supports key length from 32 bits to 448 bits and was designed as a public tool, not licensed and available at no cost. Primary use cases for Blowfish include password hashing and secure data storage and transmission.
As the successor of Blowfish, Twofish addresses its predecessor’s security issues with a larger block size of 128 bits, extendable up to 256-bits. Like AES, Twofish operates on fixed-size blocks and supports key sizes of 128, 192, or 256 bits. This encryption algorithm is optimized for 32-bit central processing units and is ideal for both hardware and software environments. Like Blowfish, it is open source and freely available for use. One of the key advantages Twofish maintains over other encryption algorithms is that it uses 16 rounds of encryption, independently of the key or data size.
Rivest Cipher (RC4)
Developed as a stream cipher for RSA Security in 1987 by Ron Rivest, RC4 encrypts data one byte at a time. RC4 is one of the most popular stream ciphers, used in SSL/TLS protocols, IEEE 802.11 wireless LAN standard, and Wi-Fi Security Protocol Wireless Equivalent Protocol (WEP). While it offers significant advantages in terms of usability and performance speed, RC4 has declined in popularity due to significant flaws that have come to light.
Common Asymmetric Encryption Algorithms
Asymmetric encryption algorithms have become essential in securing digital communications. The most commonly used asymmetric algorithms include Rivest-Shamir-Adleman (RSA), Diffie-Hellman, Elliptic Curve Cryptography (ECC) and Pretty Good Privacy (PGP).
RSA is a widely used asymmetric encryption algorithm found in a variety of products and services and is considered to be a staple of asymmetric encryption. The technical details of RSA are based on the idea that it is simple to generate a number by multiplying two sufficiently large numbers. This alone should make it extremely difficult to factor that number back into the original prime numbers. One of the two numbers used to generate the public and private key is the product of two large prime numbers. Both are calculated using the same two prime numbers. RSA keys are typically 1024 or 2048 bits in length, making them extremely difficult to factorize. However, since RSA requires two different keys of incredible length, the encryption, and decryption process can be slow, but the level of security it provides for sensitive information is incomparable.
Diffie-Hellman Key Exchange
Also called an exponential key exchange, Diffie-Hellman is a method of digital encryption that uses numbers raised to specific powers to produce decryption keys on the basis of components that are never directly transmitted. This approach makes the task of an intended code breaker mathematically overwhelming. Diffie–Hellman key exchange establishes a shared secret between two parties that can be used for exchanging secret communications over a public network. In fact, the algorithm actually uses public-key techniques to allow the exchange of a private encryption key.
Elliptic Curve Digital Signature Algorithm (ECDSA)
ECDSA, or Elliptic Curve Digital Signature Algorithm, is one of the more complex encryption algorithms for public key cryptography. Elliptic curve cryptography (ECC) generates keys that are smaller than the average keys generated by digital signature algorithms. ECDSA uses the algebraic structure of elliptic curves over finite fields. ECDSA performs the same function as other digital signatures, but more effectively. This is because ECDSA uses smaller keys to achieve the same level of security as other digital signature algorithms. Because ECC is a mathematical operation that is quick and easy to complete but extremely difficult to reverse, it is nearly impossible to crack the private key. The primary applications of elliptic curve cryptography include the generation of pseudo-random numbers, digital signatures, and more.
Pretty Good Privacy (PGP)
PGP was a popular program used to encrypt and decrypt email over the internet, authenticate messages with digital signatures, and encrypt files. PGP is now commonly used to refer to any encryption application or program that implements the OpenPGP public key cryptography standard.
(This post has been updated. It was originally published on September 16, 2019.)