Format-Preserving Encryption (FPE) encrypts data while maintaining its original format. This means that sensitive information, such as Social Security numbers or credit card numbers, is converted into ciphertext that retains the same structure as the plaintext. FPE makes the encrypted output look similar to the original data, making it useful for applications where data format consistency is important, such as financial transactions and regulatory compliance.
The problem of protecting data in legacy systems
Ensuring data security in legacy systems, like those in the banking and healthcare sectors, poses a challenge that must be tackled without disrupting their operations.
The core issue is that encrypting structured data, such as credit card numbers, using methods like AES in CBC mode, can alter the data's format. For instance, a 16-digit credit card number could be transformed into a string like BfA1lytW8I2kflOcQbOCUlX1yH+vAL1/nRoLgKkId+o=. This resulting string is longer than 16 characters and no longer purely numerical. Such changes can be problematic in complex legacy environments where numerous applications expect a 16-digit input and may not handle deviations gracefully.
PKI: Are You Doing It Wrong?
How Format Preserving Encryption solves the problem
NIST offers a solution to this issue in their recent Special Publication 800-38G, titled "Recommendation for Block Cipher Modes of Operation: Methods for Format-Preserving Encryption."
The Format-Preserving Encryption (FPE) modes outlined in SP 800-38G enable the encryption of plaintext without altering its format. FPE techniques are tailored for data that isn’t strictly binary. NIST states that “given any finite set of symbols, such as decimal digits, an FPE method transforms data formatted as a sequence of these symbols so that the encrypted data retains the same format, including length, as the original data. Therefore, an FPE-encrypted SSN would still be a sequence of nine decimal digits.”
SP 800-38G details methods for encrypting sensitive data that meet the FIPS 140-2 standards, the US government’s "Security Requirements for Cryptographic Modules." The FPE modes described in NIST SP 800-38G can be used to protect sensitive data while ensuring compliance with privacy and security regulations like CCPA, HIPAA, PCI DSS, or GDPR.
First published in 2016, NIST SP 800-38G described two FPE modes: FF1 and FF3. However, in 2017, researchers identified a cryptanalytic attack on FF3, which compromised its effectiveness for general-purpose FPE by failing to meet the desired 128-bit security level.
In response, NIST revised FF3 to FF3-1 in early 2019, addressing vulnerabilities associated with small domain sizes, such as the middle six digits of credit card or Social Security numbers. These cases lacked sufficient entropy to produce a secure output that couldn’t be reverse-engineered. The original SP 800-38G required the domain size for FF1 and FF3 to be at least 100, with a recommendation of at least 1,000,000. The revised standard mandates a minimum domain size of 1,000,000.
Benefits of Format Preserving Encryption
FPE modes enable the integration of encryption technology into existing devices or software where traditional encryption methods might not be practical. This is particularly relevant for database applications that cannot accommodate changes to the length or format of data fields.
FPE is often employed to safeguard sensitive information such as payment card data, bank account details, Social Security Numbers, and personally identifiable information (PII) stored and processed in retail, healthcare, and financial databases and applications.
More broadly, FPE can aid in the "sanitization" of databases by encrypting PII, such as SSNs. Encrypted SSNs can still function as indices for statistical research across multiple databases. This allows extensive processing of FPE-encrypted data to occur while the data remains in its protected state.
Format Preserving Encryption vs. tokenization
A comparable method to FPE for preserving data format during protection is tokenization. Tokenization replaces sensitive information with randomized values that maintain the same format but lack any inherent value. The original data is securely stored in a data vault. Unlike encryption, tokenization does not allow for reversal since there is no mathematical link between the token and the original data. While encrypted data can be decrypted with the correct keys or machine identities, tokens are irreversible, offering greater flexibility in the range of tokens to which the data can be converted.
Conclusion
Data protection is important for compliance with various security and privacy regulations and to avoid costly penalties. However, organizations should assess the various encryption methods to verify that their critical systems are not disrupted when processing ciphertexts.Either way, organizations should protect encryption keys from compromise. Data encryption and protection is as strong as the strength of associated keys. Once the keys are compromised, all encrypted data can be deciphered by cyber criminals, exposing business and individuals to threats such as financial fraud, blackmail, impersonation, and business email compromise.