x
Loading
 Loading
Featured Paper: Xen Virtualization with Novell SUSE Linux
Hello, Guest | Login | Register

Secret Handshakes and Telltale Fingerprints

Remember secret handshakes and disappearing ink? Well, random number generators, MACs, and public key encryption do the same thing, but in a 21st century way. Learn how to keep your electronic secrets top secret.

Community Tools
RSS
Recommend This [?]
1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Users That Liked This [?]
No one yet. Be the first.
Tags:
Tag This!
 No Comments

cryptography_01

Too often, cryptography is treated like magic pixie dust: spread it around generously and hope for the best. But, there’s nothing mystical or enchanted about cryptography — it’s just another tool.

Like any other tool, cryptography can be very effective when applied appropriately. With a clear, definitive goal in mind, cryptography can prevent something from happening (such as exposing sensitive data on a network), or detect that something has happened (for instance, a critical system file has been tampered with). Used improperly or without expertise or clear intent, cryptography provides little more than a false sense of security.

Let’s take a look at the building blocks of cryptography — random number generators, hash and MAC functions, symmetric and public key encryption algorithms, and digital signatures — and see how each of these powerful tools can be used effectively. We’ll stay away from the mathematical underpinnings of cryptography, and focus instead on how it’s used in practice.

So, put away your pixie dust and get ready to make your own powerful mojo.

Easy as 1, 2, 3, 4

Think back to the last system you deployed. It was probably made up of several applications, each designed to manage and process data according to certain abstract logic (often called “business rules”). In your system, it’s also likely that at least some of your business rules were designed to protect your data. It’s those rules — particularly ones related to secrecy and integrity — that can truly benefit from cryptography.

In some cases, cryptography allows us to prevent something from happening. In other cases, cryptography allows us to detect that something has happened and take appropriate action. Either way, cryptography is a mechanism to enforce business rules. Here’s an example.

Mailing list operators had a big problem: forged subscription requests had grown out of control, and too many people were being subscribed to mailing lists against their will. To thwart forgeries, the operators devised an opt-in rule: you can’t sign up for a list without first confirming your intent.

To enforce this rule, the operators devised the following basic solution: each confirmation request includes a random nonce (nonce is a contraction of the variable n and the word once, which is how many times a nonce should be used) that is stored with the user’s e-mail address on the server. A confirmation is only honored if the user’s nonce matches the server’s copy.

Barring other flaws, the unguessable nature of the nonce prevents someone who can’t read your e-mail from subscribing you to a mailing list. If the attacker can’t read your e-mail, they won’t know the nonce, and without the nonce, they can’t forge the confirmation message.

Applying cryptography isn’t much different than solving any other problem: a methodical approach works wonders. If you understand clearly what you want to enforce, and understand the unique properties of each primitive (such as encryption), you can combine the primitives to meet your requirements. When you diverge from this goal-first approach and throw cryptography at a problem too early, you can find yourself with a complex design that doesn’t protect anything.








cryptography_01
Figure One: Four-step process for applying cryptography

Here’s our four-step process for applying cryptography (shown in Figure One):


  1. Define your problem and goal. Define exactly what you’re protecting against. Until you can describe this clearly, concisely, and in plain words, you’re not ready to start applying cryptography.

  2. Know the tools of the trade. With working knowledge of the six cryptographic primitives described in this artice, you can accomplish nearly everything possible in cryptography.

  3. Design and implement a solution by combining your knowledge of both the problem and the available tools.

  4. Verify your result. Put on your black hat and try to break your system. Invite your friends to try and break it, too. Find and fix your flaws.

Flaws always remain, so rinse and repeat.

Keep ‘em Guessing

In the forged subscriptions example, we used a random nonce to distinguish legitimate requests from forgeries, preventing replay and other attacks. In addition to this public use, we also use random numbers to generate secrets (such as encryption keys). Secure random numbers have three critical properties:


  • The numbers must be large. If the value is too small (less than about 280), someone can simply try all the possible values.

  • The numbers must be uniformly distributed. If some values are more likely than anothers, an attacker can try those first.

  • The numbers must be unpredictable. If the next random number is predictable given the previous, as is the case with the Standard C Library function rand(), then the attacker doesn’t need to guess at all.

True random number generators (RNGs) are special hardware circuits whose output can never be reproduced. While hardware-based RNGs are ideal, we typically use software RNGs, called pseudo-random number generators (PRNGs), that require an initial value, or seed. The same sequence of numbers is always generated for a given seed — so always choose a really unpredictable seed value. The seed should be large, uniform and unpredictable, just like random numbers.

PRNGs are used all over Linux. Whenever you connect to a website over SSL, your browser uses a PRNG to generate an encryption key. When you edit a file, a PRNG is used to choose the name of a temporary file for your changes. The initial sequence number for every TCP connection comes from a PRNG. In fact, good passwords, IPSec keys, and Web server session IDs all come from PRNGs.

Developers have historically preferred simple random number generators like rand(), saving cryptographically-secure generators for creating encryption keys. As a result, secure PRNGs were only available in cryptographic libraries (e.g., BSAFE and OpenSSL) and not in the base operating system.

Over time, however, we learned that using an insecure PRNG, even for mundane tasks, often creates a security vulnerablity (to read about three notable flaws fixed with secure PRNGs, see “When Random Isn’t Random Enough”). Today, most UNIX-like systems have integrated PRNGs, including devices (e.g., /dev/random and /dev/urandom) and library calls (e.g., arc4random()). Additionally, many Intel chipsets include an integrated hardware number generator.




When Random Isn’t Random Enough

Three good examples of historical weaknesses with PRNGs are TCP initial sequence numbers (ISN), temporary file names, and Netscape Navigator’s original SSL implementation.

The ISN is a “starting point'’ for a TCP connection and distinguishes one connection from another between the same endpoints. Initially, a secure PRNG didn’t seem necessary and the ISN was simply a function of time. Years later, we realized that if you can guess the next ISN, you can hijack the next connection, which will use that ISN. As a result, Linux and most UNIX operating systems now use secure PRNGs to create their ISNs.

Similarly, UNIX systems have always made extensive use of temporary files. Initially, it only seemed necessary to prevent accidental race conditions and the names weren’t very random. Over time, however, we learned that if you can guess the next temporary filename, you can piggy-back onto the actions of a more privileged user (e.g., by symlinking that filename to a file you controlled, you could effectively add your own entry to the password file). As a result, Linux and most UNIX operating systems now offer a mkstemp() call that uses a secure PRNG to create the filename.

One of the more memorable random number failures was the flaw in Netscape Navigator’s original SSL implementation. In 1995, two U.C. Berkeley researchers, Ian Goldberg and David Wagner, discovered that Netscape was using an easily guessed PRNG seed — it only consisted of the time (in seconds and microseconds), the process id, and the parent’s process id. Their program was able to find the correct encryption key in less than 25 seconds if the attacker could guess the second in which the PRNG was seeded — something that wasn’t hard if you could watch the network traffic. Even though the SSL protocol itself and the remainder of Netscape’s implementation was sufficiently secure, a simple implementation flaw rendered the whole system vulnerable. Although that particular bug is long-since fixed, it shows how delicate the typical application really is. One small slip and the security melts away.

Watching The Drives








cryptography_03
Figure Two: Computing the hash of a large file

When a malicious hacker breaks into your computer, he (or she) often creates a back door by modifying critical system files. One way to detect such intrusions is to periodically check your files for suspicious changes. You could store a backup copy of all the files and then use cmp to see what files changed, but that’s too expensive and inefficient. Instead, we’ll use the following design: store the hash of each critical file on read-only media (demonstrated in Figure Two,), and then periodically check each critical file by re-computing its hash and comparing the new hash against its stored hash. Any file that doesn’t match its stored hash has been modified. This design allows us to detect tampering with limited amounts of trusted storage. Tripwire and AIDE (http://www.cs.tut.fi/~rammer/aide.html) do exactly this.

Cryptographic hash functions (like MD5 and SHA-1) create a short fingerprint (or hash) of arbitrarily-long data and can ensure that’s data’s integrity. Secure hash functions have four important properties:


  • You can’t find the input given only the output (i.e., a one-way function).

  • You can’t find two different inputs that create the same output.

  • Any change in the input causes significant changes in the output.

  • There’s no key. Given the data anyone can compute the hash value.

Hash functions are one of the most popular cryptographic tools, used throughout Linux. In addition to file integrity checkers, we build PRNGs out of hash functions, and use hash functions in nearly all cryptographic protocols. Most commonly, though, they’re used to store passwords.

For example, rather than storing your password in a human readable form, UNIX-like systems store the MD5 (or other) hash of your password. This prevents anyone who can read the password file from being able to read your password, while still allowing the system to verify that the password you’ve entered is correct. Actually, the system doesn’t just hash your password; it hashes your password concatenated with a salt value. The salt value is generated by a PRNG and ensures that two different users with the same password have different hash values, making dictionary attacks (where a dictionary of common password hashes pre-computed) more expensive.

Timestamping is another example of identifying data with hash functions. Imagine you’ve solved one of life’s great questions. You want to share the details with a few experts who can confirm your discovery, but want evidence that you discovered this prior to giving it away. You can compute the hash of your brilliant paper and publish the hash (but not the paper) in the New York Times. Since the hash function is one-way, this doesn’t risk revealing your discovery, but when you later publish your findings, anyone can verify that its hash matches the published hash value.

Cryptographic hash functions come with one major caveat: they’re public functions that don’t use any kind of key. These functions are vulnerable to forgery if we make both the data and its hash public. Luckily, there’s a variant of the hash function that doesn’t have this weakness.

Secret Handshakes

Web site cookies illustrate one problem that can arise when using hash functions. When you set a cookie in a browser, you generally don’t want the user to modify the cookie (the user is in a different trust boundary than the server). We could use a hash function, but that requires a place to store the hash value (a place that could just as well be used to store the cookie data). Instead, we need a special hash function that’s not so easily forged.

Fortunately, there’s already such a beast, known as a Message Authentication Code (MAC). It has all of the properties as a cryptographic hash function, but adds something significant: MACs require a key and only those with the right key can generate and verify the MAC value.

MAC functions are a lot like the secret handshake you made up in 5th grade — you could easily detect anyone who didn’t know the handshake, but once you taught someone to recognize the handshake, they gained the ability to use it as well. MAC functions can protect the integrity of data that crosses multiple trust boundaries, as long as the MAC values are computed and verified within the same trust boundary.

Wrapping up our example, we can include a MAC of the cookie in the cookie itself, allowing us to detect anyone who changes the cookie. Since only we know the secret, nobody else can create the right MAC value. We can easily share the single secret across all the servers where we want to create or verify our cookies. One minor detail: you can’t tell which server created the cookie since they all use the same secret.

The most trusted form of MAC is built on top of hash functions like MD5 and SHA-1. It’s known as hash-based message authentication code (HMAC) and can be found in RFC 2104. HMAC hashes a key and the data twice in a carefully designed pattern. It’s important to follow the HMAC algorithm exactly, since many ways of combining the key and data are surprisingly insecure.

In addition to being a mainstay of web-based authentication, MACs are used in most network security protocols, including SSL and IPSec, to ensure that data you send over the network hasn’t been modified in transit.

Keeping Secrets

To most people, cryptography and encryption are synonymous: both are ways to hide secrets from prying eyes. And yet nothing we’ve discussed so far has anything to do with encryption. The techniques we’ve discussed so far ensure data integrity and are often overlooked in the quest for keeping secrets. Having said that, let’s share some secrets.

Recall that our first example, mailing list management, stores a random nonce that authenticates a user’s confirmation. If stored on a network file system (NFS) partition, the nonces are exposed as they move between the software and file server. Now everyone on the network can learn the nonce and forge a confirmation message. If we don’t want people on the network to forge confirmations, then we say they’re in a different trust boundary.

We can prevent them from reading our nonces with encryption. Symmetric encryption is like a lockbox — only people who have the key can open the box and learn the contents.

Symmetric encryption uses a secret key to make the data unreadable until it is decrypted with the same key. Anyone who knows the key can both encrypt and decrypt the data. As a result, symmetric encryption is useful for keeping sensitive data secret when it crosses trust boundaries, as long as it’s always encrypted and decrypted within the same trust boundary. Our nonce example meets this constraint since encryption and decryption are always performed by the mailing list software, and only the encrypted nonces (but not the key) are exposed to the unsavory characters lurking on the network.

There are a few surprising limitations. First, encrypting data doesn’t ensure its integrity. Contrary to popular belief, you need a separate MAC (which we just saw) or a digital signature (which we’ll get to in a bit) for that. Though counter-intuitive, this is a lot like a birthday present: you can ruin the present by crushing the box without ever learning what was actually inside.

Second, the same data always encrypts to the same value if you don’t change the key. This may leak information to observers. To protect against this, we typically perform encryption within a larger algorithm, known as a mode, that first encrypts a random initialization vector (IV). The IV serves the same purpose as the salt value in the password hashes discussed above. Common modes include Cipher Block Chaining (CBC) and Counter (CTR).

Encryption is used in nearly all security protocols since keeping private data secret is a core security requirement. In addition to SSL and IPSec, encryption is also used for secure e-mail (like PGP and S/MIME), and encrypted file systems (which encrypt files before writing them to disk). In theory, the strength of an encryption scheme is based on the strength of the algorithm and the secrecy of the key. In practice, there are many other factors, including the implementation quality. Bottom line: choose respected algorithms from a good library, an unpredictable key, and get a expert to review your work.

Digital Dropbox

So far, all of the cryptographic primitives we’ve discussed are symmetric: anyone who can encrypt the data can decrypt it; anyone who can verify a MAC can generate it. Although it’s the epitomy of fairness, it presents a significant limitation by preventing separation of duties. One of the most effective controls, especially against insider threats, is separation of duties. Many corporations don’t allow any single person to sign a check worth more than $50,000. Instead, two people of sufficient rank must sign the check for it to be honored.

Similarly, consider a spam-collection database like Vipul’s Razor. When you receive a piece of spam, you send it to the database and everyone who filters their incoming e-mail with the database’s fingerprints benefits by detecting previously seen spam with no manual effort. Although we trust the database with our messages, we don’t want to expose the messages to everyone on the network. We could encrypt the messages with a single key, but then everyone who talked to the database could read the messages. We could assign everyone a different key, but then the database has to keep track of too many keys.

What we need are unfair primitives that let us separate duties between multiple people. For example, let many people encrypt but only the spam database operator decrypt. Or let one person sign a document, but let anyone verify the signature. For centuries, all we had was symmetric cryptography and this separation wasn’t possible without a trusted third party.

Fortunately, this all changed with a 1976 paper by Whitfield Diffie and Marty Hellman called “New Directions in Cryptography.” The next two sections discuss asymmetric, or public key, cryptography.

Public, But Secret

Recall our desire to encrypt messages before sending them to the spam collection database. We need a way to let anyone encrypt a message that only Vipul can decrypt. In other words, we need a digital dropbox where anyone can put a message in the slot, but only the holder of the key can get the messages back out.

Public key encryption meets just this need. Instead of a single key, as we had with symmetric encryption, there’s now two. One can be shared with the world (known as the public key), while the other is kept secret (known as the private key). Data encrypted with the public key can only be decrypted with the private key, just like our dropbox.

Public key encryption provides the already discussed benefit of separation of duties, but that comes at some cost. Public key operations are inefficient, often requiring an order of magnitude more time and space than symmetric counterparts.

As a result, we don’t actually encrypt messages or data with public keys. Instead we randomly generate a symmetric encryption key (there’s that PRNG again), known as a session key. The session key is used to symmetrically encrypt the message and then the session key is encrypted with the recipient’s public key. This approach is typically called key exchange.

There are also key agreement algorithms that don’t encrypt anything at all, but allow two parties to securely agree on a symmetric encryption key.

RSA is the de facto standard public key encryption algorithm, while Diffie-Hellman is generally used for key agreement.

Public key encryption is used all the time. Once again, the two most widely used examples are network security protocols (SSL and IPSec) and e-mail encryption (PGP and S/MIME). Another common but rarely discussed application is secure backups, where the backups are encrypted using public key encryption before they leave each server. Keep the corresponding private key locked in a safe.

Since the public key on the server can’t be used to decrypt, someone would have to steal the key from the safe to decrypt the backups. This is really useful if you want your colocation provider handling backup tape storage on your behalf.

Digital John Hancock

Remember our technique for signing Web site cookies using a MAC? We simply share a secret key among all servers that need to verify the cookies. Unfortunately, this breaks down when we don’t really trust all the servers that need to perform verification. In other words, the MAC doesn’t work when not all of the servers are within the same trust boundary and we need separation of duties between servers that issue the cookies and servers that verify the cookies. To understand why, recall that anyone who can verify a MAC can also generate the same value, enabling verifiers to also issue cookies.

A good example of this problem is Microsoft’s Passport, a project to provide centralized authentication services on the Web. In such a service, you authenticate to a centralized server and can carry proof of that authentication to many sites without reentering your password. If such a system is successful, then only a few people will have access to the authentication servers, but there may be thousands of different companies who need to verify the cookies.

The solution to this problem is to sign the cookies with a digital signature. A digital signature can only be created with the private key but can subsequently be verified using just the public key. This allows us to enforce our separation of duties by sharing the private key amongst the centralized authentication servers and providing our public key to everyone who needs to verify the cookies (all of whom could forge your identity if just symmetric keys were used).

Digital signatures are often used in secure e-mail (PGP and S/MIME), and sometimes for IPSec, secure shell (SSH), digital rights management (DRM), and electronic contracts.

RSA is the de facto standard signature algorithm, although the United State’s Digital Signature Algorithm (DSA), derived from Elgamal, is also relatively common.

Pulling It Together

Although we’ve discussed each of the major cryptographic primitives in isolation, they’re rarely actually used that way. More often, we combine several primitives in concert as part of larger protocol. To help us get a feel for these more complex protocols, we’ll quickly sketch a typical SSL session.

Before the SSL session can actually start, several prerequisites must be met. First, you always need a trust anchor. In SSL, our trust anchor is the Certification Authority’s (CA) certificate. A certificate is simply a file containing a name and a public key that’s been digitally signed. In effect, the signature binds the name to the key. In the CA’s case, the certificate is self-signed, meaning it was signed with the private key corresponding to the certified public key. Since the CA’s certificate is self-signed and serves as our trust anchor, it must be distributed in a trusted fashion. For most of us, it is installed with our web browser or operating system. It’s not much, but it’s all we’ve got.

The second prerequisite is that the Web server needs it’s own certificate. Since every server on the Internet can’t have their certificate installed with your browser, they need to get their certificates from a CA that already has a certificate installed on your computer. The Web server operator gets this certificate by randomly generating a RSA key pair, sending the public key with supporting documents to the CA. That and a check for $349 gets them a one-year certificate.

Now that the prerequisites have been met, we can actually make an SSL connection. Your browser first connects to the remote server, negotiating algorithms and protocol versions. Both sides send random nonces to prevent replay attacks. Then the server supplies its certificate and your browser checks the certificate’s digital signature against its local copy of the CA’s certificate. Finally, the browser verifies that the server certificate’s name matches the name you entered. At this point, your browser still needs to verify that the server has the private key, which it can do while negotiating the encryption keys. The browser generates a random pre-master secret, encrypts it with the key in the server’s certificate, sending the result to the server.

The server now decrypts the pre-master secret, and both sides use a hash function to derive four session keys from the secret: one for encryption in each direction, and one for integrity in each direction. Each side uses these parameters to send a finished message that the other side must verify. If the server had access to the right private key and there was no tampering on the handshake, then the verification will succeed and actual data can pass between the client and server.

Every message between the client and server is encrypted with a symmetric encryption algorithm. This ensures that the data isn’t divulged when passing over the network. The messages are also protected by a MAC, ensuring that any modification of a message can be detected by the recipient.

As you can see, real world network security protocols rely heavily upon all of the primitives we’ve discussed in this article. And although the details are rather complicated, a careful review shows clearly the benefit of each’s primitive’s unique characteristics.

High-Order Bits

As we’ve seen, cryptography is about much more than just keeping secrets safe from prying eyes. Remember:

1. Know your problem. Don’t apply cryptography until you can describe, in plain words, exactly what you’re protecting against.

2. Know the tools. Using just six basic operations, you can accomplish nearly everything possible in cryptography — but first you have to know what each has to offer and how to apply them.


  • Use random numbers with a really unpredictable seed to generate secret keys. Keep the keys secret.

  • Use secure hashes as fingerprints of critical data. Anyone with the data can compute the hash, so don’t make both the data and the hash public without additional protection.

  • Use a message authentication code (MAC) to ensure the integrity of data while making both the data and the MAC value public. You just need to keep the MAC key secret. Be sure that everyone who needs to verify the MAC value is also authorized to generate it.

  • Use symmetric encryption to keep data secret when it’s ok for everyone with the same key to access each other’s data. Remember that encryption doesn’t provide integrity, and encryption without authentication is worthless against active attackers.

  • Use public key encryption to keep data secret when one or more encrypting parties aren’t also authorized to decrypt other’s data.

  • Use digital signatures to ensure the integrity of data when there are one or more verifiers who aren’t also allowed to generate the signatures.

3. Apply the tools. Combine your knowledge of the problem and the tools to design a protocol that meets your needs.

4. Verify the result. Put on your black hat and attack your design. Get your friends to do the same. Figure out what you missed and fix it. Flaws always remain, so rinse and repeat.

Now, go do that voodoo that you can now do.



Scott Renfro earns his keep by sniffing out and fixing security flaws in large applications, including his own. Scott can be reached at applyingcrypto@yahoo.com.

Read More
  1. Enhance Security with Port Knocking
  2. Linux Magazine Annual Security Survey 2007
  3. Secure Remote Access from Your Desktop
  4. Protecting Linux Systems
  5. Keeping a Watchful Eye with OpenNMS

Comments on Secret Handshakes and Telltale Fingerprints

No comments yet.

Sorry, the comment form is closed at this time.

ActivSupport
Linux Magazine has chosen ActivSupport as IT consultants.
Sponsored Links