11 Secret Key Cryptography
Symmetric key cryptography means that the same key is used to encrypt the data as is used to decrypt it. This means that the key must be kept secret between the users encrypting data and decrypting the data. This is why the scheme has the alternative name of secret key cryptography because the key is always kept secret. The .NET framework has several classes to give access to symmetric algorithms and it also provides classes to help you make passwords more secure. In this section I will describe how to use the .NET secret key algorithms, how to do this through streams and I will also show how to encode binary data to base64 characters.
11.1 Symmetric Key Algorithms
There are two types of symmetric algorithm, stream and block. Stream algorithms work on the data a bit (or byte) at a time, whereas block algorithms work on data a block at a time. All of the algorithms in the .NET framework library are block algorithms and all of them derive from theSymmetricAlgorithm class. As explained on the
previous page, block algorithms can be run in one of several modes and once
encrypted using a particular mode the cyphertext must be decrypted using the
same mode. The mode of an algorithm is indicated by the SymmetricAlgorithm.Mode
property which is one of the CipherMode enumeration. All are
block modes, but since CFB and OFB use a shift
register for the input it means that they behave as a stream cypher with
respect to the data they consume.The classes derived from
SymmetricAlgorithm are shown
in the following table.
| Class | Description |
|---|---|
DES |
Data Encryption Standard, the standard ratified by the US government. Originally designed by IBM. 64-bit blocks with 56-bit keys. Now considered insecure. |
RC2 |
Ron Rivest algorithm, "Ron's Code". 64-bit blocks with 8- to 128-bit keys. |
Rijndael |
Designed by Vincent Rijmen and Joan Daemen. Adopted by NIST as the Data Encryption Standard. 128-bit blocks with 128- to 256-bit keys. |
TripleDES |
DES applied three times. Designed by IBM. 64-bit blocks with 168-bit keys. Considered to be secure. |
The SymmetricAlgorithm base class defines the
CreateEncryptor and CreateDecryptor methods that all
derived classes must implement. This means that all symmetric algorithms are
intended to be accessible through the ICryptoTransform interface
and hence used through a CyptoStream object. This sounds like a
contradiction: block algorithms used through a stream object, however it isn't
because the stream object only provides a stream interface to that block data.
Indeed, many of the framework stream implementations (eg FileStream)
are actually based on block data because they are buffered.
Your data is rarely in a number of whole blocks, typically there will be a
partial block of data and so the algorithm must treat this differently, by
padding the data. The ICryptoTransform interface has two methods,
TransformBlock and TransformFinalBlock. The second
of these will pad the final block with values specified by the value of the
SymmetricAlgorithm.Padding property. As mentioned earlier, block
algorithms can be used in one of several modes. The default is
CBC: cypher block chaining. This, and most of the other modes
need an initialization vector, and it is important that the IV is carefully
chosen to be random. By default, the SymmetricAlgorithm class will set the IV
property to a random value.
To see this create the following file (rijndael.cs):
using System.Security.Cryptography;
class App
{
static void Main()
{
Rijndael r = Rijndael.Create();
Console.WriteLine("current IV is {0}", BitConverter.ToString(r.IV));
}
}
Compile and run this code, you'll see something like the following:
Run this program several times to confirm that each run gives a different IV. Although the IV does not have to be secret, it must be the same for the encryption and the decryption, so if you persist encrypted data you must make sure that you also persist the IV.
Each algorithm also needs a secret key and that key should be a specific
number of bits. You can determine the number of bits from the
LegalKeySizes array which is an array of KeySizes objects.
Add the following code:
{
Console.WriteLine("Key size {0} to {1} in {2} increments",
keySize.MinSize, keySize.MaxSize, keySize.SkipSize);
}
For Rijndael I get:
This means that the key can be 128, 192 or 256 bits.
At this point I want to caution you. There are various properties that are sizes, however, some are sizes in bits, others are sizes in bytes.
You should carefully check what units the size property uses. For example,
SymmetricAlgorithm.BlockSize should be the same as
ICryptoTransform.InputBlockSize, but the former is given in bits and
the latter is given in bytes. |
Users rarely give passwords of the required number of bits and even if they
do, they will use a keyboard with printable characters which restrict the
range of each byte in their password. It is better to generate a password from
a pass phrase. The framework library provides PasswordDeriveBytes
to do just this. This class will combine the pass phase with some extra bytes
(called salt) and then perform a hash on this data. It will do this
repeatedly for a specified number of iterations. The pass phrase cannot be
derived from the result and the salt makes it harder for an attacker to
perform a dictionary attack. The salt should be a randomly generated value,
but again, it does not have to be kept secret. However, since the salt
determines the key used in the encryption it should be available to the code
decrypting the data and is typically stored with the IV and the encrypted
data.
The salt and IV should be randomly generated data. SymmetricAlgorithm
will generate a random IV, so that leaves the salt. The problem with most
random number generators (like System.Random) is that you cannot
guarantee their randomness and any repeatability in the data weakens your
cryptographic security. The framework provides a class called
RandomNumberGenerator that will create a more secure random number
generator. In a similar way to symmetric algorithms, this class has a static
method that will return an instance of a derived class (RNGCryptServiceProvider)
that is an implementation provided by the CryptoAPI.
Putting all of this together, if you want to generate a password from a pass phase you could use this code:
{
RandomNumberGenerator rand = RandomNumberGenerator.Create();
salt = new byte[saltSize];
rand.GetBytes(salt);
PasswordDeriveBytes pdb = new PasswordDeriveBytes(phrase, salt);
pass = pdb.GetBytes(passSize);
}
PasswordDeriveBytes is an implementation of the
abstract class DeriveBytes.
|
.NET Version 3.0 In version 3.0/2.0 of the framework the GetBytes method is deprecated
so you should use the CryptDeriveKey instead. The new version of
the framework has another implementation of DeriveBytes
called Rfc2898DeriveBytes. This uses password-based key
derivation functionality (PBKDF2) described by
RFC2898 and it uses the
HMACSHA1 class. You use this in much the same way as you would use
PasswordDeriveBytes. |
Now let do some encryption and decryption.
First I want to show you the basic use of ICryptTransform, so
for the time being ignore some of the deliberate security lapses. Create a
file called encrypt.cs:
using System.Security.Cryptography;
using System.Text;
class App
{
static void Main()
{
Rijndael r = Rijndael.Create();
r.Mode = CipherMode.ECB;
string phrase = "daisy, daisy, give me your answer to";
PasswordDeriveBytes pdb = new PasswordDeriveBytes(phrase, new byte[0]);
r.Key = pdb.GetBytes(r.KeySize>>3);
string data = "The quick brown fox jumps over the lazy dog.";
}
}
This creates an instance using ECB, that is, we won't use an
IV in this example. (Note that you should not write code like this, we will fix
this later.) The pass phrase and the cleartext are hard coded in the example,
and a key is generated from the pass phrase without a salt. The size of the the key is given
in KeySize in bits so I shift right 3 times (divide by 8) to get
the number of bytes.
Next you need to transform the data. To do this obtain an encryptor from
the algorithm and call a member function called CryptoTransform.
This function will apply the cryptographic transform to the data and return
the transformed data. We can also obtain a decryptor and apply that with the
same function:
ICryptoTransform en = r.CreateEncryptor();
byte[] input = Encoding.ASCII.GetBytes(data);
input = CryptoTransform(input, en);
Console.WriteLine(BitConverter.ToString(input));
ICryptoTransform de = r.CreateDecryptor();
byte[] output = CryptoTransform(input, de);
Console.WriteLine(Encoding.ASCII.GetString(output));
This code should print out the hex values of the transformed data and the
last line should print the clear text of the input data. The
CryptoTransform method looks like this:
{
byte[] output = new byte[input.Length*2];
int inputOffset = 0;
int bytesToRead = en.InputBlockSize;
int outputOffset = 0;
int totalBytesTransformed = 0;
while(true)
{
int numTransformed = en.TransformBlock(input, inputOffset, bytesToRead, output, outputOffset);
inputOffset += bytesToRead;
outputOffset += numTransformed;
totalBytesTransformed += numTransformed;
if (input.Length - inputOffset < en.InputBlockSize)
break;
}
byte[] tempBuffer = en.TransformFinalBlock(input, inputOffset, input.Length - inputOffset);
byte[] returnedBuffer = new byte[totalBytesTransformed + tempBuffer.Length];
Array.Copy(output, 0, returnedBuffer, 0, totalBytesTransformed);
Array.Copy(tempBuffer, 0, returnedBuffer, totalBytesTransformed, tempBuffer.Length);
return returnedBuffer;
}
The transforms the input data depending on the interface passed as the
parameter. Since I do not know the size of the output buffer I create one twice
the size of the input buffer. This works in practice, but I will leave it up to
the reader to write code that checks that the buffer is large enough before
calling TransformBlock. This code will call TransformBlock
with a block of data of the size given by InputBlockSize. This
assumes that TransformBlock can be called multiple times with
different data, this is the case with all of the framework's algorithms which
will return true for CanTransformMultipleBlocks.
Compile this code (csc encrypt.cs) and run it to confirm that
the binary data created by the first call to CryptoTransform will
decrypt to the original data.
As you can see, this is quite involved, so instead let's use a
CryptoStream object and streams. Add a using statement for
System.IO and replace the contents of CryptoTransform with
this:
{
MemoryStream sInput = new MemoryStream(input);
CryptoStream cs = new CryptoStream(sInput, en, CryptoStreamMode.Read);
MemoryStream sOutput = new MemoryStream();
byte[] buffer = new byte[1024];
while(true)
{
int read = cs.Read(buffer, 0, buffer.Length);
if (read == 0) break;
sOutput.Write(buffer, 0, read);
}
cs.Clear();
return sOutput.ToArray();
}
In addition, add a using statement for System.IO. Compile and
run this code.
Note that this code has a call to CryptoStream.Clear. The reason is
that if there is buffered data in the stream it might be possible for another
process to get access to the memory used by the stream and so get access to
cleartext. The stream object will live in memory until the next garbage
collection and this could be a long time after the stream object has been used,
so you cannot rely on the GC to remove this object for you. This is the reason
for the call to the Clear method: it will clear any internal
buffers so that sensitive data will not persist in memory. |
As you can see this is far simpler code. A MemoryStream object
gives stream access to the input array and this stream and the cryptographic
object are used to create the CryptoStream. The mode is set to
Read so that the data is transformed when it is read from the
underlying stream. This is a purely arbitrary choice and the following code works
just as well:
MemoryStream sOutput = new MemoryStream();
CryptoStream cs = new CryptoStream(sOutput, en, CryptoStreamMode.Write);
byte[] buffer = new byte[1024];
while(true)
{
int read = sInput.Read(buffer, 0, buffer.Length);
if (read == 0) break;
cs.Write(buffer, 0, read);
}
cs.FlushFinalBlock();
cs.Clear();
return sOutput.ToArray();
In both cases the input stream is read until it is empty (so the Read
returns zero bytes). The main difference between these two fragments of code
is that in the first case the ICryptoTransdorm.TransformFinalBlock
will be called by the call to CryptoStream.Read that returns a
value of zero. The second fragment of code has to explicitly make this call.
Regardless of which version you use, it is clearly simpler code than the version that
calls the ICryptoTransform methods directly. However, be aware
that the CryptoStream methods allocate intermediate arrays, and
if the number of bytes that you request is less than the number transformed it
will cache the excess. The convenience of making your code more readable
results in more memory allocations.
I mentioned earlier that the code uses Electronic Codebook mode (CipherMode.ECB).
There is an inherent weakness in this mode. If there are repeated strings in
the cleartext then these will be apparent in the cyphertext. Change the calling code so that it looks like
this:
r.Mode = CipherMode.ECB;
r.Padding = PaddingMode.None;
string phrase = "daisy, daisy, give me your answer to";
PasswordDeriveBytes pdb = new PasswordDeriveBytes(phrase, new byte[0]);
r.Key = pdb.GetBytes(r.KeySize>>3);
string data = "repeated.text...repeated.text...repeated.text...repeated.text...";
byte[] input = Encoding.ASCII.GetBytes(data);
ICryptoTransform en = r.CreateEncryptor();
input = CryptoTransform(input, en);
Console.WriteLine(BitConverter.ToString(input));
ICryptoTransform de = r.CreateDecryptor();
byte[] output = CryptoTransform(input, de);
Console.WriteLine(Encoding.ASCII.GetString(output));
The clear text contains a repeated phrase. Significantly, the repeated phrase
is 16 characters and the algorithm
converts data in 16 byte blocks. This means that each block will be the same.
Notice that I have also changed the padding mode to be PaddingMode.None
so no padding is used. This is possible because I have a whole number of
blocks. If the input data could not fit into a whole number of blocks then the
algorithm will throw a CryptographicException. There are two other
values that can be used for the padding. PaddingMode.Zeros will pad the input data with zero so that
it becomes a number of whole blocks, however, this action is not reversible: the
decryptor will not know if the zeros it obtains are part of the data. The final
type of padding is PaddingMode.PKCS7 which is the default. This
will pad the final block to make a full block with a byte that has the value of
the number of padding bytes used. For example, if the block is 16 bytes and the
final block is 13 bytes then three bytes with the value 0x03 will
be used as padding. The decryptor can remove the padding after it has done its
work by reading the last byte of the last block and removing that number of
bytes from the end. The only problem with this mechanism is that if the input
data has a whole number of blocks (as in this example) then an additional block
will be added to the end so that this padding block will be removed by the
decryptor.
Compile and run this code and you should get a result like the following (which I have edited slightly to align the values):
EE-98-44-3F-36-6C-E1-5C-70-55-44-59-A5-F7-B2-20-
EE-98-44-3F-36-6C-E1-5C-70-55-44-59-A5-F7-B2-20-
EE-98-44-3F-36-6C-E1-5C-70-55-44-59-A5-F7-B2-20
As you can see there are four groups of 16 bytes and each group is the same
- these correspond to the four repeated phrases in the cleartext. A
cryptoanalyst (the polite name for a cracker) will exploit repeated
values in the cyphertext. In all languages some words are more popular than
others and so the cryptoanalyst will use tables of the probability of words in
the appropriate language to make a guess at the word that could be the
repeated block. Thus repeated blocks represent a serious weakness in the
algorithm. Now change the Mode to CBC:
r.Mode = CipherMode.CBC;
In this case the algorithm will combine each block with the previous encrypted block. As I mentioned earlier, this requires an initialization vector and one will already be created for you (with random values). If you run this code and rearrange to blocks of 16 bytes you'll get data that does not repeat. On my machine I get the following, but you will get something different because of the random IV:
8A-D0-71-ED-AC-A5-9F-4E-DE-12-15-2D-A3-D4-1D-DF-
0B-E7-7A-21-6A-C0-82-F3-CD-37-E0-0E-FB-DE-C1-94-
FB-C8-B2-DB-C7-6C-9B-C1-27-D8-ED-14-7C-7F-24-B5
11.2 Persisting Data
The last section showed that using the CryptoStream class
simplifies the code that you need to write. In that section I ignored salt
when generating the symmetric key, I mentioned that this adds insecurity to your code. However, when you add salt you must use it for
encryption and decryption, so the salt value must be available to both the
encryption and decryption code. Furthermore, the same initialization vector
must also be available to the encryption and decryption code. So if you intend
to persist your encrypted data you must also persist the salt and
initialization vector. In this section I will show you how to do this.
The following example is a simple command line tool that will encrypt the contents of a file and write the encrypted data to another file. The command line looks like this:
the final parameter is optional and indicates if the action is encryption
or decryption (encryption is the default). Create a file called
encrypt.cs. Here is the code for the Main
function:
using System.IO;
using System.Security.Cryptography;
using System.Text;
class App
{
static void Main(string[] args)
{
if (args.Length < 3)
{
Console.WriteLine("command line: <input file> <output file> <password> [e|d]");
Console.WriteLine("\twhere e means encrypt (the default) and d means decrypt");
return;
}
bool bEncrypt = true;
if (args.Length == 4) bEncrypt = (args[3].ToLower()[0] == 'e');
FileStream fsIn = new FileStream(args[0], FileMode.Open);
FileStream fsOut = new FileStream(args[1], FileMode.Create);
if (bEncrypt) Encrypt(fsIn, fsOut, args[2]);
else Decrypt(fsIn, fsOut, args[2]);
}
}
Most of the code is used to manipulate the command line parameters. Once
the code has determined the names of the input and output files it opens them
and then calls Encrypt or Decrypt depending on the
last command line parameter. Both of these methods will close the files when
they have completed their work, so there is no need to call Close
on these FileStream objects. The Encrypt method
looks like this:
{
Rijndael r = Rijndael.Create();
byte[] salt = null;
byte[] pass = null;
CreatePassword(password, r.KeySize >> 3, r.KeySize >> 3, out pass, ref salt);
r.Key = pass;
sOut.Write(salt, 0, salt.Length);
sOut.Write(r.IV, 0, r.IV.Length);
CryptoStream cs = new CryptoStream(sIn, r.CreateEncryptor(), CryptoStreamMode.Read);
byte[] buffer = new byte[1024];
while (true)
{
int read = cs.Read(buffer, 0, buffer.Length);
if (read == 0) break;
sOut.Write(buffer, 0, read);
}
cs.Close(); // closes sIn, no need to call Clear
sOut.Close();
}
The CreatePassword method is similar (but not the same as) the
method given in the last section and I will show it in a moment. This method will
create a key from a pass phrase and a array of salt, if you pass in a null
array for the salt (as in this case) the method will generate random values and
return the array. Once the key is generated, it is used to initialize the
algorithm. I do not create an initialization vector, instead I allow the
Rijndael constructor to do this. The code then writes the salt and the IV
to the output file. It is important that the corresponding Decrypt
code knows the size of these values. So the length of the IV is obtained from
the algorithm, and the salt is the same length as the key (KeySize
is given in bits).
Next the code creates a CryptoStream based on the input steam
and the transform's encryptor. The CryptoStream has a Read
mode, meaning that the data is encrypted as it is read from the input stream.
Finally, the data is read (and encrypted) a kilobyte at a time and then written
out to the output stream. The CryptoStream is then closed
which will also close the underlying stream.
Decrypt is similar with the difference that it must read the salt and IV from the file. Here is the code:
{
Rijndael r = Rijndael.Create();
byte[] salt = new byte[r.KeySize >> 3];
byte[] pass = null;
sIn.Read(salt, 0, salt.Length);
CreatePassword(password, r.KeySize >> 3, r.KeySize >> 3, out pass, ref salt);
r.Key = pass;
byte[] IV = new byte[r.IV.Length];
sIn.Read(IV, 0, IV.Length);
r.IV = IV;
CryptoStream cs = new CryptoStream(sOut, r.CreateDecryptor(), CryptoStreamMode.Write);
byte[] buffer = new byte[1024];
while (true)
{
int read = sIn.Read(buffer, 0, buffer.Length);
if (read == 0) break;
cs.Write(buffer, 0, read);
}
cs.FlushFinalBlock();
sIn.Close();
cs.Close(); // closes sOut, no need to call Clear
}
First the salt is read (and assumed to be the same size as the key) and
then CreatePassword is called and the returned key is used to
initialize the algorithm's key. Next, the IV is read from the file and used to
initialize the algorithm. Finally, the CryptoStream is created
from the output stream and the decryptor in the Write mode. The
data is read a kilobyte at a time from the input stream and written through
the CryptoStream which decrypts the data before writing it to the
output stream. Since the read action of the input stream determines whether
the while block finishes the code must call FlushFinalBlock
to make sure that the final decrypted block is written to the output file.
Here is the CreatePassword mehtod:
{
if (salt == null)
{
RandomNumberGenerator rand = RandomNumberGenerator.Create();
salt = new byte[saltSize];
rand.GetBytes(salt);
}
PasswordDeriveBytes pdb = new PasswordDeriveBytes(phrase, salt);
pass = pdb.GetBytes(passSize);
}
Compile this code and test it. First, encrypt a text file (for example the source code of the utility). Open the resultant file in Notepad to convince you that the data is encrypted. Then call the utility again, to decrypt the file you have just created, and use a new name for the decrypted file. Finally, compare the two files:
encrypt encrypt.enc encrypt_new.cs secret d
comp encrypt.cs encrypt_new.cs /D /L
Here, encrypt.cs and encrypt_new.cs are the cleartext and
decrypted cyphertext files and the /L /D switches will give the location of
any differences and display the differences in decimal. The utility should indicate that the files are the same.
11.3 Cryptographic Hash Algorithms
We have already seen the application of a hash function: I presented the PasswordDeriveBytes
class that generates a key from a pass phrase. The idea is that a pass phrase
input from a user is unlikely to be the most secure because it will only
contain characters from a limited character set (those available on the
keyboard). So the PasswordDeriveBytes class overcomes this by
repeatedly applying a hash function on the pass phrase - feeding the result of
the hash back into the hash function.
Hash functions are one way, that is, you cannot calculate the original data from the hash. This is important because the hash can be made public without disclosing the data that created it, however, it is of little use if the hash is not sufficiently unique. This property of hashes, collision, is very important.
Hash functions will return a value of a specific size, for example SHA1 will return 160 bit (10 byte) value. You could naively imagine that this means that such a hash could be generated from one of 2160 (1.5 x 1048) different input data. However, when you consider all the possible combinations of input data even this value is not large enough. Thus it is possible that a specific hash value can come from at least two different input data. The important point is that given a hash and the original data it should not be possible to predict another input data that would create the same hash. Furthermore, it is vital that two similar input data cannot produce the same hash.
All of the framework's hash classes derive from the abstract HashAlgorithm
class. These classes are shown in the table below.
| Class | Description | Implementation |
|---|---|---|
MD5 |
Message Digest algorithm 5, designed by Ron Rivest. 128 bit digest. It is now known that it is possible to generate two byte strings that generate the same MD5 hash. MD5 should not used for new hashes. | MD5CryptoServiceProvider |
RIPEMD160 |
An implementation of the RACE Integrity Primitive Evaluation project's Message Digest 160-bit algorithm. .NET 3.0/2.0 | RIPEMD160Managed |
SHA1 |
Secure Hash Algorithm, designed by the NSA and is a US government standard. 160 bit digest. It is more secure than MD5, but some attacks have been suggested, but not proven. NIST plan to phase out SHA1 by 2010. | SHA1CryptoServiceProvider |
SHA256 |
SHA with 256 bit digest length | SHA256Managed |
SHA384 |
SHA with 384 bit digest length | SHA384Managed |
SHA512 |
SHA with 512 bit digest length | SHA512Managed |
These classes are abstract and provide a static Create method
that will return an instance of a concrete subclass:
This creates an instance of the default class to be used for SHA1 and this
default class is defined by the CryptoConfig class. Within
CryptoConfig is a hash table (a collection object, not to be confused
with cryptographic hashes) that maps the name of the algorithm (for example,
SHA, SHA1, System.Security.Cryptography.SHA1)
to the class that implements it (SHA1CryptoServiceProvider). This
mapping is hard coded into the class and since the hash table is a private
member you cannot change it. Hence you cannot change the implementation. The
Implementation column in the table above shows the instances that will
returned from Create. Note that only two of these are implemented
by the unmanaged CryptoAPI.
There are two ways to create a hash. The first is to call the
ComputeHash method and pass a byte array or a stream:
using(FileStream fs = File.Open("test.dat"))
{
hash = sha1.ComputeHash(fs);
}
Console.WriteLine(BitConverter.ToString(hash));
From an earlier page you
saw that HashAlgorithm implements ICryptoTransform
and
this means that an instance can be used to initialize a CryptoStream.
This is the second way to create a hash: you can read (or write) through the CryptoStream
object and the hash will be calculated. The ICryptoTransform
implementation on these hash classes do not affect the data that is read or
written, so a CryptoStream object based on them essentially has
pass-through methods. To obtain the hash you need to access the Hash
property.
The class is designed like this to allow you to create a hash while you are encrypting a file. This is a process called hash-and-encrypt and it relies on the assumption that it is not feasible to find two sets of data that have the same hash. The hash is appended to the end of the data and is encrypted with it. An attacker could change some bytes in the cyphertext that corresponds to the data. This change will be picked up when the cyphertext is decrypted because a hash taken on the decrypted data will not agree with the hash appended to it.
To see how this works use the code for symmetric key
encryption shown above (enh.cs):
using System.Security.Cryptography;
using System.Text;
using System.IO;
class App
{
static void Main()
{
Rijndael r = Rijndael.Create();
string phrase = "daisy, daisy, give me your answer to";
PasswordDeriveBytes pdb = new PasswordDeriveBytes(phrase, new byte[0]);
r.Key = pdb.GetBytes(r.KeySize>>3);
string data = "The quick brown fox jumps over the lazy dog.";
ICryptoTransform en = r.CreateEncryptor();
byte[] input = Encoding.ASCII.GetBytes(data);
input = CryptoTransform(input, en);
Console.WriteLine(BitConverter.ToString(input));
ICryptoTransform de = r.CreateDecryptor();
byte[] output = CryptoTransform(input, de);
Console.WriteLine(Encoding.ASCII.GetString(output));
}
static byte[] CryptoTransform(byte[] input, ICryptoTransform en)
{
MemoryStream sInput = new MemoryStream(input);
MemoryStream sOutput = new MemoryStream();
CryptoStream cs = new CryptoStream(sOutput, en, CryptoStreamMode.Write);
byte[] buffer = new byte[1024];
while(true)
{
int read = sInput.Read(buffer, 0, buffer.Length);
if (read == 0) break;
cs.Write(buffer, 0, read);
}
cs.FlushFinalBlock();
cs.Clear();
return sOutput.ToArray();
}
}
This takes a pass phrase and then creates a key. Then it uses this key to
encrypt some text, dumps the cyphertext to the console before decrypting the
data and printing out the result. CryptoTransform performs the
encryption and decryption using the encryptor or decryptor passed as a
parameter. To add hash-and-encrypt requires a few changes:
{
MemoryStream sInput = new MemoryStream(input);
MemoryStream sOutput = new MemoryStream();
CryptoStream cs = new CryptoStream(sOutput, en, CryptoStreamMode.Write);
SHA256 sha256 = SHA256.Create();
Stream data = null;
if (bEncrypt)
data = new CryptoStream(sInput, sha256, CryptoStreamMode.Read);
else
data = sInput;
byte[] buffer = new byte[1024];
while(true)
{
int read = data.Read(buffer, 0, buffer.Length);
if (read == 0) break;
cs.Write(buffer, 0, read);
}
if (bEncrypt)
cs.Write(sha256.Hash, 0, sha256.Hash.Length);
cs.FlushFinalBlock();
cs.Clear();
if (!bEncrypt)
{
byte[] hash = new byte[sha256.HashSize >> 3];
sOutput.Position = sOutput.Length - hash.Length;
sOutput.Read(hash, 0, hash.Length);
sOutput.Position = 0;
sOutput.SetLength(sOutput.Length - hash.Length);
byte[] newHash = sha256.ComputeHash(sOutput);
for (int x = 0; x < hash.Length; ++x)
{
if (hash[x] != newHash[x])
throw new CryptographicException("Data is corrupt!");
}
}
return sOutput.ToArray();
}
The first change is that there is a boolean that determines whether the
code is being called to encrypt or decrypt the data. (Unfortunately,
ICryptoTransform does not have a property to specify whether the object
is a encryptor or a decryptor, if such a property was available then this
boolean would not be needed and this code would be cleaner.) If the data is being
encrypted then I create a CryptoStream object based on
SHA256 and use this to read the data. The data is encrypted/decrypted
as before by writing it through the CryptoStream object created
on the crypto-transform interface. Once all of the data is processed I check
to see if this is encryption, and if so, the hash is written through the
encryption stream, that is, it is encrypted too. This means that the hash is
appended to the end of the data and this new data is encrypted.
If the data is being decrypted the action is the same as previously. The difference is that the cleartext now contains the data with the hash appended. So the first thing to do is extract the hash:
sOutput.Position = sOutput.Length - hash.Length;
sOutput.Read(hash, 0, hash.Length)
Next, I need to remove this hash from the data. This is actually very
simple to do, all I need to do is tell the MemoryStream that it
is a little bit shorter. After that, a hash is calculated on the data:
sOutput.SetLength(sOutput.Length - hash.Length);
byte[] newHash = sha256.ComputeHash(sOutput);
The final part of the code compares the newly calculated hash with the hash
extracted from the decrypted cyphertext. If the two hashes disagree the data
is corrupted. To get this code to compile you need to make a few minor
adjustments to the Main method:
byte[] input = Encoding.ASCII.GetBytes(data);
input = CryptoTransform(input, en, true);
Console.WriteLine(BitConverter.ToString(input));
ICryptoTransform de = r.CreateDecryptor();
byte[] output = CryptoTransform(input, de, false);
Compile and run this code. You'll find that it works just the same as before. To see the hash in action introduce a deliberate corruption of the cyphertext:
input[0] = 0;
Console.WriteLine(BitConverter.ToString(input));
Now run the code and you'll find that an exception will be thrown. This mechanism has two properties. Firstly, it provides a mechanism for detecting the integrity of the data: if the data is changed between the encryption and decryption (for example, when the cyphertext it is being transmitted) this is detected after the data is decrypted. The second property is authentication: the cyphertext is decrypted with the secret key only known to trusted personnel and the hash proves the integrity of the encrypted data. Since the cyphertext can only be created by someone who has access to the secret key this authenticates the sender of the data.
The framework provides classes to perform this authentication and integrity
tests in one action. These are called keyed hash classes and they derive from
the KeyedHashAlgorithm abstract class. The idea is that a secret
key, known only to the party who sends the data and the party who receives the
data is combined with the data in such a way to generate a message
authentication code (MAC). The scheme relies on the key being kept secret.
The attacker, who does not know the key, will be unable to create a MAC from
the data she has tampered with.
| Class | Description |
|---|---|
HMACMD5 |
Hash-based MAC based on MD5. The key can be any size and
the MAC will be 128 bits.
.NET 3.0/2.0 |
HMACRIPEMD160 |
Hash-based MAC based on RIPEMD160. The key can be any
size and the MAC will be 160 bits.
.NET 3.0/2.0 |
HMACSHA1 |
Hash-based MAC based on SHA1. The key can be any size and
the MAC will be 160 bits. |
HMACSHA256 |
Hash-based MAC based on SHA256. The key can be any size
and the MAC will be 256 bits.
.NET 3.0/2.0 |
HMACSHA384 |
Hash-based MAC based on SHA384. The key can be any size
and the MAC will be 384 bits.
.NET 3.0/2.0 |
HMACSHA512 |
Hash-based MAC based on SHA512. The key can be any size
and the MAC will be 512 bits.
.NET 3.0/2.0 |
MACTripleDES |
A 16 or 24 byte key is used with TripleDES (CDC mode, padding
with zeros, IV of zeros) to create cyphertext. The last block (64 bits) is
the MAC. |
These classes create a MAC in two very different ways. MACTripleDES
uses a secret key to encrypt the data and then uses part of the encrypted data
(the last 8 bytes) as the MAC. The rest of the classes mix the key into the
data and then performs a hash, then the key is
mixed into the hash and another hash is performed to create the MAC. In
this case the size of the hash determines the size of the MAC.
The user has two choices,
either call ComputeHash on the entire message in one go, or
create a CryptoStream and read the data through the keyed hash
routine as the data is read for another purpose. This is essentially the same
as the using HashAlgorithm classes with the additional
responsibility of providing a key.
11.4 Base64 Encoding
I mentioned earlier that two of the classes that implement
ICryptoStream are ToBase64Transform and
FromBase64Transform. These classes are used to encode rather
than encrypt data. Encryption transforms the data into a form that is
secure from unauthorised personnel using an encryption key. Encoding does not
use a key, and as such anyone can convert the data back into the decoded form.
You might ask what is the point of encoding? Well, it transform the data into
a format that is either more acceptable, or in some cases more usable, by other
code. For example, Internet news and mail is based on text (RFC822),
and specifically on a limited range of characters. Binary code, like images,
music or executables will contain bytes that are outside of this range, and so
this means that binary data cannot be attached to news or email messages. To
get round this restriction the news and mail protocols allow the user to
provide binary data as MIME (Multipurpose Internet Mail Extensions,
RFC2045,
RFC2046,
RFC2047,
RFC2048 and
RFC2049) attachments. The
MIME RFC specifies that base64 encoding should be used to encode binary data
into the 7 bit character set used by mail messages, and this ensures that mail
transfer agents will not alter the data during transmission. The base64
algorithm is simple, converting 3 octet binary groups into 4 octet text
groups. For binary data that contains a number of bytes that is not exactly
divisible by three base64 defines a padding scheme.
Since ToBase64Transform and
FromBase64Transform implement ICryptoTransform they can be
used to initialise a CryptoStream object. This means that you
can chain these classes to apply automatic base64 encoding and decoding as
part of the encryption and decryption code. Here is an altered version of
the CryptoTransform code
that I showed above.
{
MemoryStream sInput = new MemoryStream(input);
MemoryStream sOutput = new MemoryStream();
CryptoStream cs;
if (bEncrypt)
{
cs = new CryptoStream(sOutput, new ToBase64Transform(), CryptoStreamMode.Write);
cs = new CryptoStream(cs, en, CryptoStreamMode.Write);
}
else
{
cs = new CryptoStream(sOutput, en, CryptoStreamMode.Write);
cs = new CryptoStream(cs, new FromBase64Transform(), CryptoStreamMode.Write);
}
byte[] buffer = new byte[1024];
while(true)
{
int read = sInput.Read(buffer, 0, buffer.Length);
if (read == 0) break;
cs.Write(buffer, 0, read);
}
cs.FlushFinalBlock();
cs.Clear();
return sOutput.ToArray();
}
This method takes a parameter that indicates whether the method is called to
encrypt or decrypt data. If the data is being encrypted then the CryptoStream is
created so that a call to Write will first encrypt the data and then base64
encode it. If the data is being decrypted the CryptoStream is
created so that it is base64 decoded first and then decrypted. If you add this
to the previous example you can change the calling code like this:
input = CryptoTransform(input, en, true);
Console.WriteLine(Encoding.ASCII.GetString(input));
ICryptoTransform de = r.CreateDecryptor();
byte[] output = CryptoTransform(input, de, false);
Console.WriteLine(Encoding.ASCII.GetString(output));
If you run this code you will get the following
The quick brown fox jumps over the lazy dog.
The first line is the base64 encoded encrypted data, which is suitable to attach to an email or news message.
There are several issues surrounding these two transform classes. The first
issue is that you must create a CryptoStream class to use them
and
the action of base64 encoding and decoding is so widespread that it really warrants a
standalone base64 stream class. I mentioned above that the MIME RFC indicates
that base64 attachments should have lines no longer than 76 characters.
However, the CryptoStream class and ToBase64Transform
pay no attention to line length. So if
you use the ToBase64Transform you will have to write the data to
some intermediate storage (like a MemoryStream) and then read
through the transformed data and add newlines in appropriate places. This will
involve allocating at least one more buffer. FromBase64Transform
will strip out whitespace from the data passed to it, but this is unnecessary
because it is implemented with Convert.FromBase64CharArray which
ignores whitespace. This extra processing means a lower performance.
Furthermore, Convert.FromBase64CharArray (used by
FromBase64Transform) takes an array of Char and
Convert.ToBase64CharArray (used by ToBase64Transform)
returns a Char array but the methods that use them (TransformBlock
and TransformFinalBlock) handle Byte arrays. This
means that there will always be a call to Encoding.ASCII to
convert between these two array types. This involves more array allocation and
iteration through the values in the various input buffers: yet more CPU cycles
are burned.
Since so many temporary buffers are used this means that a lot
of copying must occur between all of these buffers. The library code does make
a concession to optimisation here because instead of using the generic
Array.Copy routine the library methods use the Buffer
class. Array.Copy and Buffer.BlockCopy are
internalcall, which means that the IL is not available, but the files
in Rotor (the Shared Source CLI) show that Array.Copy performs
various tests on the array type to see if the data can be copied and if so,
how the data should be copied, before it copies the data. The Buffer.BlockCopy
method just assumes that the data can be copied. In both cases an array of
bytes copied to another array of bytes will be performed by accessing the
interior pointer to the two arrays and doing (what is essentially) a call to
memmove. However, although these base64 methods make this
concession to performance, better performance can be obtained if the code is
designed not to perform the allocations and copies.
As you can see this is a huge catalogue of issues and I am very surprised that Microsoft have not addressed them. So I decided that I would create my own base64 stream class. The code and description can be found here.
| I hope that you enjoy this tutorial and value the knowledge that you will gain from it. I am always pleased to hear from people who use this tutorial (contact me). If you find this tutorial useful then please also email your comments to mvpga@microsoft.com. |
Errata
If you see an error on this page, please contact me and I will fix the problem.