• 沒有找到結果。

Embedding ASCII Control Codes into Emails

Data Hiding in Emails and Applications by Unused ASCII Control Codes

6.3 Embedding ASCII Control Codes into Emails

In this study, we identify five possible ways for secret data embedding in emails by use of ASCII control codes. They are listed as follows.

(1) White-space coding --- As mentioned previously, there are many different white-space codes, each of which, when displayed, appears to be a white space, yielding the same effect as the original ASCII space code 20. For example, under the environment of the Big 5 standard using Outlook Express, each of the three ASCII codes, 07, 09, and 0C, will be displayed as a white space, as found in this study. Therefore, we can use each of them to replace a white space in an email text in a data hiding process, with the resulting stego-email bringing no reader’s notice.

(2) Inserting multiple white-space codes at text line ends --- We may place multiple white-space codes before the CRLF at the end of a text line. Since no character but background white spaces are shown after the CRLF, these additionally inserted white-space codes, though displayed as visible white spaces, will be connected to the background white spaces and thus bring no noticeable effect to

the reader.

(3) Null-space coding--- As mentioned previously, there are many null-space codes, which are displayed as nothing. We can thus insert them at any position in a line for any repetitions in a data hiding process without causing the reader’s notice.

For example, under the environment of the UTF-8 standard using IE, the four null-space codes 1C, 1D, 1E, and 1F, as found in this study, are invisible.

(4) Inserting multiple null-space codes at text line ends --- We may place null-space codes repetitively at the end of a text line without causing noticeable effect because they are invisible when displayed, as in the case of (2) above.

(5) Combining techniques of the above --- We may combine the above techniques in arbitrary ways if both white-space and null-space coding are applicable in the environment.

In the above discussions, we see that the ASCII control codes usable for embedding secret data are variant for different kinds of servers, browsers, and character sets. In order to have a systematic investigation in this aspect, in this study we created an email file which includes all ASCII control codes shown in Table 1 to find out SMTP server software suitable for data embedding, as well as the corresponding appearances of the ASCII control codes after they are processed and displayed in the environment of such server software. The investigation results are described as follows.

First, we have found four SMTP email servers which do not change the text contents of emails, and so can be used as standard SMTP servers for the purpose of data embedding in this study. Their uniform resource locators (URLs) are

H74HTU

http://cis.nctu.edu.twUTH, H75HTUhttp://mis.tsint.edu.twUTH, H76HTUhttp://tw.yahoo.comUTH and

H77HTU

http://www.hotmail.comUTH. The first is located in the Department of Computer Science

at National Chiao Tung University in Taiwan, with an SMTP software of Twig 2.7.7.

The department has additionally another SMTP server system, Horde, for web mails.

The second server is located at the Department of Management Information at Technology and Science Institute of Northern Taiwan. The SMTP software is SendMail 8.12. The third server is located in Taiwan and deals with web mails with the name Yahoo! Mail. The last server is Hotmail, a web mail server of Microsoft Corporation. After registering at any of these four servers, a user may read, transmit, or receive emails by Outlook Express or IE.

In this study, the email format we use is MIME 1.0, the content-type is text/plain, and the character set is UTF-8. These formats are very commonly used and so are adopted in this study for data hiding applications.

After a systematic test of the ASCII character set on the above-mentioned four servers, we found that the hexadecimal ASCII control codes appropriate for data embedding under both the Outlook Express and the IE environments are 1C, 1D, 1E, and 1F. These four codes all appear to be invisible on the IE browser, and all are shown as white spaces in the Outlook Express window. They can so be used for data embedding respectively according to the techniques of (2) and (4) mentioned above.

However, our goal is to take into account simultaneously, instead of respectively, the techniques of (2) and (4), resulting in a method of repeatedly placing these four ASCII control codes at the ends of email text lines. The displayed result of the stego-email will be of no difference from the appearance of the original cover email, thus achieving the steganographic effect.

More specifically, we use the following encoding rules to embed secret data into the text line ends of a cover email.

1. Encode 2-bit binary secret data “00,” “01,” “10,” and “11” with the four ASCII codes 1C, 1D, 1E, and 1F, respectively.

2. Put the unique combined ASCII codes 201E in front of a sequence of secret data as its start signal, and append another copy of it at the sequence tail as the end signal.

3. Use the unique combined ASCII codes 201C to encode the 1-bit data ‘0,’ and the combined codes 201D to encode ‘1.’

4. Use the unique combined ASCII codes 201F as a separator to stop the underline display that starts from a special lexical token of the network protocol, like http, ftp, email, …, etc.

Rule 4 above is necessary because otherwise the extra white-space codes we insert at the end of a text line, when happening to be connected to the end of a network protocol text line, will appear to be underlined white spaces, like in

Uhttp://cis.nctu.edu.tw U, which obviously are against the purpose of steganography.

Based on the above rules, we describe the proposed data hiding algorithm for the purpose of covert communication and authentication in the next section.