第五章 結 論
5.2 心得
此專題完成到現在,花了最久的時間是去研究那個一分鐘的同步 機制和暫存檔的運用也是花了很久的時間來制定的,且MIME格式都是 原文說明,更需要花一番功夫來研究。還有此專題的原本環境是在 Windows發展的,但後來因為發生疾風病毒的影響,原本用Asp來發展 的東西,改成了Jsp,並且在Unix-like下的環境發展,一剛開始對於 這種環境陌生的我們,去圖書館借了好幾本書回來研讀,才有能力來 發展程式,一路走來所看的書應該不少於十幾二十本,感覺獲益良多。
除此之外,在分工與溝通之間,我們彼此有學到了不少東西,像是程 式不懂的地方都會互相研討並加以溝通,所以到最後不會因為各自完 成的程式到最後無法整合,這些都是我們在此專題中學到的。
附錄 A 使用手冊
1.1 The java Connect Database Engine (jConnector)
jConnector 利用到了 Sybase 提供的 jconnect55 的 API, 來做 Sybase Database 存取 1. 它包含以下幾支類別檔:
轉換 Database 的 M_MAIN_OUT Table 的待寄出郵件 --> cParser 和 jConnector 的中間檔。
轉換 cParser 和 jConnector 的中間檔 --> Database 的 M_MAIN_IN 和 M_RECEIVER 二 Table。
Step:
1.取 Database 的 table 裡每一 row 值, 存成存成 files, 2.然後清空已被取出的 row
1.2 The java Dispatcher (DP)
ContentFilter.java 利用 Java1.4 的 Reglar Expression,
可以從~/conf/filter.conf 裡讀出所有要過濾的關鍵字 信件裡只要符合關鍵字, 就可以做過濾的動作
1.3 jConnector 和 DP 共用的類別檔
jConnector 和 DP 用到的共用類別檔:
=======================
FileNameFilter.java 設計用來做附檔名過濾用,因為 jConnector 只處理.eml,並不對其它檔案做動 作。
loadParameter.java 設計來對 system.conf 與 cparser 和 jConnector 的中間檔做處理的 要得到 system.conf 裡 index 值,
用 String getIndexOf(String fileIndex)要得到中間檔 index 的值,
用 Map collect(FileReader file, String table_name)。
eap.getMailDist(Mail_Str)
範例:
...
emlAddrParser eap = new emlAddrParser();
String Mail_Str = "\"aaa\" <[email protected]>";
System.out.println(eap.getNickName(Mail_Str) + ", " + eap.getAddr(Mail_Str) + ", " + eap.getID(Mail_Str) + ", " + eap.getDomain(Mail_Str) + ", " + eap.getMailDist(Mail_Str));
1.4 The Lex Parser (cParser)
mail_to_txt.l 使用方式:
lex mail_to_txt.l (之後會產生一個 lex.yy.c) cc lex.yy.c -o start_parser -ll (之後會產生一個 mail_to_txt 執行檔),
之後在把 lex.yy.c 改成 mail_to_txt.c
txt_to_mail.l 使用方式:
lex txt_to_mail.l (之後會產生一個 lex.yy.c) cc lex.yy.c -o txt_to_mail -ll (之後會產生一個 txt_to_mail 執行檔),
之後在把 lex.yy.c 改成 txt_to_mail.c
parser 和 unparser start_parser(表示開始執行 parser email) start_unparser(表示開始執行 unparser email)
附錄 B 過濾設定檔 - filter.conf
##############################################################
# #
# File: Content Filter Keywords #
# Group: DP (Dispatcher) #
# Introduce: #
# This is a set of keywords #
# that is not allowed in any e-mails. #
# 使用上請注意: #
# 1. 英文字母大小寫是當作不同來區分 #
# 2. 請以 "," 隔開每個 KeyWord #
# 3. 如在 KeyWord 這行裡要換行, 但下一行內容也是 KeyWord 的 #
# 則請在 KeyWord 的最後加 "," 或下一行的行頭加上 "," #
# #
##############################################################
KeyWord= 幹, fuck, damage ,陳水扁是笨蛋,
附錄 C 用到的相關 RFC 及整理
===========================================================
Internet Official Protocol Standards
中的 STD11(http://www.sri.ucl.ac.be/normes/rfc/rfc822.txt)裡面的定義 了 E-Mail 的標準架構
===========================================================
RFC 822
STD 11, defines a message representation protocol specifying considerable detail about US-ASCII message headers, and leaves the message content, or message body, as flat US-ASCII text. This set of
documents, collectively called the Multipurpose Internet Mail Extensions, or MIME, redefines the format of messages to allow for (1) textual message bodies in character sets other than
US-ASCII,
(2) an extensible set of different formats for non-textual message bodies,
(3) multi-part message bodies, and
(4) textual header information in character sets other than US-ASCII.
RFC 2046
defines the general structure of the MIME media typing system and defines an initial set of media types
RFC 2047
describes extensions to RFC 822 to allow non-US-ASCII text data in Internet mail header fields
===========================================================
RFC822(ARPA INTERNET TEXT MESSAGES)裡的 [Page16]MESSAGE SPECIFICATION 所定義的
1.when present, some fields, must be in a particular order 2.Header fields are NOT required to occur in any particular order, except that the message body must occur AFTER the headers 3.It is recommended that, if present, headers be sent in the order "Return-Path", "Received", "Date", "From", "Subject",
"Sender", "To", "cc", etc.
4.This specification permits multiple occurrences of most fields.
Except as noted, their interpretation is not specified here, and their use is discouraged.
5.Header 有 prefix 的, 也有不是 prefix 的
6.occurrence of legal "Resent-" fields are treated identically with
the occurrence of fields whose names do not contain this prefix.
各 Field 的 Header 功能:
01.MIME Header Fields
a)TRACE FIELDS - it indicates a route back to the sender of the message.
"Return-Path" - is used to identify a path back to the originator
"Reply-To" - is added by the originator and serves to direct replies
"Received" - A copy of this field is added by each transport
service that relays the message b)ORIGINATOR FIELDS
"From"
"Sender"
"Reply-To"
c)RECEIVER FIELDS
"To" - the primary recipients of the message.
"Cc" - the secondary(informa-tional) recipients of the message.
"Bcc" - additional recipients of the message.
d)REFERENCE FIELDS
"Message-ID" - This field contains a unique identifier (the local-part address unit)
which refers to THIS version of THIS message.
The uniqueness of the message identifier is guaranteed by the
host which generates it.
"In-Reply-To" - The contents of this field identify previous
correspon-dence which this message answers.
"References" - The contents of this field identify other correspondence
which this message references.
"Keywords" - This field contains keywords or phrases, separated by commas.
e)OTHER FIELDS
"Subject"
"Comments"
"Encrypted"
f)Extension-FIeld
"X-"
f)USER-DEFINED-FIELD - Non-Multiple with other header
"***:"
g)DATE AND TIME SPECIFICATION
"Date"
02.MIME-Version Header Field
"MIME-Version"
03.Content-Type Header Field
"Content-Type"
04.Content-Transfer-Encoding Header Field
"Content-Transfer-Encoding"
05.Content-ID Header Field
"Content-ID"
06.Content-Description Header Field
"Content-Description"
07.Additional MIME Header Fields - Any RFC 822 header field which begins with the string "Content-">
"Content-"
內文的部份:
Multipart:
-- Recognize the mixed subtype. Display all relevant
information on the message level and the body part header level and then display or offer to display each of the body parts individually.
-- Recognize the "alternative" subtype, and avoid showing the user redundant parts of
multipart/alternative mail.
-- Recognize the "multipart/digest" subtype, specifically using "message/rfc822" rather than
"text/plain" as the default media type for body parts inside "multipart/digest" entities.
-- Treat any unrecognized subtypes as if they were
"mixed".
=====================================================================
有關編碼:
RFC2047, RFC2049
=====================================================================
RFC 2047 Message Header Extensions
Generally, an "encoded-word" is a sequence of printable ASCII characters that begins with "=?", ends with "?=", and has two "?"s in between. It specifies a character set and an encoding method, and also includes the original text encoded as graphic ASCII characters, according to the rules for that encoding method.
附錄D X-Window的XF86Config設定檔
Section "ServerLayout"
Identifier "XFree86 Configured"
Screen 0 "Screen0" 0 0
InputDevice "Mouse0" "CorePointer"
InputDevice "Keyboard0" "CoreKeyboard"
EndSection Section "Files"
RgbPath "/usr/X11R6/lib/X11/rgb"
ModulePath "/usr/X11R6/lib/modules"
FontPath "/usr/X11R6/lib/X11/fonts/TrueType"
FontPath "/usr/X11R6/lib/X11/fonts/local"
FontPath "/usr/X11R6/lib/X11/fonts/misc/"
FontPath "/usr/X11R6/lib/X11/fonts/Speedo/"
FontPath "/usr/X11R6/lib/X11/fonts/Type1/"
FontPath "/usr/X11R6/lib/X11/fonts/75dpi/"
FontPath "/usr/X11R6/lib/X11/fonts/100dpi/"
EndSection
Section "Module"
Load "xtt"
Load "extmod"
Load "xie"
Load "dbe"
Load "dri"
Load "glx"
Load "record"
Load "xtrap"
Load "speedo"
Load "type1"
EndSection
Section "InputDevice"
Identifier "Keyboard0"
Driver "keyboard"
EndSection
Section "InputDevice"
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "auto"
Option "Device" "/dev/sysmouse"
EndSection
Section "Monitor"
Identifier "Monitor0"
VendorName "Monitor Vendor"
ModelName "Monitor Model"
Horizsync 31.5-57.0 VertRefresh 50-100 EndSection
Section "Device"
### Available Driver options are:-
### Values: <i>: integer, <f>: float, <bool>: "True"/"False", ### <string>: "String", <freq>: "<f> Hz/kHz/MHz"
### [arg]: arg optional
Identifier "Card0"
Driver "nv"
VendorName "nVidia Corporation"
BoardName "NV5M64 [RIVA TNT2 Model 64/Model 64 Pro]"
BusID "PCI:1:0:0"
EndSection
Section "Screen"
Identifier "Screen0"
Device "Card0"
Monitor "Monitor0"
DefaultColorDepth 16 SubSection "Display"
Depth 1 EndSubSection
SubSection "Display"
Depth 4 EndSubSection
SubSection "Display"
Depth 8 EndSubSection
SubSection "Display"
Depth 15 EndSubSection
SubSection "Display"
Depth 16
Modes "800x600" "1024x768"
Virtual 800 600 ViewPort 0 0 EndSubSection
SubSection "Display"
Depth 24 EndSubSection EndSection
附錄 E MRTG 設定檔
# Created by
# /usr/local/bin/cfgmaker --global 'WorkDir: /home/mrtg/public_html' --global 'Options[_]: growright' --ifref=nr mrtg@localhost
### Global Config Options
# for UNIX
# WorkDir: /home/http/mrtg
# or for NT
# WorkDir: c:\mrtgdata
### Global Defaults
# to get bits instead of bytes and graphs growing to the right
# Options[_]: growright, bits WorkDir: /home/mrtg/public_html Options[_]: growright
Language: big5
#####################################################################
#
# System: checkme.adsldns.org
# Description: FreeBSD checkme.adsldns.org 4.9-STABLE FreeBSD 4.9-STABLE
#0: Fri Oct i386
# Contact: mrtg@localhost
# Location: checkme.adsldns.org
#####################################################################
#
### Interface 1 >> Descr: 'vr0' | Name: '' | Ip: '10.0.0.1' | Eth: '' ###
Target[localhost_1]: 1:mrtg@localhost:
SetEnv[localhost_1]: MRTG_INT_IP="10.0.0.1" MRTG_INT_DESCR="vr0"
MaxBytes[localhost_1]: 130000
Title[localhost_1]: Traffic Analysis for 1 -- checkme.adsldns.org PageTop[localhost_1]: <H1>Traffic Analysis for 1 --
checkme.adsldns.org</H1>
<TABLE>
<TR><TD>System:</TD> <TD>checkme.adsldns.org in checkme.adsldns.org</TD></TR>
<TR><TD>Maintainer:</TD> <TD>mrtg@localhost</TD></TR>
<TR><TD>Description:</TD><TD>vr0 </TD></TR>
<TR><TD>ifType:</TD> <TD>ethernetCsmacd (6)</TD></TR>
<TR><TD>ifName:</TD> <TD></TD></TR>
<TR><TD>Max Speed:</TD> <TD>130.0 kBytes/s</TD></TR>
<TR><TD>Ip:</TD> <TD>10.0.0.1 ()</TD></TR>
</TABLE>
### Interface 2 >> Descr: 'rl0' | Name: '' | Ip: '192.168.0.1' | Eth: ''
###
Target[localhost_2]: 2:mrtg@localhost:
SetEnv[localhost_2]: MRTG_INT_IP="192.168.0.1" MRTG_INT_DESCR="rl0"
MaxBytes[localhost_2]: 100000
Title[localhost_2]: Traffic Analysis for 2 -- checkme.adsldns.org PageTop[localhost_2]: <H1>Traffic Analysis for 2 --
checkme.adsldns.org</H1>
<TABLE>
<TR><TD>System:</TD> <TD>checkme.adsldns.org in checkme.adsldns.org</TD></TR>
<TR><TD>Maintainer:</TD> <TD>mrtg@localhost</TD></TR>
<TR><TD>Description:</TD><TD>rl0 </TD></TR>
<TR><TD>ifType:</TD> <TD>ethernetCsmacd (6)</TD></TR>
<TR><TD>ifName:</TD> <TD></TD></TR>
<TR><TD>Max Speed:</TD> <TD>100.0 kBytes/s</TD></TR>
<TR><TD>Ip:</TD> <TD>192.168.0.1 (checkme.adsldns.org)</TD></TR>
</TABLE>
### Interface 8 >> Descr: 'tun0' | Name: '' | Ip: '210.244.81.117' | Eth:
'' ###
Target[localhost_8]: 8:mrtg@localhost:
SetEnv[localhost_8]: MRTG_INT_IP="210.244.81.117"
MRTG_INT_DESCR="tun0"
MaxBytes[localhost_8]: 130000
Title[localhost_8]: Traffic Analysis for 8 -- checkme.adsldns.org PageTop[localhost_8]: <H1>Traffic Analysis for 8 --
checkme.adsldns.org</H1>
<TABLE>
<TR><TD>System:</TD> <TD>checkme.adsldns.org in checkme.adsldns.org</TD></TR>
<TR><TD>Maintainer:</TD> <TD>mrtg@localhost</TD></TR>
<TR><TD>Description:</TD><TD>tun0 </TD></TR>
<TR><TD>ifType:</TD> <TD>ppp (23)</TD></TR>
<TR><TD>ifName:</TD> <TD></TD></TR>
<TR><TD>Max Speed:</TD> <TD>130.0 kBytes/s</TD></TR>
<TR><TD>Ip:</TD> <TD>float</TD></TR>
</TABLE>
### CPU
LoadMIBs: /usr/local/share/snmp/mibs/UCD-SNMP-MIB.txt
Target[BoBo.cpu]:ssCpuRawUser.0&ssCpuRawIdle.0:mrtg@localhost RouterUptime[BoBo.cpu]: mrtg@localhost
MaxBytes[BoBo.cpu]: 100 Title[BoBo.cpu]: CPU LOAD
PageTop[BoBo.cpu]: <H1>User vs Idle CPU usage</H1>
Unscaled[BoBo.cpu]: ymwd ShortLegend[BoBo.cpu]: %
YLegend[BoBo.cpu]: CPU Utilization Legend1[BoBo.cpu]: User CPU in % (Load) Legend2[BoBo.cpu]: Idle CPU in % (Load) Legend3[BoBo.cpu]:
Legend4[BoBo.cpu]:
LegendI[BoBo.cpu]: User
LegendO[BoBo.cpu]: Idle
Options[BoBo.cpu]: growright,nopercent
### Memory
Target[freemem]: .1.3.6.1.4.1.2021.4.11.0&.1.3.6.1.4.1.2021.4.11.0:mr tg@localhost
Options[freemem]: nopercent,growright,gauge,noinfo Title[freemem]: Free Memory
PageTop[freemem]: <H1>Free Memory</H1>
MaxBytes[freemem]: 134152192 kMG[freemem]: k,M,G,T,P,X YLegend[freemem]: bytes ShortLegend[freemem]: bytes LegendI[freemem]: Free Memory:
LegendO[freemem]:
Legend1[freemem]: Free memory, not including swap, in bytes
表十一
M_MAIN_IN 同一封信件在 Databse 裡會共用的部份(註 1)
欄位名稱 資料型態 備 註
FROM_ID Varcahr(16) 寄件者帳號
格式:A~Z,0~9 等字元(身份證字號) FROM_NAME Varchar(20) 寄件者姓名
格式:中英文字 限制:
1使用者的 Name 裡的字元不能有 ”(上引號)或 , (逗號),因為上引號和逗號是用來 Parser 出 Name 的參考符號
2Name 的前後必須用" "
3E-Mail address 前後必須用< >
例如: “我是酷司拉” <[email protected]>
TO_LIST TEXT 主要的收件者 list 名單 格式:”name” <address>
CC_LIST TEXT 次要的收件者 list 名單
格式:< SysTime:”系統時間” @ GroupNum:”群組號碼”>
系統時間取至毫秒(ms)
第一行 Content-Type Header 裡的 boundary 來判斷 Multi-Part 中的各個 Part 的區域
ATTACH TEXT 副加檔案(MIME 內文的 attach 部份)
說明:附檔的Parser rule:除了內文外 其它都當附檔
因為如果從外部信件進來,content-type 會不只 attachment 一種,
都當附檔看待
SIZE Int 信件大小
格式:容量(Bytes),附件大小+內文大小
說明:由 cParser 計算後產生給 jConnector 存入 Database(M_MAIN_IN) MISSION_ID Varchar (3) 任務編號
說明:放在一封 MIME 格式信件最前面的 Header
用來區別存入的這封是一個普通用途”信件”或是其它用途的”檔案”
INDEX : FROM_ID, MESSAGE_ID
FreeBSD Gateway
db 組
cParser jConnector
WebMail 介面
如果是從外部的郵件進來時,cParser 負責把多行轉成同一行,再存入 WebMail
表十二
M_RECEIVER
同一封信件在 Databse 裡會獨立的部份,也就是在 to 或 cc 或 bcc 裡的人都會存一份的,而非像 M_MAIN_IN 共用
欄位名稱 資料型態 備 註
MESSAGE_ID Varchar (20)
OWNER_ID Varchar(10) 收信者 ID
範例:”阿貓” <[email protected]>
收信者 ID 為 aaa OWNER_NAME Varchar(10) 收信者暱稱
範例:”阿貓” <[email protected]> TRASHCAN Bit 刪除(Flag)
格式:0:寄件匣 1:垃圾筒 PERUSE Bit 閱讀信件(flag)
格式:0:未讀 1:已讀 INDEX:MESSAGE_ID + KIND + OWNER_ID + SEVER_ID
表十三 格式:”name” <address>
例如: “我是酷司拉” <[email protected]>
TO_LIST TEXT 主要的收件者 list 名單 格式:”name” <address>
CC_LIST TEXT 次要的收件者 list 名單 格式:同 TO_LIST BCC_LIST TEXT 被隱藏的收件者 list 名單
格式:同 TO_LIST
說明:M_MAIN_OUT 除了有 M_MAIN_IN 的 TO_LIST 和 CC_LIST 外,還 多 BCC 這欄位
格式:< SysTime:”系統時間” @ GroupNum:”群組號碼”>
系統時間取至毫秒
例如:<SysTime:20030904014978000@GroupNum:3939889>
BODY TEXT 信件內文(MIME 內文,除 attach 之外的)
說明:WebMail 須產生 Content-Type 和 boundary 在 BODY 裡 例如:
Content-Type:multipart/mixed; boundary="----=_NextPart_000_000B_01C3666E.ED61C830"
ATTACH TEXT 副加檔案(MIME 內文的 attach 部份)
說明:附檔的Parser rule:除了內文外 其它都當附檔
因為如果從外部信件進來,content-type 會不只 attachment 一種,
都當附檔看待 COPY Bit 備份(Flag)
格式:因為是剛產生,未送出的郵件,故一定是 0 TRASHCAN Bit 刪除(Flag)
格式:因為是剛產生,未送出的郵件,故一定是 0
格式:因為是剛產生,未送出的郵件,故一定是 0