rfc2049.txt 50KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281291301311321331341351361371381391401411421431441451461471481491501511521531541551561571581591601611621631641651661671681691701711721731741751761771781791801811821831841851861871881891901911921931941951961971981992002012022032042052062072082092102112122132142152162172182192202212222232242252262272282292302312322332342352362372382392402412422432442452462472482492502512522532542552562572582592602612622632642652662672682692702712722732742752762772782792802812822832842852862872882892902912922932942952962972982993003013023033043053063073083093103113123133143153163173183193203213223233243253263273283293303313323333343353363373383393403413423433443453463473483493503513523533543553563573583593603613623633643653663673683693703713723733743753763773783793803813823833843853863873883893903913923933943953963973983994004014024034044054064074084094104114124134144154164174184194204214224234244254264274284294304314324334344354364374384394404414424434444454464474484494504514524534544554564574584594604614624634644654664674684694704714724734744754764774784794804814824834844854864874884894904914924934944954964974984995005015025035045055065075085095105115125135145155165175185195205215225235245255265275285295305315325335345355365375385395405415425435445455465475485495505515525535545555565575585595605615625635645655665675685695705715725735745755765775785795805815825835845855865875885895905915925935945955965975985996006016026036046056066076086096106116126136146156166176186196206216226236246256266276286296306316326336346356366376386396406416426436446456466476486496506516526536546556566576586596606616626636646656666676686696706716726736746756766776786796806816826836846856866876886896906916926936946956966976986997007017027037047057067077087097107117127137147157167177187197207217227237247257267277287297307317327337347357367377387397407417427437447457467477487497507517527537547557567577587597607617627637647657667677687697707717727737747757767777787797807817827837847857867877887897907917927937947957967977987998008018028038048058068078088098108118128138148158168178188198208218228238248258268278288298308318328338348358368378388398408418428438448458468478488498508518528538548558568578588598608618628638648658668678688698708718728738748758768778788798808818828838848858868878888898908918928938948958968978988999009019029039049059069079089099109119129139149159169179189199209219229239249259269279289299309319329339349359369379389399409419429439449459469479489499509519529539549559569579589599609619629639649659669679689699709719729739749759769779789799809819829839849859869879889899909919929939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054105510561057105810591060106110621063106410651066106710681069107010711072107310741075107610771078107910801081108210831084108510861087108810891090109110921093109410951096109710981099110011011102110311041105110611071108110911101111111211131114111511161117111811191120112111221123112411251126112711281129113011311132113311341135113611371138113911401141114211431144114511461147114811491150115111521153115411551156115711581159116011611162116311641165116611671168116911701171117211731174117511761177117811791180118111821183118411851186118711881189119011911192119311941195119611971198119912001201120212031204120512061207120812091210121112121213121412151216121712181219122012211222122312241225122612271228122912301231123212331234123512361237123812391240124112421243124412451246124712481249125012511252125312541255125612571258125912601261126212631264126512661267126812691270127112721273127412751276127712781279128012811282128312841285128612871288128912901291129212931294129512961297129812991300130113021303130413051306130713081309131013111312131313141315131613171318131913201321132213231324132513261327132813291330133113321333133413351336133713381339134013411342134313441345134613471348
  1. Network Working Group N. Freed
  2. Request for Comments: 2049 Innosoft
  3. Obsoletes: 1521, 1522, 1590 N. Borenstein
  4. Category: Standards Track First Virtual
  5. November 1996
  6. Multipurpose Internet Mail Extensions
  7. (MIME) Part Five:
  8. Conformance Criteria and Examples
  9. Status of this Memo
  10. This document specifies an Internet standards track protocol for the
  11. Internet community, and requests discussion and suggestions for
  12. improvements. Please refer to the current edition of the "Internet
  13. Official Protocol Standards" (STD 1) for the standardization state
  14. and status of this protocol. Distribution of this memo is unlimited.
  15. Abstract
  16. STD 11, RFC 822, defines a message representation protocol specifying
  17. considerable detail about US-ASCII message headers, and leaves the
  18. message content, or message body, as flat US-ASCII text. This set of
  19. documents, collectively called the Multipurpose Internet Mail
  20. Extensions, or MIME, redefines the format of messages to allow for
  21. (1) textual message bodies in character sets other than
  22. US-ASCII,
  23. (2) an extensible set of different formats for non-textual
  24. message bodies,
  25. (3) multi-part message bodies, and
  26. (4) textual header information in character sets other than
  27. US-ASCII.
  28. These documents are based on earlier work documented in RFC 934, STD
  29. 11, and RFC 1049, but extends and revises them. Because RFC 822 said
  30. so little about message bodies, these documents are largely
  31. orthogonal to (rather than a revision of) RFC 822.
  32. The initial document in this set, RFC 2045, specifies the various
  33. headers used to describe the structure of MIME messages. The second
  34. document defines the general structure of the MIME media typing
  35. system and defines an initial set of media types. The third
  36. document, RFC 2047, describes extensions to RFC 822 to allow non-US-
  37. Freed & Borenstein Standards Track [Page 1]
  38. RFC 2049 MIME Conformance November 1996
  39. ASCII text data in Internet mail header fields. The fourth document,
  40. RFC 2048, specifies various IANA registration procedures for MIME-
  41. related facilities. This fifth and final document describes MIME
  42. conformance criteria as well as providing some illustrative examples
  43. of MIME message formats, acknowledgements, and the bibliography.
  44. These documents are revisions of RFCs 1521, 1522, and 1590, which
  45. themselves were revisions of RFCs 1341 and 1342. Appendix B of this
  46. document describes differences and changes from previous versions.
  47. Table of Contents
  48. 1. Introduction .......................................... 2
  49. 2. MIME Conformance ...................................... 2
  50. 3. Guidelines for Sending Email Data ..................... 6
  51. 4. Canonical Encoding Model .............................. 9
  52. 5. Summary ............................................... 12
  53. 6. Security Considerations ............................... 12
  54. 7. Authors' Addresses .................................... 12
  55. 8. Acknowledgements ...................................... 13
  56. A. A Complex Multipart Example ........................... 15
  57. B. Changes from RFC 1521, 1522, and 1590 ................. 16
  58. C. References ............................................ 20
  59. 1. Introduction
  60. The first and second documents in this set define MIME header fields
  61. and the initial set of MIME media types. The third document
  62. describes extensions to RFC822 formats to allow for character sets
  63. other than US-ASCII. This document describes what portions of MIME
  64. must be supported by a conformant MIME implementation. It also
  65. describes various pitfalls of contemporary messaging systems as well
  66. as the canonical encoding model MIME is based on.
  67. 2. MIME Conformance
  68. The mechanisms described in these documents are open-ended. It is
  69. definitely not expected that all implementations will support all
  70. available media types, nor that they will all share the same
  71. extensions. In order to promote interoperability, however, it is
  72. useful to define the concept of "MIME-conformance" to define a
  73. certain level of implementation that allows the useful interworking
  74. of messages with content that differs from US-ASCII text. In this
  75. section, we specify the requirements for such conformance.
  76. Freed & Borenstein Standards Track [Page 2]
  77. RFC 2049 MIME Conformance November 1996
  78. A mail user agent that is MIME-conformant MUST:
  79. (1) Always generate a "MIME-Version: 1.0" header field in
  80. any message it creates.
  81. (2) Recognize the Content-Transfer-Encoding header field
  82. and decode all received data encoded by either quoted-
  83. printable or base64 implementations. The identity
  84. transformations 7bit, 8bit, and binary must also be
  85. recognized.
  86. Any non-7bit data that is sent without encoding must be
  87. properly labelled with a content-transfer-encoding of
  88. 8bit or binary, as appropriate. If the underlying
  89. transport does not support 8bit or binary (as SMTP
  90. [RFC-821] does not), the sender is required to both
  91. encode and label data using an appropriate Content-
  92. Transfer-Encoding such as quoted-printable or base64.
  93. (3) Must treat any unrecognized Content-Transfer-Encoding
  94. as if it had a Content-Type of "application/octet-
  95. stream", regardless of whether or not the actual
  96. Content-Type is recognized.
  97. (4) Recognize and interpret the Content-Type header field,
  98. and avoid showing users raw data with a Content-Type
  99. field other than text. Implementations must be able
  100. to send at least text/plain messages, with the
  101. character set specified with the charset parameter if
  102. it is not US-ASCII.
  103. (5) Ignore any content type parameters whose names they do
  104. not recognize.
  105. (6) Explicitly handle the following media type values, to
  106. at least the following extents:
  107. Text:
  108. -- Recognize and display "text" mail with the
  109. character set "US-ASCII."
  110. -- Recognize other character sets at least to the
  111. extent of being able to inform the user about what
  112. character set the message uses.
  113. Freed & Borenstein Standards Track [Page 3]
  114. RFC 2049 MIME Conformance November 1996
  115. -- Recognize the "ISO-8859-*" character sets to the
  116. extent of being able to display those characters that
  117. are common to ISO-8859-* and US-ASCII, namely all
  118. characters represented by octet values 1-127.
  119. -- For unrecognized subtypes in a known character
  120. set, show or offer to show the user the "raw" version
  121. of the data after conversion of the content from
  122. canonical form to local form.
  123. -- Treat material in an unknown character set as if
  124. it were "application/octet-stream".
  125. Image, audio, and video:
  126. -- At a minumum provide facilities to treat any
  127. unrecognized subtypes as if they were
  128. "application/octet-stream".
  129. Application:
  130. -- Offer the ability to remove either of the quoted-
  131. printable or base64 encodings defined in this
  132. document if they were used and put the resulting
  133. information in a user file.
  134. Multipart:
  135. -- Recognize the mixed subtype. Display all relevant
  136. information on the message level and the body part
  137. header level and then display or offer to display
  138. each of the body parts individually.
  139. -- Recognize the "alternative" subtype, and avoid
  140. showing the user redundant parts of
  141. multipart/alternative mail.
  142. -- Recognize the "multipart/digest" subtype,
  143. specifically using "message/rfc822" rather than
  144. "text/plain" as the default media type for body parts
  145. inside "multipart/digest" entities.
  146. -- Treat any unrecognized subtypes as if they were
  147. "mixed".
  148. Freed & Borenstein Standards Track [Page 4]
  149. RFC 2049 MIME Conformance November 1996
  150. Message:
  151. -- Recognize and display at least the RFC822 message
  152. encapsulation (message/rfc822) in such a way as to
  153. preserve any recursive structure, that is, displaying
  154. or offering to display the encapsulated data in
  155. accordance with its media type.
  156. -- Treat any unrecognized subtypes as if they were
  157. "application/octet-stream".
  158. (7) Upon encountering any unrecognized Content-Type field,
  159. an implementation must treat it as if it had a media
  160. type of "application/octet-stream" with no parameter
  161. sub-arguments. How such data are handled is up to an
  162. implementation, but likely options for handling such
  163. unrecognized data include offering the user to write it
  164. into a file (decoded from its mail transport format) or
  165. offering the user to name a program to which the
  166. decoded data should be passed as input.
  167. (8) Conformant user agents are required, if they provide
  168. non-standard support for non-MIME messages employing
  169. character sets other than US-ASCII, to do so on
  170. received messages only. Conforming user agents must not
  171. send non-MIME messages containing anything other than
  172. US-ASCII text.
  173. In particular, the use of non-US-ASCII text in mail
  174. messages without a MIME-Version field is strongly
  175. discouraged as it impedes interoperability when sending
  176. messages between regions with different localization
  177. conventions. Conforming user agents MUST include proper
  178. MIME labelling when sending anything other than plain
  179. text in the US-ASCII character set.
  180. In addition, non-MIME user agents should be upgraded if
  181. at all possible to include appropriate MIME header
  182. information in the messages they send even if nothing
  183. else in MIME is supported. This upgrade will have
  184. little, if any, effect on non-MIME recipients and will
  185. aid MIME in correctly displaying such messages. It
  186. also provides a smooth transition path to eventual
  187. adoption of other MIME capabilities.
  188. (9) Conforming user agents must ensure that any string of
  189. non-white-space printable US-ASCII characters within a
  190. "*text" or "*ctext" that begins with "=?" and ends with
  191. Freed & Borenstein Standards Track [Page 5]
  192. RFC 2049 MIME Conformance November 1996
  193. "?=" be a valid encoded-word. ("begins" means: At the
  194. start of the field-body or immediately following
  195. linear-white-space; "ends" means: At the end of the
  196. field-body or immediately preceding linear-white-
  197. space.) In addition, any "word" within a "phrase" that
  198. begins with "=?" and ends with "?=" must be a valid
  199. encoded-word.
  200. (10) Conforming user agents must be able to distinguish
  201. encoded-words from "text", "ctext", or "word"s,
  202. according to the rules in section 4, anytime they
  203. appear in appropriate places in message headers. It
  204. must support both the "B" and "Q" encodings for any
  205. character set which it supports. The program must be
  206. able to display the unencoded text if the character set
  207. is "US-ASCII". For the ISO-8859-* character sets, the
  208. mail reading program must at least be able to display
  209. the characters which are also in the US-ASCII set.
  210. A user agent that meets the above conditions is said to be MIME-
  211. conformant. The meaning of this phrase is that it is assumed to be
  212. "safe" to send virtually any kind of properly-marked data to users of
  213. such mail systems, because such systems will at least be able to
  214. treat the data as undifferentiated binary, and will not simply splash
  215. it onto the screen of unsuspecting users.
  216. There is another sense in which it is always "safe" to send data in a
  217. format that is MIME-conformant, which is that such data will not
  218. break or be broken by any known systems that are conformant with RFC
  219. 821 and RFC 822. User agents that are MIME-conformant have the
  220. additional guarantee that the user will not be shown data that were
  221. never intended to be viewed as text.
  222. 3. Guidelines for Sending Email Data
  223. Internet email is not a perfect, homogeneous system. Mail may become
  224. corrupted at several stages in its travel to a final destination.
  225. Specifically, email sent throughout the Internet may travel across
  226. many networking technologies. Many networking and mail technologies
  227. do not support the full functionality possible in the SMTP transport
  228. environment. Mail traversing these systems is likely to be modified
  229. in order that it can be transported.
  230. There exist many widely-deployed non-conformant MTAs in the Internet.
  231. These MTAs, speaking the SMTP protocol, alter messages on the fly to
  232. take advantage of the internal data structure of the hosts they are
  233. implemented on, or are just plain broken.
  234. Freed & Borenstein Standards Track [Page 6]
  235. RFC 2049 MIME Conformance November 1996
  236. The following guidelines may be useful to anyone devising a data
  237. format (media type) that is supposed to survive the widest range of
  238. networking technologies and known broken MTAs unscathed. Note that
  239. anything encoded in the base64 encoding will satisfy these rules, but
  240. that some well-known mechanisms, notably the UNIX uuencode facility,
  241. will not. Note also that anything encoded in the Quoted-Printable
  242. encoding will survive most gateways intact, but possibly not some
  243. gateways to systems that use the EBCDIC character set.
  244. (1) Under some circumstances the encoding used for data may
  245. change as part of normal gateway or user agent
  246. operation. In particular, conversion from base64 to
  247. quoted-printable and vice versa may be necessary. This
  248. may result in the confusion of CRLF sequences with line
  249. breaks in text bodies. As such, the persistence of
  250. CRLF as something other than a line break must not be
  251. relied on.
  252. (2) Many systems may elect to represent and store text data
  253. using local newline conventions. Local newline
  254. conventions may not match the RFC822 CRLF convention --
  255. systems are known that use plain CR, plain LF, CRLF, or
  256. counted records. The result is that isolated CR and LF
  257. characters are not well tolerated in general; they may
  258. be lost or converted to delimiters on some systems, and
  259. hence must not be relied on.
  260. (3) The transmission of NULs (US-ASCII value 0) is
  261. problematic in Internet mail. (This is largely the
  262. result of NULs being used as a termination character by
  263. many of the standard runtime library routines in the C
  264. programming language.) The practice of using NULs as
  265. termination characters is so entrenched now that
  266. messages should not rely on them being preserved.
  267. (4) TAB (HT) characters may be misinterpreted or may be
  268. automatically converted to variable numbers of spaces.
  269. This is unavoidable in some environments, notably those
  270. not based on the US-ASCII character set. Such
  271. conversion is STRONGLY DISCOURAGED, but it may occur,
  272. and mail formats must not rely on the persistence of
  273. TAB (HT) characters.
  274. (5) Lines longer than 76 characters may be wrapped or
  275. truncated in some environments. Line wrapping or line
  276. truncation imposed by mail transports is STRONGLY
  277. DISCOURAGED, but unavoidable in some cases.
  278. Applications which require long lines must somehow
  279. Freed & Borenstein Standards Track [Page 7]
  280. RFC 2049 MIME Conformance November 1996
  281. differentiate between soft and hard line breaks. (A
  282. simple way to do this is to use the quoted-printable
  283. encoding.)
  284. (6) Trailing "white space" characters (SPACE, TAB (HT)) on
  285. a line may be discarded by some transport agents, while
  286. other transport agents may pad lines with these
  287. characters so that all lines in a mail file are of
  288. equal length. The persistence of trailing white space,
  289. therefore, must not be relied on.
  290. (7) Many mail domains use variations on the US-ASCII
  291. character set, or use character sets such as EBCDIC
  292. which contain most but not all of the US-ASCII
  293. characters. The correct translation of characters not
  294. in the "invariant" set cannot be depended on across
  295. character converting gateways. For example, this
  296. situation is a problem when sending uuencoded
  297. information across BITNET, an EBCDIC system. Similar
  298. problems can occur without crossing a gateway, since
  299. many Internet hosts use character sets other than US-
  300. ASCII internally. The definition of Printable Strings
  301. in X.400 adds further restrictions in certain special
  302. cases. In particular, the only characters that are
  303. known to be consistent across all gateways are the 73
  304. characters that correspond to the upper and lower case
  305. letters A-Z and a-z, the 10 digits 0-9, and the
  306. following eleven special characters:
  307. "'" (US-ASCII decimal value 39)
  308. "(" (US-ASCII decimal value 40)
  309. ")" (US-ASCII decimal value 41)
  310. "+" (US-ASCII decimal value 43)
  311. "," (US-ASCII decimal value 44)
  312. "-" (US-ASCII decimal value 45)
  313. "." (US-ASCII decimal value 46)
  314. "/" (US-ASCII decimal value 47)
  315. ":" (US-ASCII decimal value 58)
  316. "=" (US-ASCII decimal value 61)
  317. "?" (US-ASCII decimal value 63)
  318. A maximally portable mail representation will confine
  319. itself to relatively short lines of text in which the
  320. only meaningful characters are taken from this set of
  321. 73 characters. The base64 encoding follows this rule.
  322. (8) Some mail transport agents will corrupt data that
  323. includes certain literal strings. In particular, a
  324. Freed & Borenstein Standards Track [Page 8]
  325. RFC 2049 MIME Conformance November 1996
  326. period (".") alone on a line is known to be corrupted
  327. by some (incorrect) SMTP implementations, and a line
  328. that starts with the five characters "From " (the fifth
  329. character is a SPACE) are commonly corrupted as well.
  330. A careful composition agent can prevent these
  331. corruptions by encoding the data (e.g., in the quoted-
  332. printable encoding using "=46rom " in place of "From "
  333. at the start of a line, and "=2E" in place of "." alone
  334. on a line).
  335. Please note that the above list is NOT a list of recommended
  336. practices for MTAs. RFC 821 MTAs are prohibited from altering the
  337. character of white space or wrapping long lines. These BAD and
  338. invalid practices are known to occur on established networks, and
  339. implementations should be robust in dealing with the bad effects they
  340. can cause.
  341. 4. Canonical Encoding Model
  342. There was some confusion, in earlier versions of these documents,
  343. regarding the model for when email data was to be converted to
  344. canonical form and encoded, and in particular how this process would
  345. affect the treatment of CRLFs, given that the representation of
  346. newlines varies greatly from system to system. For this reason, a
  347. canonical model for encoding is presented below.
  348. The process of composing a MIME entity can be modeled as being done
  349. in a number of steps. Note that these steps are roughly similar to
  350. those steps used in PEM [RFC-1421] and are performed for each
  351. "innermost level" body:
  352. (1) Creation of local form.
  353. The body to be transmitted is created in the system's
  354. native format. The native character set is used and,
  355. where appropriate, local end of line conventions are
  356. used as well. The body may be a UNIX-style text file,
  357. or a Sun raster image, or a VMS indexed file, or audio
  358. data in a system-dependent format stored only in
  359. memory, or anything else that corresponds to the local
  360. model for the representation of some form of
  361. information. Fundamentally, the data is created in the
  362. "native" form that corresponds to the type specified by
  363. the media type.
  364. Freed & Borenstein Standards Track [Page 9]
  365. RFC 2049 MIME Conformance November 1996
  366. (2) Conversion to canonical form.
  367. The entire body, including "out-of-band" information
  368. such as record lengths and possibly file attribute
  369. information, is converted to a universal canonical
  370. form. The specific media type of the body as well as
  371. its associated attributes dictate the nature of the
  372. canonical form that is used. Conversion to the proper
  373. canonical form may involve character set conversion,
  374. transformation of audio data, compression, or various
  375. other operations specific to the various media types.
  376. If character set conversion is involved, however, care
  377. must be taken to understand the semantics of the media
  378. type, which may have strong implications for any
  379. character set conversion, e.g. with regard to
  380. syntactically meaningful characters in a text subtype
  381. other than "plain".
  382. For example, in the case of text/plain data, the text
  383. must be converted to a supported character set and
  384. lines must be delimited with CRLF delimiters in
  385. accordance with RFC 822. Note that the restriction on
  386. line lengths implied by RFC 822 is eliminated if the
  387. next step employs either quoted-printable or base64
  388. encoding.
  389. (3) Apply transfer encoding.
  390. A Content-Transfer-Encoding appropriate for this body
  391. is applied. Note that there is no fixed relationship
  392. between the media type and the transfer encoding. In
  393. particular, it may be appropriate to base the choice of
  394. base64 or quoted-printable on character frequency
  395. counts which are specific to a given instance of a
  396. body.
  397. (4) Insertion into entity.
  398. The encoded body is inserted into a MIME entity with
  399. appropriate headers. The entity is then inserted into
  400. the body of a higher-level entity (message or
  401. multipart) as needed.
  402. Conversion from entity form to local form is accomplished by
  403. reversing these steps. Note that reversal of these steps may produce
  404. differing results since there is no guarantee that the original and
  405. final local forms are the same.
  406. Freed & Borenstein Standards Track [Page 10]
  407. RFC 2049 MIME Conformance November 1996
  408. It is vital to note that these steps are only a model; they are
  409. specifically NOT a blueprint for how an actual system would be built.
  410. In particular, the model fails to account for two common designs:
  411. (1) In many cases the conversion to a canonical form prior
  412. to encoding will be subsumed into the encoder itself,
  413. which understands local formats directly. For example,
  414. the local newline convention for text bodies might be
  415. carried through to the encoder itself along with
  416. knowledge of what that format is.
  417. (2) The output of the encoders may have to pass through one
  418. or more additional steps prior to being transmitted as
  419. a message. As such, the output of the encoder may not
  420. be conformant with the formats specified by RFC 822.
  421. In particular, once again it may be appropriate for the
  422. converter's output to be expressed using local newline
  423. conventions rather than using the standard RFC 822 CRLF
  424. delimiters.
  425. Other implementation variations are conceivable as well. The vital
  426. aspect of this discussion is that, in spite of any optimizations,
  427. collapsings of required steps, or insertion of additional processing,
  428. the resulting messages must be consistent with those produced by the
  429. model described here. For example, a message with the following
  430. header fields:
  431. Content-type: text/foo; charset=bar
  432. Content-Transfer-Encoding: base64
  433. must be first represented in the text/foo form, then (if necessary)
  434. represented in the "bar" character set, and finally transformed via
  435. the base64 algorithm into a mail-safe form.
  436. NOTE: Some confusion has been caused by systems that represent
  437. messages in a format which uses local newline conventions which
  438. differ from the RFC822 CRLF convention. It is important to note that
  439. these formats are not canonical RFC822/MIME. These formats are
  440. instead *encodings* of RFC822, where CRLF sequences in the canonical
  441. representation of the message are encoded as the local newline
  442. convention. Note that formats which encode CRLF sequences as, for
  443. example, LF are not capable of representing MIME messages containing
  444. binary data which contains LF octets not part of CRLF line separation
  445. sequences.
  446. Freed & Borenstein Standards Track [Page 11]
  447. RFC 2049 MIME Conformance November 1996
  448. 5. Summary
  449. This document defines what is meant by MIME Conformance. It also
  450. details various problems known to exist in the Internet email system
  451. and how to use MIME to overcome them. Finally, it describes MIME's
  452. canonical encoding model.
  453. 6. Security Considerations
  454. Security issues are discussed in the second document in this set, RFC
  455. 2046.
  456. 7. Authors' Addresses
  457. For more information, the authors of this document are best contacted
  458. via Internet mail:
  459. Ned Freed
  460. Innosoft International, Inc.
  461. 1050 East Garvey Avenue South
  462. West Covina, CA 91790
  463. USA
  464. Phone: +1 818 919 3600
  465. Fax: +1 818 919 3614
  466. EMail: ned@innosoft.com
  467. Nathaniel S. Borenstein
  468. First Virtual Holdings
  469. 25 Washington Avenue
  470. Morristown, NJ 07960
  471. USA
  472. Phone: +1 201 540 8967
  473. Fax: +1 201 993 3032
  474. EMail: nsb@nsb.fv.com
  475. MIME is a result of the work of the Internet Engineering Task Force
  476. Working Group on RFC 822 Extensions. The chairman of that group,
  477. Greg Vaudreuil, may be reached at:
  478. Gregory M. Vaudreuil
  479. Octel Network Services
  480. 17080 Dallas Parkway
  481. Dallas, TX 75248-1905
  482. USA
  483. EMail: Greg.Vaudreuil@Octel.Com
  484. Freed & Borenstein Standards Track [Page 12]
  485. RFC 2049 MIME Conformance November 1996
  486. 8. Acknowledgements
  487. This document is the result of the collective effort of a large
  488. number of people, at several IETF meetings, on the IETF-SMTP and
  489. IETF-822 mailing lists, and elsewhere. Although any enumeration
  490. seems doomed to suffer from egregious omissions, the following are
  491. among the many contributors to this effort:
  492. Harald Tveit Alvestrand Marc Andreessen
  493. Randall Atkinson Bob Braden
  494. Philippe Brandon Brian Capouch
  495. Kevin Carosso Uhhyung Choi
  496. Peter Clitherow Dave Collier-Brown
  497. Cristian Constantinof John Coonrod
  498. Mark Crispin Dave Crocker
  499. Stephen Crocker Terry Crowley
  500. Walt Daniels Jim Davis
  501. Frank Dawson Axel Deininger
  502. Hitoshi Doi Kevin Donnelly
  503. Steve Dorner Keith Edwards
  504. Chris Eich Dana S. Emery
  505. Johnny Eriksson Craig Everhart
  506. Patrik Faltstrom Erik E. Fair
  507. Roger Fajman Alain Fontaine
  508. Martin Forssen James M. Galvin
  509. Stephen Gildea Philip Gladstone
  510. Thomas Gordon Keld Simonsen
  511. Terry Gray Phill Gross
  512. James Hamilton David Herron
  513. Mark Horton Bruce Howard
  514. Bill Janssen Olle Jarnefors
  515. Risto Kankkunen Phil Karn
  516. Alan Katz Tim Kehres
  517. Neil Katin Steve Kille
  518. Kyuho Kim Anders Klemets
  519. John Klensin Valdis Kletniek
  520. Jim Knowles Stev Knowles
  521. Bob Kummerfeld Pekka Kytolaakso
  522. Stellan Lagerstrom Vincent Lau
  523. Timo Lehtinen Donald Lindsay
  524. Warner Losh Carlyn Lowery
  525. Laurence Lundblade Charles Lynn
  526. John R. MacMillan Larry Masinter
  527. Rick McGowan Michael J. McInerny
  528. Leo Mclaughlin Goli Montaser-Kohsari
  529. Tom Moore John Gardiner Myers
  530. Erik Naggum Mark Needleman
  531. Chris Newman John Noerenberg
  532. Freed & Borenstein Standards Track [Page 13]
  533. RFC 2049 MIME Conformance November 1996
  534. Mats Ohrman Julian Onions
  535. Michael Patton David J. Pepper
  536. Erik van der Poel Blake C. Ramsdell
  537. Christer Romson Luc Rooijakkers
  538. Marshall T. Rose Jonathan Rosenberg
  539. Guido van Rossum Jan Rynning
  540. Harri Salminen Michael Sanderson
  541. Yutaka Sato Markku Savela
  542. Richard Alan Schafer Masahiro Sekiguchi
  543. Mark Sherman Bob Smart
  544. Peter Speck Henry Spencer
  545. Einar Stefferud Michael Stein
  546. Klaus Steinberger Peter Svanberg
  547. James Thompson Steve Uhler
  548. Stuart Vance Peter Vanderbilt
  549. Greg Vaudreuil Ed Vielmetti
  550. Larry W. Virden Ryan Waldron
  551. Rhys Weatherly Jay Weber
  552. Dave Wecker Wally Wedel
  553. Sven-Ove Westberg Brian Wideen
  554. John Wobus Glenn Wright
  555. Rayan Zachariassen David Zimmerman
  556. The authors apologize for any omissions from this list, which are
  557. certainly unintentional.
  558. Freed & Borenstein Standards Track [Page 14]
  559. RFC 2049 MIME Conformance November 1996
  560. Appendix A -- A Complex Multipart Example
  561. What follows is the outline of a complex multipart message. This
  562. message contains five parts that are to be displayed serially: two
  563. introductory plain text objects, an embedded multipart message, a
  564. text/enriched object, and a closing encapsulated text message in a
  565. non-ASCII character set. The embedded multipart message itself
  566. contains two objects to be displayed in parallel, a picture and an
  567. audio fragment.
  568. MIME-Version: 1.0
  569. From: Nathaniel Borenstein <nsb@nsb.fv.com>
  570. To: Ned Freed <ned@innosoft.com>
  571. Date: Fri, 07 Oct 1994 16:15:05 -0700 (PDT)
  572. Subject: A multipart example
  573. Content-Type: multipart/mixed;
  574. boundary=unique-boundary-1
  575. This is the preamble area of a multipart message.
  576. Mail readers that understand multipart format
  577. should ignore this preamble.
  578. If you are reading this text, you might want to
  579. consider changing to a mail reader that understands
  580. how to properly display multipart messages.
  581. --unique-boundary-1
  582. ... Some text appears here ...
  583. [Note that the blank between the boundary and the start
  584. of the text in this part means no header fields were
  585. given and this is text in the US-ASCII character set.
  586. It could have been done with explicit typing as in the
  587. next part.]
  588. --unique-boundary-1
  589. Content-type: text/plain; charset=US-ASCII
  590. This could have been part of the previous part, but
  591. illustrates explicit versus implicit typing of body
  592. parts.
  593. --unique-boundary-1
  594. Content-Type: multipart/parallel; boundary=unique-boundary-2
  595. --unique-boundary-2
  596. Content-Type: audio/basic
  597. Freed & Borenstein Standards Track [Page 15]
  598. RFC 2049 MIME Conformance November 1996
  599. Content-Transfer-Encoding: base64
  600. ... base64-encoded 8000 Hz single-channel
  601. mu-law-format audio data goes here ...
  602. --unique-boundary-2
  603. Content-Type: image/jpeg
  604. Content-Transfer-Encoding: base64
  605. ... base64-encoded image data goes here ...
  606. --unique-boundary-2--
  607. --unique-boundary-1
  608. Content-type: text/enriched
  609. This is <bold><italic>enriched.</italic></bold>
  610. <smaller>as defined in RFC 1896</smaller>
  611. Isn't it
  612. <bigger><bigger>cool?</bigger></bigger>
  613. --unique-boundary-1
  614. Content-Type: message/rfc822
  615. From: (mailbox in US-ASCII)
  616. To: (address in US-ASCII)
  617. Subject: (subject in US-ASCII)
  618. Content-Type: Text/plain; charset=ISO-8859-1
  619. Content-Transfer-Encoding: Quoted-printable
  620. ... Additional text in ISO-8859-1 goes here ...
  621. --unique-boundary-1--
  622. Appendix B -- Changes from RFC 1521, 1522, and 1590
  623. These documents are a revision of RFC 1521, 1522, and 1590. For the
  624. convenience of those familiar with the earlier documents, the changes
  625. from those documents are summarized in this appendix. For further
  626. history, note that Appendix H in RFC 1521 specified how that document
  627. differed from its predecessor, RFC 1341.
  628. (1) This document has been completely reformatted and split
  629. into multiple documents. This was done to improve the
  630. quality of the plain text version of this document,
  631. which is required to be the reference copy.
  632. Freed & Borenstein Standards Track [Page 16]
  633. RFC 2049 MIME Conformance November 1996
  634. (2) BNF describing the overall structure of MIME object
  635. headers has been added. This is a documentation change
  636. only -- the underlying syntax has not changed in any
  637. way.
  638. (3) The specific BNF for the seven media types in MIME has
  639. been removed. This BNF was incorrect, incomplete, amd
  640. inconsistent with the type-indendependent BNF. And
  641. since the type-independent BNF already fully specifies
  642. the syntax of the various MIME headers, the type-
  643. specific BNF was, in the final analysis, completely
  644. unnecessary and caused more problems than it solved.
  645. (4) The more specific "US-ASCII" character set name has
  646. replaced the use of the informal term ASCII in many
  647. parts of these documents.
  648. (5) The informal concept of a primary subtype has been
  649. removed.
  650. (6) The term "object" was being used inconsistently. The
  651. definition of this term has been clarified, along with
  652. the related terms "body", "body part", and "entity",
  653. and usage has been corrected where appropriate.
  654. (7) The BNF for the multipart media type has been
  655. rearranged to make it clear that the CRLF preceeding
  656. the boundary marker is actually part of the marker
  657. itself rather than the preceeding body part.
  658. (8) The prose and BNF describing the multipart media type
  659. have been changed to make it clear that the body parts
  660. within a multipart object MUST NOT contain any lines
  661. beginning with the boundary parameter string.
  662. (9) In the rules on reassembling "message/partial" MIME
  663. entities, "Subject" is added to the list of headers to
  664. take from the inner message, and the example is
  665. modified to clarify this point.
  666. (10) "Message/partial" fragmenters are restricted to
  667. splitting MIME objects only at line boundaries.
  668. (11) In the discussion of the application/postscript type,
  669. an additional paragraph has been added warning about
  670. possible interoperability problems caused by embedding
  671. of binary data inside a PostScript MIME entity.
  672. Freed & Borenstein Standards Track [Page 17]
  673. RFC 2049 MIME Conformance November 1996
  674. (12) Added a clarifying note to the basic syntax rules for
  675. the Content-Type header field to make it clear that the
  676. following two forms:
  677. Content-type: text/plain; charset=us-ascii (comment)
  678. Content-type: text/plain; charset="us-ascii"
  679. are completely equivalent.
  680. (13) The following sentence has been removed from the
  681. discussion of the MIME-Version header: "However,
  682. conformant software is encouraged to check the version
  683. number and at least warn the user if an unrecognized
  684. MIME-version is encountered."
  685. (14) A typo was fixed that said "application/external-body"
  686. instead of "message/external-body".
  687. (15) The definition of a character set has been reorganized
  688. to make the requirements clearer.
  689. (16) The definition of the "image/gif" media type has been
  690. moved to a separate document. This change was made
  691. because of potential conflicts with IETF rules
  692. governing the standardization of patented technology.
  693. (17) The definitions of "7bit" and "8bit" have been
  694. tightened so that use of bare CR, LF can only be used
  695. as end-of-line sequences. The document also no longer
  696. requires that NUL characters be preserved, which brings
  697. MIME into alignment with real-world implementations.
  698. (18) The definition of canonical text in MIME has been
  699. tightened so that line breaks must be represented by a
  700. CRLF sequence. CR and LF characters are not allowed
  701. outside of this usage. The definition of quoted-
  702. printable encoding has been altered accordingly.
  703. (19) The definition of the quoted-printable encoding now
  704. includes a number of suggestions for how quoted-
  705. printable encoders might best handle improperly encoded
  706. material.
  707. (20) Prose was added to clarify the use of the "7bit",
  708. "8bit", and "binary" transfer-encodings on multipart or
  709. message entities encapsulating "8bit" or "binary" data.
  710. Freed & Borenstein Standards Track [Page 18]
  711. RFC 2049 MIME Conformance November 1996
  712. (21) In the section on MIME Conformance, "multipart/digest"
  713. support was added to the list of requirements for
  714. minimal MIME conformance. Also, the requirement for
  715. "message/rfc822" support were strengthened to clarify
  716. the importance of recognizing recursive structure.
  717. (22) The various restrictions on subtypes of "message" are
  718. now specified entirely on a subtype by subtype basis.
  719. (23) The definition of "message/rfc822" was changed to
  720. indicate that at least one of the "From", "Subject", or
  721. "Date" headers must be present.
  722. (24) The required handling of unrecognized subtypes as
  723. "application/octet-stream" has been made more explicit
  724. in both the type definitions sections and the
  725. conformance guidelines.
  726. (25) Examples using text/richtext were changed to
  727. text/enriched.
  728. (26) The BNF definition of subtype has been changed to make
  729. it clear that either an IANA registered subtype or a
  730. nonstandard "X-" subtype must be used in a Content-Type
  731. header field.
  732. (27) MIME media types that are simply registered for use and
  733. those that are standardized by the IETF are now
  734. distinguished in the MIME BNF.
  735. (28) All of the various MIME registration procedures have
  736. been extensively revised. IANA registration procedures
  737. for character sets have been moved to a separate
  738. document that is no included in this set of documents.
  739. (29) The use of escape and shift mechanisms in the US-ASCII
  740. and ISO-8859-X character sets these documents define
  741. have been clarified: Such mechanisms should never be
  742. used in conjunction with these character sets and their
  743. effect if they are used is undefined.
  744. (30) The definition of the AFS access-type for
  745. message/external-body has been removed.
  746. (31) The handling of the combination of
  747. multipart/alternative and message/external-body is now
  748. specifically addressed.
  749. Freed & Borenstein Standards Track [Page 19]
  750. RFC 2049 MIME Conformance November 1996
  751. (32) Security issues specific to message/external-body are
  752. now discussed in some detail.
  753. Appendix C -- References
  754. [ATK]
  755. Borenstein, Nathaniel S., Multimedia Applications
  756. Development with the Andrew Toolkit, Prentice-Hall, 1990.
  757. [ISO-2022]
  758. International Standard -- Information Processing --
  759. Character Code Structure and Extension Techniques,
  760. ISO/IEC 2022:1994, 4th ed.
  761. [ISO-8859]
  762. International Standard -- Information Processing -- 8-bit
  763. Single-Byte Coded Graphic Character Sets
  764. - Part 1: Latin Alphabet No. 1, ISO 8859-1:1987, 1st ed.
  765. - Part 2: Latin Alphabet No. 2, ISO 8859-2:1987, 1st ed.
  766. - Part 3: Latin Alphabet No. 3, ISO 8859-3:1988, 1st ed.
  767. - Part 4: Latin Alphabet No. 4, ISO 8859-4:1988, 1st ed.
  768. - Part 5: Latin/Cyrillic Alphabet, ISO 8859-5:1988, 1st
  769. ed.
  770. - Part 6: Latin/Arabic Alphabet, ISO 8859-6:1987, 1st ed.
  771. - Part 7: Latin/Greek Alphabet, ISO 8859-7:1987, 1st ed.
  772. - Part 8: Latin/Hebrew Alphabet, ISO 8859-8:1988, 1st ed.
  773. - Part 9: Latin Alphabet No. 5, ISO/IEC 8859-9:1989, 1st
  774. ed.
  775. International Standard -- Information Technology -- 8-bit
  776. Single-Byte Coded Graphic Character Sets
  777. - Part 10: Latin Alphabet No. 6, ISO/IEC 8859-10:1992,
  778. 1st ed.
  779. [ISO-646]
  780. International Standard -- Information Technology -- ISO
  781. 7-bit Coded Character Set for Information Interchange,
  782. ISO 646:1991, 3rd ed..
  783. [JPEG]
  784. JPEG Draft Standard ISO 10918-1 CD.
  785. [MPEG]
  786. Video Coding Draft Standard ISO 11172 CD, ISO
  787. IEC/JTC1/SC2/WG11 (Motion Picture Experts Group), May,
  788. 1991.
  789. Freed & Borenstein Standards Track [Page 20]
  790. RFC 2049 MIME Conformance November 1996
  791. [PCM]
  792. CCITT, Fascicle III.4 - Recommendation G.711, "Pulse Code
  793. Modulation (PCM) of Voice Frequencies", Geneva, 1972.
  794. [POSTSCRIPT]
  795. Adobe Systems, Inc., PostScript Language Reference
  796. Manual, Addison-Wesley, 1985.
  797. [POSTSCRIPT2]
  798. Adobe Systems, Inc., PostScript Language Reference
  799. Manual, Addison-Wesley, Second Ed., 1990.
  800. [RFC-783]
  801. Sollins, K.R., "TFTP Protocol (revision 2)", RFC-783,
  802. MIT, June 1981.
  803. [RFC-821]
  804. Postel, J.B., "Simple Mail Transfer Protocol", STD 10,
  805. RFC 821, USC/Information Sciences Institute, August 1982.
  806. [RFC-822]
  807. Crocker, D., "Standard for the Format of ARPA Internet
  808. Text Messages", STD 11, RFC 822, UDEL, August 1982.
  809. [RFC-934]
  810. Rose, M. and E. Stefferud, "Proposed Standard for Message
  811. Encapsulation", RFC 934, Delaware and NMA, January 1985.
  812. [RFC-959]
  813. Postel, J. and J. Reynolds, "File Transfer Protocol", STD
  814. 9, RFC 959, USC/Information Sciences Institute, October
  815. 1985.
  816. [RFC-1049]
  817. Sirbu, M., "Content-Type Header Field for Internet
  818. Messages", RFC 1049, CMU, March 1988.
  819. [RFC-1154]
  820. Robinson, D., and R. Ullmann, "Encoding Header Field for
  821. Internet Messages", RFC 1154, Prime Computer, Inc., April
  822. 1990.
  823. [RFC-1341]
  824. Borenstein, N., and N. Freed, "MIME (Multipurpose
  825. Internet Mail Extensions): Mechanisms for Specifying and
  826. Describing the Format of Internet Message Bodies", RFC
  827. 1341, Bellcore, Innosoft, June 1992.
  828. Freed & Borenstein Standards Track [Page 21]
  829. RFC 2049 MIME Conformance November 1996
  830. [RFC-1342]
  831. Moore, K., "Representation of Non-Ascii Text in Internet
  832. Message Headers", RFC 1342, University of Tennessee, June
  833. 1992.
  834. [RFC-1344]
  835. Borenstein, N., "Implications of MIME for Internet Mail
  836. Gateways", RFC 1344, Bellcore, June 1992.
  837. [RFC-1345]
  838. Simonsen, K., "Character Mnemonics & Character Sets", RFC
  839. 1345, Rationel Almen Planlaegning, June 1992.
  840. [RFC-1421]
  841. Linn, J., "Privacy Enhancement for Internet Electronic
  842. Mail: Part I -- Message Encryption and Authentication
  843. Procedures", RFC 1421, IAB IRTF PSRG, IETF PEM WG,
  844. February 1993.
  845. [RFC-1422]
  846. Kent, S., "Privacy Enhancement for Internet Electronic
  847. Mail: Part II -- Certificate-Based Key Management", RFC
  848. 1422, IAB IRTF PSRG, IETF PEM WG, February 1993.
  849. [RFC-1423]
  850. Balenson, D., "Privacy Enhancement for Internet
  851. Electronic Mail: Part III -- Algorithms, Modes, and
  852. Identifiers", IAB IRTF PSRG, IETF PEM WG, February 1993.
  853. [RFC-1424]
  854. Kaliski, B., "Privacy Enhancement for Internet Electronic
  855. Mail: Part IV -- Key Certification and Related
  856. Services", IAB IRTF PSRG, IETF PEM WG, February 1993.
  857. [RFC-1521]
  858. Borenstein, N., and Freed, N., "MIME (Multipurpose
  859. Internet Mail Extensions): Mechanisms for Specifying and
  860. Describing the Format of Internet Message Bodies", RFC
  861. 1521, Bellcore, Innosoft, September, 1993.
  862. [RFC-1522]
  863. Moore, K., "Representation of Non-ASCII Text in Internet
  864. Message Headers", RFC 1522, University of Tennessee,
  865. September 1993.
  866. Freed & Borenstein Standards Track [Page 22]
  867. RFC 2049 MIME Conformance November 1996
  868. [RFC-1524]
  869. Borenstein, N., "A User Agent Configuration Mechanism for
  870. Multimedia Mail Format Information", RFC 1524, Bellcore,
  871. September 1993.
  872. [RFC-1543]
  873. Postel, J., "Instructions to RFC Authors", RFC 1543,
  874. USC/Information Sciences Institute, October 1993.
  875. [RFC-1556]
  876. Nussbacher, H., "Handling of Bi-directional Texts in
  877. MIME", RFC 1556, Israeli Inter-University Computer
  878. Center, December 1993.
  879. [RFC-1590]
  880. Postel, J., "Media Type Registration Procedure", RFC
  881. 1590, USC/Information Sciences Institute, March 1994.
  882. [RFC-1602]
  883. Internet Architecture Board, Internet Engineering
  884. Steering Group, Huitema, C., Gross, P., "The Internet
  885. Standards Process -- Revision 2", March 1994.
  886. [RFC-1652]
  887. Klensin, J., (WG Chair), Freed, N., (Editor), Rose, M.,
  888. Stefferud, E., and Crocker, D., "SMTP Service Extension
  889. for 8bit-MIME transport", RFC 1652, United Nations
  890. University, Innosoft, Dover Beach Consulting, Inc.,
  891. Network Management Associates, Inc., The Branch Office,
  892. March 1994.
  893. [RFC-1700]
  894. Reynolds, J. and J. Postel, "Assigned Numbers", STD 2,
  895. RFC 1700, USC/Information Sciences Institute, October
  896. 1994.
  897. [RFC-1741]
  898. Faltstrom, P., Crocker, D., and Fair, E., "MIME Content
  899. Type for BinHex Encoded Files", December 1994.
  900. [RFC-1896]
  901. Resnick, P., and A. Walker, "The text/enriched MIME
  902. Content-type", RFC 1896, February, 1996.
  903. Freed & Borenstein Standards Track [Page 23]
  904. RFC 2049 MIME Conformance November 1996
  905. [RFC-2045]
  906. Freed, N., and and N. Borenstein, "Multipurpose Internet Mail
  907. Extensions (MIME) Part One: Format of Internet Message
  908. Bodies", RFC 2045, Innosoft, First Virtual Holdings,
  909. November 1996.
  910. [RFC-2046]
  911. Freed, N., and N. Borenstein, "Multipurpose Internet Mail
  912. Extensions (MIME) Part Two: Media Types", RFC 2046,
  913. Innosoft, First Virtual Holdings, November 1996.
  914. [RFC-2047]
  915. Moore, K., "Multipurpose Internet Mail Extensions (MIME)
  916. Part Three: Representation of Non-ASCII Text in Internet
  917. Message Headers", RFC 2047, University of
  918. Tennessee, November 1996.
  919. [RFC-2048]
  920. Freed, N., Klensin, J., and J. Postel, "Multipurpose
  921. Internet Mail Extensions (MIME) Part Four: MIME
  922. Registration Procedures", RFC 2048, Innosoft, MCI,
  923. ISI, November 1996.
  924. [RFC-2049]
  925. Freed, N. and N. Borenstein, "Multipurpose Internet Mail
  926. Extensions (MIME) Part Five: Conformance Criteria and
  927. Examples", RFC 2049 (this document), Innosoft, First
  928. Virtual Holdings, November 1996.
  929. [US-ASCII]
  930. Coded Character Set -- 7-Bit American Standard Code for
  931. Information Interchange, ANSI X3.4-1986.
  932. [X400]
  933. Schicker, Pietro, "Message Handling Systems, X.400",
  934. Message Handling Systems and Distributed Applications, E.
  935. Stefferud, O-j. Jacobsen, and P. Schicker, eds., North-
  936. Holland, 1989, pp. 3-41.
  937. Freed & Borenstein Standards Track [Page 24]