The Mozilla
Organization
At A Glance
Feedback
Get Involved
Newsgroups
License Terms
Newsbot
Developer Docs
Roadmap
Projects
Ports
Module Owners
Hacking
Get the Source
Build It
Testing
Download
Bugzilla
Bug Writing
Tools
View Source
Tree Status
New Checkins
Submit A Bug
FAQ
Search
Mail/news 5.0 I18n specification
last update 2/6/2000
nhotta@netscape.com

5.0 I18N specification
Description Milestone Status (as of M12)
mail/news      
charset for a new mail Charset of a new mail always set by the default charset of pref (i.e. no inheritance from a current window). M4 Done
charset for reply/forward mail Charset of the original message (main body) is used. M14 Done
charset for mailtourl Charset of a new mail always set by the default charset of pref (i.e. no inheritance from a current window). M15 Done
charset for a new mail by address book Charset of a new mail always set by the default charset of pref.   Done
attachment/send No charset label for attachments unless HTML with META charset specified.
If the main body charset is ISO-2022-JP, HTML attachments are base64 encoded. 
M10 Done
thread pane display Display subject/address in multiple charset by using charsets encoded in MIME header. M4 Done
thread pane sorting Use the application locale, store it in a message db (at folder creation time) then use it for locale sensitive string comparison. M5 Done
thread pane date display Use locale sensitive date/time format interface. Always use application locale. M5 Done
message body view libmime to convert message to unicode before passing to the layout. M5 Done
charset override   M15 5938
attachment/view Display multiple charsets. 
Decide the charset by following process. 
1) Content-Type charset 2) Charset menu selection 3) Auto charset detection if available (e.g. Japanese).
M6 Done 1)&3)
attachment view by browser Follow browser's charset/auto-detect setting   Done 
message save as  text/plain -> charset convert to platform file charset
text/html -> no charset conversion
M14 23418
folder pane view Multi lingual display by unicode. M14 7844
message search widget Multi lingual display by unicode. M15  
local message search Header search, apply MIME decode and charset conversion before comparison if necessary. 
Body search, apply decoders (quoted printable, base64, html named entity and NCR), plus charset conversion if necessary.
M15 11659
IMAP search Send UTF8 query, plus fall back mechanism by mail charset of the mail folder, finally asscii. 
4.51 to support the fall back, 4.5 only supports ascii search. 5.0 should support utf-8 query.
M16 5933
IMAP folder name When a folder name is stored locally, it should be UTF-8 or modified UTF-7 instead of system charset. M14 Done
message filter Same issue as local message search M15  
newsgoup search Whatever supported by 4.5 (need investigation)    
       
       
address book      
sorting Same as the message thread sorting. 
Use the application locale, store it in a message db then use it for locale sensitive string comparison.
M11  Done
address book widgets Multi lingual display by unicode. M11 Done
type down Multi lingual support.    
name completion Multi lingual support. M15  
address book search   M15  16354
LDAP search
Preference      
intl.mailcharset.cyrillic not needed    
intl.mailcharset.override_1 not needed    
mailnews.send_hankaku_kana   M14 Done
default mail send charset UI M14 23540



Attachment view/send in detail

View:

  • Charset labeled attachments can be viewed correctly.
  • Unlabeled attachments may be viewed if charset detection is available.
  • Otherwise, unlabeled (and not iso-8859-1) attachments cannot be viewed inline (i.e. displayed as a link).
  • There is a post beta 1 feature which enable the charset menu to view unlabeled attachments (charset override #5938).


Send:

  • Use HTTP charset as a charset label if available (highest priority).
  • Use META charset for HTML as a charset label if available (second priority).
  • Otherwise, no charset label  is attached to the sending attachments.
  • Apply Base 64 encoding for Japanese attachments only.




Address book charset conversion
  • Charset for the storage is UTF-8 (escape/unescape for 8 bit data may be done by database).
  • Charset conversion of UCS2 and UTF-8 (both direction) is needed between RDF and address book storage.
  • Importing ldif needs no conversion (base64 decoding only) since ldif charset is UTF-8.
  • Importing 4.x address book needs conversion from pref specified charset to UTF-8.
  • I18n to provide a mapping function from csid to charset name since csid is used in 4.x pref.




Charset conversion fallback for mail send

text/html - Apply charset conversion first. For characters not convertered from unicode, fallback to named entity then NCR.

text/plain and message header - Apply charset conversion first. For characters not convertered from unicode, try transliteration (EUR, (tm)) then fallback to '?' (question mark).



Local mail search i18n requirement

Header search

  • MIME decode and charset conversion for headers - changed, search term is now unicode (used to be a folder charset), we should compare it against MIME decoded and unicode converted header strings (see nsMsgI18NDecodeMimePartIIStr).
Body search
  • Charset conversion for the search term - changed, use mail default charset (used to be a folder charset), (see nsMsgI18NGetDefaultMailCharset).
  • Optional international search - new, parse the Content-Type charset per message and apply charset conversion to the search term (not to the body for performance reason),
  • UI (a checkbox) for international search.
  • QP decode for body (plain and HTML)
  • Entity (CER) and NCR decode for HTML body - new, also could be an option (international search)


IMAP search i18n requirement

UTF-8 - new
Fallback search charset (mail default charset) - changed, used to be a mail folder charset



LDAP search i18n requirement

UTF-8 search by server



Charset override

todo


Copyright © 1998-2000 The Mozilla Organization.
Last modified February 7, 2000.