修改Thunderbird Mbox文件,找回丢失邮件

今天周六,起了个早,吃了早饭,沏上茶,一边洗衣服,一遍happy的看邮件。
邮件积攒的太多,于是操作就太快,结果,不知道为什么,一片邮件就消失了。

重要提示:
以下所有操作的前提是:进行完全备份

于是,尝试了重建所在文件夹的.msf索引文件,有两种方式:
1、在文件夹上,右击-》属性-》修复文件夹
2、关闭thunderbird,到邮箱路径下,删掉文件夹的.msf文件,重新打开Thunderbird
然后发现,只找回一部分邮件来。

这可不行,好多邮件很重要,于是,进行了大还原:
关闭Thunderbird,将global-messages-db.sqlite命名为global-messages-db.sqlite.bak,重启Thunderbird。
还是不行~~
哭的心都有了

好吧,找工具~~
尝试了ZMail和Advanced Media Recovery,测试证明,根本不好用~~

那只好找资料,自己开搞了。
Thunderbird的一个最底层邮件文件夹,由两部分组成,
没有后缀的MBox文件,和有后缀的.msf索引文件。

MBox就是邮件和附件的全部了,经过摸索,发现有以下规律
1、MBox可以用靠谱的文本编辑工具进行编辑,用notepad的同学们,您就省省吧
2、每一封邮件都是一段MIME格式的文本,每封邮件以如下格式开始,直到下一个”From – “结束

From - Thu Dec 19 01:26:18 2013
X-Account-Key: account4
X-UIDL: ZC0418-TJAs6xVgg1PigBtJ_pc_a3c
X-Mozilla-Status: 0001
X-Mozilla-Status2: 00000000
X-Mozilla-Keys:                                                                                 
X-QQ-SSF: 00410000000000F0
X-QQ-BUSINESS-ORIGIN: 2
X-Originating-IP: 112.65.5.74
X-QQ-STYLE: 
X-QQ-mid: bizmail35t1387361723t6111701
From: "=?utf-8?B?5pyx6I6J6Imz?=" <xxx@xxx.com>
To: "=?utf-8?B?6auY5pmX?=" <xxx@xxx.com>

3、邮件根据编码不同,主要有gb2312,utf-8和BASE64编码三种
4、附件以MIME方式存储在邮件下面,如:

MIME-Version: 1.0
Content-Type: multipart/related;
	boundary="----=_NextPart_000_0007_01CE78E0.0A87C340"
...

5、两份邮件之间,没有关联关系,可以随便修改顺序,可以随便剪切复制粘贴,但要保证每一封邮件的完整性
6、邮件的状态,由以下两个标志位指定

module Mbox
  #X-Mozilla-Status
  #Message has been read.
  MSG_FLAG_READ=0x0001
  #A reply has been successfully sent.
  MSG_FLAG_REPLIED=0x0002
  #The user has flagged this message.
  MSG_FLAG_MARKED=0x0004
  #Already gone (when folder not compacted). Since actually removing a message from a folder is a semi-expensive operation, we tend to delay it; messages with this bit set will be removed the next time folder compaction is done. Once this bit is set, it never gets un-set.
  MSG_FLAG_EXPUNGED=0x0008
  #Whether subject has “Re:” on the front. The folder summary uniquifies all of the strings in it, and to help this, any string which begins with “Re:” has that stripped first. This bit is then set, so that when presenting the message, we know to put it back (since the “Re:” is not itself stored in the file).
  MSG_FLAG_HAS_RE=0x0010
  #Whether the children of this sub-thread are folded in the display.
  MSG_FLAG_ELIDED=0x0020
  #DB has offline news or imap article.
  MSG_FLAG_OFFLINE=0x0080
  #If set, this thread is watched.
  MSG_FLAG_WATCHED=0x0100
  #If set, then this message's sender has been authenticated when sending this msg. This means the POP3 server gave a positive answer to the XSENDER command. Since this command is no standard and only known by few servers, this flag is unmeaning in most cases.
  MSG_FLAG_SENDER_AUTHED=0x0200
  #If set, then this message's body contains not the whole message, and a link is available in the message to download the rest of it from the POP server. This can be only a few lines of the message (in case of size restriction for the download of messages) or nothing at all (in case of “Fetch headers only”)
  MSG_FLAG_PARTIAL=0x0400
  #If set, this message is queued for delivery. This only ever gets set on messages in the queue folder, but is used to protect against the case of other messages having made their way in there somehow – if some other program put a message in the queue, it won't be delivered later!
  MSG_FLAG_QUEUED=0x0800
  #This message has been forwarded.
  MSG_FLAG_FORWARDED=0x1000
  #These are used to remember the message priority in interal status flags.
  MSG_FLAG_PRIORITIES=0xE000

  
  #X-Mozilla-Status2
  #This message is new since the last time the folder was closed.
  MSG_FLAG2_NEW=0x00010000
  #If set, this thread is ignored.
  MSG_FLAG2_IGNORED=0x00040000
  #If set, this message is marked as deleted on the server. This only applies to messages on IMAP servers.
  MSG_FLAG2_IMAP_DELETED=0x00200000
  #This message required to send a MDN (Message Disposition Notification) to the sender of the message. For information about MDN see Wikipedia:Return receipt.
  MSG_FLAG2_MDN_REPORT_NEEDED=0x00400000
  #An MDN report message has been sent for this message. No more MDN report should be sent to the sender.
  MSG_FLAG2_MDN_REPORT_SENT=0x00800000
  #If set, this message is a template.
  MSG_FLAG2_TEMPLATE=0x01000000
  #These are used to store the message label.
  #label value
  #1 0x02000000
  #2 0x04000000
  #3 0x06000000
  #4 0x08000000
  #5 0x0A000000
  #6 0x0C000000
  #7 0x0E000000
  MSG_FLAG2_LABELS=0x0E000000
  #If set, this message has files attached to it.
  MSG_FLAG2_ATTACHMENT=0x10000000 
end

这样就好办了,写了个程序,查找每个mbox文件内的每个EMail,查看状态,找出状态错误的邮件:
PS:Ruby加起来学了没几天,哈哈哈

#!/usr/bin/ruby

class EnumFiles
  def enumFiles(folderPath)
    #枚举文件
    files = Dir.glob(folderPath+"*")
    return files
  end
  
  def enumFilesAll(folderPath)
    #枚举文件
    files = Dir.glob(folderPath+"**/*")
    return files
  end
end
#!/usr/bin/ruby

class ReadMbox
  def readLine(filePath)
    if File.exist?(filePath)
      f = File.open(filePath,"r+")
      lines=f.readlines
      return lines
    else
      puts(filePath + " does not exist")
    end
  end
  
  def readMsgNum(filePath,rootPath)
    msgNum=0
    msgErr=0
    bError=false
    
    if File.exist?(filePath)
          f = File.open(filePath,"r+")
          f.each{|l|
            if(l[0..6]=="From - ")
              msgNum=msgNum+1
              bError=false
            end
            
            if(l[0..17]=="X-Mozilla-Status: ") && (!bError)
              flag = l[18..21].to_i(16)
              if(flag & 0x0008 > 0)
                puts("]>"+l)
                msgErr=msgErr+1
                bError=true
              end
            end
      
            if(l[0..18]=="X-Mozilla-Status2: ") && (!bError)
              flag2 = l[19..26].to_i(16)
              if(flag2 & 0x00040000 > 0)
                puts("]>"+l)
                msgErr=msgErr+1
                bError=true
              end
            end
          }
    end
          
    puts(filePath.gsub(rootPath,"")+" EmailNum=\t"+msgNum.to_s()+"\terrNum=\t"+msgErr.to_s())
  end
end
#!/usr/bin/ruby

require "./EnumFiles.rb"
require "./ReadMbox.rb"

rootPath="PAHT/TO/MAIL/

files = EnumFiles.new().enumFilesAll(rootPath)
#puts files

mbox = ReadMbox.new()

files.each do |f|
  if(!File.directory?(f)) && (!(File.extname(f).length>0))
    mbox.readMsgNum(f,rootPath)
  end
end

最后,找到文件,发现并不是很多只有5个,就懒得写程序了,手动剪切到回收站,将标志位置为正常,刷新回收站索引,然后一切就太平啦

X-Mozilla-Status: 0001
X-Mozilla-Status2: 00000000

后记:
其实Thunderbird在删除邮件时(从回收站删除),为提高效率,并不一定会真的将邮件删掉,而很可能只是打上一个标记。

这样,其实只要用一个靠谱的编辑器,比如VIM、PSPad等,就可以很方便的定位到你需要的邮件,然后就可以进行备份和还原啦。

参考:
全局还原索引文件
官方推荐的编辑工具列表
标志位说明
标志位源码