ToddAndMargo: grep is a decent start, but it actually is much more 
complicated than that.


To get this data in a format you can read/process, you have to deal with 
the fact that there is no standard in SMTP for the order of headers, and 
every client seems to do it differently.  On top of which, some clients 
might record a "Sent" header instead of a "Date" header, and then you 
have to deal with control fields, etc.

Anyway, assuming Thunderbird on Linux, you can get to the MBOX files 
from 
~/.thunderbird/<YOUR_PROFILE_ID>/ImapMail/<IMAPSERVERNAME_FOLDER>/<REMOTE_FOLDER> 


YOUR_PROFILE_ID would be whatever you see there, it's a random string 
and if you only have one profile it will end in ".default".
IMAPSERVERNAME_FOLDER would be which email account you are looking for, 
and if you have multiple it might append a "-2", "-3", etc. 
"imap.gmail.com" is a good example
REMOTE_FOLDER is the actual folder name of the folder you are trying to 
scrape.  So "INBOX", "Sent", "Spam", etc.

I wrote this simple combination of grep/awk to convert everything into a 
CSV that you can import into whatever you want.

If you save this in thunderbird_to_csv.sh, you can execute it like so 
(the first argument is the Thunderbird MBOX file):

$ ./thunderbird_to_csv.sh 
.thunderbird/<YOUR_PROFILE_ID>/ImapMail/<IMAPSERVERNAME_FOLDER>/<REMOTE_FOLDER> 




#!/bin/bash
grep -E "^((Subject|Date|Sent|From): |From - )" $1 | awk 'BEGIN { print 
"From,Subject,Date"; } /^From - /{
     subject=""; from=""; date="";
     while(length(from) == 0 || length(date) == 0 || length(subject) == 0){
         getline;
         if(length(from) == 0 && index($0, "From: ") == 1){
             from=gensub("^From: (.*)$", "\\1", $0);
         }
         if(length(subject) == 0 && index($0, "Subject: ") == 1){
             subject=gensub("^Subject: (.*)$", "\\1", $0);
         }
         if(length(date) == 0 && index($0, "Date: ") == 1){
             date=gensub("^Date: (.*)$", "\\1", $0);
         }
         if(length(date) == 0 && index($0, "Sent: ") == 1){
             date=gensub("^Sent: (.*)$", "\\1", $0);
         }
     }
     sub("\"", "\"\"", from);
     sub("\"", "\"\"", subject);
     sub("\"", "\"\"", date);
     from=substr(from, 0, length(from)-1);
     subject=substr(subject, 0, length(subject)-1);
     date=substr(date, 0, length(date)-1);
     print "\"" from "\",\"" subject "\",\"" date "\"";
}'




For those curious what this does, the grep command strips everything 
down to lines starting with "Subject: ", "From: ", "Date: ", "Sent: ", 
and "From - ".

I can't recall if "From -" is a part of the MBOX format (I don't 
remember it being there), but I think it's actually something 
Thunderbird threw in there.  Glad they did, as it separates each email 
pretty nicely.

It then loops through every line to see if you can find these headers, 
and replaces them IF AND ONLY IF THAT HEADER HASN'T BEEN SEEN BEFORE.  
So for instance, if you have an email that was originally from Bob, 
forwarded to you from Alice, if I kept searching through it would say 
the email was from Bob and not Alice (because Alice is who actually sent 
that email).

After that, it escapes the double quotes inside to be two double-quotes, 
the standard for CSV files, and takes off the last character which is an 
extra newline.

-Brad



On 08/24/2015 06:54 PM, ToddAndMargo wrote:
> On 08/24/2015 04:29 PM, Yasha Karant wrote:
>> My query applies specifically to Mozilla Thunderbird current, but could
>> have a more general solution.
>>
>> I need to convert to a plain text file listing (that could be imported
>> into a word processor, LaTeX or a GUI front end thereto, etc) what
>> appears in the display of Thunderbird as the columns Subject From and
>> Date for an internal activity report that I must write. These columns
>> appear on the end-user GUI display and allow one to then read specific
>> messages by "point and click".  As I cannot find a description of the
>> official Thunderbird nomenclature for the various sections of the GUI
>> display, I am using the above descriptions.
>>
>> I could use a screenshot application, select a rectangular region, save
>> each entity as a PNG image, and then use an OCR application to yield
>> plain text.  I would prefer that the screenshot application simply
>> recognizes the text *AS* text, allowing me to copy and paste into a text
>> editor, etc., all running under X wndows.   Does anyone know of an
>> application that does this?  A brief perusal on the web as well as a
>> quick read of the information on the "default" screenshot applications
>> that come with either MATE or KDE does not seem to reveal a mechanism
>> for this (but rather the PNG or other image, non-text, route).
>>
>> The normal mechanism I use -- highlight (select), pointing device button
>> (to copy), and then point device button (paste) to capture from say a
>> text HTTP file in a web browser to a word processor application -- does
>> not seem to work for the above "column" portion of the Thunderbird
>> display.  This normal mechanism does work if I view source for each
>> message, displaying the SMTP text source and headers in a box, but is
>> very time consuming as the information that I need is available in the
>> "columns" of the basic Thunderbird user interface without having to view
>> the source.
>>
>> Any assistance is appreciated.
>>
>> Yasha Karant
>
> Hi Yasha,
>
>    Something like this?
>
>        grep -i  "subject\|from\|date" Inbox
>
> -T
>
>
>