* Cosmo <cosmo@xxxxxxxxxx> [2003-02-19 09:03:56 +0000]:
> There are plenty of XXXX->html converters for all sorts of doc types
> but I'm not aware of a html->text converter - everyone seems hellbent
> on producing html even if just to be able to print the heading in
> *boldface*.
I'm not sure how resource-intensive this is -- I know it works fine
for my mail and I get over 100 messages a day, most of them spam -- but
I use lynx via procmail to convert text/html emails to text/plain (I
leave multipart ones intact).
If you want my personal opinion (and as the saying goes, everybody's
got one :p), any email with a content-type of text/html should be
stopped cold. If Hotmail is using it, well, their users might finally
have to feel 1/100th of the pain that their usage of that service
inflicts on the rest of the net. Note that text/html is not the same as
multipart, where at least normal mail clients can get at a plain text
version to display (even if we still have to deal with the overall
sluggishness of the Net that results from SMTP-ing around all that extra
HTML crap), although as you can probably guess I have no love lost with
that format either.
:0
* ^Content-type: text/html
{
:0 c
${MAILDIR}/inc/html-safetynet
:0 fb
|lynx -nopause -force_html -dump /dev/stdin
:0 afwh
|formail -i "Content-type: text/plain" -I "X-HTML-Strip: 1.0"
}
As a list maintainer, you'll probably want to remove that little
clause that writes a copy to the "html-safetynet" file. For everyone's
information, I have never once recovered a legitimate email from that
folder in the year I've been using this recipe, so I'm not sure why I
keep it around myself... Anyway, after running this recipe you'll wind
up with these three headers:
Old-Content-type: text/html
Content-type: text/plain
X-HTML-Strip: 1.0
The first is added by formail when it replaces the "Content-type"
header...it "backs up" the old one. The second, well duh. :p The
third one is just something I arbitrarily added so I could see at a
glance whether an email had been "filtered"...just in case a legit one
ever came in and I for some reason needed to recover the original HTML
version.
--
------------------------------------------------------------------------
John Buttery
(Web page temporarily unavailable)
------------------------------------------------------------------------
Attachment:
pgp3sluc6D2y2.pgp
Description: PGP signature