MailNews

From ThorxWiki
(Difference between revisions)
Jump to: navigation, search
(update slrnspool2maildir)
m (messageID fix)
 
(10 intermediate revisions by one user not shown)
Line 1: Line 1:
Nemo's notes on making mutt work for newsgroups, via some helper utilities that talk nntp, and scripts which bind them together.
 
 
{{TOCright}}
 
{{TOCright}}
  +
Nemo's notes on making mutt work for newsgroups, via some helper utilities that talk nntp, and scripts which bind them together.
   
 
== Reading ==
 
== Reading ==
slrnpull pulls messages into it's own spool. This spool is readable with mutt directly (it seems to treat it as an 'mh' format mailbox.
+
slrnpull pulls messages into it's own spool. This spool is readable with mutt directly (best method for that is: create a Maildir, but make 'new' be a symlink to the newsgroup spool. The filenames dont start out in Maildir recommended format, but they are happily read. And when they're written out to 'cur' they get flagged appropriately and everything is happy.
   
However, this means that the messages default to being marked as 'read', and unless you control your own spool, you can't even change that!
+
=== newsspool2Maildir ===
  +
Alternatively, I want to keep my newsspool "pure" (as a backup/archive), so I copy them over with newsspool2Maildir...
   
=== slrnspool2Maildir ===
+
newsspool2Maildir simply reads a news spool (so it's an extremely simple news client?), and caches what it reads to a Maildir (so, it's an MDA)
slrnspool2Maildir is basically an extremely simple news client. It's not even an nntp client, since it only looks at the local news spool as created by slrnpull), and ALSO acts as a MDA to the appropriate Maildirs.
 
   
 
It operates by
 
It operates by
# discovering locally "subscribed" newsgroups. (for me, they are ~/Maildir/.nntp.$GROUPNAME Maildir folders))
+
# discovering locally "subscribed" newsgroups. (for me, they are all Maildir folders matching the path "~/Maildir/.nntp.*" - where the trailing wildcard is the group name)
# for each group, find new messages from the equivalent slrnpull spool, and copying them over to the Maildir as new messages
+
# for each group, find new messages from the equivalent slrnpull spool, and copying them over to the Maildir as new messages. (See [[Maildir]] for notes on the filenames I give it ;)
  +
 
Messages can now be read, flagged, deleted, etc, as normal in mutt (or within any other mail client of choice that can read the Maildir, either directly or via IMAP/etc)
 
Messages can now be read, flagged, deleted, etc, as normal in mutt (or within any other mail client of choice that can read the Maildir, either directly or via IMAP/etc)
   
It operates in a read-only fashion on the spool, and keeps state of where it read up to for each group via a state file within the group Maildir). Flushing the spool is outside the scope of this utility. (should be handled by slrnpull)
+
It operates in a read-only fashion on the spool, and keeps state of where it read up to for each group via a state file within the group's Maildir. Flushing the spool is outside the scope of this utility and should be handled by slrnpull
   
Finally, we check if slrnpull's conf file matches the groups subscribed to within the Maildir and alert to logs if they differ.
+
We can also check if slrnpull's conf file matches the groups subscribed to within the Maildir and alert to logs if they differ.
 
* Future functionality may write the slrnpull conf file to match also!
 
* Future functionality may write the slrnpull conf file to match also!
   
 
== Writing ==
 
== Writing ==
Write messages in mutt as normal, but massage them on the way out, so they go to the slrnpull outgoing queue.
+
Compose new messages in mutt as normal, but we divert them from the normal bin/sendmail path to smtp, and instead redirect (and reformat as needed) them into the slrnpull outgoing queue.
   
I believe this can be done by setting the $sendmail variable within mutt - to a custom script.
+
This is done by setting the $sendmail variable within mutt - to a custom script, and is active only when reading .nntp.* Maildirs (thanks, folder-hook)
   
Can this variable be set as a folder-hook so that it's only set when reading "Newsdir" maildir folders?
 
   
 
=== Editing in mutt ===
 
=== Editing in mutt ===
Line 59: Line 58:
 
To: xxx@example.com
 
To: xxx@example.com
 
Subject: test
 
Subject: test
Message-ID: <20111027130905.GB24456@nemo.house.cx>
+
Message-ID: <20111027130905.GB24456@nemo.house>
 
MIME-Version: 1.0
 
MIME-Version: 1.0
 
Content-Type: text/plain; charset=us-ascii
 
Content-Type: text/plain; charset=us-ascii
Line 95: Line 94:
   
 
Note also that mutt WILL allow sending without a subject, whilst that will fail to nntp (which appears to require at minimum a "Subject: " header)
 
Note also that mutt WILL allow sending without a subject, whilst that will fail to nntp (which appears to require at minimum a "Subject: " header)
  +
  +
==== Caveats ====
  +
Mutt will <r>eply (or <g>roup reply) to the author's email address. I can set a folder-hook which sets a "my_hdr to" to add an additional address as a reminder, but the 'to' field does ultimately have to be manually set!
   
 
==== Posting a reply to author from mutt? ====
 
==== Posting a reply to author from mutt? ====
I'm not sure if this is possible without getting convoluted.
+
I'm not sure if this is possible without getting convoluted. (detect a non-group email address and pipe the message through to sendmail?)
 
   
 
=== Minimum post requirement for slrnpull ===
 
=== Minimum post requirement for slrnpull ===
Line 139: Line 141:
   
 
Note that the '.' on a line itself ended the post!
 
Note that the '.' on a line itself ended the post!
  +
  +
The headers provided by slrn in a post are:
  +
<pre>
  +
Newsgroups:
  +
From:
  +
Subject:
  +
Reply-To:
  +
Followup-To:
  +
Keywords:
  +
Summary:
  +
</pre>
  +
  +
whilst if posted with minimal additions (ie, Newsgroups:, From: and Subject: only are set), then the following headers are posted (content removed except for User Agent: and Message-ID:)
  +
<pre>
  +
Newsgroups:
  +
From:
  +
Subject:
  +
User-Agent: slrn/pre1.0.0-18 (Linux)
  +
Message-ID: <slrnjckq9p.6gk.nemo@falcon.house.au>
  +
</pre>
  +
   
 
=== Headers analysis ===
 
=== Headers analysis ===
Line 155: Line 178:
 
* Message-ID - auto generated
 
* Message-ID - auto generated
 
* User-Agent - auto generated
 
* User-Agent - auto generated
* References - auto generated? (does mutt use these at all, or only In-Reply-To ?)
+
* References - (mutt does not generate this one, though it reads it)
 
* In-Reply-To
 
* In-Reply-To
 
* Reply-To
 
* Reply-To
Line 162: Line 185:
 
* Newsgroups
 
* Newsgroups
 
* X-Newsreader
 
* X-Newsreader
  +
  +
==== Headers that appear in every news message ====
  +
(analysis from one group - 3000 messages over 3 years)
  +
<pre>
  +
Date
  +
From
  +
Message-ID
  +
Newsgroups
  +
Path
  +
Subject
  +
Xref
  +
</pre>
   
 
=== sendmail2newspool design ===
 
=== sendmail2newspool design ===
Line 187: Line 222:
 
* to check: what happens when there are multiple slrnpull instances running simultaneously?
 
* to check: what happens when there are multiple slrnpull instances running simultaneously?
   
  +
  +
== leafnode? ==
  +
This could be used against leafnode without too much difficulty, but leafnode provides a local nntp server - implying the use of a local nntp client. And the whole point of this exercise is to not in fact need that!
  +
=== However, details? ===
  +
* newsspool2Maildir needs some directories changing, and that's about it, I think! The stored files (including magic .overview) otherwise work the same.
  +
* mutt2newsspool - needs to write to leafnode's out.going directory, and set it chmod u+r (0400). Failed postings are in failed.postings.
  +
** for more info: http://leafnode.sourceforge.net/doc_en/fetchnews.8.html
  +
  +
Other Leafnodeisms:
  +
* Even with "allow_8bit_headers" set, some articles are not be downloaded at all!
  +
** they're pedantically not compliant articles, but slrn gets them. Clearly upstream gets them and was happy with them. Google groups has also been confirmed to be ok with them... leafnode ends up with an empty file in it's message.id tree - allowing me to know it's message ID, but nothing more! :(
  +
* If an article has a header but no body, nothing is kept (slrnpull keeps the header at least - which allows for the Message-ID to be searched (say, in google groups) and the message body possibly recovered :)
  +
* the leafnode method of knowing which groups are 'interesting' would be easier to sync than the slrnpull config file method
   
 
----
 
----

Latest revision as of 20:07, 26 January 2014

Contents

Nemo's notes on making mutt work for newsgroups, via some helper utilities that talk nntp, and scripts which bind them together.

[edit] Reading

slrnpull pulls messages into it's own spool. This spool is readable with mutt directly (best method for that is: create a Maildir, but make 'new' be a symlink to the newsgroup spool. The filenames dont start out in Maildir recommended format, but they are happily read. And when they're written out to 'cur' they get flagged appropriately and everything is happy.

[edit] newsspool2Maildir

Alternatively, I want to keep my newsspool "pure" (as a backup/archive), so I copy them over with newsspool2Maildir...

newsspool2Maildir simply reads a news spool (so it's an extremely simple news client?), and caches what it reads to a Maildir (so, it's an MDA)

It operates by

  1. discovering locally "subscribed" newsgroups. (for me, they are all Maildir folders matching the path "~/Maildir/.nntp.*" - where the trailing wildcard is the group name)
  2. for each group, find new messages from the equivalent slrnpull spool, and copying them over to the Maildir as new messages. (See Maildir for notes on the filenames I give it ;)

Messages can now be read, flagged, deleted, etc, as normal in mutt (or within any other mail client of choice that can read the Maildir, either directly or via IMAP/etc)

It operates in a read-only fashion on the spool, and keeps state of where it read up to for each group via a state file within the group's Maildir. Flushing the spool is outside the scope of this utility and should be handled by slrnpull

We can also check if slrnpull's conf file matches the groups subscribed to within the Maildir and alert to logs if they differ.

  • Future functionality may write the slrnpull conf file to match also!

[edit] Writing

Compose new messages in mutt as normal, but we divert them from the normal bin/sendmail path to smtp, and instead redirect (and reformat as needed) them into the slrnpull outgoing queue.

This is done by setting the $sendmail variable within mutt - to a custom script, and is active only when reading .nntp.* Maildirs (thanks, folder-hook)


[edit] Editing in mutt

I wrote a test email in mutt, adding a newsgroup: header, and then saved a local copy of the message as seen in the mutt editor (vim) before saving a local copy, and also setting $sendmail to a script which would save that copy too!

vim copy:

From: Nemo Thorx <nemo@house>
To: xxx@example.com
Cc: 
Bcc: 
Subject: test
Reply-To: 
Organization: Thorx Enterprises
Newsgroups: alt.test

test
.test1
..test2
.
...test3

-- 
  ------------------------------------------ --------------------------
                                                    earth native

note that I had to fill in To: field ... mutt wont let it send without that filled in! (it will let it go with as minimal as 'xxx@' - but if I try just 'xxx', then mutt auto-fills the domain. The custom headers (including 'newsgroups' are not visible in the mail summary screen, but are not lost.

local saved copy:

Date: Thu, 27 Oct 2011 23:09:05 +1000
From: Nemo Thorx <nemo@house>
To: xxx@example.com
Subject: test
Message-ID: <20111027130905.GB24456@nemo.house>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Organization: Thorx Enterprises
Newsgroups: alt.test
User-Agent: Mutt/1.5.20 (2009-06-14)
Status: RO

test
.test1
..test2
.
...test3

-- 
  ------------------------------------------ --------------------------
                                                    earth native

notice the empty headers removed and new headers added in

copy sent to "sendmail" was the same as the local copy, but without the 'Status' header.

So, to make the "sendmail" copy friendly for newsgroups - we need to delete the mail specific headers, pad the leading .lines, and save into the slrn out.going spool.

Easy huh?

thoughts: could the To: header be renamed on the fly to Newsgroups? (but what about the requirement in mutt that To: include email addresses? Two options?

  1. the To: field is ignored - ie, any address is ignored
  2. user-munged newsgroup names. eg
    • alt@fan.douglas-adams, alt@test,
    • ignorethis@alt.fan.douglas-adams
    • alt.fan.douglas-adams@nntp ...this is probably my favourite, since mutt will auto-append the domain to a the group name when entered into To:, and the domain can be easily thrown away again by sendmail2newspool :)

Note also that mutt WILL allow sending without a subject, whilst that will fail to nntp (which appears to require at minimum a "Subject: " header)

[edit] Caveats

Mutt will <r>eply (or <g>roup reply) to the author's email address. I can set a folder-hook which sets a "my_hdr to" to add an additional address as a reminder, but the 'to' field does ultimately have to be manually set!

[edit] Posting a reply to author from mutt?

I'm not sure if this is possible without getting convoluted. (detect a non-group email address and pipe the message through to sendmail?)

[edit] Minimum post requirement for slrnpull

A post in the var/spool/slrnpull/out.going/ directory needs to be only named "X", and can get away with only From:, Newsgroups: and Subject: headers, along with a body. Upstream servers will fill out the remainder.

The file needs to be protocol encoded - that is, any lines with a leading "." needs that . to be doubled.

For eg, a file named 'X' with these contents

From: Nobody <devnull@home>
Newsgroups: alt.test
Subject: test 2

test
.testone
..testtwo
...testthree
.
test after a terminating .

returns from the same server, in the same slrnpull loop as when it was posted, as:

Path: news.astraweb.com!news-xref3.astraweb.com!not-for-mail
From: Nobody <devnull@home>
Newsgroups: alt.test
Subject: test 2
Date: 27 Oct 2011 12:17:36 GMT
Lines: 4
Message-ID: <4ea94be0$0$14069$c3e8da3$76491128@news.astraweb.com>
Organization: Unlimited download news at news.astraweb.com
NNTP-Posting-Host: a82854f6.news.astraweb.com
X-Trace: DXC=m2DM4:=`Ee2o^d=IeQT`3>L?0kYOcDh@:[^oANOZPWM0m8]Ha;Fb[[:l9B_G1R=S=7YoIj:2_Q5X<fQ:A^;C6QQ0\M[9B[:?8U:
Xref: news.astraweb.com alt.test:2319994

test
testone
.testtwo
..testthree

Note that the '.' on a line itself ended the post!

The headers provided by slrn in a post are:

Newsgroups: 
From:
Subject:
Reply-To: 
Followup-To: 
Keywords: 
Summary: 

whilst if posted with minimal additions (ie, Newsgroups:, From: and Subject: only are set), then the following headers are posted (content removed except for User Agent: and Message-ID:)

Newsgroups:
From:
Subject:
User-Agent: slrn/pre1.0.0-18 (Linux)
Message-ID: <slrnjckq9p.6gk.nemo@falcon.house.au>


[edit] Headers analysis

common headers that the user (or MUA) has control over...

[edit] mail-only headers (will need removing)

  • To
  • Cc
  • Bcc

[edit] Common headers

  • From
  • Subject
  • Organization
  • Date - auto generated
  • Message-ID - auto generated
  • User-Agent - auto generated
  • References - (mutt does not generate this one, though it reads it)
  • In-Reply-To
  • Reply-To

[edit] news-only headers (needs adding or converting)

  • Newsgroups
  • X-Newsreader

[edit] Headers that appear in every news message

(analysis from one group - 3000 messages over 3 years)

Date
From
Message-ID
Newsgroups
Path
Subject
Xref

[edit] sendmail2newspool design

My ideas so far

  • Analyse To/Cc/Bcc for email addresses and if they exist, then send a copy of the message to sendmail (with newsgroups removed from these headers)
  • Coalesce all newsgroups from the To/Cc/Bcc headers into a single Newsgroups header.
    • note: how to distinguish between newsgroup fake email addresses, and real email addresses??
  • Should there be any newsgroups in the 'Reply-To' header, move those into 'Followup-To' header. Leave email addresses in this header intact (assuming this is RFC-OK)
  • possibly munge the message-ID header for the sake of script vanity?
  • prepend . to any .lines in the message
  • save file to a valid unique name within slrnpull out.going directory

[edit] Handling bounces

Two options

  1. sendmail2newspool triggers the slrnpull post (normally otherwise handled by cron?) and if the posting fails, grabs the message out of the reject folder, and moves it into a bounce Maildir of some kind for the user to see
    • pro: instant
    • con: this sort of thing is news2Maildir's job
    • also: dunno if this will work with a system spool
  1. news2Maildir checks the reject folder along with it's other checks.
    • pro: this is exactly the point of news2Maildir
    • con: not exactly timely, since this is a cron thing.

Best of both worlds: sendmail2newspool triggers news2Maildir at the end of it's posting, and so we satisfy both efficiency of code, and promptness of seeing the post either immediately accepted or bounced :)

  • to check: what happens when there are multiple slrnpull instances running simultaneously?


[edit] leafnode?

This could be used against leafnode without too much difficulty, but leafnode provides a local nntp server - implying the use of a local nntp client. And the whole point of this exercise is to not in fact need that!

[edit] However, details?

  • newsspool2Maildir needs some directories changing, and that's about it, I think! The stored files (including magic .overview) otherwise work the same.
  • mutt2newsspool - needs to write to leafnode's out.going directory, and set it chmod u+r (0400). Failed postings are in failed.postings.

Other Leafnodeisms:

  • Even with "allow_8bit_headers" set, some articles are not be downloaded at all!
    • they're pedantically not compliant articles, but slrn gets them. Clearly upstream gets them and was happy with them. Google groups has also been confirmed to be ok with them... leafnode ends up with an empty file in it's message.id tree - allowing me to know it's message ID, but nothing more! :(
  • If an article has a header but no body, nothing is kept (slrnpull keeps the header at least - which allows for the Message-ID to be searched (say, in google groups) and the message body possibly recovered :)
  • the leafnode method of knowing which groups are 'interesting' would be easier to sync than the slrnpull config file method

Personal tools
Namespaces

Variants
Actions
Navigation
meta navigation
More thorx
Tools