Multi-threaded NewsWatcher



Threading articles

Articles are threaded when they are fetched from the news server, to group related articles together and present them to the user in a grouping and order that makes sense. Normally, this means grouping articles which form a thread of discussion, that is, a series of articles formed by the original posting, and a series of replies, and responses to those replies.

The original NewsWatcher program identified threads simply by their similar subject line, and arranged articles within the thread based on the time they were received by the news server.

Example of a thread. Threading by reference allows MT-NewsWatcher to work out the proper grouping of articles, based on who replied to which articles.
The problem with this approach is that even though some articles have the same title, they may be unrelated. You just have to open a 'test' group (e.g. alt.test) to see this. A second problem is that, if someone changes the subject line of a reply to an article, then the reply will not be threaded with the original article. This is often done when threads wander off-topic, and a corrective reply might have a title like "Netscape sucks (was Re: Netscape world domination)".

A better solution is to use information provided by the news server, in the form of Message-ID and References headers, to build the thread structure. In this case, changing the subject line in a reply does not confuse the threading process, and unrelated articles with the same subject line do get linked together.

Threading-related preferences

To use the threading information from the news server to build threads, check the box labelled 'Thread articles using references' in the 'Newsreading options' section of the preferences dialog.

There is also an option to control how threads are displayed in new subject windows. The 'Show threads collapsed' checkbox in the 'Newsreading options' section of the preferences controls whether threads are shown collapsed (i.e. with only the first article visible), or expanded, when all the articles are visible.

Some points of note

Downloading speed Threading by references involves the downloading of some extra header information, depending on your other preference settings.

If you have checked the box in the 'Server options' preferences panel labelled 'Use XOVER command to get article headers', then threading by reference incurs no additional overhead.

If you are not using XOVER to fetch article headers, then MT-NewsWatcher will need to get 'Message-ID' and 'References' headers from the server to thread by reference. This will slow down the process of getting article headers. General considerations regarding the speed of downloading headers are discussed elsewhere.

Threading artifacts

Handling incomplete threads

Building threads based on references works best when the complete thread is being fetched from the server. If, like most users, you mark all the articles in a news group read when you close it, then often the only articles you receive are those new ones that have been posted since your last reading session. In this case, the complete thread is unlikely to be present.

In this case, MT-NewsWatcher does its best to reconstruct the thread structure of those articles present. When the original article which is referenced by another article is not longer present, MT-NewsWatcher compares the lists of referenced articles of those articles present to build the thread. Two articles that both refer to another, expired, article will be treated as two replies on the same level.

Sometimes you will see cases where a thread is split up by MT-NewsWatcher. This can happen because some newsreaders (e.g. Cyberdog) fail to include sufficient References information for proper threading, or because it is a long-running thread with two or more branches which have persisted long enough that they no longer have shared references. There is little that can be done about these problems, without going back to a subject-based threading.

Unrelated articles threaded together

Occasionally you will see two articles about quite different topics threaded together into one thread. This is almost certainly the result of user error, in that one of the authors has mistakenly replied to an existing article when intending to post a new article to the group. This can be verified by opening the articles and looking at the References: header.

Multi-part binary threads

Multi-part binaries (i.e. those posted in a series of articles with names like "car.jpg [1/2]", "car.jpg [2/2]") need special handling in this scheme, since they are not threads in the sense described above: they are not posted by replying to the previous article in the thread. When MT-NewsWatcher suspects that a series of articles are parts of a multi-part binary thread (by detecting the "part [x/y]" in the subject line), it builds the thread by comparing the subject lines after sorting the articles into alphanumerical order.


Download Basics Speech Recognition Filtering Cool Features