RELEASE NOTES DIABLO NEWS TRANSIT AND READER SYSTEM Matthew Dillon dillon at backplane.com http://www.backplane.com/diablo/ http://www.backplane.com/xmake/ news.software.nntp Diablo V1.29 * Fix bugs in handling of the rebuild of the diablo.hosts cache. (Russell Vincent) * Fix bug in global rate limiting that caused a connection to hang after a period of sucking. (Russell Vincent) Diablo V1.28 * DReaderd can now handle duplicate articles if it is run with the '-x xrefhost' option. * Added 'readerxover trackonly' option. This causes dreaderd to maintain the xover control files but not the xover data files, allowing dexpireover to be run but not allowing user reader access. This would be used in a mid-level cache which you are also using to maintain the active file in regards to article expiration. The idea here is that your leaf reader boxes can then partially synchronize their active files to the active file maintained on the midlevel box. * BSDI configuration now uses anonymous mmap and usleep() * PATCHES FROM RUSSELL VINCENT: - don't log successful connects as errors - change to LOG_INFO - add ability to match CIDR blocks in dreader.hosts (incl example) This patch is modified from a patch I obtained from a forgotten source (apologies to author). The original patch didn't work in all cases. - add the ability to limit simultaneous hosts for a matching line in dreader.hosts, rather than for a specific host. Also fix some spelling errors - Keep track of the remote port for client connections so that we close the correct connection each time and we log all connection closes. Also log the port, so that we get a separate log entry for multiple connects/closes instead of syslog logging 'last message repeated n times'. - add '-F' command-line option to disable the external spamfilter if it has been compiled in. Include additions to man page. - add 'logarts' to dnntpspool.ctl and '-A' option to dnewslink to allow logging of all article accepts/rejects for a particular host. Logging is done to "~log/feedlog.hostname". - Also accept a response of 201 (readonly access) from a remote server when running dsyncgroups. We only need read access. - Allow an group:art1-art2 range to be specified to dreadover for extracting header data for a range of articles in a group. - add a '-Q' option to diloadfromspool that reads the spool (ignoring history) for all articles and writes the output in a format that is suitable for drequeue (also included as drequeue.c). - An extra program to (usually) requeue articles for all newsfeeds based on specific input. Uses the standard options for specifying an alternative diablo.config file to, for example, only requeue articles for a specific feed. Please feel free to use the usual Diablo copyright for this. - add a command-line option to disable ident lookups in dreaderd - Rewrite the Authenticate() function of diablo so that it does a background IP lookup of all hosts in diablo.config and caches them. It rewrites the file at a (configurable) interval and when diablo.hosts changes. This fixes a problem where a news server has a given hostname, but the PTR for the IP points to a different host (which resolves to the same IP - hence still reasonably secure). i.e: 2 hosts point to same IP and you want to use the host without the reverse mapping. Diablo V1.27 More bug fixes from Nickolai. While running dreaderd, I've found two more things that I think might be worth fixing: -- The overview FD cache wasn't being flushed for feed-only forks, and thus kept certain groups from being properly resized upon expire. I've added the appropriate call to FlushOverCache for feed-only forks; this is my fault, I forgot about this case in my previous patch. -- Stale clients were being accumulated (ACTV in dreaderd.status but no TCP connection) because a file descriptor which corresponds to a closed TCP connection is not considered writeable by select(), at least on Solaris 2.6. I've changed the select() loop to poll all fd's for read, and close the connection if recv() returns 0. I'm not too sure about the efficiency here, but so far it seems to be working OK. Here are some additional modifications I've made to diablo-1.25-REL, and found quite useful in my situation; perhaps you could integrate these into the next release, if you think these would be useful to others? Some description of what these diffs intend to do: -- Parallelizing dexpireover: Add an option to make dexpireover fork N times and have each fork process 1/Nth of the overview information. This gives a noticeable speedup when running dexpireover with -e (history based expiration): on our reader Ultra60, we have two 10kRPM disks striped together for dhistory + overview (not spool), and 'dexpireover -e' takes ~16 hours; doing it in parallel with 4 forks cuts the time down to about 6 hours. [ As a side note, running dexpireover with -o, using the dexpover.dat file generated by dexpire, takes 10 min ] -- Generating Lines: headers in headfeed mode: Added an option to dspoolout, and a corresponding switch to dnewslink, to generate Lines: headers if one is not already present, in headfeed mode. Allows always having a meaningful line count in overview. -- No-initial-response bug in dnewslink: If dnewslink connects to a remote server, and doesn't get any reply, it hangs forever try to read from the socket. Set the KillFd before trying to read initial response from remote. Diablo V1.26 *** FIXED Y2K BUG IN PARSEDATE() FOR 2-DIGIT YEARS ***. While 2-digit years are not legal in USENET Date: headers, some sites still produce them. My Y2K handling for 2-digit years is broken, I was incorrectly subtracting or adding 50 instead of 100 in my calculations in lib/parsedate.c 4-digit years (the vast majority of USENET postings) work properly. Also, a huge patch set by Nickolai Zeldovich has been integrated in: XMakefile.inst Allow installing from a symlinked build tree util/dsyncgroups.c In some cases, group ends up being NULL; check util/dspoolout.c Check return value from malloc() filter/diab-filter.c Suppress error messages which result from missing articles (invalid filename passed) dreaderd/control.c Missing '@' sign in the email address util/dexpire.c (1) Unchecked malloc() return value (2) Part of the larger change below: man/dexpire.8 | Support for expiring overview based on a list of man/dexpireover.8 | expired article hash values, written by dexpire. util/dexpireover.c | It might be possible to distribute this file to lib/config.c | remote readers for dexpireover to use, although lib/global.c | I haven't tried (hv's might be different in the samples/diablo.config | unlikely case of collision). dreaderd/group.c | Periodic flush of overview FD cache, to allow dreaderd/reader.c | dexpireover to clean more groups samples/adm/biweekly.atrim Fix path to dkp; also changed to reflect the features below: lib/newsfeed.c | Support for dhistory-readonly mode, which allows util/diablo.c | the reader to still retrieve articles while the samples/dnewsfeeds | dhistory file is being rebuilt (no posting, of samples/adm/*.reader | course). Also added a user-configurable delay samples/adm/ | in the article-read loop to throttle incoming biweekly.atrim | feeds to a certain degree. Uses nanosleep() and XMakefile.inc | thus requires -lposix4 under Solaris. Diablo V1.25 Incorporated patch submitted by Nickolai Zeldovich which adds the -e option to dexpireover, causing dexpireover to expire based on the article content in the local spool rather then by days. Fixed up some compilation collisions with libc w/ strlcpy. Fixed bug in dsyncgroups - the creation and last-modified time stamp for groups already in dactive.kp was being improperly updated. Diablo V1.24 Fixed arts= and bytes= fractional values in log, they were counting 3.1022M, 3.1023M, 4.000M ... oops. Fixed bug in dreaderd xover code, unterminated article ranges ( such as '45-' ) were not being parsed properly. Fixed bug in handling of -D option to dsyncgroups. Added more documentation on the issue of POSTing from a reader back to the feeder and having it propogate down properly. Diablo V1.23-REL ***** RELEASE **** Fix bug in partial read from local socket that could cause the Act count to not decrement properly in dreaderd, locking up dreaderd after a while. Some adjustments for AIX added by request. Diablo V1.22-REL ***** RELEASE **** Add -f option to dreaderd, which defaults to ON. This is the FastCopy feature. When dreaderd retrieves an article from a remote spool, it normally buffers it completely before sending it to the client. This is so it can transparently restart the request if the spool server dies. However, this rarely happens. Now dreaderd will copy the data to the requesting client as it receives it. If the spool server dies prior to starting the body of the article dreaderd can still transparently restart the request. If the spool server dies after starting to send the body, the article will be truncated from the point of view of the client. Use -f0 to turn off this feature. Diablo V1.21-test05 ***** EXPERIMENTAL RELEASE **** Oops, broke 'xover artno' ... it would list the wrong article number. Fixed. Diablo V1.21-test04 ***** EXPERIMENTAL RELEASE **** Test for and fix user bogosity when user specifies bad article ranges, to prevent infinite loops in the server. Diablo V1.21-test03 ***** EXPERIMENTAL RELEASE **** Found and fixed bug in dexpireover reported by linux users in regards to 'NFS: No free inodes - contact Linus' error. Turned out to be a mmap() that I was forgetting to munmap! Bug finally located by Paul Martin. Diablo V1.21-test02 ***** EXPERIMENTAL RELEASE **** dreaderd main loop polls open descriptors every 5 minutes, workaround for linux bug. Code will also be used to implement idle timeout later. Added a little error checking to recvmsg() (should have no realized effect). Diablo's X-Trace: generation made to conform more closely to INN-2.x's, includes PathHost as first element now and moves the username element. Will probably change the username to a user-id in the next release to prevent scanners from deriving email addresses from X-Trace. Test for X-PGP-Sig header migrated from group.c to control.c in order to centralize authentication tests in preparation for generalizing the interface. Control/Authentication interface revamped in preparation for librarization of authentication. Bug fixed in lib/dkp.c related to seg-faults occuring in dexpireover in certain cases. TUNING_NOTES file played with. Removed altfeed (feeds/ directory) optimizations that caused directives in dnewsfeeds for feeds with feeds/ files to be ignored. Diablo V1.21-test01 ***** EXPERIMENTAL RELEASE **** 23 Oct 1998 ( This release is designated experimental due mainly to the change in the implementation of gropudef/groupref in dnewsfeeds. Sites running complex dnewsfeeds files should probably not move to this release. Other sites, including sites running dreaderd, will almost certainly want to upgrade ). Groups in dnewsfeeds (groupref, groupdef) are now fully recursive and ALL DIRECTIVES ARE SUPPORTED IN THE GROUPDEFs. This should greatly reduce the complexity of some site's newsfeeds. However, the new recursion implementation has not been fully tested yet. Added feed-only forks to dreaderd, so you can feed dreaderd without loading down a reader fork, which would effect other readers using that fork due to the load. See samples/dreader.hosts, 'f' flag. feed-only forks so not make connections to the spool/post servers. NOTE: the article, group, and head commands may not be issued for 'f'eed only connections (i.e. connections without the 'r' flag set). Moderated postings that send mail to the moderator were improperly passing certain headers to sendmail, such as Path: and To:, which would either be improperly executed by sendmail or improperly forwarded by the moderator. Seg-fault in diablo fixed: if diablo tries to retrieve an article past the end of the spool file (e.g., due to the spool file being truncated in a crash), it seg faulted. Fixed. dilookup now also understands the output from the didump command. (e.g. didump | dilookup -s) -F option add to dreadart (force read, ignore expired flag) dilookup prints 'not found' entries to stdout rather then stderr to make piping greps and stuff through it (via -s) easier. Fixed bug in dexpire: would improperly expire history adds that occured during the dexpire run itself when reader-mode expiration was configured. dexpire now just looks for the existance of the proper spool directory rather then the D.xxxxxxxx/B.xxxx file when determining whether to expire a history entry. Added 'diloadfromspool' utility. This utility will scan the entire spool, a specific set of D.xxxxxxxx directories, or a specific set of D.xxxxxxxx/B.xxxx spool files and regenerate the history records for the articles it finds. Currently there is one catch: In order to properly handle 8-bit-clean articles and ensure the robustness of the data, the Lines: header must be correct. Unfortunately, the Lines: header is not always correct so the command needs some more work (or I need to recalculate Lines: in the feeder). Diablo V1.20-REL ***** RELEASE **** Better Statistics Reporting in Diablo server - now includes breakdown of why articles were accepted or rejected. Increased diablo's precommit cache from 8192 to 16384 entries (from 128K to 256K of shared memory), and increased collision window from 30 seconds to 2 minutes, and the pre-commit history pre-cache retention from 10 minutes to 40 minutes. DReaderD now reports statistics on close: number of groups entered (group commands), number of articles read (head, article, or body commands), and number of bytes sent to client. Added -R option dexpireover - causes it to dynamically resize the over.* files and completely rewrite the data.* files in /news/spool/group. This allows fine-grain removal of garbarge from the overview files at the cost of rewriting every file. For this reason, 'dexpireover -R -a' should only be run once a week. The standard dexpireover should continue to be run daily. The standard dexpireover is much faster, but leaves wasted space inside the data.* files. Since dexpireover -R is now able to dynamically resize the overview indexes, Diablo defaults to creating 512-article indexes rather then 1024-article indexes. On a clean start it may take two or three weeks for the article indexes for heavily loaded groups to stabilize. Added -s option to dexpireover. The -s option is implied by the -a option. This option will allow dexpireover to dynmically resize the index (over.*) files in /news/spool/group but unlike the -R option it will not attempt to rewrite the data.* files. The combination of -a (implying -s) run a daily basis and -R run on a weekly basis should keep disk usage in /news/spool/group in-check. You can further refine disk utilization by setting explicit expire times in dexpire.ctl (see 'x' option in samples/dexpire.ctl) A WEEKLY crontab entry has been added, you need to add an entry in your cron to deal with it. See samples/adm/crontab.sample Updated the cron entries and sample crontab in samples/adm to make use of the new dpath command. Also: dexpire is now run once an hour rather then once every four hours, please double-check you crontabs. Allow .\n to terminate an article as well as .\r\n, because some dumb newsclients don't pass the CR like they are supposed to. Well, at least dots are supposed to be escaped so this doesn't screw up escaping of 8-bit-clean data. Add a Distribution: header internally based on distrib.pats if none is specified in the POST or if a blank Distribution: header is specified in the POST. Added configuration path elements to diablo.config. You can also specify the location of diablo.config in the DIABLO_CONFIG_PATH environment variable or by using the '-C diablo.config.path' option with any command. Baring that, the default location for the config file is /news/diablo.config. dreaderd's debug-aid, -C, has been changed to -X so -C can be used to specify a different location for the diablo.config file. Diablo V1.19-REL ***** RELEASE **** Implemented Supercedes: processing Fixed bug in GMT handling in newnews NNTP command. Fixed bug in moderator addressing, wasn't converting dots to dashes. Major revamping of documentation Implemented 'list distrib.pats', implemented 'list distributions'. Diablo V1.18-REL ***** RELEASE **** Fixed compiler error, some compilers don't like dynamic struct initialization. Fixed '.' escaping bug with POST command. POST command now mails the moderator if an unapproved posting is sent to a moderated group. (POST does not yet (should it???) mail any To: lines). Control messages were generating article number assignments in the Diablo feeder. Since control messages are out of band, this is incorrect and created gaps on the reader. The feeder no longer generates group:artno in the Xref: line for control messages. 'listgroup' command no longer lists article numbers that do not exist in overview (but still may list articles that no longer exist in the spools, since checking spool(s) is expensive). Dist now includes Iain's feeder-stats-3.80. NOTE: Iain changed the passed date format from a two to a four digit year. Diablo V1.17-REL ***** RELEASE **** The XMakefile structure has been reorganized somewhat. Hopefully this will not break builds on other platforms. The copyright has been changed. The less restrictive BSD copyright is now in effect, with a few modifications. I call it my "screw MickySoft's monopoly" copyright. Diablo reader code now checks for guard characters in the overview data file for retrieved overview data and logs any corrupt entries that it finds. Debug code added in test20 removed, diablo now ftruncate()s article files properly again when articles fail the standard filters. Fixed a bug in xover, xhdr, and other overview-related commands. In certain cases when a range of articles is specified, information pertaining to the last article number in the range will not be output. Fixed a bug in the cancel code - it would not cancel the last article in a group due to an off-by-one loop test. Added Joe Greco's Diablo-filter functionality. You have to turn it on in lib/vendor.h. See filter/README. THIS IS UNTESTED. (late addition to release notes: also adjusted POST code to allow a lone newline to separate headers from body. It previously required a CR+LF). Diablo V1.16-test23 ***** BETA RELEASE ***** NOTE!! The diablo tar file now extracts into a subdirectory named diablo-$(VERS)-$(SUBVERS)/ rather then diablo/ !!! Feeder-stats-3.71 from Iain Lee integrated. contrib/innactive removed (dsyncgroups should be used to generate initial dactive.kp files now). Diablo V1.16-test22 ***** EXPERIMENTAL RELEASE ***** Oops, fixed a few syntax errors in the test21 codebase. Fixed a boundry condition in the KPDB code that could cause dsyncgroups to segfault, or dreaderd in some cases if you were looking up the last entry in the database. Fixed a race condition in the KPDB scanning code that would cause 'list active' to return different sizes (it was cutting off a portion of the tail of the database during the scan because it was not checking for the case where the database needs to be resized). Fixed 'listgroup' off-by-one error (it wasn't including the last article number). This was causing tin and other readers to get confused. Diablo V1.16-test21 ***** EXPERIMENTAL RELEASE ***** oops, broke 'article ' retrieval. bleh. fixed. Diablo V1.16-test20 ***** EXPERIMENTAL RELEASE ***** removed ftruncate() from diablo server article spooler. Instead, fill would-be-deleted bytes with 0xFF. This wastes some space but is required to debug an article corruption problem that occurs with FreeBSD. Fixed bug in spool/post statistics, again. Diablo V1.16-test19 ***** EXPERIMENTAL RELEASE ***** Added 'newgroups' command support. Note: it currently scans dactive.kp and therefore isn't very efficient yet, so beware! Fixed bug in spool/post statistics (in ps output). Fixed a ddprintf() in DoPipe that was broken. 'article artno' command no longer returns success if the overview information is intact but the article cannot be found in the spool... it now properly returns '430 No such article'. bug in 'list active' and 'newgroups' fixed that would cause the list to be improperly cut-off (XXX 21 sep 98: actually not quite fixed yet -Matt XXX). The Diablo server now detects and logs corrupt spool entries (entries with a NUL as the first or last character, or missing the out of band guard-NUL following the article). Diablo V1.16-test18 ***** EXPERIMENTAL RELEASE ***** Fixed bug in dnewslink that caused it to continue to scan the queue even when it's batch file got ripped out from under it, which could result in more dnewslink's running the specified by dnntpspool.ctl dnewslink now checks for spaces in the message-id. samples/adm/daily.dclean updated. The 'removal of empty D.XXXXXXXX directories' code has been removed. This code interferes with reader-mode expiration by creating holes in the spool directory sequence which causes diablo to hit a failsafe. Symptoms will include Diablo not accepting news quickly enough to keep up with a feed. UPDATE YOUR DAILY.DCLEAN SCRIPT!!!!! Diablo V1.16-test17 ***** EXPERIMENTAL RELEASE ***** Minor source updates to remove compilation warnings. Spam filter handles garbaged hash table entries better. Fixed bug in new dhistory file add code that could cause lots of appended blocks of zeros, blowing up the history file's size. dnewslink now does more strict checking of Message-ID's Diablo V1.16-test16 ***** EXPERIMENTAL RELEASE ***** Serious bug in POST code for reader fixed. TCP_NODELAY option set in reader, nagle gets in the way too much when lots of little transactions are made (i.e. 'group' commands by news clients which have large .newsrc files). MsgId() code now checks for garbage message-id's such as '>'. While this sort of thing doesn't blow up Diablo, it can blow up other news systems. (Not quite finished with this yet, need to incorporate a patch from Terry to finish it up). Solaris config now expects solaris to have snprintf. Should get rid of bogus syslog error messages but otherwise has little effect on the code. Diablo V1.16-test15 ***** EXPERIMENTAL RELEASE ***** 'activedrop yes' was causing bogus (non-existent) entries to be written to the queue files for articles caught by that filter. The Diablo feeder now limits the number of D. spool directories it manages. The default is 14 days (2016 directories). See samples/diablo.config for the override. This is because the expire reader mode of operation doesn't appear to stabilize if the number of spool directories is allowed to grow indefinitely. Diablo V1.16-test14 ***** EXPERIMENTAL RELEASE ***** Added 10 second reverse ident timeout on read. Some sites would connect but would otherwise just hang the reverse ident, causing a DNS processing backlog. Changed history file append mechanism. Diablo now appends a large block of zero'd records then allocates out of the block by scanning the last N records. It is reasonably efficient even though we use read() to locate an empty record. Added '-SD' option to disable the NNTP-Posting-Host: portion of the spam filter. Diablo V1.16-test12 ***** EXPERIMENTAL RELEASE ***** dreader.hosts scan changed AGAIN.. sorry. Ok, now it scans the file once checking for matches against the IP address or FQDN from the reverse lookup. Previously it scanned the file checking FQDN's, then scanned the file again checking for IP's which could cause a "*" wildcard at the end of the file match an FQDN before an IP address earlier in the file matched the IP. It is now much more intuitive. 'q' flag added for dreaderd.hosts, surpress logging. Useful when you want to get rid of an idiot who connects a lot, and don't want to clutter your logs. Added 'L' flag - global rate limiting (per connection) to dreader.hosts. The specification is 'Lnnnn' where nnnn is the number of bytes per second the connection is to be limited to. Example: 'L2048' limits the datarate to 2 KBytes/sec. madvise() support added for diablo feeder. XADV_WILLNEED and XADV_SEQUENTIAL added to both the feeder and dnewslink. (On FreeBSD systems, XADV_SEQUENTIAL will fault in more pages at a time and XADV_WILLNEED will pre-load the page table for pages residing in the buffer cache). This is turned on automatically for FreeBSD (people need to submit patches for other OS's). dnewslink sets TCP_NODELAY to turn off nagle because it could cause unnecessary write buffering delays. We do a good job flushing our output properly and it's already a transaction oriented protocol, so... dnewslink reconnects to the destination for each queue file. Now it tries to reconnect to the *same* destination it succesfully connected to previously. It does not call gethostbyname() again unless a connect() fails. This way if a host has multiple IN A's and some don't work, dnewslink will tend to stick to the one that does work. (this problem noticed with sprintlink). increased dnewslink's internal write gather buffer from 4K to 8K. Added USE_POLL option for dnewslink to use poll(). It's turned on automatically for FreeBSD-3.0 or later. poll() is used to wait for new data to be appended to the queue file so dnewslink can read it ('realtime' and 'd-1' options in dspoolout.ctl must be specified), and poll()/read() is used rather then alarm(Timeout)/read()/alarm(0) when waiting for responses from the remote. The Diablo feeder code no longer ftruncate()'s spool files for every post-write reject (spam cache, article too large, in-transit history collision, etc...). Instead it simply seeks back. The feeder will ftruncate trailing 'garbarge' from a prior post-write reject on the descriptor close() only. Since ftruncate() must typically fsync() a file (or at least the file metadata), the new algorithm should save disk I/O. Added 'readerdns' option to diablo.config syslog logging fixed for dreaderd... it was logging the wrong description process identifier. debug aid added to dreaderd, -C option - cause SIGSEGV to be caught and infinite-loop the process. Otherwise process just dies without a core due to system setuid() security. Fixed dreaderd reader crash - was holding KPDB record pointers across state changes (where the KPDB might get remapped) for long scans (i.e. 'list active'). Changed KPDB ScanFirst/Next to operate on offsets rather then pointers. Fixed dreaderd.status slot assignment for when a reader process dies and is reforked by the parent. Added forward DNS lookup to verify reverse. hostname matching is not considered legal if reverse lookup does not match forward (if you are doing a public server, FQDN:* would disallow mismatched reverse/forwards while IP:* would allow them. Put IP:* after FQDN:* to attempt to obtain the fqdn first, but still allow even if the fqdn cannot be obtained) Diablo V1.16-test11 ***** EXPERIMENTAL RELEASE ***** Added additional errno checks in lib/buffer.c. Fixed ps status line reporting of number of active spooler and poster servers. Made fixes to record locking when reading data from dactive.kp... there were a few race conditions where one process could be parsing a group record while another was simultanious making modifications to it. This will hopefully fix a 'bogus group' race that occurs sometimes... don't know yet, crossing my fingers. Added 'L' and 'l' rate-limiting options to dreaderd.hosts, added 'h' option to dreaderd.hosts, set maximum number of connections from host. Added 'u' option to dreaderd.hosts, set maximum number of connections from user@host (only applies when identd lookups are successful). (NOTE: 'l' and 'L' not yet implemented) documented user authentication options in dreaderd.hosts Fixed wildcard handling in dreaderd.hosts ... you specify that a wildcard is to be applied specifically to an IP address or FQDN by using the IP: or FQDN: prefix. Otherwise it will apply to an FQDN only if the wildcard does not start with a digit. The sample dcontrol.ctl file was broken. Removed bogus all: entries that were droping valid checkgroups messages. Added more explicit instructions in README.READER on how to setup pgp. Added options for HPUX in lib/config.h, tested w/B.11.00-A. Added errno tests for EAGAIN (previously only checked for EWOULDBLOCK), again to support hpux. Diablo V1.16-test10 ***** EXPERIMENTAL RELEASE ***** ** Upgrade your xmake to 1.05 before compiling Diablo ** ** Note, new fields added to dactive.kp. Run 'dexpireover -a -U999' to add new fields. It is then suggested that you shut everything down and run 'dkp -t dactive.kp' to remove all the garbage from the dactive.kp file. ** Gave the diablo *feeder* side the ability to process header-only feeds. The feeder can process a mix of header-only and full-article feeds. The feeder-side 'article' and 'body' commands will return an article-not-found error if an attempt is made to retrieve an article that was originally stored from a header-only feed. This allows you to run a mix of feeds into a diablo feeder frontend in order to deal with duplicates, and then pass a header-only feed onto the reader. Fixed minor bug in snprintf().. it should cap the return value at the size of the buffer. The bug didn't effect anything and neither will the fix. dexpire now has a new option, '-s', which is to be used if your spool is softupdates-mounted. The option causes dexpire to sync/sleep/sync/sleep after removing each directory in order for statsfs() to be more reasonable. Fixed NumPending/NumActive counter error when a dreaderd/DNS process dies or gets stuck. Added 'activedrop' option to diablo.config .. when 'active on' is set, causing the feeder to assign article numbers from the dactive.kp file, the feeder normally continues to spool articles with no matching groups in dactive.kp. If activedrop is set, the feeder will reject such articles. Note that when running the feeder and reader on the same box, they can safely share the same dactive.kp file. This is because the feeder assigns article numbers via the "NX" field rather then the "NE" field. option -X[E,X] added to dsyncgroups to allow the NX field to be adjusted based on a remote NNTP host. Works like the -N option. The reader no longer creates overview for non-existant groups. 'b' option lines expire in dexpire.ctl was not working quite as advertised. Changed the math a bit to make it more exponential. Add pre-cancel cache to dreader to allow it to process cancels for articles that have not yet arrived. Only pre-cancels are entered into the cache, so as long as we get the article a reasonable amount of time after receiving the cancel, we will be able to properly detect that it was canceled. post-article-cancels are already handled. redid debugging. If you send a USR1 signal to a diablo process, normally a child diablo process, it will syslog a line for each received article indicating whether the article was accepted or not and, if not, why. More USR1's can be sent to bump up the debug level. Send a single USR2 to reset the debug level back to 0. dexpire's -s option enhanced somewhat. '-s#' specifies the sleep period inbetween sync()'s. See manual page. Added CTS and LMTS fields to dactive.kp ... keep track of when a group was created and when a group last received a new article. (see also: -O option to dexpireover). CTS will also eventually be used to implement active.times and related NNTP commands. Added -O option dexpireover. When combined with the -w ( wildcard specification ) option, dexpireover will delete stale newsgroups. See the manual page to dexpireover. Added -h0 option to dexpire. Allow you to expire the spool without updating the history file. Useful in emergencies or in order to expire on a tighter scheduler without incuring the history update overhead for every dexpire run. Fixed bug in feeder where offset/size was being stored in history for articles canceled by size, which could cause dreadart to attempt to map the (non existant) article. Diablo V1.16-test9 ***** EXPERIMENTAL RELEASE ***** Added -v option to didump so you can see generally where articles are landing in the D.XXXXXXXX directory tree when using reader-mode expiration. changed dexpire again to do a sync();sleep(1);sync(); after each directory has been removed, again to try to support softupdates. Fixed listen code when max number of Dns or max number of readers reached, was previously setting or clearing select bits for all descriptors rather then just the descriptors used to listen for new connections. Dummy 'control.CTLNAME' group added prior to the creation of the spool file in Diablo, allowing special expirations to be set for control messages. The sample dexpire.ctl defaults to a nominal 80% retention for control.* but only 20% for control.cancel (applies to diablo's reader expiration mode). Diablo V1.16-test8 ***** EXPERIMENTAL RELEASE ***** There is an IHAVE protocol case that the feeder handles incorrectly. In the case where the remote sends an IHAVE command and gets a 335 response, and then sends the article, it is expecting a final ack of 500, 501, 502, 436, 400, 235, or 437. It is not expecting a 435 response for the final ack at this point in the protocol and Diablo may send a 435 if an in-transit article collision occurs. Diablo has been changed to return a '437 Duplicate.' in this case rather then 435. This only effects the IHAVE protocol, not the streaming protocols. It should be noted that innxmit will handle the 435 response just fine, though not 100% efficiently, and newer innfeed's seem to handle a 435 response fine as well, but older innfeed's cannot handle it. While we lose some accuracy by reporting the reject code instead of the duplicate code, it's not critical in regards to the non-streaming IHAVE protocol. Added -l option to didump (line mode flush on output, useful when piping didump -f to something (see didump man page). dexpire calls sync() and sleep(1) before calling statfs(), so decoupled filesystems such as FreeBSD/BSDI w/ softupdates return more accurate statfs() results. Without it, as much as 80% of a 40G spool might be removed before the FS bitmaps catch up. Oops! fixed segfault bugs in dnewsfeed relating to seg faulting on unexpected EOF. Diablo V1.16-test7 ***** EXPERIMENTAL RELEASE ***** Fixed serious bug on Diablo feeder side. When operating in 'expire reader' mode, diablo expects /news/spool/news/D.XXXXXXXX directories for each ten-minute period. dexpire must expire the directories in time order and, indeed, it does. But if diablo is killed and restarted, a gap may occur. Diablo hits a sleep failsafe whenever it can't create an article file which can cause it to slow down massively. The fix is to have diablo create any missing directories on startup. I also create missing directories at run-time in Diablo locks up for a long enough period of time (which shouldn't happen, but if it does it will log it). Note that it is very important for Diablo to *not* recreate directories that have been deleted by dexpire. Doing so would result in article corruption by causing previously deleted article files to be recreated. Under no circumstances should you ever manually rm -rf a spool directory in the middle of the sequence range of current spool directories!!! Use dexpire. Added USE_MADVISE option to lib/config.h (and turned it on for FreeBSD) -- use madvise() syscall to reduce page faults by pre-mapping system-cached pages for articles. On systems which support it, this should reduce page faults taken by dnewslink mmaps which can be considerable. The reader code also uses this when mapping overview files. dnewslink process title changed so systems with ps status line support can see the run-time status of various dnewslinks. Diablo V1.16-test6 ***** EXPERIMENTAL RELEASE ***** Removed dactive.kp cleanup in samples/adm/biweekly.atrim ... this will blow up the reader! Removed manual references to the 'egrep' method of pruning a KP database. Fixed NumActive accounting. The reader processes were doing reader accounting for server failures, not just reader disconnects. Fixed POST bug. .. oops, I wasn't checking to see if posting was allowed and was always allowing posting. Bleh. Diablo V1.16-test5 ***** EXPERIMENTAL RELEASE ***** 'x' option added to dexpire.ctl, see sample file for more info (as related dexpireover). 'b' option added to dexpire.ctl ... adjusts expiration based on the number of lines in the article, ala the Lines: header if it exists. (I can't do it based on size because I create the data file before I know how large the article will be, and I refuse to do a file copy). Solaris patches for file descriptor passing added, from Kjetil Torgrim Homme and Russell Vincent. 'Distribution' control patch, from Lars added but with some routines rewritten. Any article without a distribution header is assumed to be a distribution of 'world' for purposes of feeding. See samples/dnewsfeeds for more information. THIS HAS NOT BEEN TESTED AND MAY NOT WORK YET. Fixed server tracking bugs for dserver.hosts and dreaders.host that prevented proper shutdown of reader daemons when parent dies. Same bugs also prevented server reconnects from occuring after a failure. Fixed bug in lib/kpdb.c that caused dsyncgroups to segfault. Fixed bug in dreaderd mbuf processing. IP wildcards in dreader.hosts are treated in a special fashion. If a host name in dreader.hosts starts with a number, it is assumed to be an IP address and the wildcard compares are only run against IP addresses, not the FQDN from the reverse lookup. Otherwise something like "10.0.0.*" would match "10.0.0.hackers.domain.com". README.READER has better instructions now. Added new directory, sample_reader, with example dreaderd + diablo config. xmake install will try to install the sample files as softlinks but will not overwrite any custom mods you make. xmake uninstall will attempt to deinstall everything, as long as it matches the source tree (i.e. will not delete files which you modify). Changed 'fgrep -h ... files' to 'cat files | fgrep ...' in XMakefile's, some machines *still* use 15 year old fgrep's. Sheesh. Implemented spool priority mechanism, see samples/dserver.hosts, and implemented a spool-latency-history mechanism to automatically handle downed servers. Fixed bug in xover output .. Xref: header name wasn't being printed and it was prefixing each entry with a space, which messes up trn's threading. Changed diablo's dnewslink -H (i.e. header-only feed) mechanism to include Bytes; header. Changed dreaderd's input mechanism to require a Bytes: header for header-only feeds. This header is not required for normal feeds because it can calculate the article size. Why all this? So I can produce the proper Bytes field in xover data output. Fixed line-too-long buffer overflow bug in diablo.c ( bug found by KIZU takashi ). Changed while(accept(..) >= 0) to if (accept(..) >= 0) in unix domain command socket code to work around a problem with very old SunOS releases that do not understand non-blocking accept()'s. /news/log/dreaderd.status increased to 120 columns, 70 wasn't enough to store readable realtime status. Diablo V1.16-test4 ***** EXPERIMENTAL RELEASE ***** sin_family not set for certain conditions in dreaderd and diablo, causing problems with AIX (besides which it really needs to be set anyway to be proper). Diablo V1.16-test3 ***** EXPERIMENTAL RELEASE ***** *** NOTICE *** ALL RELEASES AFTER AND INCLUDING 1.14, READ THE DEXPIRE.CTL DOCUMENTATION CAREFULLY AND REWRITE YOUR DEXPIRE.CTL AS APPROPRIATE!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Fixed pagealloc() & pagefree() code in lib/alloc.c to handle 0-byte allocations (it needs some more work). Diablo V1.16-test2 ***** EXPERIMENTAL RELEASE ***** HAVE_SNPRINTF cleaned up. It's just too important so the code now assumes it exists and we have a fudge routine if it doesn't. You may have to fix HAVE_SNPRINTF in lib/config.h for your OS if you get compilation warnings. Fixed GetAuthUser() code in dreaderd.. SIGPIPE was not being handled properly. Diablo V1.16-test1 ***** EXPERIMENTAL RELEASE ***** (continuing NNTP reader development) den->d_namlen replaced with strlen(den->d_name), d_namlen does not appear to exist under linux or solaris, they use d_reclen instead. Didn't include in dreaderd/main.c if NEED_TERMIOS is set. Messed up solaris compile. bcopy in lib/psstat.c if HAVE_SNPRINTF is not set used bad variable name (undefined symbol error). Fixed. Diablo V1.15 ***** EXPERIMENTAL RELEASE ***** -- NOTE: basic reader support added *** NOTICE *** ALL RELEASES AFTER AND INCLUDING 1.14, READ THE DEXPIRE.CTL DOCUMENTATION CAREFULLY AND REWRITE YOUR DEXPIRE.CTL AS APPROPRIATE!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -H option added to dnewslink for header-only feed, 'mode headfeed' must be supported by server (as a safety feature to prevent a header-only feed from leaking). 'headfeed' option added to dnntpspool.ctl, used to feed diablo reader side. Sometimes coupled with Xref: for slave synchronization, or the reader can operate in a standalone fashion. For alpha-testing, do not use Xref: (you probably aren't anyway). Diablo's headfeed still sends the entire article body for control messages. ** ALPHA-TEST DIABLO READER SUPPORT in this release via 'dreaderd', ** see manual page for dreaderd. See README.READER New TUNING_NOTES for FreeBSD (may apply to other OS's too) in regards to the v_cache_min sysctl. The diablo and dnewslink binaries are now statically linked. Say what? Reason is simple: DLL eats more memory then it saves when you are running lots of copies of the same program. you loose pages to the dynamic linking (which prevents them from being shared), to the separate data areas, and to the MMU for the mmap. All told it appears comes to around 80K per process. With statically linked binaries the text areas are 100% shared, there is no MMU overhead for the DLL mmap()s, and the data/bss is packed. ** The default hash mode is now crc rather then crc/prime, see diablo.config for a description. Spaces no longer tolerated inside message-id's. Diablo V1.14 -- NOTE: remove /news/dexpire.factor if it exists -- NOTE: /news/dexpire.ctl must be reorganized a bit -- NOTE: new REQUIRED configuration file added *NOTE* NO reader support yet, but I'm getting close. Added general key-token-value database support and added preliminary active file support through the key-token-value database. This is good enough to allow diablo to generate Xref: headers. However, no control message support has been added yet (that's going to be a reader-side function). The dactive file is implemented as a shared binary key-token-pair database and is intended to (optionally) hold additional information such as the moderator email, newsgroup description, and control PGP keys. The format is extensible. While the dactive file is human readable, it should never be edited manually while diablo-related processes are active. Use the supplied utilities instead. THIS SUPPORT IS UNDER DEVELOPMENT AND SHOULD NOT BE CONSIDERED PRODUCTION YET. New configuration file, diablo.config added. See the sample diablo.config file. This file is REQUIRED. Diablo will refuse to start if this file is missing. All entries in diablo.hosts are REQUIRED to have a label, and that label is REQUIRED to exist in the dnewsfeeds file. Diablo will now generate a 502 if a label is missing. Note that you can direct incoming-only feeds to the same label if you wish. DEXPIRE.CTL works again. In prior releases dexpire.ctl served only to reject articles that were too large. The spool storage & expiration scheme has been redone in this release so time-grouped expires work again, allowing you to control expiration based on group. Note that the expiration does not currently work for the control.* pseudo group, but it will in a future release. There is a cost though: while the number of physical files is not expected to increase, the number of open file descriptors will multiply by about 6x if you use the feature. The new reader-mode expiration code is NOT turned on by default because I do not want to trip people up. Please review the top of samples/diablo.config to see how to turn it on. The system-wide or max-per-user file descriptor limit should be at least 4096 to accomodate an increase in the number of open descriptors if you configure expiration for reader operation. The remember time, in days, may be set in diablo.config rather then hacking lib/vendor.h if you wish. This allows many diablo installations to compile straight out of the box without having to edit lib/vendor.h. The new CRC64 algorithm can be used for the hash. The sample diablo.config includes a 'hash' algorithm specification. The older hash algorithm is called 'prime'. The new one is called 'crc'. The sample diablo.config uses 'crc/prime' which means to use crc, but to fallback to prime on lookup if it fails to find a record with crc. To avoid blowing your history, you should leave the method set to crc/prime for about a week and then set it to just crc. DIABLO.HOSTS check made more sophisticated. Please read the description in the diablo-files manual page for diablo.hosts. Diablo will now attempt to match up the base domain of a reverse lookup against specific hosts in diablo.hosts if the normal reverse+forward lookup security fails, then it will perform a forward lookup of the matching host(s) in diablo.hosts to attempt to match the IP. While the new spool setup is backwards compatible to the previous setup, more care must be taken when removing spool directories to prevent corruption. If you remove spool directories manually and the diablo server is running, you must rename the directory prior to rm -rf'ing it to be 100% safe. The new dexpire does this, of course. DEXPIRE.CTL now uses a percentage expire. That is, you specify a percentage of the best case expire (1-100) that you wish and diablo expires articles accordingly. See the samples/dexpire.ctl file. Diablo will dynamically and automatically conform itself to your expiration parameters but you must edit the file with the new scheme in mind for it to make sense. Added new utility, dreadart, to read articles and/or verify their consistancy in the spool given the message-id. The maximum number of parallel feeds from any given client may now be specified on a label-by-label basis. It is still a per-ip limit so you can still collapse many different incoming feeds into a single label and have the label-specified limit work on a per-host basis. see the 'maxconnect' definition in samples/dnewsfeeds Added 'nomismatch' option to dnewsfeeds (see samples/dnewsfeeds) to surpress MISMATCH testing of the incoming Path: header. Added '-c commonpath' option to diablo. This adds a common path name to the Path: header if it does not already exist in the Path: header. See the manual page for diablo. Fixed a bug in diablo that would cause the dnewsfeeds file to be read over and over again, eating cpu. Added sub-second realtime capability: the combination of rtflush in dnewsfeeds, and 'realtime' + 'd-1' in dnntpspool.ctl. If the system supports usleep(), usleep(20000) is called (20 mS) rather then sleep(1). May require some configuration munging for you system. Not suggested. BODY and ARTICLE nntp commands added. Like the HEAD command, articles can be retrieved by message-id. dexpire will remove files called 'core' or '*.core' in the spool directory as it comes across them. Diablo V1.13 Bug in spam cache file descriptor handling fixed... fcntl locks require us to obtain discrete descriptors by calling open() after any fork rather then before or they will not work properly if a process gets killed. Due to the stability of diablo, this should not have caused any problems in 1.12, but needed to be fixed anyway. Bug in dnewslink quit code.. it would send the quit twice at the end which would result in an in the logs. Apart from the bogus log message, the bug had no other effect. This threw a few people off who were looking for errors when, in fact, there were none. Bug in dnewslink reconnect code. When moving on to a new batchfile, does not fix StreamPend when pending streaming objects are refiled due to a timeout or remote close. Would result in dnewslink exiting after processing one queue file rather then going on to the next queue file. Bug in dnewslink close/reopen (for logging) code... it would again loose track of StreamPend. It would also lose track of the number of potential receive bytes (which it calculates to guarentee that the server does not block writing responses back to the dnewslink client). Iain's latest diablo-stats included Diablo V1.12 The spam cache is turned on by default, for real this time. Added USE_PCOMMIT_SHM, USE_PCOMMIT_RW_MAP, DO_PCOMMIT_POSTCACHE, USE_SPAM_SHM, and USE_SPAM_RW_MAP. *_RW_MAP causes the diablo to use a shared r/w map rather then a read-only/lseek+write() mmap for the precommit and/or spam caches. << this will significantly improve the performance and stability of diablo >>. If you set USE_PCOMMIT_SHM, Diablo will use SYSV shared memory rather then a file mmap for the precommit cache. If you set USE_SPAM_SHM, Diablo will use SYSV shared memory rather then a file mmap for the spam cache, but it will not be non-volatile storage so if you kill and restart diablo, the spam cache will get reset. If you have sysv shared memory, you want to set USE_PCOMMIT_SHM and DO_PCOMMIT_POSTCACHE at the very least. When shared memory is used for the spam filter, the spam.cache file (if it exists) will be loaded into shared memory when diablo is started and written out when the master diablo server exits. DO_PCOMMIT_POSTCACHE tells diablo to use the precommit cache to also cache dhistory lookups and commits. It is not suggested that you use this feature unless you also set USE_PCOMMIT_SHM or at least set USE_PCOMMIT_RW_MAP. Many lookups will hit the precommit cache and thus avoid hitting the dhistory file, greatly reducing internal kernel filesystem lock contention and disk I/O on the dhistory file The FreeBSD 2.2.x, IRIX, and Solaris config automatically turn on the new precommit features. Casts the pointer argument to setsockopt() to void * to avoid compiler warnings on solaris (which still uses arcane 'char *' in its prototypes rather then 'void *'). The bytes= logging for the marks by dnewslink was incorrect, it was logging cumulative bytes rather then deltas for the marks. Added 'rtflush' to dnewsfeeds. This option causes the queue file to be flushed on every line rather then buffered. Useful when used with 'realtime' in dnntpspool.ctl. Added Path: name checking. If the first element of the Path: received by an article does not match any 'alias' statements for the incoming connection, the IP address is prepended to the path: with .MISMATCH appended. * >>> NOTE <<< you should grep through newly created spool directories every so often looking for .MISMATCH in the spool files to locate incoming feeds with improperly configured 'alias's (in dnewsfeeds). When I turned this feature on, I found that four of my 80+ feeds were misconfigured. DNewsfeeds file processing moved to the main diablo server and removed from the children, saving parsing and memory overhead of around 45K per child in heavy loaded diablo systems (x 100 processes = 4.5 MBytes saved). Queue-delay 'q#' option added to dnntpspool.ctl. This allows you to tell diablo to purposefully delay N queue files before transfering a feed to a destination, thus introducing a feed delay on purpose. This feature can be used to hold articles while allowing control messages to propogate, getting the cancels in front of the articles being canceled. Diablo now uses it's own memory allocation code in order to better manage memory. This has simplified certain memory management operations, such as when the parent forks and needs to deallocate memory pools that the child will not use. Minor fix to dnewslink: Now exits if it gets a 400 return code (ERR_GOODBYE) rather then retry the connection. Reverse dns lookup now uses case insensitive compare against forward lookup. The post-fork openlog() for syslog was being called prior to the mass file descriptor closures. Moved so it is called after. Doh! Beta 64 bit support (e.g. linux on 64 bit alphas and such) Diablo V1.11 -- NOTE: new pseudo groups control.* added to the group list for control messages. This requires some minor modifications to your dnewsfeeds file so you do not get all control messages when taking a partial feed. See below. Added control.* pseudo groups. If an article is a control message, the control message type is appended to the list of newsgroups as 'control.MESSAGETYPE'. For example, a cancel is appended to the list of newsgroups as 'control.cancel'. This means that if you do an 'addgroup *' then use delgroup to remove the groups you don't want, YOU WILL STILL RECEIVE CONTROL MESSAGES FOR ALL GROUPS. The solution is to put a 'delgroup control.*' after the 'addgroup *' for all of your normal feeds. If you do this, only control messages for the groups that pass the filter will be propogated. If you use the more standard 'delgroup *' followed by 'addgroup ...' lines for the groups you want, the delgroup covers it and no modifications are required for that feed. I have added another filter command called 'requiregroup'. It is similiar to addgroup, delgroup, and delgroupany. What it does, however, is require that the specified group BE in the Newsgroups list. This allows you to create a secondary feed to your newsreader boxes containing ONLY the control messages that also pass your normal group filters by appending 'requiregroup control.*' to the end of your addgroup/delgroup filter commands. Please see the example in the samples/dnewsfeeds file for more information. Note that control bypasses generally require two dnewsfeeds labels, one for non-control messages and one for control messages. Changed a bunch of printf's for stdout to fprintf's for stderr. samples/dnewsfeeds incorrectly described the 'alias' command in the comments, fixed. Added Iain's bytes= stats to diablo and his latest stats stuff to contrib. Also added -h option to diablo (see man page). 'T' and 'R' parameters in dnntpspool.ctl now work, allowing you to set the transmit and receive buffer sizes on a connection-by-connection basis. dnewslink now calls the socketopt to set the transmit/receive buffer sizes after the connect as well as before, in case the connect() call overrides the first ones (which it does on FreeBSD and linux boxes, since the route table dictates nominal buffering parameters) Ability to set the dhistory hash table size in the diload command. The default is 4 million entries, equivalent to the '-h 4m' option to diload. Each hash table entry is 4 bytes, so 4 million entries results in a 16MByte hash table. The hash table size must be a power of 2, so the next logical step is -h 8m (32 MBytes) or -h 16m (64 MBytes). If your news box has a lot of memory, changing your biweekly.atrim script (see the adm directory for a sample) to generate a larger hash table will greatly reduce the load on the /news partition. '-n' and '-f configfile' options added to dspoolout. Minor adjustments made to realtime code to handle a race condition between diablo creating the realtime spoolfile and dspoolout trying to open it. Minor adjustments made to realtime code in dnewslink to handle a race condition between diablo creating the realtime spoolfile and dnewslink closing out the previous one and openning the next one. DNewslink now sends the 'quit' command at the end of the session and waits for a response. Fixed another session reporting bug, diablo was not logging the correct number of elapsed seconds (non critical). New options added for lib/vendor.h. You can set the USE_SHORT_REMEMBER define to 1 to use a shorter history retention for didump -x & diablo, or you can leave that option commented out and specify a specific REMEMBERDAYS. The default remember is 14 days, the default short remember is 7 days. Diablo adds a bit of slop internally to deal with Date: conversion errors. Usually the bottleneck that develops first is in access to the dhistory file. A shorter remember/retention can significantly improve access times. /* #undef USE_SHORT_REMEMBER */ /* #define USE_SHORT_REMEMBER 1 */ /* #define REMEMBERDAYS 9 */ NOTE(!) always be sure to recompile diablo completely when making #define option changes. 'xmake clean; xmake'. The history retention is especially sensitive since it must be properly compiled into both the DIDUMP and DIABLO binaries. If you update one but not the other it is possible to get into article transfer loops with your peers. Diablo now identifies itself on 502's rather then using the same error essage as INN. Diablo V1.10 The ME label in dnewsfeeds must be renamed to DEFAULT, which is more appropriate to how it works. A new GLOBAL label is now optional. See samples/dnewsfeeds for more information. SPAM FILTER - based on NNTP-Posting-Host: header frequency in messages. Defaults to ON. See the sample dnewsfeeds file for examples of how to program the spam filter. The -S0 option to diablo will turn it off, or you can simply 'delspam *' in the GLOBAL label to turn it off. To use this filter, you need to specify 'delspam' lines in the GLOBAL filter to handle non-spam sources that exceed the rate limit (which defaults to 16 postings within 16 minutes). The samples/dnewsfeeds file contains a simple filter set. You will also want to 'delspam *' for the incoming feeds from your leaf node shell machines. However, since this sort of rate filter may become more common-place on the internet, another solution will have to be found for shell-based news readers if the posting rate exceeds 16 postings per minute. This spam filter is NOT PERFECT. There are plenty of cases where it can potential filter non-spam, but it works well enough for us that I've installed it on BEST's newsfeeds box. NEAR-REALTIME OUTGOING FEEDS - dspoolout/dnewslink now have the capability to 'hang' on the diablo's outgoing feed file, initiating transactions with the destination host as the diablo server makes data available. Since the diablo server buffers feed data, this is considered to be only near-realtime. The lag is around 5 seconds for a full feed. You can also explicitly flush diablo's queue files with 'dicmd flush' from a cron job to support slower n.r.t. feeds. 'dicmd flush' is considerably cheaper to execute then 'dspoolout'. dspoolout would then only need to be run once every 30 minutes or so. The original queue file mechanism still works and runs in parallel as a failsafe, resulting in double the message-id load on the destinations designated for realtime operation. See the dspoolout manual page for more information. dspoolout was not handling the min-flush-seconds option exactly right. It should now do a better job with it, only using it if the queue files are truely backed up. dnewslink now turns on SO_KEEPALIVE to prevent infinite hangs. Diablo already did this. Fixed minor bug in Date: parsing on incoming feeds. It prevented the article-too-old stuff from working properly. Fixed minor bug in header scanning - Diablo was not detecting the end-of-headers blank line properly and scanning headers on into the body of the article. A bottleneck for dhistory file appends has been removed. Previously, an exclusive lock was used to append to the file (O_APPEND is not dependable). Now we 'allocate' space with a record lock in a non-blocking fashion. History adds are now much faster under extreme loads. diload throws away history file entries with zero'd gm timestamps or zero'd hash codes. The size of the send and receive tcp socket buffers can be specified in dspoolout, dnewslink, and diablo now with command line options (-T and -R) Diablo now properly logs the elapsed time on disconnect. NOTE: addition to adm/biweekly.atrim sample, the dhistory.bak file is now removed before the new dhistory file is generated to make more space on the disk. This allows a 1.5G /news partition to continue to be used in the face of growing history files. 'dicmd status' added, returns diablo's current status The unix domain socket is created after diablo switches to the 'news' user rather then before. -M option added to diablo. This limits the maximum number of simultanious connections allowed from EACH remote host. The limit cannot yet be set on a per-feed basis. This option is designed to prevent system failures from out of control remotes. Diablo V1.09 Major bug in dnewslink found by Michael S. McMahon. When dnewslink gets interrupted and refiles pending stream transactions back to the batch file, it was not refiling the offset,size part! A rerun would thus result in the entire article spool file (encompassing many articles) to be sent as a single article. Ouch. Precommit caching added. Diablo now generates and maintains a pcommit.cache file which it mmap's. This file contains a static 4096-entry hash table. Message-id's for check and ihave commands are entered into the hash table and timeout after 30 seconds. hash table collisions are simply overwritten. During the time which a message-id is active, other check/ihave commands for the id will return a DUPlicate response code. The precommit caching will get rid of 98% of the article collisions when you have several incoming feeds that are mostly synchronized with each other. In this situation, the precommit caching will reduce your incoming network load by at least a factor of 2 as well as reduce the disk write load. Three new statistical elements have been added: predup, posdup, and pcoll. (predup and posdup were actually added in V1.08). predup counts the number of history collisions that occur as of the beginning of a 'takethis' command. posdup counts the number of history collisions that occur as of the article commit after a 'takethis' command, and pcoll counts the number of pcommit cache collisions which cause check/ihave to return a DUPlicated status. The random seed used to generate file names was not being re-randomized after a fork. Oops! It is now. Replaced stdio routines used for socket input by diablo with custom routines, removing the fileno(fi) = -1 problem and working around apparent stdio bugs in SunOs, Solaris, and IRIX. Article reception can deal with 8-bit-clean data now, as long as it is properly '.' escaped. This includes removing the CR before an LF for storage and adding it back in for retransmission. Added 'd' option to dnntpspool.ctl, allowing one to specify a startup delay for the dnewslink's related to a particular site. statvfs fixes made for sun/solaris. I got several diffs and chose the easiest one. It may still not be entirely compatible with all sunos/solaris implementations. The control file dspoolout uses (dnntpspool.ctl) can now be overriden on the command line. You can now specify the outbound (bind) ip address for dnewslink... useful if you have multiple interfaces and want to split outgoing traffic amoungst them, or if you have multiple interfaces and your peers expect the news to come in from a particular IP. This can also be specified in dnntpspool.ctl. DSpoolout will pass it on to DNewslink. See samples/dnntpspool.ctl for an example. New version of Iain's stats package. Diablo V1.08 *** NOTE ** DHISTORY FILE RELOAD REQUIRED IF UPGRADING *** ** HISTORY AND SPOOL FILE FORMAT CHANGE. The format change requires that you ensure that ALL of your diablo binaries are replaced. You must regenerate your dhistory file Major stability fixes to dhistory file operation... I have found that if the /news partition runs out of space or many processes are write()ing to dhistory in O_APPEND mode, that O_APPEND writes aren't as atomic as I thought they were. dhistory appends are now serialized and *ACTIVELY* realigned if their alignment gets screwed up. This should make the dhistory file much, much, much more robust. Major stability fixes to dqueue files. Feeds no longer must be flushed on fork, and files are written in multiples of one line. If a write files, the file is truncated to remove the partial line in order to prevent corruption if a disk fills up. Stability fixes to article files... now does a sanity check (looks for the \0 terminator for the article in a multi-article file) to catch corrupted files. didump -x does a better job filtering out bogus dhistory entries. Major, Major, Major changes in the spool file format, dhistory file format, and queue file format. Read the section at the end of the INSTALL file for instructions on how to upgrade without blowing up your news server. * spool files may now contain multiple articles separated by \0. spool files are now named solely based on their gmtime and iteration number. * queue files include offset/length pairs to allow dnewslink to access the new spool files. * The dhistory file now has a proper header structure, and History entries now contain two additional fields (offset, bytes) .. used to store the offset,length information for article lookups. The dynamic pipe sizing has been removed from dnewslink. It doesn't work very well when feeds get behind and just slows things down more. dexpire is temporarily braindead... it now expires in straight FIFO fashion, so most of dexpire.ctl is no longer relevant (except for expirations of 0 which cause an article to be rejected)... I have a medium term plan to make expiration specifications work again, but it will be a few releases at the very least. diablo, dnewslink, and dexpire now use chdir() caching to reduce the path lengths for remove(), open(), etc... which is a big win for the kernel. diablo now records rejection statistics properly. It previously did not count. diablo now records and returns the errno error string on fatal errors. addition of 'delgroupany' command to dnewsfeeds file. Similar to 'delgroup', except acts the same as INN's '@group'... i.e. if the group appears in the Newsgroups line at all, the article is rejected, even if there are other groups in the line that pass the filter. addition of 'maxpath' command to dnewsfeeds file. Allows you to filter feed based on the number of elements in the Path: header. addition of the 'groupdef' command and 'groupref' command (see the sample dnewsfeeds file). Rather then repeat the same group access list over and over again in each feed, you can now collect them together in one place and reference them from your feeds. statvfs support added for SUN. Diablo V1.07 - ** NOTE ** YOU MUST READ THE 'INSTALL' FILE IF UPGRADING TO THIS RELEASE FROM 1.05 OR LOWER !!! The diablo server now has the ability to bind to a specific host and/or ip address. diablo converts the reverse-lookup hostname to lowercase before comparing against hosts in the diablo.hosts file. 'nostream' option now available for dnntpspool.ctl for dnewslink runs. Diablo now rejects articles when the message-id in the article body does not match the message-id in the ihave or takethis header, and logs the mismatch. Diablo previously stored the ihave/takethis message-id into the history file without checking it against the message-body. If a mismatch occurs, the article is rejected. (should I do a 400+exit instead?) Misc small bug fixes to diablo Diablo now returns a 400 error code and exits if a file error occurs. It previously returned a 431 error code, which was incorrect. 400 isn't exactly correct either, since it requires diablo to exit. We need another return code to allow the server to indicate a problem yet not exit (i.e. let the client decide to exit). Diablo V1.06 - ** NOTE ** YOU MUST READ THE 'INSTALL' FILE IF UPGRADING TO THIS RELEASE!!!! The default hash table size for dhistory will become USE_LARGE_HASH in this upcoming release. If you wish to maintain a small hash table size (e.g. you are taking a partial feed), you must set the USE_SMALL_HASH define to 1 in lib/vendor.h dnewslink will still attempt to send news if the (nntp) server responds with code 201 rather then code 200. ** It is suggested that the dclean admin script be run twice a week rather then once a day due to it's load on the history file dnewslink now dynamically adjusts the size of the pipe for streaming feeds. If dnewslink detects an inordinate number of check/ok/takethis/reject collisions, it reduces the size of the pipeline. The pipeline is also dynamically adjusted back up when the collisions subside. More /bin/sh script fixes in contrib/XMakefile... improper semicolons removed. Fixed bug in dexpire that caused it to loop endlessly without getting much work done. Added parallel-tasking option, -p# (as in -p4) to dexpire, causing it to fork N times in order to run multiple unlink()'s in parallel without the processes stepping over each other's feet. Diablo V1.05 Switched from whole-file locks on dhistotry to record locks. Diablo was having lock starvation problems due to the large number of processes accessing the history file. By using record locks, we lock a particular hash chain rather then the entire file. This should make machines with a large number of feeds more efficient. EFFICIENCY NOTICE: We strongly suggest that you set the USE_LARGE_HASH define to 1 in lib/vendor.h and follow the instructions at the end of the INSTALL file to increase the size of the dhistory file hash table. Doing so will yield a major increase in efficiency on machines with limited buffer caches. Specifically, a 128MB machine may not have enough free space for a good buffer cache and the large number of seek/reads caused by the smaller hash table can result in seek starvation on /news. Diablo V1.04 Fixed a 436 return code into a 431 return code for streaming 'try it again' responses. Now uses floating point to store and log byte and article totals Fixed bug in the article totaling (for logs) - Diablo was reporting more received articles then were actually received in the DIABLO log line. Diablo incoming now logs every 1024 articles rather then every 1000 for a better looking K/M/G log line. Added 'maxcross' and 'maxsize' filters for outgoing feeds. Diablo V1.03 Uses mmap() to allocate memory, either via MAP_ANON or MAP_PRIVATE, allowing us to easily deallocate it from the page table on fork. We use this for stdio buffers as well as for pipe line buffers. Added doutq More portability fixes: directory stuff, signal stuff, etc... Diablo V1.02 Fixed two file descriptor leaks in the main server. There are not likely to be any more. Made portability changes: uses sigaction() rather then signal(), and uses fcntl() rather then flock() (though SUN should get it's act together and implement flock().. it's trivial). Diablo V1.01 Moved some .SET's from XMakefile to XMakefile.inc The sample crontab in adm/ ran the once-every-four-hours expire once a minute instead, and I didn't notice! dnewslink now reports local/remote latency properly. Well, as well as can be done, anyway, and also reports the size of the article in debug mode. Diablo V1.00 Initial Release