To: Ron Thisted and Michael Wichura
From: Bruce Trumbo
Subject: Treatment of correction notes, acknowledgments of priority, and comments in CIS. Related issues.
My work on the Gap Project over the last few years has given me the opportunity to focus on how we might do a better job in handling correction notes, acknowledgments of priority, discussions, etc.
Also very much in my mind as I write these notes are new and expanded roles for the data base in the future:
Already many libraries are discontinuing subscriptions to some journals that would previously have been considered indispensable. In many instances the CIS record will be the only information a user has about a paper until it is brought forth in a two or three-day retrieval process.
Soon we will probably be providing auxiliary files for abstracts and citations. Some of the information we currently put into the main CIS record might be handled more elegantly (and more completely) by the use of such auxiliary files.
Many of the really important improvements made in the data base over the past few years have come from the thoughtful imposition of new standards. A few additional ones may be helpful now.
1. Definitions. I think it is best to start by being very clear as to what is meant by the terminology.
Correction note (Corr:). One or more of the authors of a paper or the editor of the journal writes to say that the paper contains error(s). (If a non-author/editor claims to correct something, it is a comment. An exception might be made for a correction by a student or colleague of a deceased author.)
Acknowledgment of priority (Ack:). One or more of the authors of a paper, or possibly the editor, writes to say that some part(s) of the paper have previously been published elsewhere by other author(s). (A dignified way for the editor to proceed when the originator or someone who knows of his/her work complains that a result is not original is as follows: The editor works with the author to formulate a statement of apology crediting the source of the information on priority and making an unequivocal statement of who has priority for what results. This statement is published, clearly labeled as an Ack:. The letter of complaint is not published unless unresolvable issues remain. Even where this procedure is not followed, CIS should come as close as possible to pretending that it was.)
Addendum (Add:). One or more of the authors of a paper writes to expand or improve the paper with a clarification, application, generalization, etc. (Such remarks by a non-author constitute a comment, not an Add:. Even if so labeled in the journal, an item is not indexed as an Add: if it is really a Corr: or an Ack:.)
Comment (Com:). Someone other than the author(s) of the original paper writes to express agreement or disagreement or to clarify or extend what is in the original paper.
Reply (Reply:). Reply to a Comment by the author(s) of the original paper.
2. Treatment of author-generated modifications. In recent years of CIS we have sometimes failed to distinguish usefully among Corr:, Ack:, Add:, and Reply:. Most often we have used Corr: where Ack: or Add: is really what is meant. I regret this trend both on the grounds of bibliographic purity and because the distinctions may offer important clues to users about what modifying notes they need to look up and when. (If a result I am depending upon may be wrong, I need to know it before I do anything more. If priority is in question or further information may be available, I can safely continue my work while the library takes a few days to retrieve the note from wherever.)
Unfortunately, one cannot always take the terminology that appears in the journal at face value, and I think we have a responsibility to present the truth as far as we can discern it, especially since the CIS editor can usually find the truth just by scanning a few lines.
Often authors seek to blunt their embarrassment about an Ack: or Corr: by combining it with a trivial extension and calling the result an Add:. In a few instances the admission of priority for a result has come ("oh yes, and by the way") after a list of several inconsequential typographical errors and the whole note is designated as a Corr:. In appropriate situations I suggest we use Corr/Add:, Ack/Corr:, etc.
Editors sometimes obscure an Ack: by publishing the letter of complaint from the originator as a "Comment" and the apology from the author as a "Reply." In these instances I suggest we ignore the comment and treat the reply as Ack:. (A user may legitimately decide not to look up every comment especially in a journal that allows a free exchange of opinionated, but scientifically inconsequential comments. However, the user is obligated to give proper credit in his own paper and so must know about the Ack:.)
3. Distinguishing among types of comments. As a prelude to suggesting some modified ways of doing things, I distinguish three types of comments--there may be more.
A. Discussion from the floor. (Example: JRSS.) This kind of discussion is published immediately following the paper to which it refers. It may or may not be based on the opportunity to give deep thought to the issues before commenting. It is not refereed. Authors may or may not have the opportunity to edit their remarks before publication. There are often many comments, some of which may be very brief or inconsequential.
B. Invited discussion. (Example: Statistical Science.) Invited discussants read the referent paper and provide written comments. These may or may not be refereed. They almost always appear immediately following the referent paper. (On rare occasion an editor may come to realize that competent discussion of a paper is needed and organize invited discussion in a later issue.) The length and depth of these commentaries varies, but many of them approach the level of brief papers in their thoughtfulness, scope, originality, and use of references.
C. Unsolicited discussion. (Examples: Technometrics and The American Statistician.) These appear in later issues of the journal. They are often classified as letters to the editor, which is usually a fair characterization even when they are not officially so designated. Meaningful refereeing rarely, if ever, takes place. In some journals there may be many comments continuing over several years. The length and scientific content of the comments may vary greatly. (Occasionally, a comment may be scientifically superior to the referent paper. However, some editors encourage extended discussions on controversial topics by printing comments that are unsupported expressions of personal opinion, which generate rebuttals in kind, etc., ad infinitum--or thereabouts.)
I suggest that Type A discussion be handled by showing "(with discussion)" after the title of the referent paper, and including the page numbers of the discussion within the span of pages in Field 1. Many of the comments are so brief or off-hand that it seems inappropriate to dignify them all with individual records. I have used the with-discussion format in all recent Gap Project records, at first as a compromise to save effort and money, but eventually, after accumulating some experience, with no regret at all.
Type B discussion should continue to be handled in the data base by showing the pages of the discussion in parentheses after the title of the referent paper, by including a "Comments on" record for each commentary and, where appropriate, by including a "Reply to comments on" record for the author's reply.
Our notation in parentheses after the title has not been standard. We have used Com: (for comments only), C/R: (for comments and reply), Disc: (ambiguous), and perhaps other notations. I prefer to indicate whether there is a reply mainly because it is so easy to do. However, we should adopt a standard to be used in all new records.
I do not recommend trying retrofit this new standard for existing records. (If Disc: becomes the standard, I would oppose the loss of information involved in changing both Com: and C/R: to Disc:. If Com: and C/R: are to be the standard, it will not be easy to change records with Disc: because additional information would be required.)
I propose that Type C records be handled in the data base by showing the locations of all discussions in parentheses after the title of the referent paper, much as at present. However:
I recommend that the printed volumes simply ignore such unsolicited and detached discussions--both in the subject and author sections.
We should decide whether the data base would carry a separate record for each detached, unsolicited discussion or not. After some consideration, I have come to the conclusion that we should not do so for the time being. (But see items 4 and 5 below.)
The current practice is so awkward that editors seem to have given up altogether covering freewheeling discussions in some journals. A coverage criterion based on whether the discussion is called a "Letter" (if yes, out; otherwise, in) is simple, easy to follow, and almost totally irrational.
Ordinarily, the only information that would be lost by omitting records for delayed discussions is the identity of the discussants. (Individual titles given to discussions are never used in CIS.) Considerations about including records to inform the user of the identity of a commentator: Is a user likely to be influenced as to whether to read a comment depending on who wrote it? If so, is this a valid criterion in general use that we wish to validate by creation of a large number of records almost solely for its support?
Additional considerations: What is the disadvantage of having results of searches on words-in-title cluttered up with potentially long lists of comment records? What confusion results from retrieval of comment records when the search word is in the title but not when it is a key word? Is it ever really useful for a comment to carry a key word not carried by the referent paper, so that comments are retrieved without their referent papers? (This latter event is rare, but there are instances. They provide the only example in which a "Comment on" record provides non-redundant information other than the commentator's name.)
4. Automation of forward references to modifying notes and discussion. The process of inserting the parenthetical information about modifying notes and discussion after the titles of referent articles is tedious, time consuming, and extremely prone to error when done by hand. We need to establish formats and computer algorithms for:
(a) creating records that carry the necessary information when each journal is "abstracted,"
(b) attaching the necessary forward references to referent papers, and
(c) deleting and archiving records that have served their immediate purpose after their information has been attached to the referent paper. (Some information from such deleted records may find its way back into auxiliary files attached to the data base at a future date. See item 5.)
In preparing earlier annual editions of the data base and in my work on the Gap Project, I tried to keep items in parentheses in time order and to avoid repeating abbreviations where possible. For example:
(Corr: 67V7 pxxx; Com: 68V8 pxxx-xxx; 70V10 pxxx; Corr: 72V12 pxxx)
It should not be difficult to write a program that would automate the inclusion of information from each year's work into the relevant earlier records. We have never kept Corr: records in the data base after they are "attached." Depending on what is decided about retaining detached discussion records in the data base, it might be necessary for each record to carry a flag to determine whether it is fated for retention in the data base or eventual deletion/ archival.
A slightly different, but related issue, is the handling of what I consider to be mandatory cross references. For example, a paper by Smith is entitled "An extension of a theorem of Jones." In that case, if possible, I include a reference "(Ref: ....)" after the title of Smith's paper showing the location of the theorem being extended. Even more important, I have tried to include similar, cross-referencing parenthetical information after the title of the referent paper by Jones. (Users retrieving Jones's paper deserve to know that Smith claims to have extended the result.) In one sense, this constitutes treating Smith's paper almost as if it were a detached comment on Jones's, with the difference that Smith's paper is presumably fully refereed. In another sense, this is the beginnings of a citation index. In any sense, I think it would be worthwhile to do more, not less, of this kind of cross-referencing. (See item 5.)
I have also tried to cross-reference sequences of papers labeled as Part I, Part II, etc. I am not so sure about the importance of this except in the case where Part I is not so labeled. Otherwise, the alert user will do the required search on important words in the title to find all parts. Remember that different parts may have widely different key words.
5. Possibilities for the future. In the long run I suppose and hope we will start to carry abstracts for a wide selection of journals, and eventually to have some citation information--ideally captured automatically from appropriately formatted references. (The Subcommittee on Electronic Communications of the ASA Committee on Publications has urged the quickest possible implementation for both abstracts and citations; the report was approved by the full Committee and by the ASA Board. Contact Dan Solomon, subcommittee chair, if this report has not reached you.) If we go down this path it will no longer be possible to put all of the information on an item into its main CIS record. Ron has already established some standards for abstract files.
Auxiliary files such as those containing abstract and citation information on items in main records would seem to be the ideal place to put detailed information about discussions, addenda, and cross-references as well.
One day soon it may be the case that much discussion of papers will take place in electronic media. In that case especially, pointer information in addition to what will fit in the main record seems called for.
Title fields of main records for some papers are already very cluttered with parenthetical information about delayed discussions, and it seems natural to seek a more elegant way to make this information available.
It is worthwhile to give serious consideration to the eventual complete and efficient handling of delayed discussion because the most interesting and valuable papers are often the ones that generate the most discussion.
Perhaps Corr: and Ack: information--or least an indication of its existence--should always remain prominently displayed with the title in the main record so that even the most unsophisticated of users cannot fail to find it.
Depending on the shape of future electronic retrieval capabilities, it may be appropriate for us to include some brief information on the nature of a correction. We could start doing this now by putting them into the KW field of Correction Note records, annotated so as not to show in the printed volume. It may be several years before we find a way to include these in the data base, but we could archive the deleted correction notes containing such information for the day when we can show such information. Even such brief notations as the following could be very helpful, without being too hard for CIS editors to create:
Additional condition required for Theorem 4
Revised proof for weakened version of Theorem 2
Irreparably false Lemma 9 and Theorem 10
Confused headings in Table 1
Legends for Figures 7 and 8 reversed
Typographical error in Equation (3.4)
Incorrect acknowledgment of support
Non-substantive corrections of punctuation
You have my permission to distribute this memo, or contextually coherent excepts of it, within the CIS organization as you see fit.