File Format Changes in Release 8
Last updated Dec 14, 1999
Release 8 of CISED will be the last release with two-digit year identifiers in the officially released database records. With Release 8, the official database records will be in exactly the same format as in previous releases. In addition to these official records, however, a complete set of database records and indexes with four-digit year identifiers will also be included with Release 8. The format for the four-digit-year records included with Release 8 will be the format employed in future releases of CISED.
The purpose for having both sets of records with the Release 8 distribution is to make it easier for developers of search software to make appropriate modifications and to test their products prior to Release 9 of CISED.
Details of the current record formats and file-naming conventions are contained in the Technical Reference Mannual. The current version of the Manual was last updated in 1997, but the information contained there about database structure remains current. The Manual is available on current editions of the CISED CD-ROM, and is also available for download in PDF format.
Changes in record contents. Currently, the year of publication is contained in the first two positions of Field 1 of each CISED record. Currently, Field 1 is 28 bytes in length. In the new format, Field 1 will become 30 bytes in length, the first four positions of Field 1 will contain the four-digit year of publication, and subsequent subfields in Field 1 will all be shifted two bytes to the right.
Changes in file naming conventions. Currently, files containing CIS records have the form CISnn.Vrr, where nn denotes a two-digit year and rr denotes the particular release. Corresponding index files have names of the form CISnn.Xrr, where nn and rr are the same as for the database record files.
The new file names will have the form CISnnnn.Vrr and CISnnnn.Xrr, where nnnn is a four-digit year, and rr denotes the particular release, as before.
Changes to abstracts and abstract files. There is no change in the file-naming conventions for abstract files. The contents of the four-digit version of abstracts differ from those of two-digit files only in that Line B of each abstract will contain the 30-byte Field 1 rather than the current 28-byte Field 1 for the corresponding CIS entry.
Changes to abbreviation files. No changes are needed in the abbreviation files, although a complete set of abbreviation files is included in the directories that also contain the four-digit records, again to facilitate the development, testing, and deployment of search software using the new formats.
Last updated: 02/13/2004    [Send a question or comment]