ILL Standard Formatted Data

R. Ghosh Version 3.0, November 2006, minor updates March 2018
  • Storage and archiving
  • General Layout of data files
  • Accessing data
  • Technical notes
  • Descriptions of instrument specific data
  • Tools for checking data and directories
  • Evolution of the Standard Format
  • 1. Introduction

    A decision was taken in 1994 to change the storage of ILL raw data from binary to text representation (US-ASCII) when it became evident that a wide variety of machine architectures would be in daily use, and that sufficient power was now available to rapidly decode the formatted data for re-use in calculations. In addition the transfer of text files across networks was not only simpler, but the subsequent printing out of the results would rapidly show up any errors.

    In-house scientists have insisted that the policy of maintaining a complete archive of raw data should continue. The transcription of most early binary data has been completed, with a few absent data due to unrecoverable media errors. In certain cases, notably Small Angle Scattering, the data from D11 and D17 have been reformatted to match current standards, and can thus be re-examined with existing programs. The data are accessible on-line with year and cycle directories as described below.

    2. Storage and Archiving

    One disadvantage of storing data as text is the necessary increase in space required to avoid loss of precision due to truncation effects. In addition while fixed formatting leads to improved clarity for re-reading and printing, it is at the expense of using additional disk space. Given the cpu power available in the present workstations and personal computers we envisage using data compression techniques to partially alleviate these extra space requirements. A compression algorithm has been tested with packing and unpacking routines which are in common use and available for all systems at ILL (Unix-compress/uncompress, GNUcompress/ uncompress). It is possible to invoke automatic decompression of packed data when requesting access to the data; this feature may be invoked in some versions of the data access routines. Certain instruments, notably the Three- axis spectrometers, produce only small volumes of data, and no compression of these data is envisaged.

    In the past data have been acquired during separate measurements or scans (runs), identified with a sequential run-number (numor). On most instruments these data have been stored separately. When transferred to the Central VMS-Cluster the data were concatenated into a single file, with an index-file being constructed which aided routines to access, and general utility programs to manage the data. The specificity of such utilities to VMS were one reason for deciding to revert to separate files for each numor in the future system, though the format described later shows that it is still possible to combine sequences if desired at a later stage.

    The sequence of data generation, network transfer and archiving is as follows:

    The data are written to disk by the instrument computer. The basic inspection programs running on the scientific workstation adjacent to the control computer used for inspecting progress of the experiments usually access these data.

    The data are transferred to a central data server; further treatment then is performed on distributed workstations which all access this central file-server.

    The raw data are thus stored on a data-server serdon. The server is defined on standard workstations and the directories can be examined with standard commands.

    The data may be found in the directory trees from

     
    /usr/illdata/data/instrument         for the current cycle
    /usr/illdata/data-1/instrument       for the preceding cycle
    /usr/illdata/975/instrument          for cycle 975 (compressed data)
                                                etc.
    
    e.g.
    tunis:~> ls /usr/illdata
    001  033  082  122  171  761  784  807	841  873  902  971  data
    002  041  083  123  181  762  785  811	842  874  903  972  data-1
    003  042  084  131  731  763  786  812	843  875  904  973  DATA_CATALOG
    004  043  091  132  732  764  791  813	844  881  905  974  log
    005  051  092  133  733  765  792  814	845  882  911  975  logfiles
    011  052  093  141  734  766  793  815	851  883  951  981  misc
    012  053  094  142  741  771  794  821	852  884  952  982  MyData
    013  061  101  143  743  772  795  822	853  885  953  983  Numors
    014  062  102  151  744  773  796  823	861  891  954  984  power
    021  063  103  152  751  774  801  824	862  892  955  985  processed
    022  071  111  153  752  775  802  825	863  893  961  991  temp
    023  072  112  154  753  776  803  831	864  894  962  992  xray
    024  073  113  161  754  781  804  832	865  895  963  993
    031  074  114  162  755  782  805  833	871  896  964  994
    032  081  121  163  756  783  806  834	872  901  965  995
    tunis:~> 
    
    

    The cost of disk storage now allows the ILL to keep all data on-line, though, in most cases each run file is compressed, with the exception of the current and last cycles.

    Filenames

    To allow compatibility with the largest number of systems we decided that the file of data would be simply identified with their sequential run number, six digits (packed with leading zeros) e.g., 000123. The remaining information (reactor cycle, and instrument name) is used in defining the path to the sub-directory containing the run data. This scheme also matches the name changes which are automatically invoked by compaction routines on the different systems. All data-files include further identification information within the file (see below). On a Unix system the final compressed archived file will be identified by the pathname of the form:

    	              ../951/in6/000123.Z
    

    for the IN6 run number 123 from cycle 951.

    At present data are stored uncompressed, except for instruments like D19 where some reduction in space requirements is necessary. The data server disks are mounted on standardly configured workstations

       Current Cycle   /usr/illdata/data/instrument/011151    etc
    
       Previous Cycle  /usr/illdata/data-1/instrument/000659  etc
    

    Certain instruments, namely D1B, D4, D9, D15, D16, D19, D20, DB21, produce such a large number of files that they have been subdivided into sets of 10000 in subdirectories. Thus:

    /usr/illdata/data/d1b/d1b_0/......    contains runs 1 to 9999
    /usr/illdata/data/d1b/d1b_1/......    contains runs 10000 to 19999 etc.
    

    3. General layout of data files

    The earlier binary data archive format included a number of parameters at the start which served to identify the data, and the structure of the following binary or alpha-numeric fields. The identifiers were used by the ILL utilities which managed the database. The data keys permitted access to different parts of the file to be optimised.

    The text data files can be handled using standard file-utilities available on each system. The internal structure of the data files is delimited by key records indicating the format of the following field. The first part of the file contains information which is common to all runs, and includes the run number, time of recording, experiment name etc. in a specific format, so new ILL utilities (typically Unix shell-scripts) can be developed to emulate functions of the previous database management program SPECTRA. Following the title records, the contents vary from instrument to instrument.

    Parameters and data from the counting system are written in blocks of floating -point or integers, each line being padded out to exactly 80 characters of text. Such files can be read in two fashions, either using simple sequential Fortran READs, or using direct access methods (which have some advantages, allowing data to be skipped, or read in arbitrary order). To allow instruments a further flexibility, variable length data (unspecified text) is also acceptable, providing that the first two fields correspond to the standards described here.

    Each instrument, or group of instruments has a specific data format, which evolves in time. To describe the signification of the data elements an optional description may be stored with the raw data within the file.

    The layout of the standard files is a minor extension of the format developed for export of data on tape from the ILL database. TAPDAT, (1981), SPECTRA (1987)., now including the possibility of adding a text description to most fields. Data written in ASCII by TAPDAT or SPECTRA remain compatible when read since there are no lines of description in these files, and the corresponding blank variable is read as zero lines of descriptive text.

    The second extension is the introduction of a new type of field, the variable length data signified by the key letter V. Standard ILL utilities will not try to interpret data from this key to the end of the file, since the internal structure will be deemed to be specific to the instrument concerned.

    Keys, data and text and are written in 80 character fixed length strings (data following the V descriptor have variable length).

    A key field signifies a certain type of data field follows, with information on the size of the following field, and how much text (if any) is present describing the field of data.

    The text (if present) then follows (new feature).

    The next records then contain the data.

    There then follows another key record for the next data field.

    The next record contains information on the size of the following field, and how much text (if any) is present describing the field of data. etc.,

    Key fields

    These identifying fields consist of two 80 character strings with a fixed format. The first is completely filled with one of the five key letters (R, S, A, F,I,J or V), written with the Fortran format (80A1); the second contains up to 10 integers in Fortran format (10I8). The first record can always be read using the A1 format and checked before any attempt is made to read the following integers. These integers contain control information.

    The seven key types are described below:

    RRRRRRRRRR..			..RRR	(80A1)
    NRUN	NTEXT	NVERS				(10I8)
    
    NRUN	is the run number (numor ) for the data following
    NTEXT	is the number of lines of descriptive text which follow
    NVERS	is the version of the data (modified as data structure changes)
    
    SSSSSSSSSS..	...		..SSS	(80A1)
    ISPEC	NREST	NTOT	NRUN	NTEXT	NPARS	(10I8)
    
    ISPEC	is the following sub-spectrum number
    NREST	is the number of subspectra remaining after ISPEC 
    NTOT	is the total number of subspectra in the run 
    NRUN	is the current run number
    NTEXT	is the number of lines of descriptive text
    NPARS	is the number of parameter sections (F, I etc, preceding the
            counts data), typically for step-scanning multi-detector
            instruments where additional information is stored at each step
    
    AAAAAAAAAA..	...		..AAA	(80A1)
    NCHARS	NTEXT				(10I8)
    
    NCHARS 	is the number of characters to be read from the next data field
    	using the format (80A1)
    NTEXT	is the number of lines of descriptive text before this data 
    
    FFFFFFFFFF..	...		..FFF	(80A1)
    NFLOAT	NTEXT				(10I8)
    
    NFLOAT 	is the number of floating point numbers to be read from the 
    	next data field using format (5E16.8)
    NTEXT	is the number of lines of descriptive text before the data
    
    
    IIIIIIIIII..	...		..III	(80A1)
    NINTGR	NTEXT				(10I8)
    
    NINTGR 	is the number of integer numbers to be read from the next data 
    	field using the format (10I8)
    NTEXT	is the number of line of descriptive text before the data
    
    JJJJJJJJJJ..	...		..JJJ	(80A1)
    NINTGR	NTEXT				(8I10)
    
    NINTGR 	is the number of integer numbers to be read from the next data 
    	field using the format (8I10), for use where the data are
            likely overwrite white space if written in I8 format.
    NTEXT	is the number of line of descriptive text before the data
    
    VVVVVVVVVV..	...		..VVV	(80A1)
    
    Text data following are in a variable length format, and no further 
    standard fields are expected.
    
    Examples

    The sequence of key strings and data for typical instruments may be described in an abbreviated form where each capital letter, R,A,S,F,I,J,V denotes the initial key string, the small letter the length information, and t and d denote descriptive text and data strings respectively, all of fixed 80 characters total string length. The data strings v are of variable length (usually less than 256 characters, and most often less than 132 characters).

    One spectrum/run
    
    	RrtAatddFftttdddSsIittdddddddddddddddddddd
    
    The program /usr/ill/bin/anadat can be used to analyse a file,
    e.g. for D20 with one frame per run:
    
    % /usr/ill/bin/anadat 050750
     anadat - version 3.1  June 2001  (R.E. Ghosh)
    
    
     Scanning file :050750                                                                          
      Run  Format............ from  record      1
     50750 80A 480A  30F  25F  30F  15F  55F  20F +  1 x (  5F  1600J )                                                                                                              
     End of file after    310  records
    
    
    Several subspectra/run
    
    	RrtAatddFftttdddSsIitdddddddSsIitdddddSsIitddddd..
    
    A second example of D20 with 31 frames in a single file:
    % /usr/ill/bin/anadat 044450
     anadat - version 3.1  June 2001  (R.E. Ghosh)
    
    
     Scanning file :044450                                                                          
      Run  Format............ from  record      1
     44450 80A 480A  30F  25F  30F  15F  55F  20F + 31 x ( 30F  1600J )                                                                                                              
     End of file after   6692  records
    
    Variable format
    
    	RrtAatddVvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
    
    

    Apart from the run number the data are identified by the name of the instrument, the date and time of the initial recording, and a short experiment name. This information appears as a text field immediately after the run number key. The text field thus follows an AAA key; at present (September 1994) the 80 characters are used as follows: INSTexpt.-nameDD-MMM-YY.hh:mm:ss---48 blank ---- (80 total) where:

    INST instrument (4 characters) expt.-name experiment name (10 characters) DD-MMM-YY date of recording (9 characters,one space) hh:mm:ss time of recording (8 characters) Example of a data file from D11 Small-angle scattering spectrometer

    In addition to the standard header containing the instrument name etc., the following 5 data fields are present:

    	156I, 512A, 128F, 256I, 4096I	
    
    The formatted structure is:
    
    RrAadIitddddddddddddddddAatdddddddFfttttttttddddddddddddddddddddddddddIitdd etc.
    
    The data appear as follows:
    
    RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR 018983 1 0 ILL-SANS Data transcribed Spring 1994: ILL ASCII-Formatted Data (GM-REG 1994) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 80 0 D11 OTTEWILL 22-MAR-89 10:56:50 IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 156 1 Key values required to re-create original binary file ref:89GH03T (R. Ghosh) 1 4096 1 120 2 64 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 2 120 120 3 256 64 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 20 20 200 1 4096 5 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 512 1 Title(60a) Subtitle(20a) Start time(20a) Stop time(20a) OTTEWILL LATEX UNDER SHEAR SP12 1E-5 NEW 22-MAR-89 10:29:5022-MAR-89 10:56:49 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF 128 8 Parameters . . . . Preset 1 Preset 2 Time(secs) Detector sum Monitor sum Select. rev sum Chop. rev sum Counter 6/12 Counter 7/12 Counter 8/12 Counter 9/12 Counter 10/12 Counter 11/12 Counter 12/12 Offset Det. deg. Sample coder 1 Sample coder 2 Sample coder 3 Sample coder 4 Selector speed Chopper1 speed Chopper2 speed DVM channel 1 DVM channel 2 Selector A-B-C Sample-Det (m) Run cycle # Run serial # Temp. set point Temp. regulation Sample temp. K control IEEE-1 at start IEEE-1 at end IEEE(2-7)... 0.15000000E+04 0.00000000E+00 0.16137000E+05 0.28602810E+07 0.15000000E+07 0.49283000E+05 0.00000000E+00 0.00000000E+00 0.28602810E+07 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.65536000E+05 0.00000000E+00 0.00000000E+00 0.20000000E+05 0.43023000E+05 0.13250000E+04 0.23330000E+04 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.10000000E+01 0.20000000E+02 0.10000000E+01 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 256 1 Overflows (>65536) address..contents..address..contents..address..contents 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS 1 0 1 18983 0 IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 4096 1 Detector Counts C(x,y) x=1,y=1 x=2,y=1 3,1 ... 1,2 2,2 3,2 ... x=n,y=n 23145 0 0 0 0 0 1 2 0 0 0 1 2 6 37 35 38 46 58 44 65 63 50 36 53 42 51 71 71 63 67 90 101 95 89 75 73 76 83 42 59 62 49 49 51 48 70 37 42 17 16 0 1 0 0 0 1 0 0 0 19 13 4 9 0 0 0 0 0 0 0 0 0 0 0 10 32 38 41 37 49 42 38 57 58 67 54 54 69 73 88 90 84 94 86 118 114 107 118 63 74 77 86 80 54 78 55 48 57 43 32 56 49 41 42 25 7 0 0 0 0 0 0 0 0 0 0 0 0 0 The full multi-detector spectrum occupies 410 lines of integer data.... 38 70 88 118 151 146 132 107 99 186 356 439 380 312 273 461 762 932 816 485 487 731 1495 2542 3308 3030 2749 3945 9635 19771 23258 13611 4971 2765 2961 3556 3012 1862 968 552 510 774 915 870 512 357 256 259 270 259 173 118 93 135 149 168 145 103 65 64 55 56 42 42 50 58 85 82 127 130 98 100 146 335 574 769 628 371 304 355 540 637 578 451 517 1454 5303 10310 10622 5867 2450 756 1456 2681 3160 1631 685 824 4107 7988 8929 4997 1954 698 500 654 676 593 366 300 266 357 470 399 293 131 106 106 141 161 111 87 67 63 48 51 60 63 51 59 73 76 101 104 115 101 204 476 876 1102 841 504 297 279 357 439 520 531 786 3516 14624 28021 25994 11460 2592 75 82 68 72 77 67 679 7163 18254 24764 14393 4975 1096 545 487 515 443 321 304 337 527 758 685 432 219 115 102 129 131 102 84 72 70 57 58 51 57 58 71 70 76 87 112 80 97 175 448 914 1091 899 434 243 237 357 517 518 622 878 4230 16535 30370 27178 11369 2254 56 55 36 63 46 52 700 8628 20949 29901 17647 6397 1362 643 577 476 385 298 310 331 601 922 870 528 242 113 113 93 103 94 73 60 67 55 70 49 51 50 58 72 101 133 133 104 87 187 417 807 932 743 408 239 319 574 994 1006 819 1067 3671 13439 23908 19587 7581 1883 76 47 51 39 46 66 778 7225 16013 23155 14036 5341 1450 869 724 735 490 325 260 346 527 856 785 493 266 127 80 103 100 93 85 54 55 58 64 48 51 53 67 82 116 133 151 125 102 158 313 547 542 447 277 226 434 1165 2253 2044 1092 1158 2476 6592 10953 8768 3597 1567 48 56 45 52 48 57 571 4121 8506 11742 7002 3007 1444 969 1187 1565 1038 452 323 277 422 626 593 380 182 107 94 92 113 97 56 60 59 only a part of this integer field is shown here..... 0 0 0 0 0 0 0 0 0 13 36 30 35 48 31 24 36 26 28 29 34 30 27 50 49 54 45 36 41 60 49 42 49 54 43 52 55 51 41 39 43 47 38 35 29 30 26 33 28 42 36 33 35 33 28 8 0 0 0 2 4 0 1 2 0 0 0 0 0 0 0 1 0 0 19 26 40 37 26 33 25 20 26 29 29 34 27 28 38 33 31 34 33 44 28 45 44 46 37 39 33 35 29 39 29 32 33 29 32 24 23 36 31 33 36 35 25 4 1 0 1 0 0 1 14 6 4 2 1 0 0 0 0 0 0 0 0 0 0 1 30 44 36 28 33 31 31 26 30 33 39 30 38 32 33 25 39 22 38 29 36 31 27 24 33 36 33 37 41 34 36 31 30 21 35 36 29 27 26 26 4 0 0 0 0 0 0 1 0 1 2 6 85 0 0 0 0 0 0 0 1 0 1 1 0 11 22 31 24 28 20 26 21 26 24 29 26 34 29 26 36 19 47 30 31 29 22 30 27 28 23 24 32 26 38 30 30 22 25 36 24 23 10 0 0 0 1 3 0 2 1 2 1 0 8 21

    Example of a data file containing variable length strings :

    Three Axis Spectrometers

    
    RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR
      000485       1       0                                                        
    ILL TAS data in the new ASCII format follow after the line VV...V
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
          80       0                                                                 
    IN20  KULDA       31-MAR-94 16:12:34  
    VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
    INSTR: IN20 
    EXPNO: 4-7-35	
    USER_: KULDA
    LOCAL: CURRAT
    FILE_: 000485.Z 
    DATE_: 31-MAR-94 16:12:34
    TITLE: TAs
    COMND: SC EN=2.8 DEN=.2 NP=3 MN=200 
    COMM_: This is just a comment.
    [only in PA mode:
    POLAN: ON F1; CO MN=10000; OFF F1; CO MN=2000]
    
    POSQE: QH=  2.1000, QK=  0.0000, QL=  0.0001, EN=  -1.002, UN=meV
    STEPS: DQH=  0.0000, DQK=  0.0000, DQL=  0.0000, DEN=    .100,
    [or STEPS: DA3=    0.10]
    
    PARAM: DM= 3.3551, DA= 3.3550, SM= 1., SS=-1., SA= 1., ETAM= 35.1, ETAA= 24.0,
    PARAM: FX= 2., KFIX=  2.6620,
    PARAM: ALF1= 60., ALF2= 40., ALF3= 43., ALF4= 60.,
    PARAM: BET1=120., BET2= 90., BET3= 90., BET4=240., 
    PARAM: AS=  6.2832, BS=  6.2832, CS=  6.2832, 
    PARAM: AA= 90.000, BB= 90.000, CC=120.000, ETAS= 17.1,
    PARAM: AX=   1.000, AY=   0.000, AZ=   0.000, 
    PARAM: BX=   0.000, BY=   0.000, BZ=  1.000, 
    
    VARIA: A1=  35.43, A2=  70.85, A3=-185.23, A4= -56.05, A5=  35.43, A6=  70.86,
    VARIA: TM=  20.00, GM=  19.98, RM=  10.62, GL=  15.43, GU=  -2.34, 
    VARIA: TA=  15.94, GA=  13.23, RA=   5.34, CH= 134.00, LM=  23.00, 
    [only in PA mode:
    VARIA: I1=   0.00, I2=   0.00, I3=   1.00, I4=   0.80, I5=   0.00, I6=   0.00, 
    VARIA: I7=   0.00, I8=   0.00,]
    
    ZEROS: -131.02, -145.98, -180.00, -130.98,  -90.70, -131.02, etc
    ZEROS: cont.
    
    FORMT: (I4,1X,F8.3,1X,I7,1X,I6,1X,I6,1X,F7.2,1X,F7.2,1X,F7.2,1X,F8.4,1X,F7.2)
    [in PA: PNT is F6.1]
    DATA_:
    PNT    EN(meV)   CNTS     M1     M2   TIME      A3      A4      KI      T       
                                            
       1    2.610     263    200    794   72.77   22.50   -2.99   2.8890   19.87
       2    2.797     175    200    761   72.13   22.50   -2.99   2.9040   19.87
       3    2.995     161    200    806   71.69   22.50   -2.99   2.9210   19.87
       
       
    
    

    4. Accessing Data

    As created, the data files may be displayed using standard TYPE or cat commands.

    If the file is to be read in the simplest manner, i.e. sequentially, from the start then it may be programmed in simple Fortran as follows:

    	CHARACTER *50 FNAM
    	:
    	FNAM='/data-1/in6/000123'  ;Unix
    or	FNAM='PREVIOUS:[IN6]000123.'            ;VMS
    	:
    	OPEN(UNIT=10,FILE=FNAM,STATUS='OLD',ERR=999)
    	:
    
    It is then the task of the program to step through the data.

    Accessing Compressed Data

    A number of treatment programs use the following strategy to access compressed data:

           is data file present uncompressed? 
              yes - open file
              no    
              is data file present compressed (nnnnnn.Z)?
              yes - execute command and create temporary file locally
                    zcat nnnnnn.Z > data.tmp
                    open file data.tmp
              :
              read open file
    
    Some data transcribed to the ASCII archive lack end of line markers. A small utility in /usr/ill/bin/dzarch can be used instead of zcat to decompress the data. (This routine actually uses zcat during its work).

    5. Technical Notes

    Typical compressions achieved are:

    D11
    	basic format	 156I, 512A, 128F, 256I, 4096I
    
    		VMS binary	10 kbytes
    		ASCII file	42 kbytes
    		Compressed ASCII	10 kbytes
    
    IN6
    
    	basic format	156I, 512A, 384F, 128F, 95(512I)
    
    		VMS binary	200 kbytes
    		ASCII file	420 kbytes
    		Compressed ASCII	  60 kbytes
    

    6. Evolution of the Standard Data Format

    1983 TAPDAT
    The initial key structure was defined in the ILL report 83PA29T, Pater & Ghosh, describing the format for exporting data from the binary experimental data-base on magnetic tape.
    1987 SPECTRA
    The data extraction routines were rewritten for VAX-VMS. When only one "count results" field (following the header parameters was found this was then treated as a a single subspectrum, being preceded by an SSSSS field.
    1993 sharing data between VMS and Unix
    The solution to use the fixed length record format from TAPDAT was adopted to allow simplify the progressive implementation of the changover to Unix.
    1994 addition of VVVVV field
    To integrate variable formatted data into the management structure (notably from 3-Axis spectrometers and spin-echo where the data are stored in a form resembling the instrument log) the VVVVV key field was added.
    1997 multiple sub-spectrum fields
    For transcription of binary to ascii data only one data field was permitted in the subspectrum. Once ascii data were written directly under Unix this constraint (in the transcription routine) was irrelevant. For D20 it was now possible to include scan parameters as a separate field in each subspectrum.
    1997 new field key JJJJJ
    More and more data access routines depend on "white space" separating the datum values. Newer fast counting instruments were capable of such high count rates that it was possible to exceed 9999999 counts, though occasionally these values result from electronic errors. Though easy to read in Fortran in fixed format simplification for other languages could be obtained by increasing the field width to ten characters for certain instruments, notably D20.
    2006 closure of VMS system and access to binary archive
    A number of previously untransferred data from D11A were recovered, and the opportunity taken to reformat the early data from D11 and D17 to match current standards. The utility program dzarch was written to ensure all archive files could be correctly uncompressed.
    2007 BRISP first instrument to save data in NeXus (HDF5) format. This was followed by the other time of flight instruments from 2009. 2014 The new SANS instrument, D33, started operations with NeXus data in 2014. The ILL prescription is a variant on the strict definition, since the metadata are in a subset of the instrument name. A specific dictionary file for each instrument in treatment software can help follow the evolution of these data (eg D33 stores multidetector data as (time,y,x); initially D22 had (time,x,y). Some tools have been written to rewrite D11 and D22 data in an ascii format. The compression of the binary data is less efficient than the ascii compression (especially for small integer count data).