Documentation for the EBCDIC Conversion Package (Level 3) --------------------------------------------------------- Written by: Ted Green and Thomas C. Burt Last change: 10/16/2003 Introduction ------------ The VEDIT Level-3 EBCDIC package converts EBCDIC files into ASCII with any number of packed-decimal, binary, zoned, text and "ignore" fields. It can do a limited amount of data rearrangement and can output constant strings. The Level-3 EBCDIC package also supports multiple record types per file. Each record type can vary in record length and field specifications. Each record type can be optionally output to its own file. Groups of field specifications, called "segments" herein, may be named and referred to by their name. These segments may consist of multiple, fixed length, data dependent subrecords. The converted ASCII records can optionally have newline characters, e.g. Carriage-Return and Line-Feed, appended to the end of each converted record. They can also be output in DBASE format, with the DBASE header record being automatically generated. Other output options include quoting-and-comma delimiting each field or appending a user specified delimiter string after each field; converting an all-blank or null numeric field to zeros; emitting an explicit decimal point; and prefixing or appending padding characters to any specified field. For EBCDIC to ASCII conversions, the converted file is normally saved as "filename.ASC"; for DBASE output, the default extension is ".DBF". Or, the user may explicitly give the output filename. In no case is the source data file altered. Any conversion errors can optionally be reported in the file "ebcdic.err". This file is initially deleted; therefore, a batch file can test for its existence to determine whether there were any errors. The conversion is completely controlled by a ".LAY" data layout file which the user must create once for each type of file to be converted. This layout file primarily lists the beginning and ending columns for each field and the data type of the field. It also specifies the record length and specifies whether newline characters are to be appended to the end of each converted record. As described below, a COBOL "copy-book" can often be automatically converted into a .LAY data layout file. Therefore, we highly recommend that you procure the "copy-book" for your data file(s). Once the .LAY data layout file has been created, the conversion macro can be manually run from within VEDIT, or automatically run from either a Window's icon or a DOS or NT command line. The macro runs without any user interaction and, when done, saves the converted file and displays it on the screen. Optionally, the macro can automatically exit, in which case the conversion requires no user interaction whatsoever. To convert many files, EBCDIC-3.VDM can be used with the menu function {MISC, WILDFILE macro} to automatically convert entire groups of files, e.g. all files in a directory. The speed of the conversion depends both on the size of the file and the number of packed fields per record. The conversion speed is typically 10 megabytes per minute on a 1000mhz Pentium. NOTES: A custom conversion can also be performed on any desired fields using a custom written VEDIT macro. This is described in more detail later. The conversion supports additional technical options which are not described here. Please examine the description at the beginning of the EBCDIC-3.VDM file for additional details. The following capabilities of the Level-3 package are not described here: * Preprocessing and postprocessing. Greenview Data is available on a contract basis to perform any part of the conversion for you, including creating the data layout file or even translating the EBCDIC file(s) that you supply. If you do not have the packed field locations for your file, we can often examine the file and determine them for you. Installation ------------ Simply copy the supplied files to the VEDIT Home Directory, e.g. "c:\vedit". The files are: VPW.EXE VEDIT (32-bit Windows) which contains the additional packed field EBCDIC support. The {HELP, About} box displays "With EBCDIC support". This file will replace your existing vpw.exe file. EBCDIC-3.VDM The EBCDIC conversion macro described here. (Do not modify this file). EBCDIC-3.VSM The EBCDIC conversion submacros. Loaded by EBCDIC-3.VDM at runtime. (Do not modify this file). EBCDIC-3.VCM More EBCDIC conversion submacros; used only by custom macros and the rare "ki" field; loaded automatically during preprocessing if a "c" or "ki" field is encountered. (Do not modify this file). COBOL2V.VDM The COBOL copy-book preprocessor. This is used by EBCDIC-3.VDM. It may also be run independently. (Do not modify this file). RELAY.VDM Generate output column numbers from a data layout (.lay) file. Handles quoted-comma-delimited fields and user specified field separators. This is used by COBOL2V.VDM. It may also be run independently. (Do not modify this file). REGPREP.VDM Used by EBCDIC-3.VDM to streamline the macro code, making it run faster. (Do not modify this file). EBCDIC.LAY A simple prototype data layout file to help you get started. It supports EBCDIC files with uniform fixed-length records. NOTE: You must edit this file, as described below, before converting any files. MULTIREC.LAY A prototype data layout file to help you get started. It supports EBCDIC files containing several types of records. EXTRACT.LAY A prototype data layout file to extract records into files according to their record type. (See topic "Divide and Conquer"). NEWLINE.LAY A prototype data layout file to append ASCII carriage-return and line-feed characters to the end of each EBCDIC data record. See the topic "Viewing EBCDIC Data Files". MRSAMP.COP A prototype COBOL copy-book that describes two record types that has been edited for input to COBOL2V.VDM. The MRSAMP set of files provide examples of newer features added to the conversion package, including DBASE output. MRSAMP.EBC A sample multi-record EBCDIC data file described by the copy-book MRSAMP.COP. MRSAMP.L The output of COBOL2V.VDM on the file MRSAMP.COP. MRSAMP.LAY Result of editing MRSAMP.L. Fully functional layout for translating the sample file MRSAMP.EBC. MRSAMPD.LAY A layout for outputting data into a DBASE formatted file. SGSAMP.EBC Same as MRSAMP.EBC. Used with SGSAMP.LAY. SGSAMP.LAY A prototype, multi-record, data layout file using segments. COBOL.COP A sample COBOL "Copy-Book" corresponding to sample data file COBOL.EBC. COBMIX.LAY A prototype data layout file produced from COBOL.COP, above. Includes some initial .LAY specifications. COBOL.LAY A prototype data layout file produced from COBMIX.LAY, above. Converted by COBOL2V.VDM and RELAY.VDM into standard .LAY format. COBOLD.LAY A prototype layout file to produce DBASE output. Not supplied. Generated as stated in "Quick DBASE Test", q.v. COBOL.EBC A small example EBCDIC file with a layout corresponding to COBOL.COP. COBOL.ASC The converted ASCII file made from the example COBOL.EBC file. COBOL.DBF The converted ASCII file in DBASE format made from COBOL.EBC via COBOLD.LAY. EBCDIC-3.BAT A batch file for automating the conversion process. Win 95/98/NT/2000/XP users can create a shortcut icon to this batch file for "drag and drop" conversion. EBCDIC-3.TXT This documentation file. LEADZERO.CUS An example of a custom macro for processing custom fields. This macro converts unpacked EBCDIC numbers to ASCII and then strips any leading zeroes. It optionally inserts a period before the second-last or third-last digit. EBCDIL.TBL Alternative EBCDIC to ASCII translation table to override the default table EBCDIC.TBL. Converts low bytes (0x00 - 0x3f) in text fields to spaces. See topic "Alternative EBCDIC to ASCII Translation Tables", below. EBCDIQ.TBL Alternative translation table that converts quotes to apostrophes. May be needed when output is quoted and comma delimited. EBCDILQ.TBL Alternative translation table that combines the above two tables. Quick Test ---------- You can perform a quick test by converting the supplied COBOL.EBC file using the supplied COBOL.LAY data layout (COBOL copy-book) file. From a DOS/NT command box, change to the directory containing VEDIT and into which you placed the EBCDIC conversion file. This is typically "c:\vedit". Then, give the command: vpw -x ebcdic-3.vdm cobol.ebc cobol.lay This will leave you in the VEDIT editor with the converted file displayed. Alternatively, to automatically save the file and exit, give the command: vpw -y -x ebcdic-3.vdm cobol.ebc cobol.lay The following section "Performing the Conversion (Automatically)" describes this in more detail. NOTE: The easiest way to perform this quick test is to open a DOS box in Windows, change to the directory containing the VEDIT file, (e.g. with the DOS command "cd \vedit") and then enter the commands above. You could run the quick test by selecting "Run" from the "Start" button, but then you will have to enter the full pathname to each file in the command. Once you have everything set up, you can create an icon on the Windows Desktop to perform the conversion via drag-and-drop. 1. Right-click on the desktop and select New -> Shortcut. 2. For the "Command line" or "Location of item", enter: c:\vedit\ebcdic-3.bat "%1" (This assumes VEDIT was installed into "c:\vedit"). 3. For the "Name of shortcut", enter any desired name, such as "EBCDIC conversion". 4. You will now have an icon on your desktop. Creating the .LAY Layout File ----------------------------- The user must specify via a ".LAY" data layout file, typically EBCDIC.LAY, the location of the packed, zoned and other special fields. The file's record length and several options are also specified in the data layout file. The data layout file is best understood from a simple example. Assume an EBCDIC file with the following specs: Record length is 114 Field between columns 10 and 14 is a packed-decimal number (unsigned) Field between columns 20 and 21 is a packed-decimal number (unsigned) Field between columns 22 and 23 is a packed-decimal number Field between columns 30 and 33 is a packed-binary number Field between columns 42 and 46 is a zoned number Field between columns 51 and 55 is an unsigned number Field between columns 56 and 59 is a short (single precision) floating point number Field between columns 60 and 67 is a long (double precision) floating point number All other fields are simple EBCDIC text To convert this file and create a Windows/DOS text file with a Carriage- Return and Line-Feed at the end of each record, the following layout file is used: (This is the supplied prototype EBCDIC.LAY file) r=114,0 //Record length is 114, convert to DOS newlines o=z,b2z //Output leading zeros (blank '+' sign; sign at end) //Convert all-blank/null numeric fields to zeros e=0,ebcdic.err //Only report serious errors in EBCDIC.ERR d 10-14 u v2 //Packed-decimal (no sign, explicit decimal point) d 20-21 u //Packed-decimal (no sign) d 22-23 //Packed-decimal b 30-33,=9; //Packed-binary (9 digits; default is 8) z 42-46 //Zoned number (5 digits plus sign) u 51-55 //Unsigned number; want leading blanks converted to zeros //(o=b2z); otherwise redundant l 56-59,=7; v2 //Short (single precision) floating point; upto 7 digits with two past a decimal point l 60-67,=12; v2 //Long (double precision) floating point; upto 12 digits, 2 fractional As shown above, comments can be added to the layout file by preceding them with "//". The "r=114,0" indicates that this file has fixed-length records of 114 bytes. Most EBCDIC files have fixed-length records. Note that unpacking fields will increase the record length. The ",0" indicates that a Carriage-Return and Line-Feed will be appended after each record in order to create a Windows/DOS text file. Use ",1" to append just a Line-Feed to create a UNIX text file. Note: With multiple types of records, each record type can have a different record length specified with the "l=" parameter, as described below. In this case, the record length set with "r=" is ignored. The "o=z,b2z" specifies that leading zeros are to be output, rather than spaces and that all-blank/null numeric fields are to be converted to text zeros. The conversion options are described below. The "d" specifies a packed-decimal field. It must be followed by the beginning and ending column numbers of the field. The "u" in the parameter field specifies that this value is unsigned, so no sign is output, not even a space. Note: a positive or negative sign in the input data will generate an error. The "v2" specifies that an explicit decimal point is to be output, followed by the last two digits of the packed decimal value. The "b" specifies a packed-binary field. The maximum size of a packed- binary field is eight bytes. Ordinarily, a 4-byte binary field is output in an eight digit field. This can be changed with the ",=ndigits;" parameter as described in the later topic "Formatting Numeric Data". The "z" specifies a zoned number field. In a zoned field, the sign is packed into the last digit; the other digits are not packed. The output field is one greater than the input field so that the sign can be expressed. The "u" specifies a simple unsigned (non-packed) number. Since this is no different from other EBCDIC text, it need not be included. Using it here ensures that any leading spaces will be converted to zeros and allows all-blank/null numeric fields to be converted to zeros with the "o=b2z" instruction. Finally, the "l" specifies a short (single precision) or long (double precision) floating point number. The short value is four bytes in length; the long is eight. In the above example, the short value is expressed in a seven digit field including the two fractional digits. The output field is 9 bytes long to account for the sign and the explicit decimal point. Similarly, the long floating point value is output in a twelve digit field. Any fractional digits contained in the actual data beyond the two that are expressed are truncated. If the first truncated digit is five or greater the final expressed digit is rounded up. The complete list of possible field types is: b Packed-binary field d Packed-decimal field n Packed-No-Zone (PNZ); no sign; additional digit can be packed. z Zoned number field s Signed ASCII decimal field; analogous to zoned decimal. '{', 'A'-'R' and '}' represent signed digits. Used when converting an ASCII file with compressed fields. Should be used with "u=ignore" to set the default conversion. u Unsigned numeric field. Simple EBCDIC. Using it allows explicit decimal points to be output or leading spaces converted to zeros or having an all blank/null field converted to zeros. e EBCDIC field. This is redundant and slows down conversion, but may help some users better visualize their file. Necessary when quoted-and-comma-delimited or user-specified delimiter or DBASE output is specified. Also required for the first or last item of an "out" bracketed section. h Hex field. Output each byte as 2 hexadecimal digits. i Ignore field. The bytes in this field will be passed on "as is". l Floating point number. See topic "Floating Point Numbers". f Fill field. The bytes in this field are replaced (filled) with ASCII spaces, or another specified padding character. x Delete the field. c Custom field. A custom macro is used to process this field. The "b", "d", "l", "u" and "z" specifications can also take several options which specify exactly how the number is converted. If no options are specified, numbers are converted as follows: * Output space instead of '+' for positive values * Place the sign at the end of the number * Output numbers as signed values * Output leading zeros instead of spaces (due to "o=z") The following options, which are specified after the column numbers, control the conversion: (These override any "o=..." command settings). + Use an explicit "+" sign for positive numbers, else a Space is used. Negative numbers are always converted with "-". b Place the sign at the beginning of the number, else the sign is at the end of the number. u The data is expressed as an unsigned number. This option overrides the "+" and "b" options. z Use leading "0" in numbers; else leading zeros are replaced with Spaces. b2z Convert all-blank/null numeric fields to zeros. See also the topics "Formatting Numeric Data" and "Floating Point Numbers". Since it would be tedious to specify the same conversion options for every field, you can specify the default options with the "o=..." command. For example, "o=z" specifies that the converted numbers should have leading "0" instead of spaces. The command "o=+bz" would place the sign, including '+', and any leading zeros ahead of all unpacked numeric values by default. A .LAY file should have at most one "o=..." command. For some fields, you may want to disable one or more of the default options set with the "o=..." command. This can be done with the following options: - To use Space instead of "+" for positive numbers. e To place the sign at the end of the number. s The data is expressed as a signed value. p To use blank padding instead of zeros to right justify small numbers. NOTE: These field options are only needed to override the default options set with the "o=..." command. The numeric field specifications "b", "d", "l", "n", "u" and "z" (binary, packed decimal, floating point, packed-no-zone, unsigned numeric and zoned decimal) can also take an optional explicit decimal point position. v1 Place the decimal point "." one digit from the right. v2 Place the decimal point "." two digits from the right. v3 Place the decimal point "." three digits from the right. For example, the layout file above could be modified with these options: r=114,0 //Record length is 114, convert to DOS newlines o=-ep,b2z //Standard output default options //Also, convert all-blank/null numeric fields to zeros e=0,ebcdic.err //Only report serious errors in EBCDIC.ERR d 10-14 b //Packed-decimal; place sign at beginning d 20-21 zv2 //Packed-decimal; include leading "0" // place decimal point two digits from right d 22-23 +b //Packed-decimal; include "+" at beginning b 30-33 z //Packed-binary; include leading "0" //Note: output in an eight-digit field z 42-46 b //Zoned number; place sign at beginning u 51-55 z //Unsigned number in columns 51 thru 55; want leading blanks //and all-blank/null fields converted to zeros; else redundant l 56-59 //Short (single precision) floating point integer; 7 digit // field with no decimal point l 60-67 v0 //Long (double precision) floating point integer; 16-digits; // decimal point but no fractional digits Sometimes it is more convenient to specify the type and size of each field instead of its beginning and ending columns. The following variation of the original example shows this. Note how the "+" is used: r=114,0 //Record length is 114, convert to DOS newlines o=z,b2z //Output leading zeros (blank '+' sign; sign at end) //Convert all-blank/null numeric fields to zeros e=0,ebcdic.err //Only report serious errors in EBCDIC.ERR e +9 // 9 columns of EBCDIC text (1 - 9) d +5 // 5 columns of Packed-decimal (10-14) e +5 // 5 columns of EBCDIC text (15-19) d +2 // 2 columns of Packed-decimal (20-21) d +2 // 2 columns of Packed-decimal (22-23) e +6 // 6 columns of EBCDIC text (24-29) b +4 // 4 columns of Packed-binary (30-33) e +8 // 8 columns of EBCDIC text (34-41) z +5 // 5 columns of Zoned number (42-46) e +4 // 4 columns of EBCDIC text (47-50) u +5 // 5 columns of EBCDIC digits (51-55) l +4 // 4 columns of short (single precision) floating point (56-59) l +8 // 8 columns of long (double precision) floating point (60-67) e +47 //47 columns of EBCDIC text (56-76) The name of the data layout file can be either EBCDIC.LAY or filename.LAY, where 'filename' is the name of the file being converted. The name of the layout file can also be explicitly specified. This is described below under "Conversion Options". Floating Point Numbers ---------------------- The EBCDIC conversion packages support IBM 360-style floating point numbers in the range 10E-18 through 10E18. Output is fixed format, right justified. The field is specified like the other numeric fields; e.g., l bc,ec[,=ndigits;] [sop] [v[n]] // comment where the beginning and ending columns must be four or eight bytes long; "sop" are the standard options "+buz-esp", 'v' specifies that an explicit decimal point is to be output and 'n' specifies the number of digits to output past the decimal point. The optional "ndigits" specifies the total number of digits output, both before and after the decimal point (See "Formatting Numeric Data"). It must be sufficiently large to express the whole-number digits as well as the fractional digits "vn". There will always be at least one digit before the decimal point, even if just zero. The default "ndigits" is 7 for short floats and 16 for long floats. Floating point numbers with magnitudes greater than 10E18 are treated as erroneous; their output fields are filled with BAD_CHARs and error messages are generated in EBCDIC.ERR. Numbers with magnitudes between zero and 10E-18 are output as zeros. (Technical) Floating point numbers have an intrinsic truncation error associated with them. IBM's standard floating point number has 6 significant hexadecimal digits and a sign and exponent packed into 4 bytes, with a certain amount of truncation error associated with the 6th digit. This can be transformed into a decimal representation of about 7 digits with a significant amount of error in the eighth digit. Accordingly, the eighth digit is truncated and the preceding digit(s) is/are rounded up when the truncated digit was greater than or equal to five. For double precision (eight bytes), seven- teen significant decimal digits can be output. NOTE: The EBCDIC conversion packages correctly handle these truncation issues according to the IBM documentation. If the conversion instruction specifies fewer significant digits than can be validly expressed, the last digit output is rounded up when the non displayed subsequent digit is five or greater. Double rounding (from 4 to 5 during conversion and then from 5 to 6 for displaying) is prevented. If the conversion instruction specifies more digits than can be validly expressed, zeros are used for the excess positions. Using a COBOL Copy-Book for the .LAY Layout File ------------------------------------------------ For simple data files containing neither REDEFINES nor OCCURS clauses, it is possible to "cut and paste" the COBOL copy-book into the .LAY layout file. The COBOL copy-book must be preceded by a "p=" line and followed by a "q=" line. The following example converts the same file in the same way as the original example (this is the supplied prototype COBMIX.LAY file): r=114,0 //Record length is 114, convert to DOS newlines o=z,b2z //Output leading zeros (blank '+' sign; sign at end) //Convert all-blank/null numeric fields to zeros e=0,ebcdic.err //Only report serious errors in EBCDIC.ERR // p= //Start of COBOL data specifications //Vedit comments and blank lines are OK after the "p=" line *************************************************************** 000100 * * 000110 * SAMPLE COPY-BOOK * 000120 * * 000130 * RECSIZE = 114 BYTES * 000140 * * 000150 *************************************************************** 000160 000170 01 SAMPLE-REC. 000180 03 SA-KEY. 000190 10 SA-KEY-ALPHA PIC X(7). 000200 10 SA-KEY-NUMERIC PIC 9(2). 000250 03 SA-PAC-DEC-1 PIC 9(7)V99 COMP-3. 000300 03 FILLER PIC X(5). 000350 03 SA-PAC-DEC-2 PIC 999 COMP-3. 000400 03 SA-PAC-DEC-3 PIC S9(3) COMP-3. 000450 03 FILLER PIC X(6). 000500 03 SA-SIGNED-BINARY PIC S9(9) COMP. 000550 03 FILLER PIC X(8). 000600 03 SA-ZONED-DECIMAL PIC S9(5). 000650 03 FILLER PIC X(4). 000700 03 SA-ANOTHER-NUMERIC PIC 9(5). 000750 03 SA-FLOAT PIC S9(5)V99 COMP-1. 000800 03 SA-DOUBLE PIC S9(10)V99 COMP-2. 000850 03 FILLER PIC X(47). 000900 q= //End of COBOL data specifications Notes: The COBOL statement lines must be included verbatim, maintaining the 1-6,7,8-72,73-80 columnar format. For statements covering more than 1 line, the entire statement must be included, not just the picture format. Other than COBOL statement lines, only blank lines, Vedit comment lines and Column setting "c=col" lines (see below) may appear between the "p=" and "q=" lines. Alternatively, each individual packed field's COBOL statement can be cut and pasted into the layout file. It is necessary to indicate the beginning column number of each non-contiguous field, as in the following example: r=114,0 //Record length is 114, convert to DOS newlines o=z,b2z //Output leading zeros (blank '+' sign; sign at end) //Convert all-blank/null numeric fields to zeros e=0,ebcdic.err //Only report serious errors in EBCDIC.ERR // p= //Start COBOL section //Columns 1-9 are simple EBCDIC c=10 //Starting column # for upcoming packed-decimal field 03 SA-PAC-DEC-1 PIC 9(7)V99 COMP-3. 000300 //Columns 15-19 are simple EBCDIC c=20 //Starting column # for upcoming packed-decimal fields 03 SA-PAC-DEC-2 PIC 999 COMP-3. 000400 03 SA-PAC-DEC-3 PIC S9(3) COMP-3. 000450 //Columns 24-29 are simple EBCDIC c=30 //Starting column # 03 SA-SIGNED-BINARY PIC S9(9) COMP. 000550 //Columns 34-41 are simple EBCDIC c=42 //Starting column # 03 SA-ZONED-DECIMAL PIC S9(5). 000650 //Columns 47-50 are simple EBCDIC c=51 //Starting column # 03 SA-ANOTHER-NUMERIC PIC 9(5). 000750 03 SA-FLOAT PIC S9(5)V99 COMP-1. 000800 03 SA-DOUBLE PIC S9(10)V99 COMP-2. 000850 //Columns 68-114 are simple EBCDIC q= //End COBOL section More complicated copy-books with "OCCURS" and "REDEFINES" clauses should first be converted to a normal .LAY file with the supplied COBOL2V.VDM macro as described below. Performing the Conversion (Manually) ------------------------- NOTE: The conversion is normally performed automatically (see below), but you can also run it manually. To convert files manually, it is necessary to have the data layout file named EBCDIC.LAY in the current or VEDIT Home Directory. 1. Open the EBCDIC file to be converted. NOTE: Steps 2. thru 4. are optional; they just let you review the EBCDIC file before it is converted. 2. Assuming the file has fixed-length records, set the correct record length with {CONFIG, File handling, File type}. 3. Press (the hot-key for {VIEW, Toggle display mode}) several times to switch to EBCDIC mode. The non-packed fields should now be readable. 4. Open the file EBCDIC.LAY and ensure that it is correct. In particular, the "r=" should be set to the same record length that you set above. Switch back to the data file's buffer. 5. Select {MISC, Load/Execute macro}. Enter the file name "ebcdic-3.vdm" or select this file in the dialog box. Use the default register number "100". 6. The macro will start, display its sign-on message and immediately begin performing the conversion. It displays a progress message of how many records have been converted. The conversion creates a new file with the same filename, but an ".ASC" extension. 7. When done, the beginning of the converted file will be displayed on the screen. 8. Select {FILE, Exit} to exit the editor. Performing the Conversion (Automatically) ------------------------- The conversion can be performed automatically from a command line or with a DOS batch file. Win 95/98/NT/2000/XP users can create a shortcut icon to the batch file; you can then drag-and-drop the file to be converted onto the icon. 1a. From a DOS or NT command line, use the following command to run the 32-bit Windows VEDIT with the conversion macro: c:\vedit\vpw -x ebcdic-3.vdm filename where 'filename' is the name of the file to be converted. (This assumes VEDIT was installed in c:\vedit.) 2a. This will immediately start the conversion. When done, the converted file will be saved and shown on the screen. You must still exit the editor. To also automatically exit, use the command: c:\vedit\vpw -y -x ebcdic-3.vdm filename This will convert and save the file, and exit VEDIT without any user intervention whatsoever. -OR- 1b. Use the supplied EBCDIC-3.BAT batch file which contains the command to run the 32-bit Windows VEDIT and automatically convert the specified file: c:\vedit\vpw -x ebcdic-3.vdm %1 %2 Therefore, to convert the file with name 'filename', give the command: ebcdic-3 filename 2b. This will convert and save the file, and exit VEDIT without any user intervention whatsoever. If desired, you can have VEDIT display the converted file before saving and exiting. The comments in the EBCDIC-3.BAT file describe how to do this. NOTE: These commands assume that the EBCDIC data file, EBCDIC-3.BAT and EBCDIC.LAY are in the current directory. If not, you may need to specify the full pathname to the files. EBCDIC-3.BAT also assumes that VEDIT was installed into the "C:\VEDIT" directory. If you installed VEDIT into another directory, e.g. "C:\Program Files\VEDIT", you must edit the EBCDIC-3.BAT file to specify the correct directory. Conversion Options ------------------ By default, the name of the data layout file is EBCDIC.LAY. Optionally, you can use "filename.LAY", where 'filename' is the name of the file being converted. You can also explicitly specify the name of the .LAY file. EBCDIC.LAY is convenient when you have many files of the same type; filename.LAY is better when each file is of a different format. The "-n" invocation switch causes filename.LAY to be used as the layout file. Therefore, use the following command to automatically run the conversion macro: c:\vedit\vpw -n -x ebcdic-3.vdm datafile.ebc where 'datafile.ebc' is the name of the file to be converted. datafile.lay will be used as the name of the layout file. Alternatively, you can explicitly specify the name of the layout file. This is necessary when the .LAY layout file is in a different directory. Use the command form: c:\vedit\vpw -x ebcdic-3.vdm datafile.ebc layout.lay where 'datafile.ebc' is the name of the file to be converted. 'layout.lay' is the name of the data layout file. Using Different Directories --------------------------- By default, the EBCDIC source file "datafile.ebc", the data layout file EBCDIC.LAY and the resulting ASCII file "datafile.asc" must all be in the same directory. However, you can specify separate directories with the command: vpw -x ebcdic-3.vdm \dir1\datafile.ebc -a \dir2\datafile.asc \dir3\layout.lay where '\dir1\datafile.ebc' is the pathname of the file to be converted. '\dir2\datafile.asc' is the pathname of the converted file. '\dir3\layout.lay' is the name of the data layout file. This format lets you specify the full pathname of the converted file and of the data layout file. -------------------------------- SUPPORTING MULTIPLE RECORD TYPES -------------------------------- The EBCDIC Level-3 conversion package also supports multiple record types per file. Each record type can vary in length and field specifications. Any record type can optionally be output to its own file. The data description for each record type begins with a 't' and ends at the next 't' or 'seg' specification or end of file. Each 't' is followed by three or four numbers which are used to identify each record type using a record ID. t col-t,mask,min,[max] col-t Specifies the starting column (counting from 1) which contains the record ID. The record ID can be 1, 2 or 3 bytes long. mask This mask is AND'd with the record ID before making the comparison. This allows matching complex record ID ranges. For simple ID's, use a mask of "0xFF" for a 1-byte ID, "0xFFFF" for a 2-byte ID and "0xFFFFFF" for a 3-byte ID. min This is the minimum value that the record ID (after masking) must be. For simple ID's, this is simply the value of the ID. For example, if the ID is "21", this value should be set to "0xF2F1" which is the EBCDIC representation of "21". max This is the maximum value that the record ID (after masking) can be. This number is not needed for simple ID's. It is only needed for complex ID's, where one type of record might have a range of ID's. The 't' line is then followed with the field specifications as described above. The next 't' line starts a new record description. Notes: For ID's longer than 3 bytes or for record types determined by more than one field, the record sub-type specification 'st' can be used in conjunction (following) the 't' specification. See EBCDIC-3.VDM for details. A custom macro selected with "tc" can be used to identify more complex types of records. This topic is too complex for this discussion, but you may see it if you contracted Greenview to create a custom .LAY file. The 't' line is typically immediately followed by an "l=" line which sets the record length for this record type. If all record types have the same record length, the "l=" lines can be omitted; the record length is then set by the initial "r=" line. In some EBCDIC files the first record is a header which has a different layout from the following records. Although it may be possible to set up the correct "t" specification to identify the header, it is often easier to use "th" which describes the first record in the file. "th" does not take any parameters, but is typically followed by an 'l=' line to specify the header length. An optional "default" record type can also be specified. If used, it must be the last record description except for any "segment" definitions (regarding which, see below). The default record data description begins with "td". The "td" does not take any arguments. The following sample .LAY layout file specifies three explicit types of records, a header and a default record type. It is supplied as the file MULTIREC.LAY. r=114,0 //Record length is 114, convert to DOS newlines o=z,b2z //Output leading zeros (blank '+' sign; sign at end) //Convert all-blank/null numeric fields to zeros e=0,ebcdic.err //Only report serious errors in EBCDIC.ERR th //Treat first record as a header l=64 // Header is 64 bytes long d 20-24 // Packed-decimal in columns 20-24 e 25-64 // Remaining columns are simple EBCDIC t 5,0xFFFF,0xF2F1 //Record ID "21" in column 5 (and 6) and... st 15,0xFFFF,0xF8F0 // "80" in column 15 (and 16) l=82 // Record length is 82 d 10-14 // Packed-decimal in columns 10-14 d 20-21 // Packed-decimal in columns 20-21 d 22-23 // Packed-decimal in columns 22-23 d 24-29 // Packed-decimal in columns 24-29 b 30-33 // Packed-binary in columns 30-33 z 42-46 // Zoned number in columns 42 thru 46 e 47-82 // Remaining columns are simple EBCDIC t 5,0xFFFF,0xF2F2 //Record ID "22" in column 5 (and 6) l=110 // Record length is 110 b 15-18 // Packed-binary in columns 15-18 d 19-21 // Packed-decimal in columns 19-21 d 60-66 // Packed-decimal in columns 60-66 e 67-110 // Remaining columns are simple EBCDIC t 5,0xFFFF,0xF2F3 //Record ID "23" in column 5 (and 6) l=148 // Record length is 148 b 30-33 // Packed-binary in columns 30-33 b 45-46 // Packed-binary in columns 45-46 d 98-102 // Packed-decimal in columns 98-102 d 110-114 // Packed-decimal in columns 100-114 e 115-148 // Remaining columns are simple EBCDIC td //The default record type l=114 // Record length is 114 d 16-19 +bz // Packed-decimal in columns 16-19 Extracting Data by Record Type ------------------------------ It is often desirable to output each kind of record into a separate file. I.e., if your EBCDIC file contains three different kinds of records, you might want to create three ASCII files, one per record type. To select a separate output file for each record type, precede the "t", "th", or "td" lines with "tx" followed by the filename in double quotes. tx "filename.ext" where 'filename.ext' is the desired filename. Consider the following example .LAY file which converts two types of records: r=80,0 //Record length is 80, convert to DOS newlines o=z,b2z //Output leading zeros (blank '+' sign; sign at end) //Convert all-blank/null numeric fields to zeros e=0,ebcdic.err //Only report serious errors in EBCDIC.ERR tx "id1.asc" //Output the next record type into ID1.ASC t 5,0xFF,0xF1 //Record ID "1" in column 5 l=80 // Record length is 80 e 1-9 // Simple EBCDIC text in columns 1-9 d 10-14 // Packed-decimal in columns 10-14 d 20-21 // Packed-decimal in columns 20-21 d 22-23 // Packed-decimal in columns 22-23 e 24-80 // Remaining columns are simple EBCDIC tx "id2.asc" //Output the next record type into ID2.ASC t 5,0xFF,0xF2 //Record ID "2" in column 5 l=80 // Record length is 80 e 1-9 // Simple EBCDIC text in columns 1-9 d 10-14 // Packed-decimal in columns 10-14 e 20-21 // Simple EBCDIC text in columns 20-21 d 22-23 // Packed-decimal in columns 22-23 e 24-80 // Remaining columns are simple EBCDIC This conversion will create the two output files ID1.ASC and ID2.ASC; each will only contain one kind of record. --------------- ADVANCED TOPICS --------------- Quoted-Comma-Delimited ASCII File --------------------------------- Consider the following relatively simple .LAY file: r=76,0 //Record length is 76, convert to DOS newlines o=z,b2z //Output leading zeros (blank '+' sign; sign at end) //Convert all-blank/null numeric fields to zeros e=0,ebcdic.err //Only report serious errors in EBCDIC.ERR e +9 // 9 columns of EBCDIC text (1 - 9) d +5 // 5 columns of Packed-decimal (10-14) e +5 // 5 columns of EBCDIC text (15-19) d +2 // 2 columns of Packed-decimal (20-21) d +2 // 2 columns of Packed-decimal (22-23) e +6 // 6 columns of EBCDIC text (24-29) b +4 // 4 columns of Packed-binary (30-33) e +8 // 8 columns of EBCDIC text (34-41) z +5 // 5 columns of Zoned number (42-46) e +4 // 4 columns of EBCDIC text (47-50) u +5 // 5 columns of EBCDIC digits (51-55) e +21 //21 columns of EBCDIC text (56-76) One converted ASCII record might look like this: ABCDEFGHI123456789 ABCDE123 123 ABCDEF1234567890ABCDEFGH12345 ABCD12345ABCDE... It is difficult to tell where one field ends and the next begins. Therefore, the conversion can optionally quote each field and separate them with commas. Therefore, the converted ASCII record would look like this: "ABCDEFGHI","123456789 ","ABCDE","123 ","123 ","ABCDEF","123456789 ","ABCDEFGH","12345 ","ABCD","12345","ABCDE..." To select this option, add the "qcd" command to the .LAY file: r=76,0 //Record length is 76, convert to DOS newlines o=z,b2z //Output leading zeros (blank '+' sign; sign at end) //Convert all-blank/null numeric fields to zeros qcd //Create quoted-comma-delimited ASCII e +9 //9 columns of EBCDIC text (1 - 9) ... Please be aware of these notes when creating a quoted-comma-delimited file: * The "qcd" command must appear in the .LAY file before any field specs. * All fields, including simple EBCDIC text fields, must be specified. * Only use the "qcd" command when necessary - it slows down the conversion by about 20%. Although it makes the ASCII file more humanly readable, most database programs, such as Microsoft Access (tm), can import based on beginning and ending columns and don't need a quoted-comma-delimited file. * When quoting and comma delimiting output fields, any quotes that appear in text fields may confuse programs that later try to process the output. To automatically convert quotes to apostrophes, use the EBCDIQ.TBL conversion table as explained in the topic "Alternative EBCDIC to ASCII Translation Tables", below. User Specified Field Delimiter String ------------------------------------- As an alternative to quoting and comma delimiting each field, a custom string can be used to separate fields. E.g., to separate the fields with " | " (space, pipe, space), place the following command at the beginning of the .LAY layout file: v=Reg_Set(XDREG,/ | /); This is an example of the "v=" specification which executes Vedit macro commands. The text register specified by internal value XDREG (currently 17, when not being run under WILDFILE.VDM) is set to the specified string. EBCDIC-3.VDM appends T-Reg(XDREG) to the end of each specified output field and to the end of each set of unspecified (default) fields. Note: To use the "/" character as part of the string, the string delimiting "/" may be replaced by any of '"`\. Alternative EBCDIC to ASCII Translation Tables ---------------------------------------------- It is sometimes necessary or desirable to translate EBCDIC text characters to a different ASCII character than normal. Three alternative translation tables are supplied for this purpose: EBCDIQ.TBL which converts quotes to apostrophes; EBCDIL.TBL which converts "low values" (0x00 - 0x3f) into ASCII spaces; and EBCDILQ.TBL which does both. The "Q" table are provided for use when output is being quoted and comma delimited. The "L" tables are provided for cases when uninitialized text fields or, perhaps, undocumented packed fields are present in supposedly simple text fields. The result of treating them as text is gibberish in the ASCII output. To use one of these tables, include the following, appropriately edited, line near the beginning of your .lay file: v=Translate_Load("EBCDIQ.TBL") // Convert EBCDIC " to ASCII ' This assumes that the table has been copied to your VEDIT home directory or is in the directory containing the data file being translated. Formatting Numeric Data ----------------------- The numeric field specifiers "b", "d", "l", "n", "u" and "z" have an associated default output width dependent on the individual data type, the size of the input field, the presence and location of a sign and the presence of an explicit decimal point. The location of the sign (plus, minus or blank) has been well explained in the topic "Creating the .LAY Layout File". The sign may also be suppressed by the unsigned option 'u'. E.g., "b 1,4 u". The output for this binary field is eight columns - the standard number of digits stored in a 4-byte binary. There is no sign to be expressed for the unsigned numeric fields 'u' (simple numeric) and 'n' (packed, no zone). To output an explicit decimal point, use the "v[n]" option, where 'n' is the number of digits following the decimal point. E.g.,"d 1,4 v2". The output for this packed decimal field is nine columns in the format "ddddd.ffs" where 's' is the sign. If 'n' is omitted or 0, nothing is output past the decimal point. To control the actual number of digits output, use the ",=ndigits;" option, where "ndigits" is the total number of digits to be output both before and after any explicit decimal point. E.g., "l 1,4,=6; v2". The output of this floating point field is eight columns in the format "dddd.ffs". The default size is nine columns. I.e., "l 1,4 v2" generates "ddddd.ffs". The number of digits output can be greater or less than the default number of digits contained in a numeric field. If greater, leading zeros/blanks are normally output. For the case of floating point fields only, specifying more digits past the decimal point than are contained in the actual data results in trailing zeros being output. Specifying fewer digits than the default must be done with care. If fewer digits are specified than are actually contained in any field other than a floating point's fractional digits, a BAD_CHAR will be placed into the field's first output column and an error message will be output to EBCDIC.ERR. The defaults for binary numbers are two, four, eight and eighteen digits for one-, two-, four- and eight-byte binary numbers. (Note: IBM does not support one-byte binary numbers). The default for packed decimals is 2*(ending column - beginning column) + 1. The default for packed, no zone is 2*input size, where input size = (ending column - beginning column + 1). The default for short (single precision) floating point (input size = 4) is 7. The default for long (double precision) floating point (input size = 8) is 16. Following are examples of binary fields with the total output field size given in the comment field: o=+z //Output leading zeros and '+' for nonnegative values b 1,4,=9; //10 bytes (9 digits + sign) PIC S9(9) COMP. b 1,4 u // 8 bytes (8 digits, no sign) PIC 9(8) COMP. b 1,4,=10; //11 bytes (10 digits including any leading zeros + sign) b 1,1 u //1 byte (1 digit, no sign) PIC X. b 1,1,=2; u raw //2 byte2 (2 digits, no sign) PIC X. b 1,1,=2; raw //3 bytes (2 digits + sign) b 1,1 - //2 bytes (1 digit, trailing sign, space if positive) b 1,1 -b //2 bytes (1 digit, leading sign, space if positive) b 1,2 // 5 bytes (4 digits + sign) PIC S9(4) COMP. b 1,8 //19 bytes (18 digits + sign) PIC S9(18) COMP. b 1,8,=14; //15 bytes (14 digits + sign) PIC S9(14) COMP. Reporting Errors ---------------- Any conversion errors can optionally be reported in a file; the default filename is "ebcdic.err" in the same directory as the converted output file. Alternatively, you can specify a different filename or a full pathname with the "e=" command (see below). Three kinds of errors can be reported: * Syntax errors in the ".LAY" file and other serious errors which prevent any data from being converted. The ebcdic.err file will contain a description of the error. * Errors in the data file. When bad data, e.g. an invalid packed-decimal number is encountered, it is converted to spaces and the conversion continues. Optionally, up to the first 'n' data errors can be reported. The ebcdic.err file will contain a description and the position in the data file of the error. * Any unexpected conditions which cause the conversion to stop. These are reported with "***** Unexpected breakout occurred." in the ebcdic.err file. Our experience is that many data files contain insignificant errors in some fields. In some cases the field is defined in the copy-book, but is never used and contains garbage. Therefore, we suggest ignoring errors in the data file unless the data is really critical and you are willing to take the time to track down the errors on the mainframe side. When data error reporting is enabled, only the first 'n' errors are reported to prevent creating huge error files. The "e=" command in the ".LAY" file controls which errors are reported in the error file and can override the name of the file. Some examples are: e=1000,ebc-data.err Up to the first 1000 errors in the data file will be reported. Specifies that the name of the error file is "ebc-data.err". This will be written to the main output directory which is the same as the source data file unless altered by the "-a outpathname" invocation option. e=200 Up to the first 200 errors in the data file will be reported. The default error file "ebcdic.err" will be used. e=0 Most errors in the data file will not be reported and will not cause an error file to be created. Severe errors will still be reported. These are mostly encountered by the preprocessor and will have been fixed before production runs are made. e=0,\errdir\ebcdic.err Exactly the same as "e=0". It explicitly gives the location and name of the error file. When the conversion starts, any existing "ebcdic.err" file is deleted. A batch file can therefore test for the existence of ebcdic.err to determine if the conversion was successful or not. The supplied EBCDIC-3.BAT file does this with the batch commands: IF NOT EXIST ebcdic.err GOTO DONE ECHO - ECHO Conversion had errors! ECHO See the file EBCDIC.ERR for details. ECHO - PAUSE : :DONE Advanced Commands ----------------- Some newer features allow better control over outputting data, including diverting just part of a record to its own file (out) and rearranging and/or duplicating data and/or injecting constant strings into the output file (ki, ko). To link extracted data to its parent record, a unique ID # is automatically generated. Associated with this are 4 new commands, "b=", "d=", "a=" and "n=" (see below). Outputting a record section to its own file (out) ------------------------------------------------- It is possible to output a section of a record to its own file by using the 'out "fname"' ... 'out' brackets. This can be done even when the parent record is being output to its own file. The unique ID #, discussed below, is appended to both the end of the section and the end of the parent record. out ["fname"] Use the "out" commands as brackets around a section to divert the output to the file "fname". The first "out" requires the filename. The first and last fields must be explicitly defined. For example, ... out "special.asc" e 510,579 out ... converts columns 510 through 579 from simple EBCDIC to ASCII and outputs the results into file "special.asc". The unique ID will be appended to the end of the line in "special.asc" and to the end of the parent record's output line. The first and last fields (here, the same) have been explicitly specified, even though ordinarily they could have been omitted. Rearranging, duplicating, outputting-constant-strings (ki & ko) --------------------------------------------------------------- It is now possible to save translated data into one of nine "key" registers (so called because of the way they were first used). Data may be optionally accumulated in the registers or it may replace the existing contents. The data is also translated or ignored (copied) or deleted as usual. Later, this data may be injected into the output stream. Constant strings may also be injected. ki key field definition - ki[K[O]][n[+]]; cb,ce ... The field specified by 'cb,ce ...' is translated twice. The first translation is specified by 'K' (default is 'e' for EBCDIC to ASCII conversions or as specified by the "u=" line). The results are saved into key register 'n' (1 to 9, default=1). The results will be appended to key register 'n' if '+' is included, allowing more than 9 fields to be manipulated. The second translation is performed normally as specified by 'O' (default = 'K') in conjunction with "cb,ce ...". Regarding default codes, 'O' may be omitted and both 'K' and 'O' may be omitted. It is not possible to omit 'O' while specifying 'K'. ko key out - ko cb,cb [i[nflation]=amt;] list - ko cb,cb,=amt; list // archaic Injects "list" into the output stream associated with input column 'cb'. (Note that the "end" column is the same as the "beginning" column). "list" consists of key register ID's and/or quoted string constants separated by commas. The optional "i[nflation]=amt;" should be included in order for RELAY.VDM to properly generate output column numbers in the comment fields of the .LAY file. Alternatively to listing the output elements on one line, the "ko" line can be replicated once for each item in the list. This is necessary when quoting and comma delimiting for each item in the list to be separately quoted. Example of one-line list: ... ko 1,1 i=17; 1," is ",6 // 1 Output "key" data strings 1 and 6 // around constant string " is " d 1,4 // 18 Convert 4-byte packed decimal to ASCII ... Key registers one and six are output surrounding the constant string " is ". The length of the entire list is purported to be 17 characters. The translation proceeds as usual from there on. Notes: it is unimportant for the actual translation for the output length to be stated. It is only needed for RELAY.VDM to properly generate the output column numbers in the comment field. The injection column number (here '1') appears twice; once within the 'ko' line and again within the following field speci- fication. Also note that this is an input related column, even though 'ko' is an output command. Example of list spread over several lines: ... ko 1,1 i=5; 1 // 1 Injects key register[1] (5 characters) ko 1,1 i=4; " is " // 6 Injects constant string " is " ko 1,1 i=8; 6 // 10 Injects key register[6] (8 characters) d 1,4 // 18 Convert 4-byte packed decimal to ASCII ... Unique ID #'s for Extracted Fields ---------------------------------- When data fields are being output to their own files, it is helpful and sometimes necessary to generate a unique ID # for associating the extracted portions of the data record with the main portion. The unique ID is output at the end of the extracted portion and at the end of the parent record. For Level-3 this occurs for 'out "fname"' ... 'out' bracketed sections (see above). The unique ID # consists of a 5-digit "# days elapsed since base date" followed by a 1-digit "daily run #" (default value is 0) followed by a 6-digit "record #". The five-digit elapsed days count allows for 270+ unique years. The 6-digit record count allows for 999,999 records to be processed in a single day. The "daily_run_#" allows more than one such run per day. This value must be explicitly set (see below). The ID's are generated automatically, based on the day of the run and the input record #. To control the ID generation, 4 commands have been added: "b=base_date" (default = 1/1/2000); "d=run_date" (default = current date); "a=additional_offset" (default = 0) and "n=l,m[,n]" to control the number of digits in the unique ID (default = 5 digits for days elapsed, 6 digits for the record # and 1 digit for the run #). To change the run date, add the "d=run_date" command to the .LAY file. Dates are expressed in mm/dd/yyyy format. The year must be completely expressed. The separator may be any of "_./\:-". To change the base date from its default of 1/1/2000, add the "b=base_date" command; e.g., b=1/1/1900 // Include dates from the previous century To change the number of digits in the unique ID to 4 digits for # days elapsed since base date and 5 digits for the record # use the "n=d,r" command; e.g., n=4,5 // 4 digits for days elapsed since base date // 5 digits for record count // There is still 1 digit reserved for the // daily run # In case more than one data set is processed for a given base_date and run_date, one can include the "a=additional_offset" command in order to generate unique ID's in the 2nd and later runs. Set the offset greater than or equal to the last record # processed in the previous run(s); e.g. a=9999 // Previous run ended with record 9,999 As an alternative, one can employ the digit reserved for the daily run # by adding the run # to the invocation command line after the "-u" parameter, which is placed after all specified filenames. For the second run of the day this might be: c:\vedit\vpw -x ebcdic-3.vdm datafile.ebc layout.lay -u 2 Defining and Referring to Groups of Field Specifications (Segments) by Name --------------------------------------------------------------------------- Any group of field specifications can be placed at the end of the data layout file after a "seg segname" line; they may be accessed by "use cb,ce segname" where "cb" is the first column of the first field described by segment "segname" as it occurs in the source data file and "ce" is the last column of the last field so described. More than one segment may be defined. All must be placed together at the end of the layout file. The first column of the first field in each segment must be numbered as "1". Using named segments allows a more compact record specification when more than one record refers to the segment. It has the disadvantage of not having the ASCII output columns displayed. There is no significant speed degradation since the definitions are expanded during preprocessing. But, the main purpose for the segments is to allow multiple, variable, data dependent segment processing as described next. These segments are not expanded during preprocessing. The original simple layout has been reorganized, below: r=114,0 //Record length is 114, convert to DOS newlines o=z,b2z //Output leading zeros (blank '+' sign; sign at end) //Convert all-blank/null numeric fields to zeros e=0,ebcdic.err //Only report serious errors in EBCDIC.ERR //Columns 1-9 are simple EBCDIC // // <---- Decode columns 10-23 by segment "common_data" use 10,23 common_data //Columns 24-29 are simple EBCDIC // <---- Decode columns 30-46 by segment "more_data" use 30,46 more_data // //Columns 47-114 are simple EBCDIC // seg common_data// <---- Segment definition d 1-5 //Packed-decimal //Next 5 columns are simple EBCDIC d 11-12 //Packed-decimal d 13-14 //Packed-decimal seg more_data // <---- Segment definition b 1-4 //Packed-binary //Next 8 columns are simple EBCDIC z 13-17 //Zoned number (5 digits plus sign) Using Variable, Data Dependent, Named Segments ---------------------------------------------- It sometimes happens that a given number of bytes within some record can contain information in one of several different formats and the format can be determined at run time by examining a specific location or locations within the bytes themselves. This is exactly similar to specifying the main record formats. It is possible to place these sub-record specifications together after a segment label and then refer to them by the segment label from within the body of the main record. Precede each set of sub-record specifications with a "t" statement as in "Supporting Multiple Record Types" above. To extract data to its own file, precede the appropriate "t" statement with a 'tx "filename"' statement. One default record type "td" may be defined per segment; it must be the last record type defined in the segment. Any "st" and "so" specifications may be used. It is not necessary to use "l=" specifications since the length is known from the "use cb,ce segname" statement. Field columns within a record type begin numbering from "1". Although a segment record may have a defined extraction output file (tx), the parent record does not know about it. In order to force the generation of the unique ID # at the end of the parent record, in the absence of any other extractions occurring in the parent record, it is necessary to declare a formal extraction file. This can be done by bracketing the "use" statement with 'out "dummy.asc"'...'out'. The following .LAY file is nearly identical to the Level-3 MRSAMP.LAY. It includes the "use" and "seg" lines and omits the "l=114" lines. It also includes as comments the 'out "dummy.asc" ... out' brackets. Since there is no data in the record other than that in samp_seg, there is no need to produce unique ID #'s in the otherwise empty parent record. // // SGSAMP.LAY - Specify fields and options for converting EBCDIC files with // packed fields into ASCII. Uses segments. This file adapted from // MRSAMP.LAY. // r=114,0 // Max reclen, emit DOS newlines o=z,b2z // Output leading zeros (blank '+' sign; sign at end) // Convert all-blank/null numeric fields to zeros e=0,ebcdic.err // Only report serious errors in EBCDIC.ERR // out "dummy.asc" // Not wanted here, since no data other than that in samp_seg use 1,114 samp_seg // 1 // out // 114 (RecLen) seg samp_seg // ********************************************************* 000100 // * * 000110 // * SAMPLE COPY-BOOK WITH MULTIPLE RECORDS * 000120 // * * 000130 // * RECSIZE = 114 BYTES * 000140 // * * 000150 // ********************************************************* 000160 tx "keyrec.asc" // Divert SAMPLE records to file "keyrec.asc" t 1,0xFFFFFF,0xE2C1D4 // EBCDIC "SAM" in columns 1-3 (up to 3 bytes can be masked) st 4,0xFFFFFF,0xD7D3C5 // EBCDIC "PLE" in columns 4-6 (but wanted to examine 6) // 000170 // 01 SAMPLE-REC. 000180 // 03 SA-KEY. 000190 e 1,7 // 1 10 SA-KEY-ALPHA PIC X(7). 000200 u 8,9 // 8 10 SA-KEY-NUMERIC PIC 9(2). 000250 d 10,14 u v2 // 10 03 SA-PAC-DEC-1 PIC 9(7)V99 COMP-3. 000300 x 15,19 // 20 03 FILLER PIC X(5). 000350 d 20,21 u // 20 03 SA-PAC-DEC-2 PIC 999 COMP-3. 000400 d 22,23 // 23 03 SA-PAC-DEC-3 PIC S9(3) COMP-3. 000450 x 24,29 // 27 03 FILLER PIC X(6). 000500 b 30,33 // 27 03 SA-SIGNED-BINARY PIC S9(8) COMP. 000550 x 34,41 // 36 03 FILLER PIC X(8). 000600 z 42,46 // 36 03 SA-ZONED-DECIMAL PIC S9(5). 000650 x 47,114 // 42 03 FILLER PIC X(68). 000700 // 41 (RecLen) tx "datarec.asc" // Divert remaining records to file "datarec.asc" td l=114 // . 01 SAMPLE-DATA REDEFINES SAMPLE-REC. 000750 // 03 SA-DATA. 000800 e 1,4 // 1 10 SA-DAT-PRODUCT-CD PIC X(4). 000850 u 5,9 // 5 10 SA-DAT-PRODUCT-NUM PIC 9(5). 000900 e 10,14 // 10 03 SA-DAT-LOCATION-CD PIC X(5). 000950 x 15,19 // 15 03 FILLER PIC X(5). 001000 d 20,21 // 15 03 SA-DAT-DEC-1 PIC S9(3) COMP-3. 001100 d 22,23 u // 19 03 SA-DAT-DEC-2 PIC 999 COMP-3. 001150 x 24,29 // 22 03 FILLER PIC X(6). 001200 z 30,34 // 22 03 SA-DAT-ZDECIMAL PIC S9(5). 001250 x 35,42 // 28 03 FILLER PIC X(8). 001300 b 43,46,=9; // 28 03 SA-DAT-SBINARY PIC S9(9) COMP. 001350 x 47,114 // 38 03 FILLER PIC X(68). 001400 // 37 (RecLen) Using the above .LAY file on the provided data file SGSAMP.EBC vpw -n -x ebcdic-3 sgsamp.ebc produces the following output files: SGSAMP.ASC: (empty) KEYREC.ASC: SAMPLE 01987654321 123 321-88888888 12345- SAMPLE 02876543210 234 432-88888888-12345 DATAREC.ASC: MREC12345MI103987 654 43210 999999999 MREC12346MI103876-543 43210-999999999- Uncommenting the "out" brackets and rerunning, additionally produced: DUMMY.ASC: (empty) SGSAMP.ASC: (unique ID #'s; not very useful by themselves) 00718000001 00718000002 00718000003 00718000004 Padding Fields -------------- Some EBCDIC files contain multiple types of records which are very similar and have the same record length. (They often result from "REDEFINES" in the COBOL file layout copy-book.) If one record type has, e.g. two packed-decimal fields and another record type has three packed-decimal fields, then the converted ASCII file will have different record lengths. Optionally, you can add padding to individual fields, so that all records in the converted ASCII file will have the same record length. Consider the following example .LAY file which converts two types of records: r=80,0 //Record length is 80, convert to DOS newlines o=z,b2z //Output leading zeros (blank '+' sign; sign at end) //Convert all-blank/null numeric fields to zeros e=0,ebcdic.err //Only report serious errors in EBCDIC.ERR t 5,0xFF,0xF1 //Record ID "1" in column 5 l=80 // Record length is 80 e 1-9 // Simple EBCDIC text in columns 1-9 d 10-14 // Packed-decimal in columns 10-14 d 20-21 // Packed-decimal in columns 20-21 d 22-23 // Packed-decimal in columns 22-23 e 24-80 // Remaining columns are simple EBCDIC t 5,0xFF,0xF2 //Record ID "2" in column 5 l=80 // Record length is 80 e 1-9 // Simple EBCDIC text in columns 1-9 d 10-14 // Packed-decimal in columns 10-14 e 20-21 // Simple EBCDIC text in columns 20-21 d 22-23 // Packed-decimal in columns 22-23 e 24-80 // Remaining columns are simple EBCDIC Since record type "1" contains one additional packed-decimal field, it will convert to longer ASCII records than those of type "2". Also the last packed-decimal field will be in different columns in the ASCII file for the two record types. By adding some padding to record type "2", all fields will be aligned in the converted ASCII file and all records will have the same length. The following .LAY file shows how to add padding after a field: r=80,0 //Record length is 80, convert to DOS newlines o=z,b2z //Output leading zeros (blank '+' sign; sign at end) //Convert all-blank/null numeric fields to zeros e=0,ebcdic.err //Only report serious errors in EBCDIC.ERR t 5,0xFF,0xF1 //Record ID "1" in column 5 l=80 // Record length is 80 e 1-9 // Simple EBCDIC text in columns 1-9 d 10-14 // Packed-decimal in columns 10-14 d 20-21 // Packed-decimal in columns 20-21 d 22-23 // Packed-decimal in columns 22-23 e 24-80 // Remaining columns are simple EBCDIC t 5,0xFF,0xF2 //Record ID "2" in column 5 l=80 // Record length is 80 e 1-9 // Simple EBCDIC text in columns 1-9 d 10-14 // Packed-decimal in columns 10-14 e 20-21,2; //<--- Simple EBCDIC in columns 20-21; pad with 2 spaces d 22-23 // Packed-decimal in columns 22-23 e 24-80 // Remaining columns are simple EBCDIC Alternatively, you can add padding before a field as in this partial .LAY: ... t 5,0xFF,0xF2 //Record ID "2" in column 5 l=80 // Record length is 80 e 1-9 // Simple EBCDIC text in columns 1-9 d 10-14 // Packed-decimal in columns 10-14 e 20-21 // Simple EBCDIC in columns 20-21 d 22-23,-2; //<--- Packed-decimal in columns 22-23 pre-pad 2 spaces e 24-80 // Remaining columns are simple EBCDIC The padding is by default ASCII spaces, but you can change it to any other character. The following partial .LAY file uses "$" as the padding character: ... t 5,0xFF,0xF2 //Record ID "2" in column 5 l=80 // Record length is 80 e 1-9 // Simple EBCDIC text in columns 1-9 d 10-14 // Packed-decimal in columns 10-14 e 20-21 // Simple EBCDIC in columns 20-21 d 22-23,-2; '$' // Packed-decimal in columns 22-23 pre-pad 2 "$" e 24-80 // Remaining columns are simple EBCDIC DBASE output ------------ To easily produce DBASE-style output, set the second parameter in the "r=reclen,parm2" specification in the .LAY file to "DBASE-III". During preprocessing, a DBASE header record - including automatically generated or explicitly entered field names - is generated for each output file. The automatically generated names are of the form "Ln", where 'n' is the field's index, counting from 1 and 'L' is either automatically generated ('A', 'B', 'C', etc.) or is taken from the second parameter on the 'tx' specification: tx "fname",L A second letter 'A', 'B', ... is automatically generated between 'L' and 'n' for each section being output to its own file via an "out" command. Alternatively, specify the letter, e.g. 'N', on the "out" command: out "fname",LN See, also, "Defining Explicit Field Names for DBASE, below. Other details, such as the header terminating 0x0d, file terminating 0x1a and record leading space, are also automatically generated. Currently, all fields are specified as type 'C' (character). Each field in the record must be specified in the .LAY file, the same as for quoting and comma delimiting. However, the preprocessor will automatically strip the default field specifications after generating the header data for the field. This provides a dramatic increase in run-time performance for the normal case where there are only a few packed fields per record. See also COBOL2V.VDM and RELAY.VDM for DBASE options. Quick DBASE Test ---------------- You can perform a quick test by converting the supplied COBOL.EBC file after generating COBOLD.LAY from COBOL.COP. From a DOS/NT command box, change to the directory containing VEDIT and into which you placed the EBCDIC conversion file. This is typically "c:\vedit". Then give the command: vpw -x cobol2v.vdm cobol.cop -a cobold.lay -u dbase This will leave you in the VEDIT editor with the converted layout file displayed. A preliminary .XRF file is also generated which is of no practical interest. Select {FILE, Exit} to exit the editor, saving the layout file and doing as you will with the .XRF file. Now, to convert the EBCDIC data, producing an ASCII file in DBASE format, give the command: vpw -x ebcdic-3.vdm cobol.ebc cobold.lay Since COBOLD.LAY does not specify otherwise, EBCDIC-3.VDM automatically generates names for each field, creates the DBASE header record, translates COBOL.EBC into ASCII and displays the results. To view the DBASE header record, select {MISC, More macros, DBASE}. This pops up a second edit window with the DBASE header information converted into meaningful text. Viewing DBASE Files ------------------- Greenview Data, Inc. provides two macros to aid viewing DBASE files. For information, see the VEDIT "Help" topic "DBASEKEY.VDM Macro": from within VEDIT, select {Help, Search for help on...} and enter DBA into the "Index" input box; click [Display]; then select "DBASE.VDM Macro" and click [Display]. The first topic gives a more complete discussion of DBASE.VDM. The immediately following topic discusses "DBASEKEY.VDM" which you may prefer for common use. Defining Explicit Field Names for DBASE --------------------------------------- To have explicit field names entered into the DBASE header record, enter them into the .LAY file comment section just past the ASCII-output-column numbers. Also, include the line "xrf=1" or "xrf=on" at the beginning of the .LAY file. Then, run RELAY.VDM to produce an .XRF file that contains the explicit names in the first column group, auto-generated names in the second column group, and COBOL copy-book names in the third column group. When EBCDIC-3.VDM is run, the "xrf=on" specification will cause the .XRF file to be loaded during preprocessing and the first short name encountered from each line to be placed into the DBASE header record. Thus, it is possible to give explicit names to just a few important fields, using auto-generated names for the remaining fields. DBASE names are limited to ten characters; they must begin with a letter and may contain only letters and digits. See "COBOL2V.VDM and DBASE Output", below, for an example. Using the COBOL2V.VDM Macro --------------------------- Although a COBOL copy-book can often be pasted directly into a .LAY file, it is usually better to first convert the copy-book into a normal .LAY file. There are several reasons for this: * A normal .LAY file allows more flexibility in how each field is converted. E.g., you can specify different "+bz" options for each packed field. * The "X" PIC specification frequently contains binary or custom coded data. It also is used for undefined data. Undefined data frequently translates not merely into strange looking text but also into unwanted control codes. These fields must be deleted or blank filled in the .LAY file. * Removing simple EBCDIC (non-numeric) field specifications speeds up the conversion. * More complicated copy-books with "OCCURS" and "OCCURS DEPENDING ON ..." clauses may require manual changes to the .LAY file. (The latter also requires the EBCDIC Level-4 package.) * "REDEFINES" in the copy-book are not automatically supported. Most can be ignored, but others require manually setting up a different record type. The supplied COBOL2V.VDM macro converts COBOL copy-book statements, e.g. "PIC X(7)", into normal ".LAY" specifications, e.g. "e 1 7". It also generates the "r=max_rec_size,0" specification. It can also generate prototype record specifications and record-specific output filenames. The original copy-book statements are maintained in commented form. When COBOL2V.VDM is being run by EBCDIC-3.VDM on copy-book statements that have been pasted into a .LAY file, the COBOL statements must be surrounded by "p=" and "q=" brackets. Specifications other than "c=n" must not occur within the bracketed copy-book sections. When COBOL2V.VDM is being run directly on a copy-book, these brackets are not required unless .LAY data layout specifications other than "c=0" are included; the "c=0" specifications themselves are required when multiple record types are present in order to reset the input column counter. Because some copy-book "X" pictures actually contain binary data, you may want to change them to "B" for binary or "H" for hex or "N" for Packed-No-Zone (PNZ) which are recognized by the COBOL2V.VDM macro. Otherwise, edit the .LAY file after conversion and change the 'e' specifications to 'b' for binary, 'h' for hex or 'n' for PNZ. NOTES: This macro comments out all "REDEFINES" clauses. If a redefinition contains packed/zoned/binary data that differs from the original definition, a separate record description must be compiled for each definition. See "Supporting Multiple Record Types", above and "COBOL2V.VDM and Multiple Record Types", below. The rare COMP-1 (FLOAT) and COMP-2 (DOUBLE) formats are supported only in the range 10E-18 thru 10E18, in fixed form only. Please contact us if you need larger powers of 10 or scientific notation. Unsigned numeric fields (PIC 9) are translated to a "u" specification. This allows specifying explicit output decimal points, leading zeros and converting all-blank/null fields to zeros. COBOL2V.VDM is internally used by the EBCDIC-3.VDM macro, but can also be run separately. >>> To convert a COBOL copy-book into a .LAY data layout file: 1. Save the copy-book as a .LAY file, e.g. "myfile.lay". 2. Edit the .LAY file as necessary. See also "COBOL2V.VDM and Multiple Record Types". 3. Save the file. 4. Convert the .LAY file, e.g. "myfile.lay" with the command: vpw -x cobol2v.vdm myfile.lay The following is the result of running COBOL2V on the supplied COBOL.LAY file: // // COBOL.LAY - (Slightly) modified sample COBOL copy-book for converting // EBCDIC files with packed fields into ASCII. // // (If there are REDEFINES present, this will probably // be just the first of many passes). // r=114,0 //Record length is 114, convert to DOS newlines o=z,b2z //Output leading zeros (blank '+' sign; sign at end) //Convert all-blank/null numeric fields to zeros e=0,ebcdic.err //Only report serious errors in EBCDIC.ERR // *************************************************** 000100 // * * 000110 // * SAMPLE COPY-BOOK * 000120 // * * 000130 // * RECSIZE = 114 BYTES * 000140 // * * 000150 // *************************************************** 000160 // 000170 // 01 SAMPLE-REC. 000180 // 03 SA-KEY. 000190 e 1,7 // 1 10 SA-KEY-ALPHA PIC X(7). 000200 u 8,9 // 8 10 SA-KEY-NUMERIC PIC 9(2). 000250 d 10,14 u v2 // 10 03 SA-PAC-DEC-1 PIC 9(7)V99 COMP-3. 000300 e 15,19 // 20 03 FILLER PIC X(5). 000350 d 20,21 u // 25 03 SA-PAC-DEC-2 PIC 999 COMP-3. 000400 d 22,23 // 28 03 SA-PAC-DEC-3 PIC S9(3) COMP-3. 000450 e 24,29 // 32 03 FILLER PIC X(6). 000500 b 30,33,=9; // 38 03 SA-SIGNED-BINARY PIC S9(9) COMP. 000550 e 34,41 // 48 03 FILLER PIC X(8). 000600 z 42,46 // 56 03 SA-ZONED-DECIMAL PIC S9(5). 000650 e 47,50 // 62 03 FILLER PIC X(4). 000700 u 51,55 // 66 03 SA-ANOTHER-NUMERIC PIC 9(5). 000750 l 56,59,=7; v2 // 71 03 SA-FLOAT PIC S9(5)V99 COMP-1. 000800 l 60,67,=12; v2 // 80 03 SA-DOUBLE PIC S9(10)V99 COMP-2. 000850 e 68,114 // 94 03 FILLER PIC X(47). 000900 // 140 (RecLen) Alternatively, for simple copy-books, run COBOL2V.VDM on the copy-book directly: vpw -x cobol2v.vdm cobol.cop -a myfile.lay Following is the result of the above process on the supplied file "COBOL.COP" which produced the file "MYFILE.LAY": r=114,0 // Input reclen, emit DOS newlines // * ************************************************************** 000100 // * * 000110 // * SAMPLE COPY-BOOK * 000120 // * * 000130 // * RECSIZE = 114 BYTES * 000140 // * * 000150 // * ************************************************************** 000160 // 000170 // 01 SAMPLE-REC. 000180 // 03 SA-KEY. 000190 e 1,7 // 1 10 SA-KEY-ALPHA PIC X(7). 000200 u 8,9 // 8 10 SA-KEY-NUMERIC PIC 9(2). 000250 d 10,14 u v2 // 10 03 SA-PAC-DEC-1 PIC 9(7)V99 COMP-3. 000300 e 15,19 // 20 03 FILLER PIC X(5). 000350 d 20,21 u // 25 03 SA-PAC-DEC-2 PIC 999 COMP-3. 000400 d 22,23 // 28 03 SA-PAC-DEC-3 PIC S9(3) COMP-3. 000450 e 24,29 // 32 03 FILLER PIC X(6). 000500 b 30,33,=9; // 38 03 SA-SIGNED-BINARY PIC S9(9) COMP. 000550 e 34,41 // 48 03 FILLER PIC X(8). 000600 z 42,46 // 56 03 SA-ZONED-DECIMAL PIC S9(5). 000650 e 47,50 // 62 03 FILLER PIC X(4). 000700 u 51,55 // 66 03 SA-ANOTHER-NUMERIC PIC 9(5). 000750 l 56,59,=7; v2 // 71 03 SA-FLOAT PIC S9(5)V99 COMP-1. 000800 l 60,67,=12; v2 // 80 03 SA-DOUBLE PIC S9(10)V99 COMP-2. 000850 e 68,114 // 94 03 FILLER PIC X(47). 000900 // 140 (RecLen) Note that the "r=114,0" line was produced automatically. The first column of numbers past the comment lead-in characters gives the beginning column numbers for the translated fields. The final comment gives the length of the translated record, not counting the newline terminators. Options for COBOL2V.VDM ----------------------- COBOL2V.VDM now supports command line options via the "-u option_list" parameter on its invocation line: vpw -x cobol2v.vdm myfile.cob -a myfile.lay -u option_list The option_list can be any of the following, separated by spaces: DOS UNIX MAC DBASE ISHORT NORELAY CC=column_number NAMES When "-u" is not specified, the defaults are DOS and CC=20. The "-a myfile.lay" parameter, above, renames the output file to "myfile.lay" while leaving the source file "myfile.cob" unaltered. COBOL2V.VDM now generates the "r=reclen" specification automatically. The default is to generate "r=reclen,0" for DOS-terminated output lines. Specify "-u UNIX" to generate "r=reclen,1" to append just a Line-Feed character to the translated output records. The default starting column for comments is currently 20. To change this to column 'n', specify "cc=n". Ordinarily, COBOL2V.VDM invokes RELAY.VDM, which converts "+size" to "begcol,endcol" format and places the output column numbers into the comment field. To keep the "+size" format use "-u NORELAY". To then generate output column numbers, run RELAY.VDM, perhaps with "-u INSERT". To later convert the "+size" format to "begcol,endcol", run RELAY.VDM with the "-u CONVERT" option. COBOL2V.VDM and Multiple Record Types ------------------------------------- When the COBOL Copy-Book defines multiple record types, a certain amount of editing must be done before running COBOL2V.VDM. Any "REDEFINES" statements must be examined. When they indicate the presence of a separate record type, editing may consist simply of preceding the REDEFINES statement with a "c=0" specification and commenting out the REDEFINES statement. This is done in the supplied MRSAMP.COP. It may involve replicating an entire section of statements and commenting out the redefining statements in one section and the main section in the other. The activated redefinition must be further edited to convert the "REDEFINES" statement into a simple statement by removing the "REDEFINES xxxx" clause. COBOL2V.VDM now generates templates to help in creating .LAY specifications for multiple record types. Precede each COBOL record type with a "c=0" to generate t 1,0xFF,0xF1 // Proto Record Type - EBCDIC 1 in column 1 l=reclen at the start of each record. The "t" specification can be edited later to properly specify the record. Also, a "tx" template can be generated. "c=0,R" will generate the above specified lines preceded by 'tx "R.asc"'. The supplied MRSAMP.COP shows the first stage editing of a multiple record copy-book: *************************************************************** 000100 * * 000110 * SAMPLE COPY-BOOK WITH MULTIPLE RECORDS * 000120 * * 000130 * RECSIZE = 114 BYTES * 000140 * * 000150 *************************************************************** 000160 . . Edited for COBOL2V.VDM . Note: periods used in column 7 to indicate edited statements . c=0,K // <----- Start of record; want "tx" template generated (K for key) 000170 01 SAMPLE-REC. 000180 03 SA-KEY. 000190 10 SA-KEY-ALPHA PIC X(7). 000200 10 SA-KEY-NUMERIC PIC 9(2). 000250 03 SA-PAC-DEC-1 PIC 9(7)V99 COMP-3. 000300 03 FILLER PIC X(5). 000350 03 SA-PAC-DEC-2 PIC 999 COMP-3. 000400 03 SA-PAC-DEC-3 PIC S9(3) COMP-3. 000450 03 FILLER PIC X(6). 000500 03 SA-SIGNED-BINARY PIC S9(8) COMP. 000550 03 FILLER PIC X(8). 000600 03 SA-ZONED-DECIMAL PIC S9(5). 000650 03 FILLER PIC X(68). 000700 c=0,D // <----- Start of record; want "tx" template generated (D for data) . . Following Copy-Book line commented out to prevent COBOL2V.VDM from . commenting out the entire paragraph. Alternatively, " REDEFINES SAMPLE-REC" . could have been deleted. . .01 SAMPLE-DATA REDEFINES SAMPLE-REC. 000750 03 SA-DATA. 000800 10 SA-DAT-PRODUCT-CD PIC X(4). 000850 10 SA-DAT-PRODUCT-NUM PIC 9(5). 000900 03 SA-DAT-LOCATION-CD PIC X(5). 000950 03 FILLER PIC X(5). 001000 03 SA-DAT-DEC-1 PIC S9(3) COMP-3. 001100 03 SA-DAT-DEC-2 PIC 999 COMP-3. 001150 03 FILLER PIC X(6). 001200 03 SA-DAT-ZDECIMAL PIC S9(5). 001250 03 FILLER PIC X(8). 001300 03 SA-DAT-SBINARY PIC S9(9) COMP. 001350 03 FILLER PIC X(68). 001400 Running COBOL2V.VDM on mrsamp.cop: vpw -x cobol2v.vdm mrsamp.cop -a mrsamp.l produces the file MRSAMP.L, portions of which are included, below: r=114,0 // Max reclen, emit DOS newlines *************************************************************** 000100 * * 000110 * SAMPLE COPY-BOOK WITH MULTIPLE RECORDS * 000120 * * 000130 * RECSIZE = 114 BYTES * 000140 * * 000150 *************************************************************** 000160 ... tx "K.asc" t 1,0xFF,0xF1 // Proto Record Type - EBCDIC 1 in column 1 l=114 // 000170 // 01 SAMPLE-REC. 000180 // 03 SA-KEY. 000190 e 1,7 // 1 10 SA-KEY-ALPHA PIC X(7). 000200 u 8,9 // 8 10 SA-KEY-NUMERIC PIC 9(2). 000250 ... // 129 (RecLen) tx "D.asc" t 1,0xFF,0xF1 // Proto Record Type - EBCDIC 1 in column 1 l=114 // . 01 SAMPLE-DATA REDEFINES SAMPLE-REC. 000750 // 03 SA-DATA. 000800 e 1,4 // 1 10 SA-DAT-PRODUCT-CD PIC X(4). 000850 e 5,9 // 5 10 SA-DAT-PRODUCT-NUM PIC 9(5). 000900 ... // 125 (RecLen) The above can be edited to create the final .LAY file MRSAMP.LAY, shown below. // // MRSAMP.LAY - Specify fields and options for converting EBCDIC files with // packed fields into ASCII. Handles multiple record types. // r=114,0 // Max reclen, emit DOS newlines o=z,b2z // Output leading zeros (blank '+' sign; sign at end) // Convert all-blank/null numeric fields to zeros e=0,ebcdic.err // Only report serious errors in EBCDIC.ERR // ********************************************************* 000100 // * * 000110 // * SAMPLE COPY-BOOK WITH MULTIPLE RECORDS * 000120 // * * 000130 // * RECSIZE = 114 BYTES * 000140 // * * 000150 // ********************************************************* 000160 tx "keyrec.asc" // Divert SAMPLE records to file "keyrec.asc" t 1,0xFFFFFF,0xE2C1D4 // EBCDIC "SAM" in columns 1-3 (up to 3 bytes can be masked) st 4,0xFFFFFF,0xD7D3C5 // EBCDIC "PLE" in columns 4-6 (but wanted to examine 6) l=114 // 000170 // 01 SAMPLE-REC. 000180 // 03 SA-KEY. 000190 e 1,7 // 1 10 SA-KEY-ALPHA PIC X(7). 000200 u 8,9 // 8 10 SA-KEY-NUMERIC PIC 9(2). 000250 d 10,14 u v2 // 10 03 SA-PAC-DEC-1 PIC 9(7)V99 COMP-3. 000300 x 15,19 // 20 03 FILLER PIC X(5). 000350 d 20,21 u // 20 03 SA-PAC-DEC-2 PIC 999 COMP-3. 000400 d 22,23 // 23 03 SA-PAC-DEC-3 PIC S9(3) COMP-3. 000450 x 24,29 // 27 03 FILLER PIC X(6). 000500 b 30,33 // 27 03 SA-SIGNED-BINARY PIC S9(8) COMP. 000550 x 34,41 // 36 03 FILLER PIC X(8). 000600 z 42,46 // 36 03 SA-ZONED-DECIMAL PIC S9(5). 000650 x 47,114 // 42 03 FILLER PIC X(68). 000700 // 41 (RecLen) tx "datarec.asc" // Divert remaining records to file "datarec.asc" td l=114 // . 01 SAMPLE-DATA REDEFINES SAMPLE-REC. 000750 // 03 SA-DATA. 000800 e 1,4 // 1 10 SA-DAT-PRODUCT-CD PIC X(4). 000850 u 5,9 // 5 10 SA-DAT-PRODUCT-NUM PIC 9(5). 000900 e 10,14 // 10 03 SA-DAT-LOCATION-CD PIC X(5). 000950 x 15,19 // 15 03 FILLER PIC X(5). 001000 d 20,21 // 15 03 SA-DAT-DEC-1 PIC S9(3) COMP-3. 001100 d 22,23 u // 19 03 SA-DAT-DEC-2 PIC 999 COMP-3. 001150 x 24,29 // 22 03 FILLER PIC X(6). 001200 z 30,34 // 22 03 SA-DAT-ZDECIMAL PIC S9(5). 001250 x 35,42 // 28 03 FILLER PIC X(8). 001300 b 43,46,=9; // 28 03 SA-DAT-SBINARY PIC S9(9) COMP. 001350 x 47,114 // 38 03 FILLER PIC X(68). 001400 // 37 (RecLen) Final editing included adding the "o=" and "e=" lines, changing 'e' to 'x' for "FILLER" fields, changing the output filenames in the two "tx" specifications, and editing the two "t" specifications. Note, in the first case, this involved adding an "st" line to fully specify the record type. In the second case, the "t" specification was replaced by the default "td" since there are no other record types. Furthermore, freestanding COBOL comments were either prefixed with "//" or deleted entirely. A possible third alternative would be to bracket them with "p=" and "q=. Note the difference in expressing the two binary fields SA-SIGNED-BINARY and SA-DAT-SBINARY. The first record outputs eight decimal digits, our default, which corresponds to the COBOL copy-book's PIC S9(8). Since this is a signed value, the output field size is nine bytes. The second record explicitly sets the output size to nine bytes to agree with the copy-book. Since this is a signed number, the output field is ten bytes. After editing, RELAY.VDM was run to produce correct ASCII output columns in the comment fields. The file MRSAMP.LAY is now fully functional and can be used to translate the supplied sample data file MRSAMP.EBC: vpw -x ebcdic-3.vdm mrsamp.ebc mrsamp.lay COBOL2V.VDM and DBASE Output ---------------------------- The Level-3 translation package can now output records in DBASE format. Specifying the "-u DBASE" DOS command line invocation option will generate the "r=reclen,DBASE-III" specification in the .LAY file. (It also generates a .XRF file containing short field names alongside the COBOL long names). The Level-3 package can automatically generate short field names to put into the DBASE header record. Or, the user can create the short names within the comment section of the .LAY file. If this is desired, the "-u DBASE ISHORT" options should be included on the COBOL2V.VDM invocation line. For example, the following output (compressed horizontally) was produced by: vpw -x cobol2v.vdm cobol.cop -a myfile.lay -u dbase ishort cc=17 r=114,DBASE-III // Input reclen, emit DBASE-3 records // * ************************************************* 000100 // * * 000110 // * SAMPLE COPY-BOOK * 000120 // * * 000130 // * RECSIZE = 114 BYTES * 000140 // * * 000150 // * ************************************************* 000160 // 000170 // 01 SAMPLE-REC. 000180 // 03 SA-KEY. 000190 e 1,7 // 2 F1 10 SA-KEY-ALPHA PIC X(7). 000200 u 8,9 // 9 F2 10 SA-KEY-NUMERIC PIC 9(2). 000250 d 10,14 u v2 // 11 F3 03 SA-PAC-DEC-1 PIC 9(7)V99 COMP-3. 000300 e 15,19 // 21 F4 03 FILLER PIC X(5). 000350 d 20,21 u // 26 F5 03 SA-PAC-DEC-2 PIC 999 COMP-3. 000400 d 22,23 // 29 F6 03 SA-PAC-DEC-3 PIC S9(3) COMP-3. 000450 e 24,29 // 33 F7 03 FILLER PIC X(6). 000500 b 30,33,=9; // 39 F8 03 SA-SIGNED-BINARY PIC S9(9) COMP. 000550 e 34,41 // 49 F9 03 FILLER PIC X(8). 000600 z 42,46 // 57 F10 03 SA-ZONED-DECIMAL PIC S9(5). 000650 e 47,50 // 63 F11 03 FILLER PIC X(4). 000700 u 51,55 // 67 F12 03 SA-ANOTHER-NUMERIC PIC 9(5). 000750 l 56,59,=7; v2 // 72 F13 03 SA-FLOAT PIC S9(5)V99 COMP-1. 000800 l 60,67,=12; v2 // 81 F14 03 SA-DOUBLE PIC S9(10)V99 COMP-2. 000850 e 68,114 // 95 F15 03 FILLER PIC X(47). 000900 // 141 (RecLen) Notice the location of the DBASE names. Names are limited to 10 characters; they must begin with a letter and consist only of letters and digits. The letter 'F', above, stands for "field". Also, notice that the first output field starts in column 2. This is due to the initial DBASE "deleted-record" field which contains an asterisk to denote a deleted record or is blank otherwise. The purpose of the ISHORT parameter is to produce a uniform 11-character- plus-one-space name field that the user can then easily overwrite with the desired names. (This field was trimmed above for display purposes). If auto generated names are acceptable, this name field is unnecessary. Further, the field can be optionally produced later by RELAY.VDM; but, since inactive lines are not adjusted, the commented-out copy-book becomes misaligned. After editing the DBASE names, it is necessary to run RELAY.VDM to produce a .XRF file containing the names. E.g., continuing the above example, vpw -x relay.vdm myfile.lay produces the file MYFILE.XRF as well as adjusting output column numbers for any changes made to the .LAY file. To instruct EBCDIC-3.VDM to use MYFILE.XRF, include "xrf=1" or "xrf=on" in the .LAY file. Thus, a fully edited .LAY file might begin: // // MYFILE.LAY - Sample data layout file for converting EBCDIC files with // packed fields into ASCII, output into DBASE style records. // Produced from COBOL.COP. // r=114,DBASE-III // Input reclen, emit DBASE-3 records e=0,ebcdic.err // Only report serious errors in EBCDIC.ERR o=z,b2z // Output leading zeros (blank '+' sign; sign at end) // Convert all-blank/null numeric fields to zeros xrf=on // <----- Necessary to force use of .XRF file with explicit field names // Otherwise, field names will be auto-generated. // // * ***************************************** 000100 // * * 000110 // * SAMPLE COPY-BOOK * 000120 // * * 000130 // * RECSIZE = 114 BYTES * 000140 // * * 000150 // * ***************************************** 000160 // 000170 // 01 SAMPLE-REC. 000180 // 03 SA-KEY. 000190 e 1,7 // 2 KEYNAME 10 SA-KEY-ALPHA PIC X(7). 000200 u 8,9 // 9 KEYNUM 10 SA-KEY-NUMERIC PIC 9(2). 000250 ... For files with multiple record types, be sure to include the "c=0,R" specifications before each COBOL record type, where 'R' should be a unique letter for each record type. This is especially important for DBASE output. This will generate 'tx "R.dbf",R' and "t" templates and the "l=reclen" specification. It is used to automatically generate 'out "RX.dbf",RX' ... 'out' brackets for records larger than 250 fields, where 'X' = 'A', 'B', 'C', etc. It is also used to automatically generate DBASE field names. For example, running COBOL2V.VDM on the initially discussed copy-book vpw -x cobol2v.vdm mrsamp.cop -a mrsamp.ld0 -u dbase ishort produces output that is partially shown, below: r=114,DBASE-III // Max reclen, emit DBASE-3 records *************************************************************** 000100 * * 000110 * SAMPLE COPY-BOOK WITH MULTIPLE RECORDS * 000120 * * 000130 * RECSIZE = 114 BYTES * 000140 * * 000150 *************************************************************** 000160 ... tx "K.dbf",K t 1,0xFF,0xF1 // Proto Record Type - EBCDIC 1 in column 1 l=114 // 000170 // 01 SAMPLE-REC. 000180 // 03 SA-KEY. 000190 e 1,7 // 2 K1 10 SA-KEY-ALPHA PIC X(7). 000200 u 8,9 // 9 K2 10 SA-KEY-NUMERIC PIC 9(2). 000250 ... // 130 (RecLen) tx "D.dbf",D t 1,0xFF,0xF1 // Proto Record Type - EBCDIC 1 in column 1 l=114 ... // . 01 SAMPLE-DATA REDEFINES SAMPLE-REC. 000750 // 03 SA-DATA. 000800 e 1,4 // 2 D1 10 SA-DAT-PRODUCT-CD PIC X(4). 000850 u 5,9 // 6 D2 10 SA-DAT-PRODUCT-NUM PIC 9(5). 000900 ... // 126 (RecLen) Note that the file names have extension ".dbf". Also note that the template letters 'K' and 'D' have been appended to the end of the "tx" specifications. Also, due to the "ishort" invocation parameter, the auto-generated field names have been included in the comment area past the ASCII-output-column numbers. They can be overwritten with explicit names. Refer to the general discussion, above. Using the RELAY.VDM Macro ------------------------- When an EBCDIC file with packed fields is converted into an ASCII file, the records are longer in the ASCII file and the fields do not start in the same column as in the EBCDIC file. The supplied RELAY.VDM macro calculates the (ASCII) output column number for each field in a .LAY file and adds the output column number to the .LAY file in the comment field. When the DBASE option is selected, it also generates a .XRF file containing user supplied field names. See topics "Options for RELAY.VDM" and "RELAY.VDM and DBASE Output", below. The ASCII output column numbers are very useful for viewing the ASCII file and for creating a database "import filter". For example, consider the sample .LAY from above: r=114,0 //Record length is 114, convert to DOS newlines o=z,b2z //Output leading zeros (blank '+' sign; sign at end) //Convert all-blank/null numeric fields to zeros e=0,ebcdic.err //Only report serious errors in EBCDIC.ERR d 10-14 u v2 //Packed-decimal (no sign, explicit decimal point) d 20-21 u //Packed-decimal (no sign) d 22-23 //Packed-decimal b 30-33,=9; //Packed-binary (9 digits; default is 8) z 42-46 //Zoned number (5 digits plus sign) u 51-55 //Unsigned number; want leading blanks converted to zeros //(o=b2z); otherwise redundant l 56-59,=7; v2 //Short (single precision) floating point; upto 7 digits // with two past a decimal point l 60-67,=12; v2 //Long (double precision) floating point; upto 12 digits, // 2 fractional RELAY.VDM changes the sample .LAY file above to: r=114,0 //Record length is 114, convert to DOS newlines o=z,b2z //Output leading zeros (blank '+' sign; sign at end) //Convert all-blank/null numeric fields to zeros e=0,ebcdic.err //Only report serious errors in EBCDIC.ERR d 10-14 u v2 // 10 Packed-decimal (no sign, explicit decimal point) d 20-21 u // 25 Packed-decimal (no sign) d 22-23 // 28 Packed-decimal b 30-33,=9; // 38 Packed-binary (9 digits; default is 8) z 42-46 // 56 Zoned number (5 digits plus sign) u 51-55 // 66 Unsigned number; want leading blanks converted to zeros //(o=b2z); otherwise redundant l 56-59,=7; v2 // 71 Short (single precision) floating point; upto 7 digits // with two past a decimal point l 60-67,=12; v2 // 80 Long (double precision) floating point; upto 12 digits, // 2 fractional // 93 (RecLen) The RELAY.VDM macro is used by COBOL2V.VDM but may be run stand-alone on any .LAY file: c:\vedit\vpw -x relay.vdm layout.lay RELAY.VDM handles numeric output options, taking into account the signed or unsigned option, explicit decimal points, maximum # digits to be ouput, field padding options, the "quoted-comma-delimited" command "qcd", any user specified output field delimiter string, e.g. "v=Reg_Set(XDREG,/ | /)" and the DBASE output option (regarding which, see below). It also handles multiple records. Use this macro any time a substantive change is made to the layout file. Options for RELAY.VDM --------------------- RELAY.VDM now supports options on its invocation command line: c:\vedit\vpw -x relay.vdm layout.lay -u option_list where option_list is any sequence of the following, blank separated: DOS UNIX MAC DBASE cc=col_num INSERT OVERWRITE DETAB RETAB ISHORT OSHORT CONVERT NOXREF NOD The options from the first line cause the "r=reclen" line to be edited to form: r=reclen,0 // DOS r=reclen,1 // UNIX r=reclen,2 // MAC r=reclen,DBASE-III // DBASE respectively. The default is to use the "r=reclen" line "as is". Output column numbers are affected by the presence or absence of DBASE on the "r=reclen" line. Also, see "RELAY.VDM and DBASE Output", below. Specifying "col_num" via the "cc=col_num" option sets the start of the comment fields to the specified column. Ordinarily, when run "stand alone", output-column-numbers overwrite the first several columns past the comment indicators. Use the "INSERT" option to insert the column numbers instead of ovewriting. RELAY.VDM detabs the .LAY file when it begins. It again detabs the file when finished, unless more than 6 tabs were present in the source file. Use one of DETAB, to force, or RETAB, to prevent, the second detabbing. CONVERT forces RELAY.VDM to change any "+len" field specifications into the standard "begin_col,end_col" format. NOXREF prevents RELAY.VDM from generating/overwriting any .xrf file when the DBASE option is present whether specified by the"-u DBASE" option or by the "r=len,DBASE-III" line in the .lay file. NOD forces RELAY.VDM to ignore the presence of any explicit decimal point "vn" option on any field specification. RELAY.VDM and DBASE Output -------------------------- Even when DBASE style output is wanted eventually, it may be desirable, while producing the .LAY specifications, to output normal line termination characters for easier viewing. To later convert the .LAY file for DBASE output, just run RELAY.VDM with the "-u DBASE" option: vpw -x relay.vdm myfile.lay -a mydbf.lay -u DBASE or copy myfile.lay myfile.sav vpw -x relay.vdm myfile.lay -u DBASE This will edit the "r=reclen" specification to include the DBASE-III output parameter and update the ASCII output column numbers to reflect the DBASE initial "record-deleted" field. For example: r=114,DBASE-III // Input reclen, emit DOS newlines ... e 1,7 // 2 10 SA-KEY-ALPHA PIC X(7). 000200 ... // 132 (RecLen) The (unedited) comment in the "r=" specification is now incorrect. The first ASCII output field starts in column 2 because of the DBASE "deleted-field" column. Alternatively, one could edit the .LAY file, replacing ",0" with ",DBASE-III" and then run RELAY.VDM to update the ASCII output column numbers: vpw -x relay.vdm myfile.lay If explicit DBASE field names are desired, you might want to use the "ISHORT" parameter: vpw -x relay.vdm myfile.lay -u dbase ishort This will generate short names for each field in a uniform 12 space field. One can then overwrite the names with the desired explicit field names. You must also include the "xrf=on" or "xrf=1" specification in the .LAY file. When finished creating the short names, it is necessary to run RELAY.VDM again to generate a .XRF file containing the explicit names: copy myfile.lay myfile.sav // Be safe! vpw -x relay.vdm myfile.lay (Notice the absence of the "-u" parameter). The pertinent lines from the .LAY file are shown below. See "COBOL2V.VDM and DBASE Output", above, for a more complete example. r=114,DBASE-III // Input reclen, emit DBASE-3 records xrf=on // <----- Necessary to force use of .XRF file with explicit field names // Otherwise, field names will be auto-generated. // Also, RELAY.VDM must have been run to generate the .XRF file e 1,7 // 2 KEYNAME 10 SA-KEY-ALPHA PIC X(7). 000200 u 8,9 // 9 KEYNUM 10 SA-KEY-NUMERIC PIC 9(2). 000250 ... Note: "ISHORT" inserts; whereas, "OSHORT" overstrikes. Custom Field Conversion ----------------------- The layout file can specify the field type of "c" to convert a field using a custom VEDIT macro. This topic is complex and specialized; only an overview is given here. Custom conversion fields don't necessarily have anything to do with EBCDIC. They could be used to clean up a file which has already been converted to ASCII, or even to convert a file which was exported from a Window/DOS database or spreadsheet program. For example, it could be used to create a file with quoted and comma-delimited fields. The supplied custom macro LEADZERO.CUS converts unpacked EBCDIC numbers to ASCII and then strips any leading zeroes. It optionally inserts a period before the second last or third last digit. When specifying custom fields, the name of the custom macro must also be specified. For example, the following lines in the layout file use LEADZERO.CUS to convert one field: v=Reg_Load(EBC_Settings(CustomMacro),'leadzero.cus') c 15,19 :2 //Insert "." before last 2 digits The ":2" is an option that can be passed to the custom macro. Its meaning will depend upon the particular custom macro. To enable RELAY.VDM to properly generate ASCII output columns for each field into the comment field of the .LAY file, any "inflation" must be indicated using the "i=inflation;" option. E.g., c 15,19 i=1; :2 // Inserting '.' before the last 2 digits inflates // the output by one column. If any custom field is encountered in the .LAY file during preprocessing, the file "EBCDIC-3.VCM" is loaded. This contains code to process the standard fields. Thus, if under certain conditions, a field is to be treated specially, the custom code can process it, but otherwise just pass the field along. To do this, just "call('code_letter'-32). E.g., if the custom macro wishes to simply unpack a packed decimal field with standard options: "call('d'-32)". Both standard and custom options are available to the custom macro. The syntax for the custom field in the .LAY file is: c colspec [sop] [i=inflation;] [#r] [:cusops] [// comments] where "sop" are the standard options for any field; "i=inflation" has been discussed above; "r" is the text register containing the custom macro (default is EBC_Settings(CustomMacro) which is Text_Register 12 unless the conversion is being performed under WILDFILE.VDM or WILDFWIZ.VDM). To access the custom parameters, switch to the .LAY buffer; CurPos will be just past the ":". E.g., Buf_Switch(XLAY) process custom parameters starting at CurPos. Buf_Switch(XDATA) process XInpFieldSize data bytes starting at CurPos. The custom macro must completely process the specified bytes, leaving CurPos at the start of the next data field. Users who need custom conversions and are very familiar with the VEDIT macro language can use LEADZERO.CUS as a template for creating their own custom macros. LEADZERO.CUS and EBCDIC-3.VDM contain internal (terse) documentation on how custom macros work. However, users may want to contract us to write any needed custom macros. Divide and Conquer ------------------ Much time and effort can be saved when creating a .LAY file to convert EBCDIC files containing multiple record types by first extracting each record type into its own file. Properly specifying the various record types is an absolute requirement for successful translation. By creating a simple extraction .LAY file, one can focus one's attention on this task alone; the extraction process is ten times faster than a full translation (and one will probably do this more than once before the records are fully described); and the result is a group of EBCDIC files each containing just one record type. These smaller files can then be used as the source files as one builds the full translation layout specification. Also, they can be viewed in VEDIT (see next topic) to help resolve any discrepancies between the COBOL copy-book, the actual data, and the layout specifications. The file EXTRACT.LAY has been provided as an example of an extraction layout file. It is a trimmed down version of MULTIREC.LAY with a few other changes. Viewing EBCDIC Data Files ------------------------- Viewing EBCDIC data files in native mode is difficult when these files contain multiple record types of varying size. Appending ASCII newline characters (Carriage-Return and Line_Feed) to the end of each record makes this much easier. One can modify the extraction layout file created in the above topic to do this. File NEWLINE.LAY has been provided as an example of this. It was created from EXTRACT.LAY by adding ",0" to the "r=" line and deleting all "tx" lines. This, of course, creates a file where every record is two bytes longer than documented. The file should only be used for viewing purposes unless one modifies the full conversion .LAY file to account for these two bytes. (The simplest change would be to add 2 to every record length specification and to delete the extra two bytes at the end of each record with "x" commands). To view the EBCDIC file containing ASCII newline characters: after loading it into VEDIT, set {CONFIG, File Handling, File Type} to '3'. Select {VIEW, Toggle hex mode split} from the menu. Click on the smaller window and toggle it into EBCDIC mode via hot-key . You can hold down the key and press a few times until "EBCDIC mode" appears on the left side of the status bar. You can now view the EBCDIC text in the smaller window and packed values in the hex window. These windows are linked. As you move about in one window, the other window follows. It is possible that some binary field may contain the value 0x0d0a. This will cause the record to wrap at this point; but the vast majority of records should appear on their own lines. Contact Greenview Data, Inc --------------------------- Sales: 1-800-458-3348 sales@vedit.com Tech support: (734) 996-1300 support@vedit.com Fax: (734) 996-1308 Website: www.vedit.com Greenview Data, Inc. PO Box 1586 Ann Arbor, MI 48106