Advertisements

MAINFRAME – Special characters in the data (EBCDIC values)

Tags

, , , , , , , , ,

We have scenarios where we get a file from different team/company which will be fed as input to a program. This file might have name fields which might have special characters in between. We need to clean the special characters and ensure we have only valid numeric and alphabetic characters in it.

We can do this by checking each character in the name field
– If it is special char, replace with space
– If it is lower case, convert to upper case
– If it is Upper case, Numeric then it is good to go

Below is code snippet for declaration

01 WS-TEXT PIC X(60).

01 FILLER REDEFINES WS-TEXT.

05 WS-TEXT-CHAR PIC X(01) OCCURS 60
INDEXED BY IDX.

88 WS-PUNCTUATION-CHAR VALUES ‘¢’,’.’,'<‘,'(‘,’+’,’|’,’&’,’!’,’$’,’*’,’)’,’;’,’¬’,’-‘, ‘/’,’¦’,’,’,’%’,’_’,’>’,’?’,’`’,’:’,’#’,’@’,”’,’=’,'”‘,’~’,'{‘,’}’,’\’.

88 WS-ALPHA-CHAR VALUES ‘A’ THRU ‘I’,
‘J’ THRU ‘R’,
‘S’ THRU ‘Z’.
88 WS-NUMERIC-CHAR VALUES ‘0’ THRU ‘9’.
88 WS-LOWERCASE-CHAR VALUES ‘a’ THRU ‘z’.

The above declaration has an issue with how WS-LOWERCASE-CHAR field is defined, which will allow few special chars to enter system as valid. Find below the explanation.

1. Note that in the below ASCII and EBCDIC Tables letters a-z do not have consecutive HEX values (in EBCDIC).

2. Also note that the upper case characters A-Z are split based on HEX values, so only those values are considered as valid characters.

88 WS-ALPHA-CHAR VALUES ‘A’ THRU ‘I’,
‘J’ THRU ‘R’,
‘S’ THRU ‘Z’.

3. But for lower case a-z, as code mentions VALUES ‘a’ THRU ‘z’ , when there is a check, it checks through all values between a and z (which include æ, Æ). So this values slipped through logic and came into DB2 tables.

88 WS-LOWERCASE-CHAR VALUES ‘a’ THRU ‘z’.

The ASCII and EBCDIC Tables

EBCDIC HEX TABLE

Advertisements