CS 4/53111 - Project 1
Symbol Table Package & Lexical Analyzer
Due Thursday, February 16, 2006

Design and code a symbol table package, a lexical analyzer, and a main program to test the project. Projects 2 and 3 will use the symbol table package and lexical analyzer that you code in this project.

Symbol Table Package: The structure of the symbol table should follow the suggestions in section [7.6] of the course notes. As a minimum the structure of each entry in the symbol table should contain the following fields:

To prepare for projects 2 and 3 you can also add two more fields, Low and High, to the structure of each table entry - these fields will hold pointers to other entries in the symbol table.

The symbol table package should include the following functions:

Both FIND and INSERT hash their arguments to select one of the linked-lists in the symbol table. INSERT should always insert the new entry at the start of the selected list so if the same lexeme has multiple entries, FIND will pick the entry that was inserted last.

Lexical Analyzer: Write the lexical analyzer as a function with no arguments and no returned value. Each time the function is called it stores the token-type of the next input token into a global-variable, lookahead, and a pointer to a symbol table entry into another global variable, attributes. Comments are delimited by braces, { and }, and should be treated as white-space. Spaces, tabs, and newlines are also white-space.

Testing: Test the code with a main program that writes a text file containing all tokens found in a Pascal source input file. Your output file will be checked electronically so be careful how you format it. Output a line of text for each token found in the source file. Each output line has at least three fields with spaces and/or tabs between the fields:

  1. the token-type (spelled as shown in this list;)
  2. the lexeme;
  3. the entry-number of the symbol table entry.
Other information can be added to an output line as long as it is separated from the entry-number by spaces and/or tabs. You can de-bug your project using test1in as a source file: the output file for this particular input file should look like test1out. Follow these guidelines to submit your project.
Click here for some hints you can use.
Click here for a list of all token-types and their corresponding lexemes.
Kenneth E. Batcher - 1/3/2006