Project 2 (100 points)
Project 2 is due 9/9/15 on or before 11:59:59pm MST.
You will be given a lexer that reads tokens from standard input. Your
goal is to write, in C or C++, a program that reads all tokens from
standard input by calling the lexer function getToken()
and
stores certain tokens in a linked list. After all tokens are read,
your program will print out the content of the linked list in a
specific order.
Implementation
Your implementation must conform to the CSE 340 projects guideline that the lead TA, Mohsen, has written.
Lexer API
There are two functions that the lexer defines. These two functions compose the application
programming interface (API) of our lexer. These functions are declared
in lexer.h
(and implemented in lexer.c
). You will find the files
lexer.h
and lexer.c
on the
submission site for project 2.
getToken()
reads the next token from standard input and returns its type as atoken_type
enum. If the token is of typeID
,NUM
,IF
,WHILE
,DO
,THEN
, orPRINT
then the actual token value is stored in the global variablecurrent_token
as a null-terminated character array and the length of the string is stored in the global variabletoken_length
. There are two specialtoken_type
values,END_OF_FILE
, which is returned when the lexer encounters the end of standard input andERROR
, which is returned when the lexer encounters an unrecognized character in the input.ungetToken()
causes the next call togetToken()
to return the last token read by the previous call togetToken()
. Note that this means the next call togetToken()
will not read from standard input.
There are four global variables declared in lexer.h
that are set
when getToken()
is called:
t_type
the token type is stored here (note that this will be the same value that was returned bygetToken()
.current_token
the token value is stored in the arraycurrent_token
. If the token is of typeID
,NUM
,IF
,WHILE
,DO
,THEN
, orPRINT
, thencurrent_token
contains the token string. For all other token types,current_token
contains the empty string.token_length
: the length of the string stored incurrent_token
.line
: the current line number of the input when the token was read.
Requirements
Your program should use the provided lexer and read all tokens from
the input by repeatedly calling the getToken()
function. Certain
token strings and additional data should be stored in a linked list.
Specifically, if either of the following conditions is true:
The token is of type
NUM
The token is of type
ID
and the actual token is equal to one of the following values:"cse340"
,"programming"
, or"language"
Then the token string and other information needs to be stored in a node of a linked list. The information that needs to be stored about each of these tokens in the linked list is the following:
Token type (from
t_type
)Token value (from
current_token
)Line number from the input where token was read (from
line
)
After reading all tokens from the input and storing information about tokens that match the criteria, your program should go over the linked list and print the information in reverse order from when that token was encountered. Each of the tokens in the linked list must be printed to standard output on a separate line with the following format:
<line> <token_type_string> <token value>
Note that <token_type_string>
is the textual representation of the
token type. In this case, the possible values are ID
and NUM
.
Evaluation
Your submission will be graded on the automated test cases passing, however simply passing all the test cases is not enough for full credit.
Your submission will be inspected by the TAs to give you feedback on
potential errors in your code. You are required to store the
information in a linked list that you write (either single or double
linked list). The nodes in the linked list must be allocated on the
heap (using malloc
or other similar functions like calloc
), and
the allocated memory must be freed after printing the output. You are
not allowed to use the STL linked list libraries.
Note that this means you are not allowed to use the C++ new
operator
to allocate memory (you must use the C memory allocation functions).
Example
Here is an example input with four lines:
cse340 < < + 123 *
456 programming
- cse 340 , LANGUAGE 100
. ; WHILE 200 IF
Here is your program’s expected output:
4 NUM 200
3 NUM 100
3 NUM 340
2 ID programming
2 NUM 456
1 NUM 123
1 ID cse340
Submission
Submit your project here before the deadline.
Please do not submit lexer.h
or lexer.c
, as these will already be
included when you submit. You are not allowed to make modifications to
lexer.h
or lexer.c
.
Credit
This project description was adapted from Prof. Rida Bazzi, and is used with permission.