Latest update Jan 9, 2020.

Assignment 3 - Some Simple Natural Language Processing

The task of this laboration is to implement some extremely rudimentary natural language processing. Your program will read strings from the console or a text file, interpret these strings as lists of words, and parse the lists of words as sentences. The sentences are to be represented in symbolic form, in order to make possible some high-level processing of them.

More specifically, we have the following restrictions:

Your system should not be case-sensitive: it must recognize words written in any mix of upper- and lower-case letters. However, sentences should be output in proper format, with the first letter in the first word capitalized and the rest in lower-case. Claims, when written by your system, should be terminated by a period, and questions with a question mark. Input sentences, however, should be recognized as claims or questions regardless of how they are terminated.

You decide the rules for how to transform a string into a list of words. However, your decision must be sensible, and you must be able to motivate it.

Now write a program that does the following:

  1. Reads a line from either the console, or a text file with lines containing sentences (your solution must be able to handle both),
  2. Transforms it into a list of words,
  3. Tries to interpret the list of word as either a claim or a question,
  4. If interpreted as claim, prints the corresponding question on the screen,
  5. If interpreted as question, prints the corresponding claim on the screen,
  6. If failing to interpret the list of words, then terminates. Otherwise repeats from 1.

The program should (of course) also terminate when reaching end-of-file while reading input from a file.

In all cases the program should terminate in an orderly fashion, with a relevant exit message. Just interrupting the execution with failwith will not be accepted.

Think carefully about how to structure your solution into different functions, and how to represent sentences symbolically.

Hints: System.Console.ReadLine() : string reads a line from the console. The properties .ToLower() and .ToUpper() on strings can also come in handy.


Björn Lisper
bjorn.lisper (at) mdh.se