Your employer Amalgamated Data Associates has a large body of shell scripts, originally developed under Unix, and is engaged in a project to port them to an embedded device with a very small footprint. The operating system on this device is so stripped down that it supports only the minimal utilities required by the POSIX standards. Unfortunately some of the scripts date back a decade or more, and though they work just fine under Unix they use facilities that POSIX doesn't standardize, or standardizes in a different way.
Now, one could just try running these scripts on the small device and see what breaks, but as it happens this device is an automated stent repairer, a medical device that resides inside a heart patient's coronary arteries; if it misbehaves, patients might get very sick. So your boss decides on a more-systematic approach: she wants to instrument the application in a test environment to capture traces of all the commands that it executes, to check each trace for conformance to the POSIX rules, and to report any usages that don't conform to POSIX.
Your boss splits up this job into several pieces. Your job is to write the option usage checker. Its job is to verify that all the options are used in conformance with the POSIX guidelines, and that all option names are specified by the standard. The checker need need not worry about whether the command "makes sense", or is allowed by POSIX; it needs only to check that options (either with, or without arguments) are limited to those described by POSIX. Another part of the checking system, not part of this assignment, will do the other tests.
For example, your checker should reject the command cd -x because POSIX does not specify the -x option for the cd command, but it should accept the command cd a b because it contains no option-related errors, even though POSIX says the command is invalid because it has too many operands.
Since the overall software-engineering environment uses Ocaml, your checker will be written in Ocaml. Your boss gives you the following interface specification.
Write an Ocaml function conforming_options synopses command that takes two arguments. The first argument synopses is a list of command synopses as described below; the second argument command is a single command, represented as a list of argument strings. The function conforming_options should return true if the command is a valid command that follows one of the synopses and conforms to the POSIX Utility Conventions; it should return false otherwise.
A synopsis is a 3-tuple of strings (name, sansargs, withargs). The first string is the name of the utility; the second contains the letters or digits corresponding to options that do not have option-arguments; the third contains the letters or digits for options that have option-arguments. You may assume that the second and third strings have no two characters in common, and contain only ASCII letters and digits. You may also assume that no two synopses in the list of synopses have the same utility name.
For this assignment, you need not worry about Ocaml's module facilities: we will test your code by including it directly in the test environment.
To turn in your assignment, submit a file util.ml containing your definitions. The first line of util.ml should be a comment containing your name and student ID. Make sure that your implementation works with the Ocaml implementation installed on SEASnet.
# #use "util.ml";;
…
# let synopses =
[("cat", "u", "");
("ls", "CFRacdilqrtu1HLfgmnopsx", "");
("sort", "cmbdfinru", "tko");
("cd", "LP", "");
("more", "ceisu", "npt");
("uniq", "cdu", "fs")];;
val synopses : (string * string * string) list =
[("cat", "u", ""); ("ls", "CFRacdilqrtu1HLfgmnopsx", "");
("sort", "cmbdfinru", "tko"); ("cd", "LP", ""); ("more", "ceisu", "npt");
("uniq", "cdu", "fs")]
# let t command = conforming_options synopses command;;
val t : string list -> bool = <fun>
# t ["cd"; "-x"];;
- : bool = false
# t ["cd"; "a"; "b"];;
- : bool = true
# t [];;
- : bool = false
# t [""];;
- : bool = false
# t ["foo"];;
- : bool = false
# t ["cd"];;
- : bool = true
# t ["cd"; ""];;
- : bool = true
# t ["ls"; "-l"; "-qCF"];;
- : bool = true
# t ["cd"; "--"; "dir"];;
- : bool = true
# t ["sort"; "-m"; "-o"; "output"; "input"];;
- : bool = true
# t ["sort"; "-o"];;
- : bool = false
# t ["sort"; "--"; "-o"];;
- : bool = true
# t ["sort"; "-mo"; "output"; "input"];;
- : bool = false
This section answers questions that came up after the assignment was originally published.
Q1. Is it a valid command to have a utility name followed by an argument and an option without argument? Is it ok to have a command like "ls a -l"? It is invalid by POSIX but I am not sure whether you consider it as valid from the option usage checker's point of view.
A1. Good question. This turns out to be a controversial issue, and I plan to submit a request for clarification to the the POSIX committee about this. For the purpose of this assignment, please reject any attempts to use any argument beginning with "-" after an operand, unless the operands are preceded by a "--" delimiter. The only exception is the single-character argument "-", which always counts as an operand and is allowed anywhere any other operand is allowed. For example, "ls a -l" and "ls a --" are invalid as far as your option usage checker is concerned, but "ls -- a -l", "ls -- a --", and "ls -a -" are valid.
Q2. Can "--" be considered as an argument? For example, is the comannd "sort -o --" valid?
A2. It is more accurate to ask two different questions. First, can "--" be an option-argument? The answer to this question is yes, and "sort -o --" is valid and is equivalent to "sort -o--". Second, can "--" be an operand? Here the answer is yes, but only if it is preceded by another "--" that acts as a delimiter. For example, "sort -- --" is valid: its first "--" is a delimiter and its second "--" is an operand. Conversely, "sort input --" is invalid, as discussed in A1 above.
Q3 (Albert Chern). Why does sort -mo output input fail? I tried this on my system and it works, though I don't know if it's POSIX compliant or not.
A3. It's not POSIX-compliant, because the guidelines let you combine options-without-operands, but they do not let you combine options-with-operands together with options-without-operands. Implementations are allowed to support extensions to POSIX, and "sort -mo output input" is an extension. But the point of the assignment is to reject all attempts to invoke extensions like that.