CSCI 131 - Techniques of Programming, Fall 2009

    Home | | Requirements | | Syllabus | | Assignments | | Lectures

    Project Three Specifications
    Simple Code Compression

    Introduction. This project focuses on loops (for, while, do...while) and on writing at least one function.

    For this programming assignment, you will write at least one separate function (in addition to main) and make use of C++ Programs with Arguments. You may write any additional functions you deem necessary.

    When large amounts of text are stored or transmitted, it often pays to look for ways to compress the text into a smaller number of bits. The time needed to transmit a certain message is proportional to the number of bits in the message. Compacting the data reduces transmission time and requires fewer bits to store. One way to solve this problem is to remove certain repeating characters and replace them with a flag, character and a count.

    Simple Code Compression.
    Your project involves writing a Simple Code Compression Program. A single character may be repeated over and over again in a long sequence. This type of repetition does not generally take place in English text, but often occurs in large data streams. In run-length encoding, a sequence of repeated characters is replaced by a flag character, followed by the repeated character, followed by a single digit that indicates how many times the character is repeated.

    • The flag character cannot be a character included in the text.
    • Repetitions of 3 or less should not be encoded.
    • Repetitions of greater than length 9 cannot be encoded as one run because only a single digit is used for the count.
      • If, for example, your flag is @ and the input to your compression function is
        AAAAAAA
        then the output from the compression function would be
        @A7
      • Whereas, if the input to your compression function is
        nnnnnxxxxxxxxxxccchhhhhh
        then the output from the compression function would be
        @n5@x9xccc@h6
      • Furthermore, if the input to your compression function is
        aaaaaaaaaaaaaaaaaaaabbb
        then the output from the compression function would be
        @a9@a9aabbb

    The Specifications.
    As in previous projects, your program should be contained in a single file, called "proj3.cc". This file will contain main(), as well as at least one prototype and definition of another function.

    1. In an effort to encourage generality and give you another method of supplying input to your program, this time your main program will contain parameters to supply the input file name and the flag character for compression.
      Instead of the usual int main (void),
      this time you'll have int main (int argc, char *argv[]) Therefore, to invoke your program with input file warmish.dat and flag character %, type
      ./proj3 warmish.dat %
      Whereas, you would type
      ./proj3 warm3.dat @
      if you want to invoke your program with input file warm3.dat and flag character @. If the argument containing the input filename is a "bad" input filename, your program should terminate immediately. More details about programs with arguments and examples can be found in C++ Programs with Arguments.
    2. Use a loop to process the input file character by character. You will need to "remember" whether the character you are processing matches the previous character in order to count a sequence of identical characters.
    3. Design, write, and at the appropriate spot in main(), call a function with prototype:
      void WriteChar( ofstream&, int, char, char);
      This function writes the character (the third parameter) appropriately to the output file stream (the first parameter). The integer parameter (the second parameter) determines how the character parameter is written to the output file stream. The fourth parameter is the flag character.
    4. The compressed text will be stored in the file, w.compress .

    To complete this program you are required to use functions with parameters, selection statements, loops, and nested structures. (note: No global variables are allowed and you must use functions to keep your main program short. Therefore, you are required to use both value and reference parameters.)

    Sample Interaction and Screen Output.
    Some common dialogue might look like the following (red data is what has been entered from keyboard)

    radius%./proj3 text3.dat @

    Welcome to the Run-Length Compression Program.

    The input file name: text3.dat
    The flag character: @

    ----------------------
    Processing is complete.
    Check w.compress for results.
    ----------------------

    Corresponding Sample Input and Output Files.
    If the file, text3.dat, contained the following information:

    --------------------Contents of text3.dat------------------
    
    
    hello, not much should happen to this line iiiiiiiiii except here at the end
    nnnnnnnnnnnnnnnnnnnnnnn now that should be interestinggg oops......
    white space counts too                 see??
    that's all folks
    
    
    -----------------------End of text3.dat--------------------
    


    Then the file, w.compress, would contain the following information:

    --------------------Contents of w.compress------------------
    
    
    hello, not much should happen to this line @i9i except here at the end
    @n9@n9@n5 now that should be interestinggg oops@.6
    white space counts too@ 9@ 8see??
    that's all folks
    
    
    -----------------------End of w.compress--------------------
    

    Program testing.

    The directory ~csci131/PROJECTS/PROJ3 contains, among other things, files named test3, simple3.dat and text3.dat. The ".dat" files contain sample input files used to test your file yourself and in the script, test3. Copy these files into a directory that also contains your proj3.cc. Then type ./test3. The test3 script will compile your program and run it twice, once using the file simple3.dat as the file to be compressed and once using a bad filename. The runs will be concatenated into a file named output3. The perfect output, well, the output produced by my own proj3.cc can be seen in ~csci131/PROJECTS/PROJ3/model3.out. You should test your program yourself using the file text3.dat.

    To submit your finished project:

    1. Submit your program file electronically in the directory which contains your (presumably thoroughly tested) proj3.cc. (Depending on the settings, the submit program may not accept a program that does not compile).

    2. Hand in a hard copy of the file you submitted electronically. Hand this to your instructor in class on the project's due date.

    3. Print out the grading header, put your name at the top of it, and hand it in with your hard copy.

    Get started early and have fun!

    Honor code: Please review the collaboration policy on the main course webpage. Also refer to the math and CS department honor code policy.


    Home | | Requirements | | Syllabus | | Assignments | | Lectures


    Laurie King--lking at holycross.edu
    Computer Science 131
    Date Created: August 17, 2009
    Last Modified: August, 2009
    Page Expires: September 8, 2010