CMPSC 473 - Project #2 - Signature Scanning and Replacement

Due Date: 4/4/16. 100 points

Single person project. Do your own work!

This program scans input files within a single directory to detect whether signature values are present in each file. When a signature match is found, a job is created to modify the signature text (e.g., make it upper case), creating a version of the file with the signature text replaced.

The directory files are scanned by a set of reader threads. Each reader thread will obtain a single signature and scan the file for that signature. If a reader identifies a signature match, it will queue a job for a writer thread. A writer thread checks the shared queue for jobs and modifies the file to replace the signature value (e.g., with an uppercase version). The reader will then continue to scan the file until all copies of the signature in the file have been replaced.

Download the following tarball Project 2 Code to your CSE account file space. You should have one file p2.tgz. Please use these test files in test.tgz to test your code.

The program works as follows: type ./cse473-p2 input_directory number_of_writer_threads > log_file at the prompt. The input directory and its files will be provided. The number of threads will depend on the test. The log file will contain the results of the processing - call this "log". See README for additional info.

There is a Makefile in directory p2, which makes the executable file cse473-p2. Add your PSU ID to the Makefile at the variable PSUID (currently XXXXXXXXX). This will ensure that your submission file is named by your PSU ID - please do not change the name of your submission file.

We will test your Project 2 submissions on machines in the IST 218 Linux Lab. Please make sure that your code runs as expected on a machine in that lab.

IMPORTANT: Testing will examine the entry and exit of critical sections, so the printf statements for entry and exit for QUEUE, SIGN, READ, and WRITE must be at the start and end of critical sections, respectively. You only have to move the printfs significantly for Task #2, but make sure that your placement of the printfs accurately reflects the entry/exit of critical sections or you may lose points unnecessarily.

The project will consist of the following tasks - each are marked in the code provided:

  1. In Task #1, you will write the code to create and launch the reader and writer threads from the main function in p2_main.c.

  2. In Task #2, you will implement mutual exclusion over access to the signature data: sign_idx in thread_reader in p2_thread_reader.c. Each reader must obtain one distinct index - which corresponds to a single signature value for which to scan. Please add the printf statements for Task #1 (provided in the function thread_reader) at the beginning and end of the critical section. Do not modify the content of these printfs.

  3. In Task #3, you will implement reader-writer synchronization to allow multiple readers to scan a file concurrently as long as no writer is writing to protect the critical section of the do_read call in the function thread_reader. Do not modify the printf calls for "READ Entry" and "READ Exit", but ensure that they are in the critical section.

  4. In Task #4, ensure that the reader threads allow writers to handle any queued jobs after every 1000 characters read in do_read. The reader sleeps as specified in usleep, but then (after the sleep) must resume scanning in the critical section in do_read (when allowed).

  5. In Task #5, ensure that the reader threads allow writers to handle any queued jobs when the reader finds a signature match and attempts to queue a job by calling write_queue from the function do_read. After queueing, the reader must resume scanning in the critical section in do_read (when allowed).

  6. In Task #6, ensure that only one reader (or writer) thread has access to the job queue. In this task, we will focus on the reader. In the function write_queue, each reader thread must obtain exclusive access to the queue to add a new job (called a node) - a signature match that is to be replaced - to the queue. In addition, the reader thread must ensure that there is a free space on the queue to add the job before adding it and coordinate with the writer. All this logic for controlling access to the queue must be added in this function.

  7. In Task #7, ensure that only one writer (or reader) thread has access to the job queue. In this task, we will focus on the writer whose code is in p2_thread_writer.c. In the function read_queue, each writer thread must obtain exclusive access to the queue to fetch a new job (node). If there is nothing in the queue, the writer must block awaiting an entry. The "If" conditional may result in a writer thread being awoken and finding the queue empty as you will see. Be sure that the writer coordinates correctly with the reader to manage the queue and avoid deadlock.

  8. In Task #8, ensure that the writer only obtains a reader-writer lock when no readers scanning that particular file (a reader may be scanning another file) in do_write.

  • Your submission will consist of a tarball of your code, made from make tar. We will run your code and look for correct solutions to the key parts of the program. We will provide a test program that will tell you when you have passed each task.

  • Grading:

    Extra notes/explanations/reminders:
    1. TBA: Regarding using the test program

    Trent Jaeger