-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uniq Command #93
base: main
Are you sure you want to change the base?
uniq Command #93
Conversation
I will code review this |
I will code review this. |
I will code review this! |
uniq.md: sort.c: split.c: ulib.c: uniq.c: There are minor formatting inconsistencies (e.g., missing comments for new code and typos in function prototypes). Consistency in formatting helps with code readability and maintainability. High-Level Checks: |
High-Level Checks: This is a really good implementation of the uniq command. After testing, it seems like all of your intended implementation works besides the -f flag like you mentioned in your bugs. There are a few minor commenting/documentation issues in split.c and uniq.c. There is some commented out/unused code in your split.c code, so I think it would be good to just remove it. All your files have Javadoc style commenting, but I think it would be helpful to expand on most of those comments, adding more to how the function works. In uniq.c, a few of your functions do not have comments, so I think it would be helpful to keep consistent documentation for all of your functions. Besides those few documentation issues, your code looks great! All of your files and tests fulfill their intended purpose and are easy to follow along with/understand. You also account for any potential errors like malloc failure, file reading errors, and sort failure. Great job on this project! |
High-Level Checks: This implementation looks great to me! I also tested this after you added the -f flag, and that seems to work as well. This was a great project idea, and you did very well at achieving your goal. Sort.c Uniq.c I also like the formatting again, looks very clean. I don't have any other issues here. Split.c You have a bit of extra code at the bottom that you commented out here and I suggest getting rid of that or finishing it up if you were going to use it. |
This is a nice refactoring of the original. I like how you now have a separate implementation of split, uniq assumes the input is sorted so you don't need a built-in sort function, etc. I think the reviewers made some good points here. My main feedback revolves around efficiency of the approach. As it is, we currently read all the lines into memory, but that isn't necessary since we are assuming the input is sorted. So implementing void
uniq(int fd)
{
char *last_line = NULL;
char *line;
uint sz = 0;
int dupe_count = 1;
while (getline(&line, &sz, fd) > 0) {
if (last_line == NULL) {
goto done;
}
if (strcmp(line, last_line) == 0) {
dupe_count += 1;
} else {
printf("%d\t%s", dupe_count, last_line);
dupe_count = 1;
}
done:
if (last_line != NULL) {
free(last_line);
}
last_line = strdup(line);
}
} This isn't as efficient as it could be because the string gets dup'd and freed each cycle, and that could be done with a single buffer that is resized as necessary... but that's a lot of complexity to add. This also expects the standard fgets, getline, and strdup -- no changes as were done in the original source you started with (I'm not a fan of adding checks for tabs etc like they did, ideally those changes wouldn't be in the standard library) If you want to refactor around a memory-efficient version as shown above that'd be +1.5 to your project score. |
Modified Files
SummaryThis update includes improvements to the |
checked uniq and sort for memory leaks, none found. -- Summary -- uniq -- Summary -- |
Issue #72
uniq Command
Purpose
Reports or deletes repeated lines in a file.
Syntax
uniq [-c | -t | -f]
Description
The
uniq
command removes consecutive duplicate lines from a file. It reads inputfrom either standard input or a specified file and compares adjacent lines to eliminate
repeated occurrences. Because only adjacent duplicate lines are removed it is typically
used in conjunction with the
sort
command to ensure all duplicates are adjacent. Theunique lines are then written to standard output.
Flags
Example
To remove repeated adjacent lines, enter:
uniq README.md
To remove nonadjacent repeated lines, enter:
sort README.md | uniq
To remove nonadjacent repeated words, enter:
split README.md | sort | uniq
Limitations
uniq
command will break. Thesplit
command will fix this if pipped intosort
thenuniq
but forces the user to specifically search for unique words rather than lines.split
andsort
commands currently do not support any flags.uniq
command cannot chain multiple flags together.