It is worth mentioning that we are not just going through an api’s usage (that is available easily in documentation too), but we also discuss how the function is dangerous so one could avoid pit falls of incorrect usage.
So let’s talk about functions such as strcmp, strtok, strcat etc.
One unpleasant aspect of some body familiar with a modern language (java, c#, python etc.) is that you cannot simply compare two strings like this in c
char *first = "hello"; // string constant char sec = "hello"; // string literal if(first == sec) // this will never be true because we are comparing pointers printf("String are equal\n"); else printf("Not equal\n");
Above code will print “Not equal” but for an altogether different reason. First is a pointer to a string whereas sec being the name of the array also evaluate to a pointer (i.e. address of first element ‘h’.
Hence what above ‘if’ effectively does is comparing two separate pointer variables containing different addresses and hence being unequal, would print “Not equal”.
The correct way to compare two strings is to using the standard library function strcmp() as follows:
if( strcmp(first,sec) == 0) // this would be true if both strings are equal printf("String are equal\n"); else printf("Not equal\n");
The function returns 0 if two strings are equal i.e. they have the same characters and length, a positive value if the first string is greater than the second or negative otherwise.
But it is important to understand what the criterion is for “first” to be greater than “second”.
Some examples should help
“abc” and “b” // “abc” is smaller because comparing first character, ‘b’ is greater than ‘a’
“abcd” and “ab” // “abcd” is greater
“abcd” and “abd” // “abd” is greater
So this makes it clear length has nothing to do with a string being greater.
The other most important aspect to understand is that both strings need to be null terminated. Because there is no other way to know when a string terminates.
We might result into problems / unexpected results when either some string is not terminated by a null (‘\0’) or we wish to compare strings partially i.e. say the only first ‘3’ or ‘4’ or n characters. The function that comes to rescue is called strncmp().
The only difference is a 3rd parameter i.e. length up to which comparison is needed. So it doesn’t matter if we have a null or not beyond the intended length.
An example usage is
char *first = "hellohi"; // string constant char sec = "hello"; // string literal if(strncmp(first,sec, 5) == 0) // this will be true as we only compare “hello” printf("String are equal\n"); else printf("Not equal\n");
Since we compare only the first five elements “hello”, they would be equal. Also this eliminates worry regarding either or both of the strings being null terminated.
Strcmp implementation in real libraries might be a little more challenging to grasp, so we would only outline the logic here (detailed implementation can be put in a separate tutorial if there is interest). Thelogic is having two pointers iterate through each string comparing characters at the same index. As soon as they are different, return appropriate value i.e. > 1 if current char from first is greater than that from sec or 0 or negative otherwise.
Strcpy() and Strncpy()
The infamous string copying function that is frequently termed dangerous i.e. strcpy with a better substitute being strncpy (though with its own issues). Let’s see their basic usage and the cause of being dangerous.
char fir; // should have enough space to accomodate copy char sec = "hello"; strcpy(fir, sec); // copies second param into the first printf("%s\n", fir);
So the second parameter is copied into the first assuming the original one has enough memory. The reason why it is considered dangerous is that the copy keeps going on until a null (\0) is encountered. So if by any chance the null was omitted, it would most likely write into unauthorized memory and crash the program /show undetermined behavior.
Strncpy, like the comparison function, adds a ‘number of characters’ to be copied parameter to the function prototype.
strncpy(fir, sec, 20); // pads 0s after the string up to 20 elements printf("%s\n", fir);
If the second string has less elements than the number i.e. say 5 as in above case, the function would pad with zeros in the first string up to the number. This is good since you are sure you will always copy a fixed number of elements though the null might be marked much earlier to indicate the actual string length.
However, the problem with this function is more crucial. Consider the following example.
The string to be copied i.e. “ hellohi” is bigger than the allocated array’s size (5). But we mentioned the maximum size to be 5. The optimal behavior should be that it only copies 5 elements in total (4 from the string and one terminating null). However, the function strncpy only copies a null (\0) if it encounters a \0 in the original string while there are still more elements remaining in num. This means that in above scenario there will be no null copied into first string. As a result we have a non-null terminated string which could potentially be disastrous.
As you can see in the above image, on my machine it only prints garbage upto some length and terminates normally. But this is unpredictable behavior, can crash the problem and cannot be relied upon. This was the reason strncpy is not safe either. If you are ever using it, make sure you use a size large enough to leave at least one more byte in num for the terminating null.
We will continue on with strcat and strtok etc. in the next and final part 3 of this tutorial.