C Library – <string.h>

The string.h library defines the size_t type and the NULL macro for the null pointer. It provides several functions for analyzing and manipulating character strings and a few that deal with memory more generally. The table below lists the functions.

Prototype Description
void *memchr(const void *s, int c, size_t n); Searches for the first occurrence of c (converted to unsigned char) in the initial n characters of the object pointed to by s; returns a pointer to the first occurrence, NULL if none is found.
int memcmp(const void *s1, const void *s2, size_t n); Compares the first n characters of the object pointed to by s1 to the first ncharacters of the object pointed to by s2, interpreting each value as unsignedchar; the two objects are identical if all n pairs match; otherwise, the objects compare as the first unmatching pair; returns zero if the objects are the same, less than zero if the first object is numerically less than the second, and greater than zero if the first object is greater.
void *memcpy(void *s1, const void *s2, size_t n); Copies n bytes from the location pointed to by s2 to the location pointed to by s1; behavior is undefined if the two locations overlap; returns the value of s1.
void *memmove(void *s1, const void *s2, size_t n); Copies n bytes from the location pointed to by s2 to the location pointed to by s1; behaves as if copying; first uses a temporary location so that copying to an overlapping location works; returns the value of s1.
void *memset(void *s, int v, size_t n); Copies the value v (converted to type unsigned char) to the first n bytes pointed to by s; returns s.
char *strcat(char *s1, const char *s2); Appends a copy of the string pointed to by s2 (including the null character) to the location pointed to by s1; the first character of the s2 string overwrites the null character of the s1 string; returns s1.
char *strncat(char *s1, const char *s2, size_t n); Appends a copy up to n characters or up to the null character from the string pointed to by s2 to the location pointed to by s1, with the first character of s2overwriting the null character of s1; a null character is always appended; the function returns s1.
char *strcpy(char *s1, const char *s2); Copies the string pointed to by s2 (including the null character) to the location pointed to by s1; returns s1.
char *strncpy(char *s1, const char *s2, size_t n); Copies up to n characters or up to the null character from the string pointed to by s2 to the location pointed to by s1; if the null character in s2 occurs before ncharacters are copied, null characters are appended to bring the total to n; if ncharacters are copied before reaching a null character, no null character is appended; the function returns s1.
int strcmp(const char *s1, const char *s2); Compares the strings pointed to by s1 and s2; two strings are identical if all pairs match; otherwise, the strings compare as the first unmatching pair; characters are compared using the character code values; the function returns zero if the strings are the same, less than zero if the first string is less than the second, and greater than zero if the string array is greater.
int strcoll(const char *s1, const char *s2); Works like strcmp() except that it uses the collating sequence specified by the LC_COLLATE category of the current locale as set by the setlocale() function.
int strncmp(const char *s1, const char *s2, size_t n); Compares up to the first n characters or up to the first null character of the arrays pointed to by s1 and s2; two arrays are identical if all tested pairs match; otherwise, the arrays compare as the first unmatching pair; characters are compared using the character code values; the function returns zero if the arrays are the same, less than zero if the first array is less than the second, and greater than zero if the first array is greater.
size_t strxfrm(char *s1, const char *s2, size_t n); Transforms the string in s2 and copies up to n characters, including a terminating null character, to the array pointed to by s1; the criterion for the transformation is that two transformed strings will be placed in the same order by strcmp() as strcoll() would place the untransformed strings; the function returns the length of the transformed string (not including the terminal null character).
char *strchr(const char *s, int c); Searches for the first occurrence of c (converted to char) in the string pointed to by s; the null character is part of the string; returns a pointer to the first occurrence, or NULL if none is found.
size_t strcspn(const char *s1, const char *s2); Returns the length of the maximum initial segment of s1 that does not contain any of the characters found in s2.
char *strpbrk(const char *s1, const char *s2); Returns a pointer to th elocation of the first character in s1 to match any of the characters in s2; returns NULL if no match is found.
char *strrchr(const char *s, int c); Searches for the last occurrence of c (converted to char) in the string pointed to by s; the null character is part of the string; returns a pointer to the first occurrence, or NULL if none is found.
size_t strspn(const char *s1, const char *s2); Returns the length of the maximum initial segment of s1 that consists entirely of characters from s2.
char *strstr(const char *s1, const char *s2); Returns a pointer to the location of the first occurrence in s1 of the sequence of characters in s2 (excluding the terminating null character); returns NULL if no match is found.
char *strtok(char *s1, const char *s2); This function decomposes the string s1 into separate tokens; the string s2contains the characters that are recognized as token separators. The function is called sequentially. For the initial call, s1 should point to the string to be separated into tokens. The function locates the first token separator that follows a non-separator character and replaces it with a null character. It returns a pointer to a string holding the first token. If no tokens are found, it returns NULL. To find further tokens in the string, call strtok() again, but with NULL as the first argument. Each subsequent call returns a pointer to the next token or to NULL if no further tokens are found. See the example following this table.
char * strerror(int errnum); Returns a pointer to an implementation-dependent error message string corresponding to the error number stored in errnum.
int strlen(const char * s); Returns the number of characters (excluding the terminating null character) in the string s.

String Handling

The C language has no built-in string data type and as such no built-in operators for string handling. The string library provides many functions for manipulating character arrays as memory blocks, known as null-terminated strings.

This last point is important—all strings that are to be processed with the string library must be null terminated. In essence, this just means that the last character in the array must be a \0 character. Assuming you have a character array with the characters Hello in it, you could do this as follows:

szMyString[5] = '\0';

The only slight issues are remembering that arrays are zero indexed, meaning that the first element is 0, and the last element is size – 1, and that each array will have one less usable element because the final character needs to be reserved for the null terminator.

One of the most useful functions in the library returns the length of any null-terminated string:

strlen( <string> )     returns a long integer, length of string

Note that this does not return the size of the array, but the actual number of characters before the null terminator. If you had a 15-character array, placed the characters Hello into it, and set the last character to a null character, everything between the o and the end of the array would be filled with undefined data.

The best way to create a null-terminated string, given a string constant and character array, is to use the strcpy function:

strcpy( <string>, "constant" ) // copy the constant into string

For the hello text example, you would write:

strcpy ( szMyString, "Hello" );

This will result in a null-terminated string in szMyString containing the word Hello, assuming that there is enough space. To create a blank string (for initialization purposes) you would write:

strcpy( szMyString, "" ) ;

To add two null-terminated strings together (or a string variable and a constant), use the strcat function:

strcat( <target>, <source> )     // append source to target

Both arguments are defined as character arrays (pointers to characters), and the source is appended to the target, with the result being a concatenation of the two:

strcpy( szMyString, "Hel");
strcpy( szMyString, "lo"); // Result is 'Hello'

Of course, you can also append two null-terminated strings together. On the other hand, you cannot attempt to specify a constant in the first parameter, because it cannot be modified. The compiler will complain if the first parameter is a constant.

The library also provides functions for searching and comparing strings. To search a string for a character, you have two options. One option is as follows:

strchr( <string>, <character> )

This function returns a pointer to the first occurrence of the character in the string supplied in the first parameter. Because it is a pointer, you can also use pointer arithmetic to look for subsequent occurrences of the character by making multiple calls to strchr with the updated pointer in the first parameter:

char * pChr;
pChr = strchr ( szString, 'l' ); // Initial call
while (pChr != NULL)
{
     // Do some processing
     pChr = pChr + 1; // Start at next character
     pChr = strchr ( pChr, 'l' );
}

You can also find the zero-based index of the place in the character array that the character was found in using pointer arithmetic:

nIndex = pChr - szString;

Also note that there is a companion function, strrchr, that returns the last character in the string:

strrchr ( <string>, <character> )   // returns last character in string

To find a substring within a string and to return a pointer where that substring starts, use the strstr function:

strstr ( <src>, <tar> )    // returns pointer to tar in src

The function takes two null-terminated strings. A pointer to NULL is returned if the string is not found. Either parameter could feasibly be a constant value, but this might not make sense, depending on the application.

To compare strings for equality, character by character, you can use one of two functions:

strcmp ( <string 1>, <string 2> )
     // compares string 1 with string 2
strcmpi( <string 1>, <string 2> )
     // compares string 1 with string 2

These functions return 0 if the two strings are the same. If the first string is lower than (that is, would appear alphabetically before) the second, -1 is returned, if it is “higher,” 1 is returned. The comparison is strictly character value based, so certain alphabetization principles might not be respected.

The strcmpi function is simply a case-insensitive version of strcmp. This means that strcmp treats A and a as having different values, whereas strcmpi treats them both as being equal to the character a. This would also have an effect in sorting strings using these functions.

You can also insert a substring into a string using the strncpy function. This function takes a pointer to a place within a string to insert the second string and a third parameter that indicates how many characters will be inserted. This will likely make more sense with an example:

pChr = strstr ( szMyString, "llo" );
strncpy ( pChr, "lp!", 3 );

Assuming that szMyString contains “Hello”, this code will turn the string into “Help!”. This is useful for substituting parts of strings with other values, as in a search-and-replace function.

Finally, I present a useful but dangerous function. It is dangerous because it actually modifies the source argument, so it should only be used on a copy of the string. The function, strtok, tokenizes a source string, using a user-defined character as delimiter.

So if you wanted to break a line down into tokens, and you knew that the input was comma separated, you might use code such as:

char * szToken;
szToken = strtok ( szMyString, "," ); // Initial call

while ( szToken != NULL ) // NULL when no more found
{
  // Do something with szToken

  szToken = strtok ( NULL, "," ); // Next token
}

You could change the token delimiter between calls to strtok if you wanted to. Each call after the initial call should specify a NULL pointer in the first parameter, or else the program will use the original szMyString value, which is usually reset once the function has successfully completed, but because it is modified between calls, don’t count on this.

Related Post