String
1. C-Style String
A C-Style string is any null-terminated byte string, where this is a sequence of nonzero bytes followed by a byte with zero (0) value (the terminating null character).
- The terminating null character is represented as the character literal ’\0’;
- The length of an NTBS is the number of elements that precede the terminating null character. An empty NTBS has a length of zero.
- The size of an NTBS is the size of the entire array, including the terminating null character.
- A single quotes
(')are used to identify character literals. - A double quotes
('')are used to identify string literals. String literals are stored in your program image, usually in a read-only section (.data),
1.1. Create a String
// *1. Ptr
char* strMessagePtr= "abc";
// sizeof(strMessage) = 32 or 64 (ptr)
strMessagePtr[0] = 1; // error, ptr to const
// *2. Arrays
char strMessage[] = "abc";
// sizeof(strMessage ) = 4
strMessage[1] = 'a'; // OK
- For
strMessage, the memory for the array is allocated on the stack at runtime. The compiler initializes it from the string literal. At runtime, the program memory copies the string literal into the array - For
strMessagePtr, only the address of the string literal is held on the stack, and there is no copying of string literal.
1.2. Characters
- Null:
\0,0x00,NULL - Carriage Return And New Line:
\r\n - Case switching:
'A' ^ ' '&'a' ^ ' ' - Special:
\\, - Escape sequences:
Name Symbol Meaning Alert \aMakes an alert, such as a beep Backspace \bMoves the cursor back one space Formfeed \fMoves the cursor to next logical page Newline \nMoves cursor to next line Carriage return \rMoves cursor to beginning of line Horizontal tab \tPrints a horizontal tab Vertical tab \vPrints a vertical tab Single quote \'Prints a single quote Double quote \"Prints a double quote Backslash \\Prints a backslash Question mark \?Prints a question mark (no longer relevant) Octal number \{number}Translates into char represented by octal Hex number \x{number}Translates into char represented by hex number
1.3. C String Libraries
- Copying strings :
strcpy,strncpy - Concatenating strings:
strcat,strncat - Comparing strings:
strcmp,strncmp - Parsing strings:
strtok,strcspn - Length:
strlen - Examples:
#include <stdio.h>
#include <string.h>
int main() {
// -------------------------------
// 1. Copying strings
// -------------------------------
char src[] = "Hello";
char dst[20];
strcpy(dst, src); // copy full string
// dst = "Hello"
strncpy(dst, "World", 3); // copy only 3 chars
dst[3] = '\0'; // ensure null-termination
// dst = "Wor"
// -------------------------------
// 2. Concatenating strings
// -------------------------------
char text[20] = "Hi";
strcat(text, " there"); // append full string
// text = "Hi there"
strncat(text, "!!!", 2); // append only 2 chars
// text = "Hi there!!"
// -------------------------------
// 3. Comparing strings
// -------------------------------
int r1 = strcmp("abc", "abc"); // r1 = 0 (equal)
int r2 = strcmp("abc", "abd"); // r2 < 0 (abc < abd)
int r3 = strncmp("abcdef", "abcxyz", 3); // r3 = 0 (first 3 chars equal)
// -------------------------------
// 4. Parsing strings
// -------------------------------
char line[] = "A,B,C";
char* token = strtok(line, ","); // first token: "A"
while (token != NULL) {
printf("token: %s\n", token);
token = strtok(NULL, ",");
}
// strcspn: find first occurrence of any chars in reject set
char sample[] = "hello123world";
size_t pos = strcspn(sample, "0123456789");
// pos = 5 (first digit is at index 5)
// -------------------------------
// 5. Length
// -------------------------------
size_t len = strlen("abc"); // len = 3
return 0;
}
1.4. String/Numbers Conversion
- Integer to String:
itoa()(non-standard) - String to Double:
atof() - String to Double (with error checking):
strtod() - String to Long (with base + error checking):
strtol() - e.g.
#include <stdio.h>
#include <stdlib.h> // atof, strtod, strtol
#include <string.h> // itoa (non-standard on some compilers)
int main() {
// -------------------------------
// 1. Integer → String (itoa)
// -------------------------------
char buf[32];
itoa(1234, buf, 10); // convert integer to string in base 10
// buf = "1234"
itoa(255, buf, 16); // convert to hex
// buf = "ff"
// -------------------------------
// 2. String → Double (atof)
// -------------------------------
double d1 = atof("3.14159");
// d1 = 3.14159
double d2 = atof("12.5xyz");
// d2 = 12.5 (atof stops at non-numeric chars)
// -------------------------------
// 3. String → Double (strtod)
// -------------------------------
char* end;
double d3 = strtod("45.67abc", &end);
// d3 = 45.67
// end -> "abc"
// -------------------------------
// 4. String → Long (strtol)
// -------------------------------
long v1 = strtol("1234", NULL, 10);
// v1 = 1234 (decimal)
long v2 = strtol("FF", NULL, 16);
// v2 = 255 (hex → decimal)
char* end2;
long v3 = strtol("100xyz", &end2, 10);
// v3 = 100
// end2 -> "xyz"
return 0;
}
2. C++ String
Strings are objects that represent sequences of characters.
2.1. Create a String
std::string s = "hello"; ///< initialize from a string literal
std::string s2("hello"); ///< direct initialization from a string literal
std::string s3(5, 'a'); ///< create a string with 5 copies of 'a' -> "aaaaa"
2.2. Basic Properties
| Function | Description |
|---|---|
s.size() |
number of characters |
s.length() |
same as size |
s.empty() |
check empty |
s.clear() |
remove all characters |
2.3. Access Characters
s[0] // no bounds check
s.at(0) // bounds check
s.front() // first char
s.back() // last char
2.4. Modify String
/// @brief append
s.append(" world");
s += " world";
/// @brief insert
s.insert(5, "XXX");
/// @brief erase
s.erase(0,2);
/// @brief replace
s.replace(0,5,"hi");
/// @brief remove `\n`
std::string s = "hello\nworld\n";
auto new_end = std::remove(s.begin(), s.end(), '\n');
s.erase(new_end)
// std::erase(s,'\n')
s.erase(std::remove(s.begin(), s.end(), '\n'), s.end());
2.5. Substring
/// @brief std::string sub = s.substr(pos, len); , substr() does NOT modify original string
std::string s = "abcdef";
s.substr(2,3); // "cde"
2.6. Find / Search
/// @brief find character
s.find('a');
/// @brief find last
s.find('a');
/// @brief find substring
size_t pos = s.find("abc");
if (pos != std::string::npos)
std::cout << "found";
2.6. Compare Strings
if (a == b)
if (a != b)
if (a < b)
// 0 -> equal
// <0 -> smaller
// >0 -> larger
auto ret = a.compare(b);
2.7. Start/End Checking
/// @brief C++20
s.starts_with("abc");
s.ends_with("xyz");
/// @brief before c++ 20
s.rfind("abc",0) == 0
s.size() >= 3 &&
s.compare(s.size()-3,3,"xyz") == 0
2.8. Convert String
/// @brief string to int
int n = std::stoi(s);
double x = std::stod("3.14");
/// @brief int to string
std::string s = std::to_string(123);
2.9. String Parsing
#include <sstream>
std::string line = "a,b,c";
std::stringstream ss(line);
std::string item;
while(std::getline(ss,item,',')) {
std::cout << item << std::endl;
}
2.10. Trim Whitespace
s.erase(0, s.find_first_not_of(" \t\n\r"));
s.erase(s.find_last_not_of(" \t\n\r") + 1);
2.11. Convert Case
#include <algorithm>
std::transform(s.begin(), s.end(), s.begin(), ::tolower);
std::transform(s.begin(), s.end(), s.begin(), ::toupper);
2.12. String Formatting
| Method | Standard | Pros | Cons |
|---|---|---|---|
std::format |
C++20 | Clean, safe, Python-like | Needs newer compiler |
+ concatenation |
C++98 | Simple | Hard to format numbers |
stringstream |
C++98 | Flexible streaming | Slow, verbose |
snprintf |
C | Very fast, classic | Unsafe if misused |
3. std::cout « , std::cin » , std::getLine(std::cin » std::ws, std::string string)
std::wstellsstd::cinto ignore leading whitespace(tab/enter/newline(s)) before extraction.std::string::lengthreturns length of a string that does not included the null terminator character.ssuffix is astd::stringliterally, no suffix is a C-style string literally. e.g.std::cout << "goo\n"s
Initializing and copy a
std::stringis slow. That’s inefficient
4. std::string_view (C++17)
std::string_view provides read-only access to an existing string (a C-style string literal, a std::string, or a char array) without making a copy.
-
A std::string_view that is viewing a string that has been destroyed is sometimes called a dangling view. When a std::string is modified, all views into that std::string are invalidated, meaning those views are now invalid. Using an invalidated view (other than to revalidate it) will produce undefined behavior.
-
svsuffix is astd::string_viewliterally -
Modify a std::string is likely to invalidate all std::string_view that view into that.
-
It may or may not be null-terminated.