Ebooks

Strings in C#

In this part of the C# tutorial, we will work with string data in more detail. Strings are very important in computer languages. That is why we dedicate a whole chapter to working with strings in C#.

C# string definition

A string is a sequence of characters. In C#, a string is a sequence of Unicode characters. It is a data type which stores a sequence of data values, usually bytes, in which elements usually stand for characters according to a character encoding. When a string appears literally in the source code, it is known as a string literal.

Strings are objects. There are two basic classes for working with strings:

The String is an immutable sequence of characters. The StringBuilder is a mutable sequence of characters.

In C#, string is an alias for System.String. The string is a language keyword and the System.String is a .NET type.

C# initializing strings

There are multiple ways of creating strings, both immutable and mutable. We will show a few of them.

Program.cs
using System;
using System.Text;

namespace Initialization
{
    class Program
    {
        static void Main(string[] args)
        {
            char[] cdb = { 'M', 'y', 'S', 'q', 'l' };

            string lang = "C#";
            String ide = "NetBeans";
            string db = new string(cdb);

            Console.WriteLine(lang);
            Console.WriteLine(ide);
            Console.WriteLine(db);

            StringBuilder sb1 = new StringBuilder(lang);
            StringBuilder sb2 = new StringBuilder();

            sb2.Append("Fields");
            sb2.Append(" of ");
            sb2.Append("glory");

            Console.WriteLine(sb1);
            Console.WriteLine(sb2);
        }
    }
}   

The example shows a few ways of creating System.String and System.Text.StringBuilder objects.

using System.Text;

This statement enables to use the System.Text.StringBuilder type without qualification.

string lang = "C#";
String ide = "NetBeans";

The most common way is to create a string object from a string literal.

string db = new string(cdb);

Here we create a string object from an array of characters. The string is an alias for the System.String.

StringBuilder sb1 = new StringBuilder(lang);

A StringBuilder object is created from a String.

StringBuilder sb2 = new StringBuilder();
sb2.Append("Fields");
sb2.Append(" of ");
sb2.Append("glory");

We create an empty StringBuilder object. We append three strings into the object.

$ dotnet run
C#
NetBeans
MySql
C#
Fields of glory

Running the example gives this result.

C# string interpolation

The $ special character prefix identifies a string literal as an interpolated string. An interpolated string is a string literal that might contain interpolated expressions.

String formatting is a similar feature to string interpolation; it is covered later in the chapter.

Program.cs
using System;

namespace Interpolation
{
    class Program
    {
        static void Main(string[] args)
        {
            int age = 23;
            string name = "Peter";

            DateTime now = DateTime.Now;

            Console.WriteLine($"{name} is {age} years old");
            Console.WriteLine($"Hello, {name}! Today is {now.DayOfWeek}, 
                    it's {now:HH:mm} now");
        }
    }
}

The example presents C# string interpolation.

Console.WriteLine($"{name} is {age} years old");

The interpolated variables are placed between {} brackets.

Console.WriteLine($"Hello, {name}! Today is {now.DayOfWeek}, 
    it's {now:HH:mm} now");

The interpolation syntax can receive expressions or formatting specifiers.

$ dotnet run
Peter is 23 years old
Hello, Peter! Today is Friday, it's 14:58 now

This is the output.

C# regular strings

Regular strings can contain escape sequences, such as new line or tab character, which are interpreted. Regular strings are placed between a pair of double quotes.

Program.cs
using System;
using System.Text;

namespace RegularLiterals
{
    class Program
    {
        static void Main(string[] args)
        {
            string s1 = "deep \t forest";
            string s2 = "deep \n forest";

            Console.WriteLine(s1);
            Console.WriteLine(s2);

            Console.WriteLine("C:\\Users\\Admin\\Documents");
        }
    }
}

The example prints two strings which contain \t and \n escape sequences.

Console.WriteLine("C:\\Users\\Admin\\Documents");

When working with e.g. paths, the shashes must be escaped.

$ dotnet run
deep     forest
deep
    forest
C:\Users\Admin\Documents    

This is the output.

C# verbatim strings

Verbatim strings do not interprete escape sequences. Verbatim strings are preceded with the @ character. Verbatim strings can be used to work with multiline strings.

Program.cs
using System;

namespace VerbatimLiterals
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine(@"deep \t forest");
            Console.WriteLine(@"C:\Users\Admin\Documents");

            var text = @"
            Not marble, nor the gilded monuments
Of princes, shall outlive this powerful rhyme;
But you shall shine more bright in these contents
Than unswept stone, besmeared with sluttish time.";

            Console.WriteLine(text);
        }
    }
}

In this code example we work with verbatim strings.

Console.WriteLine(@"deep \t forest");

The \t special character is not interpreted; it is only printed to the console.

Console.WriteLine(@"C:\Users\Admin\Documents");

Verbatim strings are convenient when we work with paths; the shashes do not have to be escaped.

var text = @"
    Not marble, nor the gilded monuments
Of princes, shall outlive this powerful rhyme;
But you shall shine more bright in these contents
Than unswept stone, besmeared with sluttish time.";

Verbatim strings allow us to create multiline strings.

$ dotnet run
deep \t forest
C:\Users\Admin\Documents

            Not marble, nor the gilded monuments
Of princes, shall outlive this powerful rhyme;
But you shall shine more bright in these contents
Than unswept stone, besmeared with sluttish time.

This is the output.

C# strings are objects

Strings are objects. They are reference types. Strings are instances of the System.String or System.Text.StringBuilder class. Since they are objects, they have multiple methods available for doing various work.

Program.cs
using System;

namespace Objects
{
    class Program
    {
        static void Main(string[] args)
        {
            string lang = "Java";

            string bclass = lang.GetType().Name;
            Console.WriteLine(bclass);

            string parclass = lang.GetType().BaseType.Name;
            Console.WriteLine(parclass);

            if (lang.Equals(String.Empty))
            {

                Console.WriteLine("The string is empty");
            }
            else
            {

                Console.WriteLine("The string is not empty");
            }

            int len = lang.Length;
            Console.WriteLine("The string has {0} characters", len);
        }
    }
}

In this program, we demonstrate that strings are objects. Objects must have a class name, a parent class and they must also have some methods that we can call or properties to access.

string lang = "Java";

An object of System.String type is created.

string bclass = lang.GetType().Name;
Console.WriteLine(bclass);

We determine the class name of the object to which the lang variable refers.

string parclass = lang.GetType().BaseType.Name;
Console.WriteLine(parclass);

A parent class of our object is received. All objects have at least one parent — the Object.

if (lang.Equals(String.Empty)) 
{    
    Console.WriteLine("The string is empty");
} else 
{    
    Console.WriteLine("The string is not empty");
}

Objects have various methods. With the Equals() method we check if the string is empty.

int len = lang.Length;
Console.WriteLine("The string has {0} characters", len);

The Length() method returns the size of the string.

$ dotnet run
String
Object
The string is not empty
The string has 4 characters

This is the output of the stringobjects.exe program.

C# mutable & immutable strings

The String is a sequence of immutable characters, while the StringBuilder is a sequence of mutable characters. The next example will show the difference.

Program.cs
using System;
using System.Text;

namespace MutableImmutable
{
    class Program
    {
        static void Main(string[] args)
        {
            string name = "Jane";
            string name2 = name.Replace('J', 'K');
            string name3 = name2.Replace('n', 't');

            Console.WriteLine(name);
            Console.WriteLine(name3);

            StringBuilder sb = new StringBuilder("Jane");
            Console.WriteLine(sb);

            sb.Replace('J', 'K', 0, 1);
            sb.Replace('n', 't', 2, 1);

            Console.WriteLine(sb);
        }
    }
}

Both objects have methods for replacing characters in a string.

string name = "Jane";
string name2 = name.Replace('J', 'K');
string name3 = name2.Replace('n', 't');

Calling a Replace() method on a String results in returning a new modified string. The original string is not changed.

sb.Replace('J', 'K', 0, 1);
sb.Replace('n', 't', 2, 1);

The Replace() method of a StringBuilder will replace a character at the given index with a new character. The original string is modified.

$ dotnet run
Jane
Kate
Jane
Kate

This is the output of the program.

C# concatenating strings

Immutable strings can be added using the + operator or the Concat() method. They will form a new string which is a chain of all concatenated strings. Mutable strings have the Append() method which builds a string from any number of other strings.

It is also possible to concatenate strings using string formatting and interpolation.

Program.cs
using System;
using System.Text;

namespace Concatenate
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Return" + " of " + "the king.");

            Console.WriteLine(string.Concat(string.Concat("Return", " of "),
                "the king."));

            StringBuilder sb = new StringBuilder();
            sb.Append("Return");
            sb.Append(" of ");
            sb.Append("the king.");

            Console.WriteLine(sb);

            string s1 = "Return";
            string s2 = "of";
            string s3 = "the king.";

            Console.WriteLine("{0} {1} {2}", s1, s2, s3);
            Console.WriteLine($"{s1} {s2} {s3}");
        }
    }
}

The example creates five sentences by concatenating strings.

Console.WriteLine("Return" + " of " + "the king.");

A new string is formed by using the + operator.

Console.WriteLine(string.Concat(string.Concat("Return", " of "), 
    "the king."));

The Concat() method concatenates two strings. The method is a static method of the System.String class.

StringBuilder sb = new StringBuilder();
sb.Append("Return");
sb.Append(" of ");
sb.Append("the king.");

A mutable object of the StringBuilder type is created by calling the Append() method three times.

Console.WriteLine("{0} {1} {2}", s1, s2, s3);

A string is formed with string formatting.

Console.WriteLine($"{s1} {s2} {s3}");

Finally, the strings are added with the interpolation syntax.

$ dotnet run
Return of the king.
Return of the king.
Return of the king.
Return of the king.
Return of the king.

This is the example output.

C# using quotes

When we want to display quotes, for instance in direct speech, the inner quotes must be escaped.

Program.cs
using System;

namespace Quotes
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("There are many stars.");
            Console.WriteLine("He said, \"Which one is your favourite?\"");

            Console.WriteLine(@"
            Lao Tzu has said: 
            ""If you do not change direction, you may end up 
            where you are heading.""
            ");
        }
    }
}

This example prints direct speech.

Console.WriteLine("He said, \"Which one is your favourite?\"");

Inside a regular string, the character is escaped with \.

Console.WriteLine(@"
Lao Tzu has said: 
""If you do not change direction, you may end up 
where you are heading.""
");

Inside a verbatim string, the quote is preceded with another quote.

$ dotnet run
There are many stars.
He said, "Which one is your favourite?"

            Lao Tzu has said: "If you do not change direction, you may end up
            where you are heading."

This is the output of the program.

C# comparing strings

We can compare two strings with the == operator.

Program.cs
using System;

namespace CompareString
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("12" == "12");
            Console.WriteLine("17" == "9");
            Console.WriteLine("aa" == "ab");
        }
    }
}

In the example program, we compare strings.

$ dotnet run
True
False
False

This is the output of the program.

The string.Compare() method compares two specified strings and returns an integer that indicates their relative position in the sort order. If the returned value is less than zero, the first string is less than the second. If it returns zero, both strings are equal. Finally, if the returned value is greater than zero, the first string is greater than the second.

Program.cs
using System;

namespace CompareString2
{
    class Program
    {
        static void Main(string[] args)
        {
            string str1 = "ZetCode";
            string str2 = "zetcode";

            Console.WriteLine(string.Compare(str1, str2, true));
            Console.WriteLine(string.Compare(str1, str2, false));
        }
    }
}

There is an optional third ignoreCase argument. It determines if the case should be honored or not.

Console.WriteLine(string.Compare(str1, str2, true));

Compare two strings and ignore the case. This line prints 0 to the console.

C# string elements

A string is a sequence of characters. A character is a basic element of a string.

Program.cs
using System;

namespace StringElements
{
    class Program
    {
        static void Main(string[] args)
        {
            char[] crs = { 'Z', 'e', 't', 'C', 'o', 'd', 'e' };
            String s = new String(crs);

            char c1 = s[0];
            char c2 = s[(s.Length - 1)];

            Console.WriteLine(c1);
            Console.WriteLine(c2);

            int i1 = s.IndexOf('e');
            int i2 = s.LastIndexOf('e');

            Console.WriteLine("The first index of character e is " + i1);
            Console.WriteLine("The last index of character e is " + i2);

            Console.WriteLine(s.Contains("t"));
            Console.WriteLine(s.Contains("f"));

            char[] elements = s.ToCharArray();

            foreach (char el in elements)
            {
                Console.WriteLine(el);
            }
        }
    }
}

In the first example, we will work with an immutable string.

char[] crs = {'Z', 'e', 't', 'C', 'o', 'd', 'e' };
String s = new String(crs);

A new immutable string is formed from an array of characters.

char c1 = s[0];
char c2 = s[(s.Length-1)];

Using the array access notation, we get the first and the last char value of the string.

int i1 = s.IndexOf('e');
int i2 = s.LastIndexOf('e');

With the above methods, we get the first and the last occurrence of the character 'e'.

Console.WriteLine(s.Contains("t"));
Console.WriteLine(s.Contains("f"));

With the Contains() method we check if the string contains the 't' character. The method returns a boolean value.

char[] elements = s.ToCharArray();

foreach (char el in elements) 
{    
    Console.WriteLine(el);
} 

The ToCharArray() method creates a character array from the string. We go through the array and print each of the characters.

$ dotnet run
Z
e
The first index of character e is 1
The last index of character e is 6
True
False
Z
e
t
C
o
d
e

This is the example output.

In the second example, we will work with the elements of a mutable string.

Program.cs
using System;
using System.Text;

public class StringBuilderElements
{
    static void Main() 
    {
        StringBuilder sb = new StringBuilder("Misty mountains");
        Console.WriteLine(sb);
        
        sb.Remove(sb.Length-1, 1);
        Console.WriteLine(sb);
        
        sb.Append('s');
        Console.WriteLine(sb);
        
        sb.Insert(0, 'T');
        sb.Insert(1, 'h');
        sb.Insert(2, 'e');
        sb.Insert(3, ' ');
        Console.WriteLine(sb);
        
        sb.Replace('M', 'm', 4, 1);
        Console.WriteLine(sb); 
    }
}

A mutable string is formed. We modify the contents of the string by deleting, appending, inserting, and replacing characters.

sb.Remove(sb.Length-1, 1);

This line deletes the last character.

sb.Append('s');

The deleted character is appended back to the string.

sb.Insert(0, 'T');
sb.Insert(1, 'h');
sb.Insert(2, 'e');
sb.Insert(3, ' ');

We insert four characters at the beginning of the string.

sb.Replace('M', 'm', 4, 1);

Finally, we replace a character at index 4.

$ dotnet run
Misty mountains
Misty mountain
Misty mountains
The Misty mountains
The misty mountains

From the output we can see how the mutable string is changing.

C# string Join and Split

The Join() joins strings and the Split() splits the strings.

Program.cs
using System;

namespace JoinSplit
{
    class Program
    {
        static void Main(string[] args)
        {
            var items = new string[] { "C#", "Visual Basic", "Java", "Perl" };

            var langs = string.Join(",", items);
            Console.WriteLine(langs);

            string[] langs2 = langs.Split(',');

            foreach (string lang in langs2)
            {
                Console.WriteLine(lang);
            }
        }
    }
}  

In our program, we will join and split strings.

var items = new string[] { "C#", "Visual Basic", "Java", "Perl" };

This is an array of strings. These strings are going to be joined.

string langs = string.Join(",", items);

All words from the array are joined. We build one string from them. There will be a comma character between each two words.

string[] langs2 = langs.Split(',');

As a reverse operation, we split the langs string. The Split() method returns an array of words, delimited by a character. In our case it is a comma character.

foreach (string lang in langs2)
{
    Console.WriteLine(lang);
}

We go through the array and print its elements.

$ dotnet run
C#,Visual Basic,Java,Perl
C#
Visual Basic
Java
Perl

This is the output of the example.

C# common methods

Next, we present a couple of additional common string methods.

Program.cs
using System;

namespace CommonMethods
{
    class Program
    {
        static void Main(string[] args)
        {
            string word = "Determination";

            Console.WriteLine(word.Contains("e"));
            Console.WriteLine(word.IndexOf("e"));
            Console.WriteLine(word.LastIndexOf("i"));

            Console.WriteLine(word.ToUpper());
            Console.WriteLine(word.ToLower());
        }
    }
}

We introduce five string methods in the above example.

Console.WriteLine(str.Contains("e"));

The Contains() method returns True if the string contains a specific character.

Console.WriteLine(str.IndexOf("e"));

The IndexOf() returns the first index of a letter in the string.

Console.WriteLine(str.LastIndexOf("i"));

The LastIndexOf() methods returns the last index of a letter in a string.

Console.WriteLine(str.ToUpper());
Console.WriteLine(str.ToLower());

Letters of the string are converted to uppercase with the ToUpper() method and to lowercase with the ToLower() method.

$ dotnet run
True
1
10
DETERMINATION
determination

Running the program.

C# string Copy vs Clone

We will describe a difference between two methods: Copy() and Clone(). The Copy() method creates a new instance of string with the same value as a specified string. The Clone() method returns a reference to the string which is being cloned. It is not an independent copy of the string on the Heap. It is another reference on the same string.

Program.cs
using System;

namespace CopyClone
{
    class Program
    {
        static void Main(string[] args)
        {
            string str = "ZetCode";

            string cloned = (string) str.Clone();
            string copied = string.Copy(str);

            Console.WriteLine(str.Equals(cloned)); // prints True
            Console.WriteLine(str.Equals(copied)); // prints True

            Console.WriteLine(ReferenceEquals(str, cloned)); // prints True
            Console.WriteLine(ReferenceEquals(str, copied)); // prints False
        }
    }
}

Our example demonstrates the difference between the two methods.

string cloned = (string) str.Clone();
string copied = string.Copy(str);

The string value is cloned and copied.

Console.WriteLine(str.Equals(cloned)); // prints True
Console.WriteLine(str.Equals(copied)); // prints True

The Equals() method determines whether two string objects have the same value. The contents of all three strings are the same.

Console.WriteLine(ReferenceEquals(str, cloned)); // prints True
Console.WriteLine(ReferenceEquals(str, copied)); // prints False

The ReferenceEquals() method compares the two reference objects. Therefore comparing a copied string to the original string returns false. Because they are two distinct objects.

C# formatting strings

In the next examples, we will format strings. The .NET Framework has a feature called composite formatting. It is supported by Format() and WriteLine() methods. A method takes a list of objects and a composite format string as input. The format string consists of fixed string and some format items. These format items are indexed placeholders which correspond to the objects in the list.

The format item has the following syntax:

{index[,length][:formatString]}

The index component is mandatory. It is a number starting from 0 that refers to an item from the list of objects. Multiple items can refer to the same element of the list of objects. An object is ignored if it is not referenced by a format item. If we refer outside the bounds of the list of objects, a runtime exception is thrown.

The length component is optional. It is the minimum number of characters in the string representation of the parameter. If positive, the parameter is right-aligned; if negative, it is left-aligned. If it is specified, there must by a colon separating the index and the length.

The formatString is optional. It is a string that formats a value is a specific way. It can be used to format dates, times, numbers or enumerations.

Here we show, how to work with length component of the format items. We print three columns of numbers to the terminal. Left, middle and right aligned.

Program.cs
using System;

namespace Format1
{
    class Program
    {
        static void Main(string[] args)
        {
            int oranges = 2;
            int apples = 4;
            int bananas = 3;

            string str1 = "There are {0} oranges, {1} apples and {2} bananas";
            string str2 = "There are {1} oranges, {2} bananas and {0} apples";

            Console.WriteLine(str1, oranges, apples, bananas);
            Console.WriteLine(str2, apples, oranges, bananas);
        }
    }
}   

We print a simple message to the console. We use only index component of the format item.

string str1 = "There are {0} oranges, {1} apples and {2} bananas";

The {0}, {1}, and {2} are format items. We specify the index component. Other components are optional.

Console.WriteLine(str1, oranges, apples, bananas);

Now we put together the composite formatting. We have the string and the list of objects (oranges, apples, bananas). The {0} format item refers to the oranges. The WriteLine() method replaces the {0} format item with the contents of the oranges variable.

string str2 = "There are {1} oranges, {2} bananas and {0} apples";

The order of the format items referring to the objects is notation important.

$ dotnet run
There are 2 oranges, 4 apples and 3 bananas
There are 2 oranges, 3 bananas and 4 apples

We can see the outcome of the program.

The next example will format numeric data.

Program.cs
using System;

namespace Format2
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("{0}  {1, 12}", "Decimal", "Hexadecimal");

            Console.WriteLine("{0:D}  {1,8:X}", 502, 546);
            Console.WriteLine("{0:D}  {1,8:X}", 345, 765);
            Console.WriteLine("{0:D}  {1,8:X}", 320, 654);
            Console.WriteLine("{0:D}  {1,8:X}", 120, 834);
            Console.WriteLine("{0:D}  {1,8:X}", 620, 454);
        }
    }
}

We print numbers in a decimal and hexadecimal format. We also align the numbers using the length component.

Console.WriteLine("{0:D}  {1,8:X}", 502, 546);;

The {0:D} format item specifies, the first item from the list of supplied objects will be taken and formatted in the decimal format. The {1,8:X} format item takes the second item. Formats it in the hexadecimal format :X. And the string length will be 8 characters 8 . Because the number has only three characters, it is right aligned and padded with empty strings.

$ dotnet run
Decimal   Hexadecimal
502       222
345       2FD
320       28E
120       342
620       1C6

Running the example we get this outcome.

The last two examples will format numeric and date data.

Program.cs
using System;

namespace Format3
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine(string.Format("Number: {0:N}", 126));
            Console.WriteLine(string.Format("Scientific: {0:E}", 126));
            Console.WriteLine(string.Format("Currency: {0:C}", 126));
            Console.WriteLine(string.Format("Percent: {0:P}", 126));
            Console.WriteLine(string.Format("Hexadecimal: {0:X}", 126));
        }
    }
}

The example demonstrates the standard formatting specifiers for numbers. Number 126 is printed in five different formats: normal, scientific, currency, percent and hexadecimal.

$ dotnet run
Number: 126.00
Scientific: 1.260000E+002
Currency: $126.00
Percent: 12,600.00%
Hexadecimal: 7E

This is the output of the program.

Finally, we will format date and time data.

Program.cs
using System;

namespace Format4
{
    class Program
    {
        static void Main(string[] args)
        {
            DateTime today = DateTime.Now;

            Console.WriteLine(string.Format("Short date: {0:d}", today));
            Console.WriteLine(string.Format("Long date: {0:D}", today));
            Console.WriteLine(string.Format("Short time: {0:t}", today));
            Console.WriteLine(string.Format("Long time: {0:T}", today));
            Console.WriteLine(string.Format("Month: {0:M}", today));
            Console.WriteLine(string.Format("Year: {0:Y}", today));
        }
    }
}

The code example shows six various formats for current date and time.

$ dotnet run
Short date: 12/14/2018
Long date: Friday, December 14, 2018
Short time: 7:22 PM
Long time: 7:22:00 PM
Month: December 14
Year: December 2018

This is the output of the example.

This part of the C# tutorial covered strings.