C# Professional - Processing Text

talent-agile
104K views

Open Source Your Knowledge, Become a Contributor

Technology knowledge has to be shared and made accessible for free. Join the movement.

Create Content
Next: Regular Expressions - Basics

Strings and Encoding

Basics

String is a type from the System namespace that is used for most of text related operations.

The String type has some specificities compared to other types:

  • It is a reference type
  • It is immutable, meaning that you cannot change the value of a String
  • It behaves like a value type

Encodings

Encoding is used to specify:

  • How text is stored in memory
  • How text is displayed on screen

The Encoding type offers multiple common encodings:

  • Default (avoid using this one)
  • ASCII
  • Unicode
  • UTF7
  • UTF8
  • UTF32

Text in String instances is stored using Unicode 16. You can include specific Unicode characters in a String using the syntax \u03a0 (here it is the pi character for exemple).

Building Strings

As we saw before, String is immutable, which implies that every time you want to modify a String, a new instance will be created. If you have an important number of modifications to do, this can cause a big memory consumption and pressure on the memory.

In order to avoid this scenario, the .Net Framework propose the StringBuilderclass, which is designed for handling such scenarios.

Open Source Your Knowledge: become a Contributor and help others learn. Create New Content