Showing posts with label Advanced. Show all posts
Showing posts with label Advanced. Show all posts

Tuesday, March 2, 2021

Pointers in Go

Understand all about declaring and using pointers in Go

If you're coming from Python, Java, JavaScript, C# and others, talking pointers may scare you. But fear not! Go's approach to pointers is very elegant and definitely very easy to understand.

Variables

We cannot talk pointers without understanding variables. Every variable you declare in Go (or any programming language for that matter) is essentially a way to identify a block of memory that contains a value assigned to it.

There are multiple ways to declare variables in Go. For example:

package main

import "fmt"

func main() {
     var name = "John Smith"
     var email string = "john@smith.com"
     fmt.Printf("Name: %v, Email: %v", name, email)
}

We could have used the & operator to access the memory addresses of our variables:

fmt.Printf("&name: %v, &email: %v", &name, &email)
&name: 0xc00010a040, &email: 0xc00010a050

Pointers

But what about pointers? Pointers are nothing more than variables that hold the memory address of a value. In other words, the location at which a value is stored. In Go, the * character is used to either declare and read the value of a pointer For example, The type *T is a pointer to a T value.

The zero value for an unassigned pointer is always nil.

Declaring Pointers

To declare pointers, use the * character preceding its type. Like variables pointers have to be declared before they are accessed following this syntax:

var myVar *type

For example:

var p *int // declares a pointer to int

Assigning values to pointers

To assign values to pointers, you should point it to the address of another variable. That's done by using the & operator we saw previously:

i := 22 // assigns 22 to i (i is an int)
p = &i  // makes p point to the address of i

Reading pointer values

To read the values of our pointers we use the * operator:

fmt.Println(*p) // read i through the pointer p
What you see above is called "dereferencing" or "indirecting".

Nil Pointers

Since in Go you cannot declare and set a value to pointers, by default, every unassigned pointer has its zero values set to nil. The nil pointer is a constant with a value of zero as we can see in the example below:

q *int
fmt.Printf("Q: %v %x", q, q)
Q: <nil> 0

But what happens if we try to access its value before we use it? Yes, an error:

fmt.Println("Q: %v", *q)

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4990a3]

Checking if a pointer is assigned

To check if a pointer has been assigned to, you could compare it to nil. True means your pointer is assigned to a memory address and it's safe to access its value:

fmt.Println("Is p assigned?", p != nil)
Is p assigned? true

Comparing pointers

Can we compare pointers? Definitely, by using the & operator:

var x, y int
fmt.Println(&x == &x, &x == &y, &x == nil)
true false false

When use Pointers

So when should you use pointers? Since pointers allow accessing variables indirectly you may think that whenever you need to access your values. But the rule of thumb is that you should use pointers for sharing data - whenever you want an external function to be able to modify your data.

Limitations of Pointers

If you're coming from C, there's one important detail. Go unlike C, Go has no pointer arithmetic. 

Conclusion

On this post we learned about pointers in Go. If you're coming from Python, Java, JavaScript, C# and others, talking pointers may scare you. but fear not! Go's approach to pointers is very elegant and definitely very easy to understand.

Remember, that pointers make your code a little less legible so use them whenever you need to share some value.

See Also

Tuesday, January 26, 2021

Strings in Go

Learn the most important aspects of Strings in Go

On a previous posts we discussed Runes and Variables. Today, let's continue our study of Go's basic types by learning more about Strings in Go. Since Strings in Go as not as obvious as in your favorite programming language, we recommend to explore this article at your own pace.

Declaring Strings in Go

You probably know how to declare variables in Go. Declaring strings is as simple as:

s := "Hello Gopher" // or...
var s2 string

A string value can be written as a string literal, a sequence of bytes enclosed in double quotes. Strings in Go can also contain UTF characters:

s2 := "Hello 😀"

We can also treat Strings as arrays to access parts of it. For example:

fmt.Println(s[:5]) // "Hello"
fmt.Println(s[:])  // "Gopher"
fmt.Println(s[:])  // "Hello Gopher"

Concatenating Strings

Concatenating Strings in Go is similar to Java, JavaScript and Python as Go also utilizes the + operator:
s3 := "Goodbye,  " + s[5:] // "Goodbye, Gopher"
String concatenation is an expensive operation. Avoid using it in loops as it will impact the performance of your application.

Comparing Strings

Strings may be compared with operators like == and <. And since the comparison is done byte by byte, the result is a sweet natural lexicographic ordering:  

name1 := "john"
name2 := "smith"
fmt.Println(name1 == name2) // false
fmt.Println(name1 > name2)  // false

Substrings

Go also allows easy access to access parts of your string. For example:

s[3] - returns the value located on the 3rd position the array
s[5:] - returns a substring from position 5 until the end
s[:5] - returns a substring from position 0 to 5
s[2:5] - returns a substring from position 2 to 5

String Length

If you thought that Go's built-in len function returns the length of a string, you're incorrect. As per the official documentation, len over strings returns the number of bytes in the string (not the number of characters). So if your variable contained any UTF-8 character, it would fail. For example:

s := "Hello, 😊"
fmt.Println(len(s)) // returns 11. Did you expect 8?

To solve the above problem, we should resort to the package encoding/utf8:

s := "Hello, 😊"
fmt.Println(utf8.RuneCountInString(s)) // yes! now we have an 8!

Loops over Strings

As per the above, loops over strings should use range instead of len. Example:

// incorrect as it fails for UTF-8 strings
for i := 0; i < len(s); i++ {
    fmt.Printf("%d %q\n", i, s[i])
}

// correct
for i, r := range s {
    fmt.Printf("%d\t%q\t%d\n", i, r, r)
}

Immutability

Another important concept of Strings in Go is that they are immutable. By that, it means that once assigned, the byte sequence contained in a string value cannot be changed:

s[7] = 'a'   // compiler error: cannot assign to s[7]

But, as expected, a string can be reassigned another value:

s = "Hello Again"

Escape Sequences

Within a double-quoted string literal, escape sequences that begin with a backslash (\) can be used to insert arbitrary byte values into the string. The most common are:

  • \a - “alert” or bell
  • \b - backspace
  • \f - form feed
  • \n - newline
  • \r - carriage return
  • \t - tab
  • \v - vertical tab
  • \' - single quote (only in the rune literal '\'')
  • \" - double quote (only within "..." literals)
  • \\ - backslash

Runes, ASCII, Unicode and UTF

And since we're talking Go Strings, Runes, ASCII  and Unicode, let's review a little about these topics.

ASCII

ASCII (American Standard Code for Information Interchange) is a character encoding standard created in the 60's and still widely used. ASCII's only supports 128 characters such as un-accented letters, numbers and a few other characters.

Unicode

Due to ASCII's limitations, Unicode was created as a superset of it. Today it defines over 140k characters (but capable of more than a million code points), more than sufficient to handle most of the characters and symbols present in the world. The Unicode standard defines Unicode Transformation Formats (UTF) UTF-8, UTF-16, and UTF-32, and several other encodings.

UTF-8

Today, UTF-8 is the most common encoding on the internet. UTF-8 was invented by Ken Thompson and Rob Pike, two of the creators of Go. It uses between 1 and 4 bytes to represent each rune but only one byte for ASCII characters, and 2 or 3 bytes for runes. The first 128 Unicode code points represent the ASCII characters, which means that any ASCII text is also a UTF-8 text.

Unicode Standard Notation

Unicode has the standard notation for codepoint, starting with U+, followed by its codepoint in hexadecimal. For example, U+1F600 represents the Unicode character 😀. To get the Unicode value in Go, use the %U verb.

Printing Runes

Runes are usually printed with the following verbs:

  • %c: to print the character
  • %q: to print the character within quotes
  • %U: to print the value of the character in Unicode notation (U+<value>) 

For example:

ascii := 'a'
unicode := '😀'
newline := '\n'
fmt.Printf("%d %[1]c %[1]q\n", ascii)   // 97 a 'a'
fmt.Printf("%d %[1]c %[1]q\n", unicode) // 22269 😀 '😀'
fmt.Printf("%U\n", unicode)             // U+1F600
fmt.Printf("%d %[1]q\n", newline)       // 10 '\n'

Other formats can also be used, including:

  • %b: base 2
  • %o: base 8
  • %d: base 10
  • %x: base 16, with lower-case letters for a-f

Raw String Literals

A raw string literal is written using backticks (`). Within raw string literals, no escape sequences are processed; the contents are taken literally. For example:

s := `{
"name": "john"
}`
fmt.Println(s)

Prints:

{
"name": "john"
}

Standard Library Support

Strings are also widely supported by Go's standard library. The most important packages for manipulating strings are: bytes, strings, strconv, and unicode. We'll study them in future posts but feel free to explore and learn more about them at your own pace.

Conclusion

On this post we learned a little more about Strings in Go. Since manipulating Strings is an essential part of a programmer's life, understanding their particularities is important to master the Go programming language.

To summarize, here are some important particularities that you should know:

  • strings in Go are immutable sequence of bytes
  • strings in Go can contain human-readable or any data including bytes
  • text strings in Go are conventionally interpreted as UTF-8-encoded sequences of Unicode code points (runes)
  • as Go files (which are always encoded in UTF-8) Go text strings are conventionally interpreted as UTF-8 and can include Unicode code points in string literals
  • strings in Go accept either ASCII characters as well as Unicode code points
  • a rune whose value is less than 256 can be written with a single hexadecimal escape (e.g., '\x41' for 'A') but \u or \U escape must be used for higher values

See Also

Tuesday, January 19, 2021

Runes in Go

Understand what's a rune in Go and when to use it

If you are new to Go, you probably saw the word rune being used. But would you be able to precise what it is?

Runes in Go

A rune in Go is essentially a synonym to the type int32 which by convention is set to an Unicode code point. A code point is a numerical value that can represent single characters but can also have other meanings, such as formatting. With UTF-8 encoding, different code points are encoded as sequences from one to four bytes long.


For example, the rune literal ‘a’ is the ASCII code 97 or Unicode U+0061. In summary, a rune in Go is:

  • a synonym to the type int32
  • A type, with keyword rune aliased to the type int32
  • A Unicode codepoint
  • A character

Rune Literals

Another important point to remember is that Go code is encoded as UTF-8, meaning that string literals will use encoding by default and can be written as a character within single quotes.

And as we'll see later, Go also accepts any ASCII character as well as Unicode code points either directly or with numeric escapes. For example, a rune whose value is less than 256 can be written with a single hexadecimal escape (e.g., '\x41' for 'A') but \u or \U escape must be used for higher values.

ASCII, Unicode and UTF

And since we're talking ASCII  and Unicode, let's understand why we should understand how they differ.

ASCII

ASCII (abbreviated from American Standard Code for Information Interchange) is a character encoding standard for electronic communication. It's development started in the 60's and still widely used today. 

But ASCII is limited to only128 characters (or 7 bits with code points ranging from 0 to 127), which means that it only contains enough to hold un-accented letters, numbers and a few other characters, leaving out accents and most of the characters used by Eastern languages.

Unicode

For that reason, a new standard called Unicode was created as a superset of ASCII and defines over 140k characters (but capable of more than a million code points), more than sufficient to handle most of the characters in all languages present in the world plus new if necessary.

Unicode can be implemented by different character encodings. The Unicode standard defines Unicode Transformation Formats (UTF) UTF-8, UTF-16, and UTF-32, and several other encodings. The most commonly used encodings are UTF-8, UTF-16, UCS-2 and GB18030, which's standardized in China and implements Unicode fully, while not an official Unicode standard.

UTF-8

Today, UTF-8 is the most common encoding on the internet. UTF-8 was invented by Ken Thompson and Rob Pike, two of the creators of Go, and is now a Unicode standard. It uses between 1 and 4 bytes to represent each rune but only one byte for ASCII characters, and 2 or 3 bytes for runes. The first 128 Unicode code points represent the ASCII characters, which means that any ASCII text is also a UTF-8 text.

Unicode Standard Notation

Unicode has the standard notation for codepoint, starting with U+, followed by its codepoint in hexadecimal. For example, U+1F600 represents the Unicode character 😀. To get the Unicode value in Go, use the %U format.

Printing Runes

Runes are usually printed with the following formats:

  • %c: to print the character
  • %q: to print the character within quotes
  • %U: to print the value of the character in Unicode notation (U+<value>) 

For example:

ascii := 'a'
unicode := '😀'
newline := '\n'
fmt.Printf("%d %[1]c %[1]q\n", ascii)   // 97 a 'a'
fmt.Printf("%d %[1]c %[1]q\n", unicode) // 22269 😀 '😀'
fmt.Printf("%U\n", unicode)             // U+1F600
fmt.Printf("%d %[1]q\n", newline)       // 10 '\n'

But other formats can also be used, including:

  • %b: base 2
  • %o: base 8
  • %d: base 10
  • %x: base 16, with lower-case letters for a-f

Conclusion

On this post we learned about runes in Go. Runes are essentially an alias for int32 and is equivalent to int32 in all ways. It is used, by convention, to distinguish character values from integer values.

See Also

Tuesday, December 29, 2020

Go Operators

Want to know more about operators in Go? Read to understand.

If you come from JavaScript, Java, C, C# or Python you will feel at home with Go operators. On this article we will provide a summary of the operators in Go and provide some examples.

What are operators?

But what are operators? Operators are constructs which behave generally like functions, but which differ syntactically or semantically. Common simple examples include arithmetic, comparison, and logical.

Which operators are supported in Go?

The following are the operators (including assignment operators) and punctuation supported in Go:

+    &     +=    &=     &&    ==    !=    (    )
-    |     -=    |=     ||    <     <=    [    ]
*    ^     *=    ^=     <-    >     >=    {    }
/    <<    /=    <<=    ++    =     :=    ,    ;
%    >>    %=    >>=    --    !     ...   .    :
     &^          &^=

Types of operators

Go operators are usually categorized based on functionality:

  • Arithmetic Operators
  • Comparison Operators
  • Relational Operators
  • Logical Operators
  • Bitwise Operators
  • Assignment Operators
  • Misc Operators

Arithmetic operators

Arithmetic operators apply to numeric values and yield a result of the same type as the first operand. The four standard arithmetic operators (+, -, *, /) apply to integer, floating-point, and complex types; + also applies to strings. The bitwise logical and shift operators apply to integers only.

+    sum                    integers, floats, complex values, strings
-    difference             integers, floats, complex values
*    product                integers, floats, complex values
/    quotient               integers, floats, complex values
%    remainder              integers

&    bitwise AND            integers
|    bitwise OR             integers
^    bitwise XOR            integers
&^   bit clear (AND NOT)    integers

<<   left shift             integer << unsigned integer
>>   right shift            integer >> unsigned integer 

Comparison operators

Comparison operators compare two operands and yield an untyped boolean value. They are:

==    equal
!=    not equal
<     less
<=    less or equal
>     greater
>=    greater or equal

Logical operators

Logical operators apply to boolean values and yield a result of the same type as the operands. The right operand is evaluated conditionally.

&&    conditional AND    p && q  is  "if p then q else false"
||    conditional OR     p || q  is  "if p then true else q"
!     NOT                !p      is  "not p" 

Address operators

For an operand x of type T, the address operation &x generates a pointer of type *T to x. The operand must be addressable, that is, either a variable, pointer indirection, or slice indexing operation; or a field selector of an addressable struct operand; or an array indexing operation of an addressable array. As an exception to the addressability requirement, x may also be a (possibly parenthesized) composite literal. If the evaluation of x would cause a run-time panic, then the evaluation of &x does too.

For an operand x of pointer type *T, the pointer indirection *x denotes the variable of type T pointed to by x. If x is nil, an attempt to evaluate *x will cause a run-time panic.

&x
&a[f(2)]
&Point{2, 3}
*p
*pf(x)

var x *int = nil
*x   // causes a run-time panic
&*x  // causes a run-time panic 

Operator precedence

Unary operators

Unary operators have the highest precedence. As the ++ and -- operators form statements, not expressions, they fall outside the operator hierarchy. As a consequence, statement *p++ is the same as (*p)++.

Binary operators

There are five precedence levels for binary operators. Multiplication operators bind strongest, followed by addition operators, comparison operators, && (logical AND), and finally || (logical OR).

Precedence    Operator
    5             *  /  %  <<  >>  &  &^
    4             +  -  |  ^
    3             ==  !=  <  <=  >  >=
    2             &&
    1             ||

Binary operators of the same precedence associate from left to right. For instance, x / y * z is the same as (x / y) * z.

+x
23 + 3*x[i]
x <= f()
^a >> b
f() || g()
x == y+1 && <-chanPtr > 0

Conclusion

On this article we discussed a little about Go operators. If you come from JavaScript, Java, C, C# or Python you will feel at home with Go operators. Knowing and understanding operators in any programming language is important for writing better and readable code, we recommend that you read the Go spec for more information.

Source

Tuesday, December 15, 2020

Escape Sequences in Go

Want to know more about escape sequences in Go? Read to understand.

Go as other programming languages has the concept of escaping. A escaping sequence is a combination of characters that has a meaning other than the literal characters contained therein. An escaping sequence commonly uses a escape character which in Go is the \ character. Let's understand more about escaping in Go's context.

Why escape?

Escaping is a commonly used technique that developers use resort to escaping code to:

  • encode commands or special data which cannot be directly represented by the alphabet.
  • represent characters which cannot be typed in the current context, or would have an undesired interpretation

Escaping in Go

As other programming languages, Go utilizes the backslash (\) character to escape. What to use next? Depends on what you need. Here are some useful values to get you started:

Escape Sequence Value
\\ the \ character
\' the ' character
\" the " character
\? the ? character
\a an alert
\b backspace
\f form feed
\n a new line
\r carriage return
\t an horizontal tab
\xFF hexadecimal "FF"

Examples

So let's check some examples:

package main

import (
"fmt"
)

func main() {
dec := 22
octal := 033
hex := 0xFF
fmt.Printf("Decimal %v, Hex: %v, Octal: %v\n", dec, hex, octal)
fmt.Println("Some\ttab")
fmt.Println("A quote: \"")
fmt.Println("What\nabout\nline\nbreaks")
}

Decimal 22, Hex: 255, Octal: 27
Some tab
A quote: "
What
about
line
breaks

Conclusion

On this article we learned about escaping. Escaping in Go is very similar to other programming languages and is extensively used to encode commands or special data which cannot be directly represented by the alphabet or to represent characters which cannot be typed in the current context, or would have an undesired interpretation.

Reference

See Also

Any comment about this page? Please contact us on Twitter

Featured Article

Pointers in Go

Understand all about declaring and using pointers in Go If you're coming from Python, Java, JavaScript, C# and others, tal...

Popular Posts