Pointers, References and Arrays

Describe the built-in types available in C++
Introduce the generic pointer type
Distinguish lvalue and rvalue references
Introduce aggregate initialization and range-based for syntax for array types

"The C++ programming language ... provides powerful and flexible mechanisms for abstraction; that is, language constructs that allow the programmer to introduce and use new types of objects that match the concepts of an application." Stroustrup (1999)

A built-in type is a type that can be constructed from a fundamental type by using qualifiers and operators. Built-in types access their underlying types through their addresses and are implemented to reflect the hardware capabilities directly and efficiently. The built-in types in the C++ language are pointers, references and arrays. A C++ built-in type can be constructed from a fundamental type, the const qualifier and the declarator operators ([], * and &).

This chapter describes the built-in types for the C++ core language in some detail, introduces generic pointers and distinguishes references into lvalue and rvalue categories. The array section introduces aggregate initialization and range-based iteration syntax.

Pointer Types

A pointer type is a type that holds an address as its value. Each type, whether fundamental, built-in or user-defined, has a pointer type associated with it. The pointer types associated with the C++ fundamental types are:

signed char*
short int*
int*
long int*
long long int*
float*
unsigned char*
unsigned short int*
unsigned int*
unsigned long int*
unsigned long long int*
double*
char*
bool*
wchar_t*
char16_t*
char32_t*
long double*

In C++, different pointer types are not assignment compatible.

Null Address Constant

The null address is an address in memory with special significance. By convention, the null address cannot be dereferenced. The C++ keyword nullptr refers to the constant that stores this address. Any attempt to dereference a pointer that holds the value nullptr causes a run-time error.

Good Design Style

A wild pointer is a pointer that has not been initialized to any valid address. It is good style to initialize every wild pointer to nullptr. Any run-time attempt to dereference the nullptr address will terminate execution rather than allow execution to continue and generate erroneous results.

Synonym Pointer Types

Each synonym type also has a pointer type associated with it:

typedef unsigned long long int ullint;
ullint* p;

A synonym pointer type simplifies repeated definitions of the pointer type. For example,

typedef unsigned long long int* ullint_ptr;
ullint_ptr p; // a pointer to ullint

Unlike a pointer declaration, this synonym type does not require an * before each identifier. Compare this syntax to direct-definition syntax:

unsigned long long int* pp, * qq;  // we need * before each identifier
ullint_ptr p, q;                   // no need for a repeated *

Synonym pointer syntax is more readable and less error prone.

Generic Pointer Type

The generic pointer type is distinct from any other pointer type defined in an application: a pointer of generic type can hold the address of an object without holding the type information of that object itself.

A generic pointer type holds an address but is not associated with any object type. The keyword void identifies a generic pointer type:

void* p; // generic pointer type

Converting a pointer of any type into a generic type and vice versa does not lose its address. The program:

// Generic Pointer Type
// void.cpp

#include <iostream>

int main()
{
    int i;
    void* v = &i;
    int* j;
    j = static_cast<int*>(v);  // OK - j now holds the address of i
    std::cout << &i << std::endl;
    std::cout << j << std::endl;
}

outputs (the values will be different in each run, but both lines will have the same value)

0024FA18
0024FA18

Converting from one pointer type to another requires an explicit cast:

int* i;
char* c;
i = c; // ERROR - Incompatible: Different Pointer Types

int* i;
char* c;
i = static_cast<int*>(static_cast<void*>(c)); // OK

Dereferencing a Generic Pointer

Since a generic pointer lacks the type information of the object pointed to, the compiler cannot dereference the pointer and interpret the data stored in the region of memory starting at the address pointed to. To access the information stored at that address, the type of that information is required; we cast a generic pointer to the type that is associated with its address.

Hexadecimal Dump Example

The function hexDump() listed below displays the contents of a region of memory regardless of the type associated with that region. The function receives the region's address and its size in number of bytes. It casts the generic pointer a to a pointer to an unsigned char and displays the contents of c[i] in hexadecimal notation:

// Hexadecimal Representation at an Address
// hexDump.cpp

#include <iostream>
#include <iomanip>

void hexDump(void*, int);

int main()
{
    int i;
    double x;

    std::cout << "Integer value: ";
    std::cin >> i;
    std::cout << "is : ";
    hexDump(&i, 4);
    std::cout << std::endl;

    std::cout << "Floating-point value: ";
    std::cin >> x;
    std::cout << "is : ";
    hexDump(&x, 8);
    std::cout << std::endl;
}

// Dump the first "n" bytes starting from the address stored in "a"
void hexDump(void* a, int n)
{
    unsigned char* c = static_cast<unsigned char*>(a);

    auto oldFill = std::cout.fill('0'); // zero fill
    std::cout << std::hex;              // hexadecimal output

    for (int i = 0; i < n; i++)
        std::cout << std::setw(2) << static_cast<int>(c[i]) << ' ';

    std::cout.fill(oldFill);  // restore the filling character
    std::cout << std::dec;    // restore to decimal output
}

If the user inputs the values 2456 and 4.56, the output is

Integer value: 2456
is : 98 09 00 00

Floating-point value: 4.56
is : 3d 0a d7 a3 70 3d 12 40

Note that in this example the platform is little-endian (little end first).

References

A reference is an alias for an existing object. That is, it is an alternative name for the entity defined elsewhere. Each object defined in source code can have a reference associated with it. C++17 admits two reference declarations:

an *lvalue reference* - denoted by &
an *rvalue reference* - denoted by &&

The declaration of an lvalue reference identifies an accessible region of memory. The declaration of an rvalue reference identifies:

an object near the end of its lifetime
a temporary object or subobject
a value not associated with an object

The C++ language does not support references to references, arrays of references, or pointers to references.

lvalue References

An lvalue reference requires an initializer unless it

has external linkage
is a class member within a class definition
is a function parameter or a function return type

// Module B Header
// B.h
class B
{
    int b;
public:
    B(int n);
    void display() const;
};

// Module B Implementation
// B.cpp
#include <iostream>
#include "B.h"

B::B(int n) : b(n) {}

void B::display() const { std::cout << b << std::endl; }

// Module A Header
// A.h
class B;

class A
{
    B& d; // no initializer required: class member declaration
public:
    A(B& bb); // no initializer required: function parameter declaration
};

A& foo(A&);

// Module A Implementation
// A.cpp
#include <iostream>
#include "A.h"
#include "B.h"

extern B& b; // no initializer required: external linkage

A::A(B& bb) : d(bb) // initialization is in constructor definition
{
    d.display();
}

A& foo(A& a) // no initializers required:
{
    // parameter and return type declarations
    A& aa = a; // LVALUE REQUIRES AN INITIALIZER
    b.display();
    return aa;
}

// lvalue References
// lvalue.cpp
#include "A.h"
#include "B.h"

B e(8), f(6);
B& b = e; // LVALUE REQUIRES INITIALIZER

int main()
{
    A a(f);
    foo(a);
}

rvalue Reference

An rvalue reference declaration identifies an object that is less permanent, possibly temporary. For example, the program

// rvalue Reference
// rvalue.cpp

#include <iostream>

class A
{
    int a;
public:
    A(int aa) : a(aa) {}
    void display(const char* str) const
    {
        std::cout << str << ' ' << a << std::endl;
    }
};

void print(const A& a) { a.display("lvalue "); }
void print(A&& a)      { a.display("rvalue "); }

int main()
{
    A a(10);
    print(a);
    print(A(20));
}

produces the output

lvalue  10
rvalue  20

Practical uses for rvalue references include the move-constructors and move-assignment operators described in the chapter entitled Classes and Scoped Enumerations.

Standard Library

The standard library provides two functions for specifying the type of reference when that is important:

std::ref() - returns an lvalue reference to its argument (important of functions in the standard library)
std::move() - returns an rvalue reference to its argument

Their prototypes are declared in <utility> header.

The program

// rvalue Reference using std::move()
// std_move.cpp

#include <iostream>
#include <utility>

class A
{
    int a;
public:
    A(int aa) : a(aa) {}
    void display(const char* str) const
    {
        std::cout << str << ' ' << a << std::endl;
    }
};

void print(const A& a) { a.display("lvalue "); }
void print(A&& a)      { a.display("rvalue "); }

int main()
{
    A a(10), b(20);
    print(a);
    print(std::move(b));
}

outputs

lvalue  10
rvalue  20

Practical uses of std::ref are discussed in the chapter entitled Algorithms.

Array Types

An array type is a built-in type that consists of elements of identical type arranged contiguously in memory. Each element is a subobject of the array type. We declare an array type using the [] declarator operator. An array can be constructed from any one of the

fundamental types (except void)
pointer types
pointer to member types
class types
enumeration types (see the chapter entitled Classes and Scoped Enumerations)

Note that an array cannot be constructed directly from reference types.

One-Dimensional Array

The definition of a one-dimensional array takes one of the following forms

Type identifier[ c ];             // allocated on the stack
Type* identifier = new Type[ n ]; // allocated on the heap

where:

Type is the type of each of the elements in the array
c is the number of elements in the array and is an integer constant or constant integer expression.
n is the number of elements in the array and is an integer variable, integer constant or an integer constant expression.

Aggregate Initialization

We can initialize an array through aggregate initilization. Aggregate initialization takes one of the following forms:

Type identifier[ c ] = { initializer-list };
Type identifier[ c ] = { };
Type identifier[ c ] { initializer-list };
Type identifier[ c ] { };
Type identifier[ ] = { initializer-list };
Type identifier[ ] { initializer-list };
Type* identifier = new Type[ n ] { initializer-list };
Type* identifier = new Type[ n ] { };

initializer-list is a comma-separated list of initial values. If this list is present, c is optional. If c exceeds the number of values in the list, the compiler initializes the uninitialized elements to 0. If the initialization list is absent, c is required. If the braces are present but there is no initialization list, the compiler initializes all elements to 0.

For example, the program

// Aggregate Initialization
// initializers.cpp
#include <iostream>

int main()
{
    const int n = 6;
    int  a[ ] = { 1,2,3 };
    int  b[ ]{ 1,2,3 };
    int  c[5]{ 1,2,3 };
    int  d[5]{};
    int* f = new int[n]{ 1,2,3 };
    int* g = new int[n]{};

    for (int e : a) // range-based for (see below)
        std::cout << e;
    std::cout << '|' << std::endl;

    for (int e : b)
        std::cout << e;
    std::cout << '|' << std::endl;

    for (int e : c)
        std::cout << e;
    std::cout << '|' << std::endl;

    for (int e : d)
        std::cout << e;
    std::cout << '|' << std::endl;

    for (int i = 0; i < n; ++i) // cannot use range-based for with pointers
        std::cout << f[i];
    std::cout << '|' << std::endl;

    for (int i = 0; i < n; ++i)
        std::cout << g[i];
    std::cout << '|' << std::endl;

    delete[] f;
    delete[] g;
}

produces the output

Range-Based `for`

A range-based for is an iteration construct specifically designed for use with collections. This construct steps sequentially through the elements of a collection; the collection type must carry information about its size or a mechanism to detect when the boundary has been reached (statically-allocated array types carry such information, but pointers don't).

The program:

// Range-Based for
// for_each.cpp

#include <iostream>

int main ()
{
    int a[]{1, 2, 3, 4, 5, 6};

    for (int& e : a)
        std::cout << e << ' ';
    std::cout << std::endl;
}

outputs

1 2 3 4 5 6

Range-Based for with Type Inference

A range-based for can infer the type of each element in the array from the array declaration itself. The program

// Range-Based for
// for_each_auto.cpp

#include <iostream>

int main ()
{
    int a[]{1, 2, 3, 4, 5, 6};

    for (auto& e : a) // the type of an element will be inferred by the compiler
        std::cout << e << ' ';
    std::cout << std::endl;
}

outputs

1 2 3 4 5 6

Exercises

Read Wikipedia on Array Data Structures.

Pointers, References and Arrays

Pointer Types​

Null Address Constant​

Good Design Style​

Synonym Pointer Types​

Generic Pointer Type​

Dereferencing a Generic Pointer​

Hexadecimal Dump Example​

References​

lvalue References​

rvalue Reference​

Standard Library​

Array Types​

One-Dimensional Array​

Aggregate Initialization​

Range-Based for​

Range-Based for with Type Inference​

Exercises​

Pointer Types

Null Address Constant

Good Design Style

Synonym Pointer Types

Generic Pointer Type

Dereferencing a Generic Pointer

Hexadecimal Dump Example

References

lvalue References

rvalue Reference

Standard Library

Array Types

One-Dimensional Array

Aggregate Initialization

Range-Based `for`

Range-Based for with Type Inference

Exercises