Scalars, Vectors, Matrices, and Tensors
- (The University of Chicago - Alvin Wei-Cheng Wong)
- Overview
Linear algebra is the backbone of many areas of computing, allowing us to efficiently model and solve a wide variety of problems. From image processing and machine learning (ML) to computer graphics and cryptography, the principles of linear algebra are used in countless areas of computer science.
Scalars are zero-dimensional, vectors are one-dimensional, matrices are two-dimensional, and tensors can have any number of dimensions. Tensors are often used to represent multi-dimensional data, such as color images, volumetric data, or time series data.
In machine learning (ML), a vector is an element of a vector space, which is a geometric object collection with an addition rule and a scalar multiplication rule. A vector looks like a directed line segment, though not all vectors are directed line segments.
Scalars, vectors, and matrices are fundamental structures of linear algebra that are important for understanding deep learning (DL). Tensors are a generalization of matrices and can have any number of dimensions. They are often used to represent multi-dimensional data, such as color images, volumetric data, or time series data.
Scalars, vectors, and matrices are used to represent inputs like text and pictures, which allows ML or DL models to be trained and deployed.
Broadly speaking, in linear algebra data is represented in the form of linear equations. These linear equations are in turn represented in the form of matrices and vectors.
Please refer to the following for more details:
- Wikipedia: Outline of linear algebra
- Wikipedia: Scalar.
- Wikipedia: Vector Space.
- Wikipedia: Matrices.
- Wikipedia: Tensor.
- Computational Linear Algebra: Scalars, Vectors, Matrices and Tensors
In machine learning, scalars, vectors, matrices, and tensors are fundamental data structures built upon linear algebra. Scalars are single numerical values (0-dimensional), vectors are ordered lists of numbers (1-dimensional), matrices are two-dimensional arrays, and tensors are generalizations of matrices to any number of dimensions.
These structures are essential for representing and manipulating various types of data, including text, images, and time series, within ML models.
- Scalars: Represent single numerical values (e.g., temperature, price) and are considered 0-dimensional.
- Vectors: Represent ordered lists of numbers (e.g., coordinates in a space) and are 1-dimensional.
- Matrices: Represent 2D arrays of numbers (e.g., tabular data, images).
- Tensors: Generalize matrices to any number of dimensions, allowing for the representation of complex multi-dimensional data like color images, volumetric data, or time series.
- Linear Equations: In linear algebra, data is often represented as linear equations, which can be expressed using matrices and vectors.
- Deep Learning: Understanding these linear algebra concepts is crucial for grasping the fundamentals of deep learning, as these structures are used to represent inputs and manipulate data within neural networks.
- Arrays and Linear Algebra
Arrays serve as the building blocks for numerical linear algebra, enabling the representation and manipulation of mathematical structures and the execution of key operations for solving real-world problems.
In linear algebra, an array is a multi-dimensional data structure that can represent mathematical structures like matrices, vectors, and tensors.
Arrays are used in numerical implementations of linear algebra, such as for: solving systems of linear equations, calculating eigenvalues and eigenvectors of matrices, singular-value decomposition, and matrix factorizations.
Arrays are fundamental in linear algebra for representing and manipulating mathematical objects like vectors, matrices, and tensors.
Here's a closer look at how arrays are used in the numerical implementations of linear algebra:
1. Representing Structures:
- Vectors: 1D arrays are used to represent vectors, where the array's length corresponds to the vector's dimension.
- Matrices: Matrices are represented by 2D arrays, with the elements organized in rows and columns.
- Tensors: For higher dimensions, multi-dimensional arrays are used to represent tensors.
2. Numerical Implementations:
- Solving systems of linear equations: Augmented matrices, which are essentially arrays representing the coefficients and constants of linear equations, are used to solve such systems. Techniques like Gaussian elimination or Gauss-Jordan elimination, involving row operations on the augmented matrix, reduce it to a simpler form that reveals the solutions. Distributed arrays can be used in parallel processing environments to handle very large systems of equations.
- Calculating eigenvalues and eigenvectors of matrices: The calculation of eigenvalues and eigenvectors relies on forming the characteristic equation of a matrix using arrays. Solving this equation then yields the eigenvalues, and subsequently, the corresponding eigenvectors can be found by solving a system of linear equations represented by another array. The eigenvectors themselves are represented as arrays.
3. Numerical Applications:
- Solving Systems of Linear Equations: Arrays (specifically augmented matrices) are crucial for solving systems of linear equations using methods like Gaussian elimination or Gauss-Jordan elimination. Row operations are performed on the array to simplify it and find the solutions. Libraries like NumPy provide functions like "mldivide" that utilize distributed arrays to solve large linear systems efficiently.
- Eigenvalues and Eigenvectors: To find eigenvalues (λ) and eigenvectors (x) for a square matrix A, which satisfy the equation Ax=λx, arrays are used to represent the matrix and vectors involved. The problem is transformed into solving (A−λI)x=0, where I is the identity matrix, and arrays are used to perform the necessary matrix operations to find λ and ��.
- Singular Value Decomposition (SVD): SVD is a technique for decomposing a matrix into three simpler matrices (U, S, and VT), providing insights into the data. Arrays are used to represent the original matrix and the resulting decomposed matrices. Libraries like NumPy have functions that directly compute the SVD of an array.
- Matrix Factorizations: Techniques like LU, Cholesky, and QR factorization, which decompose a matrix into simpler matrices, are fundamental for solving various linear algebra problems. Arrays represent the matrices involved in these factorizations. These factorizations are essential for things like simplifying linear transformations, solving linear systems, and performing eigenvalue decomposition.
- Linear Equations
In linear algebra, a linear equation is an algebraic equation where the highest power of the variable is always 1. It's also known as a one-degree equation.
The standard form of a linear equation in one variable is Ax + B = 0, where x is a variable, A is a coefficient, and B is a constant. The standard form for linear equations in two variables is Ax+By=C. For example, 2x+3y=5 is a linear equation in standard form.
When graphed, a linear equation in one or two variables always represents a straight line. For example, x + 2 y = 4 is a linear equation and the graph of this linear equation is a straight line.
- Linear Algebra in TensorFlow (Scalars, Vectors and Matrices)
In TensorFlow, scalars, vectors, and matrices are fundamental data structures used to represent and manipulate data in ML models.
Scalars are 0-dimensional tensors, vectors are 1-dimensional tensors, and matrices are 2-dimensional tensors. These tensors are used to represent various aspects of the data, such as image features (vectors), neural network weights (matrices), and more.
- Scalars: A scalar is a single numerical value, like a temperature reading or a single pixel intensity. In TensorFlow, scalars are represented as 0-dimensional tensors.
- Vectors: A vector is an ordered collection of numbers, like the features of an image (e.g., color values, texture). In TensorFlow, vectors are 1-dimensional tensors.
- Matrices: A matrix is a rectangular array of numbers, often used to represent relationships between data points or transformations. In TensorFlow, matrices are 2-dimensional tensors. For example, a matrix can represent the connections between neurons in a neural network, where each element represents the strength of a connection.
- Tensors: Tensors are the general data structure in TensorFlow, encompassing scalars, vectors, matrices, and higher-dimensional arrays. They are used to represent and manipulate data in various machine learning tasks.
- Example
The code performs basic tensor operations in TensorFlow: scalar addition, scalar-vector multiplication, and matrix-vector multiplication. The results of these operations are then printed to the console.
Python
In TensorFlow, computation is described using data flow graphs. Each node of the graph represents an instance of a mathematical operation (like addition, division, or multiplication) and each edge is a multi-dimensional data set (tensor) on which the operations are performed.
import tensorflow as tf
# Create a scalar
scalar = tf.constant(1.0)
# Create a vector
vector = tf.constant([1.0, 2.0, 3.0])
# Create a matrix
matrix = tf.constant([[1.0, 2.0], [3.0, 4.0]])
# Add two scalars
sum = scalar + scalar
# Multiply a vector by a scalar
product = vector * scalar
# Multiply a matrix by a vector
matmul = tf.matmul(matrix, vector)
# Print the results
print(sum)
print(product)
print(matmul)
The code first initializes three TensorFlow tensors: a scalar (1.0), a vector [1.0, 2.0, 3.0], and a 2x2 matrix [[1.0, 2.0], [3.0, 4.0]]. Then, it performs three operations:
- scalar + scalar: Adds the scalar 1.0 to itself. The result is also a scalar.
- vector * scalar: Multiplies each element of the vector by the scalar 1.0. The result is a new vector with the same values as the original.
- tf.matmul(matrix, vector): Performs matrix multiplication of the matrix and vector. The resulting matmul will be a new vector.
The print statements then output the results of each operation, which will be tensors. The output will be:
Code
tf.Tensor(2.0, shape=(), dtype=float32)
tf.Tensor([1. 2. 3.], shape=(3,), dtype=float32)
tf.Tensor([ 9. 19.], shape=(2,), dtype=float32)
[More to come ...]