Download as pdf or txt
Download as pdf or txt
You are on page 1of 252

Essential

Mathematics
for Quantum
Computing
A beginner's guide to just the math you need without
needless complexities

Leonard S. Woody III

BIRMINGHAM—MUMBAI
Essential Mathematics for Quantum
Computing
Copyright © 2022 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or
transmitted in any form or by any means, without the prior written permission of the publisher,
except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the
information presented. However, the information contained in this book is sold without
warranty, either express or implied. Neither the author(s), nor Packt Publishing or its dealers and
distributors, will be held liable for any damages caused or alleged to have been caused directly or
indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies
and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing
cannot guarantee the accuracy of this information.

Publishing Product Manager: Sunith Shetty


Senior Editor: Nathanya Dias
Content Development Editor: Sean Lobo
Technical Editor: Rahul Limbachiya
Copy Editor: Safis Editing
Project Coordinator: Aishwarya Mohan
Proofreader: Safis Editing
Indexer: Manju Arasan
Production Designer: Nilesh Mohite
Marketing Coordinator: Abeer Dawe

First published: April 2022


Production reference: 1170322

Published by Packt Publishing Ltd.


Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.

ISBN 978-1-80107-314-1
www.packt.com
To my wife Jeanette, I owe you a debt of gratitude that I can only repay by
loving you every day for the rest of my life, and fortunately for us, that will
be easy.

I dedicate this book to my mom, Georgia Chandler Mapes, and my dad,


Leonard Spencer Woody, Jr.
You raised me right!
And to my grammy, Patricia Dana Woody. You were my second mother
and I love you and miss you terribly.
Acknowledgements
I would first like to acknowledge my technical reviewer, Emmanuel Knafo, Ph.D. He spent
tireless hours reviewing this text and it would not be the book it is without him. Secondly,
I would like to thank my close friend Sam Smith, who reviewed many chapters quickly
and eagerly. Sam, Robin Smith, Rory Woods, and I came up together at Microsoft. Thank
you for your friendship, our many happy hours, and help with the book. My first manager
at Microsoft, my friend and mentor Omar Kouatly, allowed me to get started in this
venture of quantum computing, encouraged me, and helped with the book as well. Thank
you. Delbert Murphy, Darius Zakrzewski, and Jon Skerrett have been my "partners in
crime" in exploring, learning, and sharing a passion for quantum computing. Thank you
for your inspiration. Finally, my friend Matthew A. Kirsch helped with early copies of this
text and earlier parts of my life. I thank you for those immeasurable contributions as well.
In the one year plus that it took to write this book, I needed support and advice. My great
friend and spiritual mentor, Art Thompson, provided that in spades. Other close friends
such as Graham Eddy, Carmel Maddox, Heather Downey, Patrick Sweet, Eli Rosenblatt,
Rich Chetelat, Paul Varela, Benjamin Maddox, Nacho Dave, and Andy Brown have been
there every step of the way during this tumultuous year.
No book is written alone and I would like to thank the people at Packt for working with
me to make this book a reality. I would especially like to thank Sean Lobo, my editor, for
sticking with me all the way through and his many hours spent reviewing this text.
Finally, I would like to thank my family, which includes Brandi Zahir and her children
Zachary, Benjamin, and Caitlyn. To my children, you allowed me to write this book and
gave up many hours with daddy so that I could finish it. I will always love you and you are
the reason I exist. Thank you, Eva-Maria, Sophia, Johnny, and Alex. To my wife Jeanette
of 17 years, you are the love of my life, my rock, my person. We have built quite a family
together and I can't wait to live the rest of my life with you. And to who made this all
possible, thank you, God.
Contributors
About the author
Leonard S. Woody III is a senior consultant with 20 years of experience explaining
complex subjects to software development clients. For the last 3 years, he has worked at
Microsoft, most currently as a program manager for Azure Quantum. He was awarded a
BS in computer science and a BS in physics from the University of Virginia. He attained
his MS in software engineering from George Mason University. Woody lives in Northern
Virginia with his wife and four children. His biggest love is spending time with his family.
About the reviewers
Emmanuel Knafo, focusing on DevOps innovation and cloud architecture, helps
organizations transform how to ideate, plan, execute, and learn from their technology
investments. He obtained his Ph.D. in mathematics in number theory at the University of
Toronto. He is a published author in various mathematical journals. He has published IT
articles on the Microsoft Premier Developer Blog.

I would like to thank the author for this opportunity to re-ignite my passion
for mathematics and physics by making me the technical reviewer for this
book. It has been a thoroughly enjoyable experience! My passion for math
was instilled by my father, Emile, and nurtured by my mother, Evelyne.
Finally, I'm grateful to Audrey and the lights of our lives: Ethan and Adam.

Devika Mehra started her programming journey when she was 15 years old, which led to
her never-ending zest to explore the boundless field of technology. She has an immense
interest in the fields of security and quantum computing. She initially flexed her muscles
in different programming languages and then focused on the development of Android
applications. She is currently working with Microsoft Sentinel as a software engineer and
develops security integration and analysis content for the end customer. She wishes to
make the world a better place to live in and believes that technology can be a great catalyst
to achieve this.

Srinjoy Ganguly works as a quantum AI research scientist at Fractal Analytics. He has 4+


years of experience in quantum computing, and is an IBM Qiskit advocate and educator.
He also teaches quantum computing at Woxsen University as a visiting professor. His
research interests include QNLP, category theory with compositionality, variational
quantum algorithms and their applications, and machine learning.
Table of Contents
Preface

Section 1: Introduction
1
Superposition with Euclid
Vectors3 Measurement13
Vector addition 5
Summary13
Scalar multiplication 7
Answers to exercises 15
Linear combinations 8 Exercise 1 15
Superposition11 Exercise 2 15

2
The Matrix
Defining a matrix 18 Matrix multiplication 29
Notation19 Properties of matrix multiplication 32
Redefining vectors 19
Special types of matrices 33
Simple matrix operations 20 Square matrices 33
Addition20 Identity matrices 33
Scalar multiplication 21
Quantum gates 34
Transposing a matrix 22
Logic gates 34
Defining matrix multiplication 23 Circuit model 36
Multiplying vectors 24
Summary37
Matrix-vector multiplication 26
viii Table of Contents

Answers to exercises 37 Exercise 4 37


Exercise 1 37 Exercise 5 38
Exercise 2 37
References38
Exercise 3 37

Section 2: Elementary Linear Algebra


3
Foundations
Sets 42 Properties52
The definition of a set 42
Groups52
Notation 42
Important sets of numbers 43
Fields53
Tuples 45 Exercise 2 53
The Cartesian product 45 Vector space 54
Functions 46 Summary55
The definition of a function 47 Answers to Exercises 55
Exercise 1 48 Exercise 1 55
Invertible functions 48 Exercise 2 55

Binary operations 51 Works cited 56


The definition of a binary operation 51

4
Vector Spaces
Subspaces58 Span 64
Definition 58 Basis 68
Examples59 Dimension71
Exercise 1 61
Summary71
Linear independence 62 Answers to exercises 72
Linear combination 62 Exercise 1 72
Linear dependence 62
Table of Contents ix

5
Using Matrices to Transform Space
Linearity 74 Rotation89
What is a linear transformation? 77 Projection 94
Exercise two 95
Describing linear transformations 77

Representing linear Linear operators 96


transformations with matrices 83 Linear functionals 97
Matrices depend on the bases chosen 84 A change of basis 98
Matrix multiplication and multiple Summary100
transformations87
Answers to exercises 101
The commutator 87
Exercise one 101
Transformations inspired Exercise two 101
by Euclid 88
Works cited 101
Translation88

Section 3: Adding Complexity


6
Complex Numbers
Three forms, one number 106 Defining complex numbers in polar
Definition of complex numbers 106 form 116
Example117
Cartesian form 107 Multiplication and division in
Addition108 polar form 118
Multiplication109 Example118
Exercise 1 111 De Moivre's theorem 119
Complex conjugate 111
The most beautiful equation
Absolute value or modulus 112
in mathematics 119
Division 112
Powers of i113 Exponential form 120
Exercise 4 120
Polar form 114 Conjugation 120
Polar coordinates 114 Multiplication121
Exercise 3 116 Example121
x Table of Contents

Conjugate transpose of Exercise 1 126


a matrix 122 Exercise 2 126
Bloch sphere 123 Exercise 3 126
Summary125 Exercise 4 126

Exercises 126 References127

7
EigenStuff
The inverse of a matrix 130 The characteristic equation 139
Determinants131 Finding eigenvectors 141
Multiplicity 142
Exercise one 133

The invertible matrix theorem 133 Trace 143


Calculating the inverse of The special properties
a matrix 134 of eigenvalues 143
Exercise two 135 Summary 144
Answers to exercises 144
Eigenvalues and eigenvectors 135 Exercise one 144
Definition 137 Exercise two 144
Example with a matrix 137

8
Our Space in the Universe
The inner product 146 Exercise 3 159
Orthonormality 149 The completeness relation 159
The adjoint of an operator 160
The norm 149
Orthogonality 150 Types of operators 162
Orthonormal vectors 152
Normal operators 162
The Kronecker delta function 153
Hermitian operators 163
The outer product 154 Unitary operators 164
Projection operators 166
Exercise two 157
Positive operators 167
Operators157
Tensor products 167
Representing an operator using the
outer product 158 The tensor product of vectors 168
Table of Contents xi

Exercise four 170 Answers to exercises 175


The basis of tensor product space 170 Exercise one 175
Exercise five 171 Exercise two 175
The tensor product of operators 172 Exercise three 175
Exercise six 173 Exercise four 175
The inner product of composite vectors 173 Exercise five 176
Exercise seven 174 Exercise six 176
Exercise seven 176
Summary 174

9
Advanced Concepts
Gram-Schmidt178 Example take two 187
Cauchy-Schwarz and triangle Singular value decomposition 188
inequalities181
Polar decomposition 189
Spectral decomposition 183
Operator functions and the
Diagonal matrices 183
matrix exponential 190
Spectral theory 184
Example185
Summary 194
Bra-ket notation 186 Works cited 194

Section 4: Appendices
Appendix 1
Bra–ket Notation
Operators198 Bras198

Appendix 2: Sigma Notation


Sigma Notation
Sigma 201 Summation rules 203
Variations202
xii Table of Contents

Appendix 3
Trigonometry
Measuring angles 205 The trig cheat sheet 212
Degrees 206 Pythagorean identities 212
Radians207 Double angle identities 212
Sum/difference identities 212
Trigonometric functions 208 Product-to-sum identities 213
Formulas210
Summary212 Works cited 213

Appendix 4
Probability
Definitions 215 The measures of a random variable 219
Random variables 217 Summary220
Discrete random variables 218
Works cited 220

Appendix 5
References

Index
Other Books You May Enjoy
Preface
This book is written for software developers and tech enthusiasts that have not learned
the math required for quantum computing either in many years or possibly not at all.
Quantum computing is based on a combination of quantum mechanics and computer
science. These two subjects, quantum mechanics and computer science, are built on a
foundation of math, as the following diagram illustrates:

Figure 1 – Diagram of relationship of math to quantum computing


Making sure your foundation is well built as you dive into quantum computing is
paramount to your long-term success in the field. Notice that I said "as you dive" instead
of "before you dive," because you should do cool quantum computing stuff as you are
learning the relevant math. We do that in the very first chapter, Chapter 1, Superposition
with Euclid, and almost every chapter after that. It's important that you see how the math
connects to actual quantum computing.

How to use this book


Let's answer the question of how you should learn the math of quantum computing using
this book. Everyone is different, but the steps we used in school work really well (with a
few tweaks):

1. **READ THIS AWESOME BOOK!**


2. Watch YouTube videos – here are a couple of playlists to get you started:

‚ Essence of linear algebra by 3Blue1Brown (https://tinyurl.com/233ruczb)


‚ Linear Algebra: An In-Depth Introduction by MathTheBeautiful (https://
tinyurl.com/464dvc4b)
xiv Preface

3. Exercise – do actual math problems. You don't learn a sport by just reading a book
and watching some videos. You have to do it! This book has you covered with
exercises in every chapter.

I'd like to add three more steps to make sure you don't lose enthusiasm:

1. Go do some cool quantum computing stuff with your newfound knowledge.


2. Get stuck.
3. Come back to the book and start again at Step 1 to get unstuck.
Now I'd like to quickly talk about who this book is for.

Who this book is for


I don't assume much in terms of mathematical training. A general high school study
of math is all you need and even then, I include appendices to review subjects such
as trigonometry if you need it. The most important prerequisite is an enthusiasm for
quantum computing.
I'll quickly say this book is not for graduate students, mathematicians, physicists, and
rocket scientists in general. You probably know all this stuff already. But it might show you
a new way to teach it.

What this book is not


This is not an overall introduction to quantum computing. We will most certainly connect
the math to quantum computing and do some actual quantum computing. But in the end,
this is a math book. There are great books out there that give a general introduction to
quantum computing. A personal favorite of mine is Quantum Computing Explained by
David McMahon and there are more included in the appendix of references I have used
for the book.

Download the color images


We also provide a PDF file that has color images of the screenshots and diagrams used
in this book. You can download it here: https://static.packt-cdn.com/
downloads/9781801073141_ColorImages.pdf.
Preface xv

Conventions used
There are a number of text conventions used throughout this book.
Bold: Indicates a new term, an important word, or words that you see onscreen. For
instance, words in menus or dialog boxes appear in bold. Here is an example: "Select
System info from the Administration panel."

Tips or Important Notes


Appear like this.

Get in touch
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us
at [email protected] and mention the book title in the subject of
your message.
Errata: Although we have taken every care to ensure the accuracy of our content,
mistakes do happen. If you have found a mistake in this book, we would be grateful if
you would report this to us. Please visit www.packtpub.com/support/errata and
fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet,
we would be grateful if you would provide us with the location address or website name.
Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise
in and you are interested in either writing or contributing to a book, please visit
authors.packtpub.com.
xvi Preface

Share Your Thoughts


Once you've read Essential Mathematics for Quantum Computing, we'd love to hear your
thoughts! Please click here to go straight to the Amazon review page for this book and
share your feedback.
Your review is important to us and the tech community and will help us make sure we're
delivering excellent quality content.
Section 1:
Introduction

This section starts the book off with easy concepts such as vectors and matrices.
The following chapters are included in this section:

• Chapter 1, Superposition with Euclid


• Chapter 2, The Matrix
1
Superposition
with Euclid
Mathematics is the language of physics and the foundation of computer science. Since
quantum computing evolved from these two disciplines, it is essential to understand
the mathematics behind it. The math you need is linear in nature, and that is where
we will start. By the time we are done, you will have the mathematical foundation to
fundamentally understand quantum computing. Let's get started!
In this chapter, we are going to cover the following main topics:

• Vectors
• Linear combinations
• Superposition

Vectors
A long time ago in a country far, far away, there lived an ancient Greek mathematician
named Euclid. He wrote a book that defined space using only three dimensions. We
will use his vector space to define superposition in quantum computing. Don't be
fooled—vector spaces have evolved tremendously since Euclid's days, and our definition
of them will evolve too as the book progresses. But for now, we will stick to real numbers,
and we'll actually only need two out of the three dimensions Euclid proposed.
4 Superposition with Euclid

To start, we will define a Euclidean vector as being a line segment with a length or
magnitude and pointing in a certain direction, as shown in the following screenshot:

Figure 1.1 – Euclidean vector


Two vectors are equal if they have the same length and direction, so the following vectors
are all equal:

Figure 1.2 – Equal vectors


Vectors can be represented algebraically by their components. The simplest way to do this
is to have them start at the origin (the point (0,0)) and use their x and y coordinates, as
shown in the following screenshot:
Vectors 5

Figure 1.3 – Vectors represented geometrically and algebraically


You should note that I am using a special notation to label the vectors. It is called bra-ket
notation. The appendix has more information on this notation, but for now, we will use
a vertical bar or pipe, |, followed by the variable name for the vector and then an angle
bracket, ⟩, to denote a vector (for example, |a⟩). The coordinates of our vectors will be
enclosed in brackets [ ]. The x coordinate will be on top and the y coordinate on the
bottom. Vectors are also called "kets" in this notation—for example, ket a, but for now, we
will stick with the name vector.

Vector addition
So, it ends up that we can add vectors together both geometrically and algebraically, as
shown in the following screenshot:

Figure 1.4 – Vector addition


6 Superposition with Euclid

As you can see, we can take vectors and move them in the XY-plane as long as we preserve
their length and direction. We have taken the vector |b⟩ from our first graph and moved
its start position to the end of vector |a⟩. Once we do that, we can draw a third vector
|c⟩ that connects the start of |a⟩ and the end of |b⟩ to form their sum. If we look at the
coordinates of |c⟩, it is four units in the x direction and zero units in the y direction. This
corresponds to the answer we see on the right of Figure 1.4.
We can also do this addition without the help of a graph, as shown on the right of Figure 1.4.
Just adding the first components (3 and 1) gives 4, and adding the second components of the
vectors (2 and -2) gives 0. Thus, vector addition works both geometrically and algebraically
in two dimensions. So, let's look at an example.

Example
What is the sum of |m⟩ and |n⟩ here?
5  2
m =  n = 
1   −4 
The solution is:
5  2   7 
m + n =  +  = 
1   −4   −3

Exercise 1
Now, you try. The answers are at the end of this chapter:

• What is |m⟩ - |n⟩?


• What is |n⟩ - |m⟩?
• Solve the following expression (notice we use three-dimensional (3D) vectors, but
everything works the same):

3 5
 −2  +  3 
   
 1   −3
Vectors 7

Scalar multiplication
We can also multiply our vectors by numbers or scalars. They are called scalars because
they "scale" a vector, as we will see. The following screenshot shows a vector that is
multiplied by a number on the left and the same thing algebraically on the right:

Figure 1.5 – Scalar multiplication


The vector |b⟩ is doubled or multiplied by two. Geometrically, we take the vector |b⟩ and
scale its length by two while preserving its direction. Algebraically, we can just multiply
the components of the vector by the number or scalar two.

Example
What is triple the vector |x⟩ shown here?
 −4 
x = 
2
The solution is:
 −4   −12 
3 x =3 ⋅  = 
2  6 

Exercise 2
• What is 4|x⟩?
• What is -2|x⟩?
8 Superposition with Euclid

Linear combinations
Once we have established that we can add our vectors and multiply them by scalars, we
can start to talk about linear combinations. Linear combinations are just the scaling and
addition of vectors to form new vectors. Let's start with our two vectors we have been
working with the whole time, |a⟩ and |b⟩. I want to scale my vector |a⟩ by two to get a new
vector |c⟩, as shown in the following screenshot:

Figure 1.6 – |a⟩ scaled by two to produce |c⟩


As we have said, we can do this algebraically as well, as the following equation shows:

3  6 
c = 2 a = 2⋅  =  
2 4
Linear combinations 9

Then, I want to take my vector |b⟩ and scale it by two to get a new vector, |d⟩, as shown in
the following screenshot:

Figure 1.7 – |b⟩ scaled by two to produce |d⟩


So, now, we have a vector |c⟩ that is two times |a⟩, and a vector |d⟩ that is two times |b⟩:
1 2
d = 2 b = 2⋅   =  
 −2   −4 
Can I add these two new vectors, |c⟩ and |d⟩? Certainly! I will do that, but I will express |e⟩
as a linear combination of |a⟩ and |b⟩ in the following way:
e =2 a +2 b = c + d
10 Superposition with Euclid

Vector |e⟩ is a linear combination of vectors |a⟩ and |b⟩! Now, I can show this all
geometrically, as follows:

Figure 1.8 – Linear combination


This can also be represented in the following equation:

3   1  6   2  8 
2⋅  + 2⋅  =   +   =  
2  −2   4   −4  0 
So, we now have a firm grasp on Euclidean vectors, the algebra you can perform with
them, and the concept of a linear combination. We will use that in this next section to
describe a quantum phenomenon called superposition.
Superposition 11

Superposition
Superposition can be a very imposing term, so before we delve into it, let's take a step
back and talk about the computers we use today. In quantum computing, we call these
computers "classical computers" to distinguish them from quantum computers. Classical
computers use binary digits—or bits, for short—to store ones and zeros. These ones and
zeros can represent anything, from truth values to characters to pixel values on a screen!
They are physically implemented using any two-state device such as an electrical switch
that is either on or off.
A quantum bit, or qubit for short, is the analogous building block of quantum computers.
They are implemented by anything that demonstrates quantum phenomena, which means
they are very, very small. In the following screenshot, we show how a property of an
electron—namely spin—can be used to represent a one or zero of a qubit:

Figure 1.9 – Pair of electrons with a spin labeled 1 and 0


Physicists use mathematics to model quantum phenomena, and guess what they use to
model the state of a quantum particle? That's right! Vectors! Quantum computer scientists
have taken two of these states and labeled them as the canonical one and zero for qubits.
They are shown in the following screenshot:

Figure 1.10 – Zero and one states


12 Superposition with Euclid

As you can see, the zero and one states are just vectors on the x and y axes with a length
of one unit each. When you combine a lot of ones and zeros in classical computing,
wonderful, complex things can be done. The same is true of the zero and one state of
qubits in quantum computing.

Greek Letters
Mathematicians and physicists love Greek letters, and they have found their
way into quantum computing in several places. The Greek letter "Psi", ψ, is
often used to represent the state of a qubit. The Greek letters "alpha", α, and
"beta", β, are used to represent numbers or scalars.

While qubits can represent a one or a zero, they have a superpower in that they can
represent a combination of a zero and one as well! "How?" you might ask. Well, this
is where superposition comes in. Understanding it is actually quite simple from a
mathematical standpoint. In fact, you already know what it is! It's just a fancy way of
saying that a qubit is in a linear combination of states.
If you recall, we defined the vector |e⟩ as a linear combination of the aforementioned |a⟩
and |b⟩, like so:

Figure 1.11 – Definition of |e⟩


If we replace those letters and numbers with the Greek letters and the zero and one states
we just introduced, we get an equation like this:

Figure 1.12 – Greek letters being transposed onto a linear combination equation
The bottom equation represents a qubit in the state |ψ⟩, which is a superposition of the
states zero and one! You now know what superposition is mathematically! This, by the
way, is the only way that counts because math is the language of physics and, therefore,
quantum computing.
Summary 13

Measurement
But wait—there's more! With only the simple mathematics you have acquired so far, you
also get a look at the weird act of measuring qubits. The scalars α and β shown previously
play a crucial role when measuring qubits. In fact, if we were to set this qubit up in the
state |ψ⟩ an infinite number of times, when we measured it for a zero or a one, |α|2 would
give us the probability of getting a zero, and |β|2 would give us the probability of getting a
one. Pretty cool, eh!?!
So, here is a question. For the qubit state |ψ⟩ in the following equation, what is the
probability of getting a zero or a one when we measure it?
1 1
ψ = 0 + 1
2 2
Well, if we said |α|2 gives us the probability of getting a zero, then the answer would look
like this:
2
1 1 1 1
= ⋅ =
2 2 2 2
This tells us that one half or 50% of the time when we measure for a zero or a one, we
will get a zero. We can do the same exact math for β and derive that the other half of the
time when we measure, we will get a one. The state |ψ⟩ shown previously represents the
proverbial coin being flipped into the air and landing heads for a one and tails for a zero.

Summary
In a short amount of time, we have developed enough mathematics to explain
superposition and its effects on measurement. We did this by introducing Euclidean
vectors and the operations of addition and scalar multiplication upon them. Putting these
operations together, we were able to get a definition for a linear combination and then
apply that definition to what is termed superposition. In the end, we could use all of this
to predict the probability of getting a zero or one when measuring a qubit.
In the next chapter, we will introduce the concept of a matrix and use it to
manipulate qubits!
14 Superposition with Euclid

History (Optional)
Euclidean vectors are named after the Greek mathematician Euclid circa 300 BC.
In his book, The Elements, he puts together postulates and theories from other
Greek mathematicians, including Pythagoras, that defined Euclidean geometry.
The book was a required textbook for math students for over 2,000 years.

Figure 1.13 – Euclid with other Greek mathematicians in Raphael's School of Athens
Answers to exercises 15

Answers to exercises
Exercise 1
3
 
a) 5

 −3 
 
b)  −5

8
1
 
 
c)  −2 

Exercise 2
 −16 
 
a)  8 

8
 
b)  −4 
2
The Matrix
In the famous movie, The Matrix, Morpheus says, "The Matrix is everywhere. It is all
around us. Even now, in this very room." Morpheus was not far off the mark, for there is
a theory in physics called the matrix string theory where essentially, reality is governed
by a set of matrices. But while a matrix is one entity in the movie, a matrix in physics is a
concept that is used again and again to model reality.

Figure 2.1 – This screenshot of the GLmatrix program by Jamie Zawinski is licensed with his permission
18 The Matrix

As you will see, the definition of a matrix is deceptively simple, but its power derives
from all the ways mathematicians have defined that it can be used. It is a central object in
quantum mechanics and, hence, quantum computing. Indeed, if I were forced to select the
most important mathematical tool in quantum computing, it would be a matrix.
In this chapter, we are going to cover the following main topics:

• Defining a matrix
• Simple matrix operations
• Defining matrix multiplication
• Special types of matrices
• Quantum gates

Defining a matrix
Mathematicians define a matrix as simply a rectangular array that has m rows and n
columns, like the one shown in the following screenshot:

Figure 2.2 – Model of a matrix with m rows and n columns


In math, matrices are written out a particular way. An example 4 × 5 matrix is shown in
the following expression. Notice that it has four rows and five columns:
Defining a matrix 19

Figure 2.3 – Example of a 4 x 5 matrix

Notation
In math and quantum computing, matrix variable names are in capital letters, and each
entry in a matrix is referred to by a lowercase letter that corresponds to the variable name
with subscripts (aij). Subscript i refers to the row the entry is in and subscript j refers to
the column it is in. The following formula shows this for a 3 × 3 matrix:
 a11 a12 a13   5 4 3 
   
A =  a21 a22 a23  =  2 1 7 
 a a32 a33   8 9 0 
 31   

In our example matrix A, a22 = 1. What is a32? Hint—it's the only number that begins with
the letter n.

Redefining vectors
One thing we will do in this book is iteratively define things so that we start simple, and as
we learn more, we will add or even redefine objects to make them more advanced. So, in
our previous chapter, our Euclidean vectors only had two dimensions. What if they have
more? Well, we can represent them with n × 1 matrices, like so:
 a1 
 
 a2 
 ⋮ 
 
 an 

What do we call a 1 × n matrix such as the following one?


 a1 a2 ⋯ a n 
 
20 The Matrix

Well, we will call it a row vector. To distinguish between the two types, we will call the
n × 1 matrix a column vector. Also, while we have been using kets (for example, |x⟩) to
notate column vectors so far, we will use something different to notate row vectors. We
will introduce the bra, which is the other side of the bra-ket notation. A bra is denoted by
an opening angle bracket, the name of the vector, and a pipe or vertical bar. For example,
a row vector with the name b would be denoted like this: ⟨b|. This is the other side of the
bra-ket notation explained further in the appendix. To make things clearer, here are our
definitions of a column vector and a row vector for now:
 a1 
 
 a2 
column vector : = x = 
⋮ 
 
 an 

row vector : = y =  a1 a2 ⋯ an 
 

Important Note
A bra has a deeper definition that we will look at later in this book. For now,
this is enough for us to tackle matrix multiplication.

Now that we have introduced how column and row vectors can be represented by
one-dimensional (1D) matrices, let's next look at some operations we can do on matrices.

Simple matrix operations


As mentioned in the introduction to this chapter, the power of matrices is the operations
defined on them. Here, we go through some of the basic operations for matrices that
we will build on as the book progresses. You have already encountered some of these
operations with vectors in the previous chapter, but we will now expand them to matrices.

Addition
Addition is one of the easiest operations, along with its inverse subtraction. You basically
just perform addition on each entry of one matrix that corresponds with another entry in
the other matrix, as shown in the following formula. Addition is only defined for matrices
with the same dimensions:
Simple matrix operations 21

 a11 a12 a13   b11 b12 b13   a11 + b11 a12 + b12 a13 + b13 
     
A + B =  a21 a22 a23  +  b21 b22 b23  =  a21 + b21 a22 + b22 a23 + b23 
 a a32 a33   b31 b32 b33   a +b a32 + b32 a33 + b33 
 31     31 31 

Example
Here is an example of matrix addition:
 2 3   5 3   2+5 3+3   7 6 
  +   =   =  
 4 5   0 −1   4 + 0 5 − 1   4 4 

Exercise 1
What is the sum of the following two matrices? (Answers to exercises are at the end of
the chapter.)
1 2   2 1 
3 4  +  0 2 
   

Scalar multiplication
Scalar multiplication is also rather easy. A scalar is just a number, and so scalar
multiplication is just multiplying a matrix by a number. We define it for a scalar b and
matrix A thusly:
 a11 a12 ⋯ a1n   b ⋅ a11 b ⋅ a12 ⋯ b ⋅ a1n 
   
 a a22 ⋯ a2 n   b ⋅ a21 b ⋅ a22 ⋯ b ⋅ a2 n 
b ⋅ A = bA = b ⋅  21
⋮ ⋮ ⋱ ⋮  =  ⋮ ⋮ ⋱ ⋮ 
   
 am1 am 2 ⋯ amn   b ⋅ am1 b ⋅ am 2 ⋯ b ⋅ amn 

Example
As always, an example can definitely help:
 4 7   3⋅4 3⋅7   12 21 
3⋅  =   =  
 2 −3   3 ⋅ 2 3 ⋅ −3  
6 −9 

22 The Matrix

Exercise 2
Calculate the following:
 3 4 
3 
 2 1 

Transposing a matrix
An important operation involving matrices and vectors is to transpose them. To transpose
a matrix, you essentially convert the rows into columns and the columns into rows. It is
denoted by a superscript T. Here is the definition:
 a11 a12 ⋯ a1n   a11 a21 ⋯ am1 
   
 a a22 ⋯ a2 n   a12 a22 ⋯ am2 
If A =  21  , then A =  ⋮
T

⋮ ⋮ ⋱ ⋮ ⋮ ⋱ ⋮
   
 am1 am 2 ⋯ amn   a1n a2 n ⋯ amn 

Notice how the subscripts for the diagonal entries stay the same, but the subscripts
for all other entries are switched (for example, the entry at a12 becomes a21). Also, as a
consequence of this operation, the dimensions of the matrix are switched. A 2 × 4 matrix
becomes a 4 × 2 matrix.

Examples
Here is an example of a 3 × 4 matrix transposed:
T  1 5 9 
 1 2 3 4   
   2 6 0 
 5 6 7 8  = 
 9 0 1 2  3 7 1 
 
   4 8 2 

Here is an example of a square matrix transposed:


T
 3 5   3 2 
  =  
 2 0   5 0 

Now, let's move on to matrix multiplication.


Defining matrix multiplication 23

Defining matrix multiplication


Matrix multiplication can be a complicated procedure, and we will build up to it
gradually. It is defined as an operation between an m × n matrix and an n × p matrix that
produces an m × p matrix. The following screenshot shows this well:

Figure 2.4 – Schematic of matrix multiplication


Notice that matrix multiplication is only defined if the number of columns in the first
matrix equals the number of rows in the second matrix—or, in other words, the ns have
to match in our preceding figure. This is so important that we will give it a special name:
the matrix multiplication definition rule, or definition rule for short. Based on this, the
first thing you should do when presented with two matrices to multiply is to make sure
they pass the definition rule. Otherwise, the operation is undefined. For example, do the
following two matrices pass the definition rule?
c d z
a b  
 c d  ⋅ e f x
  g 
 h y 

The answer is no because you have a 2 × 2 matrix being multiplied by a 3 × 3 matrix. The
number of columns of the first matrix does not equal the number of rows of the second
matrix. What about the following matrix multiplication?

c d
a b e 
c d ⋅ g h
 f    (1)
 i k 

Yes, it is defined! It is a 2 × 3 matrix multiplied by a 3 × 2 matrix. The number of columns


in the first equals the number of rows in the second!
24 The Matrix

If the matrices pass the definition rule, the second step you should take when multiplying
two matrices is to draw out the answer's dimensions. For instance, when presented with
the preceding matrix multiplication from Equation (1) of a 2 × 3 matrix and a 3 × 2
matrix, we would draw out a 2 × 2 matrix, like so. I like to write the dimensions of each
matrix on the top:

Figure 2.5 – Writing dimensions of matrix multiplication


Remember—the number of rows in the first matrix determines the number in rows of the
product or resultant matrix. The number of columns in the second matrix determines the
number of columns in the product.
Okay—to summarize, here are the first two steps you should take when doing matrix
multiplication:

1. Does it pass the definition rule? Do the number of columns in the first matrix equal
the number of rows in the second matrix?
2. Draw out the dimensions of the product matrix. For an m × n matrix and an n × p
matrix, the dimensions of the resulting matrix will be m × p.

Based on all of this, do you think matrix multiplication is commutative (that is,
A ⋅ B = B ⋅ A)? Think about it—I'll give you the answer later in this chapter. Now, let's
look at how to multiply two vectors to produce a scalar.

Multiplying vectors
Earlier, we defined column and row vectors as one-dimensional matrices. Since they are
one-dimensional, they are the easiest matrices to multiply. A bracket, denoted by ⟨x|y⟩, is
essentially matrix multiplication of a row vector and a column vector. Here is our definition:

 y1 
 

y2 
x y =  x1  = x1 ⋅ y1 + x2 ⋅ y2 + ⋯ + xn ⋅ yn

x2 ⋯ xn
 ⋮
 
 yn 
Defining matrix multiplication 25

Important Note
A bracket has a deeper definition that we will look at later in this book. For
now, this is enough for us to tackle matrix multiplication.

Let's look at an example to make this more concrete.

Examples
Let's say ⟨y| and |x⟩ are defined this way:
1 
2
y = [3 2 1 4] x =  
3 
 
4

Now, let's calculate the bracket ⟨y| x⟩:


1 
2 3 ⋅ 1 + 2 ⋅ 2 + 1 ⋅ 3 + 4 ⋅ 4

y x = [3 2 1 4] ⋅   =  3 + 4 + 3 + 16
3  
   26
4

Here are two more examples of matrix multiplication of a row vector with a
column vector:
4
[3 2]   = 3 ⋅ 4 + 2 ⋅ 1 = 12 + 2 = 14
1 

4
[1 2 3] 5  = 1 ⋅ 4 + 2 ⋅ 5 + 3 ⋅ 6 = 4 + 10 + 18 = 32
 6 

Exercise 3
What is the answer to the matrix multiplication of this row vector and column vector?
1 
2
[1 2 3]
 
 3 
26 The Matrix

Matrix-vector multiplication
We are building up to pure matrix multiplication, and the next step to getting there
is matrix-vector multiplication. Let's look at a typical expression for matrix-vector
multiplication:
2 1
3 4 ⋅ 2
   3 
 4 5 

Now, this is a 3 × 2 matrix multiplied by a 2 × 1 column vector. Does this pass the matrix
multiplication rule? Yes! 2=2. What are the dimensions of the product? Well, taking the
outer dimensions of the two matrices involved in the product, it will be a 3 × 1 column
vector. If the matrix and column vector are variables, we can write out the product this way:
Ax (2)
This denotes the matrix A multiplying the vector |x⟩.
Alright—how do we actually do the multiplication? Well, first, we separate the rows of the
matrix into row vectors. Wait—there are vectors in matrices?! Yes—you can see a matrix
as a set of row vectors or column vectors when it is convenient to do so. Let's look at an
example of doing this:
 4 3 2  = R1
 4 3 2   
 
 4 1 3  ⇒  4 1 3  = R2
 2 4 2 
   2 4 2  = R3 (3)
 

See how I separated the three rows of the matrix into three row vectors? I even gave them
names, with the letter R standing for row and the subscript number showing which row it
came from. So, after performing the first two steps of matrix multiplication, the next step is:

3. Separate the matrix on the left into row vectors, ⟨R1|, ⟨R2|, … , ⟨Rm|.

From there, we will calculate the bracket between each row vector from the separated
matrix and the column vector we are multiplying by. Let's see an example.
Let's say we are trying to find the answer to the matrix-vector multiplication between
matrix A and vector |x⟩ as in Equation (2) and matrix A is the one defined in Equation (3).
We will create a simple vector |x⟩ to give us:
Defining matrix multiplication 27

 4 3 2   w a 
A x = 4 1 3 ⋅ y = b 
   
     
 2 4 2   z   c 

Okay—so, how do we figure out what a, b, and c are? Let's calculate the brackets!
 w
a = R1 x = [ 4 3 2]  y  = 4w + 3 y + 2 z
 
 z 
 w
b = R2 x = [ 4 1 3]  y  = 4 w + y + 3z
 
 z 
 w
c = R3 x = [ 2 4 2]  y  = 2w + 4 y + 2 z
 
 z 

Now that we have seen an example, let's generalize this for the final step of matrix-vector
multiplication:

4. For a matrix-vector multiplication, A|x⟩, compute the bracket for each row in A
with the column vector |x⟩. Put the result of the bracket in the corresponding row
for the resultant column vector.

So, let's write this all out for our example:


 4 3 2  = R1
 4 3 2     w 
   
A =  4 1 3  ⇒  4 1 3  = R2 x =  y 
 
 2 4 2   z 
   2 4 2  = R3  
 

 
 4 3 2   w   R1 x 
     
A x =  4 1 3  ⋅ y  =  R2 x 
 2 4 2   z   
     R3 x 
 

Now, let's put this all together for a proper definition of matrix-vector multiplication.
28 The Matrix

Matrix-vector multiplication definition


Given an m × n matrix A and an n × 1 vector |x⟩, where A is made up of m row vectors,
like so:
 R1 
 
A = 
R2 
 ⋮ 
 
 Rm 

Matrix-vector multiplication is defined as:


 
 R1 x 
 
 R2 x 
A x = y =  
 ⋮ 
 
 Rm x 
 

Now, let's apply this to an example with real numbers:


 2 
 [ 2 1]   
 3 
2 1    2 ⋅ 2 + 3 ⋅ 1 7
 3 4  ⋅  2  =  3 4  2   = 3 ⋅ 2 + 4 ⋅ 3 = 18 
   3  [ ]     
 3 
 4 5    
 4 ⋅ 2 + 5 ⋅ 3 
  23
[ 4 5]  2  
 3 
 

Now, it's your turn to do matrix-vector multiplication.

Exercise 4
If you have matrices A, B, and C defined as so:
1 2 
1 2  1 2 3
A = 3 4  B =   C =  
  3 4  4 5 6
5 6 
Defining matrix multiplication 29

and three vectors defined as so:


0
1 2
x =  −1 y =   z =  
   −1 2
 −2 

what are the following matrix-vector products? If the operation is undefined, say so.
Ax C x
Ay Bz

Matrix multiplication
Alright—we have finally arrived at matrix multiplication! Trust me, it is worth the wait
because matrix multiplication is used all over quantum computing and you now have the
basis to do the calculations correctly and succinctly.
Remember in the previous section that I said you could view a matrix as a set of row
vectors? Well, it ends up you can view them as set of column vectors as well. Let's take our
matrix from before and repurpose it to see matrices as a set of column vectors:
 4 3 2   4   3   2 
       
 4 1 3  ⇒  4   1   3 
 2 4 2   2   4   2 
       
C1 C2 C3

So, this time, I separated the matrix into three column vectors. I gave each one a name
with the letter C, standing for column, and the subscript number showing which column
it came from.
We will use the first three steps we have defined so far for matrix multiplication as well
and replace the fourth step from matrix-vector multiplication. Here are the first four steps
of matrix multiplication:

1. Does it pass the definition rule? Do the number of columns in the first matrix equal
the number of rows in the second matrix?
2. Draw out the dimensions of the product matrix. For an m × n matrix and an n × p
matrix, the dimensions of the resulting matrix will be m × p.
30 The Matrix

3. Separate the matrix on the left into row vectors, ⟨R1|, ⟨R2|, … , ⟨Rm|.
4. Separate the matrix on the right into column vectors, |C1⟩, |C2⟩,…,|Cp⟩.

Alright—we have arrived at the last step! Can you guess what it is? Well, it definitely
involves brackets. Without further ado, here it is:

5. For each entry aij in the resultant matrix, compute the bracket of the ith row vector
and the jth column vector, ⟨Ri|Cj⟩.

If matrix A is the left matrix and matrix B is the right matrix (that is, A ⋅ B), then the
following diagram is a good way to look at Step 5 graphically:

Figure 2.6 – Depiction of the matrix product AB [1]


Let's look at an example.

Example
Say we have the following two matrices:
1 2  1 2 3
A =   B =  
3 4  4 5 6
Defining matrix multiplication 31

Let's go through our five steps.

1. Does it pass the definition rule? Do the number of columns in the first matrix equal
the number of rows in the second matrix?
Yes! A is a 2 × 2 matrix and B is a 2 × 3 matrix. So, the operation is defined.
2. Draw out the dimensions of the product matrix. For an m × n matrix and an n × p
matrix, the dimensions of the resulting matrix will be m × p.
2× 2 2×3 2× 3
     

1 2   1 2 3   
A⋅B =  ⋅
   =  
3 4   4 5 6   

3. Separate the matrix on the left into row vectors, ⟨R1|, ⟨R2|, … , ⟨Rm|.

 1 2   1 2  = R1
 
  ⇒
 3 4   3 4  = R2
 

4. Separate the matrix on the right into column vectors, |C1⟩, |C2⟩,…,|Cp⟩.
 1 2 3   1   2   3 
  ⇒      
 4 5 6   4   5   6 
C1 C2 C3

5. For each entry aij in the resultant matrix, compute the bracket of the ith row vector
and the jth column vector, ⟨Ri|Cj⟩.
 
 R1 C1 R1 C2 R1 C3 
A⋅B =  
 R2 C1 R2 C2 R2 C3 
 
32 The Matrix

So, the computations will look like this:


 1  2 3 
 [1 2]   [1 2]   [1 2]   
1 2  1 2 3   4 5 6 
3 4  ⋅  4 5 6  = 
    1  2 3 
[3 4]   [3 4]   [3 4 ]   
 4 5   6  

 1 + 8 2 + 10 3 + 12   9 12 15 
=   =  
3 + 16 6 + 20 9 + 24  19 26 33

And there's our answer! That may have taken a while, but it will become quicker and more
intuitive as you practice it. Speaking of practice…

Exercise 5
Given the following three matrices:
 −1 2 
1 2  1 0 4
D =  3 0 E =   F =  
   0 −2   −1 5 −2 
 5 2 

what are the following products? Follow the steps! And if the operation is undefined,
say so.
D⋅E
E⋅F
F ⋅D

Properties of matrix multiplication


It's good to know some general properties of matrix multiplication. These properties
assume that the matrices all have the right dimensions to pass the matrix multiplication
definition rule. Here they are—matrix multiplication is:

• Not commutative: A⋅ B ≠ B⋅ A
• Distributive with respect to matrix addition:
A( B + C ) = AB + AC = ( B + C ) A
Special types of matrices 33

• Associative:
(𝐴𝐴𝐴𝐴)𝐶𝐶 = 𝐴𝐴(𝐵𝐵𝐵𝐵)

• The transpose of a matrix product A ⋅ B is the product of the transpose of each


matrix in reverse:
( AB )T = BT AT

That concludes our section on matrix multiplication. You might want to take a
break—you've been through a lot! When you come back, we'll look at some special
matrices that are good to know.

Special types of matrices


In the world of matrices, some are so special that they have been singled out. Here
they are.

Square matrices
A special type of matrix is a square matrix. A square matrix is one where the number
of rows equals the number of columns. In other words, it is an m × n matrix in which
m = n. Square matrices show up all over the place in quantum computing due to special
properties that they can have—for example, symmetry, which is discussed later in the
book. As we progress in the book, they will become one of the central types of matrices
we will use. Some examples of square matrices are:
 3 8 4 6 
 3 8 8   
 2 5     2 9 2 9 
   9 1 6  
 1 7   3 5 6  1 6 0 5 
 
   0 7 5 4 

Identity matrices
An important type of square matrix is an identity matrix, named I. It is defined so that it
acts as the number 1 in matrix multiplication so that the following holds true:
A⋅I = A
34 The Matrix

It has ones all down its principal diagonal and zeros everywhere else. Its dimensions need
to change based on the matrix it is being multiplied by. Here are some examples of I in
different dimensions:
 1 0 0 
 1 0   
I1 = [1] , I 2 =   , I 3 =  0 1 0 
 0 1
  0 0 1 
 

You should multiply some of the matrices we have used before with the identity matrix to
convince yourself that it does indeed return the matrix it is multiplied by. Now that we've
gone over some special matrices, let's get into why matrix multiplication is important in
quantum computing.

Quantum gates
In this section, I'd like to take the math you have learned in this chapter around matrices
and connect it to actual quantum computing—namely, quantum gates. Please remember
that this book is not about teaching you everything in quantum computing, but rather
the mathematics needed to do and learn quantum computing. That being said, I want to
connect the math to quantum computing and show the motivation for learning it. Do not
be frustrated if this does not all make sense, and please consult the reference books in the
appendix for more information on quantum gates.

Logic gates
In classical computing, we use logic gates to put together circuits that will implement
algorithms, such as, adding two numbers. The logic gates represent Boolean logic. Here
are some simple logic operations:
• AND
• OR
• NOT

In a circuit, you have input, output, and logic gates. The input and outputs are represented
by a binary number, with 1 being true and 0 being false. Here is an example of a NOT gate:

Figure 2.7 – NOT gate


Quantum gates 35

Truth tables can be created that correspond to the inputs and outputs of the circuit.
Here is a truth table for the NOT gate just described:

Figure 2.8 – Truth table for NOT gate


A slightly more complicated circuit involves the AND gate, as shown in the
following figure:

Figure 2.9 – AND gate


Here is a truth table for this circuit:

Figure 2.10 – Truth table for AND gate


36 The Matrix

Circuit model
Much of quantum computing is modeled using quantum circuits that are similar to,
but not the same as, the classical circuits we just went through. In quantum circuits, the
inputs are qubits (vectors), and the gates are matrices. An example quantum logic gate is
shown here:

Figure 2.11 – Quantum circuit with NOT gate


The output qubit is derived through matrix-vector multiplication! So, if we remember
from Chapter 2, Superposition with Euclid, qubits are just vectors and the binary states
for qubits are:
1  0
0 =   and 1 =  
0 1 

The NOT gate in quantum computing is represented by the following matrix:


0 1 
X =  
1 0 
So, if our input qubit is a |1⟩, then the output would be:
0 1  0  1 
X 1 =     =   = 0
1 0  1  0
If we did the computation with |0⟩ as the input, the output would be |1⟩. Basically, this is
the quantum version of the NOT gate!
There are many quantum gates, but they are all modeled as matrices. Thus, the math you
have learned in this chapter is directly applicable to quantum computing!
Summary 37

Summary
In this chapter, we have learned a good number of operations on matrices and vectors.
We can now do basic computation with them, and we saw their application to quantum
computing. Please note that not all matrices can be used as quantum gates. Please keep
reading the book to find out which ones can.
In the next chapter, we start to go deeper and look at the foundations of mathematics.

Answers to exercises
Exercise 1
3 3
 
3 6

Exercise 2
 9 12 
 
6 3 

Exercise 3
14

Exercise 4
 −8 
A x isundefined C x =  
 −17 
 −1
6
A y =  −1 Bz =  
  14 
 −1
38 The Matrix

Exercise 5
 −1 2 
1 2 1 0 4
D =  3 0 E =  F = 
  0 −2  
 −1 5 −2 
 5 2 
 −1 2   −1 −6 
1 2 
D ⋅ E =  3 0  = 3 6
  0 −2   
 5 2    5 6 
1 2   1 0 4  −1 10 0 
E ⋅F =   =
0 −2   −1 5 −2   2 −10 4 
 
 −1 2 
 1 0 4   19 10 
F ⋅D =    3 0 =  6 −6 
 − 1 5 − 2   5 2  
 

References
[1] Matrix multiplication diagram by Bilou is licensed under CC BY-SA 3.0.
Section 2:
Elementary
Linear Algebra

This section digs deeper into the heart of quantum computing: linear algebra.
The following chapters are included in this section:

• Chapter 3, Foundations
• Chapter 4, Vector Spaces
• Chapter 5, Using Matrices to Transform Space
3
Foundations
Up until this point, we have introduced our mathematics with as little rigor as possible.
This chapter – and this part of the book – will change that. You may ask why? Well, this
rigor and foundational material is needed when we get to much more complex concepts
such as Hilbert spaces and tensor products. Without this chapter, these advanced
concepts will not make sense, and you won't have the context to understand them.
This chapter goes through the field of abstract algebra. As you might expect, there will
be some abstract concepts that will be explored. Abstract algebra takes a step back from
all other forms of algebra, such as linear, Boolean, and elementary algebra, and it tries
to see what can be generalized between them. Mathematicians have found that they can
generalize a few foundational concepts that, when put together, allow us to go further in
math than we have before and help us understand it at a more fundamental level. Within
this chapter, our ultimate goal will be to define vector spaces rigorously.
So, without further ado, let's get into the material. We will cover the following topics:

• Sets
• Functions
• Binary operations
• Groups
• Fields
• Vector spaces
42 Foundations

Sets
Sets are very intuitive and are really about grouping things together. For example, all
mammals – taken together – form a set. In this set, its members are things such as the fox,
squirrel, and dog. Sets don't care about duplication – so, if we have 5,000 dogs in our first
mammals set, this set is equal to a set of mammals that has only one dog. Let's make this
more formal.

The definition of a set


A set is a collection of objects. This collection can be finite or infinite. Mathematical
objects are abstract, have properties, and can be acted upon by operations. Examples of
objects are numbers, functions, shapes, and matrices. Objects in a set are called elements
or members.

Notation
There are multiple ways to denote a set. The easiest way is to just describe it, as I did with
the set of mammals. Another example for doing this would be to describe a set S of all US
States. Some examples of elements in this set would be Virginia and Alabama. Let's look at
a more formal way to notate sets called set-builder notation.

Set-builder notation
Set-builder notation is definitely more formal, but with this formality, you gain preciseness
(which mathematicians covet). The easiest way to denote a set in set-builder notation is
just to enumerate all the members of a set. You do this by giving the variable name of the
set, followed by an equals sign, and then the members of the set are put into curly brackets
and separated by commas. Here is an example:
𝑁𝑁 = {𝐻𝐻𝐻𝐻, 𝑁𝑁𝑁𝑁, 𝐴𝐴𝐴𝐴, 𝐾𝐾𝐾𝐾, 𝑋𝑋𝑋𝑋, 𝑅𝑅𝑅𝑅, 𝑂𝑂𝑂𝑂}

Any guesses on what this set is? Extra bonus points if you guessed the noble gases.
An ellipsis (…) is used to skip listing elements if a pattern is clear or to denote an infinite
set, shown as follows:
𝑋𝑋 = {1, 2, 3,  … , 100}𝑌𝑌 = {… ,  − 2,  − 1,0,1,2, … }
Sets 43

The final way you can denote a set is to include conditions for the members of your set.
Here is an annotated example to explain each part of the notation:

Figure 3.1 – An annotated description of set-builder notation


You can also describe what type of number you are dealing with by writing the type before
the vertical bar, like so:
{𝑥𝑥 is a prime number|𝑥𝑥 < 8}

This is equivalent to the set {2, 3, 5, 7}.

Other set notation


An important symbol in set notation is ϵ, which denotes membership. If there is a slash
through ϵ, it means the object is not a member of the set. For example, the following
denotes that He (helium) is a member of the noble gases and O (oxygen) is not:
𝐻𝐻𝐻𝐻 ∈ 𝑁𝑁  𝑂𝑂 ∉ 𝑁𝑁

The next symbol to consider is used to denote subsets. If X and Y are sets, and every
element of X is also an element of Y, then:

• X is a subset of Y, denoted by X ⊆ Y.
• Y is a superset of X, denoted by Y ⊇ X.

For example, the set N of the noble gases is a subset of the set E of all elements. Or,
mammals are a subset of the animal kingdom and a superset of the primates.
Finally, it is important to define the empty set that has no members and is denoted by ∅
or {}. The empty set is a subset of all sets.

Important sets of numbers


Since mathematics is all about numbers, special attention should be given to certain sets of
numbers that we will see in this book. Each one has a special double-struck capital letter
to represent them. So, without further ado, here is the list:

• ℕ, which is the set of natural numbers defined as {0, 1, 2, 3, …}. This is the first set
of numbers you learn as a child, and they are used for counting.
44 Foundations

• ℤ, which is the set of integers defined as {…, -3, -2, -1, 0, 1, 2, 3, …}. This is a superset
of ℕ. The letter Z comes from the German word Zahlen, which means numbers.
• ℚ, which is the set of rational numbers, where a rational number is defined as
any number that can be expressed as a ratio or quotient of two integers. Since all
integers are divisible by 1, ℤ is a subset of ℚ.
• ℝ, which is the set of real numbers. The real numbers are composed of ℚ and all
the irrational numbers. Irrational numbers, when represented as a decimal, do not
terminate, nor do they end with a repeating sequence. For example, the rational
number 1/3 is represented as 0.33333… in decimal, but the irrational number π
starts with 3.14159, but it never terminates nor repeats a sequence. Some other
examples of irrational numbers are as follows: all square roots of natural numbers
that are not perfect squares; the golden ratio, ϕ; Euler's number, e.
• ℂ, which is the set of complex numbers. We will go in-depth into complex
numbers in a later chapter, but for now, we will just say that a complex number is
represented by:
𝑎𝑎 + 𝑏𝑏𝑏𝑏 where 𝑎𝑎 and 𝑏𝑏 are real numbers and 𝑖𝑖 2 = −1

If we set b = 0, then we have the set of all real numbers, so ℝ ⊆ ℂ.


To sum up:
ℕ⊆ℤ⊆ℚ⊆ℝ⊆ℂ

The following diagram shows this very well graphically:

Figure 3.2 – A diagram showing all the sets of numbers [1]


Alright, let's move on to tuples!
Sets 45

Tuples
It is important to note that sets do not care about order. So, if set A = {1, 2, 3} and set
B = {2, 3, 1}, A and B are equal. Sets also do not care about duplication. So, if set
C = {1, 2, 3, 3, 3} and set D = {1, 2, 3}, C and D are also equal. A mathematical object that
does care about these things is called a tuple.
A tuple is a finite, ordered list of elements, which is denoted with open and close
parentheses, as shown here:
𝐸𝐸 = (3,4,5,5)𝐹𝐹 = (3,4,5)𝐻𝐻 = (5,3,4)

Since order and duplication matters to tuples, none of the preceding examples are equal to
each other. The number of elements in a tuple is defined as n, and we use this number to
refer to a tuple as an n-tuple. For example, in the preceding example, E is a 4-tuple and F
is a 3-tuple. Some n-tuples have special names; for example, a 2-tuple is also known as an
ordered pair.

The Cartesian product


The Cartesian product may not be as familiar to you as sets, but it is still important. The
Cartesian product takes two sets and creates a third set of ordered pairs (that is, 2-tuples)
from those two sets. It is denoted by the × symbol. The following figure shows an example
for the Cartesian product of two sets A={x, y, z} and B={1, 2, 3}.

Figure 3.3 – An example of the Cartesian product of A × B [2]


Here's another example: if I have a set A = {1, 2} and a set B = {6, 7, 8, 9}, then A × B is
equal to {(1, 6), (1, 7), (1, 8), (1, 9), (2, 6), (2, 7), (2, 8), (2, 9)}. It should be noted that
the Cartesian product is not commutative, so in general, A × B ≠ B × A.
46 Foundations

The greatest example of this operation is from Rene Descartes (who the Cartesian
product is named after). You have probably heard of the Cartesian plane, as shown in the
following diagram. Well, this is the Cartesian product of the set X of the real numbers ℝ
and the set Y of the real numbers ℝ, and it is denoted by ℝ × ℝ = ℝ2.

Figure 3.4 – A Cartesian plane with Cartesian coordinates [3]


The Cartesian product can be done multiple times, so if you have three sets, for example,
X, Y, and Z, then X × Y × Z is the set of all 3-tuples for every combination of elements of
X, Y, and Z. Again, an example is helpful: let's say ℝ × ℝ × ℝ produces ℝ3, which is the set
of all 3-tuples of real numbers or three-dimensional space. In general:
𝑋𝑋𝑛𝑛 = ⏟
𝑋𝑋 × 𝑋𝑋 × … × 𝑋𝑋
𝑛𝑛 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡

So, for shorthand, we write ℝn to denote all of the n-tuples of real numbers.
Now that we have covered everything to do with sets and tuples, let's look at another
fundamental object: functions.

Functions
Functions are fundamental to mathematics, and there is no doubt that you have been
exposed to them before. However, I want to go over certain aspects of them in depth, as
we will define things such as matrices as representations of functions later in the book.
Functions 47

The definition of a function


A function, for example, y = f(x), maps every element x in a set A to another element y
in set B. Each element y is called the image of x under the function f(x). Set A is called the
domain of the function and set B is called the codomain of the function. The domain and
codomain of a function are denoted by f: A → B . The following mapping diagram shows
the function f: X → Y.

Figure 3.5 – An example function [4]


All the images of f(x) form a set called the range. The range is a subset of the codomain.
In the previous diagram of our function f: X → Y, the codomain was the set Y, but the
range was the set {D, C}. The image of the element 1 in the domain was the element D in
the range.
I could define another function, f:ℝ → ℝ, where f(x) = 2x. Here, the domain and
codomain are the real numbers, as well as the range. The image of 3 under f(x) is f (3)=6.
There are two rules that functions must follow:

1. Every member of the domain must be mapped.


2. Every member of the domain cannot be mapped to more than one element in
the codomain.

The mapping shown in the following figure is illegal because it doesn't follow these two
rules. It breaks the first rule by not mapping the elements 3 and 4 in the domain. Can you
spot how it breaks the second rule?

Figure 3.6 – An example of an illegal function [5]


48 Foundations

I'm sure you pointed out that it breaks the second rule by mapping the element 2 to
B and C.
Let's say I have a set C = {1, 2, 3} and a set D = {4, 5, 6}. One of the many ways I can
define a function is with a table. This table defines a function, f:C → D.

Figure 3.7 – A function table


Now, imagine I delete the last row from the table. Is it still a function? No, because I do
not have a mapping for every element of the domain set A, namely, the number 3.

Exercise 1
For the sets E = {a, b, c} and F = {4, 5, 6, 7, 8}, and a function, f:E → F, which of the
following tables do not represent a function?

Figure 3.8 – The Exercise 1 tables

Invertible functions
Invertible functions are key in quantum computing because the laws of quantum
mechanics only allow these types of functions in certain situations. Before going into
invertibility, I'd like to look at three other properties of functions.
Functions 49

Injective functions
An injective function, also known as a one-to-one function, is a function where each
element in the range is the image of only one element in the domain. It is important to
note that not every element in the codomain needs to be in the range, so a function is
still injective if there are members of the codomain that are not mapped. Let's look at
an example.

Figure 3.9 – The function on the left is injective and the function on the right is not
The function on the left in Figure 3.9 is injective because A, B, and C are the image of only
one element in the domain X. The function on the right is not injective because B is the
image of both the numbers 2 and 3 in the domain X.

Surjective functions
A surjective function, also known as an onto function, is a function where the range of
the domain of the function is equal to its codomain. Another way to say this is that every
element in the codomain is mapped to by at least one element in the domain.
Neither of the functions from Figure 3.9 is surjective because they leave elements in the
codomain unmapped. However, the following functions are surjective:

Figure 3.10 – Two surjective functions


50 Foundations

The function f:ℝ →ℝ, where f(x) = x2, is not surjective because not all of the elements
in the codomain (namely, all negative real numbers) are mapped to, as the square of a
number (that is, x2) can only be non-negative. The negative real numbers are left out.

Bijective functions
Now that we know what injective and surjective functions are, defining a bijective
function is quite easy! A bijective function is one that is both injective and surjective.
And guess what else we get in this deal!? A function is invertible if it is bijective!
Now, did you notice any bijective functions in our preceding examples? Extra, extra bonus
points if you pointed to the function on the right in Figure 3.10, which is reproduced in
the following figure:

Figure 3.11 – A bijective, invertible function


The inverse of a function, such as, f(x) is usually denoted by f -1 (x). It makes sense that
an invertible function has to be bijective. The only way I can make f -1 (x) a function is to
make it follow the two rules for functions that we described before, namely:

• Every member of the domain must be mapped.


• Every member of the domain cannot be mapped to more than one element in
the codomain.

For f -1 (x), the domain and codomain are flipped. For it to follow the first rule, every
element of the codomain for f(x) must be mapped (that is, f(x) has to be surjective).
And for it to follow the second rule, every member of the range of f(x) cannot be mapped
to more than one element in its domain (that is, f(x) is injective). The following figure
shows f -1 (x) graphically:
Binary operations 51

Figure 3.12 – The inverse of f(x)


Alright, that concludes our discussion of functions. Let's move on to binary operations.

Binary operations
You are probably familiar with some binary operations, for example, addition and
multiplication, but we are going to look at binary operations in more depth.

The definition of a binary operation


A binary operation is simply a function that takes two input values and outputs one
value. More precisely, it takes an ordered pair (known as an operand) from the Cartesian
product of two sets and produces an element in another set. Using our notation:
𝑓𝑓: 𝐴𝐴 × 𝐵𝐵 → 𝐶𝐶

An operation can be anything! For example, sexual reproduction within the set of
mammals can be considered a binary operation. It takes an ordered pair from the subsets of
males and females and produces another member of the set of mammals. More formally:
sexual reproduction : male mammals × female mammals → mammal

Within the number systems, addition is a good example of a binary operation. Let's define
it for the real numbers:
𝑓𝑓: ℝ × ℝ → ℝ

𝑓𝑓(𝑥𝑥, 𝑦𝑦) = 𝑥𝑥 + 𝑦𝑦

You'll notice that with binary operations, we don't use the usual function notation of
f(x, y), but instead, we use a symbol with the first element on the left side and the second
element on the right side. I want to remind you that binary operations are a general
concept – that is, they can be anything – so I will use the unusual ֎ symbol when I want
to talk about operations in general.
52 Foundations

Properties
Operations can have several properties, most of which are probably familiar to you from
grade school. But again, it's important to spell these out to understand abstract algebra.
Here are the properties for a set S:

1. Identity: There exists an identity element, e ∈ S, such that for all x ∈ S, a ֎


e = e ֎ a = a. This element, e, is unique, and it is called the identity element of
the group.
2. Associativity: If a, b, c ∈ S, then a ֎ (b ֎ c) = (a ֎ b) ֎ c.
3. Invertibility: For every a ∈ S, there exists an a-1, such that a ֎ a-1 = a-1 ֎
a = e, where e is the identity element identified in rule one.
4. Closure: For every a, b ∈ A, a ֎ b produces an element c that is also in the set
A. f: A × A → A.
5. Commutativity: If a, b ∈ S, then a ֎ b = b ֎ a.

Okay, now that we defined binary operations and their properties, we can move on to
discuss important algebraic structures.

Groups
A group builds upon the concept of a set by adding a binary operation to it. We denote a
group by putting the set and the operation in angle brackets (⟨⟩). For example, ⟨A, ֎⟩ for
set A and operation ֎. The operation has to follow certain rules to be considered a group,
namely, the rules of identity, associativity, invertibility, and closure. If the operation ֎
also has the property of commutativity, then it is called an Abelian group (also known as
a commutative group).
In our example set of mammals, the operation of sexual reproduction would not make it a
group because the only property it has is commutativity.
Now, let's look at a mathematical example. What if we define ֎ to be addition over the
natural numbers ℕ denoted ⟨ℕ, ֎⟩ – is this a group? Well, let's go through the properties
and see if it fulfills each one.

1. Identity: Does there exist an identity element e, such that a + e = e + a = a? Well,


yes, if we define e = 0!
2. Associativity: Does a + (b + c) = (a + b) + c? Yes!
Fields 53

3. Invertibility: For every a ∈ ℕ, is there an a-1, such that a + a-1 = a-1 + a = e, where
e is the identity element identified in rule one (e = 0)? Hmmm, this is a tough one.
So, ℕ starts at zero and goes to positive infinity, but it does not include the negative
numbers. Without negative numbers, there is no way to define an inverse of a that
when added to a, will always equal zero.
4. Closure: If we take two numbers, a and b, in ℕ, then does a + b produce a natural
number, c? Well, zero is taken care of because it is our identity element for rule one.
1 + 2 = 3, and 3 is also in ℕ. How about 100,000 + 200,000? Well, that equals
300,000, and that is also in ℕ. So, no matter how large we pick two numbers in ℕ,
they will always produce another number in ℕ, as it goes to positive infinity!

So, there you have it. It ends up that addition with the set ℕ is not a group. Is there a set
that would work? Why yes! The set of all integers, ℤ! We can then define a-1 to be –a,
and suddenly, invertibility is fulfilled because a + (-a) = 0! Therefore, the set ℤ with the
operation of addition qualifies as a group! Since addition is commutative as well, the
group is also an Abelian group.

Fields
Fields extend the concept of groups to include another operation. Now, mathematicians
end up defining fields with the familiar symbols of ⋅ and +, and they even call them
multiplication and addition, but hopefully by now, you can see that in abstract algebra, these
are just general terms that can mean anything. So, without further ado, let's define a field.
A field is a set (denoted by S) and two operations (+ and ⋅) that we will notate as {S, +, ⋅},
which follows these rules:

1. ⟨S, +⟩ is an Abelian group with the identity element e = 0.


2. If you exclude the number 0 from the set S to produce a new set S', then ⟨S', ⋅ ⟩ is an
Abelian group with the identity element e = 1.
3. For the rule of distributivity, let a, b, c ∈ S. Then, a ⋅ (b + c) = a ⋅ b + a ⋅ c.

The set of real numbers ℝ, with the operations of addition and multiplication, is the most
obvious example of a field, but there are plenty of others.

Exercise 2
Is the set ℝ, with the operations of subtraction and division {ℝ, −, ÷}, a field?
54 Foundations

Vector space
Now that we have covered all of the abstract concepts we need to understand, we can give
a formal definition of a vector space, before looking at the implications of these in the
following chapters.
A vector space is defined as having the following mathematical objects:

1. An Abelian group ⟨V,+⟩ with an identity element e. We call members of the set V
vectors. We define the identity element to be the zero vector, and we denote this by
0. The operation + is called vector addition.
2. A field {F, +, ⋅}. We say that V is a vector space over the field F, and we call the
members of F scalars.

The Zero Vector Is Not Denoted by |0⟩


It is important to note that we denote the zero vector with a bold zero – 0 – and
it is totally different from the vector |0⟩ we defined earlier in the book. This is
the convention in quantum computing.

We can define an additional operation as scalar multiplication, which is an operation


between a scalar and a vector defined as follows:

• Let a scalar s ∈ F and the vector |v⟩ ∈ V. Scalar multiplication is a binary


operation, f: F × V → V. The multiplicative identity element of the field F of
scalars is 1: |v⟩ ⋅ 1 = 1 ⋅ |v⟩ = |v⟩ .

This new operation – scalar multiplication – must also be compatible or work with addition
and multiplication from our field F of scalars in rule two. It also has to be compatible with
the operation of the vector addition defined in rule one. More formally, let α, β ∈ F and |u⟩,
|v⟩ ∈ V. Or in other words, α and β are scalars and |u⟩ and |v⟩ are vectors:

• Scalar multiplication is compatible with field multiplication:


𝛼𝛼(𝛽𝛽|𝑣𝑣⟩) = (𝛼𝛼 𝛽𝛽)|𝑣𝑣⟩

• Distributivity of scalar multiplication with respect to vector addition:


𝛼𝛼(|𝑢𝑢⟩ + |𝑣𝑣⟩) = 𝛼𝛼|𝑢𝑢⟩ + 𝛼𝛼|𝑣𝑣⟩

• Distributivity of scalar multiplication with respect to field addition:


(𝛼𝛼 + 𝛽𝛽)|𝑣𝑣⟩ = 𝛼𝛼|𝑣𝑣⟩ + 𝛽𝛽|𝑣𝑣⟩
Summary 55

These two mathematical objects – an Abelian group ⟨V, +⟩ of vectors and a field {F, +, ⋅}
of scalars along with the operation of scalar multiplication – are all we need to define a
vector space! That's it! That wasn't too bad, was it?
Any guesses which field our vector spaces in quantum computing are concerned with?
If you answered the field of complex numbers, ℂ, give yourself three extra bonus points!
However, for the next few chapters, I will stick to the field of real numbers, ℝ, as this
makes it easier to get the concepts across without getting caught up in the extraneous
complexities inherent in ℂ. Don't worry, there will be a whole chapter on complex
numbers, and the latter half of the book will use this field almost exclusively.
Also, since vectors are the name we give to members of the set V, they can be any
mathematical object. In quantum computing, they are n-tuples, but in mathematics
and quantum mechanics, they can be anything, including functions, matrices, and
polynomials. You do not need to worry about this for quantum computing, but you
should know about it, as it will clarify why definitions are given in a generalized way to
accommodate all of these possible things being vectors in a vector space.

Summary
In this chapter, we have built a solid foundation that will carry us through the rest of this
book. We started with two fundamental mathematical concepts: sets and functions. From
there, we defined a binary operation as being a function with two input values from sets.
We combined all of these concepts to create groups, fields, and our ultimate goal, vector
spaces. In the next chapter, we will look at all the great things we can do with these vectors
that live in vector spaces!

Answers to Exercises
Exercise 1
Parts B and D are not functions.

Exercise 2
No, this is not a field. ⟨ℝ, −⟩ is not an Abelian group; subtraction is not commutative.
56 Foundations

Works cited
• "NumberSetinC" by HB is licensed under CC BY-SA 4.0.
• File:Cartesian Product qtl1.svg - Wikimedia Commons.
• "Cartesian Coordinate System" by K. Bolino is licensed under CC BY-SA 3.0.
• "Example Function" by Bin im Garten is licensed under CC BY-SA 3.0.
• "Example of Illegal Function" by Bin im Garten is licensed under CC BY-SA 3.0.
4
Vector Spaces
The entire last chapter led up to defining vector spaces. Now we will see how some vector
spaces can be subsumed by other vector spaces. We'll revisit linear combinations to talk
about linear independence. We will also learn new ways to define vector spaces with just
a small set of vectors. Finally, while we used the word dimension previously to describe a
vector space, we will attain a mathematical definition for it in this chapter.
In this chapter, we are going to cover the following main topics:

• Subspaces
• Linear independence
• Span
• Basis
• Dimension
58 Vector Spaces

Subspaces
Let's say you have a set, U, of vectors and it is a subset of a set, V, of vectors (U ⊆ V). This
situation is shown in the following diagram:

Figure 4.1 – The set U as a subset of V


Is it possible that U is a subspace of V? Well, yes. It has met the first condition for
subspaces, namely, that the potential subspace has to be a subset of the bigger vector
space's set of vectors. What's next? Well, U also has to be a vector space using the same
field as the vector space V. This seems like it might be an exhaustive thing to do, but it
has been proven that we only need to check for three small conditions to make sure U is
a subspace of V, and two of them have to do with the closure property from Chapter 3,
Foundations. As a reminder, here it is:

• Closure: For every a,b ∈ A, a ֎ b produces an element, c, that is also in the set
A. f: A × A → A

Armed with the concept of a subset and closure, we can now define a subspace.

Definition
For a subset, U, of a vector space, V, with an associated field, F, of scalars, U is a
subspace if:

1. The 0 vector is included in the subset U. (0 ϵ U).


2. The subset U is closed for vector addition. If |u⟩ and |v⟩ ϵ U, then |u⟩ + |v⟩ ϵ U.
3. The subset U is closed for scalar multiplication. If |u⟩ ϵ U and s is a scalar, then
s|u⟩ ϵ U.
Subspaces 59

We are basically trying to ensure that the following situation doesn't occur:

Figure 4.2 – Ensuring closure for a potential subspace


Alright, enough with formality! Let's look at concrete examples!

Examples
Let's use two-dimensional real space, ℝ2, as our overall vector space to keep things easy.
As you know, ℝ2 is the complete Cartesian coordinate system. What if we picked a line on
that X-Y plane to possibly be a subspace, shown as follows:

Figure 4.3 – The graph of y = x


60 Vector Spaces

As you can see, we picked the line y = x. Let's call every vector on that line (for example,
(1,1), (2,2), (-3, -3)) the set, W, of vectors. Let's represent the set in our set builder notation
from Chapter 3, Foundations:
  a  
W =    : a , b ∈ ℝ, a = b 
   
b

Equivalently, we could write:


  a  
W =    :a∈ℝ 
  a  

Using our graph, we can see that our set, W, contains a subset of ℝ2 but is not equal to
ℝ2 since vectors such as (1,3) and (2,3) are not in our set. Now, let's check whether W is a
subspace of ℝ2.

1. The 0 vector is included in the subset U. (0 ϵ U).


Check! The 0 vector in ℝ2 is (0,0), and that meets the condition of our subset that
the coordinates equal one another; so we're good here!
2. The subset U is closed for vector addition. If |u⟩ and |v⟩ ϵ U, then |u⟩ + |v⟩ ϵ U.
Hmmm, we'll have to work this one out:
 x   y   x+ y 
 + = 
 x   y   x+ y 

It should be obvious that


x+ y= x+ y ,
so we're all good and we can check this condition off the list.
3. The subset U is closed for scalar multiplication. If |u⟩ ϵ U and s is a scalar, then s|u⟩
ϵ U. Alright, let's work this one out too:
 a   sa 
s = 
 a   sa 

Again, it should be obvious that


sa = sa ,
so, baa-ding! Check again!
Subspaces 61

So, all three conditions were true, and therefore W is a subspace of ℝ2.
Let's try another line on the X-Y plane like the one in the following diagram:

Figure 4.4 – The graph of y = x + 1


So, what do you think; is this one a subspace of ℝ2? I can tell you right now, the answer
is no. It fails the very first condition: it does not contain the zero vector. Alrighty, now it's
your turn.

Exercise 1
Test the following subsets of ℝ2 and determine which ones are subspaces and which are
not. Assume all variables are real numbers.
a 
1.    : a > 0 
   
b

0
2.    
   
a

  a  
3.    : a ≥ 0 
   
a
62 Vector Spaces

Linear independence
So, it ends up that these vectors got together and wrote a declaration of independence
and that's what we'll cover here. Just joking! We do need humor every so often in a math
book. To explain linear independence, we need to go back to the concept of a linear
combination that we introduced earlier in this book.

Linear combination
We learned in Chapter 2, Superposition with Euclid, that linear combinations are the
scaling and addition of vectors. I would like to give a more precise definition as we go
beyond three-dimensional space.
A linear combination for vectors |x1⟩,|x2⟩, … |xn⟩ and scalars c1, c2, … cn in a vector space,
V, is a vector of the form:
c1 x1 + c2 x2 + ⋯ + cn xn
(1)
Basically, it is still scaling and addition, but now we can do it for vectors of any dimension
and with as many finite numbers of vectors as we wish.
Let's look at an example:
 1   −1   2   2   −3   −4   −5 
             
0  2  3 0   6   −6 0 
2 +3 −2 = + + =
 −1   −2   −2   −2   −6   4   −4 
             
 2   0   1   4   0   −2   2 

So now that we have defined linear combinations, let's look at the antonym of linear
independence – linear dependence.

Linear dependence
If you have a set of vectors and you can create a linear combination of one of the vectors
from a subset of the other vectors, then all of those vectors are linearly dependent. Let's
look at some examples.
The following vectors are linearly dependent:
1  5   13 
     
d = 2  e = 6  f =  18 
3  7   23 
     
Linear independence 63

This is because we can create |f⟩ from a linear combination of |d⟩ and |e⟩ using the scalars
3 and 2, as shown here:
3   10   13 
     
3 | d 〉 + 2 | e〉 =  6  +  12  =  18 
9   14   23 
     

Are |0⟩ and |1⟩ linearly dependent? No, because there are no scalars that you can multiply
|0⟩ by to get |1⟩ and vice-versa. You should quickly verify this.
How about these vectors?
 5   −10   4 
     
−7 −3 8 
g =  h =  f =
 3   6   9 
     
 1   8   −3 

These are not linearly dependent as there are no scalars that you can use to create one
out of a subset of all three. Now, you might rightly ask, how do you know? That is a very
good question and there are whole branches of computational linear algebra dedicated
to just this problem. The short answer is that there are several methods, some easier than
others. However, I'm going to give you a pass. The methods are not needed for quantum
computing, just the concept. You can use a very nice calculator to do the computation
for you, such as Wolfram Alpha (https://www.wolframalpha.com/). Here is a
screenshot from there asking about those three vectors we just defined:

Figure 4.5 – Screenshot from Wolfram Alpha


64 Vector Spaces

If you read the screenshot carefully, it gives you a hint to this next part. A set of vectors is
either linearly dependent or linearly independent! They cannot be both and they have to
be one or the other. In fact, that is how we will define linear independence:

A set of vectors is linearly independent if they are not linearly dependent.


So, there you have it, you now know what linear independence is. Be sure not to forget
this concept as it will be important later on.

Span
In the section on subspaces, we used set builder notation to define possible candidates for
subspaces. There is a better way to do this, however, using something called the span. The
span uses a set of distinct, indexed vectors to generate a vector space. How does it do this?
It uses every possible linear combination of the set of vectors. As you hopefully see by
now, linear combinations are at the heart of linear algebra.
So, let's start with a set, S, of vectors with just one vector, like the following:
  2  
S =   
2
   

Let's look at our one vector graphically:

Figure 4.6 – Graph of the one vector in S


Okay, what would be its span? Or, in other words, what are all the vectors that are linear
combinations of this one vector? Well, we can't add because all we have is one vector. So
all we can do is scale this vector. If we scale it for all real numbers, it becomes a ….. line!
We would say that S spans the space, or span(S) generates the space. So, our subspace
would look like the following:
Span 65

Figure 4.7 – Graph of the span of S


Hopefully, this line looks familiar, it is the line y = x from the previous section. So, we
have just created a subspace of ℝ2 with one vector using the span operator.
Let's add a vector and see whether we can generate a more interesting subspace:
  2   −2  
T =  , 
2 −2  
    

Let's look at these vectors graphically:

Figure 4.8 – Graph of the two vectors in set T


66 Vector Spaces

As you can see, they lie on the same line. Now, the big question is, what are all their
linear combinations? The answer ends up being disappointing. Yes, we do have another
vector to add now, but since the two vectors are linearly dependent, it buys us nothing.
All their linear combinations end up being the same line we had before, y = x. So,
span(S) = span(T).
To really get something, we need to add vectors that are linearly independent to the vectors
already in our spanning set. Let's go ahead and do that:
  2   −2   0  
U =  , ,  
2 −2   1  
    
The third vector is not a linear combination of the first two. Let's draw them on a graph:

Figure 4.9 – Graph of the three vectors in set U


Span 67

Alright, now we can do something! What are all the linear combinations of the three
vectors in our set U? I've drawn a few as follows:

Figure 4.10 – Graph of the linear combinations of vectors in the set U


If we continued to draw all the vectors that are linear combinations of the set U, we
would end up filling the entire vector space of ℝ2! That's right; we just defined ℝ2 with
only three vectors:
span(U ) = ℝ 2

Can we find a set with fewer vectors that spans ℝ2? It just so happens that we can! The
following set will do the job:
  0   3  
W =  ,  
3 3
     
This set of vectors is linearly independent. This condition of linear independence
for a spanning set ends up being so important that we give spanning sets of linearly
independent vectors a special name – a basis.
68 Vector Spaces

Basis
The word basis is used often in English speech and its colloquial definition is actually a
good way to look at the word basis in linear algebra:

Basis, ba·sis \ ˈbā-səs \ plural bases\ ˈbā-ˌsēz \ Noun


Something on which something else is established or based. Example 1: Stories
with little basis in reality. Example 2: No legal basis for a new trial.

The reason for this is that you can choose different bases for a vector space. While the
vector space itself does not change when you choose a different basis, the way things are
described with numbers does.
Let's look at an example in ℝ2. Consider the vector |u⟩, given as follows:

Figure 4.11 – Graph of the vector |u⟩


Clearly, its coordinates are (3,3). What if I told you I could describe the same vector
with the coordinates (3,0)? Wait a minute; that should disturb you. We never talk about
the basis in most math classes because we implicitly use the standard basis. What is the
standard basis, you may ask? It is the set of vectors whose components are all zeros, except
one component that has the number one. For ℝ2, they are:
1 0
0 =  1 = 
0 1
You will notice that these vectors are our familiar zero- and one-state vectors. This is no
accident. In quantum computing, we call the standard basis the computational basis, and
that is the term we will use in this book.
Basis 69

Okay, now remember when I said you could write any vector as a linear combination of a
spanning set? Well, since a basis is a spanning set, this is true for a basis as well. But what's
special about a basis is that there is a unique linear combination to describe each vector
and the scalars used in this unique linear combination are called coordinates.
Let's look at this for our example vector |u⟩. The unique linear combination to describe |u⟩
using the standard basis is:
3
x 0 +y 1 = 
3
1 0 3
3 +3  =  
0 1 3
I purposely named the scalars x and y because they correspond to the x and y coordinates
according to the computational basis.
Now, I am going to describe ℝ2 with a different basis, F, shown as follows:
1  −1 
F= { f1 , f 2 } where f1 =   and f 2 = 
1  1 

Let's look at |f1⟩ and |f2⟩ on our standard coordinate system:

Figure 4.12 – Graph of the vectors in set F


70 Vector Spaces

This basis is just as valid as the computational basis from before because every vector
in ℝ2 can be uniquely defined by a linear combination of these two vectors. So, the next
question is: What is the linear combination that describes |u⟩? You can probably make out
geometrically what the unique linear combination is that describes |u⟩, but let's write it out
algebraically as well:
3
x f1 + y f 2 =  
3
1  −1   3 
3  +0 = 
1  1  3
So, the scalars for this linear combination are (3,0) and these are the coordinates of |u⟩
according to our basis, F. Let's look at |u⟩ in our new coordinate system:

Figure 4.13 – Graph of |u⟩ in the F basis


Notice that |u⟩ is still the same vector according to our very first definition of a vector
as being a line segment with a length and direction. But with a new basis, we just use
different coordinates to describe it. As Shakespeare wrote in Romeo and Juliet:

What's in a name? That which we call a rose


By any other name would smell as sweet;
Or, in other words:

What's in coordinates? That which we call a vector


By any other coordinates would still be the same vector;
Dimension 71

So, I have now shown you how I can describe the same vector with different components
or coordinates. To denote the basis we are writing a vector in, we use a subscript like so,
where C denotes the computational basis:
3 3
  = 
 3 C  0  F
If no basis is given, then we assume we are using the computational basis. It should be
noted that the basis has to be indexed and ordered so that our coordinates do not get
mixed up (for example, (0,3) does not equal (3,0) in the computational basis).
I know all of this may be a little confusing, but I need to blow your mind a little to prepare
you for what lies ahead. In the next chapter, I will show you how to use matrices to travel
between bases easily.

Dimension
The dimension of a vector space ends up being very easy to define once you know what
a basis is. One thing we didn't talk about in our previous section is that the basis is the
minimum set of vectors needed to span a space, and that the number of vectors in a basis
for a particular vector space is always the same.
From this, we define the dimension of a vector space to be equal to the number of vectors
it takes as a basis to describe a vector space. Equivalently, we could say that it is the
number of coordinates it takes to describe a vector in the vector space. It follows that the
dimension of ℝ2 is two, ℝ3 is three, and ℝn is n.

Summary
We have covered a bit of ground around describing vector spaces in this chapter. We've
seen how a vector space has subspaces and how to test whether a set of vectors is a
subspace. We've rigorously defined linear combinations and derived the concept of linear
independence from it. We've also learned multiple ways to describe a vector space through
the span and basis. From this, we've learned the true meaning of coordinates and put all
that together to define the dimension of a vector space. In the next chapter, we will look at
how to transform vectors in these vector spaces using matrices!
72 Vector Spaces

Answers to exercises
Exercise 1
1. No, it is not a subspace. It does not contain the zero vector.
2. Yes, it is a subspace!
3. No, it is not a subspace. It is not closed under scalar multiplication.
5
Using Matrices to
Transform Space
Linear transformations are one of the most central topics in linear algebra. Now that
we have defined vectors and vector spaces, we need to be able to do things with them. In
Chapter 3, Foundations, we manipulated mathematical objects with functions. When we
manipulate vectors in vector spaces, mathematicians use the term linear transformations.
Why the change of terminology? As with most things in linear algebra, the wording
is inspired by Euclidean geometry. We will see that, geometrically, these "functions"
actually "transform" vectors from one direction and length to another. But this visual
transformation has been generalized algebraically to all types of vectors (n-tuples of
numbers, functions, and so on).
We also go through the crucial link between linear transformations and matrices.
The most important point of this chapter is that linear transformations can always be
represented by matrices when the vector spaces are finite (which are the only ones we
use in this book). The only caveat is that this is not a one-to-one relationship but rather
a one-to-many relationship in that each linear transformation can be represented by
multiple matrices.
74 Using Matrices to Transform Space

A word of caution about terminology. Some mathematicians call a linear transformation


a linear mapping. Physicists and quantum computing practitioners use the term linear
operator, which we will consider as a special type of linear transformation at the end of
this chapter.
In this chapter, we are going to cover the following main topics:

• Linearity
• What is a linear transformation?
• Representing linear transformations with matrices
• Transformations inspired by Euclid
• Linear operators
• Linear functionals
• A change of basis

Linearity
What makes a transform linear? This question gets to the heart of linear algebra. The
concept of linearity ties together all the other concepts we have considered so far and the
ones to come. Indeed, quantum mechanics is a linear theory. That's what makes linear
algebra crucial to understanding quantum computing.
Before I define linearity, let's look at what it is not. Real-life examples of non-linearity
abound. For example, exercising 1 hour a day for 24 days does not give the same result as
exercising 24 hours in 1 day. Watering a plant is another good non-linear example. Giving
a plant 1 gallon of water a day for 100 days will be much better than giving it 100 gallons
in 1 day. These are both examples of non-linear relationships. How much you put in does
not always translate to what you get out.
Linear relationships, on the other hand, are proportional. Speed is a good example. If you
go 20 mph for 1 hour, you'll cover 20 miles. If you go 1 mph for 20 hours, you'll still cover
20 miles. Exchange rates for money are also a good example. If the current rate is 2 dollars
to a euro, then if I give you 4 dollars, you'll give me 2 euros. I can also give you 1 dollar 4
times and get the same result.
Linearity 75

Graphs for these types of relationships are straight lines, such as this one for the
euro example:

Figure 5.1 – A graph of the dollars to euros function


Crucially though, linearity requires functions to return 0 if 0 is the input to the function.
So, straight-line functions that don't pass through the origin do not get ascribed the
property of linearity. In other words, if I go 0 mph, I should cover 0 miles, and hopefully,
you won't give me euros back if I give you 0 dollars (but you'd be my best friend if you did).
In mathematics, we want to generalize this concept so that it works in many situations.
We are all familiar with a line through the origin. Let's take the simplest example, y = x, as
shown in the following graph:

Figure 5.2 – The line y = x


76 Using Matrices to Transform Space

What can we generalize about this line? Well, the line has a constant slope, namely one,
so that if I increase its slope, it consequently raises the output by a proportional amount.
For example, let's say I increase the slope to three (y = 3x). Well, now instead of y being
3 at x = 1, it will be 3 Ÿ 3 or 9. This property has been generalized into something called
homogeneity and is defined thusly:

α f ( x) = f (α x) where α is a scalar
Here is a small table showing values for our function y = x and the different values of α:

Table 5.1 – αf(x) and f(αx) at different values of x and α


What else can we generalize about our line y = x? Well, if I added the value of y at x = 3
to the value of y at x = 4, it would equal the value of y at x = 7. This works no matter what
slope I give the line, too. Here's another table of values to prove it to you:

Table 5.2 – f(x), f(z), and f(x+z) at different values of x and z


What is a linear transformation? 77

This property is called additivity and is defined this way:

f ( x + y ) = f ( x) + f ( y )
These two properties, additivity and homogeneity, define linearity. To be a linear
transformation, a transformation must have linearity.

What is a linear transformation?


To be precise, a linear transformation is a function T from a vector space U to a vector
space V. A capital letter "T" is traditionally used by mathematicians to denote a generic
transformation, and we use the same syntax that we introduced for functions in
Chapter 3, Foundations:
T :U → V
Similarly, the vector space U is the domain, and the vector space V is the codomain. Each
vector that is transformed is called the image of the original vector in the domain. The set
of all images is the range.
To be linear, the transformation must preserve the operations of vector addition and
scalar multiplication by meeting the conditions for linearity. Here, we express them in
terms of vectors:

T ( | x〉 + | y 〉 ) = T ( | x〉 ) + T ( | y 〉 )
T ( s | x〉 ) = sT ( | x〉 )
where s is a scalar
It follows from these axioms that for any linear transformation T, T (0) has to equal the
zero vector 0. Let's look at how we describe transformations in the next section.

Describing linear transformations


There are many ways to describe a linear transformation. In the case of Euclidean vectors,
you can describe a linear transformation geometrically. If you are dealing with n-tuples
of numbers, you can describe the effect of the transformation on those numbers. Due to
the concept of linear combinations, you can also describe how the transformation changes
just the basis vectors. Let's go through each of these in depth.
78 Using Matrices to Transform Space

A geometric description
To show an example of a geometric description, let's use reflection. Reflection is a very
easy and intuitive transformation, as seen in the following diagram. We will call our
reflection transformation "R":

Figure 5.3 – A graphical depiction of the reflection transformation


One vector is the axis of reflection, and you reflect vectors about it by drawing a dashed
line that is perpendicular to the axis of reflection from the tip of the original vector (|x⟩
in the diagram). Then, place the tip of the reflected vector R(|x⟩) equidistant from the axis
of reflection.
Okay, now that we've described this transformation, let's check to see whether it is linear.
First, we'll check for homogeneity. Homogeneity for vector transformations is defined as:
T ( s | x〉 ) = sT ( | x〉 )

We will set s = 2, and you can see in the following diagram that our reflection
transformation, R, does indeed pass the test for homogeneity:

Figure 5.4 – A test reflection for homogeneity


What is a linear transformation? 79

Now, let's test the other condition for linearity, additivity. As you should recall, additivity
for a linear transformation is defined as:

T ( | x〉 + | y 〉 ) = T ( | x〉 ) + T ( | y 〉 )
The following diagram shows a reflection for two vectors, |x⟩ and |y⟩. Note that R(|y⟩) is
the same as |y⟩ because |y⟩ is on the axis of reflection:

Figure 5.5 – A reflection of |y⟩


Now, look at the following diagram closely:

Figure 5.6 – A test for additivity


We moved the start point of |y⟩and R(|y⟩) to the end of |x⟩ and R(|x⟩), respectively,
because Euclidean vectors are equal as long as they retain their length and magnitude,
as we explained in Chapter 1, Superposition with Euclid. From the diagram, you should
be able to make out that additivity holds for reflections as well. Therefore, our reflection
transformation is a linear transformation.

An algebraic description
Let's look at another way you can describe a linear transformation. If you are dealing
with n-tuples of numbers in ℝn, then you can say explicitly what the transform does to an
n-tuple. Let's look at an example.
First, I need to define the domain and codomain of the transform, like so:
2 3
T: →
80 Using Matrices to Transform Space

Then, I can define what the transform does:


T ( x, y ) = ( 2 x + y , x − y , x + y )

We represent n-tuples as column vectors, so this can be rewritten as:


 2x + y 
 x   
T (| x〉 ) = T    = x− y 
  y 
   x+ y 
 
The preceding equations fully define the linear transformation for every vector in our
domain, ℝ2. Here are a few instances of the transformation in action:

 7 
 2   
T     =  −1 
 3 
   5 
 
 −3 
  −2    
T    = −3 
  1   
   −1 
 
 15 
 6   
T   = 3 
 3 
   9 
 
We should make sure this transformation is indeed linear as well. Let's do homogeneity first:
T ( s | x〉 ) = sT ( | x〉 )

Let's see how our transformation does with this:

 2 sx + sy 
  sx    
T ( s | x〉 ) = T     =  sx − sy 
  sy  
   sx + sy 
 
 2 x + y   2 sx + sy 
 x     
sT (| x〉) = sT     = s  x − y  =  sx − sy 
  y 
   x + y   sx + sy 
   
What is a linear transformation? 81

Alright, it passes! Let's try additivity:

T ( | x〉 + | y 〉 ) = T ( | x〉 ) + T ( | y 〉 )

Here's the test:

 2( x + w) + ( y + z )   2 x + 2w + y + z 
  x w    x+w     
T (| x〉+ | y〉 ) = T    +    = T      ( x + w) − ( y + z )  =  x + w − y − z 
  y  z     y+z  
     ( x + w) + ( y + z )   x + w + y + z 
   
 2 x + y   2w + z   2 x + y + 2 w + z 
 x   w       
T (| x〉) + Τ(| y 〉 ) = T     + T     =  x− y  +  w− z  =  x− y + w− z 
  y    z  
     x + y   w + z   x + y + w + z 

Again, it passes, so this transformation is linear. Now it's your turn – are the following
transforms linear?

Exercise one
T : ℝ 4 → ℝ3
  x  
    w
   
T (| x〉 ) = T  
y 
 w = y 
    z 
  z    
 
U : ℝ → ℝ3
2

 x+ y 
 x   
U (| x〉 ) = U     =  y 
  y 
   x2 
 
Y :ℝ → ℝ
3 3

  x    x+ y 
     
Y (| x〉 ) = Y   y   =  2y 
  z    z + 2 
     

A basis vectors description


I'd like to show you one more way to describe a linear transformation. There are many
more ways to describe a linear transformation, but I think these three are a good way
to start.
82 Using Matrices to Transform Space

Since any vector in a vector space can be expressed as a linear combination of a set of basis
vectors, if you describe what the transform does to a set of basis vectors, you've described
the complete transformation. Let's show this through an example.
We'll start with our computational basis vectors |0⟩ and |1⟩. I'll describe what my
transform does to these two vectors:
2 2
T: →
1 2
T (| 0〉 ) =   , T (| 1〉 ) =  
2 1

I have now fully described the transformation for every vector in my domain ℝ2. Let's take
a random vector, |x⟩ in ℝ2, and work out its transformation:

3
| x〉 =   = 3 | 0〉 + 2 | 1〉
2

You'll notice that I've expressed the vector |x⟩ as a linear combination of our basis
vectors. By definition of a basis, I can do this for any vector in ℝ2. Now, I will apply the
transformation to the linear combination:

T (| x〉 ) = T ( 3 | 0〉 + 2 | 1〉 )
Due to Additivity, I can do this:
T ( 3 | 0〉 + 2 | 1〉 ) = T (3 | 0〉) + T ( 2 | 1〉 )
Due to Homogenity, I can do this:
T (3 | 0〉) + T (2 | 1〉 ) = 3T (| 0〉) + 2T (| 1〉 )
ues of T (| 0〉) and T (| 1〉) in:
I can then substitue the valu
1 2 3  4 7
3T (| 0〉) + 2T (| 1〉 ) = 3  +2 = + = 
2 1 6  2 8
7
So T (| x〉 ) =   .
8
Representing linear transformations with matrices 83

Through this example, I hope I've shown that you can describe a linear transformation by
just stating what it does to a set of basis vectors. We'll move on now to matrices!

Representing linear transformations with


matrices
Now for the most common and important way of describing a linear transformation, the
matrix. Through the magic of matrix-vector multiplication, a matrix is all you need to
describe a linear transformation.
Again, let's start with an example. I'm going to describe the linear transformation we used
in the An algebraic description section with a matrix. To jog your memory, here is the
aforementioned linear transformation:
2 3
T: →
 2x + y 
 x   
T (| x〉 ) = T     =  x − y 
  y 
   x+ y 
 
Now, here is how I can describe it with a matrix:

T : ℝ 2 → ℝ3
2 1 
 x    x 
T (| x〉 ) = T     =  1 −1   
  y 
  1 1  
y

 

I don't even need to be that formal, other than telling you that we are using real numbers;
I can just give you the matrix, and that describes everything. The dimension of the domain
is the number of columns of the matrix, the dimension of the codomain is the number of
rows of the matrix, and the actual transformation is the matrix itself. That is the power of
a matrix!
84 Using Matrices to Transform Space

Let's apply this transformation to the same example vectors we used in the An algebraic
description section using matrix-vector multiplication:

2 1   7 
 2    2   
T     =  1 −1    =  −1 
 3  3
  1 1    5 
   
2 1   −3 
  −2      −2   
T T   = 1 −1    =  −3 
  1    1 
  1 1   −1 
   
2 1   15 
 6    6   
T     =  1 −1    =  3 
 3  3
  1 1    9 
   

We get the same exact answers!

Matrices depend on the bases chosen


Now, let's look to see how we can use a matrix to describe our reflection transformation
from before. The crucial point of this example is that our matrix depends on the basis we
choose to represent it. Look at the transformation and see whether you can determine any
good basis to use it:

Figure 5.7 – A reflection transformation


Representing linear transformations with matrices 85

Let's use the basis set E of |e1⟩ and |e2⟩ in the following diagram:

Figure 5.8 – A reflection transformation with basis vectors


Using this basis, we can see that:
| x〉 = 2 | e1 〉 + | e2 〉
and
R(| x〉) = 2 | e1 〉 − | e2 〉

We can use the scalars of this linear combination as coordinates, as we learned in


Chapter 4, Vector Spaces. This gives us the following:
2
| x〉 =  
 1 E
 2 
R(| x〉 ) =  
 −1  E

where E is our set of basis vectors.

So now, for our transformation R, we need to find a matrix A that represents it. Here is
what I'm trying to say mathematically:

2  2 
Let | x〉 =   then R (| x〉 ) =  
1  −1 
a a12   2   2a11 + 1a12   2 
R(| x〉 ) = A | x〉 =  11  =  = 
 a21 a22   1   2a21 + 1a22  .  −1 
86 Using Matrices to Transform Space

If you look at the equation closely, you should be able to make out what the entries of the
matrix should be. Here they are in all their glory!
1 0  2   2 
R(| x〉 ) = A | x〉 =   = 
 0 −1   1   −1 
This is great! We have now found a matrix that we can multiply any vector by in our
2D space to get its reflection using our basis vectors E. Let's do it for the vector |y⟩ in
this diagram:

Figure 5.9 – A reflection of |y⟩ with basis vectors


Since |y⟩ is on the axis of reflection, it is its own reflection. In terms of our basis vectors
E, |y⟩ and its reflection have the following coordinates:
2
| y〉 = R (| y〉 ) =  
 0 E

Alright, it's time for the moment of truth. Will our matrix A give us the right vector back?
Let's check:
1 0  2   2 
R(| y〉 ) = A | y〉 =   = 
 0 −1   0   0 

It does! I will go ahead and tell you that this matrix will work for any vector that is
expressed in terms of our basis set E.
You should also be able to see that if we picked a different set of basis vectors, the matrix
would be different as well. In our first example, we chose a matrix based implicitly on
the canonical computational basis. This is called the standard matrix of the linear
transformation. But if you change the basis, say to |+⟩ and |-⟩, the matrix will change
as well. You can even get really complex and change the input and output bases of the
transformation to affect a new matrix, but this is rarely done in practice. The takeaway
point of this section is that a matrix can represent a linear transformation, but there are
many matrices that can represent it based on the basis chosen. Finally, matrices that
represent the same linear transformation are called similar.
Representing linear transformations with matrices 87

Matrix multiplication and multiple transformations


The last superpower we will go over for matrices is the fact that not only can they
represent transformations, but they can also represent multiple transformations through
matrix multiplication!
Let's say that we wanted to do our reflection transformation twice. Intuitively, we should
get back the same vector. Using matrix multiplication, we can show that algebraically:

1 0  a   a 
R(| z 〉 ) = A | z 〉 =   = 
 0 −1   b   −b 
1 0 1 0  a  1 0  a   a 
R( R (| z〉 )) = A( A | z 〉 ) =    =  = 
 0 −1   0 −1   b   0 −1   −b   b 

And there you have it – we have just proven that for any vector in ℝ2, doing our reflection
twice returns the same vector.

The commutator
You should remember from Chapter 1, Superposition with Euclid, that matrices do not,
in general, commute. This also means that linear transformations do not commute in
general. Physicists use something called the commutator to represent "how much" two
transformations or matrices commute.
The commutator is defined to be:
[ A, B] = AB − BA
for two n x n matrices. This holds for any matrices that represent a linear transformation.
If the commutator is zero for two transformations, then they commute. If it is non-zero,
the operators are said to be incompatible. In quantum mechanics, observables such as
momentum are represented by linear transformations. All of this leads to the famous
uncertainty principle that states that two observables that do not commute cannot be
measured simultaneously.
Okay, let's move on to something a little less heady and talk about translations, rotations,
and projections.
88 Using Matrices to Transform Space

Transformations inspired by Euclid


In linear algebra, there are many "special" types of linear transformations that have names
that connote concepts we have in our real world, such as reflections and projections. These
concepts have been generalized to apply to all types of vectors, but the geometric description
of them with Euclidean vectors gives us an idea as to why they work the way they do. This
intuition can then be taken and applied to all types of vectors and vector spaces.

Translation
The first transform we will look at is translation. It transforms all vectors in a vector space
by a displacement vector. More precisely:
T (| x〉 ) =| x〉+ | d 〉 where | d 〉 is the vector of displacement
In the following graph, the vector |x⟩ is translated to the right by |d⟩ to form T(|x⟩):

Figure 5.10 – A graphical depiction of translation


What's interesting about this type of translation is that it turns out to be non-linear! I will
quickly show you.
Let's start with additivity and work it out algebraically. First, we will transform two vectors
and add them together:
T (| u 〉 ) =| u 〉+ | d 〉 T (| v〉 ) =| v〉+ | d 〉

T (| u〉 ) + T (| v〉 ) =| u〉+ | v〉 + 2 | d 〉

This should equal the transformation of the two vectors added together:
T (| u 〉 + | v〉 ) =| u 〉 + | v〉 + | d 〉
Unfortunately, they are not equal. In other words:

T (| u 〉 ) + T (| v〉 ) ≠ T (| u〉+ | v〉 ).
Transformations inspired by Euclid 89

Another quick way to prove that a transformation is not linear is to show that the
transform of the zero vector does not return the zero vector:

T (0) ≠ 0
T (0) =| d 〉
Finally, if we draw out the vectors, we can see that the transformation is not linear. The
following diagram is a test for homogeneity and, as you can see, the transform T(2|x⟩)
does not equal 2T(|x⟩):

Figure 5.11 – A test for homogeneity


I could have left this transformation out of this section, but I thought it important to
show you that even intuitive concepts such as translation can be non-linear. Now, let's
look at the linear transformations of projection and rotation. No more trickery – all the
remaining transformations are linear!

Rotation
Everyone has a concept of what a rotation is. We need to take that concept and express
it mathematically. This section will rely a lot on trigonometry. If you need to brush up,
please consult the Appendix chapter on trigonometry.
90 Using Matrices to Transform Space

Okay, let's start with two-dimensional rotations. I will actually define them using the
following graph:

Figure 5.12 – A graphical depiction of a rotation transformation


Describing this in words, given a vector |v⟩ and an angle θ, R(θ) will rotate |v⟩ through an
angle θ with respect to the x axis about the origin.
Transformations inspired by Euclid 91

Let's see what this transformation does to our two computational basis vectors |0⟩ and |1⟩.
First, |0⟩ – if I rotate |0⟩ by θ radians, what do I get? Let's look at a graph:

Figure 5.13 – The effect of rotation by Ѳ on |0⟩


From the graph, we can tell that the new coordinates for |0⟩ will be:

 cos θ 
R(θ ) | 0〉 =  
 sin θ 
92 Using Matrices to Transform Space

Now, on to |1⟩. Let's look at its graph:

Figure 5.14 – The effect of rotation by Ѳ on |1⟩


From the graph, we can tell that:
 − sin θ 
R(θ ) | 1〉 =  
 cos θ 

I have now fully described the transformation both geometrically and through the basis
vectors. What if I want to come up with a matrix for this transformation? Well, there is a
theorem in linear algebra that if I give you the results of a transformation according to the
computational basis, I can then compute the matrix according to this formula:

T (| x〉 ) = A | x〉
if I know
T (| 0〉 ) and T (| 1〉 )
then
 T (| 0〉 ) T (| 1〉 ) 
A=  (1)
 
Transformations inspired by Euclid 93

In other words, I can use the results I found before of the transform's effect on the
computational basis vectors. Taking those results as column vectors and putting them into
a matrix gives me the standard matrix for the linear transformation! Without further ado,
here is our result:
 cos θ 
R(θ ) | 0〉 =  
 sin θ 
 − sin θ 
R (θ ) | 1〉 =  
 cos θ 
R (θ )(| x〉 ) = A | x〉
then
 R(θ )(| 0〉 ) R(θ )(| 1〉 )   cos θ − sin θ 
A= = 
   sin θ cos θ 

Based on this, I can give you the result of a rotation for any vector in ℝ2:

a
| v〉 = a | 0〉 + b | 1〉 =  
b
 cos θ − sin θ   a   a cos θ − b sin θ 
R(θ ) | v〉 =   = 
 sin θ cos θ   b   a sin θ + b cos θ 

Note that Equation (1) works for any finite amount of computational basis vectors as well:

T (| x〉 ) = A | x〉
if I know
T (| e1 〉 ), T (| e2 〉 ), ..., T (| en 〉 )
then
 T (| e1 〉 )... T (| en 〉 ) 
A= 
 
where | e1 〉, | e2 〉, ..., | en 〉 are the standaard computational basis vectors.

Now, on to another intuitive transformation, projection.


94 Using Matrices to Transform Space

Projection
Projection is another linear transformation that it is good to be acquainted with in
quantum computing, since it is used heavily in the measurement of qubits. A good way to
conceptually look at it is the way we think about projection in the everyday world. Let's
take the process of taking a picture with a camera. When you do this, you are projecting a
3D world onto a 2D surface.
Also, if you were to take a picture of the picture, you would get the same picture. Doing a
projection twice does not yield a different result.
In this figure, I am projecting a 3D cube onto a two-dimensional plane:

Figure 5.15 – A projection of a cube on a 2-D plane


In the following figure, I am projecting a two-dimensional circle onto a 1D line:

Figure 5.16 – A projection of a two-dimensional circle onto a one-dimensional line


Transformations inspired by Euclid 95

If I project the line again, I will get the same line. It is this feature of projection that
mathematicians have generalized to create a definition for projections. If you have a linear
transformation P, then if the following condition holds, it is a projection:
P2 = P (2)
That's it! Not that bad, huh?
Okay, let's say that we want to come up with a projection matrix for the projection of the
cube onto a plane in Figure 5.15. Let's say that the plane is the X-Y plane and all the points
or vectors for the cube are in ℝ3. So, we need to keep the X-Y coordinates but set the Z
coordinates to zero. We need a matrix that does this:

x x 
   
P y = y 
z  0 
   

If we set P to the following matrix, that should do it:

1 0 0 
 
P= 0 1 0 
0 0 0
 
Let's see if Equation (2) holds:
1 0 0  1 0 0  1 0 0 
     
P =  0 1 0 ⋅ 0 1 0  =  0 1 0 
2

0 0 0 0 0 0 0 0 0
     
Indeed, it does. Now, you get to test it out.

Exercise two
What is the answer to this problem for a random three-dimensional vector?
1 0 0  3 
  
P | v〉 =  0 1 0   −4 
0 0 0  1 
  

Now, on to a special type of linear transformation.


96 Using Matrices to Transform Space

Linear operators
Linear operators are linear transformations that map vectors from and to the same vector
space. Indeed, reflections, rotations, and projections are all linear operators. In quantum,
we put a "hat" or caret on the top of the letter of the linear operator when we want to
distinguish it from its representation as a matrix. For instance, all the following linear
transformations are linear operators:
̂ : ℝ2 → ℝ2

1 2
=[ ]
3 4

̂ : ℝ3 → ℝ3
4 5 1
= [ 3 9 8]
4 5 3

̂ : ℂ4 → ℂ 4

4 2 −1
−2 3 6 7
[ ]
9 1 2 4
5 6 7 8
Most of the time, it is clear from the context that we are referring to a matrix or a linear
operator, so the caret or "hat" is not used.
The following linear transformations are not linear operators:
2 3
T: →
 1 2
 
 3 9
 83 7 
 

6 3
R: →
8 9 8 5 4 6
 
1 5 3 4 9 4
6 3 3 5 4 6
 
Linear functionals 97

You probably noticed that all linear operators are represented by square matrices.
This leads to all types of special properties that they can have, such as determinants,
eigenvalues, and invertibility. We will take up all these topics in later chapters, but I
wanted to make sure that you knew this term. In quantum computing, you will rarely
see the term linear transformation, but it is a common term used in mathematics. It will
almost always be linear operator in quantum computing, and now you know that it is
just a special type of linear transformation. Also, there is one other special type of linear
transformation I would like to look at.

Linear functionals
A linear functional is a special case of a linear transformation that takes in a vector and
spits out a scalar:
f : V → F where V is a vector space and F is the field of scalaars ℝ or ℂ

For instance, I could define a linear functional for every vector in ℝ2:
a
f ( | v〉 ) = a + b where | v〉 =  b 
 
So that:
 3 
f     =3+ 2 = 5
 2 
 
  5  
f    =5−2=3
  −2  
 

There are many linear functionals that can be defined for a vector space. Here's
another one:
g : ℝ2 → ℝ
a
g ( | v〉 ) = 2a − 3b where | v〉 =  
b
The set of all linear functionals that can be defined on a vector space actually form their
own vector space called the dual vector space. This concept is important to fully define
a bra in bra-ket notation. Please see the Appendix section on bra-ket notation if you are
interested in more information.
98 Using Matrices to Transform Space

A change of basis
We learned in Chapter 4, Vector Spaces, that a vector can have different coordinates
depending on the basis that was chosen, but we didn't tell you how to go back and forth
between bases. In this section, we will.
We want to come up with a matrix – let's call it B for a change of basis – that takes us from
one basis to another. In other words, we want this mathematical formula to work:

 x1   y1 
   
B  = 
 xn   yn 
 C  F
This matrix B will convert the coordinates of a vector according to a basis C to the
coordinates for the vector in the basis F. Now, how do we find this matrix?
Let's look at an example. We will define the basis C as the computational basis and the
basis F this way:

1 0
Basis C = {| 0〉 ,| 1〉} where | 0〉 =   and | 1〉 =  
0 1

1  −1 
Basis F = { f1 , f 2 } where f1 =   and f 2 = 
1  0 

Now, let's look at a random vector, |v⟩, defined in the computational basis C:

1  0  3
| v〉 = 3 | 0〉 + 4 | 1〉 = 3  +4 = 
0  1   4 C (3)
So, what we want to do is to find the coordinates of |v⟩ in the basis F. In other words, we
want to find the variables a and b in the following equation:

a
| v〉 = a f1 + b f 2 =  
 b F
What would happen if we took our basis vectors in C and multiplied them by our change
of basis matrix B? We would get our basis vectors in C expressed as coordinates in the
basis F, as shown in the following equation:
A change of basis 99

B | 0〉 =| 0〉 F (4)
B | 1〉 =| 1〉 F

Let's take our original Equation (3), multiply it by our change of basis matrix B, and
express it again based on our new-found knowledge from Equation (4):
| v〉 = 3 | 0〉 + 4 | 1〉

B | v〉 = 3( B | 0〉 ) + 4( B | 1〉 ) = 3 | 0〉 F + 4 | 1〉 F
We can express that very conveniently as matrix multiplication, like so:
 | 0〉 F | 1〉 F   3  (5)
B | v〉 =   
   4 C

The next step is to find our basis vectors in C, |0⟩ and |1⟩, expressed as coordinates in F.
To do this, we have to find them expressed as a linear combination of the vectors in the
basis F! We'll start with |0⟩:
| 0〉 = c | f1 〉 + d | f 2 〉

Let's work it all out:


1  −1  1  −1   1 
| 0〉 = c | f1 〉 + d | f 2 〉 = c   + d   = 0   + (−1)  = 
1 0  1  0 0
c   0 
| 0〉 F =   =  
 d   −1 
And here are the calculations to obtain the coordinates in F for |1⟩:

1  −1   1   −1   0 
| 1〉 = g | f1 〉 + h | f 2 〉 = g   + h   =1  +1 = 
1  0  1  0   1 
 g  1
| 1〉 F =   =  
 h  1
Now that we have that, let's plug these vectors back into Equation (5) and see what we get,
using our change of basis matrix B on our random vector |v⟩:

 | 0〉 F | 1〉 F   3   0 13  4
B | v〉 =    =   = 
   4  C  −1 1   4  C  1  F
100 Using Matrices to Transform Space

So, there you have it – we have changed a vector expressed in C to a vector expressed in F
using a matrix. Due to this, a change of basis is a linear transformation. With this change
of basis matrix, we can change any vector expressed as coordinates in C to coordinates
in F. But how do we know we're right? Well, the vector |v⟩ should be equal as a linear
combination in either basis, so:

3 4
|𝑣𝑣⟩ = [ ] = [ ]
4 𝐶𝐶 1 𝐹𝐹

|𝑣𝑣⟩ = 3|0⟩ + 4|1⟩ = 4|𝑓𝑓1 ⟩ + 1|𝑓𝑓2 ⟩

1 0 1 −1 3
|𝑣𝑣⟩ = 3 [ ] + 4 [ ] = 4 [ ] + 1 [ ] = [ ]
0 1 1 0 4

Okay, now that we've worked out an example, I'll give you the general formula for
transforming a basis, and hence, you will be able to transform the coordinates for every
vector in one basis to another:

G = { | g1 〉 … | g n 〉 } H = { | h1 〉 … | hn 〉 }

G H

CG → H =  | g1 〉 H … | g n 〉 H 

This is basically saying that you have to express the basis vectors in the input basis as
coordinates in the output basis.

Summary
We have covered the breadth of linear transformations in this chapter. They are key to
understanding the linear algebra that is pervasive in quantum computing. We've also seen
how these transformations have been inspired by Euclidean geometry and considered
special transformations such as linear operators and linear functionals. Finally, we saw
how to do a change of basis, which is a linear transformation as well! Next up, we will go
from real numbers into the field of complex numbers.
Answers to exercises 101

Answers to exercises
Exercise one
T : ℝ 4 → ℝ3
  x  
    w
   
T (| x〉 ) = T  
y 
 w  =  y  Yes, this is linear
    z 
  z    
 
U : ℝ 2 → ℝ3
 x+ y 
 x   
U (| x〉 ) = U    = y  No, this is non-linear
  y 
   x2 
 
Y : ℝ3 → ℝ3
  x    x+ y 
     
Y (| x〉 ) = Y   y   =  2 y  No, this is non-lin
near
     
  z    z + 2 

Exercise two
 3 
 
 −4 
 0 
 

Works cited
Geometry – What is the position of a unit cube so that its projection on the ground has
maximal area? – Mathematics Stack Exchange
https://math.stackexchange.com/questions/1894106/what-is-the-
position-of-a-unit-cube-so-that-its-projection-on-the-ground-
has-max
Section 3:
Adding Complexity

In this section, we add complex numbers to the mix. We use complex numbers to get
into Eigenstuff. Then, we bring it all together in Chapter 8, Our Space in the Universe, and
add on a layer of extra credit in Chapter 9, Advanced Concepts.
The following chapters are included in this section:

• Chapter 6, Complex Numbers


• Chapter 7, Eigenstuff
• Chapter 8, Our Space in the Universe
• Chapter 9, Advanced Concepts
6
Complex Numbers
"What is unpleasant here, and indeed directly to be objected to, is the use of
complex numbers. Ψ is surely fundamentally a real function."
– Letter from Schrödinger to Lorentz. June 6, 1926.
Even the great physicist Erwin Schrödinger was perplexed by the occurrence of complex
numbers in quantum mechanics. Yet, complex numbers have been found to be inherent to
quantum mechanics and, hence, quantum computing. Up until now, we have concentrated
on the real numbers to make concepts easier to get across. It is time now to cross the
Rubicon and make our way into the complex plane. You have probably come across
complex numbers before, but we will go into great depths about them in this chapter.
I think it is unfortunate that René Descartes named i an imaginary number, as it makes
it seem that this topic should be otherworldly. But so was the number 0 when it was
introduced, and the same with negative numbers. No one gives them a second thought
today and you should treat complex numbers the same way. They are just another set
of numbers.
106 Complex Numbers

In this chapter, we are going to cover the following main topics:

• Three forms, one number


• Cartesian form
• Polar form
• The most beautiful equation in mathematics
• Exponential form
• Bloch sphere

Three forms, one number


There are three main ways of representing a complex number:

• Cartesian form (aka the general form)


• Polar form
• Exponential form

Each one has its advantages and disadvantages, depending on what we are trying to do.
These will become evident as we go through them.

Definition of complex numbers


A complex number is a number that can be expressed in the following way:
𝑧𝑧 = 𝑎𝑎 + 𝑏𝑏𝑏𝑏 (1)
where a and b are real numbers and i is the imaginary unit. The imaginary unit is
defined as:
𝑖𝑖 2 = −1
It follows from this that there are two square roots of -1, i and -i.
The real part of a complex number is denoted by Re(z) and the imaginary part is
denoted by Im(z). For our complex number, z, defined in Equation (1), Re(z) = a and
Im(z) = b. Two complex numbers, z and w, are equal if, and only if, Re(z) = Re(w)
and Im(z) = Im(w).
It is important to remember that the set of real numbers ℝ is a subset of ℂ. Hence, if
Im(z) = 0 for a complex number z, then z is also a real number.
Cartesian form 107

Let's quickly look at some examples of complex numbers:


𝑧𝑧 = 3 + 2𝑖𝑖 
2
𝑤𝑤 = −4.3 − 𝑖𝑖
3
𝑥𝑥 = 6
𝑦𝑦 = 4𝑖𝑖

Now, let's move on to describing the three forms of complex numbers.

Cartesian form
We used the Cartesian form to define a complex number. To see why it is called Cartesian,
notice we can also use an ordered pair of real numbers to represent the complex number z.
The first number of the ordered pair will be the real part of the complex number, and the
second number will be the imaginary part:
𝑧𝑧 = 𝑎𝑎 + 𝑏𝑏𝑏𝑏 = (𝑎𝑎, 𝑏𝑏)

Given this, we can represent complex numbers on a Cartesian coordinate system since a
and b are just real numbers. We will need to make a couple of modifications though.
We will replace the x axis with an axis for the real part of a complex number ( Re(z) ), and
the y axis with an axis for the imaginary part of a complex number ( Im(z) ), like so:

Figure 6.1 – The complex plane


108 Complex Numbers

This is called the complex plane. Here is an example involving actual complex numbers:

Figure 6.2 – Complex numbers on the complex plane


Keep this in mind as we go through the basic operations of complex numbers as we can
think of them both algebraically and geometrically, as we did in Chapter 2, Superposition
with Euclid.

Addition
Addition is rather easy for complex numbers; just add Re(z) and Im(z) of the two numbers
together to get the sum. We will be using the following two complex numbers in our
following definitions:
𝑧𝑧1 = 𝑎𝑎1 + 𝑏𝑏1 𝑖𝑖
𝑧𝑧2 = 𝑎𝑎2 + 𝑏𝑏2 𝑖𝑖

Here is the definition:


𝑧𝑧1 + 𝑧𝑧2 = (𝑎𝑎1 + 𝑎𝑎2 ) + (𝑏𝑏1 + 𝑏𝑏2 )𝑖𝑖

Subtraction is defined as:


𝑧𝑧1 − 𝑧𝑧2 = (𝑎𝑎1 − 𝑎𝑎2 ) + (𝑏𝑏1 − 𝑏𝑏2 )𝑖𝑖
Cartesian form 109

Here are some examples:


(5 + 4𝑖𝑖) + (7 + 2𝑖𝑖)
12 + 6𝑖𝑖

(−6 + 3𝑖𝑖) + (5 + 𝑖𝑖)


−1 + 4𝑖𝑖

𝑖𝑖 + (−5 + 6𝑖𝑖)
−5 + 7𝑖𝑖

Remember that you can view complex numbers as vectors on the complex plane, so
addition, scalar multiplication, and subtraction can be viewed graphically as well, as in
the following:

Figure 6.3 – Vector addition and scalar multiplication [1]


Alright, let's move on to multiplication.

Multiplication
I will show you another way to do complex multiplication later in this chapter, but you
should know how to do it with the Cartesian form of a complex number. Hopefully, you
remember the FOIL method from high school algebra. If you do, you can skip the next
section. If not, here's a quick refresher:

FOIL method (optional)


The FOIL method is used to multiply two binomials together. It stands for:

• First terms
• Outer terms
• Inner terms
• Last terms
110 Complex Numbers

You add these all up, and there you are! This figure should jog your memory:

Figure 6.4 – FOIL method illustrated [2]


Now that we've jogged your memory, on to the defintion of multiplication for
complex numbers.

Definition
Here is the definition of the multiplication of two complex numbers in Cartesian form.
It is:
𝑧𝑧1 ⋅ 𝑧𝑧2 = (𝑎𝑎1 + 𝑏𝑏1 𝑖𝑖)(𝑎𝑎2 + 𝑏𝑏2 𝑖𝑖)
𝑎𝑎1 𝑎𝑎2 + 𝑎𝑎1 𝑏𝑏2 𝑖𝑖 + 𝑎𝑎2 𝑏𝑏1 𝑖𝑖 + 𝑏𝑏1 𝑏𝑏2 𝑖𝑖 2
𝑎𝑎1 𝑎𝑎2 + 𝑎𝑎1 𝑏𝑏2 𝑖𝑖 + 𝑎𝑎2 𝑏𝑏1 𝑖𝑖 − 𝑏𝑏1 𝑏𝑏2
(𝑎𝑎1 𝑎𝑎2 − 𝑏𝑏1 𝑏𝑏2 ) + (𝑎𝑎1 𝑏𝑏2 + 𝑎𝑎2 𝑏𝑏1)𝑖𝑖

As always, here is an example for your viewing pleasure:


(3 − 4𝑖𝑖)(4 + 5𝑖𝑖) = 3 ⋅ 4 + 3 ⋅ 5𝑖𝑖 − 4𝑖𝑖 ⋅ 4 − 4𝑖𝑖 ⋅ 5𝑖𝑖 Use FOILMethod.
      = 12 + 15𝑖𝑖 − 16𝑖𝑖 − 20𝑖𝑖 2 Substitute 𝑖𝑖 2 = −1 .
= 12 + 15𝑖𝑖 − 16𝑖𝑖 − 20(−1)
= 12 − 𝑖𝑖 + 20
= 32 − 𝑖𝑖

Now it's your turn.


Cartesian form 111

Exercise 1
What is:
(2 − 3𝑖𝑖)(5 + 𝑖𝑖)
(1 − 𝑖𝑖)(2 + 𝑖𝑖)
(−2 + 3𝑖𝑖)(−4 − 𝑖𝑖)

Now for a new concept that doesn't exist for real numbers.

Complex conjugate
While the definition of the complex conjugate is very simple, store it somewhere safe
in your brain as it will become very important to us as we move forward. The complex
conjugate of a complex number a + bi is a – bi. That's it! It is written as z* for a complex
number z. Here it is again, just to drill it into your skull :)
𝑧𝑧 = 𝑎𝑎 + 𝑏𝑏𝑏𝑏
𝑧𝑧 ∗ = 𝑎𝑎 − 𝑏𝑏𝑏𝑏
It is interesting to view complex conjugation on the complex plane as it is just a reflection
of the real axis, as you can see in the following figure:

Figure 6.5 – Complex vector reflected on the real axis


Now we'll use the complex conjugate to get the absolute value of a complex number.
112 Complex Numbers

Absolute value or modulus


Again, simple, but very important. The absolute value or modulus of a complex number z,
denoted |z|, is the square root of z multiplied by its conjugate, z*:
|𝑧𝑧| = √𝑧𝑧 ⋅ 𝑧𝑧 ∗

It can also be defined thus for a complex number z = x+ iy:


|𝑧𝑧| = √[𝑅𝑅𝑅𝑅( 𝑧𝑧)]2 + [𝐼𝐼𝐼𝐼(𝑧𝑧)]2 = √𝑥𝑥2 + 𝑦𝑦 2

Here's an example:
|5 + 6𝑖𝑖| = √52 + 62 

= √25 + 36
= √61

Exercise 2
Compute the following absolute values:
|3 − 2𝑖𝑖|
|𝑖𝑖|
|6 + 3𝑖𝑖|

Division
The way to compute the division of two complex numbers is unfortunately much harder
than multiplication. However, there is a process named "rationalizing the denominator,"
which makes it easier.
Let's define our two complex numbers as:
𝑐𝑐 + 𝑑𝑑𝑑𝑑
𝑎𝑎 + 𝑏𝑏𝑏𝑏
where both a and b ≠ 0. First, we multiply the numerator and denominator by the
complex conjugate of the denominator:
(𝑐𝑐 + 𝑑𝑑𝑑𝑑) (𝑎𝑎 − 𝑏𝑏𝑏𝑏) (𝑐𝑐 + 𝑑𝑑𝑑𝑑)(𝑎𝑎 − 𝑏𝑏𝑏𝑏)
⋅ =
(𝑎𝑎 + 𝑏𝑏𝑏𝑏) (𝑎𝑎 − 𝑏𝑏𝑏𝑏) (𝑎𝑎 + 𝑏𝑏𝑏𝑏)(𝑎𝑎 − 𝑏𝑏𝑏𝑏)
Cartesian form 113

Then we use the FOIL method:


𝑐𝑐𝑐𝑐 − 𝑐𝑐𝑐𝑐𝑐𝑐 + 𝑎𝑎𝑎𝑎𝑎𝑎 − 𝑏𝑏𝑏𝑏𝑖𝑖 2
=
𝑎𝑎 2 − 𝑎𝑎𝑎𝑎𝑎𝑎 + 𝑎𝑎𝑎𝑎𝑎𝑎 − 𝑏𝑏2 𝑖𝑖 2
Finally, we substitute the terms that have i2 with -1:
𝑐𝑐𝑐𝑐 − 𝑐𝑐𝑐𝑐𝑐𝑐 + 𝑎𝑎𝑎𝑎𝑎𝑎 − 𝑏𝑏𝑏𝑏(−1)
=
𝑎𝑎 2 − 𝑎𝑎𝑎𝑎𝑎𝑎 + 𝑎𝑎𝑎𝑎𝑎𝑎 − 𝑏𝑏2 (−1)
(𝑐𝑐𝑐𝑐 + 𝑏𝑏𝑏𝑏) + (𝑎𝑎𝑎𝑎 − 𝑐𝑐𝑐𝑐)𝑖𝑖
=
𝑎𝑎 2 + 𝑏𝑏2

Hopefully, that wasn't too bad. Let's look at an example for this quotient:
(2 + 5𝑖𝑖)
(4 − 𝑖𝑖)

From there, let's solve it together:


(2 + 5𝑖𝑖) (4 + 𝑖𝑖) 8 + 2𝑖𝑖 + 20𝑖𝑖 + 5𝑖𝑖 2
⋅ =
(4 − 𝑖𝑖) (4 + 𝑖𝑖) 16 + 4𝑖𝑖 − 4𝑖𝑖 − 𝑖𝑖 2
8 + 2𝑖𝑖 + 20𝑖𝑖 + 5(−1)
= Because 𝑖𝑖 2 = −1
16 + 4𝑖𝑖 − 4𝑖𝑖 − (−1)
3 + 22𝑖𝑖
=
17
3 22
= + 𝑖𝑖 Separate real and imaginary parts.
17 17

As you can see, it's a little more than division in the real numbers.

Powers of i
I wanted to make sure that you could calculate the powers of i in your complex number
toolkit. The positive powers of i follow this pattern:
𝑖𝑖 0 = 1
𝑖𝑖1 = 𝑖𝑖
𝑖𝑖 2 = −1
𝑖𝑖 3 = 𝑖𝑖 2 ⋅ 𝑖𝑖 = −𝑖𝑖
𝑖𝑖 4 = 𝑖𝑖 3 ⋅ 𝑖𝑖 = 1
𝑖𝑖 5 = 𝑖𝑖 4 ⋅ 𝑖𝑖 = 𝑖𝑖
𝑖𝑖 6 = 𝑖𝑖 5 ⋅ 𝑖𝑖 = −1
𝑖𝑖 7 = 𝑖𝑖 6 ⋅ 𝑖𝑖 = −𝑖𝑖
114 Complex Numbers

The negative powers of i follow a very similar pattern:


𝑖𝑖 0 = 1
𝑖𝑖 −1 = −𝑖𝑖
𝑖𝑖 −2 = −1
𝑖𝑖 −3 = 𝑖𝑖
𝑖𝑖 −4 = 1
𝑖𝑖 −5 = −𝑖𝑖
𝑖𝑖 −6 = −1
𝑖𝑖 −7 = 𝑖𝑖
Hopefully, you see the patterns here and can derive any power of i. Alright, off to the next
form of complex numbers – polar!

Polar form
The polar form is based on polar coordinates, which you may or may not be used to. If
not, the next section goes through these, otherwise, you can skip it. Also, the rest of the
chapter is heavy on trigonometry and we use radians for all angles. If you require a quick
refresher on these, please consult the Appendix.

Polar coordinates
Polar coordinates are another way of representing points in ℝ2. We are very familiar with
the Cartesian coordinate system and its points, such as (x,y). Now we will represent a
point with two coordinates called r and θ. The following diagram is very helpful in terms
of putting this all together:

Figure 6.6 – Polar coordinates on a graph


Polar form 115

As you can see, r is the hypotenuse of a right triangle with the other two sides being the
Cartesian coordinates x and y. Because of this, it is easy to derive the equation to find r
given the Cartesian coordinates using the Pythagorean theorem:

𝑟𝑟 2 = 𝑥𝑥2 + 𝑦𝑦 2 (2)
We can use the trigonometric function tangent to derive θ:
𝑦𝑦
𝑡𝑡𝑡𝑡𝑡𝑡 𝜃𝜃 = 
𝑥𝑥
𝑦𝑦 𝑦𝑦
𝜃𝜃 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 ( ) = 𝑡𝑡𝑡𝑡𝑡𝑡−1 ( ) (3)
𝑥𝑥 𝑥𝑥
Let's look at an example. Let's say we have the point (3,4), as shown in the following figure:

Figure 6.7 – The point (3,4)


Now we need to convert from Cartesian coordinates to polar coordinates using our
formulas. So, using Equation (2), r would be:
𝑟𝑟 2 = 𝑥𝑥 2 + 𝑦𝑦 2
𝑟𝑟 2 = 32 + 42
𝑟𝑟 2 = 9 + 16
√𝑟𝑟 2 = √25
𝑟𝑟 = 5
116 Complex Numbers

Now that we've found r, we need θ. Again, we'll use our formula from Equation (3):
4
tan θ =
3
4
𝜃𝜃 = tan-1 ( ) ≈ 0.9273 rad
3
There you go! We have found that (3, 4) in Cartesian coordinates is (5, .9273) in polar
coordinates. Now it is your turn.

Exercise 3
Convert the following into polar coordinates:
(-3,3)
(4, 1)
(-2, π)

Defining complex numbers in polar form


Since we can use Cartesian coordinates with complex numbers, we can also use polar
coordinates with complex numbers. Here is Figure 6.6 from the last section again, but this
time using the complex plane:

Figure 6.8 – Polar coordinates of the complex number z


Polar form 117

Several things have changed in the graph. The real and imaginary axes have replaced the x
and y axes, along with the x coordinate being Re(z) and the y coordinate being Im(z) now.
The r in the hypotenuse represents the modulus of our complex number, z. This makes
sense if we remember that for a complex number, z = a + bi:
𝑟𝑟 = |𝑧𝑧| = √[𝑅𝑅𝑅𝑅( 𝑧𝑧)]2 + [𝐼𝐼𝐼𝐼(𝑧𝑧)]2 = √𝑎𝑎2 + 𝑏𝑏2 .

The Greek letter Θ (pronounced "theta") is called the argument of the complex number z
by mathematicians, but in quantum computing, it is often called the "phase." We denote
this in math as follows:
𝑎𝑎𝑎𝑎𝑎𝑎( 𝑧𝑧) = 𝜃𝜃

From the graph, we can derive the following:


𝑅𝑅𝑅𝑅( 𝑧𝑧) 𝐼𝐼𝐼𝐼( 𝑧𝑧) 𝐼𝐼𝐼𝐼( 𝑧𝑧)
𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃 =  𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃 =  𝑡𝑡𝑡𝑡𝑡𝑡 𝜃𝜃 =
|𝑧𝑧| |𝑧𝑧| 𝑅𝑅𝑅𝑅( 𝑧𝑧) 
  For 𝑧𝑧 = 𝑎𝑎 + 𝑖𝑖𝑖𝑖, where 𝑟𝑟 = |𝑧𝑧|

   𝑎𝑎 = 𝑟𝑟 𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃  𝑏𝑏 = 𝑟𝑟 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃 

Putting this all together, we can say that for a complex number, z, it can be represented in
polar form as:
Polar Form of a Complex Number

𝑧𝑧 = 𝑟𝑟(𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃)

Let's explore some more concepts of complex numbers in polar form.

Example
Okay, let's put this into an example. Let's find the polar form of z = -4 + 4i. First, we need
to find the value of r:
𝑟𝑟 = √𝑎𝑎2 + 𝑏𝑏2 

𝑟𝑟 = √(−4)2 + 42
𝑟𝑟 = √16 + 16
𝑟𝑟 = √32 = 4√2
118 Complex Numbers

Now we have to find θ:


𝐼𝐼𝐼𝐼( 𝑧𝑧) 4
𝑡𝑡𝑡𝑡𝑡𝑡 𝜃𝜃 = = = −1
𝑅𝑅𝑅𝑅( 𝑧𝑧) −4
3𝜋𝜋
𝑡𝑡𝑡𝑡𝑡𝑡−1( − 1) =
4
3𝜋𝜋
𝜃𝜃 =
4

Now that we have r and θ, we can express z in polar form:


3𝜋𝜋 3𝜋𝜋
𝑧𝑧 = 4√2 (𝑐𝑐𝑐𝑐𝑐𝑐 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 )
4 4

Let's now explore some more concepts of complex numbers in polar form.

Multiplication and division in polar form


While the polar form is not recommended for addition and subtraction, it is
recommended for multiplication and division. It is much easier to perform these
operations, as you will quickly realize.
Given the two complex numbers below:
𝑧𝑧1 = 𝑟𝑟1 (𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃1 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃1 ) and 𝑧𝑧2 = 𝑟𝑟2 (𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃2 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃2 )

The product of these two numbers is:


𝑧𝑧1 𝑧𝑧2 = 𝑟𝑟1 𝑟𝑟2 [𝑐𝑐𝑐𝑐𝑐𝑐 (𝜃𝜃1 + 𝜃𝜃2 ) + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠(𝜃𝜃1 + 𝜃𝜃2 )] (4)
Notice that all we had to do was add the angles and multiply the moduli. Pretty easy, right!?!
Dividing the two complex numbers is similarly easy. It is defined as:
𝑧𝑧1 𝑟𝑟1
= [𝑐𝑐𝑐𝑐𝑐𝑐(𝜃𝜃1 − 𝜃𝜃2 ) + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠(𝜃𝜃1 − 𝜃𝜃2 )], 𝑧𝑧2 ≠ 0
𝑧𝑧2 𝑟𝑟2

Here, all you have to do is subtract the angles and divide the moduli.

Example
Say we have two complex numbers:
𝑧𝑧1 = 2 + 2𝑖𝑖

𝑧𝑧2 = √3 − 𝑖𝑖
The most beautiful equation in mathematics 119

Converting to polar form, we have:


𝜋𝜋 𝜋𝜋
2 + 2𝑖𝑖 = 2√2 (𝑐𝑐𝑐𝑐𝑐𝑐 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 )
4 4
𝜋𝜋 𝜋𝜋
√3 − 𝑖𝑖 = 2 [𝑐𝑐𝑐𝑐𝑐𝑐 (− ) + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 (− )]
6 6
Applying our formula to the product of the two complex numbers in Equation (4), we get
the following for the product:
𝜋𝜋 𝜋𝜋 𝜋𝜋 𝜋𝜋
(2 + 2𝑖𝑖)(√3 − 𝑖𝑖) = 4√2 [𝑐𝑐𝑐𝑐𝑐𝑐 ( − ) + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 ( − )]
4 6 4 6
𝜋𝜋 𝜋𝜋
= 4√2 (𝑐𝑐𝑐𝑐𝑐𝑐 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 )
12 12

De Moivre's theorem
If we use Equation (4) to repeatedly multiply one complex number by itself, we get:
𝑧𝑧 = 𝑟𝑟(𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃)
𝑧𝑧 2 = 𝑟𝑟 2 (𝑐𝑐𝑐𝑐𝑐𝑐 2 𝜃𝜃 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 2 𝜃𝜃)
𝑧𝑧 3 = 𝑧𝑧𝑧𝑧 2 = 𝑟𝑟 3 (𝑐𝑐𝑐𝑐𝑐𝑐 3 𝜃𝜃 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 3 𝜃𝜃)

You should see a pattern whereby, in order to get a power of a complex number, we
take the power of the modulus and multiply the angles by the power. This is known as
de Moivre's theorem. It states that:
If 𝑧𝑧 = 𝑟𝑟(𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃) and 𝑛𝑛 is a positive integer, then

 𝑧𝑧 𝑛𝑛 = [𝑟𝑟(𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃)]𝑛𝑛 = 𝑟𝑟 𝑛𝑛 (𝑐𝑐𝑐𝑐𝑐𝑐 𝑛𝑛 𝜃𝜃 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 𝑛𝑛 𝜃𝜃)

Make sure to tuck this away somewhere. Let's now move on to the most beautiful equation
in mathematics!

The most beautiful equation in mathematics


In 1748, Leonhard Euler (pronounced "oy-lr") published his most famous formula, aptly
called Euler's Formula:
𝑒𝑒 𝑖𝑖𝑖𝑖 = 𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃
This is true for any real number θ. Substituting θ = π into this equation gives the most
beautiful equation in math, called Euler's identity:
𝑒𝑒 𝑖𝑖𝑖𝑖 + 1 = 0
120 Complex Numbers

It is hard to overstate the beauty of this equation. It combines in one equation what are
arguably the most important symbols and operations in mathematics. Along with the
operations of addition, multiplication, and exponentiation, you have 0, the additive
identity, 1, the multiplicative identity, i, the imaginary unit, and two of the most important
mathematical constants, e and π. This equation is also integral to quantum computing.
You will see eiθ all over the place in quantum computing, so you better get used to it!
So, how does this equation help us in quantum computing? You're about to find out, but
first, we need to use it to express complex numbers in exponential form.

Exponential form
Complex numbers written in terms of eiθ are said to be in the exponential form, as
opposed to the polar or Cartesian form we have seen earlier. Using Euler's formula,
we can express a complex number, z, as:
𝑧𝑧 = 𝑟𝑟(𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃) = 𝑟𝑟𝑒𝑒 𝑖𝑖𝜃𝜃

So
𝑧𝑧 = 𝑟𝑟𝑒𝑒 𝑖𝑖𝜃𝜃 in which 𝑟𝑟 = |𝑧𝑧| and 𝜃𝜃 = 𝑎𝑎𝑎𝑎𝑎𝑎( 𝑧𝑧)

As you can see, the exponential form is very close to polar form, but now you have θ in
one place instead of two!

Exercise 4
Express the following complex numbers in exponential form:
𝑧𝑧 = 1 − 𝑖𝑖
𝑧𝑧 = 2 + 3𝑖𝑖 
𝑧𝑧 = −6

Conjugation
As we have seen, the conjugation of a complex number is represented as a reflection
around the real axis. For complex numbers in exponential form, this means we just change
the sign of the angle to get the complex conjugate:
If 𝑧𝑧 = 𝑟𝑟(𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃) = 𝑟𝑟𝑒𝑒 𝑖𝑖𝑖𝑖 ,
then 𝑧𝑧 ∗ = 𝑟𝑟(𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃 − 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃) = 𝑟𝑟(𝑐𝑐𝑐𝑐𝑐𝑐( − 𝜃𝜃) + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠( − 𝜃𝜃)) = 𝑟𝑟𝑒𝑒 −𝑖𝑖𝑖𝑖 
Note that 𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃 = 𝑐𝑐𝑐𝑐𝑐𝑐( − 𝜃𝜃)and − 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃 = 𝑠𝑠𝑠𝑠𝑠𝑠( − 𝜃𝜃)
Exponential form 121

Multiplication
Multiplication and division are even easier in exponential form and are one of the reasons
why it is so preferred to work with. We can take our steps for multiplication from the
polar form and easily restate them in exponential form.
Given the two complex numbers below:
𝑧𝑧1 = 𝑟𝑟1 (𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃1 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃1 ) and 𝑧𝑧2 = 𝑟𝑟2 (𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃2 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃2 )

In exponential form, they are represented as:


𝑧𝑧1 = 𝑟𝑟1 𝑒𝑒 𝑖𝑖𝜃𝜃1 and 𝑧𝑧2 = 𝑟𝑟2 𝑒𝑒 𝑖𝑖𝜃𝜃2

The product of these two numbers in polar form is then:


𝑧𝑧1 𝑧𝑧2 = 𝑟𝑟1 𝑟𝑟2 [𝑐𝑐𝑐𝑐𝑐𝑐(𝜃𝜃1 + 𝜃𝜃2 ) + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠(𝜃𝜃1 + 𝜃𝜃2 )]

Their product in exponential form is then:


𝑧𝑧1 𝑧𝑧2 = 𝑟𝑟1 𝑟𝑟2 𝑒𝑒 𝑖𝑖(𝜃𝜃1+𝜃𝜃2)

Without going through all that, I will simply state that for division:
𝑧𝑧1 𝑟𝑟1 𝑖𝑖(𝜃𝜃 −𝜃𝜃 )
= 𝑒𝑒 1 2 , 𝑧𝑧2 ≠ 0
𝑧𝑧2 𝑟𝑟2

Example
Let's reuse our example from the section on the polar form, but do it in the
exponential form!
𝑧𝑧1 = 2 + 2𝑖𝑖

𝑧𝑧2 = √3 − 𝑖𝑖

Going from the Cartesian form to the polar form and then the exponential form, we get:
𝜋𝜋 𝜋𝜋 𝑖𝑖𝑖𝑖
2 + 2𝑖𝑖 = 2√2 (𝑐𝑐𝑐𝑐𝑐𝑐 + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 ) = 2√2𝑒𝑒 4
4 4
𝜋𝜋 𝜋𝜋 −𝑖𝑖𝑖𝑖
√3 − 𝑖𝑖 = 2 [𝑐𝑐𝑐𝑐𝑐𝑐 (− ) + 𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠 (− )] = 2𝑒𝑒 6
6 6

Finally, using our definition of multiplication in the exponential form from before, we get:
𝑖𝑖𝑖𝑖 −𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖
(2 + 2𝑖𝑖)(√3 − 𝑖𝑖) = 4√2𝑒𝑒 4 𝑒𝑒 6 = 4√2𝑒𝑒 12
122 Complex Numbers

Conjugate transpose of a matrix


Since we now have the definition of the complex conjugate of a number, I'd like to quickly
go over the conjugate transpose of a matrix as we will use this later in the book. The
conjugate transpose is exactly as it sounds. It combines the notions of complex conjugates
and the transposition of a matrix into one operation. If you remember from Chapter 2,
The Matrix, we defined the transpose to be:
𝑎𝑎11 𝑎𝑎12 ⋯ 𝑎𝑎1𝑛𝑛 𝑎𝑎11 𝑎𝑎21 ⋯ 𝑎𝑎m1
𝑎𝑎21 𝑎𝑎22 ⋯ 𝑎𝑎2𝑛𝑛 𝑇𝑇
𝑎𝑎12 𝑎𝑎22 ⋯ 𝑎𝑎m2
If 𝐴𝐴 = [ ⋮ ⋮ ⋱ ⋮ ] ,then 𝐴𝐴 = [ ⋮ ⋮ ⋱ ⋮ ]
𝑎𝑎𝑚𝑚1 𝑎𝑎𝑚𝑚2 ⋯ 𝑎𝑎𝑚𝑚𝑚𝑚 𝑎𝑎1n 𝑎𝑎2𝑛𝑛 ⋯ 𝑎𝑎𝑚𝑚𝑚𝑚

This is where we essentially convert the rows into columns and the columns into rows.
The conjugate of a matrix is just the conjugation of every entry:
𝑎𝑎11 𝑎𝑎12 ⋯ 𝑎𝑎1𝑛𝑛 𝑎𝑎11 * 𝑎𝑎12 * ⋯ 𝑎𝑎1𝑛𝑛 *
𝑎𝑎21 𝑎𝑎22 ⋯ 𝑎𝑎2𝑛𝑛 𝑎𝑎21 * 𝑎𝑎22 * ⋯ 𝑎𝑎2𝑛𝑛 *
If 𝐴𝐴 = [ ⋮ ⋮ ⋱ ⋮ ] ,then 𝐴𝐴* = [ ⋮ ]
⋮ ⋱ ⋮
𝑎𝑎𝑚𝑚1 𝑎𝑎𝑚𝑚2 ⋯ 𝑎𝑎𝑚𝑚𝑚𝑚 𝑎𝑎𝑚𝑚1 * 𝑎𝑎𝑚𝑚2 * ⋯ 𝑎𝑎𝑚𝑚𝑚𝑚 *

For example, if the matrix M equals


1 + 2𝑖𝑖 4
[ 𝑖𝑖𝑖𝑖 ]
𝑒𝑒 2 −𝑖𝑖 ,
then M* equals
1 − 2𝑖𝑖 4
[ −𝑖𝑖𝑖𝑖 ]
𝑒𝑒 2 𝑖𝑖 .
So here is the big payoff. The conjugate transpose of a matrix A is defined to be:
𝐴𝐴† = (𝐴𝐴* )𝑇𝑇

The cross symbol at the top right of A is pronounced "dagger," and therefore when you
hear "A dagger," the conjugate transpose of A is being referred to.
A quick example should get this all sorted. Let's use our matrix M from before. The
conjugate transpose of M would be:
1 + 2𝑖𝑖 4 −𝑖𝑖𝑖𝑖
𝑀𝑀 = [ 𝑖𝑖𝑖𝑖 ]   𝑀𝑀† = [1 − 2𝑖𝑖 𝑒𝑒 2 ]
𝑒𝑒 2 −𝑖𝑖 4 𝑖𝑖
Please note that the conjugate transpose of a matrix also goes under the names Hermitian
conjugate and adjoint matrix.
Bloch sphere 123

Okay, with all that squared away, we can get to some cool quantum computing stuff called
the Bloch sphere!

Bloch sphere
This is the big payoff of the chapter, understanding the Bloch sphere! The Bloch sphere,
named after Felix Bloch, is a way to visualize a single qubit. From Chapter 1, Superposition
with Euclid, we know that a qubit can be represented in the following way:
|𝜓𝜓⟩ = 𝛼𝛼|0⟩ + 𝛽𝛽|1⟩

We did not say this before, but now that we have introduced complex numbers, we can say
that α and β are actually complex numbers.
Now we know that complex numbers take two real numbers to represent, it looks as
though we will need four real numbers to characterize a qubit state. This is very hard to
graph as we cannot visualize 4D space. Let's see whether we can decrease the number of
real numbers required to represent a qubit state.
First, let's replace α and β with their exponential form to get:
|𝜓𝜓⟩ = 𝑟𝑟𝛼𝛼 𝑒𝑒 𝑖𝑖𝜙𝜙𝛼𝛼 |0⟩ + 𝑟𝑟𝛽𝛽 𝑒𝑒 𝑖𝑖𝜙𝜙𝛽𝛽 |1⟩

Now, let's rearrange the right side of the equation by taking out 𝑒𝑒 𝑖𝑖𝜙𝜙𝛼𝛼 and distributing it.
Notice that I need to subtract the phase of 𝑒𝑒 𝑖𝑖𝜙𝜙𝛼𝛼 from the second term:
|𝜓𝜓⟩ = 𝑒𝑒 𝑖𝑖𝜙𝜙𝛼𝛼 (𝑟𝑟𝛼𝛼 |0⟩ + 𝑟𝑟𝛽𝛽 𝑒𝑒 𝑖𝑖(𝜙𝜙𝛽𝛽−𝜙𝜙𝛼𝛼) |1⟩) (5)
It ends up that quantum mechanics (QM) calls 𝑒𝑒 𝑖𝑖𝜙𝜙𝛼𝛼 the "global phase" in Equation (5)
and that it has no measurable effect on the qubit state. Therefore, QM lets us drop it. So
now we have:
|𝜓𝜓⟩ = 𝑟𝑟𝛼𝛼 |0⟩ + 𝑟𝑟𝛽𝛽 𝑒𝑒 𝑖𝑖(𝜙𝜙𝛽𝛽−𝜙𝜙𝛼𝛼) |1⟩

The (φβ - φα) in the second term is called the "relative phase." We will replace it with just
one φ and also restrict it to go from 0 to 2π radians. Here is the new equation:
|𝜓𝜓⟩ = 𝑟𝑟𝛼𝛼 |0⟩ + 𝑟𝑟𝛽𝛽 𝑒𝑒 𝑖𝑖𝑖𝑖 |1⟩

Okay, hopefully, you're keeping up. If you're keeping score at home, with all this
mathematical contortion, we are now down from four dimensions to three dimensions to
describe the qubit state. Can we get down to two dimensions? Let's try!
124 Complex Numbers

Another constraint we know is that α and β represent probability measurements and all
probabilities must add up to one, so:
|𝛼𝛼|2 + |𝛽𝛽|2 = 1

Because of this, the following must also be true:


𝑟𝑟𝛼𝛼 2 + 𝑟𝑟𝛽𝛽 2 = 1

We can represent this on the first quarter of a unit circle like so:

Figure 6.9 – Unit circle with a right triangle showing the parameters of our qubit
You may have noticed that we use θ/2 rather than just θ. The reason for this goes into very
deep physics and math that are beyond the scope of this book. If you are very interested
in why this is the case, please consult https://physics.stackexchange.com/
questions/174562/why-is-theta-over-2-used-for-a-bloch-sphere-
instead-of-theta.
Now we can represent our two coefficients for r as:
𝜃𝜃
𝑟𝑟𝛼𝛼 = 𝑐𝑐𝑐𝑐𝑐𝑐 ( )
2
𝜃𝜃
𝑟𝑟𝛽𝛽 = 𝑠𝑠𝑠𝑠𝑠𝑠 ( )
2
0 ≤ 𝜃𝜃 < 𝜋𝜋
Okay, here's the big payoff!

ψ = cos ( )0
θ
2
+ sin ( )e
θ
2

1
Summary 125

We're down to only two dimensions (θ and φ). Using these two angles, we can represent
the qubit state on a unit sphere like so:

Figure 6.10 – Bloch sphere [3]


Θ is like the latitude on a map, but it goes from zero on the north pole, represented by the
zero state, to π/2 on the equator and π on the south pole, which represents the one state.
I was only able to show you how to get to the sphere mathematically, which is still a lot!
For more information on why we need the Bloch sphere, please consult Packt's other great
book, Dancing with Qubits, by Robert Sutor. For a cool, interactive visualization of the
Bloch sphere, please go to https://www.st-andrews.ac.uk/physics/quvis/
simulations_html5/sims/blochsphere/blochsphere.html.

Summary
Well, we've come a long way in a few pages. We've now seen how complex numbers can be
expressed in the three forms of Cartesian, polar, and exponential. We've also seen which
forms are better for different operations. You've learned how to multiply, divide, add, and
subtract complex numbers. Finally, we've put it all together to mathematically derive the
Bloch sphere. Bravo!
In the next chapter, we will explore the world of eigenvalues, eigenvectors, and all kinds of
eigenstuff. Get ready!
126 Complex Numbers

Exercises
Exercise 1
(2 − 3𝑖𝑖)(5 + 𝑖𝑖) = 13 − 3𝑖𝑖 
(1 − 𝑖𝑖)(2 + 𝑖𝑖) = 3 − 𝑖𝑖
(−2 + 3𝑖𝑖)(−4 − 𝑖𝑖) = 11 − 10𝑖𝑖

Exercise 2
|3 − 2𝑖𝑖| = √13
|𝑖𝑖| = 1

|6 + 3𝑖𝑖| = 3√5

Exercise 3
(-3,3)
3𝜋𝜋
(3√2, )
4
(4, 1)
(√17, .24498)

(-2, π)
𝜋𝜋
(√4 + 𝜋𝜋 2 , 𝜋𝜋 − 𝑡𝑡𝑡𝑡𝑡𝑡−1 ( )) or
2
𝜋𝜋
(−√4 + 𝜋𝜋 2 , 𝜋𝜋 − 𝑡𝑡𝑡𝑡𝑡𝑡−1 ( ) + 2𝜋𝜋)
2

Exercise 4
𝑧𝑧 = 1 − i = √2𝑒𝑒 −𝑖𝑖𝑖𝑖/4 
𝑧𝑧 = 2 + 3i=√13𝑒𝑒 .9828𝑖𝑖 
𝑧𝑧 = −6 = 6𝑒𝑒 𝑖𝑖𝑖𝑖
References 127

References
[1] – File:Vector add scale.svg – Wikimedia Commons (https://commons.
wikimedia.org/wiki/File:Vector_add_scale.svg)
[2] – MonkeyFaceFOILRule – FOIL method – Wikipedia (https://en.wikipedia.
org/wiki/FOIL_method#/media/File:MonkeyFaceFOILRule.JPG)
[3] – File:Bloch Sphere.svg – Wikimedia Commons (https://commons.wikimedia.
org/wiki/File:Bloch_Sphere.svg)
Quote on Quantum physics needing complex numbers is from 2101.10873.pdf
(https://arxiv.org/pdf/2101.10873.pdf)
7
EigenStuff
Eigen (pronounced EYE-GUN) is a German prefix to words such as eigentum (property),
eigenschaft (a feature or characteristic), and eigensinn (an idiosyncrasy). To sum up,
we are looking for some values and vectors that are characteristic, idiosyncratic, and
a property of something. What is that something? That something is our old friend the
linear operator and its representations as square matrices. But before we get there, we'll
need to look at some other concepts such as the matrix inverse and determinant. We'll
wrap it all up with the trace of a matrix and some properties that the trace, determinant,
and eigenvalues all share. These concepts will allow us to reach even further heights in the
chapters that follow.
In this chapter, we are going to cover the following main topics:

• The inverse of a matrix


• Determinants
• Invertible matrix theorem
• Eigenvalues and eigenvectors
• Trace
• The special properties of eigenvalues
130 EigenStuff

The inverse of a matrix


It would be nice to have a way to do algebra on matrices the way we do for simple
algebraic expressions, like so:

3xy = 6
divide both sides by 3x
3xy 6
=
3x 3x
2
y=
x
The inverse of a matrix provides us with a way to do this. It is very similar to the reciprocal
for rational numbers. For rational numbers, the following is true:
1
x −1 =
x
−1
x ⋅ x =1

In a similar way, the inverse of a matrix is defined to be a matrix that when multiplied by
the original matrix, you get the identity matrix. Here it is mathematically:

A ⋅ A−1 = I where A−1 denotes the matrix inverse of A

The matrix inverse can then be used when trying to algebraically modify a matrix
equation. Let's say we are trying to find the vector |x⟩ in the following equation:

A | x〉 = | y 〉

Since we now have a multiplicative inverse of a matrix, we can multiply both sides by it to
get the following:

A−1 A | x〉 = A−1 | y〉

I | x〉 = A−1 | y〉

| x〉 = A−1 | y〉
Determinants 131

Please remember that matrix multiplication is not commutative, so if you left multiply
a matrix on one side of an equation, you must left multiply on the other side. The same
applies if you right multiply a matrix. We now have a way to find |x⟩ by using the inverse
of matrix A. But there is a catch – not all matrices have an inverse. Determinants will help
us here though.

Determinants
Determinants determine whether a square matrix is invertible. This is a huge help to
us, as we will see. In the literature, you will see either a function abbreviation for the
determinant or vertical bars, like so:
det( A) = | A |

The determinant is a function from ℂn × n to ℂ. In other words, it takes an n × n square


matrix as input and spits out a scalar. For a 1 × 1 matrix, the determinant is just the
number (easy enough). For a 2 × 2 matrix, this is the formula. You should probably just
commit it to memory if you can:

a b
If A =  ,
c d 
then det( A) =| A |= ad − bc

I will give you exercises at the end of this section to help with the memorization part,
which will also give you a feel for the determinant itself.
There is a method for calculating determinants for bigger matrices, but it is rather
involved, and once you've mastered 2 × 2 matrices, I would suggest using a matrix
calculator. It's just like arithmetic; you should know how to do it for small numbers before
using a calculator for everything else. Just so you get a glimpse of how it gets increasingly
difficult to calculate determinants, here is the formula for a 3 × 3 matrix:

a b c
e f d f d e
| A |= d e f =a −b +c
h i g i g h
g h i
= aei + bfg + cdh − ceg − bdi − afh.
132 EigenStuff

For reference, a good matrix calculator is WolframAlpha's (https://www.


wolframalpha.com/calculators/determinant-calculator), as shown here:

Figure 7.1 – The WolframAlpha online determinant calculator


Okay, let's see this in action and work through a quick example for determinants. What is
the determinant of the following matrix?

 −5 −4 
 
 −2 −3 
Using our formula, we can calculate it as follows:

−5 −4
= ( −5)( −3) − ( −4)( −2)
−2 −3
= 15 − 8
=7

See? Easy-peasy!
So, at the start of this section, we said that the determinant helps to determine whether a
matrix is invertible. How is that possible? I'll put this in a callout, as it is quite important:

Determinants and Invertibility


If the determinant of a matrix equals zero, it is not invertible. Otherwise, the
matrix is invertible.
The invertible matrix theorem 133

If a matrix is not invertible, we call it singular or degenerate. We will now use the
determinant in the next sections on calculating the inverse of a matrix and calculating
eigenvalues. But first, it leads us to a very important and useful theorem in linear algebra,
the invertible matrix theorem.

Exercise one
What is the determinant of the following 2 × 2 matrices?

1 2
 
3 4

1 0
 
0 1

 −1 3 2 
 
 1 −1 

The invertible matrix theorem


The invertible matrix theorem is a great result in linear algebra because based on the
invertibility of a matrix, we can say a great many things about that matrix that are also
true. Since we now know a quick way to determine the invertibility of a matrix through
the computation of the determinant, we get all these other properties for free!
Here is the actual definition. Let A be a square n × n matrix. If A is invertible (det(A) ≠ 0),
then the following properties follow:

• The column vectors of A are linearly independent.


• The column vectors of A form a basis for ℂn.
• The row vectors of A form a basis for ℂn and are also linearly independent.
• The linear transformation mapping |x⟩ to A|x⟩ is a bijection from ℂn to ℂn (we
studied bijections in Chapter 3, Foundations).

If the matrix A is not invertible (det(A) = 0), then all the preceding properties are false.
134 EigenStuff

While it may not seem that these are a lot of properties, there are many more I did not
include because we didn't cover the concepts in this book. But even with the four properties
listed here, it should show how powerful the determinant is when it is computed.

Calculating the inverse of a matrix


But wait – there's more! We can use the determinant of a matrix to calculate its inverse.
Here is the formula to calculate the inverse of a 2 × 2 matrix:
−1
−1
a b  1  d −b 
A =  =  
c d  det( A)  −c a

The first thing to notice is that the reciprocal of the determinant is used. Now, what if the
determinant is zero? We can't divide by zero! So the formula is undefined, but remember
that if the determinant is zero, it doesn't matter because the matrix is not invertible to
begin with!
The other thing to notice is that a and d just switch position. Then, b and c are just
multiplied by -1.
That was a lot of words! Let's look at an example with the following matrix:

 2 4
D= 
 −2 2 
Let's calculate the determinant first:

det( D ) =| D |= 2 ⋅ 2 − ( −2)(4) = 4 + 8 = 12

Alright, let's use that to calculate the inverse:

1  2 −4   1 6 −1 3 
D −1 =  = 
12  2 2   1 6 1 6 
And there you have it. Let's make sure that it is the inverse by using the definition we had
for the inverse of a matrix:

 2 4  1 −1   13 + 2 3 − 2 3 + 2 3  1 0
DD −1 = 
6 3
 = 1 1 = =I
 −2 2     − 3+ 3
2 + 1
 0 1
1 1
6 6 3 3

Now, let's do some exercises.


Eigenvalues and eigenvectors 135

Exercise two
Calculate the inverse of the following matrices:

1 2
 
3 4
4 3 
 
 2 −1 
1 i 
 
 2 2i 
You should note that in the last exercise, I included complex numbers in the matrix. You
should get used to this now that we have covered complex numbers.
Now let's move on to the main feature, eigenvalues and eigenvectors!

Eigenvalues and eigenvectors


Ah, yes – we have arrived at the meat of this chapter, the eigen-stuff! We know that linear
transformations are involved in this somehow, but how? Well, let's start with something
we are familiar with, the reflection transformation we covered in Chapter 5, Using Matrices
to Transform Space. Here is the diagram that describes it:

Figure 7.2 – Reflection transformation


136 EigenStuff

Now, the special vectors we are looking for, called eigenvectors, are the ones that only get
multiplied by a scalar when they are transformed. Here's a look at a number of vectors
being reflected, with the reflections being the dashed lines. Stare at the diagram and see if
you see any reflections that are the same as multiplying by a scalar:

Figure 7.3 – Vectors and their reflections


Well, I don't want you to stare too long, so here is the first one:

Figure 7.4 – The first eigenvector


Any vector along this line going through a reflection transformation is the same as being
multiplied by the scalar -1. This scalar is called the eigenvalue.
Did you notice any other eigenvectors? This one is a little harder to tease out, but here it is:

Figure 7.5 – The second eigenvector


Eigenvalues and eigenvectors 137

If you remember in our discussion about reflection, we said that any vector that is on the
axis of reflection will be reflected onto itself. In other words, it is multiplied by the scalar
1. The set of eigenvectors is any vector on this line, and the corresponding eigenvalue is 1.
So, there you have it – for reflection, an eigenvector is any vector perpendicular to the axis
of reflection with an eigenvalue of -1 or any vector parallel to the axis of reflection with an
eigenvalue of 1.

Definition
Let's get rigorous and define eigenvectors and eigenvalues. Given a linear operator  an
eigenvector is a non-zero vector that when transformed by  is equal to being multiplied
by a scalar λ (pronounced lamb-da). This can be written as the following:

𝐴𝐴̂|𝑣𝑣⟩ = 𝜆𝜆|𝑣𝑣⟩

This is also true for any matrix A that represents the linear operator, Â:

A | x〉 = λ | x〉

where A is a matrix. The scalar λ is called an eigenvalue.


Now for some comic relief from all this rigor!
"What is a baby eigensheep?"
"A lamb, duh!"

Example with a matrix


Alright, let's see this eigen-stuff in action with an example matrix. Here's our chosen
matrix:

3 0
B= 
0 2
138 EigenStuff

Let's see what it does to a variety of vectors:

 3 0 1   3 
  = 
 0 2  2   4 
 3 0   −2   −6 
  = 
 0 2  4   8 
 3 0 1   3 
  = 
 0 2 0   0 

Do any of the resultant vectors look like a scalar multiple of the original? If you said the
last one, you are correct! Let's write it again with its eigenvalue, 3:

 3 0 1  3
  = 
 0 2 0  0
1 3 
3  =  
0 0 
You may have noticed that there is more than one eigenvector for an eigenvalue – in fact,
there is a set. This leads us to the fact that an eigenvalue with its set of eigenvectors, plus
the zero vector, constitutes a subspace of the overall vector space that a linear operator is
transforming. For our matrix B, this happens to be a one-dimensional line through (1,0) if
we are dealing with a real vector space. Let's see some more eigenvectors of the eigenvalue
3 to drive the point home:

 3 0  −1   −1   −3 
   =3 = 
 0 2  0   0   0 
 3 0  2  2 6
   =3  =  
 0 2  0  0 0
This subspace of vectors is called the eigenspace. To define the eigenspace mathematically
for the matrix B and the eigenvalue 3 in ℝn, it is as follows:
  x  
   : x∈ℝ 
  0  

We can also just give the vector (1,0) as a basis vector. When doing this, the basis set of
vectors is called the eigenbasis.
Eigenvalues and eigenvectors 139

The characteristic equation


So, how do we find eigenvalues when given a square matrix? Well, let's start with what we
are trying to find:
A | x〉 = λ | x〉 (1)
We want to find λ. So, let's start with manipulating Equation (1) algebraically. Let's
subtract λ|x⟩ from both sides of the equation:

A | x〉 − λ | x〉 = 0

Okay, after doing that, let's take out the vector |x⟩:

( A − λ ) | x〉 = 0

Hmm – this is not a valid equation because λ is a scalar and A is a matrix. What should we
do? Let's multiply λ by the identity matrix so that we are subtracting two matrices:
( A − λ I ) | x〉 = 0 (2)
Okay, we now have one vector multiplied by the difference of two matrices on the left
side of our equation and the zero vector on the other. It just so happens that a part of
the invertible matrix theorem that I left unstated says that for Equation (2) to have a
non-trivial solution (a solution other than the zero vector), the determinant of the matrix
on the left side has to equal zero:
det( A − λ I ) = 0 (3)
Equation (3) is called the characteristic equation of matrix A. If we can solve it, we will
get all the eigenvalues of A! Let's see how we can do it with an example.

Example
Okay, let's use the matrix A, defined as follows:

 3 −2 
A= 
 1 −1 
140 EigenStuff

Now, let's solve its characteristic equation by first finding the matrix on the left in
Equation (3):

 3 −2  1 0
A − λI =  −λ  
 1 −1  0 1
 3 −2   λ 0 
= − 
 1 −1   0 λ 
 3−λ −2 
= 
 1 −1 − λ 

Okay, now that we've found that, let's set the determinant of that matrix to 0 and solve
for λ:

det( A − λ I ) = (3 − λ )(−1 − λ ) − ( −2) = 0


det( A − λ I ) = −3 − 3λ + λ + λ 2 + 2 = 0
det( A − λ I ) = λ 2 − 2λ − 1 = 0
The polynomial we have found in the preceding equation is called the characteristic
polynomial of the matrix A for reference. There seems to be a lot of characteristics
floating around, which gets back to our discussion of what eigen stands for in German! If
we put the characteristic polynomial into the quadratic formula, we will find that our two
eigenvalues for the matrix A are as follows:

λ =1± 2

And there you go – you now know how to find eigenvalues for 2 × 2 matrices.
What about bigger matrices? Well, the characteristic polynomials get harder and harder
to solve symbolically, and we have to rely on computers to find them numerically. Once
again, a handy matrix calculator comes to save the day. For getting eigenvalues and
eigenvectors, I'd like to recommend Symbolab (https://www.symbolab.com/) as
an online calculator as well (another tool in your toolbelt); a screenshot is shown in the
following figure:
Eigenvalues and eigenvectors 141

Figure 7.6 – The Symbolab online calculator


Now that we have found the eigenvalues, how do we find the eigenvectors?

Finding eigenvectors
In this section, we will answer that question. In a nutshell, we put the eigenvalues back
into the definition and solve for the eigenvectors. Let's say I use the linear operator Y,
which is a common gate in quantum computing. Here it is in the computational basis:

 0 −i 
Y = 
i 0 
I'll go ahead and tell you that the eigenvalues are 1 and -1. The definition for eigenvalues is
as follows:

Y | x〉 = λ | x〉
142 EigenStuff

Now, let's solve for the eigenvalue 1, to find its eigenvectors. First, we put in the matrix
and eigenvalue:

 0 −i   x1   x1 
   =1 
 i 0   x2   x2 
Multiplying this out gives us the following:

0 x1 − ix2 = x1
ix1 + 0 x2 = x2

This is a system of two linear equations, which is pretty simple to solve:

If 0 x1 − ix2 = x1 then x1 = −ix2

This means that we can choose any value for x2 and derive x1. Putting this back into the
vector |x⟩, we get the following:

 −ix2 
| x〉 =  
 x2 
So, any vector that has the components of |x⟩ will be an eigenvector for the eigenvalue
of 1 for the matrix Y. Let's set one component to get a "representative eigenvector" for
the eigenvalue:

Let x2 = 1
 −i 
then | x〉 =  
 1 

To find the other set of eigenvectors for Y, you repeat the process with the other eigenvalue
of -1.

Multiplicity
Sometimes, when we compute the characteristic polynomial for a matrix, it can output
something like this:

(λ − 1) 2 = 0
Trace 143

In this case, the only eigenvalue is 1, but it is the root of the characteristic polynomial
twice. Therefore, this particular eigenvalue has an algebraic multiplicity of 2. In general,
the algebraic multiplicity of a particular eigenvalue is the number of times it appears as a
root of the characteristic polynomial.

Trace
The trace of a matrix is very easy to compute, but it helps to have its value, as it has special
properties. It is defined as the sum of the values on the main diagonal of a matrix. So, let's
say we have the following:

 a11 a12 a13  1 3 5


   
A =  a21 a22 a23 = i −i 6
 a31  7 
 a32 a33   2 3

The trace of A will then be the following:


3
tr( A ) = ∑ aii = a11 + a22 + a33 = 1 − i + 3 = 4 − i
i =1

In general, the definition of the trace for an n × n square matrix is as follows:


n
tr( A ) = ∑ aii = a11 + a22 + ... + ann
i =1

The trace of a linear operator is the same for any matrix that represents it.

The special properties of eigenvalues


Now that we have gone through the concept of the trace and determinant, let's quickly
look at some cool properties they have in relation to eigenvalues.
It happens that the sum of the eigenvalues of a matrix equals the trace of the matrix:

tr( A ) = ∑ λi
i

Additionally, the product of all the eigenvalues of a matrix is equal to the determinant:
det( A) = λ1 ⋅ λ2 ⋅ ... ⋅ λn = ∏ λi
i
144 EigenStuff

I encourage you to go back to the matrices we have used as examples with their
eigenvalues and prove to yourself that this is indeed true!

Summary
We have covered quite a bit of ground here in relatively few pages. We saw how to
calculate the determinant and how it plays a role in finding the inverse of matrices. Many
types of eigen-stuff were discussed, and ways to find them were also given. Finally, we saw
how the trace is calculated and how it relates to everything else we covered in this chapter.
These concepts will equip you to do serious linear algebra and quantum computing going
forward. In the next chapter, we will bring everything together that we learned in the last
seven chapters to explore the space where qubits live!

Answers to exercises
Exercise one
1 2
det   = −2
 3 4 

1 0
det   =1
0 1

 −1 3 2 
det   = − 12
 1 −1 

Exercise two
−1
1 2 1  −4 2 
  =  
3 4 2  3 −1 
−1
 4 3  1 1 3 
  =  
 2 −1  10  2 −4 
 1 i 
  this matrix is not invertible as the det = 0
 2 2i 
8
Our Space in the
Universe
Our space in the universe is called a Hilbert space. A Hilbert space is a type of vector
space that has certain properties – properties that we will develop in this chapter. The
most important will be defining an inner product. Once this is done, we will be able to
measure distance and angles between vectors in an n-dimensional complex space. We will
also be able to measure the length of vectors in these spaces. Later in the chapter, we will
look at putting these Hilbert spaces together into even bigger Hilbert spaces through the
tensor product!
The other main topic of this chapter is linear operators. We will go through many types,
showing the distinct properties of each.
In this chapter, we are going to cover the following main topics:

• The inner product


• Orthonormality
• The outer product
• Operators
• Types of operators
• Tensor products
146 Our Space in the Universe

The inner product


An inner product can actually be any function that follows a few properties, but we
are going to zero in on one definition of the inner product that we will use in quantum
computing. Here it is:

 x1   y1 
    n
| x〉 ,| y〉 ≡  ⋮ ,
  ⋮  = ∑ xi* yi = x1* y1 + ⋯ + xn* yn
i =1
 xn   yn 
   
Mathematicians use the preceding notation for the inner product, but Dirac defined it
with a bra and ket, calling it a bracket:

〈 x | y〉 ≡ | x〉 ,| y〉

Now, if we define a bra to be the conjugate transpose of its corresponding ket, so that if:

 x1 
 
| x〉 =  ⋮ 
 xn 
 
Then, ⟨x| is now:

 x1* ⋯ xn* 
We can then define a bracket as just matrix multiplication!

 y1 
  n
〈 x | y〉 =  x1* ⋯ xn*  ⋮ = ∑ xi* yi = x1* y1 + ⋯ + xn* yn
   i =1
 yn 
 
Pretty cool, eh? That is one of the reasons why bra-ket notation is so convenient! You
should notice something else too. The bra ⟨x| is a linear functional. It will take any
vector |y⟩ and give you a scalar according to the inner product formula!
The inner product 147

Let's look at an example. Let's say |x⟩ and |y⟩ are defined this way:

3   2 
   
| x〉 =  i  | y〉 =  −i 
 2i   3 
   

Then, the bras are going to be:

〈 x |=  3 −i −2i  〈 y |=  2 i 3 

Now, let's calculate the inner product ⟨y|x⟩:

 3   2 ⋅ 3 + i ⋅ i + 3 ⋅ 2i
  
y x =  2 i 3  ⋅  i  =  6 − 1 + 6i
 2i   5 + 6i
  

Let's reverse the inner product and calculate ⟨x|y⟩:

 2   3 ⋅ 2 + −i ⋅ −i + −2i ⋅ 3
  
x y =  3 −i −2i  ⋅  −i  =  6 − 1 − 6i
 3   5 − 6i
  

You might have noticed that the preceding answers are complex conjugates of each other;
that's because:

〈 x | y 〉 = 〈 y | x〉 ∗ .
Since bras are linear functionals, brackets are left-distributive (α and β are scalars):

y (α x +β z ) =α y x +β y z (1)

And they are also right-distributive:

(α x +β z )y =α x y + β z y
148 Our Space in the Universe

Let's test the left-distributive property in an example where we evaluate both sides of
Equation (1) separately. We will define all the variables first. Our two scalars will be α = 3
and β = 2. Our vectors are defined as follows:

2 6
   
x = 1  z = 4  y =  1 3 4 
0 2
   
Alright, here we go. Let's evaluate the left side of Equation (1) first:

 2  6 
    
y (α x +β z ) =  1 3 4   3  1
  0

+2 4  
  2  
     

  6   12  
     
 1 3 4    3 + 8  
 0  4  
     

 18 
 
 1 3 4   11  = 18 + 33 + 16 = 67
 4 
 

And now for the right side of Equation (1):

 2    6 
       
α y x + β y z = 3   1 3 4   1   + 2   1 3 4   4  
 0     2  
       

3(2 + 3 + 0) + 2(6 + 12 + 8)

3 ⋅ 5 + 2 ⋅ 26

15 + 52 = 67

Luckily, our answers match. Whew! I didn't want to have to go back and do that again!
Let's move on to see how the inner product can be used in vector spaces.
Orthonormality 149

Orthonormality
In this section, we will look at the concepts of the norm and orthogonality to come up
with orthonormality.

The norm
We can define a metric on our vector spaces called the norm and denote it this way,
∥x∥, where x is the vector on which the norm is being measured. In two- and three-
dimensional Euclidean spaces, it is often called the length of a vector, but in higher
dimensions, we use the term norm. It gives us a way to measure vectors.
We define the norm using our inner product from the previous section, like so:

x = x x

As always, let's look at an example. What is the norm of the vector |x⟩ here?

2
 
x = 3 
4
 

Well, let's work it out:

x = x x

2
 
=  2 3 4   3 
4
 
= 4 + 9 + 16 = 29
As you can see, the norm, ∥x∥, of |x⟩ is the square root of 29.

Normalization and unit vectors


Oftentimes, especially in quantum computing, we will want to represent our vectors by
something called a unit vector. The word unit refers to the fact that the norm of these
vectors is one. How do we achieve this? Well, we divide a vector by its norm in a process
called normalization.
150 Our Space in the Universe

A unit vector |u⟩ is found for the vector |v⟩ using the following formula:
v
u =
v
What is the unit vector for the vector |x⟩ from the previous section? Well, we just divide
|x⟩ by its norm to obtain:

2  2 
1    
29
x
u = =  3= 
3
29
x 29    
 4   4
29 

You should take the norm of |u⟩ to ensure that it is indeed 1.

Orthogonality
Orthogonal may not be a word you've seen before, but I bet you are familiar with
the word perpendicular. For instance, the two vectors in the following graph are
perpendicular to each other:

Figure 8.1 – Two perpendicular vectors


Orthogonality is taking the concept of vectors being perpendicular to each other to higher
dimensions of ℂn.
Orthonormality 151

The inner product is helpful in this regard as well. If two vectors are orthogonal to each
other, their inner product will be 0. Let's see whether the two vectors in Figure 8.1 are
orthogonal according to the inner product:

1
 −1 1    = −1 ⋅ 1 + 1 ⋅ 1 = −1 + 1 = 0
1

Indeed, they are orthogonal, as their inner product is zero. Are the next two
vectors orthogonal?

2 6
   
x = 1  z = 4 
0 2
   
Well, let's see!

6
 
x z =  2 1 0   4 
2
 

x z = 12 + 4 + 0 = 16

So, no – they're not! Now, it's your turn.

Exercise one
Determine whether each pair of vectors is orthogonal using the inner product:
1  2 
   
 2  ,  −2 
 3   −1 
1.    
 0   −1 
   
 1   1 
,
 2   2 
   
2.  −1   5 
152 Our Space in the Universe

Orthonormal vectors
So, if we combine the concept of orthogonality and normalization, we get orthonormal
vectors. These are very important in the study of vector spaces and quantum computing.
Let's see whether we can come up with some orthonormal vectors in ℂ2. We'll try the
computational basis vectors:
1 0
0 =  1 = 
0 1
Are they orthogonal?
1
1 0 =  0 1    = 0 ⋅ 1 + 1 ⋅ 0 = 0
0
Well, their inner product is zero, so they are orthogonal! Are they unit vectors?

2 1
0 = 0 0 =  1 0    = 1 ⋅ 1 + 0 ⋅ 0 = 1
0

2 0
1 = 1 1 =  0 1    = 0 ⋅ 0 + 1 ⋅ 1 = 1
1
Yes! The square root of their own inner product (aka their norm) is one. So |0⟩ and |1⟩
are orthonormal (hint: that's one of the reasons they were chosen as the canonical 0
and 1 vectors).
Let's look at another set of vectors that are important in quantum computing. They are
labeled with plus and minus signs:

 1   1 
   
2  2 
+ = − =
 1   1 
  − 
 2   2 
Orthonormality 153

What do you think? Orthonormal? Or just normal? How do we know? That's right – we
use our formulas. Let's see whether they are orthogonal first:

 1   1 1  1   1 
 1     ⋅ + − ⋅ 
− + = −
1  2  =  2 2  2   2 

 2 2   1  
   1 1
− =0
 2   2 2
Okay, so their inner product is zero. Therefore, they are orthogonal. Are they normal or
abnormal (he-he)?

 1 
 1   
+
2
= + + =
1
  2  = 1 + 1 =1
 2 2   1  2 2
 
 2 
 1 
 1 1   2  1 1
   = + =1
2
− = − − = −
 − 1  2 2
 2 2
 
 2 
Looks like they're normal! So |+⟩ and |-⟩ are orthonormal vectors.

The Kronecker delta function


Almost all vectors in quantum computing are unit vectors and many are orthogonal to
each other. When you are dealing with orthonormal vectors and doing computations, it is
convenient to use the Kronecker delta function. It is defined thusly:

 0 if i ≠ j
δ ij = 
 1 if i = j

So, when you see that symbol, if the indices of i and j are equal, then it is one. Otherwise,
it is zero.
154 Our Space in the Universe

A good example is the identity matrix. If you replace the entries with the Kronecker delta
function, you will get ones on the main diagonal when the indices of the entries equal
each other:

δ δ12   1 0 
I =  11 = 
δ
 21 δ 22 
 0 1
More importantly, when you have a basic set of orthonormal vectors, the inner product
of a vector with itself is one because they are normalized. The inner product is zero when
you use any other basis vector because they are orthogonal. For instance, take the basis
set B with orthonormal vectors:

B = { | 0〉 ,| 1〉,| 2〉, ..,| n〉 } ,

We can then use the Kronecker delta function to succinctly represent their inner product:

〈i | j 〉 = δ ij

I want you to know this function because you will see it in quantum computing literature,
and it helps us in succinctly describing computations.

The outer product


The outer product is interesting because it is again the matrix multiplication of two
vectors, but this time, the vectors are reversed, producing a matrix. The inner product uses
the matrix multiplication of two vectors to get a scalar. The outer product uses two vectors
to produce a matrix. Formally, the outer product is defined to be:

 u1v1 * u1v2 * … u1vn * 


 
u v * u2 v2 * … u2 vn * 
u v = 2 1
 ⋮ ⋮ ⋱ ⋮ 
 
 um v1 * um v2 * … um vn * 
The outer product 155

If you remember matrix multiplication from Chapter 2, The Matrix, we have the following
situation when multiplying an m × n matrix and an n × p matrix. They produce an m × p
matrix, as shown in the following diagram. Since we are dealing with vectors, we have an
m × 1 matrix and a 1 × p matrix:

Figure 8.2 – The schematics of matrix multiplication


Let's look at an example. First, we have two vectors |u⟩ and |v⟩:

 7
 3   
  2
u = 2  v =
 3
 −1   
  1 


Now, let's do the outer product with them:

 3 
 
u v =  2   7 2 3 1 
 −1 
 
 3 ⋅ 7 3 ⋅ 2 3 ⋅ 3 3 ⋅1 
 
=  2 ⋅ 7 2 ⋅ 2 2 ⋅ 3 2 ⋅1 
 −1 ⋅ 7 −1 ⋅ 2 −1 ⋅ 3 −1 ⋅ 1 
 
 21 6 9 3 
 
=  14 4 6 2 
 −7 −2 −3 −1 
 
156 Our Space in the Universe

Is the outer product commutative? Well, let's try:

7 
 
2
v u =   3 2 −1 
3  
 
 1 
 7⋅3 7⋅ 27 ⋅ −1 
 
2⋅3 2⋅ 22 ⋅ −1 
=
 3⋅3 3⋅ 23 ⋅ −1 
 
 1⋅ 3 1⋅ 2
1 ⋅ −1 
 21 14 −7 
 
6 4 −2 
=
 9 6 −3 
 
 3 2 −1 
Apparently not! So, in general, |u⟩⟨v| ≠|v⟩⟨u|. Look closely at the matrices for the final
answers though. Do you see any similarities? That's right – they are the transpose of each
other! This leads us to the following property of the outer product:

(|𝑢𝑢⟩⟨𝑣𝑣|)† = (|𝑣𝑣⟩⟨𝑢𝑢|)
You should also note that the outer product is distributive over addition, so:

( )u=v
v + w u + w u

u ( v + w )= u v + u w . (2)

So, let's try out Equation (2) on our friends |0⟩ and |1⟩. Here's the left side of the equation:
( 0 + 1 )1
 1 0 
   +     0 1 
 0 1 
 
1
   0 1 
1
0 1
 
0 1
Operators 157

As an exercise, compute the right side of Equation (2), and make sure that you get the
same answer as you just got for the left side.

Exercise two
Compute the following:

0 1 + 1 1

Operators
In this section, we will consider linear operators. We first described these in Chapter 5,
Transforming Space with Matrices. To reiterate, linear operators are linear transformations
that map vectors from and to the same vector space. They are represented by square
matrices. For just this section, I will put a "hat" or caret on the top of operators and use just
the uppercase letter for matrices, as I want to be deliberate when referencing operators.
For instance, let's look at the X̂ operator that transforms the zero and one states:

𝑋𝑋̂|0⟩ = |1⟩
𝑋𝑋̂|1⟩ = |0⟩

Now, let's come up with a matrix that represents this operator. The question becomes,
which basis will we use? Let's use the computational basis, which is |0⟩ and |1⟩. I will
denote this set of basis vectors by the letter C. So, the X̂ operator in the C basis is
represented by:

0 1
XC =  
1 0

Now, I want to come up with a matrix representation of X̂ in the |+⟩, |-⟩ basis, which I will
denote as +,-. Here is the matrix representation of the X̂ operator in the +,- basis:

1 0 
X + ,− =  
 0 −1 

Now that we have refreshed our knowledge around linear operators (which I will
sometimes just call operators), let's look at another way we can represent an operator.
158 Our Space in the Universe

Representing an operator using the outer product


Can we use the outer product to represent an operator? Well, since the computation of an
outer product results in a matrix, we most certainly can. But, given the previous discussion,
since the outer product is a matrix, multiple outer products can represent an operator.
Let's show that we can by writing an equation in which a matrix A transforms a vector:
A | x〉 = | y 〉 (3)
Now, let's try to represent our matrix A by an outer product between two vectors
|u⟩ and |v⟩:

A = | u〉〈v |
Okay, let's rewrite Equation (3) using our new outer product representation of the
matrix A:

( | u 〉 〈 v | ) | x〉 = | y 〉
Due to the properties of the inner product and outer product that we have enumerated in
the previous sections, we can rewrite this as:

| u 〉 ( 〈 v || x〉 ) =| y〉
Due to bra-ket notation, we can rewrite this with a bracket:
| u 〉 ( 〈 v | x〉 ) = | y 〉 (4)
We know that a bracket denotes an inner product and therefore produces a scalar, like so:
〈 v | x〉 = α
α ∈ℂ
We can take our scalar α and put it back into Equation (4) and rearrange it:
| u 〉α = | y〉
α | u〉 = | y〉

We now see that the ket |y⟩ that was a product of the matrix multiplication in Equation (3)
is just another ket proportional to |u⟩. This is great because it shows that we can represent
a linear operator with an outer product in bra-ket notation. Not only that, the resulting ket
will be proportional to the ket we use in the outer product.
Operators 159

So, how can this be used? Let's say we have an operator Ô defined with our plus and minus
kets from before. This gives us a matrix O defined in an outer product, like so:
O = | +〉 〈− |

Now, we want to know what our new operator Ô does to the minus ket. We can do it
without ever resorting to matrices, like so:
𝑂𝑂̂|−⟩ = 𝑂𝑂|−⟩ = (|+⟩⟨−|)|−⟩
𝑂𝑂̂|−⟩ = |+⟩⟨−|−⟩
𝑂𝑂̂|−⟩ = |+⟩ ⋅ 1 = |+⟩

So, our operator Ô turns the minus ket into the plus ket. You try it now.

Exercise 3
What does the operator Ô do to the plus ket? In other words, what is the following?
𝑂𝑂̂|+⟩

The completeness relation


The following relation goes under a few names (the closure relation and the resolution of
identity), but I will go with the name completeness relation. I will just state it without
proving it, as the proof is rather laborious. We start with an orthonormal basis B where:

B = { | 0〉 ,| 1〉,| 2〉, ..,| n〉 } ,

Then, the identity operator Î can be written as an outer product, like so:
n
I = ∑ | i 〉 〈i | (5)
i =0

That's it. Doesn't seem like much, does it? But it is used over and over again in quantum
computing to manipulate bra-ket expressions.
So, let's apply this to an example. Let's take our basis to be the plus and minus kets
{|+⟩,|-⟩}. How then do we write the identity operator in this basis using the outer product
representation? Well, using Equation (5), we get:

I = | +〉 〈+ | + | −〉 〈− |
160 Our Space in the Universe

Now, let's test the completeness relation. If we apply the identity operator to the plus ket,
we should get the plus ket back. Let's see:

I | +〉 = ( | +〉 〈+ | + | −〉 〈− | ) | +〉
Distribute the plus ket
I | +〉 = ( | +〉 〈+ | +〉 + | −〉 〈− | +〉 )
Calculate the inner products
I | + 〉 = ( | + 〉 ⋅ 1 + | −〉 ⋅ 0 ) = | + 〉

Indeed, we do get the plus ket back, and hence, we can see how the completeness relation
works in a simple example.

The adjoint of an operator


As we saw, we can manipulate operators using bra-ket notation without resorting to a
matrix representation. We also saw the use of the dagger symbol when looking at the
conjugate transpose. Now, I'd like to take a step back and put this all together to define the
adjoint of an operator.
In this definition, we need to remember that for every ket |x⟩, there is a bra ⟨x|. Let's say
we have an operator  that acts on |x⟩ to give us |y⟩:

𝐴𝐴̂|𝑥𝑥⟩ = |𝑦𝑦⟩ ,
Is there an associated operator we can use on the bra ⟨x| to give us the associated bra ⟨y|of
the ket |y⟩? It happens that there is, and it is named the adjoint of the operator  and
written this way:

⟨𝑦𝑦| = ⟨𝑥𝑥|𝐴𝐴̂†
Now, there are rules when manipulating adjoint operators that you must keep in mind.
First, the adjoint of a scalar is its complex conjugate:

𝛼𝛼† = 𝛼𝛼*
𝛼𝛼 ∈ ℂ
You can distribute the adjoint operation amongst scalar multiplication of operators, like so:

(𝛼𝛼𝐴𝐴̂)† = 𝛼𝛼*𝐴𝐴̂†
Operators 161

I will reiterate that bras and kets are the adjoints of each other:

(|𝑥𝑥⟩)† = ⟨𝑥𝑥|
(⟨𝑥𝑥|)† = |𝑥𝑥⟩
If you take the adjoint of an adjoint, you get back the original operator:

(𝐴𝐴̂† )† = 𝐴𝐴̂
You cannot distribute the adjoint across the multiplication of operators; rather, they
anti-commute:

(𝐴𝐴̂𝐵𝐵̂)† = 𝐵𝐵̂† 𝐴𝐴̂†


Finally, the adjoint can be distributed across a sum of operators:

(𝐴𝐴̂ + 𝐵𝐵̂ + 𝐶𝐶̂ )† = 𝐴𝐴̂† + 𝐵𝐵̂† + 𝐶𝐶̂ †

I know this seems like a lot of rules, but you will need them when you manipulate bra-ket
expressions in quantum computing.
Finally, if an operator is expressed as an outer product, we can take its adjoint in the
following way:

𝐴𝐴̂ = |𝑥𝑥⟩⟨𝑦𝑦|
𝐴𝐴̂† = |𝑦𝑦⟩⟨𝑥𝑥|

When an operator is represented as a matrix on an orthonormal basis, then its adjoint is


the conjugate transpose of the matrix:
𝐴𝐴̂† = 𝐴𝐴†

Let's look at a quick example. The X̂ operator we have looked at previously can be
represented in the computational basis as an outer product thusly:

X =| 0〉 〈1 | + | 1〉 〈 0 |
Now, let's take the adjoint of X:

𝑋𝑋† = (|0⟩⟨1| + |1⟩⟨0|)†


𝑋𝑋† = (|0⟩⟨1|)† + (|1⟩⟨0|)†
𝑋𝑋† = |1⟩⟨0| + |0⟩⟨1|
162 Our Space in the Universe

If you look closely, you should notice that the adjoint of X is equal to the original X. That's
because X̂ is Hermitian, which we will discuss in our next section.

Types of operators
There are certain types of linear operators that are special and need to be defined so
that we can refer to them later in the book. You will also hear of them all the time in
quantum computing.

Normal operators
Normal operators are ones that commute with their adjoint. For an operator Â, if:
𝐴𝐴̂𝐴𝐴̂† = 𝐴𝐴̂† 𝐴𝐴̂ (6)
then  is normal. They are important because a normal operator is diagonalizable, which
is something we will consider later in the book. The following operators (Hermitian,
unitary, positive, and positive semi-definite) are all normal operators.
A normal matrix represents a normal operator, and it commutes with its conjugate
transpose. Let's look at an example normal matrix A:

 −i 2 + 3i 
A= 
 −2 + 3i 0 

Its conjugate transpose is:

𝑖𝑖 −2 − 3𝑖𝑖
𝐴𝐴† = [ ]
2 − 3𝑖𝑖 0

Now, let's see if A commutes with its conjugate transpose. We'll calculate the left side of
Equation (6) first:
−𝑖𝑖 2 + 3𝑖𝑖 𝑖𝑖 −2 − 3𝑖𝑖
𝐴𝐴𝐴𝐴† = [ ][ ]
−2 + 3𝑖𝑖 0 2 − 3𝑖𝑖 0
(−𝑖𝑖)𝑖𝑖 + (2 + 3𝑖𝑖)(2 − 3𝑖𝑖) (−𝑖𝑖)(−2 − 3𝑖𝑖) + (2 + 3𝑖𝑖) ⋅ 0
𝐴𝐴𝐴𝐴† = [ ]
(−2 + 3𝑖𝑖)𝑖𝑖 + 0 ⋅ (2 − 3𝑖𝑖) (−2 + 3𝑖𝑖)(−2 − 3𝑖𝑖) + 0 ⋅ 0
14 −3 + 2𝑖𝑖
𝐴𝐴𝐴𝐴† = [ ]
−3 − 2𝑖𝑖 13
Types of operators 163

And now we do the same for the right side of Equation (6):
𝑖𝑖 −2 − 3𝑖𝑖 −𝑖𝑖 2 + 3𝑖𝑖
𝐴𝐴† 𝐴𝐴 = [ ][ ]
2 − 3𝑖𝑖 0 −2 + 3𝑖𝑖 0
𝑖𝑖(−𝑖𝑖) + (−2 − 3𝑖𝑖)(−2 + 3𝑖𝑖) 𝑖𝑖(2 + 3𝑖𝑖) + (−2 − 3𝑖𝑖) ⋅ 0
𝐴𝐴† 𝐴𝐴 = [ ]
(2 − 3𝑖𝑖)(−𝑖𝑖) + 0 ⋅ (−2 + 3𝑖𝑖) (2 − 3𝑖𝑖)(2 + 3𝑖𝑖) + 0 ⋅ 0
14 −3 + 2𝑖𝑖
𝐴𝐴† 𝐴𝐴 = [ ]
−3 − 2𝑖𝑖 13

And there you go – the answers match and, therefore, our example matrix A is a
normal matrix!
Normal operators and matrices have special properties, namely:

• They are diagonalizable; this will come up in the next chapter.


• Their eigenvalues are the conjugates of the eigenvalues of their adjoints.
• The eigenvectors associated with different eigenvalues are orthogonal.
• A vector space can be defined by an orthogonal basis set of these eigenvectors.

The operators that follow are all special types of normal operators.

Hermitian operators
The definition of a Hermitian operator is rather terse and simple as well. A Hermitian
operator is an operator that is equal to its adjoint:
𝐴𝐴̂ = 𝐴𝐴̂†

You will also hear these referred to as self-adjoint operators.


Since Hermitian operators are normal, they have all their properties plus one more:

• All their eigenvalues are real numbers

Hermitian operators play an important part in quantum computing, as all measurements


of a quantum state are done via a Hermitian operator.
164 Our Space in the Universe

Unitary operators
Unitary operators are very important, as they describe the evolution of a quantum
state and therefore all gates in quantum computing are unitary. They also have a very
simple definition:

̂ −1 = 𝑈𝑈
𝑈𝑈 ̂†
Using this definition, we can also derive that:

̂−1 𝑈𝑈
𝑈𝑈 ̂ = 𝑈𝑈
̂𝑈𝑈̂−1 = 𝐼𝐼̂
̂† 𝑈𝑈
𝑈𝑈 ̂ = 𝑈𝑈
̂𝑈𝑈̂† = 𝐼𝐼̂.
Unitary operators have two unique properties. First, their eigenvalues are complex
numbers of modulus one:
| λ |=1
λ = eiθ , θ ∈ ℝ

And they preserve the inner product.


Let's quickly prove that they preserve the inner product using bra-ket notation:

̂|𝑥𝑥⟩ = |𝑥𝑥'⟩, 𝑈𝑈
𝑈𝑈 ̂|𝑦𝑦⟩ = |𝑦𝑦'⟩
̂ † )(𝑈𝑈
⟨𝑥𝑥'|𝑦𝑦'⟩ = (⟨𝑥𝑥|𝑈𝑈 ̂ |𝑦𝑦⟩) = ⟨𝑥𝑥| 𝑈𝑈
̂ † 𝑈𝑈
⏟ ̂ |𝑦𝑦⟩
𝐼𝐼̂

⟨𝑥𝑥'|𝑦𝑦'⟩ = ⟨𝑥𝑥|𝐼𝐼̂|𝑦𝑦⟩ = ⟨𝑥𝑥|𝑦𝑦⟩


A consequence of unitary operators preserving the inner product is that they also preserve
the norm of transformed vectors.
Unitary operators are represented by unitary matrices, and similarly:

𝑈𝑈 −1 = 𝑈𝑈†
In general, unitary matrices are expressed this way:

 a b 
U =  iϕ ∗ iϕ ∗  , | a |2 + | b |2 = 1
 −e b e a 
Types of operators 165

The determinant is:

det(U ) = eiϕ
Since all quantum computing gates are unitary, let's look at one and test it. The phase shift
gate is represented in the computational basis by the following matrix:

1 0 
P (ϕ ) =  iϕ 
0 e 
where ϕ is the phase shift with a period of 2π
To find the inverse of a matrix, we use our formula from Chapter 7, Eigen-Stuff:

−1
a b  1  d −b 
A−1 =   =  
c d  det( A)  −c a
So, first, we have to find its determinant:

det( P (ϕ )) = 1 ⋅ eiϕ − 0 = eiϕ


Now, we just plug and play:

 0 
−1 1  eiϕ 0  1 
P (ϕ ) = iϕ  = 1 
e  0 1   0
 eiϕ 
Now for the moment of truth – is it unitary?

1 0
† 1 0 1 ]
𝑃𝑃 (𝜑𝜑) = [ ]=[
0 𝑒𝑒 −𝑖𝑖𝑖𝑖 0
𝑒𝑒 𝑖𝑖𝑖𝑖
Indeed, it is! Let's move on to projection operators.
166 Our Space in the Universe

Projection operators
We covered projection linear transformations in Chapter 5, Using Matrices to Transform
Space. In that chapter, we defined a projection this way. If you have a linear transformation
P, then if the following condition holds, it is a projection:
P2 = P (7)
In quantum computing, they are defined a little differently:
Given a normalized state |𝜓𝜓⟩the projection operator for this state is:
𝑃𝑃̂ = |𝜓𝜓⟩⟨𝜓𝜓| (8)
Equation (7) still holds for projection operators, but given the definition in Equation (8),
you also get that the projection operator is Hermitian:

𝑃𝑃̂ = 𝑃𝑃̂†

All Projections Are Orthogonal in Quantum Computing


While there are non-orthogonal projection transformations in mathematics, we
do not use them in quantum computing, so when you hear projection, assume
that it is an orthogonal projection unless explicitly told otherwise.

If two projection operators commute, then their product is also a projection operator:

If𝑃𝑃̂1 𝑃𝑃̂2 = 𝑃𝑃̂2 𝑃𝑃̂1 then


𝑃𝑃̂1 𝑃𝑃̂2 is also a projection operator.
In quantum computing, we often project one vector space onto another. Let's say I have
a vector in an n-dimensional vector space defined by the basis {|0⟩,|1⟩,…|n⟩} and I want
to project it onto an m-dimensional subspace defined by the basis {|0⟩,|1⟩,…|m⟩}. Both
bases are orthonormal. Then, the projection operator that projects onto our subspace is
defined by:
m
Pm ≡ ∑ | i〉 〈i |
i =1

The only eigenvalues a projection operator can have are zero and one. Now, let's move on
to positive operators.
Tensor products 167

Positive operators
Positive operators are happy and optimistic, always looking on the bright side of life. Okay,
maybe that's a joke :) Mathematically, positive operators are Hermitian. There are actually
two types of positive operators, and they are only slightly different.
An operator  is said to be positive definite if:
⟨𝜓𝜓|𝐴𝐴̂|𝜓𝜓⟩ > 0 (9)
Please consult the appendix on Bra-Ket notation if you are unfamiliar with the notation in
Equation (9).
An operator  is said to be positive semidefinite if:

⟨𝜓𝜓|𝐴𝐴̂|𝜓𝜓⟩ ≥ 0
So the only difference is that positive definite operators do not include zero in their
definition. All eigenvalues of positive operators are non-negative.
Okay, well, we're done with types of operators! There are quite a few, but they will come up
in different segments of quantum computing, so try to keep them all straight.

No More Hats
As I said at the beginning of the section on operators, I will drop the hat or
caret on top of operators, and you should be able to derive whether I mean an
operator or a matrix from the context in which it is used.

Tensor products
Tensor products are a way to combine vector spaces. One of the postulates of quantum
mechanics is that the state of a qubit is completely described by a unit vector in a Hilbert
space. The problem then becomes how to deal with more than one qubit. This is where a
tensor product comes in. Each qubit has its own Hilbert space, and to describe many qubits
as a system, we need to combine all their Hilbert spaces into one bigger Hilbert space.
Mathematically, that means that if we have a Hilbert space H and another Hilbert space J,
we denote their tensor product as:

M =H ⊗J
168 Our Space in the Universe

If H is an h dimensional space and J is a j dimensional space, then the dimension of the


combined space M is h ⋅ j. In other words:

dim( M ) = dim( H ) ⋅ dim( J )


Before we go any farther, let's look at the tensor product of two vectors.

The tensor product of vectors


The tensor product of two vectors is denoted in the following way in bra-ket notation.
You'll notice that there are four different ways to notate it, so be careful when you come
across it in quantum computing literature and know exactly what you are dealing with:

u v or u ⊗ v or uv or u, v

Here are some of the properties of the tensor product:

• For a scalar s and a vector |u⟩ in U and vector |v⟩ in V, then:

s (| u 〉⊗ | v〉 ) = ( s | u 〉 )⊗ | v〉 =| u 〉 ⊗ ( s | v〉 )
• It is both right and left distributive. For vectors |v⟩ and |w⟩ in V and vectors |u⟩ and
|z⟩ in U, then:

( | v〉 + | w〉 ) ⊗ | u〉 = | v〉⊗ | u〉 + | w〉⊗ | u〉
| v〉 ⊗ ( | u 〉 + | z 〉 ) =| v〉 ⊗ | u 〉 + | v〉 ⊗ | z〉

• It is not commutative, so in general:

| v〉 ⊗ | u〉 ≠| u 〉 ⊗ | v〉
Now, let's look at how we do the tensor product for actual column vectors. When the
tensor product is implemented with arrays of numbers, it is actually the Kronecker
product. Keep this in mind when researching and reading about the tensor product.
Tensor products 169

So, here is the Kronecker product defined mathematically for vectors. The tensor product
of vectors is defined as:

 x1 ⋅ y1 
 
 x1 ⋅ y2 
 ⋮ 
 
  
x1 : yk
 x1   y1 
     x2 ⋅ y1
x2 ⊗ y2 = 
x ⊗ y = xy =  x2 ⋅ y2 
 ⋮   ⋮   
     ⋮ 
 xh   yk   
xh ⋅ y1
 
 xh ⋅ y2 
 ⋮ 
 
 xh ⋅ yk 
In other words, take the first component of |x⟩ and multiply it by every component of |y⟩,
then take the second component of |x⟩ and multiply it by every component of |y⟩, and
repeat this procedure until all of the components of |x⟩ are exhausted! That's a lot to say!
Let's look at a definition with just 1 × 2 vectors:

 ac 
 
 a   c   ad 
ψ = ⊗  =
 
 b   d   bc 
 bd 

That's not bad, right? By the way, if you noticed, the tensor product of two vectors always
produces another vector. Here is an example for you:

 2⋅3   6
   
 2   3   2 ⋅1   2
 ⊗  = =
 
 0   1   0⋅3   0

 0 ⋅ 1   0 
170 Our Space in the Universe

Alright, how about some examples with our two favorite vectors |0⟩ and |1⟩? Let's do a
tensor product between them:

1  0 
   
1 1 0  1 0 1 
0 ⊗ 0 = 00 =   ⊗   = 0 ⊗ 1 = 01 =   ⊗   =
   
0 0 0  0 1 0 
 0   0 

Exercise four
• What is |11⟩?
• What is |10⟩?

The basis of tensor product space


You will remember that we can completely describe a vector space by its basis. The same
is true of a tensor product space. To get the basis of the tensor product space, you take the
tensor product of each basis vector in one space with every basis vector in the other space
involved in the tensor product. I hope you remember the Cartesian product from Chapter
3, Foundations. A quick reminder is that if I have two sets A={x, y, z} and B={1, 2, 3},
their Cartesian product is all the ordered pairs shown in the following graphic:

Figure 8.3 – An example of the Cartesian product of A × B [1]


So, another way to describe the basis of the new tensor product space is to first take the
Cartesian product of all basis vectors and then do the tensor product between all the
ordered pairs. And there you go!
Tensor products 171

Let's look at an example. What if we had two vector spaces, U and V, that were both
two-dimensional? The basis for U is {|0⟩,|1⟩}, and the basis for V is {|+⟩,|-⟩}. Then, what
is the basis for the tensor product of U and V? We have to calculate all the following
tensor products:

| 0〉 ⊗ | +〉
| 0〉 ⊗ | −〉
| 1〉 ⊗ | +〉
| 1〉 ⊗ | −〉

So let's look at one of these and calculate it using the Kronecker product:

 12 
 
    1 
1
1
⊗ = 2 
2
| 0〉 ⊗ | +〉 = 
    0 
1
0 2 
 0 
 

So, that is one of the four vectors that are in the basis of our tensor product. Of course,
once we have a basis, we can describe every vector in the space as a linear combination of
our new basis vectors.

Exercise five
Calculate the Kronecker product of the following:

| 0〉 ⊗ | −〉
| 1〉 ⊗ | +〉
| 1〉 ⊗ | −〉
172 Our Space in the Universe

The tensor product of operators


In bra-ket notation, there are rules to follow about the tensor product of linear operators.
First, let's define the tensor product of operators in bra-ket notation. Let's say we have a
vector |v⟩ and linear operator A in vector space V. We also have a vector |u⟩ and linear
operator B in vector space U. Then, A ⊗ B on the tensor product space of U ⊗ V is
defined as:

( A ⊗ B ) ( | v〉⊗ | u〉 ) ≡ A | v〉 ⊗ B | u〉
Let's look at an example. We'll say we have two familiar linear operators X and Z and
vector states |0⟩ and |1⟩. Here is the math of our example:
( X ⊗ Z ) ( | 0〉⊗ | 1〉 ) = X | 0〉 ⊗ Z | 1〉
X | 0〉 =| 1〉 and Z | 1〉 = − | 1〉 so
(10)
X | 0〉 ⊗ Z | 1〉 =|1〉 ⊗ − |1〉 = − |11〉
The Kronecker product of matrices
When we represent operators with matrices, we can calculate the tensor product using
the Kronecker product. It is very similar to the Kronecker product for vectors, as column
vectors are just n × 1 matrices. You basically take the first matrix entries and multiply
them by the second matrix. Here's the definition. Given an m x n matrix A and a p × q
matrix B, then their Kronecker product is an (m ⋅ p) × (n ⋅ q) matrix, like so:

 a11B ⋯ a1n B 
 
A⊗B =  ⋮ ⋱ ⋮ 
 am1B ⋯ amn B 
 
That's pretty abstract; let's look at an example. Let's make matrix A a 2 × 2 matrix and B
a 2 × 2 matrix as well. What will be the dimension of their Kronecker product? From the
definition, we see it will be 4 × 4. Here's the example:

 0 5 0 5    1⋅ 0 1⋅ 5 1⋅ 0 1⋅ 5   0 5 0 5 
 1  1   
1 1   0 5    6 0 6 0    1⋅ 6 1⋅ 0 1⋅ 6
 
1⋅ 0   6 0 6 0 

 ⊗ = = =
3 4 6 0  0 5 0 5    3⋅0 3⋅5 4⋅0 4 ⋅ 5   0 15 0 20 
   
 3 6  4
0

0    3 ⋅ 6 3⋅ 0 4 ⋅ 6 4 ⋅ 0   18 0 24 0 
  6 
Tensor products 173

Exercise six
What is B ⊗ A?
Now, let's redo our example from earlier, Equation (10), in matrix form to double-check
our work:

( X ⊗ Z )(| 0〉⊗ | 1〉 )
 0 1  1 0    1   0  
  ⊗     ⊗  
 1 0   0 −1     0   1  
  
0 0 1 0  0 
  
0 0 0 −1   1 
1 0 0 0  0 
  
 0 −1 0 0   0 
 0
 
 0
= − | 1〉 ⊗ | 1〉 = − | 11〉
 0
 
 −1 
Indeed, we get the same answer, which is assuring! Finally, let's look at the inner product
for tensor product spaces.

The inner product of composite vectors


This is actually probably the most important definition when it comes to the tensor
product because it enables us to have a Hilbert space as the output of our tensor product.
Without further ado, here is the definition of the inner product for the tensor product of
two vector spaces.
Two Hilbert spaces V and U create a tensor product space W:
W = V ⊗U
For vectors |v⟩ and |x⟩ in V and vectors |u⟩ and |z⟩ in U, we define two composite
vectors in W:
| w〉 =| v〉 ⊗ | x〉
| y〉 =| u〉 ⊗ | z 〉
174 Our Space in the Universe

We now define the inner product of these two composite vectors |w⟩ and |y⟩ as:
〈 w | y〉 = ( 〈 v | ⊗〈 x | ) ( | u 〉 ⊗ | z〉 ) = 〈 v | u 〉 〈 x | z〉 (11)
We were able to simplify in Equation (11) due to the mixed-product property of the
Kronecker product:

( A ⊗ B )(C ⊗ D) = ( AC ) ⊗ ( BD)

The inner product of two of our computation basis vectors in ℂ4 should be zero.
Let's check:

〈00 | 11〉 = ( 〈 0 | 〈 0 | )( | 1〉 | 1〉 ) = 〈0 | 1〉 〈0 | 1〉 = 0 ⋅ 0 = 0
Indeed, it is! Whew!

Exercise seven
Calculate the following inner products:

〈00 | 01〉
〈00 | 10〉
〈10 | 01〉

This concludes our discussion of tensor products.

Summary
We have covered a lot of ground in this chapter, but I hope you feel it's been worth it.
I think it has because it brings everything we have worked on in the previous chapters
together into one framework. Also, we can do real math with quantum computing now
that we just couldn't do before. We will build on this in the last chapter to reach new
heights in quantum computing that, at first, probably looked unattainable!
Answers to exercises 175

Answers to exercises
Exercise one
1. No, their inner product is -5.
2. Yes!

Exercise two
0 1
 
0 1

Exercise three
O | +〉 = 0

Exercise four
1. What is |11⟩?

0 
 
0 
0 
 
 1 

2. What is |10⟩?
0 
 
0 
1 
 
 0 
176 Our Space in the Universe

Exercise five
 12 
 
 − 12 
| 0〉 ⊗ | −〉 =  
 0 
 0 
 
 0 
 
 0 
| 1〉 ⊗ | +〉 =  1 
 2 
 12 
 
 0 
 
 0 
| 1〉 ⊗ | −〉 =  1 
 2 
 − 12 
 

Exercise six
 0 0 5 5 
 
 0 0 15 20 
 6 6 0 0 
 
 18 24 0 0 

Exercise seven
They all equal zero, as the states are orthogonal to each other.
9
Advanced Concepts
In this chapter, we will go into some advanced linear algebra concepts. These will not
come up all the time in quantum computing, but when they do, you should know what
they are and where to find information about them. Almost all the topics are about
decomposing a matrix. This becomes important in quantum computing because when we
come up with a unitary transformation that we'd like to do on a quantum computer, we
will only have certain unitary operators to use on it. Then, it becomes a question of which
combination of available operators we should use so that we can perform our overall
unitary transformation. Along the way, we will also look at important inequalities and
how to represent functions that have matrices in them.
In this chapter, we are going to cover the following main topics:

• Gram-Schmidt
• Cauchy-Schwarz and triangle inequalities
• Spectral decomposition
• Singular value decomposition
• Polar decomposition
• Operator functions and the matrix exponential
178 Advanced Concepts

Gram-Schmidt
The Gram-Schmidt process is an algorithm in which you input a basis set of vectors and
it outputs a basis set that is orthogonal. We can then normalize that set of vectors, and
suddenly, we have an orthonormal set of basis vectors! This is very helpful in quantum
computing and other areas of applied math, as an orthonormal basis is usually the best
basis for computations and representing vectors with coordinates.

Gram-Schmidt Is a Decomposition Tool


While we won't go into it in this book, the Gram-Schmidt process is used in
certain decompositions, so it's good to know from that vantage point too.

Let's look at an example before getting into the nitty-gritty of the actual procedure
(which can be dry and dull). Let's say I have a basis for ℂ2, such as the following:
2  1 
x1 =   x2 =  
0  −2i 

These vectors are not orthogonal, since their inner product does not equal 0:
 1 
x1 x2 =  2 0    =2−0=2
 −2i 

They are also not normalized. Now, I want to get an orthonormal basis for ℂ2 using
these two vectors. Here's the process. The first step is the easiest; we just choose the
first vector in our set, |x1⟩, to be the first vector in our soon to be orthogonal basis set
(denoted by |v1⟩):
2
v1 = x1 =  
0
That was easy enough. This is the easiest step of Gram-Schmidt, which is always the first
step. Now for the second step. Here is the formula:

x2 v1
v2 = x2 − v1
v1 v1
Let's calculate the numerator of the right part of the equation first:
2
x2 v1 =  1 2i    = 2
0
Gram-Schmidt 179

Now for the denominator:

2 
v1 v1 =  2 0   =4
0 
Finally, let's put it all together!

x2 v1
v2 = x2 − v1
v1 v1
x2 v1 2 1
= =
v1 v1 4 2

 1  12
v2 =   −  
 −2i  20
 1  1   0 
v2 =   −  = 
 −2i   0   −2i 

So, our new orthogonal basis for ℂ2 is the following basis set:

2  0 
| v1 〉 =   , | v2 〉 =  
0  −2i 
You should calculate their inner product to ensure it is zero and that they are indeed
orthogonal. So, now all we need to do is normalize them to get an orthonormal basis!
Let's do |v1⟩ first:

2
v1 = 〈v1 | v1 〉 = 4
v1 = 4 = 2
v1  2  1 1 
e1 = =  ⋅ =  
v1 0 2 0
180 Advanced Concepts

Now for |v2⟩:

2  0 
v2 = 〈 v2 | v2 〉 =  0 2i   =4
 −2i 
v2 = 4 = 2
v2  0  1  0 
e2 = = ⋅ =  
v2  −2i  2  −i 
We now have an orthonormal basis for ℂ2, as shown in the following:
1  0 
e1 =   , e2 =  
0  −i 
And this all started from two linearly independent vectors using the Gram-Schmidt
process. Pretty cool, eh?
Now, it's time to lay out the entire process:
{
Given a basis B = | x1 〉,| x2 〉, ...,| x p 〉 } for a vector space V , define
v1 = x1
x2 v1
v2 = x2 − v1
v1 v1
x3 v1 x3 v2
v3 = x3 − v1 − v2
v1 v1 v2 v2

x p v1 x p v2 x p v p −1
vp = xp − v1 − v2 −…− v p −1
v1 v1 v2 v2 v p −1 v p −1
Then {v 1 , v2 , … , v p } is an orthogonal basis for V .
This algorithm will give us an orthogonal basis; then, all we have to do is normalize each
vector, and we get an orthonormal basis! Let's move on to two important inequalities.
Cauchy-Schwarz and triangle inequalities 181

Cauchy-Schwarz and triangle inequalities


The Cauchy-Schwarz inequality is one of the most important inequalities in
mathematics. Succinctly stated, it says that the absolute value of the inner product of two
vectors is less than or equal to the norm of those two vectors multiplied together. In fact,
they are only equal if the two vectors are linearly dependent:

〈 u | v〉 ≤ u v
There are several proofs of this inequality, which I encourage you to seek out if you are
interested. But, in the totality of things, knowing this inequality is all that is really required
for quantum computing.
The other major inequality is the triangle inequality. It comes from our old friend
Euclid in his book The Elements. Succinctly stated, it says that the length of two sides
of a triangle must always be more than the length of one side. They will only be equal
in the corner case when the triangle has zero area. It is very intuitive once you see some
example triangles. Here are some triangles that show how the side z is less than the sum
of sides x and y:

Figure 9.1 – Three example triangles for the triangle inequality [1]
182 Advanced Concepts

Now, let's state the triangle inequality:

x+ y ≤ x + y
You'll notice that the way it is notated is using the norms of vectors. Let's look at an
example. Let's say we have the two following kets:

| x〉 = 2 | 0〉 + 3i | 1〉
| y〉 = 2i | 0〉− | 1〉

Their corresponding bras (that is, their conjugate transpose) are:

〈 x |= 2〈 0 | −3i 〈1 |
〈 y |= −2i 〈0 | −〈1 |
Now, the norm of |x⟩ is:

= 〈 x | x〉 = ( 2〈 0 | −3i 〈1 | )( 2 | 0〉 + 3i | 1〉 )
2
x
2 0 0
x = 〈 x | x〉 = 4〈0 | 0〉 + 6i 〈 0 | 1〉 − 6i 〈1 | 0〉 + 9〈1 | 1〉
2
x = 〈 x | x〉 = 4 + 9 = 13
x = 13
The norm of |y⟩ is:

= 〈 y | y〉 = ( −2i 〈 0 | −〈1 | )( 2i | 0〉− | 1〉 )


2
y
2 0 0
y = 〈 y | y〉 = 4〈 0 | 0〉 + 2i 〈 0 | 1〉 − 2i 〈1 | 0〉 + 1〈1 | 1〉
2
y = 〈 y | y〉 = 4 +1 = 5
y = 5
So, the right side of the triangle inequality is:

x + y = 13 + 5 ≈ 5.84
Spectral decomposition 183

Let's calculate the left side of the triangle inequality:

| x〉+ | y〉 = 2 | 0〉 + 3i | 1〉 + 2i | 0〉− | 1〉
| x〉+ | y〉 = (2 + 2i ) | 0〉 + (−1+ 3i ) |1〉

= ( ( 2 − 2i )〈0 | +(−1− 3i )〈1| ) ( (2 + 2i) | 0〉 + (−1+ 3i ) |1〉 )


2
x+ y
2
x+ y = (4 + 4)〈0 | 0〉 + (1 + 9)〈1 | 1〉
2
x+ y = 8 + 10 = 18
x + y = 18 = 3 2 ≈ 4.24

So, the triangle equality holds for our two kets |x⟩ and |y⟩ since 4.24 ≤ 5.84. Now, we will
move to look at decompositions of matrices.

Spectral decomposition
The spectrum of a square matrix is the set of its eigenvalues. There is a cool theorem in
linear algebra that states that all matrices representing a linear operator have the same
spectrum. Before we use the spectrum though, we need to talk about diagonal matrices.

Diagonal matrices
The main diagonal of a matrix is every entry where the row index equals the column
index. Examples make this very easy to see. All the following matrices have the letter d on
their main diagonal:
 d 4 2 3
d 2 2   
d 3    4 d 2 4
  4 d 4  
2 d  5 1 d 4
2 1   
 d 
 1 2 2 d 

Now, a diagonal matrix has zero on all entries outside the main diagonal. Here are
examples of diagonal matrices:
 3 0 0 0
3 0 0  
2 0    0 3 0 0
   0 −2 0   0 0 i 0
0 i  0 0 4  
 
 0 0 0 5 
184 Advanced Concepts

Here are two cool features of diagonal matrices that make them all the rage at linear
algebra parties. One, all their eigenvalues are on their main diagonal. Two, they are very
easy to exponentiate. Let's see the latter in action real quick:
2
 2 0   22 0  4 0
  = = 
 0 2   0 22   0 4 
3
 2 0   23 0  8 0
  = = 
 0 2   0 23   0 8 

 2 0   2n 0 
n

  = 
 0 2   0 2n 
Try to exponentiate any regular old random matrix… it'll take you a while! Now, back to
the main feature – spectral decomposition.

Spectral theory
The spectral theorem lets us know when an operator can be represented by a diagonal
matrix. These operators and all the matrices that represent them are called diagonalizable.
This means that we can factor a diagonalizable matrix A into:

A = PDP −1

where P is an invertible matrix and D is a diagonal matrix.


If you remember from the last chapter, all normal operators and their associated normal
matrices are diagonalizable. If this is the case, then we can decompose a matrix A into
even more special factors:

𝐴𝐴 = 𝑈𝑈Λ𝑈𝑈†
Looks like a fraternity name, right? Well, U is a unitary matrix, and the capital Greek
letter lambda (Λ) is a diagonal matrix with all the eigenvalues of A on its main diagonal!
Remember that we used the lowercase lambda (λ) for each eigenvalue, so it makes sense
to use the uppercase lambda for all of them together in one matrix. Now, here comes the
kicker – the column vectors making up U are the eigenvectors of A!
Spectral decomposition 185

Putting this all together means that we can decompose any normal matrix into a unitary
matrix U, its conjugate transpose, and a diagonal matrix Λ that has the eigenvalues down
its main diagonal. Please note that the eigenvalues placement on the main diagonal must
correspond with its eigenvector's placement in U. Since the set of eigenvalues is called the
spectrum, I hope you can see why this is called spectral decomposition.
Okay, enough talk – more action! Let's do an example.

Diagonalizable Matrices
Please note that there are diagonalizable matrices that are not normal, but we
do not often come across these matrices in quantum computing.

Example
Let's use spectral decomposition to decompose the quantum gate Y. Here is its matrix in
the computational basis:

 0 −i 
 
i 0 

Now, I will go ahead and tell you that the eigenvalues of Y are 1 and -1. The eigenvectors
are as follows where "+" denotes the eigenvector for the eigenvalue of 1 and "–" denotes the
eigenvector of -1:

 1   1 
   
 2   2 
| λY+ 〉 = , | λY− 〉 =
 i   −i 
   
 2   2 
So, according to the spectral theorem, I can represent Y in terms of a matrix U that has
its eigenvectors as the columns, a diagonal matrix with the eigenvalues down its main
diagonal, and the conjugate transpose of U. So, here is U and U dagger:
1 1 1 −𝑖𝑖
2 √2 2 √2
𝑈𝑈 = √  , 𝑈𝑈 † = √
𝑖𝑖 −𝑖𝑖 1 𝑖𝑖
[√2 √2] [√2 √2]
186 Advanced Concepts

Here is the eigenvalue matrix:

1 0 
Λ= 
 0 −1 
Putting this all together, we can now decompose Y into:
1 1 1 −𝑖𝑖
2 √2 1 0 √2 √2
𝑌𝑌 = 𝑈𝑈Λ𝑈𝑈 † = √  [ ]
𝑖𝑖 −𝑖𝑖 0 −1 1 𝑖𝑖
[√2 √2] [√2 √2]

I encourage you to work out the matrix multiplication and make sure I'm right!
But, hey – wait a minute. We're in quantum computing and we like bra-ket notation! So,
let's do it in bra-ket notation!

Bra-ket notation
All the way back in Chapter 2, The Matrix, we talked about how you can represent
a matrix as a set of column vectors (kets) or row vectors (bras). Well, we're going to use
that here. I'll prove this using 2 × 2 matrices, and then you'll have to trust me that it
works for n × n matrices. I can represent the unitary matrix U that has eigenvectors for
its column vectors like so:

 | λ 〉 | λ2 〉 
U = 1 
 
U dagger then becomes:

⟨𝜆𝜆 |
𝑈𝑈 † = [ 1 ]
⟨𝜆𝜆2 |

The eigenvalue matrix will look like this:

λ 0 
Λ= 1 
 0 λ2 
Spectral decomposition 187

Putting it all together, it looks like:

|𝜆𝜆1 ⟩ |𝜆𝜆2 ⟩ 𝜆𝜆1 0 ⟨𝜆𝜆1 |


𝐴𝐴 = 𝑈𝑈Λ𝑈𝑈 † = [ ][ ][ ]
0 𝜆𝜆2 ⟨𝜆𝜆2 |
Multiplying the first two matrices, we get:

𝜆𝜆1 |𝜆𝜆1 ⟩ 𝜆𝜆2 |𝜆𝜆2 ⟩ ⟨𝜆𝜆1 |


𝐴𝐴 = 𝑈𝑈Λ𝑈𝑈 † = [ ][ ]
⟨𝜆𝜆2 |

Then, when we do the final multiplication, we get:

𝐴𝐴 = 𝑈𝑈Λ𝑈𝑈 † = 𝜆𝜆1 |𝜆𝜆1 ⟩⟨𝜆𝜆1 | + 𝜆𝜆2 |𝜆𝜆2 ⟩⟨𝜆𝜆2 |

And this is the bra-ket equation for spectral decomposition! For an n × n matrix,
it becomes:
n
A = ∑ λi | λi 〉 〈λi |
i =1

This is a very important result! It means that any normal operator can be represented as a
linear combination of outer products composed of just its eigenvalues and eigenvectors!
Let's see how this plays out with our Y operator.

Example take two


You may have noticed that the eigenvectors of Y look eerily familiar; that's because they
are the i and minus i states!
 1   1 
   
2  2 
| λY + 〉 =| i〉 ≡  , | λY − 〉 =| −i〉 ≡ 
 i   −i 
   
 2   2 
So, using spectral decomposition, we can represent the quantum Y gate this way:
Y = | i〉 〈i | − | −i〉 〈−i | (1)
188 Advanced Concepts

If you want to express this in the computational basis, you have to write i and minus i in it:

| i〉 =
1
2
( | 0〉 + i | 1〉 )
| −i〉 =
1
2
( | 0〉 − i | 1〉 )
Then, substitute this with Equation (1) and work out the math. You will get:

Y = i ( | 1〉 〈 0 | − | 0〉 〈1 | )
Now, let's look at another decomposition.

Singular value decomposition


Singular Value Decomposition (SVD) is probably the most famous decomposition you
can do for linear operators and matrices. It is at the core of search engines and machine
learning algorithms. Additionally, it can be used on any type of matrix, even rectangular
ones. However, we will only look at square matrices.
Succinctly stated, it guarantees that for any matrix A, it can be decomposed into
three matrices:

𝐴𝐴 = 𝑈𝑈Σ𝑉𝑉 † ,
Whereas U is a unitary matrix, Σ (sigma) is a diagonal matrix with what is known as the
singular values of A on its diagonal, and V is also a unitary matrix. It should be noted that
this decomposition is not unique, and different matrices can be used for U, Σ, and V.
Let's look at an example. We have the following matrix A:

 −2 0 
A= 
 0 0 
Without going through the math, I'm going to tell you that SVD can be used to get
this decomposition:

 −1 0   2 0   1 0 
A=   
 0 1  0 0  0 1 
Polar decomposition 189

Let's make sure that U and V are unitary matrices:


−1 0 −1 0
𝑈𝑈 = [ ] , 𝑈𝑈 † = [ ]
0 1 0 1
Verify that 𝑈𝑈𝑈𝑈† = 𝐼𝐼
−1 0 −1 0 1 0
[ ][ ]=[ ] = 𝐼𝐼
0 1 0 1 0 1

Since V is the identity matrix, I think we can safely assume it is unitary.


Now for these pesky singular values on the main diagonal of Σ. How do we get those?
Those are found by taking the square root of the eigenvalues of A multiplied by its
conjugate transpose. So for us, this is:
−2 0 −2 0
𝐴𝐴 = [ ] , 𝐴𝐴† = [ ]
0 0 0 0
−2 0 −2 0 4 0
𝐴𝐴𝐴𝐴† = [ ][ ]=[ ]
0 0 0 0 0 0

The eigenvalues are on the main diagonal, and they are 4 and 0. So, the singular values
of A are the square root of these eigenvalues, namely 2 and 0. If you look at the middle
matrix, Σ, you'll notice that it has these singular values on its main diagonal. Your next
question might be, how do we find U and V? Well, you may be disappointed, but I'm
not going to go through the algorithm here. Suffice it to say that it involves finding
an orthonormal set of eigenvectors for AA†. If you are interested in learning more,
I encourage you to look into one of the linear algebra books in my appendix of references.
Though, to be quite truthful, we almost always use computers to calculate SVD. Let's move
onto another decomposition – polar decomposition.

Polar decomposition
Polar decomposition allows you to factor any matrix into unitary and positive
semi-definite Hermitian matrices. It can be seen as breaking down a linear transformation
into a rotation or reflection and scaling in ℝn. Formally, it is as follows:

A = UP ,
190 Advanced Concepts

for any matrix A. U is a unitary matrix and P is a positive semi-definite matrix. Let's look
at an example:

2 0 
A= 
 0 −1 

Using polar decomposition, this matrix can be decomposed into:


1 0  2 0 
A = UP =   
 0 −1   0 1 
This may not seem like much, but we took a random matrix and turned it into a reflection
matrix times a scaling matrix. Pretty cool!
Again, I will not go through the algorithm here because we will use calculators.
Calculators for polar decomposition are not as plentiful as SVD, but I have found using
the SciPy Python library to be the best way.

Operator functions and the matrix


exponential
Now, we get to look at functions that involve an operator and their matrix representations.
What types of functions are we talking about? Well, really any function that can be
defined, such as the sine of x. You have probably never seen a function like this:
f ( A) = sin( A)

where A is a matrix. The first question is, does this even make sense? Well,
mathematicians have come up with ways for this to make sense, and it has applications
in quantum computing.
As we have said, if a matrix A is diagonalizable, it can be decomposed into an invertible
matrix P and diagonal matrix D as:

A = PDP −1 (2)
Operator functions and the matrix exponential 191

Given that, we can represent a function involving such a matrix like so:

 f ( d1 ) ⋯ 0 
 
f ( A) = P  ⋮ ⋱ ⋮  P −1
 
 0 ⋯ f (d )n  (3)

where the matrix in the middle is the diagonal matrix D in Equation (2). The function is
evaluated for every value on the main diagonal of D.
Let's look at an example. Let's do the easiest case, where we are already dealing with
a diagonal matrix:

π 0 
A= 
π
0 2 

Now, we want to find the sine of A. Following the process from Equation (3), this is what
we get:

 sin π 0 
sin( A) =  
 0 sin π 2 
0 0
sin( A) =  
0 1

How about we check our answer by finding the cosine of A as well and then summing
their squares? Here is the cosine of A:

 cos π 0 
cos( A) =  
 0 cos π 2 
 −1 0 
cos( A) =  
 0 0

We know the trigonometric identity:

sin 2 x + cos 2 x = 1
192 Advanced Concepts

The number 1 in that identity becomes the identity matrix for us when dealing with
matrices. Now, let's verify our calculations for sine and cosine!

0 0
sin( A) =  
0 1
0 0
sin 2 ( A) =  
0 1
 −1 0
cos( A) =  
 0 0
1 0
cos 2 ( A) =  
0 0
0 0 1 0 1 0
sin 2 ( A) + cos 2 ( A) =  + = =I
0 1 0 0 0 1

So, our calculations are correct and this makes sense!


A function that comes up a lot in quantum computing is the exponential function:

f ( x) = e x = exp( x)
You'll notice that we use exp(x) to also denote the exponential function, as it is easier
to write sometimes. If the matrix is diagonal, such as the Z operator, this becomes easy
to calculate:

1 0 
Z = 
 0 −1 
 e1 0 
eZ =  
 0 e−1 

Let's see a slightly harder example. What is exp(A) when A is:

1 2 
A= 
 0 −1 
Operator functions and the matrix exponential 193

Well, first we need to diagonalize it, and the first step of that is finding its eigenvalues:

1 2   1 0   1− λ 2 
 −λ  = 
 0 −1  0 1  0 −1 − λ 
 1− λ 2 
det   = (1 − λ )( −1 − λ ) − 2 ⋅ 0
 0 −1 − λ 
 1− λ 2 
 = λ −1
2
det 
 0 − 1 − λ 
λ =1
2

λ1 = 1 , λ2 = −1

Then, we have to find its eigenvectors. I will go ahead and tell you that they are:
1  −1 
| λ1 〉 =   , | λ2 〉 =  
0  1 
Given all of this, we can write its diagonal representation like so:

 1 −1   1 0   1 1 
A = PDP −1 =    
 0 1   0 −1   0 1 
Evaluating the exponential function on the main diagonal of D gives us:

 1 −1   e 0   1 1 
exp( A) = exp( PDP −1 ) =   1   
 0 1   0 e   0 1 
Multiplying this all out gives us:

 e2 − 1 
e 
exp( A) =  e 
 1 
0 
 e 
While this was a long example, it should show you the power of being able to calculate the
functions of operators.
194 Advanced Concepts

Summary
Well, that wraps up this chapter. I hope you can see all the wonderful decompositions you
can do with matrices and some of the more advanced things you can do with them. This
chapter also concludes the book. I hope you have enjoyed it and learned as much as I did.
Take this math and go forth to infinity and beyond in the universe of quantum computing!

Works cited
[1] - Triangle inequality - Wikipedia:
https://en.wikipedia.org/wiki/Triangle_inequality#/media/
File:TriangleInequality.svg
Section 4:
Appendices

The following chapters are included in this section:

• Appendix 1, Bra-ket Notation


• Appendix 2, Sigma Notation
• Appendix 3, Trigonometry
• Appendix 4, Probability
• Appendix 5, References
Appendix 1
Bra–ket Notation
Bra-ket notation was introduced by Paul Dirac in 1939 and is sometimes called Dirac
notation consequently. Kets are denoted by a pipe ("|") and right-angle bracket ("⟩"), like
so – |Label⟩, while bras are denoted by a left-angle bracket ("⟨") and pipe ("|"), like so
⟨Label|. Kets represent vectors in a Hilbert space and bras represent their covectors in
a dual Hilbert space. The labels for kets and bras can be lowercase letters, numbers,
and greek letters. Uppercase letters are usually reserved for operators, which we will get
to later.
The computational basis vectors are represented by |0⟩ and |1⟩. It is important to note that
the zero vector is denoted by 0 and is totally different than |0⟩. The zero vector, sometimes
referred to as the null vector, is the only one that is not represented as a ket.
An inner product between two kets, |ϕ⟩ and |ψ⟩, is notated this way – ⟨ϕ|ψ⟩. This can be
called a "bracket" and brings the notation full circle.
198 Bra–ket Notation

Other notations used are shown in the following table:

Operators
Operators are represented by capital letters such as A, B, and C. Operators can be
represented by matrices numerically, as shown in the following diagram:

 1 3
A= 
7 9

The rest of bra-ket notation will be explained as the book progresses. The next section is
a very advanced treatise on bras and is optional.

Bras
A bra is a linear functional. We talk about these in Chapter 5, Transforming Space with
Matrices. To help jog your memory, they are a special case of linear transformation that
takes in a vector and spits out a scalar:
f : V → F where V is a vector space and F is the field of scalaars ℝ or ℂ
Operators 199

For instance, I could define a linear functional for every vector in ℝ2:

a
f ( | v〉 ) = a + b where | v〉 =  b 
 
So that:

 3 
f    =3+ 2 = 5
 2 
 
  5  
f    =5−2=3
  −2  
 
There are many linear functionals that can be defined for a vector space. Here's
another one:

g : ℝ2 → ℝ
a
g ( | v〉 ) = 2a − 3b where | v〉 =  
b
The set of all linear functionals that can be defined on a vector space actually form their
own vector space called the dual vector space.
Instead of using the usual function notation for these linear functionals, Paul Dirac came
up with a notation that he called a bra:

f |v 〉 ≡ 〈 v |
Since every vector has its own linear functional (called its dual vector or covector), the
label between the angle bracket and vertical bar or pipe is the dual vector for the ket with
the same label. In other words, every ket |v⟩ has a linear functional 〈v| defined for it.
200 Bra–ket Notation

Now the big question is, what is the function that is defined for each ket? That, my friend, is
the inner product, defined as follows and explained in Chapter 8, Our Place in the Universe:

 x1   y1 
    n
| x〉 ,| y〉 ≡  ⋮ ,
  ⋮  = ∑ xi* yi = x1* y1 + ⋯ + xn* yn
i =1
 xn   yn 
   
Appendix 2:
Sigma Notation
Many quantum computing books will introduce sigma notation without ever really
explaining it, and you will also see shorthand for it as well. I hope to demystify this
notation when you encounter it in this book and others.

Sigma
Sigma is the 18th letter of the Greek alphabet, and we are talking about the capital
version, . It signifies a summation, so that:
3

1 + 2 + 3 =∑
=1

More generally,

1 + 2 +⋯+ =∑
=1
202 Sigma Notation

Here are the different parts of sigma notation:

Figure 11.1 – A diagram of sigma notation


Here is an easy example to make the point better:

∑ 𝑖𝑖 = 1 + 2 + 3 = 6
𝑖𝑖=1

Variations
You will see shorthand in a lot of quantum computing books, but don't let it throw you.
The following are all equivalent:

𝑛𝑛

∑ 𝑥𝑥𝑖𝑖 = ∑ 𝑥𝑥𝑖𝑖 = ∑ 𝑥𝑥𝑖𝑖


𝑖𝑖 = 1 𝑖𝑖
Summation rules 203

Summation rules
There are many properties of summations, but the two most important ones I think are:

∑ = ∑
where c is a constant and

(∑ )(∑ ) =∑ ∑
=0 =0 =0 =0

These are both based on the distributive property.


Appendix 3
Trigonometry
In some of the chapters, we have relied heavily on trigonometry. You probably took trig
(the shortened alias for trigonometry) in high school but may have not used it for a long
time. This chapter is meant as a refresher or just an introduction.
In this chapter, we are going to cover the following main topics:

• Measuring angles
• Trigonometric functions
• Formulas

Measuring angles
It all starts with angles in trigonometry. There are two main ways to measure angles –
degrees and radians. Radians are almost exclusively used in quantum computing, but you
are probably more familiar with degrees, so we'll start there.
206 Trigonometry

Degrees
There are 360 degrees in a circle. Probably the most familiar use of degrees is in a compass,
as shown in the following:

Figure 12.1 – Compass


However, on the X-Y plane, zero degrees starts on the right side of the x axis and goes
from there. In fact, the X-Y plane can be broken up into four quadrants with each taking
up 90 degrees, as shown in the following:

Figure 12.2 – Four quadrants [2]


Measuring angles 207

Additionally, angle measures can be expressed in negative form. When degrees or radians
are positive, you go in the counterclockwise direction, and when they are negative, you go
in a clockwise direction on the X-Y plane, as shown in the following diagram:

Figure 12.3 – Positive and negative angles [3]


This can lead to interesting angle measures that can be equivalently expressed in positive
and negative forms. For instance, the following angle can be expressed as 45 degrees or
-315 degrees:

Figure 12.4 – Positive and negative angles [4]


After introducing degrees, we now get to the meat of the matter – radians!

Radians
As I said, radians are the most used measure in mathematics and physics because they
make calculations much easier. The key part to converting from degrees to radians is
that 360 degrees equals 2π radians. Using this, we can obtain the following formulas for
converting degrees to radians:

 180   π 


1 rad =   ≈ 57.3

1 =  rad  ≈ 0.017 rad
 π   180 

You should note that we use the abbreviation "rad" to represent a radian when we need to
distinguish it from a degree.
208 Trigonometry

The following diagram does a good job of showing the relationship of degrees and radians
on the X-Y plane:

Figure 12.5 – Degrees and equivalent radians on the X-Y plane [5]

Trigonometric functions
There are three main trigonometric functions, and they are all based on right triangles.
Let's use the right triangle, as follows:

Figure 12.6 – A right triangle [6]


Trigonometric functions 209

We will call the angle at A on the triangle θ. The angle at C is 90 degrees or a right angle.
The three trig functions are called sine, cosine, and tangent. Here are their definitions
along with their abbreviations:

opposite adjacent oppositee


sin θ = cos θ = tan θ =
hypotenuse hypotenuse adjacent

There are three other trig functions, which are the inverse of the main three and have the
special names of cosecant, secant, and cotangent. Here are their definitions:

hypotenuse hypotenuse adjacentt


csc θ = sec θ = cot θ =
opposite adjacent opposite

Given this, common values for these functions can be seen in the following table:

Figure 12.7 – The common values for sin, cos, and tan
210 Trigonometry

As angles get bigger, they move from the first quadrant into the three other quadrants and
become positive or negative, based on the values of x and y in those quadrants. Here is a
cheat sheet to know which are positive and negative:

Figure 12.8 – The trig functions by quadrant [7]

Formulas
There are some common formulas you should know. The first is about tangent and
its relationship to the sine and cosine functions. We can derive the formula in the
following way:
opposite
𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃 = opposite hypotenuse opposite
hypotenuse
= • = = 𝑡𝑡𝑡𝑡𝑡𝑡 𝜃𝜃
adjacent hypotenuse adjacent adjacent
𝑐𝑐𝑐𝑐𝑐𝑐 𝜃𝜃 =
hypotenuse
Formulas 211

This leads us to this:

sin θ
tan θ =
cos θ

The next one we get from studying the graphs of sine and cosine as function θ over time.
Here are those graphs:

Figure 12.9 – The graphs of sine and cosine [8]


You may notice that they are "out of phase" by π/2. This leads us to the next formula:

sin θ = cos ( π
2
−θ )
cos θ = sin ( π
2
−θ )

While I will not derive it, the following is called the Pythagorean identity, and you should
commit it to memory:

sin 2 θ + cos 2 θ = 1
212 Trigonometry

Summary
Alright, that's all you need to know for trigonometry. Please consult a textbook or other
resources if you feel like you need more help. The next page contains a cheat sheet of
formulas you will most likely need in your journey through quantum computing.

The trig cheat sheet


Pythagorean identities
cos 2 ( x) + sin 2 ( x) = 1 sec 2 ( x) − tan 2 ( x) = 1
csc 2 ( x) − cot 2 ( x) = 1

Double angle identities


sin(2 x) = 2 sin( x) cos( x) cos(2 x) = 1 − 2 sin 2 ( x)
cos( 2 x) = 2 cos 2 ( x) − 1 cos(2 x) = cos 2 ( x) − sin 2 ( x)
2 tan( x)
tan( 2 x) =
1 − tan 2 ( x)

Sum/difference identities
sin( s + t ) = sin( s ) cos(t ) + cos( s) sin(t )
sin( s − t ) = sin( s ) cos(t ) − cos( s ) sin(t )
cos( s + t ) = cos( s ) cos(t ) − sin( s ) sin(t )
cos( s − t ) = cos( s) cos(t ) + sin( s ) sin(t )
tan( s) + tan(t )
tan( s + t ) =
1 − tan( s) tan(t )
tan( s) − tan(t )
tan( s − t ) =
1 + tan( s) tan(t )
Works cited 213

Product-to-sum identities
cos( s − t ) + cos( s + t )
cos( s) cos(t ) =
2
cos( s − t ) − cos( s + t )
sin( s) sin(t ) =
2
sin( s + t ) + sin( s − t )
sin( s ) cos(t ) =
2
sin( s + t ) − sin( s − t )
cos( s) sin(t ) =
2

Works cited
[1] - File:Compass Card B+W.svg – Wikimedia Commons (https://commons.
wikimedia.org/wiki/File:Compass_Card_B%2BW.svg)
[2] - 8.png (438×315) (opencurriculum.org) (http://media.opencurriculum.
org/articles_manual/michael_corral_trigonometry/trigonometric-
functions-of-any-angle/8.png)
[3] - Trigonometric Functions of Any Angle – OpenCurriculum (2.png) (https://
opencurriculum.org/5484/trigonometric-functions-of-any-angle/)
[4] - File:Positive, negative angle.svg – Wikimedia Commons (https://commons.
wikimedia.org/wiki/File:Positive,_negative_angle.svg)
[5] - File:30 degree rotations expressed in radian measure.svg – Wikimedia Commons
(https://commons.wikimedia.org/wiki/File:30_degree_rotations_
expressed_in_radian_measure.svg)
[6] - File:TrigonometryTriangle.svg – Wikimedia Commons (https://commons.
wikimedia.org/wiki/File:TrigonometryTriangle.svg)
[7] - Trigonometric Functions of Any Angle – OpenCurriculum (9.png) (https://
opencurriculum.org/5484/trigonometric-functions-of-any-angle/)
[8] - File:Sine cosine one period.svg – Wikimedia Commons (https://commons.
wikimedia.org/wiki/File:Sine_cosine_one_period.svg)
Appendix 4
Probability
The study of gambling, specifically the throwing of dice, led to the mathematical field
of probability. Probability is the study of how likely an event is to occur given a number
of possible outcomes. A real number between 0 and 1 is assigned to each event where
0 signifies the event has no chance of happening and 1 signifies the event will always
happen. You can also multiply these numbers by 100 to get a percentage that the event
will happen. All the probabilities for all possible outcomes must sum to 1. For instance,
the probability of a coin flip landing on heads is 0.5 or 50%. For tails, it is also 0.5 or
50%. Both of these numbers add up to 1. From these basics, this chapter will go over the
probability needed in the study of quantum computing.
In this chapter, we are going to cover the following main topics:

• Definitions
• Random variables

Definitions
Let's start by getting some basic definitions out of the way. The word experiment is used
in probability theory to denote the execution of a procedure that produces a random
outcome. Examples of experiments are flipping a coin or rolling dice. In quantum
computing, an experiment is measuring a qubit.
216 Probability

A sample space is the set of all possible outcomes of an experiment. It is usually denoted
by Ω (the upper case Greek letter omega). The set Ω for a fair coin is {Heads, Tails}. The
set Ω for one die is {1, 2, 3, 4, 5, 6}. The set Ω for a qubit when measured in the Z basis is
{|0⟩, |1⟩}.
An event (E) is a subset of Ω. Every outcome is a subset of size 1 – for example, {Heads}
and {Tails} are events for a fair coin. But as we saw in Chapter 3, Foundations, subsets
also include the empty set ∅ and the whole set itself, which is Ω in this case. The set of all
events is called an event space and is usually denoted with a ℱ. Here is the event space for
a fair coin:

• ∅
• {Heads}
• {Tails}
• {Heads, Tails} = Ω

Because of the definition of an event, you can also group outcomes together. For instance,
{1 ,2, 3} is the event that a die is 3 or below after a roll (aka experiment).
Finally, there is a probability function (P) that maps each event to a real number between
0 and 1:

P : F → [0, 1]
The probability function has these properties:

P (Ω) = 1
P (∅) = 0
For all E such that E ⊆ Ω
0 ≤ P( E ) ≤ 1
Random variables 217

Here is an example for a flip of a coin using these definitions, where H stands for Heads
and T stands for Tails:

Ω = {H , T }
F = {∅,{H },{T },{H , T }}
P(∅) = 0
1
P({H }) =
2
1
P({T }) =
2
P({H , T }) = 1

Let's move on to to see how we can analyze these events further.

Random variables
An important concept in probability is a random variable. Oftentimes, we are not
interested in the actual outcome of an experiment but some function of the outcome. For
instance, let's define the function S to be the number of tails when flipping two coins. We
know that Ω is {(H,H), (H,T), (T,H), (T,T)} where H stands for heads and T stands for
tails. We also know that the probability of each of these outcomes is 1/4th. However, I want
to know the amount of tails in my outcomes, which I define as this:

S :Ω → ℝ
S ( HH ) = 0
S ( HT ) = 1
S (TH ) = 1
S (TT ) = 2

If I define S to be a random variable, then:


1
P( S = 0) =
4
1
P( S = 1) =
2
1
P ( S = 2) =
4
218 Probability

In general, random variables are written with capital letters such as X, Y, and Z. Random
variables are functions from the sample space Ω to a measurable space, which is not a
trivial thing. Fortunately for us, for most of our random variables, the measurable space
will be the real numbers.

Discrete random variables


There are continuous and discrete random variables. Most random variables in quantum
computing are discrete, and hence, we will only deal with this type. Discrete random
variables have distinct values, and their sample spaces are finite or countably infinite.
Instead of writing P(X = z) all the time, we define a new function called the Probability
Mass Function (PMF) this way:

p( z ) = P( X = z )

Two properties of the PMF are:

∑ p( x ) = 1, where n is the amount of possible values


i =1
i

p( x) ≥ 0

Histograms are often used to show the PMF graphically for a random variable. Here is a
nice histogram showing the PMF for a random variable S, defined as the sum of two dice
being rolled:

Figure 13.1 – The PMF of S, the sum of two dice rolled


Random variables 219

Let's look at how we can describe our random variables.

The measures of a random variable


There are a few measures that are important to know for random variables. The first is the
expected value of a random variable denoted as E[X] where X is the random variable.
Let's start with an example first and look at the roll of a single die. The expected value
takes each possible outcome and multiplies it by the PMF for that value. So, if we let X
represent the value of the outcome of a die roll, then the expected value will be:

1
p (1) = p (2) = p(3) = p (4) = p(5) = p (6) =
6
E[ X ] = 1 ⋅ p (1) + 2 ⋅ p( 2) + 3 ⋅ p (3) + 4 ⋅ p( 4) + 5 ⋅ p (5) + 6 ⋅ p(6)
1 1 1 1 1 1
E[ X ] = 1 ⋅ + 2 ⋅ + 3 ⋅ + 4 ⋅ + 5 ⋅ + 6 ⋅
6 6 6 6 6 6
1 2 3 4 5 6 21 7
E[ X ] = + + + + + = = = 3. 5
6 6 6 6 6 6 6 2
E[ X ] = 3.5

You should note that the expected value is not actually a possible value of the PMF. This
will often be true. Intuitively, this should also look like the average or mean, and it is if all
outcomes have the same PMF value. Let's define this mathematically. The expected value
for a random variable X is defined as:

n
E[ X ] ≡ ∑ xi p( xi )
i =1

The other important measure of a random variable is its variance, which measures the
spread of possible PMF values. It is defined as:

var( X ) ≡ E[( X − E[ X ]) 2 ] = E[ X 2 ] − E ( X ) 2
220 Probability

Let's calculate the variance for our roll of one die example. We already know the expected
value of X, but we also need the expected value of X2. Here's the calculation:

1
p(1) = p(2) = p (3) = p(4) = p (5) = p(6) =
6
E[ X 2 ] = 12 ⋅ p(1) + 22 ⋅ p (2) + 32 ⋅ p(3) + 42 ⋅ p( 4) + 52 ⋅ p (5) + 62 ⋅ p (6)
1 1 1 1 1 1
E[ X 2 ] = 1 ⋅ + 4 ⋅ + 9 ⋅ + 16 ⋅ + 25 ⋅ + 36 ⋅
6 6 6 6 6 6
1 4 9 16 25 36 91
E[ X ] = + + + +
2
+ =
6 6 6 6 6 6 6
91
E[ X ] =
2

6
Now, we can calculate the variance of our roll of one die:

var( X ) = E[ X 2 ] − E ( X ) 2
2
91  7  35
var( X ) = −  =
6  2  12

The last measure to consider is called the standard deviation of a random variable, and it
is quite easy once you have the variance. It is just the square root of the variance.
And there you go – you now know the three most important measures of a random
variable: the expected value, the variance, and the standard deviation.

Summary
The field of probability is vast, but you now have the necessary tools to understand it
as it applies to quantum computing. The foundation for understanding probability is
the definitions we went through, and in quantum computing, it all revolves around
random variables.

Works cited
[1] - File:Dice Distribution (bar).svg - Wikimedia Commons (https://commons.
wikimedia.org/wiki/File:Dice_Distribution_(bar).svg)
Appendix 5
References
"Abstract Algebra." YouTube, uploaded by Socratica, January 2, 2021, www.youtube.
com/c/Socratica.
Andreescu, Titu, and Andrica, Dorin. Complex Numbers from A to...Z. Birkhäuser
Boston, 2014.
Axler, Sheldon. Linear Algebra Done Right (Undergraduate Texts in Mathematics).
3rd edition. 2015, Springer, 2014.
Byron, Frederick, and Fuller, Robert. Mathematics of Classical and Quantum Physics.
Dover Publications, 1992.
Dirac, P. A. M. "A New Notation for Quantum Mechanics." Mathematical Proceedings
of the Cambridge Philosophical Society, vol. 35, no. 3, 1939, pp. 416–18. Crossref,
https://doi.org/10.1017/s0305004100021162.
Encyclopedia of Mathematics. Encyclopedia of Mathematics community,
encyclopediaofmath.org/wiki/Main_Page. Accessed 1 Feb. 2021.
Gowers, Timothy, et al. The Princeton Companion to Mathematics. Amsterdam University
Press, 2008.
Greenfield, Pavel. "Linear Algebra." YouTube, uploaded by MathTheBeautiful, 4 Jan. 2021,
www.youtube.com/c/MathTheBeautiful.
222 References

Griffiths, David. Introduction to Quantum Mechanics. 2nd edition, Cambridge University


Press, 2016.
Halmos, Paul. Naive Set Theory (Dover Books on Mathematics). Reprint, Dover
Publications, 2017.
Hidary, Jack. Quantum Computing: An Applied Approach. 1st edition 2019, Springer, 2019.
Lay, David, et al. Linear Algebra and Its Applications. 5th edition, Pearson, 2014.
"Math on YouTube." YouTube, uploaded by Eddie Woo, 2 Jan. 2021, www.youtube.
com/c/misterwootube.
McMahon, David. Quantum Computing Explained. Wiley, 2007.
Nielsen, Michael A., and Chuang, Isaac. Quantum Computation and Quantum
Information: 10th Anniversary Edition. 1st edition, Cambridge University Press, 2011.
Pinter, Charles. A Book of Abstract Algebra: Second Edition (Dover Books on Mathematics).
Second, Dover Publications, 2010.
"Professor M does Science." YouTube, uploaded by Professor M, 2 Jan. 2021,
www.youtube.com/c/ProfessorMdoesScience.
Rieffel, Eleanor, and Polak, Wolfgang. Quantum Computing: A Gentle Introduction
(Scientific and Engineering Computation). Illustrated, The MIT Press, 2014.
Ross, Sheldon. A First Course in Probability. Prentice Hall, 1998.
Sanderson, Grant. Various math videos. YouTube, uploaded by 3Blue1Brown, 1 Jan. 2020,
www.youtube.com/c/3blue1brown.
Shankar. Principles of Quantum Mechanics. 2nd edition, Plenum Press, 1994.
Stewart, Ian, and Tall, David. The Foundations of Mathematics. 2nd edition, Oxford
University Press, 2015.
Stewart, James. Single Variable Calculus: Early Transcendentals, Volume I. 8th edition,
Cengage Learning, 2015.
Strang, Gilbert. Introduction to Linear Algebra (Gilbert Strang). 5th edition, Wellesley-
Cambridge Press, 2016.
Susskind, Leonard, and Friedman, Art. Quantum Mechanics: The Theoretical Minimum.
Illustrated, Basic Books, 2015.
 223

Sutor, Robert. Dancing with Qubits: How Quantum Computing Works and How It Can
Change the World. Packt Publishing, 2019.
Weisstein, Eric. "Wolfram MathWorld: The Web's Most Extensive Mathematics Resource."
Wolfram MathWorld, Wolfram Research, mathworld.wolfram.com. Accessed
1 Feb. 2021.
Wikipedia contributors. Various articles. Wikipedia, en.wikipedia.org/wiki/
Main_Page. Accessed 1 Feb. 2021.
Index
A Bloch sphere
about 123-125
Abelian group 52
reference link 125
absolute value 112
bra 198-200
additivity 77
bra-ket notation 5, 186, 187, 197
adjoint matrix 122
adjoint, of operators 160-162
algebraic description, linear C
transformations 79-81
Cartesian form
angles
about 107, 108
degrees 206, 207
absolute value 112
measuring 205
addition 108, 109
radians 207, 208
complex conjugate 111
associativity 52
division 112, 113
average 219
exercise 111
modulus 112
B multiplication 109
powers of i 113
basis 68-71
Cartesian plane 46
bijective functions
Cartesian product 45, 46
about 50, 51
Cauchy-Schwarz inequality 181
rules 50
characteristic equation 139
binary operations
characteristic polynomial 140
about 51
circuit model 36
definition 51
closure 52, 53, 58
properties 52
codomain 47, 77
bits 11
226 Index

column vector 20 eigensinn 129


commutative group 52 eigenspace 138
commutativity 52 eigentum 129
commutator 87 eigenvalues
completeness relation 159 about 137
complex conjugate properties 143
defining 111 eigenvectors
complex number about 136, 137
defining 44, 106, 107 finding 141, 142
defining, in polar form 116, 117 elements 42
complex plane 108 Euclidean vector 4
composite vectors Euler's Formula 119, 120
inner product 173, 174 Euler's identity 119, 120
computational basis 68 event 216, 217
coordinates 69 experiment 215
exponential form
D about 120
complex number 120
degrees 206, 207 conjugation 120
de Moivre's theorem 119 division 121
determinants example 121
about 131 multiplication 121
example 132
exercise 133
diagonalizable matrices 184
F
diagonal matrices 183 fields 53
dimension 71 FOIL method 109, 110
discrete random variables 218, 219 formulas 210, 211
domain 47, 77 functions
about 46
E illegal function 47, 48
invertible functions 48
eigen rules 47
about 129
example matrix 137, 138
eigenbasis 138
G
eigenschaft 129 geometric description, linear
transformations 78, 79
Index 227

Gram-Schmidt process 178-180 linear dependence 62-64


group 52, 53 linear functional 97
linear independence 62
H linearity 74-77
linear operators 96, 97
Hermitian conjugate 122 linear transformations
Hermitian operators 163 about 73, 77
Hilbert space 145 algebraic description 79-81
homogeneity 76 basis vectors description 81-83
change of basis, performing 98-100
I geometric description 78, 79
similar 86
identity element 52 standard matrix 86
identity matrix 33, 34 linear transformations, with matrices
image 47 commutator 87
imaginary unit 106 multiple transformations, through
injective function 49 matrix multiplication 87
inner product presentation 83, 84
about 146-148 logic gates
of composite vectors 173, 174 AND 35
invertibility 52, 53 NOT 34, 35
invertible functions
about 48
bijective functions 50, 51
M
injective function 49 matrix
surjective function 49 about 18
invertible matrix theorem 133 conjugate transpose 122
identity matrix 33, 34
K Kronecker product 172, 173
notation 19
Kronecker delta function 153, 154 square matrix 33
Kronecker product 168 transposing 22
of matrices 172, 173 transposing, examples 22
types 33
L vectors, redefining 19, 20
matrix addition
length 149 example 21
linear combinations 8-10, 62 exercises 21
228 Index

matrix exponential function 192, 193 normalization 149


matrix inverse normal matrix 162
about 130 normal operators
calculating 134 about 162, 163
exercise 135 properties 163
matrix multiplication n-tuple 45
about 23, 24, 29, 30, 131
example 30-32
exercise 32
O
multiple transformations, through 87 one-dimensional (1D) matrices 20
properties 32, 33 one-to-one function 49
vectors, multiplying 24 onto function 49
matrix operations operand 51
about 20 operator functions 190, 191
addition 20 operators
scalar multiplication 21 about 157, 198
matrix-vector multiplication adjoint 160-162
about 26, 27 Hermitian operators 163
definition 28 normal operators 162
exercise 28, 29 positive operators 167
mean 219 projection operators 166
measures, random variable representing, with outer
average 219 product 158, 159
expected value 219 tensor product 172
mean 219 unitary operators 164, 165
standard deviation 220 ordered pair 45
variance 219 orthogonality 150, 151
members 42 orthonormality
modulus 112 about 149
multiplication for complex number norm 149
defining 110 orthonormal vectors 152, 153
multiplicity 142, 143 outer product 154-156

N P
non-linearity perpendicular 150
real-life examples 74 polar decomposition 189, 190
norm 149
Index 229

polar form rotation 89-93


about 114 row vector 20
complex number, defining 116, 117
complex number, example 117, 118
division 118
S
multiplication 118 sample space 216
polar coordinates 114-116 scalar multiplication
positive definite 167 about 7, 21, 54
positive operators 167 example 7, 21
positive semidefinite 167 exercise 22
Probability Mass Function (PMF) 218 scalars 7, 54
projection 94, 95 self-adjoint operators 163
projection operators 166 set-builder notation 42, 43
sets
Q about 42
Cartesian product 45, 46
quantum computing 166 definition 42
quantum gates elements 42
about 34 members 42
circuit model 36 notation 42, 43
logic gates 34, 35 sets of numbers 43, 44
qubit 11 tuples 45
Sigma
R about 201, 202
variations 202
radians 207, 208 Singular Value Decomposition
random variable (SVD) 188, 189
about 217, 218 singular values 188
discrete random variables 218, 219 span 64-67
measures 219, 220 spectral decomposition
range 47, 77 about 183
rational number 44 example 185-187
real numbers 44 spectral theorem 184, 185
reflection 78 square matrix 33
reflection transformation standard basis 68
about 84, 135 standard deviation 220
with basis vectors 85, 86 standard matrix, of linear
transformation 86
230 Index

subset 43
subspaces
U
about 58 unitary operators 164, 165
definition 58, 59 unit vector 149
examples 59-61
exercise 61
summation rules 203
V
superposition variance 219
about 11, 12 vector addition 5, 6, 54
measurement 13 vectors
superset 43 about 4, 54, 136
surjective functions 49, 50 multiplying 24
Symbolab multiplying, examples 25
URL 140 redefining 19, 20
tensor product 168-170
T vectors description, linear
transformations 82, 83
tensor products vector space 54, 55, 77
about 167, 168
of operators 172
of vectors 168-170
W
tensor product space 170, 171 Wolfram Alpha
trace 143 reference link 63
transformations, by Euclid
about 88
projection 94, 95
rotation 89-93
translation 88, 89
triangle inequality 181-183
trigonometric functions 208-210
tuple 45
Packt.com
Subscribe to our online digital library for full access to over 7,000 books and videos, as
well as industry leading tools to help you plan your personal development and advance
your career. For more information, please visit our website.

Why subscribe?
• Spend less time learning and more time coding with practical eBooks and Videos
from over 4,000 industry professionals
• Improve your learning with Skill Plans built especially for you
• Get a free eBook or video every month
• Fully searchable for easy access to vital information
• Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and
ePub files available? You can upgrade to the eBook version at packt.com and as a print
book customer, you are entitled to a discount on the eBook copy. Get in touch with us at
[email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up
for a range of free newsletters, and receive exclusive discounts and offers on Packt books
and eBooks.
232 Other Books You May Enjoy

Other Books You


May Enjoy
If you enjoyed this book, you may be interested in these other books by Packt:

Quantum Computing with Silq Programming


Srinjoy Ganguly, Thomas Cambier
ISBN: 978-1-80056-966-9
• Identify the challenges that researchers face in quantum programming
• Understand quantum computing concepts and learn how to make quantum circuits
• Explore Silq programming constructs and use them to create quantum programs
• Use Silq to code quantum algorithms such as Grover's and Simon's
• Discover the practicalities of quantum error correction with Silq
• Explore useful applications such as quantum machine learning in a practical way
Other Books You May Enjoy 233

Learn Quantum Computing with Python and IBM Quantum Experience


Robert Loredo
ISBN: 978-1-83898-100-6

• Explore quantum computational principles such as superposition and


quantum entanglement
• Become familiar with the contents and layout of the IBM Quantum Experience
• Understand quantum gates and how they operate on qubits
• Discover the quantum information science kit and its elements such as Terra
and Aer
• Get to grips with quantum algorithms such as Bell State, Deutsch-Jozsa, Grover's
algorithm, and Shor's algorithm
• How to create and visualize a quantum circuit
234

Packt is searching for authors like you


If you're interested in becoming an author for Packt, please visit authors.
packtpub.com and apply today. We have worked with thousands of developers and
tech professionals, just like you, to help them share their insight with the global tech
community. You can make a general application, apply for a specific hot topic that we are
recruiting an author for, or submit your own idea.

Share Your Thoughts


Now you've finished Essential Mathematics for Quantum Computing, we'd love to hear
your thoughts! If you purchased the book from Amazon, please click here to go straight
to the Amazon review page for this book and share your feedback or leave a review on the
site that you purchased it from.
Your review is important to us and the tech community and will help us make sure we're
delivering excellent quality content.

You might also like