Vector spaces
We keep hearing that artificial intelligence (AI) runs on vectors. Fair enough. But what does that really mean for us when we’re coding?
Data as points
Picture every piece of data as a point. A movie, an image, a line of text—each one lands somewhere in a high-dimensional space. The coordinates are numbers, lots of them, and together they form the vector. Not magic. Just math.
Similarity matters
AI figures out that two things are “alike” by checking how close their vectors sit. Closer usually means more similar. We measure that closeness with metrics like cosine similarity or Euclidean distance. It’s basically “which neighbor do you live near in vector-land.” That’s how she tells two sentences mean the same thing even if the words are different.
Too many dimensions
High-dimensional space sounds cool until it’s a mess. Models give us vectors with hundreds or thousands of numbers. Hard to see, hard to work with. We use dimensionality reduction—PCA, t-SNE, UMAP—so we can drop some numbers without losing the big picture. It’s like folding a roadmap so we can actually carry it.
Seeing the shapes
Once the vectors shrink, we can plot them. Clusters appear. Sentences with the same meaning huddle together. Outliers drift off. It’s not just pretty pictures; it’s how we sanity-check what she’s really learning. If nothing clusters, something’s off.
What we take away
We don’t need to love linear algebra. We just need to know vectors are how she thinks. Similarity is distance. Reduction is cleanup. Visualization is the gut check. That’s enough to get us coding without faking it.