0.1.11 GPU β€” the artist and the parallel calculator

In one line: the GPU started as a chip for drawing triangles fast β€” and accidentally became the engine of the AI revolution.

An Nvidia graphics card with the heatsink removed, exposing the GPU chip.
A GPU is not just a display part anymore. It is a parallel calculator, which is why password cracking, crypto mining, simulation, and AI all care about it. Image: Wikimedia Commons, Nvidia Geforce 6600GT GPU 2009-01-27.jpg.

A story β€” from Quake to ChatGPT

In 1993, three engineers in a Denny’s restaurant in California sketched a chip that would do 3D graphics so fast that PC games could finally look like arcade games. They called the company NVIDIA. The name comes from β€œinvidia” β€” Latin for envy.

For a decade, GPUs (Graphics Processing Units) had one job: take a list of triangles, calculate where each pixel of each triangle should be drawn, and shade them. They were narrow but brutally good at it β€” hundreds of tiny processors all doing the same simple math in parallel.

In 2006 NVIDIA released CUDA β€” a way to write general-purpose code that ran on the GPU. Researchers started realising: β€œwait, neural networks are mostly just multiplying huge matrices, and that’s the exact same shape as drawing triangles.” A 2012 paper called AlexNet trained a neural network on two GeForce GTX 580s β€” and obliterated every previous image-recognition record. The deep-learning era began.

By 2024, NVIDIA was the most valuable company in the world. Not because of games. Because every AI lab on Earth was queueing for H100 chips at $30,000 each. The GPU went from β€œthing that draws Doom” to β€œthing that runs civilisation” in 30 years.

What’s actually going on

A CPU has a few powerful cores (4-32) optimised for latency β€” finishing one complex task as fast as possible. A GPU has thousands of simple cores optimised for throughput β€” doing the same simple operation on millions of pieces of data simultaneously.

CPUGPU
Cores4-641,000-20,000
Per-core speedVery fastSlow
Best atBranchy, sequential codeParallel arithmetic on huge data
MemoryShares RAMHas its own dedicated VRAM (8-80 GB)

Modern GPUs sit in a PCIe slot with their own fans. They have their own VRAM (Video RAM, very fast β€” typically GDDR6 or HBM3) and their own driver. Programs explicitly upload data to the GPU, ask the GPU to compute, and download the result.

If you’re not doing graphics, scientific simulation, crypto mining, or AI β€” your GPU is mostly idle. But for those workloads it’s 10-100Γ— faster than a CPU.

Why a hacker cares

  • GPU password cracking β€” a tool called Hashcat uses GPUs to try billions of password guesses per second. An 8-character password? Cracked in hours, not years. This is why everything modern uses slow key-derivation functions (bcrypt, scrypt, Argon2) β€” to make GPU brute-force impractical.
  • Crypto-mining malware β€” quietly hijack the GPU for currency mining. Slow GPU, hot card, big power bill.
  • AI/ML attack surface β€” the entire AI security niche (which we’re aiming for) is built on GPU-trained models. Attacks on ML models include adversarial examples, model extraction, training data poisoning, prompt injection β€” all happen at the GPU/model layer.
  • Side channels β€” power and timing side-channels on GPUs are a research area. GPU-resident malware that hides from CPU-based detection has been demonstrated in academic papers.
  • VRAM forensics β€” GPU memory often holds decrypted keys, frame buffers, and model weights long after the user thinks they’re gone.

In one sketch

   CPU                            GPU
   β”Œβ”€β”€β”€β”€β”                         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚ 4  β”‚  big cores              β”‚ β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ β”‚
   β”‚ to β”‚  good at one thing      β”‚ β–‘ thousands of tiny β–‘ β”‚
   β”‚ 64 β”‚  at a time              β”‚ β–‘ cores all doing   β–‘ β”‚
   β”‚coresβ”‚                        β”‚ β–‘ the same simple   β–‘ β”‚
   β””β”€β”€β”€β”€β”˜                         β”‚ β–‘ thing in parallel β–‘ β”‚
                                  β”‚ β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ β”‚
                                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
   sequential, low-latency        parallel, high-throughput

Reference and image credit

Memory peg

CPU is one chef cooking ten different dishes. GPU is a thousand line cooks all making the same omelette. AI is a lot of omelettes.