What Is Hashing?

Last updated May 10th, 2022

Hashing is a technique used to convert a key into another value (typically for cryptographic or data storage purposes).

Hashing is a technique used to convert a key into another value (typically for cryptographic or data storage purposes). It works by running a mathematical function (called a hash function) called a key to create a new value — the hash (or the hash value).

One-way hashes

Cryptographically secure hash functions are said to be good when they’re one way (one way hashes). This means that you cannot get the original key from the hash value.

Collisions

If a hash function yields the same hash value for two or more keys, then that’s when we end up with a collision — this is non-ideal. There are ways to handle this (see this article).

When we do we hashing?

Implementing hash tables

Hash tables are a data structure that helps us store key/value data. They are comprised of two parts — a storage structure (be it an array, object, etc) and a hash function.

The hash function is responsible for fitting data within the data structure so that it can be retrieved later.

If the storage is bounded (fixed) such as an array, then the hash function must generate a key which can be used as the index, but fits within the size constraints of the array.

For example, a trivial formula to compute the index of a string or integer key is to do:

index = key % sizeOfTable

This would work, but for any fixed size table, we’re likely to run into collisions eventually.

See here for “Collision detection and dynamic array resizing".

When storage is unbounded — like when using a JavaScript object — we don’t have to worry about collisions. We can just use some unique aspect of the data (such as an email address, phone number, or a compound-key) to create the hash value to use as an index.

However, if security is important, we should still use a cryptographic function to create the hash.

That leads to the final point.

Cryptography & data encryption

Hashing is a very common technique when it comes to cryptography. The most common reason we hash is to convert passwords to something more secure (than plain text) before we store them to a database. We do this for two reasons:

If a hacker steals the records from the db, they don’t get the actual passwords — just the hashed ones (which cannot be reversed since they were one way hashes).
To prevent staff from accidentally viewing plain-text passwords in the production database.

To hash passwords, check out MD5 or SHA-2 — two widely-used crypto hashes.

Discussion

Liked this? Sing it loud and proud 👨‍🎤.

Stay in touch!

About the author

Khalil Stemmler,
Software Essentialist ⚡

I'm Khalil. I turn code-first developers into confident crafters without having to buy, read & digest hundreds of complex programming books. Using Software Essentialism, my philosophy of software design, I coach developers through boredom, impostor syndrome, and a lack of direction to master software design and architecture. Mastery though, is not the end goal. It is merely a step towards your Inward Pull.

View more in Data Structures

You may also enjoy...

A few more related articles

Why You Have Spaghetti Code Design

divergence convergence

Code that gets worse instead of better over time results from too much divergence & little convergence.

Reality → Perception → Definition → Action (Why Language Is Vital As a Developer)Humans & Code

psychology language mastery domain-driven design

As developers, we are primarily abstractionists and problem decomposers. Our task is to use language to decompose problems, turnin...

The Code-First Developer The Phases of Craftship

the phases of craftship code-first

As you improve as a developer, you tend to move through the 5 Phases of Craftship. In this article, we'll discuss the first phase:...

Object Stereotypes Object-Oriented Design

rdd

The six object stereotypes act as building blocks - stereotypical elements - of any design.

Not subscribed? Get the latest newsletters straight to your inbox. Learn to write scalable, testable software.