What Is Hashing?

Last updated May 10th, 2022
Hashing is a technique used to convert a key into another value (typically for cryptographic or data storage purposes).

Hashing is a technique used to convert a key into another value (typically for cryptographic or data storage purposes). It works by running a mathematical function (called a hash function) called a key to create a new value — the hash (or the hash value).

hash-function.png

One-way hashes

Cryptographically secure hash functions are said to be good when they’re one way (one way hashes). This means that you cannot get the original key from the hash value.

Collisions

If a hash function yields the same hash value for two or more keys, then that’s when we end up with a collision — this is non-ideal. There are ways to handle this (see this article).

When we do we hashing?

Implementing hash tables

Hash tables are a data structure that helps us store key/value data. They are comprised of two parts — a storage structure (be it an array, object, etc) and a hash function.

The hash function is responsible for fitting data within the data structure so that it can be retrieved later.

If the storage is bounded (fixed) such as an array, then the hash function must generate a key which can be used as the index, but fits within the size constraints of the array.

For example, a trivial formula to compute the index of a string or integer key is to do:

index = key % sizeOfTable

This would work, but for any fixed size table, we’re likely to run into collisions eventually.

See here for “Collision detection and dynamic array resizing".

When storage is unbounded — like when using a JavaScript object — we don’t have to worry about collisions. We can just use some unique aspect of the data (such as an email address, phone number, or a compound-key) to create the hash value to use as an index.

However, if security is important, we should still use a cryptographic function to create the hash.

That leads to the final point.

Cryptography & data encryption

Hashing is a very common technique when it comes to cryptography. The most common reason we hash is to convert passwords to something more secure (than plain text) before we store them to a database. We do this for two reasons:

  1. If a hacker steals the records from the db, they don’t get the actual passwords — just the hashed ones (which cannot be reversed since they were one way hashes).
  2. To prevent staff from accidentally viewing plain-text passwords in the production database.

To hash passwords, check out MD5 or SHA-2 — two widely-used crypto hashes.



Discussion

Liked this? Sing it loud and proud 👨‍🎤.


0 Comments

Be the first to leave a comment

Submit

Stay in touch!



About the author

Khalil Stemmler,
Developer Advocate @ Apollo GraphQL ⚡

Khalil is a software developer, writer, and musician. He frequently publishes articles about Domain-Driven Design, software design and Advanced TypeScript & Node.js best practices for large-scale applications.



View more in Data Structures



You may also enjoy...

A few more related articles

Non-Functional Requirements (with Examples)
Non-functional requirements are quality attributes that describe how the system should be. They judge the system as a whole based ...
4 Principles of Object-Oriented Programming
The four principles of object-oriented programming (abstraction, inheritance, encapsulation, and polymorphism) are features that -...
How to Test Code Coupled to APIs or Databases
In the real-world, there's more to test than pure functions and React components. We have entire bodies of code that rely on datab...
How to Mock without Providing an Implementation in TypeScript
Having to provide an implementation everytime you create a test double leads to brittle tests. In this post, we learn how to creat...

Want to be notified when new content comes out?

Join 10000+ other developers learning about Domain-Driven Design and Enterprise Node.js.

I won't spam ya. 🖖 Unsubscribe anytime.

Get updates