From the cryptography introduction, we are aware of hashing! If not exactly remembering, a quick recap!
Hashing or a hashing function is a type of operation that takes arbitrary data as input and converts it to an output of fixed size. The fixed size resulting output is known as a hash digest. Actually, the size is fed to the hash function. Hashing is used in cryptographic applications to store the user details like passwords, fingerprints, etc. to the database. Hashing is completely different from encryption since it is one-directional. Meaning, if the plain text is converted to a hash digest then it is impossible to recover plain text from the digest.
Hash functions that are used should be deterministic, that is they should always yield the same hash value when the input is the same. If there is a minute change in the input, it will output different hash values. So, the passwords are fed to the hash functions and the resulting hash digest is compared to the stored hash value for authentication. We might have heard of the hash tables. Hashing can also be used to identify duplicate data in databases to speed up searching for tables or to remove duplicate data to save space. Hashing is similar to symmetric key block ciphers and operates on the blocks of data.
Hashing Algorithms
Message Digest (MD5) is a widely used hash function developed by Ronald Rivest. It operates on 512-bit blocks and generates 128-bit hash digests. A flaw was discovered in MD5 and cryptographers recommended using the SHA-1 hash. Since this is not a critical flaw, this hash function continued in usage. In 2004, it was discovered that MD5 is susceptible to hash collisions. After this flaw was discovered, security researchers were able to generate two different files that have matching MD5 hash digests.
In 2008, security researchers took this a step further and demonstrated the ability to create a fake SSL certificate, that validated due to an empty five hash collision. Due to these serious vulnerabilities in the hash function, it was recommended to stop using MD5. In 2012, this hash collision was used for nefarious purposes in the Flame malware. When design flaws were discovered in MD5, it was recommended to use SHA-1 as a replacement.
| Name of the Algorithm | Input block size | Output Hash length (in bits) |
|---|---|---|
| Message digest (MD5) | 512 | 128 |
| Secure Hash Algorithm (SHA-1) | 512 | 160 |
| SHA-2 or SHA-256 | 512 | 256 |
| SHA-3 or SHA-384 | 1024 | 384 |
| SHA-5 or SHA-512 | 1024 | 512 |
Secure Hash Algorithm 1 (SHA-1) was designed by the National Security Agency (NSA) and published in 1995. It operates a 512-bit block and generates a 160-bit hash digest. SHA-1 is popularly used in network protocols like TLS/SSL, PGP SSH, and IPsec. SHA-1 is also used in version control systems like Git, which uses hashes to identify revisions and ensure data integrity by detecting corruption or tampering. The US National Institute of Standards and Technology (NIST), recommended stopping the use of SHA-1 and depending on SHA-2 in 2010. Many other organizations have also recommended replacing SHA-1 with SHA-2 or SHA-3.
SHA-1 also has its vulnerabilities. During the 2000s, a bunch of theoretical attacks was formulated and some partial collisions were demonstrated, but full collisions using these methods requires significant computing power. In early 2017, the first full collision of SHA-1 was published. Using significant CPU and GPU resources, two unique PDF files were created that result in the same SHA-1 hash.
There’s also the concept of a Message Integrity Check (MIC). A MIC is essentially a hash digest of the message that checks whether the data is not tampered or modified during the transmission. It doesn’t use secret keys, which means the message isn’t authenticated. MICs only protect against accidental corruption or loss but does not protect against tampering or malicious actions.
Neologism and Related terms:
- Block Cipher: It is an encryption method that applies a deterministic algorithm along with a symmetric key to encrypt a block of text, rather than encrypting one bit at a time as in stream ciphers.
- Stream Cipher: It is a method of encrypting text (to produce ciphertext) in which a cryptographic key and algorithm are applied to each binary digit in a data stream, one bit at a time
- Symmetric Key: It is an encryption system where the sender and receiver of messages use a single common key to encrypt and decrypt messages. This is also termed as Private or Secret key cryptography.
- Hash collision: When the two inputs or files generate the same hash digest, it is known as a hash collision.
- Secure Sockets Layer (SSL) Certificate: It is a data file hosted between the server and the browser for a secured connection. It makes an HTTP to HTTPS providing SSL/TLS connection.
- Pretty Good Privacy (PGP): It is a data encryption program that gives cryptographic privacy and authentication for online communication. It is often used to encrypt and decrypt texts, emails, and files to increase the security of emails.
- Secure Shell (SSH): It is a method of communicating with a computer network securely in an unsecured network. Many applications including, remote control access, login and modifying data on the host computer, etc.
- Internet Protocol Security (IPsec): It is a secure network protocol suite that authenticates and encrypts the packets of data to provide secure encrypted communication between two computers over an Internet Protocol network.
- Git: It is a version control system designed to track the changes in source code during software development.
References
This is all about Hashing in reference to Cybersecurity. There are more cryptographic encryption algorithms and we shall get to them slowly. Till then, have a safe and healthy learning!
Stay Safe and Spread Knowledge!!