File hashing is a technique used in applications for many reasons, mainly to ensure data integrity, security, and efficiency.
There are a number of reasons why you may want to use file hashing such as:
-
Data Integrity Verification
When transferring or storing files, there is always a risk of data corruption or tampering. By generating a hash value based on the file's contents using a hashing algorithm (e.g., MD5, SHA-1, SHA-256), the application can later recompute the hash and compare it to the original hash to verify if the file has been altered or corrupted during transmission or storage. -
Deduplication
In certain applications like cloud storage or backup systems, file deduplication is used to save storage space. Hashing enables you to identify duplicate files efficiently. Instead of storing multiple copies of the same file, the application can use a single hash to represent the duplicated content. -
Cache Management
In caching systems, file hashing can be used to quickly check if a particular file has been accessed before and is available in the cache. The hash serves as a unique identifier for the file, allowing for faster retrieval and reducing unnecessary data transfer. -
Data Indexing and Searching
In large-scale applications where files need to be indexed and searched quickly, file hashing can be used to build efficient data structures like hash tables, enabling fast lookup and retrieval based on file content. -
Malware Detection
Antivirus and security applications use file hashing to identify known malware and viruses. By comparing the hash of a file against a database of known malicious hashes, these applications can quickly determine if a file is potentially harmful.
Glasswall Halo Events
- The Glasswall Halo API allows you to interact programmatically by initiating requests to specific endpoints based on the file type:
- For binary files, use the
api/v3/cdr-file
endpoint. - For Base64-encoded files, use the
api/v3/cdr
endpoint.
-
With each request, you have the option to request the generation of file hashes in any combination of three different formats: SHA-256, SHA-1, and MD5.
-
Once the request is received, the input hash is generated and then Glasswall Halo processes it using the Glasswall Embedded Engine to rebuild and analyse the input file.
-
Upon a successfully rebuilt file, the platform generates the output file hashes and both the input and output hashes are returned in the response headers.
-
If a request fails, the platform has still generated the input hash which is returned in the response headers.
-
These generated input and output file hashes can be utilised in various ways within your application or system, such as performing security checks, further processing, or any other necessary actions.
API Authentication
Glasswall Halo supports two types of authentication: **Basic **and Bearer authentication. Before making any request, you must authenticate using the appropriate scheme based on your configuration.
Basic
If your system is configured with Basic authentication, you need to obtain an Organisation ID and token from the system administrator. Combine these values in the format <OrganisationId>:<Token>
and then Base64 encode them.
The resulting encoded value should be included in the Request Header:
Authorization: Basic ZGVtbzpwQDU1dzByZA==
Bearer
For Bearer authentication, you will require a Bearer token obtained from your identity provider.
Once you have obtained the token, include it in the request header:
Authorization: Bearer ZGVtbzpwQDU1dzByZA==
Note: ensure that you use the appropriate authentication method as per your system's configuration. Using the correct authentication ensures successful access to the Glasswall Halo API and prevents any authorization issues.
Learn more about authenticating Glasswall Halo
Request Construction
When leveraging Glasswall Halo for processing binary or Base64 encoded files, you have the flexibility to request the generation of multiple hash types, including SHA-256, SHA-1, and MD5. When making an API request, you can specify any combination of these hash types, and in response, you will receive the generated hash values for both the input and output files in the response headers. This enables you to ensure data integrity and authenticity, as well as facilitate efficient handling of files within your application or system.
Binary File Processing
POST {baseUrl}/api/v3/cdr-file?generate-hash-types=SHA256,SHA1,MD5
Base64 Encoded File Processing
Submit the Base64 encoded string in the Request body to the following endpoint:
POST {baseUrl}/api/v3/cdr?generate-hash-types=SHA256,SHA1,MD5
Request body Format
The body of the request should be in JSON format and include the Base64 field containing the Base64 encoded string of the file, and the fileName field specifying the original filename (including the appropriate file extension).
{
"Base64": "string",
"fileName": "filename.pdf",
}
Note
- Replace {baseUrl} with the actual base URL of the Glasswall Halo API.
- Correct auth header should be provided with each request
- Setting
generate-hash-types
to a comma separated list of hash values in the URL ensures that the correct hashes are generated and returned in the response - For binary file processing, use a multipart form post, while for Base64 encoded file processing, provide the file content in the JSON Request body with the appropriate filename.
Response Handling:
If you are successful in making a request to the Glasswall Halo API, a 201 status code is returned. If you have requested any hashes to be generated and returned, both input and output hashes will be available in the Response Header. If the response is unsuccessful, only the input hash values will be returned.
You can identify the hash values with the following header keys:
For SHA-256:
- Input File Hash:
x-hash-sha256-input
- Output File Hash:
x-hash-sha256-output
For SHA-1:
- Input File Hash:
x-hash-sha1-input
- Output File Hash:
x-hash-sha1-output
For MD5:
- Input File Hash:
x-hash-md5-input
- Output File Hash:
x-hash-md5-output
Example
access-control-allow-origin: *
access-control-expose-headers: *
content-disposition: attachment; filename=18402777-7826-457f-bc02-6446611495e6.zip; filename*=UTF-8''18402777-7826-457f-bc02-6446611495e6.zip
content-length: 288843
content-type: application/octet-stream
date: Fri,21 Jul 2023 12:42:15 GMT
strict-transport-security: max-age=31536000; includeSubDomains
x-applied-policy: service-dynamic
x-file-size: 301189
x-filetype: pdf
x-hash-md5-input: AF5EB421F8AFC69AC4178ACF747E618C
x-hash-md5-output: 8018FC4F37AB858442F354430203CEB4
x-hash-sha1-input: 9C714CAC1F82A450728F3C046CEFEFAF0BBD2C07
x-hash-sha1-output: 058991F3A252F0A4FD0E7F8B3178D3F15494099B
x-hash-sha256-input: 5F83AB533252C00AD1C60A0EFC016BF273497F3179FD6CE642AD8A4857148B8D
x-hash-sha256-output: 9D5D88F073D74160828F92E9E8DADAC5712706C65D49448D9ABA91D8FDC6FC28
x-processing-id: 18402777-7826-457f-bc02-6446611495e6
x-processing-status: rebuilt
Summary
By using generated hashes in your file processing workflow via Glasswall Halo, you can streamline operations, enhance security, and improve the reliability of your application. These collectively contribute to a more robust and resilient system that ensures the integrity and safety of your files.
Quick Start
To try Glasswall Halo yourself, please refer to our Quick Start Guide.