Use file hashes to improve security
    • PDF

    Use file hashes to improve security

    • PDF

    Article summary

    File hashing is a technique used in applications for many reasons, mainly to ensure data integrity, security, and efficiency.

    There are a number of reasons why you may want to use file hashing such as:

    • Data Integrity Verification
      When transferring or storing files, there is always a risk of data corruption or tampering. By generating a hash value based on the file's contents using a hashing algorithm (e.g., MD5, SHA-1, SHA-256), the application can later recompute the hash and compare it to the original hash to verify if the file has been altered or corrupted during transmission or storage.

    • Deduplication
      In certain applications like cloud storage or backup systems, file deduplication is used to save storage space. Hashing enables you to identify duplicate files efficiently. Instead of storing multiple copies of the same file, the application can use a single hash to represent the duplicated content.

    • Cache Management
      In caching systems, file hashing can be used to quickly check if a particular file has been accessed before and is available in the cache. The hash serves as a unique identifier for the file, allowing for faster retrieval and reducing unnecessary data transfer.

    • Data Indexing and Searching
      In large-scale applications where files need to be indexed and searched quickly, file hashing can be used to build efficient data structures like hash tables, enabling fast lookup and retrieval based on file content.

    • Malware Detection
      Antivirus and security applications use file hashing to identify known malware and viruses. By comparing the hash of a file against a database of known malicious hashes, these applications can quickly determine if a file is potentially harmful.

    Glasswall Halo Events

    1. The Glasswall Halo API allows you to interact programmatically by initiating requests to specific endpoints based on the file type:
    • For binary files, use the api/v3/cdr-file endpoint.
    • For Base64-encoded files, use the api/v3/cdr endpoint.
    1. With each request, you have the option to request the generation of file hashes in any combination of three different formats: SHA-256, SHA-1, and MD5.

    2. Once the request is received, the input hash is generated and then Glasswall Halo processes it using the Glasswall Embedded Engine to rebuild and analyse the input file.

    3. Upon a successfully rebuilt file, the platform generates the output file hashes and both the input and output hashes are returned in the response headers.

    4. If a request fails, the platform has still generated the input hash which is returned in the response headers.

    5. These generated input and output file hashes can be utilised in various ways within your application or system, such as performing security checks, further processing, or any other necessary actions.

    API Authentication

    Glasswall Halo supports two types of authentication: **Basic **and Bearer authentication. Before making any request, you must authenticate using the appropriate scheme based on your configuration.

    Basic

    If your system is configured with Basic authentication, you need to obtain an Organisation ID and token from the system administrator. Combine these values in the format <OrganisationId>:<Token> and then Base64 encode them.

    The resulting encoded value should be included in the Request Header:

    Authorization: Basic ZGVtbzpwQDU1dzByZA==
    

    Bearer

    For Bearer authentication, you will require a Bearer token obtained from your identity provider.

    Once you have obtained the token, include it in the request header:

    Authorization: Bearer ZGVtbzpwQDU1dzByZA==
    

    Note: ensure that you use the appropriate authentication method as per your system's configuration. Using the correct authentication ensures successful access to the Glasswall Halo API and prevents any authorization issues.

    Learn more about authenticating Glasswall Halo

    Request Construction

    When leveraging Glasswall Halo for processing binary or Base64 encoded files, you have the flexibility to request the generation of multiple hash types, including SHA-256, SHA-1, and MD5. When making an API request, you can specify any combination of these hash types, and in response, you will receive the generated hash values for both the input and output files in the response headers. This enables you to ensure data integrity and authenticity, as well as facilitate efficient handling of files within your application or system.

    Binary File Processing

    POST {baseUrl}/api/v3/cdr-file?generate-hash-types=SHA256,SHA1,MD5
    

    Base64 Encoded File Processing

    Submit the Base64 encoded string in the Request body to the following endpoint:

    POST {baseUrl}/api/v3/cdr?generate-hash-types=SHA256,SHA1,MD5
    

    Request body Format

    The body of the request should be in JSON format and include the Base64 field containing the Base64 encoded string of the file, and the fileName field specifying the original filename (including the appropriate file extension).

    {
      "Base64": "string",
      "fileName": "filename.pdf",
    }
    

    Note

    • Replace {baseUrl} with the actual base URL of the Glasswall Halo API.
    • Correct auth header should be provided with each request
    • Setting generate-hash-types to a comma separated list of hash values in the URL ensures that the correct hashes are generated and returned in the response
    • For binary file processing, use a multipart form post, while for Base64 encoded file processing, provide the file content in the JSON Request body with the appropriate filename.

    Response Handling:

    If you are successful in making a request to the Glasswall Halo API, a 201 status code is returned. If you have requested any hashes to be generated and returned, both input and output hashes will be available in the Response Header. If the response is unsuccessful, only the input hash values will be returned.

    You can identify the hash values with the following header keys:

    For SHA-256:

    • Input File Hash: x-hash-sha256-input
    • Output File Hash: x-hash-sha256-output

    For SHA-1:

    • Input File Hash: x-hash-sha1-input
    • Output File Hash: x-hash-sha1-output

    For MD5:

    • Input File Hash: x-hash-md5-input
    • Output File Hash: x-hash-md5-output

    Example

    access-control-allow-origin: * 
    access-control-expose-headers: * 
    content-disposition: attachment; filename=18402777-7826-457f-bc02-6446611495e6.zip; filename*=UTF-8''18402777-7826-457f-bc02-6446611495e6.zip 
    content-length: 288843 
    content-type: application/octet-stream 
    date: Fri,21 Jul 2023 12:42:15 GMT 
    strict-transport-security: max-age=31536000; includeSubDomains 
    x-applied-policy: service-dynamic 
    x-file-size: 301189 
    x-filetype: pdf 
    x-hash-md5-input: AF5EB421F8AFC69AC4178ACF747E618C 
    x-hash-md5-output: 8018FC4F37AB858442F354430203CEB4 
    x-hash-sha1-input: 9C714CAC1F82A450728F3C046CEFEFAF0BBD2C07 
    x-hash-sha1-output: 058991F3A252F0A4FD0E7F8B3178D3F15494099B 
    x-hash-sha256-input: 5F83AB533252C00AD1C60A0EFC016BF273497F3179FD6CE642AD8A4857148B8D 
    x-hash-sha256-output: 9D5D88F073D74160828F92E9E8DADAC5712706C65D49448D9ABA91D8FDC6FC28 
    x-processing-id: 18402777-7826-457f-bc02-6446611495e6 
    x-processing-status: rebuilt 
    

    Summary

    By using generated hashes in your file processing workflow via Glasswall Halo, you can streamline operations, enhance security, and improve the reliability of your application. These collectively contribute to a more robust and resilient system that ensures the integrity and safety of your files.

    Quick Start

    To try Glasswall Halo yourself, please refer to our Quick Start Guide.


    Was this article helpful?