Policy Management
    • PDF

    Policy Management

    • PDF

    Article summary

    When using Glasswall's Embedded Engine, you have the option to set your own sanitisation preferences by changing the policy settings for each of the supported file types.

    This means you can shape your organisation’s security policy in accordance to its risk tolerance.

    Policy settings

    A Policy is a set of switches, each switch has its own switch setting. wordConfig and pptConfig are example Policy names relating to Doc/Docx and Ppt/Pptx files respectively.

    The following switch settings can be applied to active content:

    • Sanitise: as a general rule, we'll analyse the file and remove this type of risky content, then rebuild the file. See Embedded Files and Embedded Images for exceptions.

      • Example: you can sanitise (remove) macros from Word files.
    • Allow: we'll analyse and rebuild the file, but we won't remove this type of risky active content.

      • Example: you can allow macros for Word files. (This presents a risk to you if an attacker has placed malware within a file.)
    • Disallow: we'll analyse the file, but if we find risky content, we won't process the file at all.

      • Example: you can specify that Word files with macros aren't processed at all.

    The switches currently available for each format are depicted in the table below:

    SwitchPDFPPT/PPTXDOC/DOCXXLS/XLSXTIFFGIFWEBPSVG
    acroform
    actions_all
    connections *
    digital_signatures *
    dynamic_data_exchange
    embedded_files
    embedded_images
    external_hyperlinks
    foreign_objects
    geotiff
    hyperlinks
    internal_hyperlinks
    javascript
    macros
    metadata
    retain_exported_streams *
    review_comments
    scripts
    tracked_changes * *
    hidden_data * * *
    slide_notes *
    in_text_comment *
    value_outside_reasonable_limits
    watermark

    [ *]: Content management switch available in Editor's "enableRebuild" (default) mode or Rebuild only

    All content types not represented by a content management type for a specific file format will be automatically remediated by the Glasswall engine if identified as malicious.

    Embedded Files

    The "Embedded Files" content management type applies only to non-image file formats which are either unsupported by the Glasswall engine or obfuscated by the containing file. For MS Office formats, embedded files in supported formats are processed as a standalone file and if the embedded supported file is conforming it will be regenerated regardless of content management settings, otherwise the containing file will be rejected.

    The Embedded Engine currently supports the processing of OOXML files up to a depth of nine embedded files. This means that the Engine is capable of traversing and processing the content of files nested within files, down to the ninth level of embedding. OOXML files containing more more than ten levels of embedded files are not supported and will not be processed.

    • Sanitise: for MS Office formats, embedded MS Office files are processed as standalone files. If the embedded file is conforming, the embedded file will be regenerated; otherwise, both the containing and embedded file will be rejected. For all other container or embedded formats, embedded files are removed without being processed.
    • Disallow: embedded files are forbidden. If one is found, both the embedded and the containing file are rejected.
    • Allow: for MS Office formats, embedded MS Office files are processed as standalone files. If one is non-conforming, both the embedded and the containing file are rejected. For all other container or embedded formats, embedded files are regenerated without being processed.

    The table below shows which embedded file formats we attempt to regenerate () when "Embedded Files" is set to "Sanitise" versus those which are removed ():

    Embedded File FormatDOCX/XLSX/PPTXDOC/XLS/PPTPDF
    Office 2003
    Office 1997
    PDF
    MP3n/a
    MP4n/a
    MPEGn/a
    WAV
    Formats unsupported by Glasswall

    [†]: Disallowed by container format

    [‡]: Not removed by Embedded Files switch, but may be removed by All Actions switch. Embedded file is regenerated without being processed.

    Embedded Images

    For image file formats, the "Embedded Images" content management switch should be used. This has the following behaviour depending on switch setting:

    • Sanitise: for MS Office, embedded images in supported formats are processed as standalone files. If the embedded image is conforming, the embedded file will be regenerated; otherwise, both the containing and embedded file will be rejected. Unsupported image formats are removed. In PDFs, embedded images are not processed and will always be regenerated if entry is structurally correct.
    • Disallow: embedded images are forbidden. If one is found, the containing file is rejected.
    • Allow: embedded images are not processed and are always regenerated as long as they are a supported file format.

    The table below shows which image formats we attempt to regenerate () when "Embedded Images" is set to "Sanitise" versus those which are removed ():

    Embedded Image FormatDOCX/XLSX/PPTXDOC/XLS/PPTPDF
    BMP, JPEG, GIF, PNG, EMF, SVG, TIFF
    WMF, EMF
    WebP
    Formats unsupported by Glasswall

    [⸸]: Will be converted to a different format by container file

    Note: when the "Embedded Images" is set to "Disallow", any images being encountered will result in the rejection of the containing file. This includes thumbnails of the containing or embedded documents and so may supersede the "Embedded File" content management switch.

    Hidden Data

    Office file formats offer multiple different way of legitimately "hiding" text or data, including whole Excel sheets, PowerPoint slides or lines of text in a Word document. The Glasswall engine deals with hidden data for OOXML files in the following ways, depending on the content management switch setting:

    • Sanitise - The file is regenerated with all hidden data "unhidden" so it is completely visible to the user.
    • Disallow - Hidden data is forbidden. If any hidden data is found, the containing file is rejected.
    • Allow - Any hidden data is regenerated and remains hidden.

    When the switch is set to sanitise, any hidden data is made visible. List of hidden data attributes sanitised by file type:

    • Word (DOCX) - attributes vanish and webHidden (used to hide objects, text, tables, etc.)
    • PowerPoint (PPTX) - attribute show (used to hide a slide) and bwMode (which controls how to handle the shape in black and white printer settings)
    • Excel (XLSX) - attributes hidden, width, customWidth, customHeight, ht and hidden (used to hide sheets, columns, rows or cells)

    Macros

    The macros content switch for MS Office files applies to both Microsoft Visual Basic for Applications (VBA) and Excel 4.0 macros.

    Microsoft Visual Basic for Applications

    VBA macros are written in the Visual Basic programming language and can be included in any MS Office file format. The handling of VBA macros can be configured as follows:

    • Sanitise - VBA macros are removed from files.
    • Disallow - VBA macros are forbidden. If one is found, the containing file is rejected.
    • Allow - VBA macros are processed and regenerated as part of the containing file providing they conform to specification.

    Excel 4.0 Macros

    Excel 4.0 macros are a legacy feature included in XLSX and XLS files. XLSX files containing Excel 4.0 macros will be saved using the ".xlsm" file extension and will produce an error if this extension is modified. The handling of Excel 4.0 macros can be configured as follows:

    • Sanitise: in XLS files, the file will be blocked and Excel 4.0 Macro found: Not supported reported as an issue item. In XLSX/XLSM files, sheets containing macros will be removed from the document and reported as a sanitisation item. If this causes the file to be malformed (i.e. reducing the number of visible sheets to zero), the file will be rejected and an appropriate issue item reported.
    • Disallow: Excel 4.0 macros are forbidden. If one is found, the containing file is rejected.
    • Allow: in XLS files, the file will be blocked and Excel 4.0 Macro found: Not supported reported as an issue item. In XLSX/XLSM files, the file will be regenerated with macros intact.

    Metadata

    In OOXML, metadata refers to information that describes the content, structure, and properties of a document but is not part of the document's main content. Metadata in OOXML documents is primarily stored in XML files located within the docProps directory:

    1. core.xml: Contains core properties based on the Dublin Core Metadata Element Set.
    2. app.xml: Contains extended properties specific to Microsoft Office applications.
    3. custom.xml: Contains custom properties.

    The handling of OOXML metadata can be configured as follows:

    • Sanitise - The file is regenerated with metadata removed (see below for all the properties currently sanitised)
    • Disallow - Metadata is forbidden. If any metadata (properties listed below) is found, the containing file is rejected.
    • Allow - The file is processed and the metadata is regenerated.

    As part of the 'metadata' content management switch, we currently sanitise the following in:

    • core.xml: title, subject, creator, keywords, description, lastModifiedBy, revision, lastPrinted, created, modified, category, contentStatus, language, and version.
    • app.xml: manager, company, and hyperlinkBase
    • custom.xml: any custom properties added to the OOXML document.

    OOXML Files (DOCX, XLSX, PPTX)

    Hidden Data

    Office file formats offer multiple different way of legitimately "hiding" text or data, including whole Excel sheets, PowerPoint slides or lines of text in a Word document. The Glasswall engine deals with hidden data in the following ways, depending on the content management switch setting:

    • Sanitise - The file is regenerated with all hidden data "unhidden" so it is completely visible to the user.
    • Disallow - Hidden data is forbidden. If any hidden data is found, the containing file is rejected.
    • Allow - Any hidden data is regenerated and remains hidden.

    Tracked Changes

    The tracked_changes content management switch refers to content added by the "Track Changes" functionality in DOCX and XLSX files, also known as "revisions". These can contain historic data related to previous versions of the document, including names of contributors and records of content that has since been removed or obfuscated. The handling of tracked changes can be configured as follows:

    • Sanitise - All historic data is removed and "Track Changes" disabled. The regenerated document will be equivalent to the final state of the original document.
    • Disallow - Tracked changes are forbidden. If there is any evidence of previous revisions or tracked changes still present in the file, the file will be rejected.
    • Allow - The file is regenerated with all historic changes, revisions and tracked changes intact.

    Slide Notes

    The slide_notes content management switch refers to content added by the "Notes" functionality in PPTX files, also known as "slide notes" (and/or "speaker notes"). The Glasswall engine deals with these slide notes in the following ways, depending on the configuration of the content management switch setting:

    • Sanitise - The file is regenerated with all slide notes removed.
    • Disallow - Slide notes are forbidden. If any slide notes are found, the containing file is rejected.
    • Allow - Any slide notes are regenerated and remain in the file.

    In-Text Comments

    The in_text_comment content management switch refers to content added by the "In-Text Comments" functionality in DOCX files. The handling of the switch can be configured as follows:

    • Sanitise - In-Text Comment is removed alongside the corresponding document metadata found in core.xml.
    • Disallow - In-Text Comment is forbidden. Any DOCX containing an in-text comment will block the file from being regenerated.
    • Allow - The file is regenerated with the In-Text Comment present in the regenerated DOCX file.

    Note: When the in_text_comment switch is set to "allow" and the metadata switch is set to "sanitize," the regenerated file will retain the in-text comment but without any associated data, as the metadata switch sanitizes the corresponding description from the core.xml file.


    Was this article helpful?