CLI Tool [Embedded Engine]
    • PDF

    CLI Tool [Embedded Engine]

    • PDF

    Article Summary

    Note: The GWCLI (Command Line Tool) will be excluded as standard from Release 10 of the Embedded Engine. Please contact support@glasswall.com should you require any additional support.

    This documentation explains how to use the Glasswall CLI using the Windows command prompt or the terminal window for Linux, with an explanation of the parameters and configuration.

    Learn more about the Glasswall Embedded Engine.

    Purpose

    The Glasswall CLI is the quickest way to get the CDR engine up and running, and is often used as a test or evaluation tool. The results provided can be examined and compared to give you a clear understanding of any issues or unwanted content within the files. Any invalid files will have the issues reported in the XML documents to show why they were non-conforming. You will also be able to see if any disallowed content was present that made the file non-conforming and decide if you want to rerun it in sanitise mode to remove that content.

    Config Files

    Configuration Settings

    If an XML policy is not specified on the command line, then GWFileConfigXml() is not called before the specified document processing function is called.

    XML Policy File Description

    Configuration Format

    The format within this file is described formally in the XSD located in the Glasswall SDK documentation. This policy file is passed unchanged to the GWFileConfigXML() function. An example of a full policy file is shown below .

    <?xml version="1.0" encoding="UTF-8"?>
    <config>
    
        <pdfConfig>    
            <acroform>sanitise</acroform>
            <metadata>sanitise</metadata>
            <javascript>sanitise</javascript>
            <actions_all>sanitise</actions_all>
            <embedded_files>sanitise</embedded_files>
            <embedded_images>sanitise</embedded_images>
            <internal_hyperlinks>sanitise</internal_hyperlinks>
            <external_hyperlinks>sanitise</external_hyperlinks>
            <digital_signatures>sanitise</digital_signatures>
        </pdfConfig>
      
        <wordConfig>
            <macros>sanitise</macros>
            <metadata>sanitise</metadata>
            <embedded_files>sanitise</embedded_files>
            <embedded_images>sanitise</embedded_images>
            <review_comments>sanitise</review_comments>
            <internal_hyperlinks>sanitise</internal_hyperlinks>
            <external_hyperlinks>sanitise</external_hyperlinks>
            <dynamic_data_exchange>sanitise</dynamic_data_exchange>
        </wordConfig>
      
        <pptConfig>
            <macros>sanitise</macros>
            <metadata>sanitise</metadata>
            <embedded_files>sanitise</embedded_files>
            <embedded_images>sanitise</embedded_images>
            <review_comments>sanitise</review_comments>
            <internal_hyperlinks>sanitise</internal_hyperlinks>
            <external_hyperlinks>sanitise</external_hyperlinks>
        </pptConfig>
      
        <xlsConfig>
            <macros>sanitise</macros>
            <metadata>sanitise</metadata>
            <embedded_files>sanitise</embedded_files>
            <embedded_images>sanitise</embedded_images>
            <review_comments>sanitise</review_comments>
            <internal_hyperlinks>sanitise</internal_hyperlinks>
            <external_hyperlinks>sanitise</external_hyperlinks>
            <dynamic_data_exchange>sanitise</dynamic_data_exchange>
            <connections>sanitise</connections>
        </xlsConfig> 
      
        <tiffConfig>
            <geotiff>sanitise</geotiff>
        </tiffConfig>
      
        <webpConfig>
            <metadata>sanitise</metadata>
        </webpConfig>
       
        <svgConfig>
            <scripts>sanitise</scripts>
            <foreign_objects>sanitise</foreign_objects>
            <hyperlinks>sanitise</hyperlinks>
        </svgConfig>
    
        <sysConfig>
            <interchange_type>xml</interchange_type>
            <interchange_pretty>false</interchange_pretty>
            <interchange_best_compression>false</interchange_best_compression>
            <export_embedded_images>true</export_embedded_images>
        </sysConfig>
    
    </config>
    

    Note: The xlsConfig, pptConfig, and wordConfig profiles (aka "policies") cover both office XML and Office Binary file types.

    Policy Settings

    Content Management SwitchContent Management Switch SettingNotes
    Other switchessanitiseConfigures Glasswall to remove or clean document element types associated with this content management switch type from any document being processed. This removal will be logged in analysis reports as a sanitisation item.
    allowConfigures Glasswall to leave document element types associated with this content management switch type in any document being processed.
    disallowConfigures Glasswall to raise an issue if document element types associated with this content management flag are found within any document being processed.

    Policy Details

    In addition to the behaviours recorded above, the following is a breakdown of specific content management switches that behave slightly differently:

    Content Management SwitchContent Management Switch SettingNotes
    hyperlinkssanitiseThe hyperlink targets will be removed but the text will remain in the document. This removal will be logged in analysis reports as a sanitisation item.
    embedded_filessanitiseEmbedded files within the document are passed to the Glasswall engine and are processed in accordance with there policy, files not supported by Glasswall will be removed.
    embedded_imagessanitiseEmbedded images within the document are passed to the Glasswall engine and are processed in accordance with there policy, images not supported by Glasswall will be removed.

    Content Management "sysConfig" Switch

    Content Management SwitchContent Management Switch SettingNotes
    interchange_typesisl/xmlExport/Import file format currently can be XML or SISL. Default: 'sisl'
    interchange_prettytrue/falseExport/Import file format can be human readable formatted if this is set to 'true'. Default: 'false'
    interchange_best_compressiontrue/falseIf set to 'true' then the result Export/Import archive package will be compressed at the maximum compression level but compression will take more time. Default: 'false'
    export_embedded_imagestrue/falseProcess embedded images within the document and save them as XML or SISL streams to Export archive package if this is set to 'true' or save as raw images if this is set to 'false'.
    run_modeenablerebuild/editoronlySends all content through our older Rebuild engine if this is set to 'enablerebuild'. Use the newer implementation if this is set to 'editoronly'. Rebuild elements will be switched off as the newer ones reach same or better maturity.

    CLI Commands

    SYNOPSIS

    The recursion option -r will become a default and its use is 
    deprecated. It is shown below pending its removal.
    
    This synopsis section shows several different use cases.
    
    'get all ids'
    gwcli -a FILE 
    
    'get specific id'
    gwcli -b ISSUE_ID -o FILE
    
    'run editor mode'
    gwcli  -i INDIR -o OUTDIR [-c FILE] [-r] [-x]
    
    'run archive mode'
    gwcli -c FILE -i INDIR -o OUTDIR -j (noarchive|analysis|protect|export|import) [-r]
    
    'use Rebuild style INI config file'
    gwcli -n FILE [-c FILE]
    
    'run wordsearch'
    gwcli -i INDIR -o OUTDIR [-c FILE] -g FILE -w [-r]
    
    'get licence information'
    gwcli --licence
    

    DESCRIPTION

    gwcli allows the user to process files using Glasswall Editor via a command-line interface.

    OPTIONS
    
    These options require values.
    -a FILE
        Retrieve all ID information and place in FILE.
    
    -b ID -o FILE
        Request information for ID ID. Requires -o option.
    
    -c FILE
        Use the content management settings specified in FILE.
    	
    -f FILE
        Specify consolidated summary XML report file. -e is optional.
    
    -g FILE
        Specify homoglyphs in FILE. -w must be specified as well.
    
    -i DIRECTORY
        Process all files in DIRECTORY and its subdirectories.
    
    -j MODE
        Use archive mode. MODE is one of noarchive, analysis, protect, export, import.
    
    --lic FILE
        Specify licence at FILE
    
    -n FILE
        Get subset of options from Rebuild style INI file FILE.
    
    -o DIRECTORY
        Place processed files in DIRECTORY. Create subdirectories to
        mirror the directory structure specified by -i.
    
    -x MODE
        Set process mode to export or import. MODE is one of export or import. --export_text_dump is optional.
    
    -z FILE
        Enable archive manager analysis report and place in FILE.
    
    These options do not require values.
    -e
        Skip unsupported file types.
    
    --export_text_dump
        Enable a text dump file to be produced when processing file in export mode
    
    --libvers
        Display the Glasswall library version.
    
    --licence, --license
        Display licence information.
    
    --man
        Display this User Manual.
    
    -r
        Recurse through directories in the directory specified by -i.
        Recursion will become the default and this option will be
        removed.
    
    -w
        Enable wordsearch; -g must be specified as well.
    

    EXIT STATUS

    0   Success
    1   Failure
    

    EXAMPLES

    The following Windows terminal session shows how to process files from C:\TestFiles\Test_Set_01 and place the processed files and analysis reports in C:\Output\Test_Set_01:

    CMD> gwcli -i C:\TestFiles\Test_Set_01 -o C:\Output\Test_Set_01
    

    To process files and use your own policy settings, specify it with the -c option:

    CMD> gwcli -i C:\TestFiles\Test_Set_01 -o C:\Output\Test_Set_01 -c policiesfile.xml
    

    To use the export functionality you would use this format:

    CMD> gwcli -i C:\TestFiles\Test_Set_01 -o C:\Output\Test_Set_01-EXPORTED -x export
    

    To import files from the exported directory, use

    CMD> gwcli -i C:\Output\Test_Set_01-EXPORTED -o C:\Output\Test_Set_01-IMPORTED -x import
    

    To perform word search on a set of files, use

    CMD> gwcli -i C:\TestFiles\Test_Set_01 -o C:\Output\Test_Set_01 -c policies.xml -g homoglyphs.json -w
    

    You can specify an INI file, as used for Rebuild. Bear in mind, this is deprecated and support will be removed from future releases.

    Optionally, you can specify a policies file as well:

    CMD> gwcli -n config.ini -c policies.xml
    

    NOTES

    -r is deprecated. Planned work includes defaulting to recursive descent through the input directory, rendering the option obsolete.

    INI Config File

    CLI Config files are optional but can be used to pre-configure and store settings for future runs.

    The INI config file can be an INI or TXT file, it contains most of the parameters in the terminal, you can set the report mode, storage mode, input location and output location.

    The config files are backwards compatibility with Glasswall legacy CLI.

    Configuration Format

    The configuration file is a text file that enables the operation of CLI to be configured. The file is made up of sections and associated configuration item name\value pairs.

    For example

    [sectionname1] 
    itemname11 = itemvalue11
    itemname12 = itemvalue12 
    [sectionname2] 
    itemname21 = itemvalue21 
    itemname22 = itemvalue22 
    

    All configuration values default to either 0 or an empty string, where appropriate.

    IdentifierValueNotes
    fileType[pdf|jpg|png|gif|doc]Type of documents for processing. Specifying the wildcard option ‘*’ will enable automatic file type detection.
    [|ppt|xls|docx|pptx|xlsx|e mf|wmf|*]Process mode 5 does not support the wildcard option.
    inputLocation[valid path]Location of documents for processing
    useSubfolders[0|1]0 = Process documents only in inputLocation 1 = Process documents in inputLocation and any subfolders
    processMode[0|1|2|4|5|6|7|8]0 = Analysis
    GWFileAnalysisAudit
    GWFileAnalysisAuditAndReport
    GWFileToFileAnalysisAudit
    GWFileToFileAnalysisAuditAndReport
    1 = Manage & Protect
    GWFileProtect
    GWFileProtectAndReport
    GWFileToFileProtect
    GWFileToFileProtectAndReport
    2 = Manage & Protect Lite
    GWFileProtectLite
    GWFileProtectLiteAndReport GWFileToFileProtectLite
    GWFileToFileProtectLiteAndReport
    4 = Export Mode
    Not supported
    5 = Import Mode
    Not supported
    6 = Archive Analysis
    Runs analysis process modes on files within archives such as ZIP.
    7 = Archive Protect
    Runs protect process modes on files within archives such as ZIP.
    8 = Word Search
    Not supported
    Note: The Short form of Report produced has no content items listed; only Sanitisations/Issue and Remedies
    reportMode[0|1]1 = Invokes the APIs that also produce a report. For example, GWFileAnalysisAuditAndReport. If writeOutput is set to zero, no reports will be produced. These logs are for low level analysis of how Glasswall handles the files and should not be made availible to the end users.
    outputLocation[valid path]Root folder for all processing output.
    nonconformingDirName[valid folder name]Name of subfolder within outputLocation that is the destination for all output from the processing of nonconforming files.
    managedDirName[valid folder name]Name of subfolder within outputLocation that is the destination for all output from the processing of managed files.
    fileToFileMode[1]1 = File to File
    pathToSummaryReport[valid path]Produces a consolidated summary XML report for all processed files (Process Mode 0 only)
    skipUnsupportedFiles[0|1]1 = Suppress the reporting of unsupported files in the summary report
    0 = Report unsupported files in the summary report

    Note: where using the ini config file, fileToFileMode must currently be set to 1. This option will be deprecated in a subsequent release.

    EXAMPLE INI FILE

    Item name must be unique within a section. Zero or more whitespace characters are allowed around the literal =. An example valid configuration file is shown below, mixing spaces and no spaces around the = for demonstration purposes.

    [GWConfig] 
    processMode = 0 
    reportMode = 1 
    fileToFileMode = 1
    fileType = * 
    inputLocation=C:\Test\bug_301 
    useSubfolders=1 
    
    pathToSummaryReport=summary.xml
     
    outputLocation       = Report
    nonConformingDirName = NonConforming 
    managedDirName       = Managed 
    
    

    Was this article helpful?

    What's Next