Skip to content

Handle special characters in requests #24

@dandye

Description

@dandye

The 2.32.x version of the python requests library does not work with the unstructuredlogentries API if there are special characters (e.g. "Microsoft® Windows® Operating System") in the entries.

CL opened to pin the version of requests below 2.32:
https://critique.corp.google.com/cl/645031111

Details

The symptom that was reported was missing data from the Events loaded periodically by the Logstory project. Research revealed the missing logs were lines that contained the special character “®”, which is found in “Microsoft® Windows® Operating System”. The problem began when the

The 2.31.x version of requests does work. (i.e. requests < 2.32)
Removing or replacing (with ?) also fixes the issue and allows requests 2.32.x to be used.
The API does not raise an error. Instead, it times out after 120 seconds.

Not sure if this is relevant but 2.32 introduced:

Requests now supports optional use of character detection (chardet or charset_normalizer) when repackaged or vendored. This enables pip and other projects to minimize their vendoring surface area. The Response.text() and apparent_encoding APIs will default to utf-8 if neither library is present. (#6702)

charset-normalizer==3.3.2 is installed with requests 2.31.0
Logstory was affected by this issue because the request version in requirements.txt was specified as: requests >= 2.28.1
Changing this to requests < 2.32 should fix it
The only special character in logstory logs is ®
The affected usecases are:

(Pdb) for afile in files_affected: print(afile, files_affected[afile])
./MISP/EVENTS/WINDOWS_SYSMON.log 208
./EDR_WORKSHOP/EVENTS/CS_EDR.log 10
./EDR_WORKSHOP/EVENTS/WINDOWS_SYSMON.log 16
./RAT/EVENTS/WINDOWS_SYSMON.log 148
./RAT/EVENTS/WINEVTLOG.log 90
./THW/EVENTS/GCP_FIREWALL.log 91
./THW/EVENTS/WINDOWS_SYSMON.log 1566
./THW/EVENTS/POWERSHELL.log 20
./RECON_CISA/EVENTS/WINDOWS_SYSMON.log 1554
./RECON_CISA/EVENTS/WINEVTLOG.log 660
./MANDIANT_FRONTLINE/EVENTS/WINDOWS_SYSMON.log 78
./GCTI/EVENTS/WINDOWS_DEFENDER_ATP.log 36
./GCTI/EVENTS/WINDOWS_SYSMON.log 24
./GCTI/EVENTS/MICROSOFT_DEFENDER_ENDPOINT.log 6
./SOAR_RECON_CISA/EVENTS/WINDOWS_SYSMON.log 1364
./RULES_SEARCH_WORKSHOP/EVENTS/WINDOWS_SYSMON.log 1078
./RULES_SEARCH_WORKSHOP/EVENTS/WINDOWS_SYSMON_1good_1bad.log 4

This two-line file may be used to test (usage follows):

<14>Jun 16 13:37:42 wrk-shasek.stackedpads.local Microsoft-Windows-Sysmon[2568]: {"EventTime":1718545020,"Hostname":"wrk-shasek.stackedpads.local","Keywords":-9223372036854775808,"EventType":"INFO","SeverityValue":2,"Severity":"INFO","EventID":7,"SourceName":"Microsoft-Windows-Sysmon","ProviderGuid":"{5770385F-C22A-43E0-BF4C-06F5698FFBD9}","Version":3,"Task":7,"OpcodeValue":0,"RecordNumber":72325,"ProcessID":2568,"ThreadID":3284,"Channel":"Microsoft-Windows-Sysmon/Operational","Domain":"NT AUTHORITY","AccountName":"SYSTEM","UserID":"S-1-5-18","AccountType":"User","Message":"Image loaded:\r\nRuleName: technique_id=T1053,technique_name=Scheduled Task\r\nUtcTime: 2024-06-16 13:37:42.456\r\nProcessGuid: {6b7cbb53-33b3-62c8-b900-000000000e00}\r\nProcessId: 3620\r\nImage: C:\\Windows\\System32\\MoUsoCoreWorker.exe\r\nImageLoaded: C:\\Windows\\System32\\taskschd.dll\r\nFileVersion: 10.0.19041.1266 (WinBuild.160101.0800)\r\nDescription: Task Scheduler COM API\r\nProduct: Microsoft® Windows® Operating System\r\nCompany: Microsoft Corporation\r\nOriginalFileName: taskschd.dll\r\nHashes: SHA1=27EFA81247501EBA6603842F476C899B5DAAB8C7,MD5=49E93FA14D4E09AAFD418AB616AD1BB1,SHA256=35E3F44C587DE8BFF62095E768C77E12E2C522FB7EFD038FFFCC0DD2AE960A57,IMPHASH=B7A4477FA36E2E5287EE76AC4AFCB05B\r\nSigned: true\r\nSignature: Microsoft Windows\r\nSignatureStatus: Valid\r\nUser: NT AUTHORITY\\SYSTEM","Category":"Image loaded (rule: ImageLoad)","Opcode":"Info","RuleName":"technique_id=T1053,technique_name=Scheduled Task","UtcTime":"2024-06-16 13:37:42.456","ProcessGuid":"{6b7cbb53-33b3-62c8-b900-000000000e00}","Image":"C:\\Windows\\System32\\MoUsoCoreWorker.exe","ImageLoaded":"C:\\Windows\\System32\\taskschd.dll","FileVersion":"10.0.19041.1266 (WinBuild.160101.0800)","Description":"Task Scheduler COM API","Product":"Microsoft® Windows® Operating System","Company":"Microsoft Corporation","OriginalFileName":"taskschd.dll","Hashes":"SHA1=27EFA81247501EBA6603842F476C899B5DAAB8C7,MD5=49E93FA14D4E09AAFD418AB616AD1BB1,SHA256=35E3F44C587DE8BFF62095E768C77E12E2C522FB7EFD038FFFCC0DD2AE960A57,IMPHASH=B7A4477FA36E2E5287EE76AC4AFCB05B","Signed":"true","Signature":"Microsoft Windows","SignatureStatus":"Valid","User":"NT AUTHORITY\\SYSTEM","EventReceivedTime":1718545080,"SourceModuleName":"in_sysmon","SourceModuleType":"im_msvistalog"}
<14>Jun 16 13:37:43 wrk-shasek.stackedpads.local Microsoft-Windows-Sysmon[2568]: {"EventTime":1718545020,"Hostname":"wrk-shasek.stackedpads.local","Keywords":-9223372036854775808,"EventType":"INFO","SeverityValue":2,"Severity":"INFO","EventID":7,"SourceName":"Microsoft-Windows-Sysmon","ProviderGuid":"{5770385F-C22A-43E0-BF4C-06F5698FFBD9}","Version":3,"Task":7,"OpcodeValue":0,"RecordNumber":72330,"ProcessID":2568,"ThreadID":3284,"Channel":"Microsoft-Windows-Sysmon/Operational","Domain":"NT AUTHORITY","AccountName":"SYSTEM","UserID":"S-1-5-18","AccountType":"User","Message":"Image loaded:\r\nRuleName: technique_id=T1053,technique_name=Scheduled Task\r\nUtcTime: 2024-06-16 13:37:43.789\r\nProcessGuid: {6b7cbb53-33b3-62c8-ba00-000000000e00}\r\nProcessId: 5528\r\nImage: C:\\Windows\\System32\\svchost.exe\r\nImageLoaded: C:\\Windows\\System32\\taskschd.dll\r\nFileVersion: 10.0.19041.1266 (WinBuild.160101.0800)\r\nDescription: Task Scheduler COM API\r\nProduct: Microsoft® Windows® Operating System\r\nCompany: Microsoft Corporation\r\nOriginalFileName: taskschd.dll\r\nHashes: SHA1=27EFA81247501EBA6603842F476C899B5DAAB8C7,MD5=49E93FA14D4E09AAFD418AB616AD1BB1,SHA256=35E3F44C587DE8BFF62095E768C77E12E2C522FB7EFD038FFFCC0DD2AE960A57,IMPHASH=B7A4477FA36E2E5287EE76AC4AFCB05B\r\nSigned: true\r\nSignature: Microsoft Windows\r\nSignatureStatus: Valid\r\nUser: NT AUTHORITY\\SYSTEM","Category":"Image loaded (rule: ImageLoad)","Opcode":"Info","RuleName":"technique_id=T1053,technique_name=Scheduled Task","UtcTime":"2024-06-16 13:37:42.789","ProcessGuid":"{6b7cbb53-33b3-62c8-ba00-000000000e00}","Image":"C:\\Windows\\System32\\svchost.exe","ImageLoaded":"C:\\Windows\\System32\\taskschd.dll","FileVersion":"10.0.19041.1266 (WinBuild.160101.0800)","Description":"Task Scheduler COM API","Product":"Microsoft® Windows® Operating System","Company":"Microsoft Corporation","OriginalFileName":"taskschd.dll","Hashes":"SHA1=27EFA81247501EBA6603842F476C899B5DAAB8C7,MD5=49E93FA14D4E09AAFD418AB616AD1BB1,SHA256=35E3F44C587DE8BFF62095E768C77E12E2C522FB7EFD038FFFCC0DD2AE960A57,IMPHASH=B7A4477FA36E2E5287EE76AC4AFCB05B","Signed":"true","Signature":"Microsoft Windows","SignatureStatus":"Valid","User":"NT AUTHORITY\\SYSTEM","EventReceivedTime":1718545080,"SourceModuleName":"in_sysmon","SourceModuleType":"im_msvistalog"}
python -m ingestion.create_unstructured_log_entries  --customer_id=$PROJECT_GUID  --log_type=WINDOWS_SYSMON  --logs_file=./ingestion/example_input/sysmon_unstructured_log_entries_bad_only.txt  --credentials_file=/Users/dandye/.ssh/malachite-ltstr740-5526c600e791-ingestion-api.json

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions