Identifying Split or Part Archive File Transfers
Detecting split or part archive files in a dataset can be useful for identifying potential data exfiltration or malicious activity. The following is a KQL query that works with the DeviceFileEvents table in Microsoft Sentinel to discover split or part archive files based on naming patterns:
// Define patterns for split or part archive file names
let SplitArchivePatterns = dynamic(["*.part*", "*.zip.*", "*.rar.*", "*.z*.*", "*.tar.*", "*.gz.*"]);
// Query the FileEvents table
DeviceFileEvents
| extend FileExtension = tolower(split(FileName, ".")[-1]) // Extract file extension
| where FileName matches regex @"(.*\.(part[0-9]+|zip\.[0-9]+|rar\.[0-9]+|z\.[0-9]+|tar\.[0-9]+|gz\.[0-9]+))$"
or FileName has_any (SplitArchivePatterns) // Match patterns or dynamic list
| summarize
TotalFiles = count(),
UniqueDevices = dcount(DeviceName),
UniqueUsers = dcount(RequestAccountName),
FileSizeSum = sum(FileSize)
by FileName, FolderPath, FileExtension, bin(Timestamp, 1h)
| order by TotalFiles desc
| project Timestamp, FileName, FolderPath, FileExtension, TotalFiles, UniqueDevices, UniqueUsers, FileSizeSumExplanation:
Patterns for Split/Part Files:
SplitArchivePatterns: Defines patterns that identify split or part archive files, such as.part1,.zip.001,.rar.002,.z.003, etc.Uses
matches regexandhas_anyfor flexible pattern matching.
File Extension Extraction:
Extracts the file extension from the
FileNamefield usingsplit()and converts it to lowercase for case-insensitive comparison.
Filters:
Filters the
FileNamefield to match naming conventions for split or part archive files usingmatches regexor the predefined pattern list.
Aggregation:
Summarises:
TotalFiles: Number of files matching the pattern.UniqueDevices: Number of unique devices involved.UniqueUsers: Number of distinct users associated with the files.FileSizeSum: Sum of file sizes for the detected files.
Time Binning:
Group results into 1-hour intervals using
bin(Timestamp, 1h)for temporal analysis.
Results:
Displays key fields such as
Timestamp,FileName,FilePath,FileExtension,TotalFiles,UniqueDevices,UniqueUsers, andFileSizeSum.
Customisation:
Patterns:
Add or modify patterns in
SplitArchivePatternsto align with your organisation's requirements.
Time Filtering:
Add a specific time range filter, e.g.,
| where Timestamp between (startTime .. endTime).
Additional Fields:
Include fields like
UserPrincipalName,SourceIP, orDestinationIPfor more context.
This query can help detect potentially suspicious activity related to split or part archive files in your environment.
This query will look for files with common split archive extensions and patterns in their names:
// Define patterns for split or part archive file names
let SplitArchivePatterns = dynamic(["*.part*", "*.zip.*", "*.rar.*", "*.z*.*", "*.tar.*", "*.gz.*"]);
// Query the FileEvents table
DeviceFileEvents
| extend FileExtension = tolower(split(FileName, ".")[-1]) // Extract file extension
| where FileName matches regex @"(.*\.(part[0-9]+|zip\.[0-9]+|rar\.[0-9]+|z\.[0-9]+|tar\.[0-9]+|gz\.[0-9]+))$"
or FileName has_any (SplitArchivePatterns) // Match patterns or dynamic list
| summarize
TotalFiles = count(),
UniqueDevices = dcount(DeviceName),
UniqueUsers = dcount(RequestAccountName),
FileSizeSum = sum(FileSize)
by FileName, FolderPath, FileExtension, bin(Timestamp, 1h)
| order by TotalFiles desc
| project Timestamp, FileName, FolderPath, FileExtension, TotalFiles, UniqueDevices, UniqueUsers, FileSizeSumExplanation:
Pattern Matching: The
SplitArchivePatternsdynamic array contains common extensions for split or part archive files.Filtering: The
whereclause filters theFileEventstable to retain only files matching the specified patterns.Summarization: The
summarizestatement aggregates the data to count the total number of files, unique devices, and unique users for each file name and folder path.Ordering: The results are ordered by the total number of files in descending order.
Projection: The
projectstatement selects the relevant columns for the final output.
This query should help identify split or part archive files in an environment.
Splunk query to identify split or part archive files using Sysmon logs:
Index=sysmon sourcetype=XmlWinEventLog:Microsoft-Windows-Sysmon/Operational EventCode=11
| eval FileExtension=lower(mvindex(split(FileName, "."), -1)) // Extract file extension
| search FileName IN ("*.part*", "*.zip.*", "*.rar.*", "*.z.*", "*.tar.*", "*.gz.*", "*.001", "*.002", "*.003") // Match split or part archive patterns
| stats count AS TotalFiles,
dc(Computer) AS UniqueHosts,
dc(User) AS UniqueUsers,
sum(FileSize) AS TotalFileSize
by FileName, FilePath, FileExtension
| sort - TotalFiles
| table FileName, FilePath, FileExtension, TotalFiles, UniqueHosts, UniqueUsers, TotalFileSizeExplanation:
Index and Sourcetype:
The query assumes
index=sysmonandsourcetype=XmlWinEventLog:Microsoft-Windows-Sysmon/Operational. Adjust as per your environment.
Event Code:
EventCode=11corresponds to Sysmon FileCreate events, capturing file creation activity.
File Extension Extraction:
Uses
split()andmvindex()to extract the file extension and normalise it to lowercase for uniform comparison.
Split/Part File Patterns:
Searches for files matching common split or part archive patterns:
Examples:
.part*,.zip.*,.rar.*,.001,.002, etc.
Statistics:
Aggregates data using
statsto show:TotalFiles: Number of files matching the pattern.UniqueHosts: Number of unique hosts involved.UniqueUsers: Number of distinct users associated with file creation.TotalFileSize: Sum of file sizes for matched files.
Sorting and Display:
Sorts results by
TotalFilesin descending order.Displays relevant fields:
FileName,FilePath,FileExtension,TotalFiles,UniqueHosts,UniqueUsers, andTotalFileSize.
Customisation:
Patterns:
Expand or adjust file patterns to include additional split archive naming conventions.
Fields:
Verify the field names (
FileName,FilePath,FileSize,Computer,User) and adjust them to match your Sysmon log schema.
Time Filters:
Use Splunk's time picker or add time range filters like
earliest=-24h.
Use Case:
This query helps detect the creation of split or part archive files on endpoints, which could indicate potential data staging for exfiltration or malicious activity.
Last updated