Lookup table creation for scalable anomaly detection with JA3/JA3s hashes (2024)

You can run a search that uses JA3 and JA3s hashes and probabilities to detect abnormal activity on critical servers, which are often targeted in supply chain attacks. JA3is anopen-source methodology that allows for creating an MD5 hash of specific values found in the SSL/TLS handshake process, and JA3s is a similar methodology for calculating the JA3 hash of a server session.

Required data

Deep packet inspection data

In this example, Zeekis used to generate JA3 and JA3s data but you can use any other tool which can generate that data.

Procedure

These searches are most effectively run in the following circ*mstances:

  • with an allow list that limits the number of perceived false positives.
  • against network connectivity that is not encrypted over SSL/TLS.
  • with internal hosts or netblocks that have limited outbound connectivity as a client.
  • in networks without SSL/TLS interceptions or inspection.
  1. Run the following search to generate your lookup table. You can optimize it byspecifying an index and adjusting thetime range.

    sourcetype="bro:ssl:json" ja3="*" ja3s="*" src_ip IN (192.168.70.0/24) | eval id=md5(src_ip+ja3+ja3s) | stats count BY id,ja3,ja3s,src_ip | eventstats sum(count) AS total_host_count BY src_ip,ja3 | eval hash_pair_likelihood=exact(count/total_host_count) | sort src_ip ja3 hash_pair_likelihood | streamstats sum(hash_pair_likelihood) AS cumulative_likelihood BY src_ip,ja3 | eval log_cumulative_like=log(cumulative_likelihood) | eval log_hash_pair_like=log(hash_pair_likelihood) | outputlookup hash_count_by_host_baselines.csv

    Search explanation

    The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.

    Splunk Search Explanation
    sourcetype="bro:ssl:json" ja3="*" ja3s="*" src_ip IN (192.168.70.0/24)

    Search for JA3 and JA3s hashes within the critical server defined.

    This part of the search usescritical server netblock, 192.168.70.0/24. It's important that you adjust this part of the search to includeyour own critical servers.

    | eval id=md5(src_ip+ja3+ja3s)

    Createa new field, id, with a message-digest (MD5) 128-bit hash value for src_IP, JA3, and JA3s.

    | stats count BY id,ja3,ja3s,src_ip

    Countby ID, JA3, JA3s, and src_IP.

    To ensure the probabilities stayup-to-date, you must run an additional query to ensure the latest information is in the lookup table. You can do this by inserting theadditional SPL shown here after this lineof the original search. While the initial outputlookup query should have a time window of the previous 7days, this update query should run every 24 hours during the last 24 hours' worth of data. You canappend the content from the previous query and restrict the time window to start when the last one is completed.

    | append
    [| inputlookup hash_count_by_host_baselines.csv]
    | stats sum(count) as count by id,ja3,ja3s,src_ip

    | eventstats sum(count) AS total_host_count BY src_ip,ja3

    Count the number of events usingtotal_host_count, and countby src_ip and JA3.

    | eval hash_pair_likelihood=exact(count/total_host_count) Evaluate the likelihood of an exact match of the events total_host_count.
    | sort src_ip ja3 hash_pair_likelihood Sort src_ip, JA3 and JA3s
    | streamstats sum(hash_pair_likelihood) AS cumulative_likelihood BY src_ip,ja3 Calculate a cumulativeprobability forthe events, grouped by src_ip andJA3.
    | eval log_cumulative_like=log(cumulative_likelihood)
    | eval log_hash_pair_like=log(hash_pair_likelihood)
    Create a field calledlog_cumulative_like that calculates the logarithm of thecumulative_likelihood value with base 10. Then do the same for log_hash_pair_like.
    | outputlookup hash_count_by_host_baselines.csv Displaythe results in a lookup table CSV.
  2. Run the following search with the lookup command to identify anomalous activity.You can optimize it byspecifying an index and adjusting thetime range.

    sourcetype="bro:ssl:json" ja3="*" ja3s="*" src_ip IN (192.168.70.0/24)| eval id=md5(src_ip+ja3+ja3s)| lookup hash_count_by_host_baselines.csv id AS id OUTPUT count, total_host_count,log_cumulative_like, log_hash_pair_like| table _time, src_ip, ja3s, server_name, subject, issuer, dest_ip, ja3, log_cumulative_like, log_hash_pair_like, count, total_host_count| sort log_hash_pair_like

    Search explanation

    The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.

    Splunk Search Explanation
    sourcetype="bro:ssl:json" ja3="*" ja3s="*" src_ip IN (192.168.70.0/24)

    Search for JA3 and JA3s hashes within the critical server defined.

    This part of the search usescritical server netblock, 192.168.70.0/24. It's important that you adjust this part of the search to includeyour own critical servers.

    | eval id=md5(src_ip+ja3+ja3s)

    Createa new field, id, with a message-digest (MD5) 128-bit hash value for src_IP, JA3, and JA3s.

    | lookup hash_count_by_host_baselines.csv id AS id OUTPUT count, total_host_count,log_cumulative_like, log_hash_pair_like Look up the id hash within the table previously generated and output the fields shown.
    | table _time, src_ip, ja3s, server_name, subject, issuer, dest_ip, ja3, log_cumulative_like, log_
    hash_pair_like, count, total_host_count
    Display the results in a table with columns in the order shown.
    | sort log_hash_pair_like Sort the results with the smallest log_hash_pair_like value first.

Next steps

This part of the search means you can look up the ID generated previously in your lookup table to hone in on potentially anomalous results. In the example below, anomalous results are returned in the top ten results.

Lookup table creation for scalable anomaly detection with JA3/JA3s hashes (1)

Results from this search are of similar effectiveness and an equivalent amount of time for the queries to complete when compared to the anomaly probabilitysearch. However, in general, day-to-day usage, it is approximately 100x faster when compared with the secondary lookup query.

An allow list could also bea necessity when tested with more extensive networks so thatanomalous activity is consistently identified within the top 30 events.

Finally, you might be interested in other processes associated with theDetecting software supply chain attacks use case.

Lookup table creation for scalable anomaly detection with JA3/JA3s hashes (2024)
Top Articles
Latest Posts
Article information

Author: Prof. An Powlowski

Last Updated:

Views: 6287

Rating: 4.3 / 5 (44 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Prof. An Powlowski

Birthday: 1992-09-29

Address: Apt. 994 8891 Orval Hill, Brittnyburgh, AZ 41023-0398

Phone: +26417467956738

Job: District Marketing Strategist

Hobby: Embroidery, Bodybuilding, Motor sports, Amateur radio, Wood carving, Whittling, Air sports

Introduction: My name is Prof. An Powlowski, I am a charming, helpful, attractive, good, graceful, thoughtful, vast person who loves writing and wants to share my knowledge and understanding with you.