Unveiling the Hidden Enchantment in malicious.h5: A Detailed Analysis
Challenge: Investigate a mysterious magical artifact (malicious.h5
) exhibiting unusual behavior to uncover its secrets. The flag format is HTB{FlagGoesHere}
.
Workflow: We will systematically inspect the malicious.h5
file, leveraging tools like h5py
and online model visualizers like Netron to understand its structure and identify any hidden elements or malicious code.
Step-by-Step Analysis:
-
Initial Inspection with
h5py
:-
Purpose: Start by understanding the basic structure of the
malicious.h5
file. H5 files are hierarchical, andh5py
allows us to navigate this structure programmatically. We want to see what groups and datasets are present. -
Action: Use a Python script with
h5py
to print the names of all groups and datasets within the file.python -c "import h5py; f = h5py.File('malicious.h5', 'r'); f.visititems(lambda name, obj: print(name))"
-
Observation: Running this script reveals a typical structure for a Keras/TensorFlow model, with groups like
model_weights
and layers likeconv2d_1
,batch_normalization_1
, etc. However, amidst these standard layers, we notice an unusual layer namedhyperDense
. This non-standard name immediately raises suspicion.
-
-
Visualizing the Model with Netron:
-
Purpose: A visual representation of the model architecture often provides a quicker and more intuitive understanding than just text output. Netron is a web-based tool that excels at visualizing neural network models.
-
Action: Upload the
malicious.h5
file to https://netron.app/. -
Observation (Crucial Insight): Netron visually renders the model graph. Navigating through the layers, we locate the
hyperDense
layer. Inspecting its properties in Netron reveals that:- It is a Lambda layer. This is significant because Lambda layers in Keras allow for arbitrary code execution during model loading or inference.
- It has two associated Lambda functions: one for the main function and one for the output shape function.
- Crucially, both Lambda function configurations contain base64 encoded strings under the "code" parameter. This is a major red flag, strongly suggesting hidden code within the model.
(Netron Visualization Screenshot - Imagine a screenshot here showing Netron with the hyperDense layer selected, highlighting the base64 encoded code in the Lambda function configuration.)
-
-
Examining the
hyperDense
Layer Configuration Programmatically:-
Purpose: While Netron visually identified the base64 encoded code, we need to extract this code programmatically for further analysis. We'll use
h5py
again to access themodel_config
attribute of the H5 file, which contains the model's JSON configuration, including the Lambda layer details. -
Action: Use a Python script to read the
model_config
attribute and parse the JSON to extract the base64 encoded code from thehyperDense
layer's Lambda function configuration.import h5py import json def extract_lambda_code(file_path): with h5py.File(file_path, 'r') as f: if 'model_config' in f.attrs: model_config = f.attrs['model_config'] if isinstance(model_config, bytes): model_config_str = model_config.decode('utf-8', errors='ignore') else: model_config_str = str(model_config) model_config_json = json.loads(model_config_str) for layer in model_config_json['config']['layers']: if layer['config']['name'] == 'hyperDense': lambda_code_b64 = layer['config']['config']['function']['config']['code'] output_shape_code_b64 = layer['config']['config']['output_shape']['config']['code'] print("Hyperdense function code (base64):") print(lambda_code_b64) print("\nOutput shape function code (base64):") print(output_shape_code_b64) return lambda_code_b64, output_shape_code_b64 return None, None lambda_code_b64, output_shape_code_b64 = extract_lambda_code('malicious.h5')
-
Observation: Running this script successfully extracts the base64 encoded strings for both the hyperDense function and its output shape function, confirming what we saw in Netron programmatically.
-
-
Decoding the Base64 Encoded Lambda Function Code:
-
Purpose: Now that we have the base64 encoded code, the next crucial step is to decode it to understand what it does. We expect it to be Python bytecode, given it's within a Keras Lambda layer.
-
Action: Use Python's
base64
module to decode the extracted base64 string. We'll try decoding as both UTF-8 and latin-1 to handle potential encoding variations.import base64 lambda_code_b64 = "4wEAAAAAAAAAAAAAAAQAAAADAAAA8zYAAACXAGcAZAGiAXQBAAAAAAAAAAAAAGQCpgEAAKsBAAAA AAAAAAB8AGYDZAMZAAAAAAAAAAAAUwApBE4pGulIAAAA6VQAAADpQgAAAOl7AAAA6WsAAADpMwAA AOlyAAAA6TQAAADpUwAAAOlfAAAA6UwAAAByCQAAAOl5AAAAcgcAAAByCAAAAHILAAAA6TEAAADp bgAAAOlqAAAAcgcAAADpYwAAAOl0AAAAcg4AAADpMAAAAHIPAAAA6X0AAAD6JnByaW50KCdZb3Vy IG1vZGVsIGhhcyBiZWVuIGhpamFja2VkIScp6f////8pAdoEZXZhbCkB2gF4cwEAAAAg+h88aXB5 dGhvbi1pbnp1dC02OS0zMjhhYjc5ODJiNGY++gg8bGFtYmRhPnIaAAAADgAAAHM0AAAAgADwAgEJ SAHwAAEJSAHwAAEJSAHlCAzQDTXRCDbUCDbYCAnwCQUPBvAKAAcJ9AsFDwqAAPMAAAAA " # Replace with the actual base64 string extracted decoded_lambda_code_bytes = base64.b64decode(lambda_code_b64) decoded_lambda_code_ascii = decoded_lambda_code_bytes.decode('latin-1', errors='ignore') # or try 'utf-8' print("Decoded Lambda function (ASCII - latin-1):") # Or whichever encoding worked best print(decoded_lambda_code_ascii)
-
Observation (Flag Discovery): Examining the
latin-1
decoded output (or potentially other encodings iflatin-1
doesn't fully decode) reveals readable strings interspersed with binary data. Within this output, we clearly see the flag:HTB{k3r4S_Lryrrr1njrctr0r}
. Additionally, we find the string"Your model has been hijacked!"
and the use ofeval()
. This confirms the malicious nature of thehyperDense
layer.
-
-
Flag Confirmation and Malicious Code Analysis:
- Purpose: Verify the extracted flag and understand the malicious code's intent.
- Action: Manually check if
HTB{k3r4S_Lryrrr1njrctr0r}
is accepted as the flag for the challenge. Analyze the decoded code further (even if it's partially binary, look for recognizable string patterns or Python bytecode structures) to solidify the understanding of the hijack mechanism. We seeprint('Your model has been hijacked!')
andeval()
which confirms the model is designed to execute arbitrary code and display a message indicating compromise.
Conclusion:
Through a combination of structural inspection using h5py
and visual analysis with Netron, we identified a suspicious hyperDense
Lambda layer in the malicious.h5
file. Netron proved invaluable in quickly pinpointing the base64 encoded code within the Lambda layer's configuration. Decoding this base64 string revealed the hidden flag, HTB{k3r4S_Lryrrr1njrctr0r}
, and confirmed the presence of malicious code designed to hijack the model upon loading. This challenge highlights the security risks associated with loading untrusted machine learning models and the potential for embedding malicious payloads within model files.
Flag: HTB{k3r4S_Lryrrr1njrctr0r}