I am writing a tool to hook every printf call in the program and dump the program's stack. I don't care the printf in the external library, so I wrap the program's printf@plt stub in dynamorio. I use the ./api/samples/callstack.cpp as a template and modify the module_load_event function to the following:
static void
module_load_event(void *drcontext, const module_data_t *mod, bool loaded)
{
    size_t modoffs;
    drsym_error_t sym_res = drsym_lookup_symbol(
        mod->full_path, "main", &modoffs, DRSYM_DEMANGLE);
    if (sym_res == DRSYM_SUCCESS) {
        app_pc towrap = mod->start + [printf@plt offset];
        bool ok = drwrap_wrap(towrap, wrap_pre, NULL);
        DR_ASSERT(ok);
        dr_fprintf(STDERR, "wrapping %s!%s\n", mod->full_path,
                   "printf@plt");
    }
}
The [printf@plt offset] can be obtained from the objdump output for a program.
Then I write the following program to test the stack walk tool:
#include <stdio.h>
void recursive_call(int depth) {
    if (depth <= 0) {
        printf("Recursion end\n", depth);
        return;
    }
    printf("Recursion depth: %d %lx\n", depth, malloc(depth));
    recursive_call(depth - 1);
}
int main() {
    printf("Start\n");
    recursive_call(5);
    return 0;
}
If the program is compiled by gcc, it outputs the correct stack walk trace like:
printf@plt called from:
  gcc-clang-example!<unknown> // Because printf@plt is not an actual function symbol  so it prints <unknown>
  gcc-clang-example!recursive_call
  gcc-clang-example!main
  libc.so.6!__libc_start_call_main
  libc.so.6!__libc_start_main_alias_2
  gcc-clang-example!<unknown>
However, if the program is compiled by clang, the trace is:
printf@plt called from:
  clang-example!<unknown>
  <unknown module> @0x0000000500000005
  <unknown module> @0x00007ffec5126140
The detailed trace log is in attachment.