If you're trying to sideload the traced_perf on an older Android build (and can run it as root), I'd consider using the "tracebox" target which will bundle all the daemons together as a single binary that can be run as you would the "perfetto" commandline interface. For traced_perf to be included in the tracebox, it needs to be a build config where enable_perfetto_traced_perf=true as you say. There's some clumsiness involved with gn variables, but the following should work:
1) pull newest "master" perfetto git branch
2) tools/install-build-deps --android
3) tools/build_all_configs.py --android # (will generate a bunch of predefined gn configurations in out/, won't actually build anything yet)
4) ninja -C out/android_release_incl_heapprofd_arm64 tracebox # (this has the right gn vars for enable_perfetto_traced_perf)
5) adb push out/android_release_incl_heapprofd_arm64/tracebox /data/local/tests/tracebox
6) adb shell su root /data/local/tests/tracebox -c /data/local/tmp/cfg --txt -o /data/local/tmp/trace # (assuming you've pushed the config to that path)
Example config for two hw counters:
https://pastebin.com/raw/H1CC6Ub7. For local counter recording I'd recommend using "period" with a reasonable value as in this example instead of "frequency" since the latter will give you less predictable sampling intervals. (If you're also interested in callstacks, then do be aware that standalone traced_perf won't handle interpreted ART java frames since it doesn't link in the necessary dexfile libraries.)
Do tell if this works or if you have other questions. I understand that the documentation surrounding this feature is lacking.