I think response time per API, as you're thinking about it, is not terribly useful once it's inside your service boundary.
Instead think of it as a budget: Your overall SLA is 500ms. Once Service A calls Service B, you have a "x-time-remaining=351ms". When B calls C, you have a "x-time-remaining=125ms". And so forth, with the numbers there being actual calculated values based on elapsed time. If time remaining reaches 0, you can abandon the request, trigger an alarm, or otherwise figure out where your time is being spent. There are huge classes of requests that, when the remaining budget hits zero, are just abandoned - why bother finishing the work? Your behavior here will obvious depend on your time distrubutions...
Latency measurements are important at each stage, but only really matter close to the customer.
Cheers,
Chris