Akka HTTP path matching fundamentally broken? Surely not?

1,362 views
Skip to first unread message

Alan Burlison

unread,
Mar 17, 2017, 7:25:09 AM3/17/17
to Akka User List
I'm sure I must be missing something here because I can't believe path
matching in Akka HTTP could be broken in the way it seems to be, because
it would be unusable for anything other than toy applications if it was:

I'm composing route handing from a top-level handler and sub-handlers
like this:

pathPrefix("root") {
concat(
pathPrefix("service1") { service1.route },
pathPrefix("service2") { service2.route }
)
}

where service1.route etc returns the sub-route for the associated
sub-tree. That works fine with a path of say "/root/service1", but it
*also* matches "/rootnotroot/service1", because pathPrefix() just
matches any arbitrary string prefix and not a full path segment. And if
I use path() instead of pathPrefix() it tries to match the entire
remaining path. What I'm looking for is something along the lines of
segment() where that fully matches just the next path segment and leaves
the remaining path to be matched by inner routes, but there doesn't seem
to be such a thing.

What am I missing?

Thanks,

--
Alan Burlison
--

Alan Burlison

unread,
Mar 17, 2017, 7:47:59 AM3/17/17
to Akka User List
> pathPrefix("root") {

I can bodge around this with:

pathPrefix("^root$".r)

but that's unspeakably vile.

--
Alan Burlison
--

Akka Team

unread,
Mar 17, 2017, 8:12:51 AM3/17/17
to Akka User List
Did you read the docs about the various path directives and how they differ?

--
Johan
Akka Team




--
Alan Burlison
--

--
     Read the docs: http://akka.io/docs/
     Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
     Search the archives: https://groups.google.com/group/akka-user
--- You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscribe@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Alan Burlison

unread,
Mar 17, 2017, 8:25:52 AM3/17/17
to akka...@googlegroups.com
On 17/03/2017 12:12, Akka Team wrote:

> Did you read the docs about the various path directives and how they differ?
> http://doc.akka.io/docs/akka-http/10.0.4/scala/http/routing-dsl/directives/path-directives/index.html

Yes, over and over, because as I said I was sure I must be missing
something.

I actually think it might be a bug - I'm in the middle of trying to
figure out exactly where but it looks like the URI handling under the
DSL splits the URI into segments and then matches the routing directives
against it in turn. In the case of a string it looks like it is
comparing the string to the path segment with startsWith instead of
equals, so it is checking if string is a _prefix_ of the next segment
rather than the _entirety_ of the next path segment.

If someone wanted to match the next segment against a string prefix then
they could use a RE, e.g. "foo(.*)".r would match "foobar", "foobaz" etc
and extract "bar" and "baz".

Currently a string of "foo" will *also* match "foobar", "foobaz" but
won't extract the "bar" or "baz" suffix, which seems almost certainly
not what you'd want.

--
Alan Burlison
--

Alan Burlison

unread,
Mar 17, 2017, 9:25:58 AM3/17/17
to akka...@googlegroups.com
> I actually think it might be a bug - I'm in the middle of trying to figure
> out exactly where

PathMatcher.scala, line 145:

def apply[L: Tuple](prefix: Path, extractions: L): PathMatcher[L] =
if (prefix.isEmpty) provide(extractions)
else new PathMatcher[L] {
def apply(path: Path) =
if (path startsWith prefix) Matched(path dropChars
prefix.charCount, extractions)(ev)
else Unmatched
}

I believe "startsWith" should be "==", otherwise it is matching the
prefix of a segment which is in turn the prefix of a path, not a
segment which is a prefix of the path.

From looking at the examples of how the pathPrefix directive is used,
it commonly takes a series of path matchers with "/" separators, where
segments that need to be matched with no extractions or conditionals
are represented by strings. However strings currently do *not* match
entire path segments, they match *prefixes* of path segments and there
appears to be no way to do exact matching of fixed path segments other
than the regexp hack I outlined in my previous email. You could I
suppose say that the current behaviour is as intended, in which case
I'd suggest it is surprising and unintuitive - if I put a literal
"foo" I expect it to match "foo" and not "foobar". It's even more
surprising because there is a mechanism (REs) that's explicitly for
matching "foobar", "foobaz" and extracting the variable part of the
segment, which you'd almost certainly want to do anyway for subsequent
routing logic.

If I have a match against "foo" followed by a slash, followed by "bar"
and I provide it with a path of "foofoo/bar" and try and synthesise a
useful error message from the failure with extractUnmatchedPath I get
a string "foo/bar", which is confusingly close to the real path of
"/foo/bar". I can imagine the WTF? complaints from users that will
cause...

Even if this *is* working as intended and won't be fixed (which I
personally believe would be a bad choice), would it be possible to add
a matcher DSL item that specifically matched a complete path segment?
There's already 'Segment', could that be extended to take a segment to
fully match against, e.g. 'Segment("foo")' ?

--
Alan Burlison
--

Daniel Stoner

unread,
Mar 17, 2017, 5:25:44 PM3/17/17
to Akka User List
I think perhaps your mistaking the same mistake I did when I first picked up Akka HTTP coming from a background with Jersey.
What I wanted to do - was nest paths inside of each other thinking:
path("v1") {
path("orders") {whatever}
path("customers") {whatever}
}

Would be the perfect syntax for setting up the 2 paths:
/v1/orders/...
/v1/customers/...
In a really easy to comprehend and sensible manner. EG Each sub path prefix only really matching an individual segment like how your speaking.

In practice it appears that it works more as though each nested path does not operate on the remaining unmatched path - but actually the whole url. In a sense the above code is saying match paths that both look like: START - /v1/ - END AND START - /orders/ - END. Obviously an impossible situation.

So what we do is simply enumerate the paths we want to match for (rather than nest them) and use PathMatcher DSL to enable us to avoid any real overhead to the code for this. In the end it's just as readable (If not a little bit more so since your enumerating explicitly your matching possibilities rather than nesting them).

What does this look like? (In Java)
public static final String PATH_ORDERS = "orders";
public static final String PATH_VERSION = "v2";
private static final PathMatcher1<UUID> PATH_PARAM_UUID = PathMatchers.uuidSegment();

route
(
path
(
 
PathMatchers
 
.segment(PATH_VERSION)
 
.slash(PATH_ORDERS),
 
() -> put(...-> createOrder())
),
path
(
 
PathMatchers
 
.segment(PATH_VERSION)
 
.slash(PATH_ORDERS)
 
.slash(PATH_PARAM_UUID),
 orderId
-> route(
 post
(...->changeOrderStatus(orderId),
 
get(...->getOrder(orderId)
 
)
))


Code above representative only and in an older version of Akka HTTP but much the same as current. The - ...-> syntax used to highlight what methods like getOrder are doing (They are just returning Routes which further limit down by GET/POST/similar). Sure we could have made it a lot nicer by defining and reusing:
PathMatcher0 VERSION = PathMatchers.segment(PATH_VERSION)

One practices we found that worked - do the splitting by HTTP method (GET/POST/PUT) down at the lowest level. Often you have a URL like /v1/orders that accepts GET, PUT and POST and then you can nest all 3 options under one path matcher.

Can you do a fully nesting style like v1/ { orders, customers } ? Probably using the wildcards like you suggest, but I could never get it to work like that (I tried a lot) and i'm not sure if its designed for that use case. Frankly I kind of ended up preferring the enumerated style and found it easier to manage than Jerseys nested api styling.

If you checkout some of the Akka seed projects using Akka HTTP they probably show better examples in your language of preference too :)

Activator is amazing for finding feature fledged examples of how to use things.

Alan Burlison

unread,
Mar 17, 2017, 7:40:48 PM3/17/17
to akka...@googlegroups.com
On 17/03/2017 21:25, Daniel Stoner wrote:

> What I wanted to do - was nest paths inside of each other thinking:
> path("v1") {
> path("orders") {whatever}
> path("customers") {whatever}
> }

Yes, exactly so.

> In practice it appears that it works more as though each nested path does
> not operate on the remaining unmatched path - but actually the whole url.
> In a sense the above code is saying match paths that both look like: START
> - /v1/ - END AND START - /orders/ - END. Obviously an impossible situation.

I believe that's why you need to use "~" or concat, so the alternatives
are tried in order.

> So what we do is simply enumerate the paths we want to match for (rather
> than nest them) and use PathMatcher DSL to enable us to avoid any real
> overhead to the code for this. In the end it's just as readable (If not a
> little bit more so since your enumerating explicitly your matching
> possibilities rather than nesting them).

The problem is I have hundreds of paths to match, listing each
individual path in full in a linear fashion is just unworkable, I need
to use a tree.

> One practices we found that worked - do the splitting by HTTP method
> (GET/POST/PUT) down at the lowest level. Often you have a URL like
> /v1/orders that accepts GET, PUT and POST and then you can nest all 3
> options under one path matcher.

Yes, that's what I've done.

I really believe the current behaviour is the wrong choice, although
I've found hints in the documentation that it is deliberate - I have no
idea why it was considered to be a good design, if indeed it was.

I'm trying to figure out how to write my own implementation of
Segment("foo") to allow complete matching of a path segment, but it's
not easy as the DSL implementation is pretty "dense" code.

--
Alan Burlison
--

Daniel Stoner

unread,
Mar 17, 2017, 9:16:18 PM3/17/17
to Akka User List
I guess you have to remember that Akka-HTTP originates from Spray - and so those choices were likely already made. (I'm sure there is a fully plausible performance threading reason that is beyond me too hehe).

Well I know how i'd do nesting in Java at least if its any help.

Implementing a custom directive is easy! And the requestContext which is passed into every layer of your route (presumably its an implicit value in scala?) can be extended and passed around with the additional info you might need such as a slowly building list of string segment paths. You can then signal that your at a leaf node of your tree by calling end or the likes - and return a single Path with the full linear evaluation of that point in the tree recursion.

So how I'd do it is implement my own directive called Segment maybe a little like this:

public abstract class SegmentDirectives extends AllDirectives


public Route segment(String segment, Route inner) {
return this.mapRequestContext((innerCtx) -> {
//You could put some logic here and then do something - you can control the RequestContext which gets passed to the child
//Hence you can control whether the lower level stuff get invokes or not based on what the full path is
 
return new RequestContext(new MySuperNewRequestContextWhichObviouslyImplementsRequestContextInterface(innerCtx.delegate(), segment));
}
}



}


Your new RequestContext impl could then just keep a List<String> - that it made available as a PathMatcher when you chose to 'finish' your tree like:
public Route endSegment(Route innerRoute){
return this.mapRequestContext(innerCtx-> {
if(innerCtx instanceof MySuperNewRequest.....){
return path(((
MySuperNewRequest)innerCtx).getPathMatcherForBuiltPaths, innerRoute);
}
})
}

Then simply do a route a little like:
segment("v1", { segment("orders", {end({whatever})}), segment("customers", {end({whatever})}) }

Well obviously the 'this is how id do it' bit is a lie - I wouldn't do it. I'd probably ask myself why I had hundreds of apis and wanted to list those all in 1 mega file using a nested tree that either presumably is going to flip flop all over the classes in the project, or move further and further to the right of the screen as it becomes deeper.

I know the reality of software development is generally you get stuck with tough situations like that from historical decisions so fair enough if its really required. At the end of the day though - even writing 300 linear apis reusing PathMatcher variables preconfigured to do most of common situations can end up the same kind of amount of code as the nest equivalent. I know it doesn't feel like it initially but keep faith! :)

For context our largest service has around 10 classes which implement Route - each class probably has about 6 apis in it, and all of these are pulled in using Guice multi-binding - meaning I end up with 10 beautifully crafted readable classes called things like V1OrdersRoute, V2OrdersRoute, V1CustomersRoute and then all have the same OAUTH2 authentication protections and error logging applied when the multi-binding is injected and connected up to the Route flow on the HTTP server ensuring no-one goes without good security protocols or basic access logging. In my HTTPServer I simply put @Inject private MultiBinder<Route> allMyRoutes and attach it into a tree with the above stated requirements.

On Friday, 17 March 2017 11:25:09 UTC, Alan Burlison wrote:

Alan Burlison

unread,
Mar 20, 2017, 10:39:04 AM3/20/17
to akka...@googlegroups.com
On 18 March 2017 at 01:16, Daniel Stoner <daniel...@ocado.com> wrote:

> I guess you have to remember that Akka-HTTP originates from Spray - and so
> those choices were likely already made. (I'm sure there is a fully plausible
> performance threading reason that is beyond me too hehe).

Yes, I suspect it is too late to change the behaviour, no matter how
odd it seems to be - although I haven't seen any comment from the Akka
HTTP people yet.

> Implementing a custom directive is easy!

I don't think a directive is the right way to do this, it needs to be
added to the existing PathMatcher DSL so it can be used with the path
and pathPrefix directives, and I'm not sure of how to do that.

--
Alan Burlison
--

Alan Burlison

unread,
Mar 20, 2017, 11:37:59 AM3/20/17
to akka...@googlegroups.com
> I don't think a directive is the right way to do this, it needs to be
> added to the existing PathMatcher DSL so it can be used with the path
> and pathPrefix directives, and I'm not sure of how to do that.

This seems to work:

def seg(segment: String) =
new PathMatcher[Unit] {
def apply(path: Path) = if (path.head == segment)
Matched(path.tail, ()) else Unmatched
}

Then you can say:

pathPrefix(seg("someseg"))

If that gets a nod of approval from the Akka HTTP folks I'll figure
out to log a RFE.

--
Alan Burlison
--

Johannes Rudolph

unread,
Mar 20, 2017, 11:57:51 AM3/20/17
to Akka User List
Hi Alan,

On Friday, March 17, 2017 at 12:25:09 PM UTC+1, Alan Burlison wrote:
pathPrefix("root") {
   concat(
      pathPrefix("service1") { service1.route },
      pathPrefix("service2") { service2.route }
   )
}
 
That works fine with a path of say "/root/service1", but it
*also* matches "/rootnotroot/service1", because pathPrefix() just
matches any arbitrary string prefix and not a full path segment.


 No, it doesn't match `/rootnotroot/service1` (or it least it should not). You are right, that `pathPrefix(string)` only matches a prefix of the string, but it also matches a leading slash. So, if you are using the recommended `pathPrefix("xyz") { pathPrefix("abc") { path("innerMost") } }` pattern, then the problem you describe is never a problem as the only path that is matched is `/xyz/abc/innerMost`.

The only place where it would matter is if you complete a route from a place where not the whole path is expected to be matched, i.e. there's still something left in `ctx.unmatchedPath`. Is that what you are doing?

Johannes

Alan Burlison

unread,
Mar 20, 2017, 12:21:13 PM3/20/17
to akka...@googlegroups.com
On 20 March 2017 at 15:57, 'Johannes Rudolph' via Akka User List
<akka...@googlegroups.com> wrote:

> The only place where it would matter is if you complete a route from a place
> where not the whole path is expected to be matched, i.e. there's still
> something left in `ctx.unmatchedPath`. Is that what you are doing?

Yes, you are correct, I've played around with it some more and it's
only an issue if there is a trailing end to the path that includes the
last matched segment as a prefix and you don't add a
pathEndOrSingleSlash. However I think there is still the confusing
"foofoo/bar" case that I mentioned in another mail when matched
against "foo" and "bar".

The "seg" PathMatcher conditional I posted earlier seems to work fine
and is "fail fast" which makes dealing with deep hierarchies a bit
easier, and possibly faster in the error case although I haven't
tested that. I think it would be a simple addition to make to the
PathMatcher DSL but that's not my call :-)

--
Alan Burlison
--
Reply all
Reply to author
Forward
0 new messages