Thanks for jumping in so quickly and eagerly and getting right down to some awesome questions!
We should be able to get back with you on these questions early next week.
--
You received this message because you are subscribed to the Google Groups "NuGet Ecosystem" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
nuget-ecosyst...@googlegroups.com.
To post to this group, send email to
nuget-e...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
First of all, congratulations on the NuGet v3.0 progress! For thos new to the game, read up on http://blog.nuget.org/20140715/nuget-3.0-ctp1.html.I've been exploring some of the new API endpoints and here's my understanding of how it all works, for future reference and getting brains aligned.
- http://preview.nuget.org/ver3-ctp1/intercept.json - checked by the client to check which endpoints are available for various purposes
- http://preview.nuget.org/ver3-ctp1/packageregistrations/1/ - the base URL for resolving individual packages. For example Newtonsoft.Json lives at http://preview.nuget.org/ver3-ctp1/packageregistrations/1/newtonsoft.json.json and lists all versions and properties for that specific package ID
- https://api-search.nuget.org/search/query - perform searches against this endpoint (would like to see some docs on how to query and if continuations will be used in here :-))
- Several views on the data:
- http://preview.nuget.org/ver3-ctp1/islatest/segment_index.json - lists all segments of "islatest"
- http://preview.nuget.org/ver3-ctp1/islateststable/segment_index.json - lists all segments of "islateststable"
- http://preview.nuget.org/ver3-ctp1/allversions/segment_index.json - lists all segments of all packages
- Segments tell the client the range of package info (lowest - highest, alphabetically) that can be found at various URL endpoints, for example http://nugetprod0.blob.core.windows.net/ver3-ctp1/allversions/segment_0.json
- Segment contains references to the package information document (see Newtonsoft.Json above) as well as the id, version and description of the package
Do correct me if I'm wrong on the above, but they are the baseline for discussion on how the new API seems to work.Again, congratulations on this, it's all pretty straightforward to consume and generate, however I have some remarks up for discussion... I'll break them up into several bullets, if I have to split them up in separate topics let me know and I'll happily do that.
- Number of requests having to be done by the client on average
It seems to me that for every action that will be performed by the client, at least two HTTP requests are to be made: the intercept.json and one of the endpoints listed in there. Search being the most widely used feature probably, I guess two requests for search + one (or two if you count redirects) to download will be the average in the client (3-4 in total). Package restore will be quicker: intercept, packageregistration, package download.
Looping packages becomes a bit more funky: intercept, segment index, segment, count * package registration -> that's a lot of traffic if I want to list some packages somewhere without searching.
- Authentication
Different URLs and lots of opportunity to split requests over domains and all. I love that from a scalability point of view!
But... what with authentication? Basic authentication will be a mess here if these endpoints are indeed split across multiple domains. At least basic authentication where the RFC is followed (https://www.ietf.org/rfc/rfc2617.txt, in short the canonical URL + auth realm are used to identify a protected resource). A deviation could be to only use the realm and not the canonical URL, however what with servers / clients that do follow the standard?
- Dynamic generation of segment files could be hard
Imagine having 100.000 packages. That's quite some segment index and segment files to write out, but feasible to do every now and then. An open question from me: imagine a 100.001'st package is added: AAAAA-1.0. Will this be in the first entry in the segment index? Or can it be appended to the segment index saying lowest = AAAAA-1.0 and highest = whatever comes next?
If the answer is: it has to be in the first, then it's pretty crazy stuff to have to generate the segment index and relevant segments over and over again for added packages. In other words: is a full database dump really required on database mutations? Or can it somehow be generated incrementally?
In addition to that, MyGet and I'm also guessing ProGet and Artifactory have the notion of upstream package sources. Imagine having a feed that aggregates three other feeds, of which all three can be a mixture of v2 and v3 endpoints. Generating the segment index and segments themselves would mean these three package sources will have to be queried for ALL the packages they have. Relatively easy to do on a v3 endpoint, but on v2 that means fetching all pages of the OData feed. Just for fun I tried this on NuGet.org's v2 endpoint and I can tell you it's quite the background job to run and will bring eventual consistency of such an aggregate feed to a whole new level :-)
If appending is feasible, then this "fetch from upstream" job is more feasible as it means only one full crawl and then incremental crawls (for example based on LastModified date of packages). So again: is appending feasible or does AAAAA-1.0 have to be in the first few segments?
One more to add :-)
- Segments for individual packages
Looking through some of our feeds, I found a couple using their MyGet feed for CI purposes big time. The package, which I will call Steak as I'm in BBQ mood, has over 6.000 versions of it. Most are prerelease, some are stable. What happens if someone tries to query http://preview.nuget.org/ver3-ctp1/packageregistrations/1/steak.json? Does this mean a JSON document with over 6.000 entries will be served? Or will this be segmented as well?