Any plans to improve API or add functional features in 3.0?

46 views
Skip to first unread message

Alex Papadimoulis

unread,
Aug 21, 2014, 11:25:27 AM8/21/14
to nuget-e...@googlegroups.com
I've had a chance to review NuGet 3.0, and so far as I can tell, it's basically just NuGet 2x wrapped in JSON-LD.

Considering the massive rewrite efforts, it seems like this would be an opportune time to improve the API and/or add much-needed functionality. I could compile a very long list, but as some examples...

For an API improvement, the whole isLatestVersion vs isAbsoluteLatestVersion thing is a mess. Those attributes are not actually package metadata, but repository metadata. This is because they are mutable and dependent upon other data in the repository. This has, and will continue to cause lots of unique quirks and bugs.

For a functional improvement, providing first-class ownership and allowing namespaces to be owned.

Like I said, any ecosystem user could compile a long list... but if these are "off the table" then there's no point.

Jeff Handley

unread,
Aug 21, 2014, 3:19:41 PM8/21/14
to Alex Papadimoulis, nuget-e...@googlegroups.com

Accurate Assessment and Some Aspirations

Alex, your observation is absolutely accurate, for now.  We spent several weeks trying to find the best way to introduce API v3 into the client while still supporting API v2 package sources too.  Here were some of the aspirations:

 

1.       Create new, clean-room API v3 client libraries that consume our JSON-LD endpoints directly and natively

2.       Unlock many new scenarios for data navigation with our resources being exposed in linked data that could be arbitrarily navigable

3.       Refactor the VS client and command-line to understand API v3 as the primary format, with API v2 being supported side-by-side

4.       Factor the service endpoints and client libraries so that they are fully usable by ecosystem partners who want to provide value-add services over nuget.org and/or their own package sources

 

Spinning Our Wheels

Unfortunately, with these aspirations (especially #3), we were unable to make much progress because the current client codebase is so extremely coupled to API v2 that finding the right place to toggle between API v2 and API v3 proved infeasible.  The client code uses a combination of interfaces and Linq to the point that trying to decouple the code from the data structures exposed by API v2 just wasn’t happening.  Trust me, we are all very sad and frustrated about this!

 

To be clear, we weren’t spinning our wheels because Linked Data or JSON-LD is hard—quite the opposite!  Consuming and navigating JSON-LD is super clean and easy; it’s far less code and much easier than the interfaces and Linq that we’re using today.  We were stuck because we couldn’t find a good way to have the client code-base understand both JSON-LD and Linq/OData.  If we didn’t have to worry about backwards compatibility with API v2 sources, we would have shipped NuGet 3.0 long ago.  But since the client is not limited to connecting to nuget.org, we know we must support API v2 connectivity for a long time to come.

 

Shipping is a Feature

In an effort to get something out the door—to make some progress—we took the approach of the “shim” where we hijack the API v2 requests, translate them into non-optimized JSON-LD requests that mimic the existing API v2 calls, and then translate the request back into OData-formatted XML to return back to the caller.  Is it gross?  Absolutely!  Did it work?  It sure did!  In fact, it doesn’t just work—it’s significantly faster, even without being optimized JSON-LD!

 

Why We Dislike API V2

You may ask what we gained by this, and the answer is simple: reliability and scalability.  With the OData endpoint on the server, it was handling every possible OData request that comes, with little preference or prejudice for mainline or unrecognized scenarios.  Regardless of the scenario the user was in, the request comes through the pipeline and results in one of two options:

 

1.       Either a call down to SQL Server through Entity Framework, or

2.       An HTTP call over to our Search Service – for requests that we manage to identify as mainline search queries by virtue of recognizing expression patterns.

 

Both of these options require compute AND external resources from the web server.  Requiring either compute or external resources introduces both latency and vulnerability, and we were relying on both compute and external resources being available.  By definition, our uptime is limited to the intersection of uptime for two services, and we’ve been displeased with those results.

 

Realizing API v3 Benefits Today – Using API V2

We recognized something interesting from what we’d accomplished with the Search Service request hijacking though: it was not just possible, but relatively straight-forward to take requests that we understood and translate them into completely different types of calls—switching from SQL/EntityFramework to an HTTP request to our external Search Service.  If we could do that on the server, where the pipeline is handling every possible OData request that can be made to us, then we could certainly repeat that approach on the client where the list of scenarios and possible requests is not just finite, but very limited.

 

With this in mind, we created the V3 shim and we are hijacking the V2 requests and making completely different calls to the server.  We’ve moved the SQL/EntityFramework hijack up from the server to the client, and done so for all requests the client makes instead of just search requests.  And where do those V3 requests go to avoid compute and external resources?  They go directly to our CDN, which is running over Azure Blob Storage—with all responses being served from static files that are generated by our backend processes.  This provides the ultimate availability and reliability: if the CDN that serves NUPKG files is up, then then entire API is up.  Plus, if the CDN that serves NUPKG files is down, then what good was the API anyway?  And fortunately, we’ve never experienced an interruption on the CDN or with Azure Blob Storage.  This approach yields virtually 100% uptime without having to completely rewrite the entire client codebase (that has taken years to get to where it is).

 

It’s Just CQRS and It’s OK

At the end of the day, this is just a clever extension to the CQRS approach we’re taking all-up.  It’s just that the first round of presentations we’re producing from the query model are there to support the old client code base instead of the new client code base that will understand JSON-LD natively.  This isn’t just OK, it’s proof that the CQRS pattern gives us the flexibility we need for various client scenarios.

 

Strategy and Tactics

We very much want the clean-room client libraries and the super-pretty server resources, but we’ve recognized that elegance is a far lower priority than function.  We need to get NuGet 3.0 out the door so that our users can benefit from higher availability.  We are hoping to squeeze in a few features as well, but new features are useless if we can still only achieve 99.9% availability.

 

We absolutely want to unlock the kinds of scenarios you’re talking about, that is the strategy.  In fact, the last two changes you mentioned (namespaces and IsLatest) have been frequent topics for us.  But we needed to be tactical for a spell, setting the longer-term strategy aside while we deliver the benefits of the CQRS pattern to existing client codebases.  With this, we’re continuing our NuGet 3.0 work in the mindset of “making it work” before “making it pretty.”

--
You received this message because you are subscribed to the Google Groups "NuGet Ecosystem" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nuget-ecosyst...@googlegroups.com.
To post to this group, send email to nuget-e...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alex Papadimoulis

unread,
Aug 22, 2014, 3:03:31 PM8/22/14
to nuget-e...@googlegroups.com, apapad...@inedo.com, jeff.h...@microsoft.com
Thank you for the very detailed reply; that gives some interesting insight into things.

So it sounds like the client is going to support both V3 and V2 APIs? I mean, in theory great and all, but I've seen the tangled mess that is the current code, and it seems like you'll just be trying to sculpt this big ball of mud.

Why not just have the client support v3, and v3 only? That seems pretty reasonable... I'm sure some would grumble, but I'm pretty sure everyone hates the v2 API anyway, so it's a move in the right direction.

If there's some decree from high-above, then just do something stupid like embed nuget-v2.8.2.exe as a resource inside of nuget.exe, and call out to that if v2 is absolutely needed? Then you get really clean code separation, you can gut all the rubbish code, reduce risk and just maintain two separate client branches as needed.

I mean, is there anything API/package-related in the v2 codebase that's really worthwhile keeping? We wrote proget.exe in a few hours, and a visual studio extension in days. The hard part is fighting the visual studio quirks... and I'm sure those are all lessons easily transferable to a new codebase.


>> we’ve never experienced an interruption on the CDN or with Azure Blob Storage

Well, except that one time someone forgot to renew the certificate ;-)

Jeff Handley

unread,
Aug 22, 2014, 3:18:57 PM8/22/14
to Alex Papadimoulis, nuget-e...@googlegroups.com

There are too many V2 package sources out there right now.  It’ll take some time for them to be replaced with API v3.  We’re expecting a couple of years for it to make its way around as everyone’s implementation will be a bit different now—we’re applying CQRS, but others might not.

 

And I was wondering if someone would point out the storage interruption—by luck that didn’t affect us and we stayed up that whole day.  But the point is still the same: Storage is reliable!

Reply all
Reply to author
Forward
0 new messages