Core Refactor & Performance Modifications

24 views
Skip to first unread message

andr...@gmail.com

unread,
May 14, 2019, 9:18:09 PM5/14/19
to git-tfs-dev
Hi There,

As the TFVC userbase continues to drop off, my company has committed to TFS for the foreseeable future. Because of that, I decided to fork the repository with the intention of completely reiterating and improving maintainability and performance for myself. If there are gains to be had, It be foolish to not share them with the community.

While I intend on keeping almost all of the core functionality, my plan was to start by restructuring the project into a cleaner core domain with fewer cross-cutting concerns between types. My next goal will be to improve performance by introducing some asynchronous & multithreaded functionality.I'm unsure how many are still using Git-Tfs but I imagine there are still some that would benefit from a code cleanup and some better performance. In my case, it's an essential tool that I use every day. 

As I understand, this project is no longer in development. What I am asking in particular is how open you would be a to a complete reiteration of the Git-Tfs project that keeps the original functionality intact while wholly improving performance, structure, cleanliness and maintainability. I have some significant changes that I plan on making in my own fork but if you're interested, I'd like to offer my contributions in either pull requests, becoming the primary contributor, or obtaining your blessing to create a clone project that leverages a lot of the existing work. 

If you are interested you can find my fork at https://github.com/armunro/git-tfs

Regardless, I have to thank you for all of your and your community's hard work; you've kept be from having to use TFS for 4 years now. A gift that I cannot possibly repay :)

Cheers, 
 
AM

Matt Burke

unread,
May 15, 2019, 8:24:07 AM5/15/19
to git-t...@googlegroups.com
Hi AM,

That’s great! There are a lot of things that could be improved in git-tfs, and it would be great if you have the time and energy to put into it. 

My primary concern about git-tfs/git-tfs is stability: I’d like it to be something that folks can clone and use without a lot of trouble. Do you think it would be easier/faster to iterate on  improvements in your fork, or do you think it would help to have the visibility of the main fork?

One additional thing that would be worth considering as an improvement is to start using git-fast-import when converting TFVC changesets into Git commits. It’s probably one of the single best things you could do for performance. 

--
You received this message because you are subscribed to the Google Groups "git-tfs-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to git-tfs-dev...@googlegroups.com.
To post to this group, send email to git-t...@googlegroups.com.
Visit this group at https://groups.google.com/group/git-tfs-dev.
To view this discussion on the web visit https://groups.google.com/d/msgid/git-tfs-dev/407035f1-276f-4c32-82f7-a24956cbef4f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrew Munro

unread,
May 15, 2019, 1:36:56 PM5/15/19
to git-t...@googlegroups.com
That's great to hear that we can keep a good thing going.  

When I originally forked, my plan was to rip out the stuff I didn't need like VS2012, 2013, cake, etc. However, since I have your support to make some larger changes, I'm willing to support these tools in an effort to keep with the spirit of the project: Accessibility and stability. I will be discarding my changes so far as most of them removed tools that I will now need.

I think before we decide the best way to share my changes, we should probably establish some of the changes I had in mind. You can then decide if you are comfortable with the direction that I plan to work towards

Re-architecture
  1. All core objects will be refactored to separate assembly (GitTfs.Core). The main reason is to decouple the core functionality from the command line. However, this could pave the way for a GitTfs NuGet package that could facilitate others integrating GitTfs in their own apps if the demand arises. 

  2. Namespaces within GitTfs.Core will be defined more clearly, creating a separation between Git and TFS components. The types that bridge the two will remain at the highest level, referencing only components in deeper namespaces
  3. Util classes will be consolidated into the various namespaces of which types they operate on. This will hopefully create stronger boundaries between types resulting in better test isolation 
  4. Where possible reduce the amount of state fields/properties on core types.
Code Cleanup
  1. I saw that general project goal was to improve duck-typing throughout a few key areas. I follow traditional clean code and clean architecture in practice as well as the SOLID principles so self documenting code is on the forefront of my concerns. Type names will undergo some significant changes to increase readability
Maintainability & Contributions
  1. I'm not sure if the .SLN is intended to build in the IDE,but I was not able to build out of the box from a fresh clone. TFS binaries for 2012 and 2013 were missing and did not exist in NuGet from what I could tell. I'd like to simplify the build process and eliminate any additional steps that developers will need to do to contribute. This will hopefully encourage others to step in and keep things moving forward.
  2. Introduce a basic .editorconfig with accepted styles
Performance
  1. Investigate and possible implement fast-clone as suggested. Thank you for the tip!
  2. Look into possible async/await and multithreaded operations 
  3. Fix a bug I recently discovered, (which inspired me pick up this project)  where large log files cause universally poor performance. My log file was 171mb because I have scheduled git-tfs fetches on all of my repos. The performance of System.Diagnostic.Trace.* methods, appear to be dependent on the size of the log file. even the help command is affected.
Wishlist
  1. I'd like to look at the possibility of removing NLog altogether and replacing it with a lightweight, custom output console/file appender. The formatting will remain the same so it will have a consistent L&F but we'll loose the Nlog dependancy.
  2. Investigate options for fetch/clone progress indicator
  3. Better remote tracking. We had issues getting "commits ahead/behind" from TFS remote. This sometimes work but we found we had to fake a git remote to get this working. 
  4. Any outstanding items preventing V1 release

As you can see, most of these changes are not going to be trivial. If I am successful, the visibility of the main fork will get these improvements to people quickly. That being said, I am also very conscious that this constitutes a significant effort on your part and do not wish to undermine your goals or evolve the project into something that you and the community do not want.

If you are willing to offer me access to the main fork and implement the above changes I would certainly be happy to work there as it is probably easier for me. Otherwise I will re-fork and start there. 

Look forward to hearing from you. Thanks.

AM

You received this message because you are subscribed to a topic in the Google Groups "git-tfs-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/git-tfs-dev/CQQcpYse6Is/unsubscribe.
To unsubscribe from this group and all its topics, send an email to git-tfs-dev...@googlegroups.com.

To post to this group, send email to git-t...@googlegroups.com.
Visit this group at https://groups.google.com/group/git-tfs-dev.

Matt Burke

unread,
May 17, 2019, 7:08:06 AM5/17/19
to git-t...@googlegroups.com
That's great to hear that we can keep a good thing going.  

:+1: Yes! I'd love to see some of these things done.
 
When I originally forked, my plan was to rip out the stuff I didn't need like VS2012, 2013, cake, etc. However, since I have your support to make some larger changes, I'm willing to support these tools in an effort to keep with the spirit of the project: Accessibility and stability. I will be discarding my changes so far as most of them removed tools that I will now need.

Tooling like cake and nlog are very much OK to replace. They were convenient at a particular time, but aren't inherently required. With cake especially, it would be worth replacing some or all of the things it did. IIRC it helps with the release process and with running automated tests. I think it would make sense to start with other changes first to build momentum, but it's fair to replace tooling with things you'll be able to work with more easily.
 
I think before we decide the best way to share my changes, we should probably establish some of the changes I had in mind. You can then decide if you are comfortable with the direction that I plan to work towards

Re-architecture
  1. All core objects will be refactored to separate assembly (GitTfs.Core). The main reason is to decouple the core functionality from the command line. However, this could pave the way for a GitTfs NuGet package that could facilitate others integrating GitTfs in their own apps if the demand arises. 

  2. Namespaces within GitTfs.Core will be defined more clearly, creating a separation between Git and TFS components. The types that bridge the two will remain at the highest level, referencing only components in deeper namespaces
  3. Util classes will be consolidated into the various namespaces of which types they operate on. This will hopefully create stronger boundaries between types resulting in better test isolation 
  4. Where possible reduce the amount of state fields/properties on core types.
+1 That all looks good.
 
Code Cleanup
  1. I saw that general project goal was to improve duck-typing throughout a few key areas. I follow traditional clean code and clean architecture in practice as well as the SOLID principles so self documenting code is on the forefront of my concerns. Type names will undergo some significant changes to increase readability
I don't follow this point. Do you have an example or two of classes that you'd like to change? It's probably all good, just a jargon misunderstanding on my part. ;)

If you're talking about the mirroring of TFS classes in GitTfs interfaces and types, that's intentional so that a single git-tfs.exe can work with any VS version.

Maintainability & Contributions
  1. I'm not sure if the .SLN is intended to build in the IDE,but I was not able to build out of the box from a fresh clone. TFS binaries for 2012 and 2013 were missing and did not exist in NuGet from what I could tell. I'd like to simplify the build process and eliminate any additional steps that developers will need to do to contribute. This will hopefully encourage others to step in and keep things moving forward.
One of the things I tried to do with git-tfs was to build a single exe that would work with any (or almost any) version of VS. It drives me crazy when dev tools require a different installation depending on which version of VS you have. So, git-tfs has all of its core stuff in one assembly (git-tfs.exe) and one assembly per supported VS version. Some code works with  One of GitTfs's VS assemblies will be loaded at runtime. I'm quite happy with this at a high level. The implementation seems to work well enough, but the particulars aren't super important. I'm interested in preserving the "only one git-tfs package" property. One of the side effects of the current layout is that you can't build the entire SLN if you don't have all of the supported versions of TFS installed. This is definitely a pain point, and I'd be interested in getting the dev side sorted out so that that's not the case. It could be that all of the VS projects are disabled by default, or they have conditional build rules so that they're skipped if their deps aren't installed, or whatever. Like I said earlier, my main goal in this area would be to keep distributing just one git-tfs package regardless of the target env.

  1. Introduce a basic .editorconfig with accepted styles
Performance
  1. Investigate and possible implement fast-clone as suggested. Thank you for the tip!
You're welcome! This might not result in immediate end-user performance improvements. But it will make things better on large repositories. For example, git fast-import will produce a single packfile for each session instead of loose objects. Git itself will perform better when the repository has more packfiles and fewer loose objects in .git/objects.
  1. Look into possible async/await and multithreaded operations 
+1  
  1. Fix a bug I recently discovered, (which inspired me pick up this project)  where large log files cause universally poor performance. My log file was 171mb because I have scheduled git-tfs fetches on all of my repos. The performance of System.Diagnostic.Trace.* methods, appear to be dependent on the size of the log file. even the help command is affected.
What!? Yes, please fix this!
 
Wishlist
  1. I'd like to look at the possibility of removing NLog altogether and replacing it with a lightweight, custom output console/file appender. The formatting will remain the same so it will have a consistent L&F but we'll loose the Nlog dependancy.
  2. Investigate options for fetch/clone progress indicator
  3. Better remote tracking. We had issues getting "commits ahead/behind" from TFS remote. This sometimes work but we found we had to fake a git remote to get this working. 
  4. Any outstanding items preventing V1 release
As you can see, most of these changes are not going to be trivial. If I am successful, the visibility of the main fork will get these improvements to people quickly. That being said, I am also very conscious that this constitutes a significant effort on your part and do not wish to undermine your goals or evolve the project into something that you and the community do not want.

If you are willing to offer me access to the main fork and implement the above changes I would certainly be happy to work there as it is probably easier for me. Otherwise I will re-fork and start there. 

It probably makes sense to make the changes on the main project. What do you think about starting with one or two pull requests and we'll take it from there? Feel free to @-mention me, or ping me on gitter, or via email.
 

andr...@gmail.com

unread,
May 19, 2019, 12:09:02 PM5/19/19
to git-tfs-dev
Hey - Thanks for getting back.
 
Tooling like cake and nlog are very much OK to replace. They were convenient at a particular time, but aren't inherently required. With cake especially, it would be worth replacing some or all of the things it did. IIRC it helps with the release process and with running automated tests. I think it would make sense to start with other changes first to build momentum, but it's fair to replace tooling with things you'll be able to work with more easily..

I was able to get the build working relatively easily so I will keep it for now and see if it makes sense to keep it once everything else is done. I've also created an issue in my fork to address the output performance. Something small and lightweight but custom will suit this application nicely.

I don't follow this point. Do you have an example or two of classes that you'd like to change? It's probably all good, just a jargon misunderstanding on my part. ;)
If you're talking about the mirroring of TFS classes in GitTfs interfaces and types, that's intentional so that a single git-tfs.exe can work with any VS version.

I'll be keeping the version specific TFS stuff as-is because I don't feel like refactoring any of the StructureMap stuff yet. I'm more talking about classes like Bouncer.cs, Janitor.cs, that don't reveal their nature from the name. Refactoring to enforce that classes only have a single responsibility (SRP) is generally quite difficult after the fact but it's high on my list of priorities. From my own experiences, projects become much easier to maintain when there is a clearer division of responsibility.

You're welcome! This might not result in immediate end-user performance improvements. But it will make things better on large repositories. For example, git fast-import will produce a single packfile for each session instead of loose objects. Git itself will perform better when the repository has more packfiles and fewer loose objects in .git/objects.

This is very interesting because our primary repository at work is quite large and we've had some difficulty with size. A full clone can take days so we've had to keep a local cache that we update every month to avoid the 2-3 day checkout when we're on-boarding someone. Once the above refactors are done, this will be one of the first things I look into adding.

One of the things I tried to do with git-tfs was to build a single exe that would work with any (or almost any) version of VS. It drives me crazy when dev tools require a different installation depending on which version of VS you have. So, git-tfs has all of its core stuff in one assembly (git-tfs.exe) and one assembly per supported VS version. Some code works with  One of GitTfs's VS assemblies will be loaded at runtime. I'm quite happy with this at a high level. The implementation seems to work well enough, but the particulars aren't super important. I'm interested in preserving the "only one git-tfs package" property. One of the side effects of the current layout is that you can't build the entire SLN if you don't have all of the supported versions of TFS installed. This is definitely a pain point, and I'd be interested in getting the dev side sorted out so that that's not the case. It could be that all of the VS projects are disabled by default, or they have conditional build rules so that they're skipped if their deps aren't installed, or whatever. Like I said earlier, my main goal in this area would be to keep distributing just one git-tfs package regardless of the target env.

Yeah I'm not sure what happened here when I cloned the repo. I primarily use Rider instead of VS so I didn't have any TFS client libraries installed. I think it's extremely important to maintain that compatibility so I'll ensure that no support is lost. It looks like the VS 2015 assembly builds fine because the packages are coming from NuGet. The 2012 and 2013 do not out of the box but that's because I don't have those versions.

Interestingly, I added the same NuGet TFS  client libraries that 2015 was using to 2012 and 2013 and I was able to get it building completely. The MS documentation suggests that Microsoft.TeamFoundation.ExendedClient contains support for 2012 and 2013 so I'm still not sure if VS is even required to be installed. If that is the case, the version specific assemblies may be redundant but I must admit I haven't tested it yet so it could be a trap. Is this something that you've tried?

In the end I decided to re-fork and make my commits there. If you're interested in tracking commits, feel free to check it out. What's up there now is a rough cut but a core domain model starting to emerge so I'm happy with the direction so far. I apologize for the size of the commits, the first part of the refactor is usually pure chaos due to the size of the re-namespacing commits. Once that is out of the way, commits will be more issue based.

Thanks again for all of the feedback
AM
To unsubscribe from this group and stop receiving emails from it, send an email to git-t...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "git-tfs-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/git-tfs-dev/CQQcpYse6Is/unsubscribe.
To unsubscribe from this group and all its topics, send an email to git-t...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "git-tfs-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to git-t...@googlegroups.com.

Philippe Miossec

unread,
May 20, 2019, 7:12:09 AM5/20/19
to git-t...@googlegroups.com
That's an interesting discussion but could you open an issue in the git-tfs repository.
I prefer to have the discussion there that way it will be easier to find for everyone ( It should not be a problem as you have a github account).

I'm sorry to be able to ask you that but I prefer that the issue will be created by you to be much clearer for everyone.
Just create the issue with the first post if you want, I will recreate the discussion history after that (or you could do it if you want...)

I keep my answer for there but for short:

- I should have made a new release since a long time ago (there are 2 PRs to merge before)
- VS2012 and 2013 project requires the install of libs (done with a cake task) and these projects are mandatory for projects using this version of TFS and using checkin policies (but I planned to delete these projects after the release and no more support this case with the future releases)
- cake is more used to do the release (to prevent to have to install this libraries locally) and could be simplified once VS2012 & 13 projects will be deleted.

Thanks a lot.



To unsubscribe from this group and stop receiving emails from it, send an email to git-tfs-dev...@googlegroups.com.

To post to this group, send email to git-t...@googlegroups.com.
Visit this group at https://groups.google.com/group/git-tfs-dev.

andr...@gmail.com

unread,
May 20, 2019, 11:25:29 AM5/20/19
to git-tfs-dev
Greetings Philippe - That sounds great, I created the issue. I'll let you get the formatting this time so I can get an idea if how you prefer it. I'll be sure to submit any further modifications in this way. Thanks for your help.

Philippe Miossec

unread,
May 20, 2019, 11:36:52 AM5/20/19
to git-tfs-dev
The discussion continues here: https://github.com/git-tfs/git-tfs/issues/1276
Reply all
Reply to author
Forward
0 new messages