[Proposal] `mix xref` improvements to detect problematic dependencies

182 views
Skip to first unread message

Marc-André Lafortune

unread,
Jun 1, 2021, 3:33:41 AM6/1/21
to elixir-lang-core
I played around with `mix xref graph` to figure out problematic compile-time dependencies. I am happy to report I vanquished them!

A few improvements to `mix xref graph` that came to mind:

1) I define a problematic dependency as:
`A` has compile-time dependency on `B` *and* `B` has any kind of dependency on `C`.

There is currently no easy way that I know of to check for these problematic dependencies.

I coded up a `filter-leaves <integer>` options that removes "leaves", i.e. files that have no dependencies (repeat n times).

This way, `mix xref graph --label compile --filter-leaves 1` will give me a list of all these cases. In my example above, if `B` has no dependencies on `C` or other file, it is filtered out, and thus the dependency `A -> B` never shows.

This made it much quicker to figure out which were the deps I needed to fix.

It also provided me a way to add a CI check so that nobody re-introduces such a problem.

2) The option `--only-nodes` can be nice, but often I wanted the reverse. E.g.:

```
$ mix xref graph --label compile

lib/scripts/stripe_life_plans.ex
├── lib/bobby/atomtype.ex (compile)
├── lib/bobby/communication/behaviour.ex (compile)
├── lib/bobby/env.ex (compile)
# ...
lib/scripts/update_mide_and_alfe_accounts.ex
├── lib/bobby/atomtype.ex (compile)
├── lib/bobby/communication/behaviour.ex (compile)
├── lib/bobby/env.ex (compile)
# ...
(same pattern repeats many times over)
```

Option `--only-nodes` gives me what I am *not* interested in:

```
$ mix xref graph --label compile --only-nodes
lib/scripts/stripe_life_plans.ex
lib/scripts/update_mide_and_alfe_accounts.ex
# ...
```

What I want is the list of sinks, not sources, so something like `--only-sinks`, `--only-deps` or `--skip-nodes`, I'm not quite sure what a good API here:

```
$ mix xref graph --label compile --only-deps
lib/bobby/atomtype.ex
lib/bobby/communication/behaviour.ex
lib/bobby/env.ex
```

I got around it with the hack `| grep '^\W' | cut -c 5- | sort | uniq`, but I think such an option could be helpful

3) The `--source` and `--sink` options currently accept only a single file, but I see no good reason for that. I had cases where I wanted to specify multiple files (although didn't need it after --filter-leaves)

4) There are no meaningful exit codes. For example, `mix xref graph --format cycles --label compile` should have an easy way to have exit code 1 if cycles are found. I'm not too sure of the API for this either (`--test`?) For my CI task, I modified my `grep` with `-c` option and inverted the result, quite hacky.

Which of these points seem worthwhile, and what would a decent API be like for these?

José Valim

unread,
Jun 1, 2021, 4:28:48 AM6/1/21
to elixir-l...@googlegroups.com
Those all look great!

1. Correct! We have recently changed --label compile so it considers transient dependencies, before it would only consider direct ones. I can think of two options here. One potential option here is to introduce a new label, called "--label transient-compile". Would that be feasible implementation wise? If we go this route, I would probably rename --only-direct to --label direct-compile (of course, in a separate PR which I can also tackle).

2. The option `--only-nodes` can be nice, but often I wanted the reverse

--only-nodes is really only useful with the --sink option. I think we can have two options here:

a. Make --only-nodes show sinks unless the sink option is given
b. Introduce a new option called --only-sinks (and then perhaps rename --only-nodes to --only-sources

3. Sounds good to me! Our option parser even supports the :keep annotation, so this should hopefully be straightforward!

4. Which commands would you like to fail? Is there any command besides the cycle one?

Awesome work! I am really looking forward to these PRs!

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/e94a635e-393e-4f04-b1c1-7277ef492819n%40googlegroups.com.

Marc-André Lafortune

unread,
Jun 2, 2021, 1:17:58 PM6/2/21
to elixir-lang-core
On Tuesday, 1 June 2021 at 04:28:48 UTC-4 José Valim wrote:
Those all look great!

1. Correct! We have recently changed --label compile so it considers transient dependencies, before it would only consider direct ones. I can think of two options here. One potential option here is to introduce a new label, called "--label transient-compile". Would that be feasible implementation wise?

I'm not completely sure what you are thinking about, in particular with that name as `--label compile` currently does show "transient compile" deps, unless one uses `--only-direct`.
Would `--label transient-compile` filter out all dependencies that are either not compile time, or those that are compile-time but towards a sink with no dependency?
E.g.:  A => B (runtime) => C (compile) => D (anything): clearly `B => C` needs to be selected, but would `A => B` also be selected (unless `--only-direct` is specified)?
 
If we go this route, I would probably rename --only-direct to --label direct-compile (of course, in a separate PR which I can also tackle).

Still confused. What I read is renaming `--label compile --only-direct` to `--label direct-compile`, having a new option `--label transient-compile` to show transient-but-not-dead-end-deps, and `--label compile` would remain for transient (dead-end or not) deps?

2. The option `--only-nodes` can be nice, but often I wanted the reverse

--only-nodes is really only useful with the --sink option. I think we can have two options here:

a. Make --only-nodes show sinks unless the sink option is given
b. Introduce a new option called --only-sinks (and then perhaps rename --only-nodes to --only-sources

I lack experience to make an informed recommendation.

3. Sounds good to me! Our option parser even supports the :keep annotation, so this should hopefully be straightforward!

Should be 👍

4. Which commands would you like to fail? Is there any command besides the cycle one?

I'd say all formats too. For instance what I want in our CI is not about cycles, but stricter: no non-dead-end-compile deps at all.
`xref graph` is like a `grep`, so maybe a `--count` would output the number of dependencies found instead of the normal result, and have an exit code in consequence? With grep, no result == failure though, whereas here no result == success, could be confusing.

So maybe just `--test` that sets the result to a failure if anything is found. Or no option but that could lead to hard to figure out incompatibilities in some people's scripts.
 

José Valim

unread,
Jun 2, 2021, 1:31:42 PM6/2/21
to elixir-l...@googlegroups.com
Would `--label transient-compile` filter out all dependencies that are either not compile time, or those that are compile-time but towards a sink with no dependency?
E.g.:  A => B (runtime) => C (compile) => D (anything): clearly `B => C` needs to be selected, but would `A => B` also be selected (unless `--only-direct` is specified)?

For the graph above, I believe today label=compile will show A => B => C. But it will only include D if it is compile time (or if something after D is compile time). So it includes direct dependencies and transitive dependencies. In this case, it is transitive because we are including a runtime dependency since there is a compile time node after. So yes, I agree my suggestion was bad/confusing. :)

In your case, you only want to show the graph at all if there is a dependency after a compile-time node, so it is more like "compile-connected"? Does that make sense?
 
About 2, let's go with --only-sinks option (and I will rename later on --only-nodes to --only-sources).

About 4, what if we call it --fail-if-any? Or alternatively, --at-most=0 (and you can set it to any positive integer)?

Marc-André Lafortune

unread,
Jun 2, 2021, 1:47:23 PM6/2/21
to elixir-l...@googlegroups.com
On Wed, Jun 2, 2021 at 1:31 PM José Valim <jose....@dashbit.co> wrote:
Would `--label transient-compile` filter out all dependencies that are either not compile time, or those that are compile-time but towards a sink with no dependency?
E.g.:  A => B (runtime) => C (compile) => D (anything): clearly `B => C` needs to be selected, but would `A => B` also be selected (unless `--only-direct` is specified)?

For the graph above, I believe today label=compile will show A => B => C. But it will only include D if it is compile time (or if something after D is compile time). So it includes direct dependencies and transitive dependencies. In this case, it is transitive because we are including a runtime dependency since there is a compile time node after. So yes, I agree my suggestion was bad/confusing. :)

In your case, you only want to show the graph at all if there is a dependency after a compile-time node, so it is more like "compile-connected"? Does that make sense?

` --label "compile-connected"` makes sense 👍

 About 2, let's go with --only-sinks option (and I will rename later on --only-nodes to --only-sources).

👍
 
About 4, what if we call it --fail-if-any? Or alternatively, --at-most=0 (and you can set it to any positive integer)?

`at-most` is short and sweet 👍

I'll prepare some PRs

Christopher Keele

unread,
Jun 3, 2021, 1:54:58 PM6/3/21
to elixir-lang-core
> > About 4, what if we call it --fail-if-any? Or alternatively, --at-most=0 (and you can set it to any positive integer)?
> `at-most` is short and sweet 👍

I think it's important for this flag name to communicate that it will cause a failure. at-most sounds like it would simply limit graph depth, rather than return a non-zero status code if the graph is over a certain depth.

Maybe --failure-depth=2 or --fail-at=3?

Marc-André Lafortune

unread,
Jun 3, 2021, 2:33:01 PM6/3/21
to elixir-lang-core
On Thursday, 3 June 2021 at 13:54:58 UTC-4 christ...@gmail.com wrote:
> > About 4, what if we call it --fail-if-any? Or alternatively, --at-most=0 (and you can set it to any positive integer)?
> `at-most` is short and sweet 👍

I think it's important for this flag name to communicate that it will cause a failure. at-most sounds like it would simply limit graph depth, rather than return a non-zero status code if the graph is over a certain depth.

Maybe --failure-depth=2 or --fail-at=3?

Agreed, `--fail-at=1` or `--fail-above=0` seem clearer

José Valim

unread,
Jun 3, 2021, 2:38:59 PM6/3/21
to elixir-l...@googlegroups.com
I like --fail-above. --fail-at may read like it should be exactly 1.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

Christopher Keele

unread,
Jun 3, 2021, 2:44:43 PM6/3/21
to elixir-l...@googlegroups.com
> I like --fail-above. --fail-at may read like it should be exactly 1.

--fail-above is really nice. I couldn't think of a concise term for --fail-at-or-over, and --fail-over is the only term I could come up with which is obviously cognitively overloaded already.

You received this message because you are subscribed to a topic in the Google Groups "elixir-lang-core" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elixir-lang-core/Hmg5y8S3v4E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4K%3DFzgrxfSy5L8y0EtOvJBKqJx1OUvyKhYfX9SFH3Eq8g%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages