Checkstyle should suggest corrections when violations are detected

949 views
Skip to first unread message

Vladimir Sitnikov

unread,
Jul 1, 2014, 2:44:03 AM7/1/14
to checksty...@googlegroups.com
Hi,

I'm opening the discussion as per Roman's recommendation in https://github.com/checkstyle/checkstyle/pull/164
When dealing with checkstyle violations, the hardest part is to figure out how to correct the violation.

Different projects have different checkstyle configurations, thus it is almost impossible to pass the checkstyle from the first try.
I believe, checkstyle should not only report violations but provide suggestions to fix the violations automatically.

Consider java IDE. If you miss an import statement, IDE will suggest "would you like to add x.y.Z import?" an so on.

Here's perfect illustration for lots of checkstyle violations: http://xkcd.com/1168/

The most painful for me is ImportOrderCheck. I propose the improvement for the check in https://github.com/checkstyle/checkstyle/pull/164
Integration with IDE is out of scope of the particular pull request, however I believe it makes sense to teach checkstyle provide the suggested fixes that can be picked by IDE plugins.

Consider ImportOrderCheckTest.testDefault
Given the following input
package com.puppycrawl.tools.checkstyle.imports;

import java.awt.Button;
import java.awt.Frame;
import java.awt.Dialog;
import java.awt.event.ActionEvent;
import static java.awt.Button.ABORT
;
import javax.swing.JComponent;
import javax.swing.JTable;
import java.io.File;
import static java.io.File.createTempFile;
import java.io.IOException;
import java.io.InputStream;
import java.io.Reader;
import static javax.swing.WindowConstants.*;

public class InputImportOrder {
}

ImportOrder check would complain by default as follows:
5: Wrong order for 'java.awt.Dialog' import.
9: Wrong order for 'javax.swing.JComponent' import.
11: Wrong order for 'java.io.File' import.
13: Wrong order for 'java.io.IOException' import.

Whaaat? How on earth one should fix that?

I would rather see this message instead (this warning is enabled by setting printDesiredOrder=true):
Suggested import order is as follows: import java.awt.Button;
import java.awt.Dialog;
import java.awt.Frame;
import java.awt.event.ActionEvent;
import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.io.Reader;
import javax.swing.JComponent;
import javax.swing.JTable;
import static java.awt.Button.ABORT;
import static java.io.File.createTempFile;
import static javax.swing.WindowConstants.*;

 This gives clear view of the import order desired by checkstyle. If integrated with IDE plugins, the fix can be applied automatically without leaving IDE and/or guessing on the violations list.

Roman Ivanov

unread,
Jul 3, 2014, 1:50:01 AM7/3/14
to checksty...@googlegroups.com
Hi Vladimir,

thanks for a post, sorry for late reply to you.
Your ideas are in right direction.
 
Some comments on your post:

> provide suggestions to fix the violations automatically ......
> If integrated with IDE plugins, the fix can be applied automatically

Checkstyle is library , it is not a IDE plugin, so automation of fix we should ask to implement IDE's plugins.

--------------------


So lets focus on what Checkstyle could do to ease developer's life to fix its violations:
1) accumulate violations in one violation message
2) somehow show what is expected

"1)" point is kind of reasonable.
"2)" point is not that obvious from reporting point of view:
a) message could not be multi-line string, in other case all checkstyle reports and viewers will be broken. Good example from life codding is Exception stack trace in log file and how log viewers manage to show it correctly - it doers not work .
b) Reporting huge amount of imports as one line - is unreadable. Beware that list of 30-40 imports is well spread in real code.

~/java/git/checkstyle/checkstyle [master ↓·1|✔] $ grep -R -c 'import ' . | grep -e '\.java\:[2-9][0-9]'
./contrib/bcel/src/checkstyle/com/puppycrawl/tools/checkstyle/bcel/ClassFileSetCheck.java:23
./contrib/bcel/src/checkstyle/com/puppycrawl/tools/checkstyle/bcel/ReferenceVisitor.java:21
./src/test/java/com/puppycrawl/tools/checkstyle/checks/imports/AvoidStaticImportTest.java:20
./src/main/java/com/puppycrawl/tools/checkstyle/Checker.java:28
./src/main/java/com/puppycrawl/tools/checkstyle/gui/ParseTreeInfoPanel.java:25
./src/main/java/com/puppycrawl/tools/checkstyle/gui/FileDrop.java:21
./src/main/java/com/puppycrawl/tools/checkstyle/gui/JTreeTable.java:26
./src/main/java/com/puppycrawl/tools/checkstyle/TreeWalker.java:28

So printDesiredOrder is not a good name and concept, as Checkstyle should bind message to certain token in file, and checkstyle does not know where message will be printed (console, DB, HTML, in-memory container, IDE hint , ...... ).

We need to look at this as "accumulation of order violation", report on first import token and now we need to think how to report even in single line as user could read this. .... I open for proposals.

I could make a hint to similar problem in CustomOrderDeclarationCheck, in that check we report a order violation on each element but we print what type of element is expected.
Example:  "49:5: Method definition in wrong order. Expected 'GetterSetter(.*)' then 'Method(.*)' ".

truth is somewhere there ..... 


PS and out-of-context:
I still remember about your other PRs, just do not have time to review them attentively, sorry.

thanks,
Roman Ivanov

Vladimir Sitnikov

unread,
Jul 3, 2014, 4:38:44 AM7/3/14
to checksty...@googlegroups.com

Checkstyle is library , it is not a IDE plugin, so automation of fix we should ask to implement IDE's plugins.

Sorry, I am not following you.
How should IDE plugin figure out the corrective action?
From my point of view, that is checkstyle who should provide the actions (i.e. in terms of "replace bytes from X to Y with Z" or some similar _dumb_ actions)
 

So lets focus on what Checkstyle could do to ease developer's life to fix its violations:
1) accumulate violations in one violation message
This depends on the output media.
If the result is used in IDE, no need to combine the warnings.
If the result is used in the day to day "mvn verify", then it would be better to have some summary.
 
2) somehow show what is expected
a) message could not be multi-line string, in other case all checkstyle reports and viewers will be broken. Good example from life codding is Exception stack trace in log file and how log viewers manage to show it correctly - it doers not work .
You know, from the API perspective, not only multiline messages, but more attributes are required.
For instance, "start/end position", so automatic tools can identify what section of the source file to replace without parsing "checkstyle warning message text".

b) Reporting huge amount of imports as one line - is unreadable. Beware that list of 30-40 imports is well spread in real code.

I expect the output be copy&past-able. I do understand multiline messages impose problems for grep, but the most frequent use-case is to copy and paste the right imports just into the source code.
I see no problem in printing all the 100-200 import lines. You know, it gets printed _only_ in case there are violations of the import order.
 

So printDesiredOrder is not a good name
Ok, it is indeed one of the hardest problems: http://martinfowler.com/bliki/TwoHardThings.html

and concept
I do not agree here.
 
as Checkstyle should bind message to certain token in file
I see no standard way to bind message to "all the imports in the file", thus I use the line number of the very first encountered import.
I believe it is the best checkstyle can afford at this point.
 
and checkstyle does not know where message will be printed (console, DB, HTML, in-memory container, IDE hint , ...... ).
What is the problem with it?
If the log gets printed to DB, then the multiline-suggestion will can be put into its separate column provided checkstyle's LocalizedMessage allow getArgument(int index) and getArgumentCount().
That is a separate issue, but I find it very surprising that LocalizedMessage provides getKey and at the same time it does not provide access to the arguments.
So your point of "does not know where message will be printed" is indeed the argument to go and fix com.puppycrawl.tools.checkstyle.api.LocalizedMessage.

 
We need to look at this as "accumulation of order violation", report on first import token and now we need to think how to report even in single line as user could read this. .... I open for proposals.
Once again:
1) In brief mode, just a single warning: "the order of imports is broken" might be enough. It would be good if checkstyle could figure out the least amount of required edits to make the order right.
For instance, instead of "wrong import of orders at line 4" it could figure out that "just moving line 5 to line 1 makes the imports perfect".
In other words, even if checkstyle provide recommendations on per-line basis this would be a huge improvement.
This however requires some rocket-science (http://en.wikipedia.org/wiki/Longest_common_subsequence_problem ? or may be some advanced diff algorithm), _and_ it might be misleading if multiple lines are to be moved at the same time in different directions. As you move a single line, the whole numbering gets messed up and I am not sure 10 warnings of "move line X to Y" are good for the end user. 
You know, diff problem is hard and I do not think we'll get the ultimate algorithm at the first try (e.g. even git has multiple diff algorithms: http://bryanpendleton.blogspot.ru/2010/05/patience-diff.html)

2) As the number of violations exceeds 1, it is much easier just go ahead and rewrite _all_ the imports with the correct ones. For instance, I am always using "optimize imports" in IDE to reorder them, kill unused imports, etc. Why shouldn't I just copy and paste the imports? It makes more sense than manual reordering line by line. I considered 

3) It is a good question how multiple _different_ checks should play together. Say, "invalid import order" and "unused imports" might have their own opinion on the "right" import section. I would suggest making progress a bit at a time, and implement "per check" improvement first, and consider more complex cases later.

I could make a hint to similar problem in CustomOrderDeclarationCheck, in that check we report a order violation on each element but we print what type of element is expected.
Example:  "49:5: Method definition in wrong order. Expected 'GetterSetter(.*)' then 'Method(.*)' ".
Yes, this is better than just "Method definition in wrong order". But even "Expected 'GetterSetter(.*)' then 'Method(.*)'" is less than perfect, since there might be two possibilities:
a) The particular method at line 49 was added in the wrong order. Then, it this violation message is fine, and it can be improved if something like "please move the method to the line Y" is added as well.
b) The particular method was not changed for ages, but another method was added in the wrong place. Then the violation is misleading. You know, it would be better if violation is reported against the guilty code, not against the random victim.

 
PS and out-of-context:
I still remember about your other PRs, just do not have time to review them attentively, sorry.
Good to know they are not lost in the way.

Vladimir 

Roman Ivanov

unread,
Jul 3, 2014, 10:10:48 PM7/3/14
to checksty...@googlegroups.com
Hi Vladimir,

> How should IDE plugin figure out the corrective action?

By parsing of error message - there is no other way. Do not expect any changes to checkstyle API to do it in other way. 

> From my point of view, that is checkstyle who should provide the actions

Message is all that you will have.

> If the result is used in the day to day "mvn verify", then it would be better to have some summary.

if you use maven plugin, please start discussion at maven plugin project.

> You know, from the API perspective, not only multiline messages, but more attributes are required.

Do not expect any changes there(api) in next year or so.

> I expect the output be copy&past-able....
> I see no problem in printing all the 

It looks like you use checkstyle maven plugin, and you disappointed on how it report the problem. But maven plugin is only one of numerous listeners of Checkstyle events. If you want convenience only for maven plugin that does not help to any other listener - keep that changes in maven plugin only.

> I see no standard way to bind message to "all the imports in the file"

Yes, so we could report on first or last import token.

...................

DO NOT COMMENT ANY MY ANSWER ABOVE - DISCUSSION IS CLOSED FOR THEM.

So now more constructive items:
1) for now I do not see how report accumulative expected order, instead of just using multiline String.
I do not like this solution, you might open new discussion and get approval form other mentors to do this.

2) I would stay with single line reporting, but instead of useless message that order is wrong will print for each real import line what import is expected.
Example:
"5: Wrong order for 'java.awt.Dialog' import, expected 'java.awt.Frame'."

That will give ability to automate that violation resolving in IDEs: Eclipse plugin or Idea Plugin or ......, by parsing of error message only.
That will give you ability to do grep in maven plugin output and put expected set of imports to file.
Message finally become useful and user will know what to do, boring manual management of import will force him to use formatters.

PS:
I always thought that order of imports was a problem from beginning of 2000, when IDE did not collapse imports at all. Engineer should not care about this.
In recent time it is more requirements for famous open source libraries to make a code shiny as will be examined by millions of developers.

thanks,
Roman Ivanov

Vladimir Sitnikov

unread,
Jul 5, 2014, 3:34:31 AM7/5/14
to checksty...@googlegroups.com
Roman,

Please DO clarify your opinion on "very surprising that LocalizedMessage provides getKey and at the same time it does not provide access to the arguments".
Did I get you right the only way to deal with checkstyle messages is to parse messages' java.lang.String?

I am not going to support the discussion further without explicit answer from checkstyle maintainers.

Vladimir

Roman Ivanov

unread,
Jul 6, 2014, 1:30:54 AM7/6/14
to Vladimir Sitnikov, checksty...@googlegroups.com
Hi Vladimir,

Yes, absents of getter for Arguments at LocalizedMessage - is done with idea that message key will be identifier of problem, but arguments are non-named and un-sorted list of arguments, and Checkstyle is not ready to expose that as api, even order of arguments could be changed any time (example: rephrase in error message).

To make Arguments useful for IDE Plugins to provide auto-fix, Checkstyle need to name each argument, and make a format of message as kind of contract to out-side application. That was exactly that we try to avoid for now, as we know that our messages sometimes are far from ideal and want to keep freedom to change them in any way without any notification to clients.  
 
I understand that parsing of values from String is inefficient, and Checkstyle could simplify plugins live a lot to let them get list of arguments as ready to use types/values. 

Right now I do not see any requests for this feature. All famous IDE plugins are inactive in development. I do not want to complicate Checkstyle's development (or change api) if nobody is ready to use it. All plugins could start from parsing a String, as we notice a persistent development in that area we could start thinking about API changes.

======

I am not sure how this question depend on your discussion for ImportsOrder improvement in messages. Lets finish with it.

thanks,
Roman Ivanov

Vladimir Sitnikov

unread,
Jul 6, 2014, 2:05:58 AM7/6/14
to checksty...@googlegroups.com, sitnikov...@gmail.com
 could be changed any time (example: rephrase in error message).
No need to change argument indices. You just improve resource.properties text and you are fine.
If the message is indeed new, a new message.id might be even better in some cases.

 I understand that parsing of values from String is inefficient, and Checkstyle could simplify plugins live a lot to let them get list of arguments as ready to use types/values. 
Ok, go ahead and parse mondrian then. Good luck.
 
Right now I do not see any requests for this feature. All famous IDE plugins are inactive in development. I do not want to complicate Checkstyle's development (or change api) if nobody is ready to use it. All plugins could start from parsing a String, as we notice a persistent development in that area we could start thinking about API changes.
I believe it is a chicken and egg problem. You do not provide the API, and nobody wants parsing error strings in multiple/unknown languages.

I am not sure how this question depend on your discussion for ImportsOrder improvement in messages. Lets finish with it.
This "feature" highlights the way "corrective actions/multiline arguments" can/cannot be used. From my point of view, expose of arguments is a natural thing to integrate multiple different destinations. I do understand, arguments-identified-by-positions are worse than arguments-identified-by-string-keys, but I believe, positional arguments are good enough so checkstyle can stick with MessageFormat and checkstyle consumers still get meaningful info.

As I see now, you deny even the very simple yet reasonable improvements to the API and suggest no alternatives.

I stop contributing to checkstyle since the weather here does not suit me.

Regards,
Vladimir

Roman Ivanov

unread,
Jul 6, 2014, 3:10:51 PM7/6/14
to checksty...@googlegroups.com, sitnikov...@gmail.com
Hi Vladimir,

> No need to change argument indices

Example:
1 variant: Block WHILE is incorrectly intended, Expected indentation is 8 spaces.
2 variant: Expected indentation is 8 spaces for WHILE block. 

> Ok, go ahead and parse mondrian then. Good luck

Most of the problem could be resolved even without parsing, but nodoby need this, there is no even draft implementation that is going to use this.

> I believe it is a chicken and egg problem.

Ok, let me be honest with you - All Checkstyle plugins are dead, they could not resolve even more painful problems.  
As soon as somebody show me intention of development, I will agree with API changes. 
Right now there are no problems to implement auto-resolving (there is only minor inconvenience), no desire from anybody except for our team. But our focus is to resolve the most acute problems of Checkstyle users - Java8, comments, problem with incorrect parsing.
All Chckstyle users have a format of message without arguments, and message with arguments - is a real problem catch arguments from there ??? no :). Real problem is that nobody what to spend few weeks to implement this and then support that :).

> This "feature" highlights the way "corrective actions/multiline arguments" can/cannot be used. 

Just explain me how you will resolve that problem in Check messaging by adding arguments to LocalizedMessage class  - I do not see a connection there at all.

> As I see now, you deny even the very simple yet reasonable improvements to the API and suggest no alternatives.

I need to see a good reason to extend API. API is nor for resolving custom problems, in other words problems of Vladimir.
From history of Checkstyle development there were few cases then somebody unexpectedly appear in mail-list and demand extension of API .
But when it comes to describing their problem or show sources of project to recheck that Checkstyle is not misused - all developers vanished. 
That prove that there was no project, it was temporal desire to resolve problem by means of Checkstyle.
Right now I see the same situation - you demand smth and you do not show any evidence that this is on demand, and there stable team that is going to use that feature.

For alternatives: I already told you that parsing of String by already known format is not a big deal. Absence of Agrs in API does not block anybody.
Show me implementation that works for few Checks and I will extend API.  

> I stop contributing to checkstyle since the weather here does not suit me.

Do not expect we will follow any of your requests, you are new to Checkstyle and looks like you do not see a full picture of how Checkstyle is used.
Checkstyle will never address problems of individual person or non existing or close-source team.
You need to prove your ideas and show me evidence that API changes you will use and where. 

Right now I do not see from you what you want fix -  ImportOrder or add Agrs to LocalizedMessage. That two problems are not related to each other at all.

Finally, If you still think that you are right :) and do not what to prove that to us, you are welcome to make a fork from us and enjoy any API changes right-away. 
I did enough for Checkstyle to let others to easily fork it and sync source code easily. That will allow you to do implementation that you thinks is right, use that at your project or work, and show me it after completion and testing.

thanks,
Roman Ivanov

Vladimir Sitnikov

unread,
Jul 6, 2014, 4:19:01 PM7/6/14
to checksty...@googlegroups.com, sitnikov...@gmail.com

> No need to change argument indices

Example:
1 variant: Block WHILE is incorrectly intended, Expected indentation is 8 spaces.
2 variant: Expected indentation is 8 spaces for WHILE block. 

That's perfectly fine. It does not require reordering of the arguments. Just alter the message and that is it.
1 variant is "Block {0} is incorrectly intended, Expected indentation is {1} spaces"
2 variant is "Expected indentation is {1} spaces for {0} block."
MessageFormat puts the right value to the right place.
In both cases "indentation" is the first argument, and the "block name" is the second.
You've just proved that arguments-identified-by-positions survive this kind of refactoring.


But our focus is to resolve the most acute problems of Checkstyle users - Java8, comments, problem with incorrect parsing.
I feel your pain.
 
is a real problem catch arguments from there ??? no :)
I would not say you are wrong, but "expose the arguments" is very small, observable and easily reviewable change.
It should not take ages to integrate such a simple change.


Just explain me how you will resolve that problem in Check messaging by adding arguments to LocalizedMessage class  - I do not see a connection there at all.

If LocalizedMessage.getArguments is there, a check (e.g. ImportOrderCheck) can _safely_ emit multiline warnings.
For instance, it will emit "import.desired.order" message with first argument equal to the all the import statement in the required order with required separators, the second argument equal to the starting byte/character of the imports in the file and the third argument equal to the last byte/character of the imports in the file.
Then, the consumer tools might adapt accordingly.

For instance, IDE might just show: "oh, checkstyle suggests some import reordering, hit X to apply it", in the console the full message can be printed, etc.
Alternatively, the tool might just ignore "suggestions" and print just warnings to the user. I guess someone might not want suggestions get printed.
 
> As I see now, you deny even the very simple yet reasonable improvements to the API and suggest no alternatives.
I need to see a good reason to extend API. API is nor for resolving custom problems, in other words problems of Vladimir.
Roman, even log4j has better notion of LoggingEvent than checkstyle's LocalizedMessage: https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/spi/LoggingEvent.html#getProperties()
Why on earth LocalizedMessage.getKey() says something about IDE (!) and at the same time LocalizedMessage.getArguments() is considered "problems of Vladimir"?
 
From history of Checkstyle development there were few cases then somebody unexpectedly appear in mail-list and demand extension of API .
But when it comes to describing their problem or show sources of project to recheck that Checkstyle is not misused - all developers vanished.
Here I am.
I show you the source code of the checkstyle usage, and you do not flag it as misuse.
I show you the patch that conforms to the checkstyle style and includes the tests, yet you reject the feature for unknown reason.
Don't you find this case is a bit different?

 
That prove that there was no project, it was temporal desire to resolve problem by means of Checkstyle.
Right now I see the same situation - you demand smth and you do not show any evidence that this is on demand, and there stable team that is going to use that feature.
Please, read carefully and have a look into optiq's import order configuration https://github.com/julianhyde/optiq/blob/retired/src/main/config/checkstyle/checker.xml#L130 .
I did provide it a couple of times yet saw no feedback on "misuse of the checkstyle" or whatever.
 
For alternatives: I already told you that parsing of String by already known format is not a big deal.
The format is not known since no-one knows which language is used in the String.
I am not going to support parsing of 100500 languages just to get the argument.
 
Absence of Agrs in API does not block anybody.
Show me implementation that works for few Checks and I will extend API.  

I do not follow you. Which implementation are you talking about? 

That two problems are not related to each other at all.
The relation is as follows: you have hard time trying to squeeze all the values into a single line.
You say it is ultimate evil if a multiline argument is used even once.
I say it makes no harm provided the API consumers can tell where are the arguments.
If the consumer does not care where are the arguments, it might just pass the whole string. If the consumer cares where are the arguments (e.g. IDE plugin tries to figure out starting position and ending position of the error), then it can use getArguments API and get ready to use args.

Regards,
Vladimir 

Roman Ivanov

unread,
Jul 7, 2014, 12:12:35 AM7/7/14
to Vladimir Sitnikov, checksty...@googlegroups.com
Hi Vladimir,

> In both cases "indentation" is the first argument, and the "block name" is the second.

:) as I told before - arguments amount and indexes will not be part of API for near future. Our team will change message and arguments(amount, order , .... ) in any time.

> I would not say you are wrong, but "expose the arguments" is very small

Change is small but very demanding for future Checkstyle support. You are not going to support Checkstyle, so we(checkstyle's team) decide what change is appropriate and what will support in future.

Then, the consumer tools might adapt accordingly

The worse idea I could imagine. But now I understand why you need update for API :) . I would rather agree to make multiline message.
That idea will never be approved, non of mentors will help you.

You force me to think a bit more about LocalizedMessage update, and here is list of requirements I need before that update will happen:
1) owned of IDE plugin should send us request for this
2) owner of IDE plugin should provide first implementation of auto-fix feature for Checks that does not have any arguments in their message, and show me that his intention to make full support for all Checks are real.

For now I do not see any reason that any Check update could require that change in API.

I show you the source code of the checkstyle usage

you have never send me code that required API change. All discussion about ImportOrder is not a reason to change API.

I did provide it a couple of times yet 

Just a configuration, now thing new.

I am not going to support parsing of 100500 languages just to get the argument.

First of all, multi language support in checkstyle is not enforced, when new Check appear nobody do multi-language translation. So all users already have mix of English and Other Language in  messages. I am no proud of this, but this is how it is now, sorry.
Send me link to project where you have a problem with parsing of messages from Checkstyle.

Which implementation are you talking about? 

Send me links to open source project (with exact file), that require API changes I need to take a look at it.

> you have hard time trying to squeeze all the values into a single line.

I have never have a problem with that.

> I say it makes no harm provided the API consumers can tell where are the arguments

I will NOT put any line of code in API part of Checkstyle, If know that could not follow that contract (api == contract).

Summary:

if you are not representing any IDE Checkstyle plugin development group - just forget about your API changes request. I will not do that. End of discussion.

to improve existing ImportOrderCheck I see only one approach right now - print in message expected import line, as I described you above. That will resolve your problem and will be useful for all others users.

if you do not like weather in Checkstyle - do fork and enjoy your design.

thanks,
Roman Ivanov
Reply all
Reply to author
Forward
0 new messages