[PATCH] checkpatch: Add optional --codespell dictionary to find more typos

81 views
Skip to first unread message

Joe Perches

unread,
Mar 5, 2015, 1:52:42 PM3/5/15
to Andrew Morton, Andy Whitcroft, Kees Cook, Masanari Iida, Lucas De Marchi, code...@googlegroups.com, LKML
If a codespell dictionary exists, use it if desired.
default is off, maybe it could be turned on later.

codespell's dictionary format allows multiple possible corrections,
ignore that for now and only use the first suggestion.

Also add \b to spelling test so that consecutive misspelled words
are found properly.

Signed-off-by: Joe Perches <j...@perches.com>
---
scripts/checkpatch.pl | 38 ++++++++++++++++++++++++++++++++++----
1 file changed, 34 insertions(+), 4 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index c061a63..6b79beb 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -47,6 +47,8 @@ my $ignore_perl_version = 0;
my $minimum_perl_version = 5.10.0;
my $min_conf_desc_length = 4;
my $spelling_file = "$D/spelling.txt";
+my $codespell = 0;
+my $codespellfile = "/usr/local/share/codespell/dictionary.txt";

sub help {
my ($exitcode) = @_;
@@ -88,6 +90,9 @@ Options:
file. It's your fault if there's no backup or git
--ignore-perl-version override checking of perl version. expect
runtime errors.
+ --codespell Use the codespell dictionary for spelling/typos
+ (default:/usr/local/share/codespell/dictionary.txt)
+ --codespellfile Use this codespell dictionary
-h, --help, --version display this help and exit

When FILE is - read standard input.
@@ -146,6 +151,8 @@ GetOptions(
'ignore-perl-version!' => \$ignore_perl_version,
'debug=s' => \%debug,
'test-only=s' => \$tst_only,
+ 'codespell!' => \$codespell,
+ 'codespellfile=s' => \$codespellfile,
'h|help' => \$help,
'version' => \$help
) or help(1);
@@ -449,7 +456,6 @@ my $misspellings;
my %spelling_fix;

if (open(my $spelling, '<', $spelling_file)) {
- my @spelling_list;
while (<$spelling>) {
my $line = $_;

@@ -461,15 +467,39 @@ if (open(my $spelling, '<', $spelling_file)) {

my ($suspect, $fix) = split(/\|\|/, $line);

- push(@spelling_list, $suspect);
$spelling_fix{$suspect} = $fix;
}
close($spelling);
- $misspellings = join("|", @spelling_list);
} else {
warn "No typos will be found - file '$spelling_file': $!\n";
}

+if ($codespell) {
+ if (open(my $spelling, '<', $codespellfile)) {
+ while (<$spelling>) {
+ my $line = $_;
+
+ $line =~ s/\s*\n?$//g;
+ $line =~ s/^\s*//g;
+
+ next if ($line =~ m/^\s*#/);
+ next if ($line =~ m/^\s*$/);
+ next if ($line =~ m/, disabled/i);
+
+ $line =~ s/,.*$//;
+
+ my ($suspect, $fix) = split(/->/, $line);
+
+ $spelling_fix{$suspect} = $fix;
+ }
+ close($spelling);
+ } else {
+ warn "No codespell typos will be found - file '$codespellfile': $!\n";
+ }
+}
+
+$misspellings = join("|", sort keys %spelling_fix) if keys %spelling_fix;
+
sub build_types {
my $mods = "(?x: \n" . join("|\n ", @modifierList) . "\n)";
my $all = "(?x: \n" . join("|\n ", @typeList) . "\n)";
@@ -2305,7 +2335,7 @@ sub process {
# Check for various typo / spelling mistakes
if (defined($misspellings) &&
($in_commit_log || $line =~ /^(?:\+|Subject:)/i)) {
- while ($rawline =~ /(?:^|[^a-z@])($misspellings)(?:$|[^a-z@])/gi) {
+ while ($rawline =~ /(?:^|[^a-z@])($misspellings)(?:\b|$|[^a-z@])/gi) {
my $typo = $1;
my $typo_fix = $spelling_fix{lc($typo)};
$typo_fix = ucfirst($typo_fix) if ($typo =~ /^[A-Z]/);


Lucas De Marchi

unread,
Mar 5, 2015, 2:13:57 PM3/5/15
to Joe Perches, Andrew Morton, Andy Whitcroft, Kees Cook, Masanari Iida, code...@googlegroups.com, LKML
On Thu, Mar 5, 2015 at 3:52 PM, Joe Perches <j...@perches.com> wrote:
> If a codespell dictionary exists, use it if desired.
> default is off, maybe it could be turned on later.
>
> codespell's dictionary format allows multiple possible corrections,
> ignore that for now and only use the first suggestion.

Most of them were particularly added to avoid wrong suggestions in
kernel code base (admittedly a long time ago)...
Wouldn't it be better to output the entire list of suggestions as
codespell the tool does?
I'm not a perl guru, but couldn't we have a single option like
--codespell / --codespell=FILE ?

Other than that,

Acked-By: Lucas De Marchi <lucas.d...@gmail.com>

--
Lucas De Marchi

Joe Perches

unread,
Mar 5, 2015, 2:21:42 PM3/5/15
to Lucas De Marchi, Andrew Morton, Andy Whitcroft, Kees Cook, Masanari Iida, code...@googlegroups.com, LKML
On Thu, 2015-03-05 at 16:13 -0300, Lucas De Marchi wrote:
> On Thu, Mar 5, 2015 at 3:52 PM, Joe Perches <j...@perches.com> wrote:
> > If a codespell dictionary exists, use it if desired.
> > default is off, maybe it could be turned on later.
> >
> > codespell's dictionary format allows multiple possible corrections,
> > ignore that for now and only use the first suggestion.
>
> Most of them were particularly added to avoid wrong suggestions in
> kernel code base (admittedly a long time ago)...
> Wouldn't it be better to output the entire list of suggestions as
> codespell the tool does?

If you want to write the code, be my guest.

> I'm not a perl guru, but couldn't we have a single option like
> --codespell / --codespell=FILE ?

Not easily. There's no easy way to turn off codespell like that.


Reply all
Reply to author
Forward
0 new messages