[PATCH v2 6/9] gen_compile_commands: move directory walk to a generator function

3 views
Skip to first unread message

Masahiro Yamada

unread,
Aug 21, 2020, 3:02:40 PM8/21/20
to linux-...@vger.kernel.org, Nathan Huckleberry, Nick Desaulniers, Tom Roeder, clang-bu...@googlegroups.com, Masahiro Yamada, linux-...@vger.kernel.org
Currently, this script walks under the specified directory (default to
the current directory), then parses all .cmd files found.

Split it into a separate helper function because the next commit will
add more helpers to pick up .cmd files associated with given file(s).

There is no point to build and return a huge list at once. I used a
generator so it works in the for-loop with less memory.

Signed-off-by: Masahiro Yamada <masa...@kernel.org>
---

Changes in v2:
- New patch

scripts/gen_compile_commands.py | 44 ++++++++++++++++++++++++---------
1 file changed, 32 insertions(+), 12 deletions(-)

diff --git a/scripts/gen_compile_commands.py b/scripts/gen_compile_commands.py
index 3ed958b64658..6dec7e2c4098 100755
--- a/scripts/gen_compile_commands.py
+++ b/scripts/gen_compile_commands.py
@@ -33,6 +33,7 @@ def parse_arguments():
log_level: A logging level to filter log output.
directory: The work directory where the objects were built
output: Where to write the compile-commands JSON file.
+ paths: The list of directories to handle to find .cmd files
"""
usage = 'Creates a compile_commands.json database from kernel .cmd files'
parser = argparse.ArgumentParser(description=usage)
@@ -56,7 +57,28 @@ def parse_arguments():

return (args.log_level,
os.path.abspath(args.directory),
- args.output)
+ args.output,
+ [args.directory])
+
+
+def cmdfiles_in_dir(directory):
+ """Generate the iterator of .cmd files found under the directory.
+
+ Walk under the given directory, and yield every .cmd file found.
+
+ Args:
+ directory: The directory to search for .cmd files.
+
+ Yields:
+ The path to a .cmd file.
+ """
+
+ filename_matcher = re.compile(_FILENAME_PATTERN)
+
+ for dirpath, _, filenames in os.walk(directory):
+ for filename in filenames:
+ if filename_matcher.match(filename):
+ yield os.path.join(dirpath, filename)


def process_line(root_directory, command_prefix, file_path):
@@ -94,31 +116,29 @@ def process_line(root_directory, command_prefix, file_path):

def main():
"""Walks through the directory and finds and parses .cmd files."""
- log_level, directory, output = parse_arguments()
+ log_level, directory, output, paths = parse_arguments()

level = getattr(logging, log_level)
logging.basicConfig(format='%(levelname)s: %(message)s', level=level)

- filename_matcher = re.compile(_FILENAME_PATTERN)
line_matcher = re.compile(_LINE_PATTERN)

compile_commands = []
- for dirpath, _, filenames in os.walk(directory):
- for filename in filenames:
- if not filename_matcher.match(filename):
- continue
- filepath = os.path.join(dirpath, filename)

- with open(filepath, 'rt') as f:
+ for path in paths:
+ cmdfiles = cmdfiles_in_dir(path)
+
+ for cmdfile in cmdfiles:
+ with open(cmdfile, 'rt') as f:
result = line_matcher.match(f.readline())
if result:
try:
- entry = process_line(directory,
- result.group(1), result.group(2))
+ entry = process_line(directory, result.group(1),
+ result.group(2))
compile_commands.append(entry)
except ValueError as err:
logging.info('Could not add line from %s: %s',
- filepath, err)
+ cmdfile, err)

with open(output, 'wt') as f:
json.dump(compile_commands, f, indent=2, sort_keys=True)
--
2.25.1

Nick Desaulniers

unread,
Aug 21, 2020, 8:41:40 PM8/21/20
to Masahiro Yamada, Linux Kbuild mailing list, Nathan Huckleberry, Tom Roeder, clang-built-linux, LKML
On Fri, Aug 21, 2020 at 12:02 PM Masahiro Yamada <masa...@kernel.org> wrote:
>
> Currently, this script walks under the specified directory (default to
> the current directory), then parses all .cmd files found.
>
> Split it into a separate helper function because the next commit will
> add more helpers to pick up .cmd files associated with given file(s).
>
> There is no point to build and return a huge list at once. I used a
> generator so it works in the for-loop with less memory.
>
> Signed-off-by: Masahiro Yamada <masa...@kernel.org>
> ---
>
> Changes in v2:
> - New patch
>
> scripts/gen_compile_commands.py | 44 ++++++++++++++++++++++++---------
> 1 file changed, 32 insertions(+), 12 deletions(-)
>
> diff --git a/scripts/gen_compile_commands.py b/scripts/gen_compile_commands.py
> index 3ed958b64658..6dec7e2c4098 100755
> --- a/scripts/gen_compile_commands.py
> +++ b/scripts/gen_compile_commands.py
> @@ -33,6 +33,7 @@ def parse_arguments():
> log_level: A logging level to filter log output.
> directory: The work directory where the objects were built
> output: Where to write the compile-commands JSON file.
> + paths: The list of directories to handle to find .cmd files

Punctuation: please add a period.
If `cmdfiles` is never referenced again, please make this:

for cmdfile in cmdfiles_in_dir(path):

With those 2 changes feel free to add my
Reviewed-by: Nick Desaulniers <ndesau...@google.com>

> + with open(cmdfile, 'rt') as f:
> result = line_matcher.match(f.readline())
> if result:
> try:
> - entry = process_line(directory,
> - result.group(1), result.group(2))
> + entry = process_line(directory, result.group(1),
> + result.group(2))
> compile_commands.append(entry)
> except ValueError as err:
> logging.info('Could not add line from %s: %s',
> - filepath, err)
> + cmdfile, err)
>
> with open(output, 'wt') as f:
> json.dump(compile_commands, f, indent=2, sort_keys=True)
> --
> 2.25.1
>


--
Thanks,
~Nick Desaulniers

Masahiro Yamada

unread,
Aug 22, 2020, 12:36:27 AM8/22/20
to Nick Desaulniers, Linux Kbuild mailing list, Nathan Huckleberry, Tom Roeder, clang-built-linux, LKML
On Sat, Aug 22, 2020 at 9:41 AM 'Nick Desaulniers' via Clang Built
Linux <clang-bu...@googlegroups.com> wrote:

> > + for path in paths:
> > + cmdfiles = cmdfiles_in_dir(path)
> > +
> > + for cmdfile in cmdfiles:
>
> If `cmdfiles` is never referenced again, please make this:
>
> for cmdfile in cmdfiles_in_dir(path):


I intentionally wrote in this way
because in the next commit,
cmdfiles will be selectively set.





> With those 2 changes feel free to add my
> Reviewed-by: Nick Desaulniers <ndesau...@google.com>
>
> > + with open(cmdfile, 'rt') as f:
> > result = line_matcher.match(f.readline())
> > if result:
> > try:
> > - entry = process_line(directory,
> > - result.group(1), result.group(2))
> > + entry = process_line(directory, result.group(1),
> > + result.group(2))
> > compile_commands.append(entry)
> > except ValueError as err:
> > logging.info('Could not add line from %s: %s',
> > - filepath, err)
> > + cmdfile, err)
> >
> > with open(output, 'wt') as f:
> > json.dump(compile_commands, f, indent=2, sort_keys=True)
> > --
> > 2.25.1
> >
>
>
> --
> Thanks,
> ~Nick Desaulniers
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-li...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/CAKwvOdn9ZfvC4dzuVnxc_a52JFn_q1ewOWwZZD5b9%3DizeEayKQ%40mail.gmail.com.



--
Best Regards
Masahiro Yamada

Masahiro Yamada

unread,
Aug 22, 2020, 10:56:58 AM8/22/20
to linux-...@vger.kernel.org, Nathan Huckleberry, Nick Desaulniers, Tom Roeder, clang-bu...@googlegroups.com, Masahiro Yamada, linux-...@vger.kernel.org
Currently, this script walks under the specified directory (default to
the current directory), then parses all .cmd files found.

Split it into a separate helper function because the next commit will
add more helpers to pick up .cmd files associated with given file(s).

There is no point to build and return a huge list at once. I used a
generator so it works in the for-loop with less memory.

Signed-off-by: Masahiro Yamada <masa...@kernel.org>
Reviewed-by: Nick Desaulniers <ndesau...@google.com>
---

Changes in v3:
- Add the missing punctuation in the comment

Changes in v2:
- New patch

scripts/gen_compile_commands.py | 44 ++++++++++++++++++++++++---------
1 file changed, 32 insertions(+), 12 deletions(-)

diff --git a/scripts/gen_compile_commands.py b/scripts/gen_compile_commands.py
index 71a0630ae188..e45f17be8817 100755
--- a/scripts/gen_compile_commands.py
+++ b/scripts/gen_compile_commands.py
@@ -33,6 +33,7 @@ def parse_arguments():
log_level: A logging level to filter log output.
directory: The work directory where the objects were built.
output: Where to write the compile-commands JSON file.
+ paths: The list of directories to handle to find .cmd files.
@@ -95,31 +117,29 @@ def process_line(root_directory, command_prefix, file_path):

def main():
"""Walks through the directory and finds and parses .cmd files."""
- log_level, directory, output = parse_arguments()
+ log_level, directory, output, paths = parse_arguments()

level = getattr(logging, log_level)
logging.basicConfig(format='%(levelname)s: %(message)s', level=level)

- filename_matcher = re.compile(_FILENAME_PATTERN)
line_matcher = re.compile(_LINE_PATTERN)

compile_commands = []
- for dirpath, _, filenames in os.walk(directory):
- for filename in filenames:
- if not filename_matcher.match(filename):
- continue
- filepath = os.path.join(dirpath, filename)

- with open(filepath, 'rt') as f:
+ for path in paths:
+ cmdfiles = cmdfiles_in_dir(path)
+
+ for cmdfile in cmdfiles:
Reply all
Reply to author
Forward
0 new messages