The numpydoc validator is available at
https://github.com/numpy/numpydoc/blob/master/numpydoc/validate.py and can be run with
python -m numpydoc --validateHowever, it has some limitations out-of-the-box, e.g. it does not offer package-wide validation or any form of .rst parsing. Rather, it accepts the name of a single object and uses
importlib to fetch that object's docstring. As a result, most projects maintain their own validation scripts that wrap
numpydoc.validate and make repeated calls to it. For example, scikit-learn has a script that enumerates their functions/classes/methods (using
pkgutil), filters that list of objects, calls
numpydoc.validate on each, ignores certain error codes, and pretty prints the results:
https://github.com/scikit-learn/scikit-learn/blob/master/maint_tools/test_docstrings.pyAnd over at pandas, they use
numpydoc.validate alongside some custom validation, all rolled into their CI:
https://github.com/pandas-dev/pandas/blob/master/scripts/validate_docstrings.pyhttps://github.com/pandas-dev/pandas/blob/master/scripts/tests/test_validate_docstrings.pyNote that pandas' version parses .rst files directly to enumerate the objects to be validated, as that is how the validation script was written before it migrated from pandas to numpydoc. To clarify, I have not contributed code to
numpydoc.validate but I did participate in its migration (on GitHub and via email) and adapted/tested both versions for use with SciPy. One issue that cropped up was that the .rst parsing of the original script assumed
autosummary, while many projects (like SciPy) use
autodoc. For that and other reasons, .rst parsing was removed entirely from
numpydoc.validate, at least for now. I see that SymPy has considered migrating from
autodoc to
autosummary (
#18594), which could make it easy to mimic some of the sophisticated things pandas is doing with their docstring validation, including CI.
But despite any .rst parsing, the validation itself is still done through
importlib. This dependency makes
numpydoc.validate somewhat clunky to use like a linter, as that would require building from source before each validation. I certainly don't want to imply that
numpydoc.validate is the perfect tool for all workflows and, in fact, an overenthusiasm for tooling can easily generate technical debt and distract from more valuable work (like actually writing docstrings). My proposed use of
numpydoc.validate was just as another tool in the toolbox; a convenient way to populate my tasklist. For example, here is a quick list of the SymPy objects that have custom sections or don't follow the section order specified in the SymPy docstring guide (i.e. Explanation, Examples, Parameters, See Also, References):
https://gist.github.com/brandondavid/02868ca74600897d5d61c43c43e2654a--Brandon