Organisation of the API reference docs

40 views
Skip to first unread message

Oscar Benjamin

unread,
Jun 8, 2022, 6:01:56 PM6/8/22
to sympy
Hi all,

I was looking at the SymPy docs earlier today and I found that it was
quite difficult to find the docs for the polys module. I'm interested
to know what people think about how the API docs are organised because
I see at least some problems but I'd like to know more from a user
perspective what people think about it.

You can see the dev version of the docs here:
https://docs.sympy.org/dev/index.html

That's how the docs are expected to look after the next release and it
does feel a lot nicer with the theme change compared to the 1.10 docs:
https://docs.sympy.org/latest/index.html

The problem I have though is really to do with the API reference page:
https://docs.sympy.org/dev/reference/index.html#reference

I remember spending some time with Aaron and Joannah coming up with
how to organise this but it seemed difficult at the time to come up
with a good "hierarchy" to break down the different modules in SymPy.
From memory we just came up with something and thought that we could
improve it later based on feedback or something but that structure
remained.

The API reference page has a heading called "Topics". The text
underneath mentions "polynomials". When I click the "Topics" header I
get here:
https://docs.sympy.org/dev/reference/public/topics/index.html#topics

It's quite hard to see polynomials there and I missed it the first few
times I looked at that page. SymPy's polynomials module is one of the
more well developed parts of SymPy and is extremely useful. It is also
a really core part of SymPy that is used all the time as part of any
simple operation as well as having extremely useful functions for end
users. Putting polys on the same level as "category theory" or
"cryptography" seems totally wrong to me given that those are both
modules that are completely unused internally and are probably not
used externally for anything significant.

I think our goal at the time to break down the list of submodules into
a small number of categories for the API reference was a good idea but
I don't think we got the categories right. For example polys should be
at least on the same level as matrices right up there with a top link
in the main page.

Does anyone have any idea what would be a better structure here? Which
topics should be listed right up top and which should be in
subheadings like "topics" and so on?

For some sort of metric here's a quick count of the number of lines of
code in each submodule of SymPy. According to git ls-files there are
843064 lines of code in total in SymPy. Here's the breakdown in terms
of submodules (showing that polys makes up 10% of the whole codebase!)
and how many lines of code they have:

111849 sympy/integrals
89184 sympy/polys
67999 sympy/physics
56055 sympy/core
46909 sympy/printing
43640 sympy/solvers
40034 sympy/functions
32076 sympy/matrices
31561 sympy/utilities
25289 sympy/combinatorics
23812 sympy/parsing
22257 sympy/stats
18966 sympy/tensor
18025 sympy/simplify
15448 sympy/geometry
11481 sympy/assumptions
10979 sympy/series
10662 sympy/ntheory
10582 sympy/plotting
10346 sympy/sets
7061 sympy/concrete
6865 sympy/logic
6779 sympy/vector
6181 sympy/codegen
4731 sympy/categories
4299 sympy/holonomic
4121 sympy/testing
3958 sympy/crypto
3664 sympy/calculus
3048 sympy/diffgeom
2194 sympy/liealgebras
1916 sympy/external
1678 sympy/discrete
1448 sympy/algebras
1423 sympy/interactive
1242 sympy/strategies
1237 sympy/multipledispatch
752 sympy/unify
531 sympy/__init__.py
478 sympy/benchmarks
111 sympy/abc.py
105 sympy/sandbox
074 sympy/conftest.py
021 sympy/this.py
001 sympy/release.py
001 sympy/galgebra.py

Of course lines of code might not be a reasonable metric for what's
relevant to users. I think there is something wrong though if the
largest modules like integrals or polys don't even have a top-level
mention or are hard to navigate to. Another problem is the fact that
many of these headings don't make much sense e.g. all the limits code
is in the series module and I'm not sure that would be obvious to
anyone from outside. The above list gives similar lines of code the
logic module and the holonomic module but the logic module is used all
the time and I've never heard of anyone using holonomic...

Ideally the reference API docs tree would make it easy to peruse the
overall features of SymPy as a library and give some impression of
what it does. The old list that just showed every submodule in
alphabetical order wasn't much use because it gave undue prominence to
things that have their own top-level submodule for no particular
reason. I don't think we've really succeeded in coming up with a good
organisation of the topics and features of SymPy here though.

Does anyone have any thoughts about how to do this better? What topics
should be more prominent? Do the headings on the reference API page
even make sense?

--
Oscar

Aaron Meurer

unread,
Jun 8, 2022, 6:49:57 PM6/8/22
to sy...@googlegroups.com
Polys might makes more sense under "basics" with the current organization. However, even with "basics", I sometimes have a hard time realizing that I should look there. Maybe the name "basics" isn't very good. 

I never really considered polys a top-level thing because despite its importance for the core, it's not something that most end-users should care about. 

Some very simple improvements I see that can be made are

- Make sure that the pages in the left sidebar are always in alphabetical order. Right now the stuff under "basics" isn't. That way if you know what you are looking for it's easy to find.
- Keep the section titles as short as possible. It's hard to peruse the polys sections in the left sidebar because they are so long. 
- Make sure each page has a top-level heading. I discovered that the way Furo creates sections for the left sidebar is it makes one section per top-level header. That means that a page like https://docs.sympy.org/dev/modules/codegen.html#code-printers-sympy-printing gets all of its headings in the left sidebar and nothing in the right sidebar. This can be fixed by adding a top-level heading like

===============
Code Generation
===============

- Clean up some of the redundant "index" pages. I think that pages like the one Oscar liked to (https://docs.sympy.org/dev/reference/public/topics/index.html) are mostly useless now that we have the sidebars in the theme. Maybe others will disagree here. At the very least, if that page is going to exist, it should have some text instead of just being a table of contents.
I'm curious how you get this output from git ls-files. I don't see any flags to count line numbers or restrict to a directory level. 
 

111849 sympy/integrals

I'm pretty sure this is all because of RUBI, which has several huge files with patterns. 
I think we can easily move things around. None of the page URLs reference the categories, so the URLs won't break if we rename a category or move something to another category. The only URL that might break is the top-level index pages which I mentioned earlier would be better to either remove entirely or make more useful. I think it comes back to the discussion we had about making the reference documentation organized distinctly from the actual submodules, and having a separate docs page that exactly matches the submodules, and also inlcudes documentation for private functions.

 Aaron Meurer


--
Oscar

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxQJkbPWrm3qhs6E9RU3OG7hoVMOx11H5q4RtO1Lx0ZVhQ%40mail.gmail.com.

Jeremy Monat

unread,
Jun 8, 2022, 6:55:41 PM6/8/22
to sy...@googlegroups.com
Google searches seem like a "a reasonable metric for what's relevant to users." I posted the top 1000 Google search terms for a year's worth of searches.

Jeremy


Oscar Benjamin

unread,
Jun 8, 2022, 7:29:59 PM6/8/22
to sympy
On Wed, 8 Jun 2022 at 23:55, Jeremy Monat <jem...@gmail.com> wrote:
>
> Google searches seem like a "a reasonable metric for what's relevant to users." I posted the top 1000 Google search terms for a year's worth of searches.

I agree to some extent but here we are talking about the reference API
rather than guides etc which means we can organise it to educate users
about the features that SymPy has rather than just try to draw the
right clicks. I actually think that a lot of the time I know what
users want more than they do, or in other words they are searching for
the wrong solution to their problem. We can try to approach this in
different ways but I'm interested to hear from people who are making
good use of SymPy whether they feel that something is missing or not
prominent enough in the docs.

Another part of this is just making it clear to users what SymPy can
do. The organisation of the docs should make that clear. SymPy has
significant capabilities to solve problems involving polynomials.
Those capabilities exceed many other parts of SymPy by *miles* but yet
are very much unemphasised by the docs.

--
Oscar

Aaron Meurer

unread,
Jun 8, 2022, 8:16:34 PM6/8/22
to sy...@googlegroups.com
On Wed, Jun 8, 2022 at 5:29 PM Oscar Benjamin <oscar.j....@gmail.com> wrote:
On Wed, 8 Jun 2022 at 23:55, Jeremy Monat <jem...@gmail.com> wrote:
>
> Google searches seem like a "a reasonable metric for what's relevant to users." I posted the top 1000 Google search terms for a year's worth of searches.

I agree to some extent but here we are talking about the reference API
rather than guides etc which means we can organise it to educate users
about the features that SymPy has rather than just try to draw the
right clicks. I actually think that a lot of the time I know what
users want more than they do, or in other words they are searching for
the wrong solution to their problem. We can try to approach this in
different ways but I'm interested to hear from people who are making
good use of SymPy whether they feel that something is missing or not
prominent enough in the docs.

That's a good point. People who already know what they are looking for will usually use search (either Google or the built-in search in the docs). So the organization should optimize for people who are "just browsing", so to speak. We already somewhat do this by emphasizing some important features at the top-level like code generation. 

Although I must say I personally have been making use of the new sidebar to get what I want because I know where to look and it's much faster than search and search can be quite unreliable (as an aside, just to give an idea of how bad search results can be, the other day I saw someone posted a live video of them coding SymPy on Twitter and they searched for "sympy avoid piecewise results from integrate" which led them to http://omz-software.com/pythonista/sympy/modules/integrals/integrals.html, the third result on the page. This is someone else's hosting of the SymPy 0.7.4.1 documentation, which is a version from 2013!)

A real challenge is that while there are a couple of big things that are more common, like solving, there is a long tail of SymPy features that are all used by some people but aren't any one of them used significantly more than the others. We want to make them visible, but listing them all at the same level puts us back to the same organization we had before. 

Aaron Meurer
 

Another part of this is just making it clear to users what SymPy can
do. The organisation of the docs should make that clear. SymPy has
significant capabilities to solve problems involving polynomials.
Those capabilities exceed many other parts of SymPy by *miles* but yet
are very much unemphasised by the docs.

--
Oscar

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages