python cgi webapp using graphviz "can't find" twopi

315 views
Skip to first unread message

Jon Crump

unread,
Nov 11, 2021, 4:03:02 PM11/11/21
to pygraphviz-discuss
I'm posting this simultaneously on SO because I've no confidence that the problem I'm having has anything to do with pygraphviz, but perhaps the assembled wise here will have some insight.

I'm developing this web application locally on my new MacbookPro (macOS 11.6) using the local apache2 server which I've configured to run .py files in the relevant directories as cgi programs.

The relevant working parts are these:

* the graphviz binaries are installed: /opt/homebrew/bin/twopi:  
twopi - graphviz version 2.49.2 (20211016.1639)
* pygraphviz 1.7
* Python 3.9.7: /opt/homebrew/opt/python@3.9/bin/python3.9
* rdflib version : '6.03a'

The aim of this application, driven by a python cgi script, is to retrieve some data from the local file system, and some RDF data from an AllegroGraph instance on the web using python's `requests` module, and then layout and display a graph visualization in a web page using graphviz and python's pygraphviz module.

the javascript makes a GET request like this:
```javascript
function graphMe(charter){
    $.ajax({
        type: "get",
        url: "cartametallon.py",
        data: {"graphMe": charter},
        dataType: 'json',
        success: deploySVG,
        error: function(jqXHR, textStatus, errorThrown) {
            console.log(jqXHR.response, textStatus, errorThrown);
        }
    });
}
```

The python cgi script fields this request using the cgi module like this:

```python
import cgi, cgitb
cgitb.enable(format="text")
form = cgi.FieldStorage()

try:
    if 'graphMe' in form:
        charter = form.getvalue('graphMe')
        uri = "<http://chartex.org/graphid/" + charter + ">"
        
        print ("Content-Type: application/json\r\n\r\n")
        print (json.dumps(visualizeDocumentGraph(uri)))
        
except Exception:
    print ("Content-Type: text/plain\n")
    print("Exception in user code:")
    print("~"*20, __file__.split('/')[-1], "~"*20)
    traceback.print_exc(file=sys.stdout)
    print("~"*60)
```

The `visualizeDocumentGraph` function assembles graph data and metadata from several sources and stores it in a `dict` which should then be returned to the referring page as a json object. One of the things stored in this object is an SVG string of the graph as laid out by the graphviz's `twopi` algorithm. I've verified that each element of this python function works as expected, and when run at the command line, it returns the expected object; however, the response to the `jQuery.ajax()` request looks like this:

```python
Content-Type: text/plain

Exception in user code:
~~~~~~~~~~~~~~~~~~~~ cartametallon.py ~~~~~~~~~~~~~~~~~~~~
Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.9/site-packages/pygraphviz/agraph.py", line 1344, in _get_prog
    runprog = self._which(prog)
  File "/opt/homebrew/lib/python3.9/site-packages/pygraphviz/agraph.py", line 1800, in _which
    raise ValueError(f"No prog {name} in path.")
ValueError: No prog twopi in path.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "~/Sites/cartametallon/cartametallon.py", line 661, in <module>
    print (json.dumps(visualizeDocumentGraph(uri)))
  File "~/Sites/cartametallon/cartametallon.py", line 292, in visualizeDocumentGraph
    dgsvg = makedot(g).draw(format='svg', prog='twopi')
  File "/opt/homebrew/lib/python3.9/site-packages/pygraphviz/agraph.py", line 1596, in draw
    data = self._run_prog(prog, args)
  File "/opt/homebrew/lib/python3.9/site-packages/pygraphviz/agraph.py", line 1360, in _run_prog
    runprog = r'"%s"' % self._get_prog(prog)
  File "/opt/homebrew/lib/python3.9/site-packages/pygraphviz/agraph.py", line 1346, in _get_prog
    raise ValueError(f"Program {prog} not found in path.")
ValueError: Program twopi not found in path.
```

This is the puzzle then: in the cgi interaction it "appears" that the `twopi` program can't be found, but run on its own, my python script has no trouble. The `twopi` binary is installed at `/opt/homebrew/bin/twopi`, is readily accessible via the env variable `PATH`:
```
% echo $PATH
~/opt/anaconda3/condabin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
```

it's clear that my python script knows about it too. Not only does it execute it successfully when the script runs on its own, it knows explicitly where to find it:

```
>>> os.get_exec_path()
['~/opt/anaconda3/condabin', '/opt/homebrew/bin', '/opt/homebrew/sbin', '/usr/local/bin', '/usr/bin', '/bin', '/usr/sbin', '/sbin']
```

I can't get my python script to return the json I need for the web page, argh! This is made all the more maddening by the fact that what I'm trying to do is refactor and update an existing application that runs just fine at https://neolography.com/chartex/. This working program was recently transfered to a new web host and that required a little tinkering to get it working again, and the graph output is not as good as on my old host because A2 hosting insisted on installing an ancient version of graphviz (don't ask). So, the relevant working parts of the working program are these:

* twopi - graphviz version 2.30.1 (20201013.1554)
* pygraphviz 1.5
* Python 2.7.18 (default, Jul  8 2021, 01:00:23)  
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux2  
(this python had to be running in a virtual env so that I could install this:)
* RDFLib Version: 5.0.0



Jon Crump

unread,
Nov 13, 2021, 10:28:34 PM11/13/21
to pygraphviz-discuss
It took me days, but I finally came to understand that the "PATH" environmental variable, when I run a script at the command line, is quite different from the same variable in the context of an executing cgi program.

```bash

 % echo $PATH
~/opt/anaconda3/condabin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
```
and

```python
>>> os.environ["PATH"]
'~/opt/anaconda3/condabin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin'
```

are not the same as the `os.environ["PATH"]` when it's referenced by apache when it runs the cgi script.

Worse, I was conflating the $PATH variable with python's `sys.path` which has quite a different purpose. My cgi program, using pygraphviz, was trying to execute `twopi`, a binary at `/opt/homebrew/bin`, and in the context of an executing cgi program, the PATH variable available to it looks like this:

```python
"/usr/bin:/bin:/usr/sbin:/sbin:"
```

It's easy enough to add the necessary path to that variable within the cgi program like this:
```python
os.environ["PATH"] = f"{os.environ['PATH']}:/opt/homebrew/bin"
```

And that solved my problem. But, I'm still uneasy. This seems like a hacky approach. I still don't fully understand why apache's `PATH` is different from the `$PATH` available to the python interpreter, or in a script run at the command line. I gather that the `PATH` available to a cgi program is different because apache executes it as a different user (`_www`).

It would be good to know if there's a more canonical way to solve this problem. Any suggestions as to documentation that would clarify my understanding of this issue would be gratefully received.


  [1]: https://stackoverflow.com/questions/3783887/why-cant-python-find-some-modules-when-im-running-cgi-scripts-from-the-web
  [2]: https://docs.python.org/3/library/os.html#os.environ
Reply all
Reply to author
Forward
0 new messages