.NET and Python Integration Problem and PDF Library (Need Help and Suggestions)

Ravi Kumar

unread,

Dec 18, 2007, 8:34:16 AM12/18/07

to pytho...@python.org

Hi.
First I am explaining the Problem so it would not get messed up. :)

============== PROBLEM ================
I have to integrate a small part in .NET Projects. .NET project is
actually all Web-based application, user interface is Web-page for
multiple actions.
But a backend component of project needs PDF File manipulation.
Manipuation is limited to splitting pages, removing certain pages,
joining pages, finding certain text information (Title/Author/No. of
Pages/Orientation of each page etc), and page orientation maipulation
(rotating pages clockwise and anticlockwise, on degree parameter,
left/right/flip etc).
.NET guys are failing here, so I proposed to do that component in Python.
But problem is, how to integrate Python in .NET Web Application. I am
looking on IRONPYTHON, and thats the only point seemed to me. Also, as
I am not an expert in Python, I have transition from Perl and PHP to
python. So right now, my applciation architecture would little bit be
inexperienced and not enterprise ready. But every component in project
is being worked on Enterprise grade.
Now,
I want to integrate Python implementation for PDF works, and that
would be called from .NET (C#) processes supplying required
parameters. Since the whole application would be implemented on MS
Windows Server, so I am bit lacking the freedom of library usage as in
Linux.

Also looking for best PDF library which doesn't have any or many
dependencies and that dependencies can be successfully installed. I
will also need an XML Library to write down the logs and
instructions+information for next component which handles Printing.

=====================================

Therefore, I am seeking all the precious (even small or one-liner)
advices from you. Please suggest me every possible things that come in
your mind. Things on high priorities right now are:

--
-=Ravi=-

Ravi Kumar

unread,

Dec 18, 2007, 8:42:59 AM12/18/07

to pytho...@python.org

In continuation of last mail [since pressing tab+space sent the mail :( ]

Things on high priorities right now are:

- How to integrate Python calling from .NET
- Any suggestions for optimizations that would prevent overburden to
application due to IronPython interpretation calling, if any, or does
such things happen.
- Pointers to good resources
- Any step in such kind of situation, so to make it Enterprise Grade
application components
- your opinion with available PDF Libraries, that are best among. Also
which library to use for Windows server platform (there is limitation
on installing long chain libraries that include other deep
dependencies too). A pure python PDF library would be good, but which
one.
-Which XML Library is pure python based.

More questions will follow up :)
And I hope, people will reply me.

--
-=Ravi=-

Joshua Kugler

unread,

Dec 18, 2007, 2:01:11 PM12/18/07

to pytho...@python.org

Ravi Kumar wrote:
> - your opinion with available PDF Libraries, that are best among. Also
> which library to use for Windows server platform (there is limitation
> on installing long chain libraries that include other deep
> dependencies too). A pure python PDF library would be good, but which
> one.
> -Which XML Library is pure python based.

Never done Python/.NET integration, so I can't help you there.

PDF library: ReportLab. But that is most generation of PDFs. For reading,
splitting, etc, you may have to look at their commercial offering. And I
do believe it is all pure python.

XML: ElementTree http://effbot.org/zone/element-index.htm Should be all
you need.

Hope that helps some.

j

Boris Borcic

unread,

Dec 18, 2007, 7:40:11 PM12/18/07

to pytho...@python.org

Ravi Kumar wrote:
> - your opinion with available PDF Libraries, that are best among. Also
> which library to use for Windows server platform (there is limitation
> on installing long chain libraries that include other deep
> dependencies too). A pure python PDF library would be good, but which
> one.

Everybody will tell you "reportlab", but AFAIK the open-source kit does not
provide manipulation of existing pdf file contents - "only" creation. Besides
it's targeted at CPython and it isn't 100% clear it runs perfectly on other
implementations; so have a look at itextsharp - it might better fit your needs
if you decide to use IronPython.

Waldemar Osuch

unread,

Dec 18, 2007, 8:38:23 PM12/18/07

to

On Dec 18, 6:42 am, "Ravi Kumar" <ra2...@gmail.com> wrote:
> In continuation of last mail [since pressing tab+space sent the mail :( ]
> Things on high priorities right now are:
> - How to integrate Python calling from .NET

If I had to do something like this I would host a Python web
server listening on some port and let .NET application talk to me
using HTTP requests preferably or SOAP if I really had to.
It could be a Paster or Cherrypy or Twisted based server.
Google for instruction on how to package it using py2exe
into a windows service.
The service could be hosted on the same server as .NET or a separate
box

> - Any suggestions for optimizations that would prevent overburden to
> application due to IronPython interpretation calling, if any, or does
> such things happen.
> - Pointers to good resources
> - Any step in such kind of situation, so to make it Enterprise Grade
> application components

I do not have Enterprise Grade Seal of Approval but similar setup is
running successfully at my workplace for the last 2 years.

> - your opinion with available PDF Libraries, that are best among. Also
> which library to use for Windows server platform (there is limitation
> on installing long chain libraries that include other deep
> dependencies too). A pure python PDF library would be good, but which
> one.

http://pybrary.net/pyPdf/

> -Which XML Library is pure python based.
>

ElementTree

Ravi Kumar

unread,

Dec 19, 2007, 1:36:39 AM12/19/07

to pytho...@python.org

> Joshua:

> PDF library: ReportLab. But that is most generation of PDFs. For reading,
> splitting, etc, you may have to look at their commercial offering

I will have to research on it, i think time to list the best available
libs, and finding all's dependency.

> On Dec 19, 2007 6:10 AM, Boris Borcic <bbo...@gmail.com> wrote:
> Everybody will tell you "reportlab", but AFAIK the open-source kit does not
> provide manipulation of existing pdf file contents - "only" creation. Besides
> it's targeted at CPython and it isn't 100% clear it runs perfectly on other
> implementations; so have a look at itextsharp - it might better fit your needs
> if you decide to use IronPython.
>

iTextSharp look very promising. But i am unable to find any good
documentation. I think i will have to buy the book, so that I can have
API docs just beside me for reference.

One thing I am stuck at right now. the IronPython 1.0.2467 on .NET
2.0.50727.42 is installed on my Ubuntu, so does CPython 2.5 libraries
can be called in it without any problem.
In ipy shell,
>>> sys.path
['/home/rskumar', '/usr/lib/ironpython/Lib', '/usr/lib/python2.4',
'/usr/lib/python2.4/site-packages']
>>>

so it uses Python 2.4 libs. I am fearing to mix the 2.4 and 2.5 libs.
I need to load CPython 2.5 libs by adding it in site.py file of
ironpython. Maybe I should go experimenting and let you people know
about it.

--
-=Ravi=-

Ravi Kumar

unread,

Dec 19, 2007, 1:43:01 AM12/19/07

to pytho...@python.org

>
> If I had to do something like this I would host a Python web
> server listening on some port and let .NET application talk to me
> using HTTP requests preferably or SOAP if I really had to.
> It could be a Paster or Cherrypy or Twisted based server.
> Google for instruction on how to package it using py2exe
> into a windows service.
> The service could be hosted on the same server as .NET or a separate
> box

Thats is the best solution to me also, but restrictions are I cann't
consume extra service for it. I need tight integration and no remote
service calls, thats why it seems bit tricky to me., and i am working
to integrate Python implementation in .NET.
So where the .NET application has to work with PDF part, it will
invoke the IronPython Engine, pass required configurations and
parameters, and get the result+exceptions etc.

>
> http://pybrary.net/pyPdf/

Noted for my reference. Thanks :)

> > -Which XML Library is pure python based.
> >
>
> ElementTree

So ElementTree solved one part. thanks friend.

--
-=Ravi=-

Lua...@gmail.com

unread,

Dec 20, 2007, 9:27:38 AM12/20/07

to

This isn't a "python" reply, but for .NET PDF manipulation, you might
look at http://www.pdfbox.org/userguide/dot_net.html