PyPI parser eases path to browsing Python package APIs

A new project from Read the Docs aims to automatically generate API documentation from code uploaded to the Python Package Index

Read the Docs, a popular community-supported service for creating easy-to-navigate online documentation for software projects, has unveiled Pydoc, a new service that automatically generates API reference documentation for packages uploaded to the Python Package Index (PyPI).

"There are a specific set of use cases that API reference documentation support," write the project's maintainers in their introductory post to the service, "and the Python community doesn’t support them well." Languages like Go and Ruby already have similar API documentation sites, and now Python users can enjoy the same.

Do you read me?

Pydoc takes packages hosted in PyPI and parses their source code with the Sphinx documentation generation engine. If the source code has inline documentation formatted according to Python's docstring convention, it generates a browsable, tree-formatted index of all the APIs in the package. APIs with no documentation have their function signature listed with no comments.

Read the Docs has typically focused on high-level user documentation for a project, although it has also included API documentation whenever it's provided. But Read the Docs generates material at the behest of the project owners and not automatically from the project's source.

Pydoc's approach differs in two aspects. One, it's built automatically from PyPI packages; anything hosted on PyPI with proper docstrings will have its documentation assembled without need for user intervention. Two, it provides a place for seasoned programmers to look up information about a package's API without having to paw through the source code. The source code is still available in its own repository -- or by simply obtaining the PyPI package and uncrating it -- but with Pydoc, it's easier to read it without distractions.

Rough and tumble

Not everything works yet. "This is a very beta release," state the developers in their introductory post. How the service integrates with PyPI or handles edge cases like private methods in functions are all still works-in-progress.

What's more, the current techniques used for generating docs don't scale well. The code has to be imported into a running instance of the Python interpreter to be parsed and traversed.

For smaller libraries, such as the pytz time zone library, this approach isn't so bad. But for anything more sprawling and complex, such as NumPy where a lot of the functionality is in C libraries, it's impractical. To that end, Pydoc's developers are looking at building custom parsers as a long-term solution, but there's no timetable for when that'll be available.

Copyright © 2016 IDG Communications, Inc.