SourceForge.net Logo

mechanize

Stateful programmatic web browsing in Python, after Andy Lester's Perl module WWW::Mechanize .

An example:

import re
from mechanize import Browser

b = Browser()
b.open("http://www.example.com/")
# follow second link with element text matching regular expression
response = b.follow_link(text_regex=re.compile(r"cheese\s*shop"), nr=1)
assert b.viewing_html()
print b.title()
print response.geturl()
print response.info()  # headers
print response.read()  # body
response.close()

b.select_form(name="order")
# Browser passes through unknown attributes (including methods)
# to the selected HTMLForm (from ClientForm).
b["cheeses"] = ["mozzarella", "caerphilly"]  # (the method here is __setitem__)
response2 = b.submit()  # submit current form

response3 = b.back()  # back to cheese shop
# the history mechanism uses cached requests and responses
assert response3 is response
# we can still use the response, even though we closed it:
response3.seek(0)
response3.read()
response4 = b.reload()
assert response4 is not response3

for form in b.forms():
    print form
# .links() optionally accepts the keyword args of .follow_/.find_link()
for link in b.links(url_regex=re.compile("python.org")):
    print link
    b.follow_link(link)  # takes EITHER Link instance OR keyword args
    b.back()

Full documentation is in the docstrings.

Thanks to Ian Bicking, for persuading me that a UserAgent class would be useful.

Todo

Download

All documentation (including this web page) is included in the distribution.

This is an alpha release: interfaces may change, and there will be bugs.

Development release.

For installation instructions, see the INSTALL file included in the distribution.

See also

Richard Jones' webunit (this is not the same as Steven Purcell's code of the same name). webunit and mechanize are quite similar. On the minus side, webunit is missing things like browser history, high-level forms and links handling, thorough cookie handling, refresh redirection, adding of the Referer header, observance of robots.txt and easy extensibility. On the plus side, webunit has a bunch of utility functions bound up in its WebFetcher class, which look useful for writing tests (though they'd be easy to duplicate using mechanize). In general, webunit has more of a frameworky emphasis, with aims limited to writing tests, where mechanize and the modules it depends on try hard to be general-purpose libraries.

There are many related links in the General FAQ page, too.

FAQs

John J. Lee, January 2005.