Class Mechanize::HTTP::Agent
In: lib/mechanize/http/agent.rb
Parent: Object

An HTTP (and local disk access) user agent. This class is an implementation detail and is subject to change at any time.

Headers

Attributes

conditional_requests  [RW]  Disables If-Modified-Since conditional requests (enabled by default)
gzip_enabled  [RW]  Is gzip compression of requests enabled?
request_headers  [RW]  A hash of request headers to be used for every request
user_agent  [R]  The User-Agent header to send

History

Attributes

history  [RW]  history of requests made

Hooks

Attributes

content_encoding_hooks  [R]  A list of hooks to call to handle the content-encoding of a request.
post_connect_hooks  [R]  A list of hooks to call after retrieving a response. Hooks are called with the agent and the response returned.
pre_connect_hooks  [R]  A list of hooks to call before making a request. Hooks are called with the agent and the request to be performed.

HTTP Authentication

Redirection

Attributes

follow_meta_refresh  [RW]  Follow HTML meta refresh and HTTP Refresh. If set to +:anywhere+ meta refresh tags outside of the head element will be followed.
follow_meta_refresh_self  [RW]  Follow an HTML meta refresh that has no "url=" in the content attribute.

Defaults to false to prevent infinite refresh loops.

redirect_ok  [RW]  Controls how this agent deals with redirects. The following values are allowed:
:all, true:All 3xx redirects are followed (default)
:permanent:Only 301 Moved Permanantly redirects are followed
false:No redirects are followed
redirection_limit  [RW]  Maximum number of redirects to follow

Robots

Attributes

robots  [R]  When true, this agent will consult the site‘s robots.txt for each access.

SSL

Attributes

pass  [RW]  OpenSSL key password

Timeouts

Attributes

keep_alive  [RW]  Set to false to disable HTTP/1.1 keep-alive requests
open_timeout  [RW]  Length of time to wait until a connection is opened in seconds
read_timeout  [RW]  Length of time to attempt to read data from the server

Attributes

cookie_jar  [RW]  The cookies for this agent
max_file_buffer  [RW]  Responses larger than this will be written to a Tempfile instead of stored in memory. Setting this to nil disables creation of Tempfiles.

Utility

Attributes

context  [RW]  The context parses responses into pages
ignore_bad_chunking  [RW]  When set to true mechanize will ignore an EOF during chunked transfer encoding so long as at least one byte was received. Be careful when enabling this as it may cause data loss.
scheme_handlers  [RW]  Handlers for various URI schemes

Public Class methods

Creates a new Mechanize HTTP user agent. The user agent is an implementation detail of mechanize and its API may change at any time.

Public Instance methods

Adds credentials user, pass for uri. If realm is set the credentials are used only for that realm. If realm is not set the credentials become the default for any realm on that URI.

domain and realm are exclusive as NTLM does not follow RFC 2617. If domain is given it is only used for NTLM authentication.

Retrieves uri and parses it into a page or other object according to PluggableParser. If the URI is an HTTP or HTTPS scheme URI the given HTTP method is used to retrieve it, along with the HTTP headers, request params and HTTP referer.

redirects tracks the number of redirects experienced when retrieving the page. If it is over the redirection_limit an error will be raised.

URI for a proxy connection

Retry non-idempotent requests?

Retry non-idempotent requests

Headers

Public Instance methods

History

Public Instance methods

Equivalent to the browser back button. Returns the most recent page visited.

Returns the latest page loaded by the agent

Returns a visited page for the url passed in, otherwise nil

Hooks

Public Instance methods

Invokes hooks added to post_connect_hooks after a response is returned and the response body is handled.

Yields the context, the uri for the request, the response and the response body.

Invokes hooks added to pre_connect_hooks before a request is made. Yields the agent and the request that will be performed to each hook.

Request

Response

Robots

Public Instance methods

Tests if this agent is allowed to access url, consulting the site‘s robots.txt.

Returns an error object if there is an error in fetching or parsing robots.txt of the site url.

Raises the error if there is an error in fetching or parsing robots.txt of the site url.

Removes robots.txt cache for the site url.

SSL

Public Instance methods

Path to an OpenSSL CA certificate file

Sets the path to an OpenSSL CA certificate file

The SSL certificate store used for validating connections

Sets the SSL certificate store used for validating connections

Sets the client certificate to given X509 certificate. If a path is given the certificate will be loaded and set.

An OpenSSL private key or the path to a private key

Sets the client‘s private key

SSL version to use

Sets the SSL version to use

A callback for additional certificate verification. See OpenSSL::SSL::SSLContext#verify_callback

The callback can be used for debugging or to ignore errors by always returning true. Specifying nil uses the default method that was valid when the SSLContext was created

How to verify SSL connections. Defaults to VERIFY_PEER

Sets the mode for verifying SSL connections

Timeouts

Public Instance methods

Reset connections that have not been used in this many seconds

Sets the connection idle timeout for persistent connections

Utility

Public Instance methods

Creates a new output IO by reading input_io in read_size chunks. If the output is over the max_file_buffer size a Tempfile with name is created.

If a block is provided, each chunk of input_io is yielded for further processing.

Sets the proxy address, port, user, and password addr should be a host, with no "http://", port may be a port number, service name or port number string.