EZproxy has the capability of recording large amounts and varied types of data about your remote patrons' use of EZproxy. This data can be used in a number of ways to assess remote users' access to EZproxy, to identify security issues, and troubleshoot problems that may arise with your instance of EZproxy.
The LogFormat directive and its fields provide you with the opportunity to customize what data you want EZproxy to record in ezproxy.log. The same fields used to refine the ezproxy.log data are also compatible with the LogSPU directive and can be used to customize the information contained in spu.log as well.
LogFormat is a position-independent config.txt/ezproxy.cfg directive that specifies the format and information that EZproxy should use when recording EZproxy user activity. These requests are logged to the file ezproxy.log or to the filename specified by LogFile. By default, EZproxy records this information using common log format which is recognized by many web server log file analysis packages.
The following table describes the special fields that are available to customize the log format. When logging URL information, EZproxy records only the remote database URL (e.g. http://www.somedb.com/) and not the corresponding EZproxy URL that the user sees (e.g http://ezproxy.yourlib.org:2050/ or http://www.somedb.com.ezproxy.yourlib.org/). Not all fields are available in older versions of EZproxy, so if a specific field returns -, you may need to update to the current release if you want to use that field.
Field | Value |
---|---|
%a |
IP address of the host accessing EZproxy |
%b |
Number of bytes transferred |
%{expression}e |
Evaluate expression as an EZproxy Expression |
%h |
The IP address of the host accessing EZproxy |
%{header}i |
Specified header from the browser request; any HTTP header field can be substituted for header; the following rows (until %l) are EZproxy specific and commonly used headers within EZproxy logs |
%{ezproxy-dbvar #}i |
Replace # with a digit 0 through 9 to have the DbVar associated with this database inserted |
%{ezproxy-groups}i |
Plus-sign-delimited list of groups to which the user has access (e.g. General+Restricted) |
%{ezproxy-protocol}i |
Protocol of the remote server accessed (i.e. http, https, or ftp) |
%{ezproxy-session}i |
EZproxy identifier for the user's current session |
%{ezproxy-spuaccess}i |
When used with LogSPU, insert "proxy" if the remote user's access to the URL will be proxied; "local" if being redirected due to ExcludeIP; or "unknown" if the URL is not know to EZproxy and Option RedirectUnknown appears in the config.txt. |
%{ezproxy-url #}i |
Specific portion of the destination URL; # is a number that specifies which portion to insert. For example, in the URL http://www.somedb.com/abc/def, %{ezproxy-url1}i would return abc, %{exproxy-url2}i def, and %{ezproxy-url}i a blank string. |
%{ezproxy-usrvar #}i |
Replace # with a digit 0 through 9 to have the corresponding UsrVar associated with this user inserted |
%{referer}i |
The URL of the website the user was on prior to accessing EZproxy, if that website has sent a referring URL header (you might want to log this so you can see how your users are arriving at EZproxy) |
%{user-agent}i |
The browser the user is using |
%l |
Remote username obtained by identd (if identd is not used, a - will be inserted) |
%m |
Method of request (e.g. GET, POST) |
%r |
Complete request (e.g. GET http://www.somedb.com HTTP/1.0) |
%s |
HTTP numeric status of request |
%t |
Date/time of request; may also appear as %{format}t to specify a strftime time format. |
%T |
Time in seconds to process the request |
%u |
Username used to log into EZproxy if Option LogUser appears in config.txt; session identifier if Option LogSession appears in config.txt; - otherwise. To log both username and session, add only Option LogUser to config.txt, then use %u for the username and %{ezproxy-session}i for the session identifier. |
%U |
URL requested (e.g. http://www.somedb.com/) |
%v |
Virtual web server's hostname (e.g. www.somedb.com) |
%% |
Single percent sign (%) |
\n |
Newline character |
\t |
Tab character |
When considering the use of Option LogUser, Option LogSession, and %{ezproxy-session}i, carefully consider the balance between the data gathering possibilities versus the potential privacy issues of being able to bundle together the browsing history of your patrons.
The LogFormat field %s records the HTTP numeric status of the request. This field can return standard HTTP status codes, but may also record special status codes under the circumstances specified in the table below.
Code | Meaning |
---|---|
597 | Recorded on access attempts after the IntruderReject threshold has been exceeded. |
598 | Attempt made to access an unauthorized EZproxy administration function |
599 | Starting point URL referenced a host that EZproxy is not configured to proxy |
900-905 & 907 | Error occurred receiving the request from the remote user's browser |
906 | Error occurred forwarding the request from the user's browser to the remote server |
950 | Error occurred interpreting an administration request from the remote user |
997 | Shibboleth login failure due to metadata misconfiguration (recorded as 999 in early beta releases of Shibboleth support) |
998 | Access denied based on a DenyIfRequestHeader directive in config.txt |
999 | Access attempted from an IP address that is in a RejectIP address range |
By default, EZproxy uses this LogFormat command:
LogFormat %h %l %u %t "%r" %s %b
And will provide you with the following information in this order:
%h
: the IP address of the Host accessing EZproxy%l
: the remote username obtained by identd, if identd is used; if identd is not used, you will see a -%u
: the username or session identifier, based on other config.txt options; if no related directives are included in your config.txt file, you will see a -%t
: the date and time the request was made"%r"
: the complete http request sent to the remote server; this field is contained in quotation marks so it is parsed as one piece of data even though it contains spaces, since spaces are generally a signal that a new field of data is beginning%s
: the HTTP numeric status of the request%b
: the number of bytes transferredThis default will provide you with the following line of data:
132.174.1.1 - - [14 /Mar/2014:09:39:18 -0700] “GET http://www.somedb.com:80/index.html HTTP/1.0” 200 1234
If you would like to collect different information in your ezproxy.log file, you can edit config.txt specifying the information and format you would like your log files to contain by adding different field values after the LogFormat directive. For a complete list of fields available for customization, please see the table below
The following table provides scenarios in which data gathered from EZproxy logs could be of use, the config.txt directive statement that would record useful information in response to that scenario, and a discussion of why this directive statement would provide you the needed information.
What you want to know | config.txt Directive | Why This Works |
---|---|---|
User and Use Data You want to know more about your remote users. You wonder about details such as their location, how much data they are transferring in a typical proxied session, when they are most active, and what resources they are using. |
|
The default configuration specified by the LogFormat directive provides you with a selection of general information to assess your remote users' activity on EZproxy. %h will provide you with their IP address, which you can tie to a geographic location %t will reveal the dates and times of their requests, so you can determine when they are most active %r will show you what resources they request, so you can assess your electronic resource collections %b will show you how many bytes are transferred in their sessions, revealing exactly how much information they are accessing when they use your resources remotely. |
Most Active User Groups You want to know which groups within your user community access resources remotely with the most frequency (for example, what group/s do they belong to, e.g. student, graduate student, faculty) You would like to use this data to market remote database access to groups who are under-utilizing this service. |
|
The default configuration is retained so EZproxy continues to record basic usage information in your log. Two options exist to gather user group information based on your configuration and authentication settings. If you have Groups defined in your config.txt file, you can add the first config.txt statement to add a list of resource groups an individual user can access to any line of data in your EZproxy log. If you have UsrVar defined in your user.txt file, you can add the second config statement, substituting the user variable, 0 through 9, that you have included in your user.txt file to assign a particular tag to a selection of users. The Groups and uservar pages provide more detail about each of these configuration options. |
Security You are concerned about security, and want to determine if there has been any suspicious usage from locations where your users would not likely be. You want to confirm that no one is attempting to hack into your users accounts or downloading unusually large amounts of data. |
|
The default format plus the Option LogSession directive will provide you with information you can compare against the data provided in the Audit logs, per the Audit Most and Location directives, to assess security. %u combined with the Option LogSession directive provide you with a session number for each user login. Audit Most plus the Location directive will record the IP address and corresponding location as defined in the GeoLiteCity file along with the session ID for each audit event logged. You can compare any suspicious sessions in the EZproxy log by searching for the session ID of the identified session in the Audit logs from the admin page. You can then determine whether the session originated from a location where your users would not likely be. |
DbVar, LogFile, LogFilter, LogSPU, Option LogSession, Option LogUser, UsrVar