TechVideos
  Home     About           Free Videos           FAQ     Contact  
Stay Informed Tell A Friend Bookmark this site
Copyright 2003

Overview of Web Traffic Analysis Software

ABSTRACT

This paper examines how web traffic analysis software can be used to help understand who is visiting a website, where they came from, and what types of information they requested. This software can also be used to help model behavior data about visitors such as when visitors are leaving a site or their buying patterns. This information can be used to internally justify a company's investment in the web, help increase their return on their investment, or help them better manage their web site. Altogether this software, allows conclusions to be drawn from the volume of individual requests made to a server.

INTRODUCTION

As web sites become an integral part of an organization's operations and external communications, it is increasingly becoming more important to understand, model, and utilize web site traffic and visitors' online behavior effectively. By examining web log files, it is possible to determine not only what, when, and by whom documents are being viewed, but log files can also provide information regarding server load, unsuccessful requests, and valuable marketing information. An analysis of log files can help an organization better understand their visitors and the actions they take on a web site. This information can be used to internally justify a company's investment in the web, help increase their return on their investment, or help them manage their web site.

In addition to knowing what you can learn from a log file, it is equally important to understand what you can't learn from a log file. In particular what types of data are not captured in log files, what types of data are inherently incomplete, and what types of incorrect inferences can be made from log files.

WHAT IS RECORDED IN A LOG FILE

Every request from a client browser is recorded in the server's log files. For a busy server this can result in hundreds or thousands of entries being recorded per hour. Depending on the server and how it is configured, the following information is typically recorded. Below is an actual example of a log file entry.
T59982.nsuok.edu - - [13/Jan/2003:13:39:12 -0500] 
  "GET /athletics/ HTTP/1.1" 200 9980 "http://www.nsuok.edu" 
  "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"
    Information   What it stands for
T59982.nsuok.edu Address of the computer requesting the file
[13/Jan/2003:13:39:12 -0500] Date and time the request was made
/schedules/openspring.html URL of the file requested
HTTP/1.1" Protocol used to request the file
200 successful GET
9980 Number of bytes
http://www.nsuok.edu Referring URL
Mozilla/4.0 Type of browser
Windows 98 Operating system

USAGE STATISTICS THAT CAN BE DETERMINED

The data contained in a log file can be analyzed in various ways. This information can provide the following statistics.

INFERENCES THAT CAN BE MADE

Advanced Web traffic analysis software can even provide behavioral data about visitors. By taking a closer examination of log files this software can help: 1) identify when visitors are leaving your website, 2) understand visitors buying patterns and content interests, 3) sort visitors by demographics and browsing behaviors, 4) quantify the mix of visitors including the number of new, repeat, and unique visitors, and 5) help companies optimize their marketing dollars. Listed below are other statistics that can be determined.

USAGE STATISTICS THAT CAN NOT BE DETERMINED

While many statistics can be compiled by examining log files, there are still some types of data and inferences that can not be derived from log files. Today many large scale caches are used to help reduce the response and download times. This implies that if the browser finds the file at any intermediary cache, then the request will not be recorded in the server where the original document is located. Similarly, if a site is mirrored, then the log files from all sites must be added together.


WEB TRAFFIC ANALYSIS SOFTWARE

There are numerous applications available to help analyze log files. Below is a partial listing of web traffic analysis software. These software packages are very competitive priced. For example, a single-domain license of WUSAGE 8.0 cost $75. A web-hosting license of WUSAGE 8.0 which reports on unlimited virtual domains, located at a single physical location costs $295.
NetTracker
http://www.sane.com/products/NetTracker

WUSAGE
http://boutell.com/wusage

WebTrends
http://www.webtrends.com

Webalizer

EXAMPLES OF WEB TRAFFIC ANALYSIS SOFWARE

Below are some actual examples of usage statistics. The charts and tables were created using WUSAGE 8.

Top 10 Browsers (User Agents).
Sorted by Access Count

Rank Product %
1 Microsoft Internet Explorer 6.0 49.12
2 Microsoft Internet Explorer 5.0 38.05
3 Netscape 4.0 7.78
4 Netscape 5.0 1.86
5 Microsoft Internet Explorer 4.0 1.00
6 Netscape 3.0 0.49
7 Googlebot/2.1 0.35
8 MSProxy/2.0 0.17
9 Wget/1.8.1 0.16
10 Microsoft URL Control - 6.00.8862 0.09



Screen Depth (Number of Colors).
Note: computers reporting 16 colors may in some cases be grayscale devices. Most 16-color computers are Windows machines temporarily running in safe mode.

# Color Depth Computers %
1 Black and White (1 bit) 31,505 0.12
2 4 Gray Shades (2 bit) 73 0.00
3 16 Colors (4 bit) 22,685 0.09
4 256 Colors (8 bit) 907,830 3.51
5 65,536 Colors (16 bit) 12,353,149 47.80
6 Millions of Colors (24 bit) 3,354,058 12.98
7 Millions of Colors (32 bit) 9,171,476 35.49



Top 10 Visitor Domains.
Sorted by Access Count

Rank Domain Accesses % Bytes % Visits %
1 edu 256,979 49.11 2,977,112,827 49.85 61,630 44.41
2 net 110,722 21.16 1,343,720,945 22.50 38,731 27.91
3 unknown 72,732 13.90 655,430,928 10.98 19,228 13.85
4 com 72,043 13.77 864,868,021 14.48 15,319 11.04
5 gov 3,110 0.59 38,923,391 0.65 1,216 0.88
6 n_america 1,650 0.32 21,540,140 0.36 679 0.49
7 asia 1,550 0.30 13,199,266 0.22 344 0.25
8 mil 1,457 0.28 20,541,590 0.34 487 0.35
9 org 1,165 0.22 11,433,053 0.19 489 0.35
10 europe 1,020 0.19 13,184,515 0.22 286 0.21



Top 10 Search Keywords.
Sorted by Access Count

Keywords used to reach this site via search engines, such as Altavista and Infoseek.

Rank Search Keyword(s) Accesses %
1 Northeastern State University 4,416 7.50
2 Northeastern 2,371 4.03
3 State 1,814 3.08
4 nsuok 1,416 2.41
5 Academic schedules 1,034 1.76
6 Tahlequah 842 1.43
7 Broken Arrow 798 1.36
8 training evaluation 768 1.30
9 shareware 713 1.21
10 information technology 658 1.12



Most Downloaded File Types.
Sorted by Access Count

Most Downloaded File Types
 
File type
Files
K Bytes Transferred
1
gif
1,522,220
1,671,971
2
jpg
449,425
5,660,605
3
html
230,096
8,521,768
4
js
57,849
1,289,407
5
pdf
32,906
1,049,564
6
htm
6,543
90,479
7
txt
1,182
482
8
class
266
2,494
9
css
185
236
10
com/
5
165
 
Total Files & K Bytes Transferred
2,300,677
18,287,166



Screen Resolution.
Computers offering at least 640x480 pixels, but less than 800x600 pixels, are reported as 640x480, and so on.

# Resolution Computers %
1 640x480 867,629 3.36
2 800x600 9,208,487 35.62
3 1024x768 12,830,967 49.64
4 1280x1024 2,421,307 9.37
5 1600x1200 and above 520,560 2.01

CONCLUSIONS

Unlike other marketing venues, visitors to a web site are typically anonymous. Web traffic analysis software, however, can help marketers better understand who is visiting a web site and why. This software can help identify the types of companies that are visiting a website, where they came from, and what types of information they requested. Advanced Web traffic analysis software can even provide behavioral data about the visitors. For example, when visitors are leaving a site and their buying patterns. Altogether these software packages, allow conclusions to be drawn from the volume of individuals requests made to a server.



Powered by Sphider