This post was forwarded to me by Adam McCarthy our lead server admin:
I have seen two threads on the forums thus far dealing with stats, and since
there is a lot of general information to cover I will be posting the same
response and clarification to both threads, as I feel a lot of the
information is beneficial to both topics.
Prior to my joining the MonsterCommerce team some time back, we used a
program called DeepMetrix LiveStats for web analytics. This program had a
very nice set of visual displays, but had many horrible stability issues. As
I am sure some of our veteran customers recall, LiveStats would go down
almost predictably every other week and typically required us to get the
actual development team from DeepMetrix working on correcting various issues with the database for stats, or replacing/patching the actual binaries
almost every time. This led to issues where the stats service would be down
for days while waiting on repairs, inconsistent reporting, etc, etc.
During this time, I was charged with evaluating other web stats solutions,
and out of the large group of options a technical group decision was reached
on Urchin. Urchin has proved to be a fairly robust solution, we have no had
any crashes per se, and performance is fairly reliable...with a few minor
issues/points of concern, which I will address further in this post.
One of the primary issues is that customers want to have access to tracking
of variables in their web site URLs such as an accurate gauge the populararity of products, or the progression of consumer browsing, etc.
Unfortunately, this is not something that our standard stats offering will be able to provide for several reasons.
The first of these reasons is the added overhead required to process the URI
stems for parameters of page requests causes to the overall performance of
the stats server is so high that it literally adds almost exponentially to
the total processing time of all sites.
This ability does exist, and it has actually led to some serious delays in stats processing, which in turn makes other customers unhappy (especially those who don't seem to need more information than is already present).
We are currently in the process of removing these filters to parse URI stems (for the sites we enabled it) to improve overall performance. The second reason we do not process log information to this granular a level is that configuring the filters of regular expressions it takes to catch and isolate the particular parameter or variable that our customers want is a manual process and would have to be continually modified and/or tweaked to catch exactly what information each customer wanted. Aside from the amount of time and effort this takes, it is literally something we would have to update or change every time a customer's product line was modified in any way.
Currently we do not provide geographic data information as a default option, but when the next major release of Urchin is made available (version 5), the processing time is expected to improve so much that we should be able to enable this feature. For approximately $10.00 per month you really can't expect the equivalent of your own dedicated private stats server with full customizable reporting and display options, filters, etc. These services are available to use with your MC store... it simply costs more.
What I can tell everyone is that there are several workaround options. We
rely on log file data for processing all of our sites bandwidth usage and
have just reached a 90% completion point on analyzing a fully accurate
monthly analysis of our customer's disk and bandwidth usage compared to
their plan type. This is relevant to the stats discussion in that now we are
able to isolate customers who are exceeding their standard plan types, or
even the tiered options, and require them to upgrade accordingly. The
biggest help is that many customers who are exceeding the usage for their
tiered options are basically being told they have to go with dedicated
servers, which at the same time will be running their own personal Urchin
installation to analyze just the stats for their sites on their dedicated
machine.
This does a couple of things....most apparent is that the load on the
servers of all our plan types is decreased to optimal performance levels
based on the plan type/allocation we have designed and implemented.
Additionally, these sites pushing massive traffic typically have gigantic
log files (anywhere from 1 to 2 gigabytes in size *daily*). By removing the
overhead of processing these extremely large log files (compared to our
typical customer's log files that it), the processing time for Urchin on the
shared stats servers is boosted. A less apparent benefit of this course of
action is that the customers with dedicated servers have full control over
how their log files are processed, including parameters/variables, and
whatever kind of filters they want to apply....allowing for the more
specific types of reporting some customers are asking for.
As I mentioned earlier, this is a relatively new plan of action (as of this
month we have implemented). This means overall improvements to the stats
performance as regarding update generation times should be seen from this
month on. Also, a new group of stats servers is being brought online to make
sure overall log analysis efforts are spread amongst many machines
(basically you will not have to worry about this so long as you go to the
Site Statistics link in your MC admin interface to get to your stats login).
One of the best outcomes of the option to have a stats service running on a
customer's dedicated store server is the level of tuning and control this
provides....so I would like a lot of the customers who want more frequent
updates and more granular reporting to keep this option in mind. The
additional charge for stats service on your dedicated server (should you
choose to go this route) is $35.00 monthly. If you want to see what a deal
this is then checkout Urchin.com and you will see that the same one time
charge for a purchase of a dedicated Urchin setup is about $500.00.
Moving on, other options are that we can make your log file directory
available to you via a virtual directory (i.e.
http://www.site.com/logs) and you (the customer) can feel free to download the logs and process them on your end in any manner you would like. The only caveats to this option are that the log file directory will only be available to you in read only format (with no exceptions as we need those logs for tracking bandwidth and disk usage), we do delete each previous months logs at the beginning of each current month (i.e. the first week of July all logs for June will be deleted and are not recoverable), and most importantly the downloading of your log files *will* count towards your total bandwidth usage figures. If you would like to have the accessible log file option enabled for your account, please submit a ticket specifically requesting this and we will take care of it. There is no charge for this option and it does not preclude being able to still get stats service with Urchin.
While we are on the track of processing your own files...please keep in mind
that if you have personally found a really good solution that seems to be
better than our current implementation, please do not hesitate to let us
know! MonsterCommerce as a company knows that our relationship with our
customers is a partnership, and your feedback/suggestions are taken very
seriously....plus I can honestly tell your our technical staff has not had a
chance to evaluate every option out there and we would appreciate the advice
of anyone willing to share it.
This brings us to the final option for stats....and that is to go with a 3rd
party solution such as WebTrends Live/WebTrends Reporting Service where
snippets of code can be placed into your store pages and information sent
for processing based on this to WebTrends, where reports are generated. I
have spoken with WebTrends technical and sales staff and I can tell you
honestly if WebTrends cannot track whatever degree of information about your web site you want...then it just isn't trackable. Keep in mind as I
mentioned about price earlier, should you choose to go with an outside
service such as WebTrends, you will be dealing with them, not us, and while
we will do everything we can to integrate whatever code is needed for
implementation; whatever technical support or billing issues you will have
would be through this third party. Also, plain and simple...for the fine
tuned information levels some people are mentioning they want, you will be
paying quite a lot more. Basic reporting via WebTrends starts at $35.00 per
month for 50,000 page views processed, with an additional $0.65 per each
additional 1,000 pageviews processed. To put this in perspective, for some
of our sites with 100,000 page views a month, WebTrends service would cost
about $67.00 per month, and this cost would only grow as your traffic did
(for one of our busier sites with 1,000,000 average page views per *day* the
WebTrends processing service would cost $18,560 per month - these are actual quotes I received when investigating this option). Again I cannot stress
enough, you get what you pay for is what a lot of this boils down to.
So, in summary the stats service we provide is very basic. Stats are
collected daily, with the previous days reports being available to our
customer each day, a service we charge $10.00 per month for. The stats
service will continue to be basic in regards to information provided and
frequency of updates because of the pricepoint is set as a shared basic
option (so just general information as opposed to very detailed reports).
There are 3 courses of action if this is not acceptable to you....these
being getting a dedicated server with your own standalone urchin
installation providing more frequent updates and more control over reporting
information, processing the logs on your own end with the log files for your
site being made publicly available via a virtual directory link (must
request this option via ticketing system), or going with an outside solution
whose implementation will not impact the overall server performance (i.e. we
will not install the processing server on the same machine as the shared
sites, code inside of your web pages is preferred). Finally, we (the
technical staff) are aware of some performance problems and improvements
needed and are focusing on these efforts as a priority, with noticeable
results expect to be seen inside of the next month and then again once the
next release of Urchin is available.
Wow, I wrote a book...sorry for the length, but I did want to cover all
issues related to stats and basically inform you (our customers) of what is
currently going on since typically the technical operation staff are kind of
behind the scenes. What I can tell everyone who has managed to suffer
through the entire length of this post is that my team is just as concerned
about the satisfaction levels of our customers as any other department or
person in MonsterCommerce is and that I can assure you our goals are to work towards continual improvement and to constantly do our part to maintain or increase our customer satisfaction levels.
Adam McCarthy
Senior Systems and Security Administrator MonsterCommerce
adam@monstercommerce.com