Google’s Configuration Code
I blogged a while ago about strings hidden inside the various Google Talk executable and what could be inferred from them, and it looks like more Google internals have been revealed by the following logging information from a Google error page on a cache query:
pacemaker-alarm-delay-in-ms-overall-sum 2341989
pacemaker-alarm-delay-in-ms-total-count 7776761
cpu-utilization 1.28
cpu-speed 2800000000
timedout-queries_total 14227
num-docinfo_total 10680907
avg-latency-ms_total 3545152552
num-docinfo_total 10680907
num-docinfo-disk_total 2200918
queries_total 1229799558
e_supplemental=150000 –pagerank_cutoff_decrease_per_round=100 –pagerank_cutoff_increase_per_round=500 –parents=12,13,14,15,16,17,18,19,20,21,22,23 –pass_country_to_leaves –phil_max_doc_activation=0.5 –port_base=32311 –production –rewrite_noncompositional_compounds –rpc_resolve_unreachable_servers –scale_prvec4_to_prvec –sections_to_retrieve=body+url+compactanchors –servlets=ascorer –supplemental_tier_section=body+url+compactanchors –threaded_logging –nouse_compressed_urls –use_domain_match –nouse_experimental_indyrank –use_experimental_spamscore –use_gwd –use_query_classifier –use_spamscore –using_borg
It’s hard to interpret what exactly these codes mean, except that they give “cooking” parameters for pagerank computation to establish limits on the convergance of pagerank over time, that the server’s a 2.8Ghz machine, and a bunch of timing information. Thanks to the parameters section, though, we can try to infer:
- Country of Origin and/or language are interpreted per-domain and passed onto pages in the domain tree
- Production code is being used
- Pages are parsed into discrete units of meaning, such as sentences
- Some product of vectors optimization is being done (to aid in PR computation, etc)
- The body, your links, and the url are the most interesting to Google
- The bord are involved!!
We’ll probably only learn more about the internals of Google over time by slowly watching how their product works. Any suggestions, leave comment?
This entry was posted on Tuesday, July 4th, 2006 at 11:45 pm and is tagged with google error, google talk, docinfo, domain tree, indyrank, convergance, discrete units, google, information thanks, cpu utilization, classifier, cpu speed, pacemaker, internals, servlets, country of origin, product of vectors, rewrite, compounds, sentences. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback.

Add New Comment
Thanks. Your comment is awaiting approval by a moderator.
Do you already have an account? Log in and claim this comment.
Add New Comment
Trackbacks
(Trackback URL)
7/11/2007 at 4:16 pm
Redline... Redline (Inside Line Redesign) The strategy for the redesign is to maintain and expand Inside Line's reputation as the definitive, ...