performance - How can Google be so fast? -


what technologies , programming decisions make google able serve query fast?

every time search (one of several times per day) amazes me how serve results in near or less 1 second time. sort of configuration , algorithms have in place accomplishes this?

side note: kind of overwhelming thinking if put desktop application , use on machine not half fast google. keep on learning say.


here of great answers , pointers provided:

latency killed disk accesses. hence it's reasonable believe data used answer queries kept in memory. implies thousands of servers, each replicating 1 of many shards. therefore critical path search unlikely hit of flagship distributed systems technologies gfs, mapreduce or bigtable. these used process crawler results, crudely.

the handy thing search there's no need have either consistent results or up-to-date data, google not prevented responding query because more up-to-date search result has become available.

so possible architecture quite simple: front end servers process query, normalising (possibly stripping out stop words etc.) distributing whatever subset of replicas owns part of query space (an alternative architecture split data web pages, 1 of every replica set needs contacted every query). many, many replicas queried, , quickest responses win. each replica has index mapping queries (or individual query terms) documents can use results in memory quickly. if different results come different sources, front-end server can rank them spits out html.

note long way different google - have engineered life out of system there may more caches in strange areas, weird indexes , kind of funky load-balancing scheme amongst other possible differences.


Comments

Popular posts from this blog

windows - Why does Vista not allow creation of shortcuts to "Programs" on a NonAdmin account? Not supposed to install apps from NonAdmin account? -

c++ - How do I get a multi line tooltip in MFC -

unit testing - How to mock PreferenceManager in Android? -