How do you evaluate the efficiency of an algorithm, if the problem space is underspecified? -
there post on here posed following question:
you have two-dimensional plane of (x, y) coordinates. bunch of random points chosen. need select largest possible set of chosen points, such no 2 points share x coordinate , no 2 points share y coordinate.
this all information provided.
there 2 possible solutions presented.
one suggested using maximum flow algorithm, such each selected point maps path linking (source → x → y → sink). runs in o(v3) time, v number of vertices selected.
another (mine) suggested using hungarian algorithm. create n×n matrix of 1s, set every chosen (x, y) coordinate 0. hungarian algorithm give lowest cost matrix, , answer number of coordinates selected equal 0. runs in o(n3) time, n greater of number of rows or number of columns.
my reasoning that, vast majority of cases, hungarian algorithm going faster; v equal n in case there's 1 chosen point each row or column, , substantially greater case there's more that: given 50×50 matrix half coordinates chosen, v 1,250 , n 50.
the counterargument there cases, 109×109 matrix 2 points selected, v 2 , n 1,000,000,000. case, takes hungarian algorithm ridiculously long time run, while maximum flow algorithm blinding fast.
here question: given problem doesn't provide information regarding size of matrix or probability given point chosen (so can't know sure) how decide algorithm, in general, better choice problem?
you can't, it's imponderable.
you can define better "in general" defining inputs see "in general". example whip probability model of inputs, expected value of v function of n, , choose 1 best expected runtime under model. there may arbitrary choices made in construction of model, different models give different answers. 1 model might choose co-ordinates @ random, model might @ actual use-case program you're thinking of writing, , @ distribution of inputs encounter.
you can alternatively talk has best worst case (across possible inputs given constraints), has virtue of being easy define, , flaw it's not guaranteed tell performance of actual program. instance heapsort faster quicksort in worst case, slower in average case. faster? depends whether care average case or worst case. if don't care case, you're not allowed care "is faster".
this analogous trying answer question "what probability next person see have above (mean) average number of legs?".
we might implicitly assume next person meet selected @ random uniform distribution human population (and hence answer "slightly less one", since mean less mode average, , vast majority of people @ mode).
or might assume next meeting person randomly selected uniform distribution set of meetings between 2 people, in case answer still "slightly less one", reckon not exact same value first - one-and-zero-legged people quite possibly congregate "their own kind" more frequency within population suggest. or possibly congregate less, don't know, don't see why should same once take account veterans' associations , on.
or might use knowledge - if live one-legged person answer might "very above 0".
which of 3 answers "correct" depends precisely on context forbidding talking about. can't talk correct.
Comments
Post a Comment