4. Discussion

While some of the findings did agree with our hypotheses, there were a number of unexpected results.

In general, the variances for task time were much larger than expected. Becuase of thishigh degree of variance, many of the results were not statistically significant. Despite this,the results across all dependent variables were mostly consistent.

The broad range of times can be attributed primarily to the choice of tasks. It seems that therange of difficulty of the six tasks was slightly greater than expected. Further study would havebeen necessary to ensure that not only were the tasks reasonable, but the difficulty in locating eachtarget among distractors was similar, a detail which was regretfully omitted.

The velocity slider was predicted to have the greatest range of times, mostly because of its relativeunfamiliarity. However, surprisingly enough, it seems that the scrollbar was not only the slowest of all threeinterface types, but it also exhibited the largest range of times. While the scrollbar was not expected to be the mostefficient method, it was certainly not expected to perform worse than the velocity slider. This result may beexplained by the fact that the velocity slider involved much less manipulation of the interface than the scrollbar.It seems that the time to find an image may be significantly affected by the degree of manipulation the interface involves. In this case, the scrollbar demanded almost constant focus on the interface, whereas the velocity sliderneeds only one manipulation in the beginning of the visual search and only fine-tuning adjustments thereafter.In the case of the velocity slider, it seems the practice tasks were not sufficient in acquainting the subjects withthe peculiarities of each interface, especially the velocity slider. Thus, its observed times may have been inflated.

The fact that task times were generally longer for the 5x5 grid suggests that in some contexts, particularly this one,a larger overview is not always advantageous. While previous experiments have shown that there existsan optimum thumbnail density for a given image database, the data suggests that this point is highlysensitive to the similarity of images in the database.

The results of the button based browser implied that perhaps a simpler interface is better. Certainly, the other interfaceswere more dynamic and sophisticated. But the button browser was faster, more satisfying, and gave the smallest error rates inall tests.Nevertheless, given sufficient training, the possibility still exists for a user to browse more rapidly with the velocity slider at a rate comparable to or perhaps faster thanthe standard button method. Further research is necessary.

While most of the observed error rates were not statistically significant, they did correlate fairly well with the task time and other measurements.The interfaces which gave the largest error rates also gave the slowest times, and vice versa.

The satisfaction and predictability measurements were closely correlated. As expected, the velocity slider was disliked themost, and the buttons were the most liked.While in the minority, a few subjects did prefer the velocity slider over the others.In general, satisfaction decreased slightly as the number of thumbnails increased. This is consistent with the task time observations. Intuitively, one would think that users prefer rapid browsing to slow browsing and the results doreflect this.However, the measurement of subjective difficulty was not as closely correlated with the other satisfaction-type observations.This can be explained by the fact that the question asked was unclear. Some subjects answered the question strictly in reference to their difficulty in recognizing the target imageafter encountering it, while others answered it in reference to the interface.

Finally, there were a few problems observed that may have affected the results:
First, one unforseen bias in the experiment was thatwhen the target image appeared on the first screen, often the subject would recognize it and thus never really use the widget.Indeed, a popular strategy among subjects was to first examine all the images on the first screen and to then proceed tomanipulate the interface. As a result, for all tasks where the image appeared on the first screen, the time measurementdid not accurately reflect the effect of the interface on the visual search. It seems as though the overall bias would bequite small, since such an event did occur uniformly across tasks. However, this effectively cuts down on the number of accurate observations.The amount of its effect would seem to be directly related to the probability of the target image appearing in the first nimages, where n is the number of images visible at once. This probability can be described by the ratio of n/k, where kis the size of the entire database, assuming uniform distribution of the target image. For our experiment, in the 3x3 treatments the ratio was about 5% and in the 5x5 treatments it was about 14%.

Second, the implementation of the scrollbar and the velocity slider may have had an adverse effect on the various measurements observed. Specifically,scrolling was not smooth but occured in steps of one column at a time. Since the illusion of motion was not sufficiently obvious, initial confusion resulted among a number of subjects. The fact that the database consisted of highly similar images and the aforementioned scrolling problem may have contributed to this. For example, many subjects at first thought that at each step of scrolling, the images did not move one column at a timebut instead moved n spaces where n was the width of the grid. Because of this subjects would scan some images more than once. This may have inflated the observed times for task completion.