|
Bush, himself, conceded a certain conservative vision in the paper, noting: Thus science may implement the ways in which man produces, stores, and consults the record of the race. It might be striking to outline the instrumentalities of the future more spectacularly, rather than to stick closely to methods and elements now known and undergoing rapid development, as has been done here. Technical difficulties of all sorts have been ignored, certainly, but also ignored are means as yet unknown which may come any day to accelerate technical progress as violently as did the advent of the thermionic tube. In order that the picture may not be too commonplace, by reason of sticking to present-day patterns, it may be well to mention one such possibility, not to prophesy but merely to suggest, for prophecy based on extension of the known has substance, while prophecy founded on the unknown is only a doubly involved guess. He does this just before suggesting brain-technology interfaces, anticipating his 1945 reader’s concerns that he was drifting into science fiction. We now know that the publication—if not the writing—of the article coincided approximately with the detonation of the atomic bomb at Trinity Site, a project that Bush shepherded and would prove that science fiction and fact were increasingly converging, so he can perhaps be forgiven his exuberance. Still, another aspect that is largely missing from the memex vision is the ability to construct linkages between record content automatically, based on the content itself. Associative “trails” are created by scientists in the memex universe, much like content is associated by hyperlinks in HTML pages. We know now the great acceleration of knowledge access that is possible by searching by content, rather than by title alone or crude metadata. Content mining is the essential feature of creating associative trails that accelerate knowledge discovery beyond the limitations of individuals knitting text and images together by the efforts of their own understanding. Where is this heading? A strong possibility is that data and metadata is becoming the newest application. Automated statistical machine translation is one example of how data is perhaps more critical than the supporting algorithms. In this approach that dates to the 1990s and is used by Google in its translation engine, massive collections of existing translations are mined to create useful (if not perfect) passage and phrasal-level translations. Speaker independent speech recognition is another area benefiting from large data. Directly answering questions, rather than supplying search results that are related to potential answers, is another area where data and the metadata extracted from it are essential to getting sufficient coverage and depth to make the answers useful. Automating associative content discovery is an arena that has not yet found a killer app, but is critically dependent on content mining to create those associations through the generation of high-quality metadata about the meaning of online content. And, likely, even the realization of Bush’s direct brain interfaces will rely on large-scale data to train the systems to translate the noisy and complex signals in our heads into the actions, gestures, and control signals. |