Google presented a mathematics heavy paper in February that studied a new way of ordering searches to improve their accuracy via an endogenous method that favours facts and reflects this in its search order. The paper is titled Knowledge Based Trust: Estimating the Trustworthiness of Web Sources.
Google researchers say in the study: “The quality of web sources has been traditionally evaluated using exogenous signals such as the hyper-link structure of the graph. We propose a new approach that relies on endogenous signals, namely, the correctness of factual information provided by the source. A source that has few false facts is considered to be trustworthy.”
This trustworthiness score is the core of the the new study Google has commissioned to find a better way of organising Google searches along factual based lines rather than their hit score on a graph of internet activity. Search results complied in this way would allow those pages with less or no factual inaccuracies to show up higher and earlier in search results lists.
As yet there is nothing more concrete policy wise than Googles intention to study this subject and its existing commitment through associated projects like Knowledge vault. But the ramifications of Google applying such technology to searches is huge.
In one recent Google trial with a random sampling of pages, researchers found that only 20 of 85 factually correct sites were ranked highly under Google’s current scheme. A switch to an endogenous trustworthiness scoring methodology, theoretically, could put better and more reliable information in the path of the millions of people who use Google every day. Something which would have have massive implications not only for SEO — but for civil society and media literacy.
As a vital adjunct the Knowledge vault project is actually integral to Googles study of this new search method having compiled close to 2 billion facts so far collecting them from the Web and then verifying them against existing sources. Google claims that for 271 million of these facts, the probability of actual correctness is well over 90 percent.
In subsequent analysis Gossip sites and Web forums fare very poorly in the Knowledge vault assessments despite their high popularity along exogenous lines, replete as they are with opinion, speculation as well as outright mistakes and deception. Its possible to see commercial sites reacting to this new fact based search algorithm with deep concern. Lower search presence could be costing them valuable advertising money as some advertisers migrated to higher order search sties favoured by the new factual accuracy algorithm.
Certainly the adoption of a trustworthiness search model could function to drive some advertisers away from many currently popular sites, though Google would have a fair argument in suggesting that this was the market correcting its own investment choices due to better data and analysis. But as many entertainment and social media based sites are not particularly sought or inhabited for their factual accuracy as they are for the community and informatively salacious exchanges, the impact on these sites may not be particularly prohibitive. People wanting to use these sites will be able to filter their searches, just as they can currently, to find exactly what they want. And certainly some of these entertainment sites may well work hard to increase the relative ratio of facts on their sites to garner higher search order listings.
The greater problem however begins when looking at contrarian science like that of anti vaccination and Climate Change Sceptics. Yes the new algorithmic process could certainly position more factually accurate pages eroding the solid enfranchisement of the rampant politicised disinformation on the internet. But this assumes that all parties searching on the internet have the same interpretation of accuracy as Google and academia and that they are entirely rational actors in this theatre.
It could very well backfire with this assumption of rationality and actually feed the mass of conspiracy theories flooding the internet and further entrench those already mired in misinformation. Google and its supporters could well argue that the solidification of a partisan minority is inevitable and it is the majority of indeterminately inclined but concerned citizens they are aiding with a more trustworthy accurate search system. And it does seem reasonable to make search criteria the same as those in science which determines objective truth via reality and its testable nature as the ultimate axiom.
But despite my tacit support there are some potentially very rough edges of this new search proposal that should be of concern to us all. And perhaps even of greater concern in the journalistic community is the impact of such searches on news analysis pieces which are fundamental to community discussion of social and political events. Certainly it could also unintentionally limit Op Ed pieces and analysis by downscaling sites discussion of breaking events that are not yet well studied empirically or are perhaps are for a time until further studies take place still the subject of pronounced Academic debate, IE the LHC faster than light neutrino’s. An event, which though solidly ensconced in empirical science, took months to finally resolve with further experiments.
Adding to the problems here is the very nature of journalism itself. Journalistic analysis itself is often speculatory and its discussion often begins with the limited facts available at the time. It is also hard to believe that all measures of truth and accuracy can be easily quantified by Googles fact based algorithms. And certainly, when discussing future events where facts are usually incompletely known, it is easy to see that journalism itself could be harmed by the relegation of much of its work lower in the proposed search systems order.
A distinction between news and analysis would likely need to be an essential part of the successful adoption of any such Trustworthiness search system. But it remains to be seen if such subtle distinctions via Googles software is even technologically possible. Its also important to remember that Google is a corporation, not a public service and when push comes to shove the search engine giant exists to make money for its shareholders, not to help us out.
If and when Google decides to implement the new ranking system, it will do so for its own not entirely understood agenda. An agenda that may or may not coincide with offering an improved service for users. And not being a democracy, we can’t vote on how Google amends its search algorithms. Indeed, its decisions are closely guarded commercial secrets, leaving us trapped with whatever its executives decide.
However, Googles new brain child is still, as yet, only research project. So we can hope that if it ever does see the light of day that it is sufficiently nuanced to reflect the development cycle of a fact and the necessary and not entirely empirical discussion of events that must take place in order for us to remain a truly well informed and democratic society.
If not, I suspect that many Google customers may be seeking a search engine alternative in coming years.