Output

The output of a truth discovery algorithm in this library is a Result object. The most important attributes of Result objects are trust and belief, which are dictionaries containing the trust and belief scores for each source and claim. See the example below for their format.

An important method is get_most_believed_values, which returns (a generator of) the values with highest belief score for a given variable.

See the Result class for full documentation on the available attributes and methods. The example below shows the format of the trust and belief dictionaries, after running an algorithm on the first example dataset from Input Data.

>>> results = AverageLog().run(mydata)
>>> results.trust
{'source 1': 0.2637321368777009, 'source 2': 0.52146101879096, 'source 3':
0.7681758936725634, 'source 4': 1.0}
>>> results.belief["x"]
{4: 0.14915492164635274, 3: 1.0}
>>> results.belief["y"]
{7: 0.4440695965138328, 6: 0.5655545941885707}
>>> results.belief["z"]
{5: 0.7293600806789093, 8: 0.5655545941885707}
>>> results.time_taken
0.006222724914550781
>>> results.iterations
20
>>> list(results.get_most_believed_values("x"))
[3]
>>> list(results.get_most_believed_values("y"))
[6]
>>> list(results.get_most_believed_values("z"))
[5]
>>> results.filter(sources=["source 1", "source 3"]).trust
{'source 1': 0.2637321368777009, 'source 3': 0.7681758936725634}
>>> results.filter(variables=["x"]).belief
{'x': {4: 0.14915492164635274, 3: 1.0}}
>>> results.get_trust_stats()
(0.6383422623353061, 0.2746120273343826)

Difference between two set of results

It is possible to compare two sets of results using the ResultDiff class.

>>> from truthdiscovery import ResultDiff
>>> diff = ResultDiff(res1, res2)

A ResultDiff object has attributes trust, belief, time_taken and iterations, as the Result objects do. The format of each attribute is the same as in Result, but gives the increase in trust/belief/time taken/number of iterations between the first and second set of results (the numbers are negative in the case of a decrease).

This is useful for comparing results after making a small change to the input dataset: for example to study the effects on trust scores of a source making an additional claim, or the effects on belief when adding a new variable.