Inferring monopartite projections of bipartite networks: an entropy-based approach

Whenever detecting a relationship between any two entities is unfeasible (either because impractical - as in the case of social systems - or impossible - as in the case of financial systems), a procedure to infer the presence of a connection from the available information must be devised.

This information is often encoded into a bipartite network, a representation evidencing co-behaviors of the entities under analysis (co-authorship, co-occurrences, etc.). The method presented in this paper rests upon the assumption that a large number of shared attributes proxies a significant nodes similarity, providing a clue of an otherwise undetectable connection between them.

More specifically, to infer the presence of a link between any two nodes, our method prescribes counting their number of shared attributes and comparing it with a statistical benchmark: as long as it is found to be significantly large, the considered two nodes are linked. A multiple hypothesis test is then applied, i.e., the False Discovery Rate procedure, a flexible tool that overcomes many of the limitations affecting similar criteria (e.g., the Bonferroni one).

Remarkably, our algorithm can be employed to define a novel recommendation system. An interesting problem concerning bipartite customers-goods networks, in fact, is devising a procedure for recommending products to users. Within our framework, such an issue can be addressed quite naturally by selecting those goods that have been “co-purchased” a significantly-large number of times, thus providing suggestions tailored to each user’s basket of preferences (and not just recommending the most popular items).

References

Saracco, F., Straka, M.J., Di Clemente, R., Gabrielli A., Caldarelli, G., & Squartini, T. Inferring monopartite projections of bipartite networks: an entropy-based approach New Journal of Physics 19, 053022 doi:10.1088/1367-2630/aa6b38 (2017)

The module contains a Python implementation of the Bipartite Configuration Model (BiCM), which can be used as a statistical null model for undirected and binary bipartite networks. Special thanks to Mika Straka.