Thanks to everybody for reading me and the good suggestions on how to improve the model. I will work on this again, if you have ideas, please leave a comment to this post. My plan in the next weeks/months is to:
- Research more the Wikipedia Ontology and the SPARQL language. I only have a shallow knowledge in this field and the data extraction is tricky…for example to make things more difficult, in certain cases the field used is “influenced”, instead of “influenced” or “influenced by”. Any patient volunteer on this to help writing advanced queries is welcome!
- Add the time dimension. The information is readily available in the Wikipedia infoboxes. This will allow to get a better view on how ideas a propagated and their persistence
- Improve the influence concept. At the moment the model contemplate a fairly subjective and simple model. Need to add weights and longer chain of influence (e.g.: Cicero => Rousseau => Kant => … => …). This part will be tricky as we are not talking about hard disciplines (try to tell a write his work is deeply influenced by XYZ actors to see either anger or tears…)