By continuing your visit to this site, you accept the use of cookies. These ensure the smooth running of our services. Learn more.

May 24, 2016

Computational Personality?

The field of artificial intelligence (AI) has undergone a dramatic evolution in the last years. The impressive advances in this field have inspired several leaders in the scientific and technological community - including Stephen Hawking and Elon Musk - to raise concerns about a potential domination of machines over humans.

While many people still think about AI as robots with human-like characteristics, this field is much broader and include a number of diverse tools and applications, from SIRI to self-driving cars, to autonomous weapons. Among the key innovations in the AI field, IBM’s Watson computer system is certainly one of the most popular.

Developed within IBM’s DeepQA project lead by principal investigator David Ferrucci, Watson allows answering questions addressed in natural language, but also features advanced cognitive abilities such as information retrieval, knowledge representation, automatic reasoning, and “open domain question answering”.

Thanks to these advanced functions, Watson could compete at the human champion level in real time on the American TV quiz show, Jeopardy. This impressive result has opened several potential business applications of so-called “cognitive computing”, i.e. targeting big data analytics problems in health, pharma, and other business sectors. But psychology, too, may be one of the next frontier of the cognitive computing revolution.


For example, Watson Personality Insight is a service designed to automatically-generate psychological profiles on the basis of unstructured text extracted from mails, tweets, blog posts, articles and forums. In addition to a description of your personality, needs and values, the program provides an automated analysis of “Big Five” personality traits: openness, conscientiousness, extroversion, agreeableness, and neuroticism; all these data can then be visualized in a graphic representation. According to IBM’s documentation, to give a reliable estimate of personality, the Watson program requires at least 3,500 words, but preferably 6,000 words. Furthermore, the content of the text should ideally reflects personal experiences, thoughts and responses. The psychological model behind the service is based on studies showing that frequency with which we use certain categories of words can provide clues to personality, thinking style, social connections, and emotional stress variations.

Clearly, many psychologists (and non-psychologists, too) may have several doubts about the reliability and accuracy of this service. Furthermore, for some people, collecting social media data to identify psychological traits may lead to Orwellian scenarios. Although these concerns are understandable, they may be mitigated by the important positive applications and benefits that this technology may bring about for individuals, organizations and society.

Using Big Data in Cyberpsychology

Thanks to the pervasive diffusion of social media and the increasing affordability of smartphone and wearable sensors, psychologists can gather and analyse massive quantities of data concerning people behaviours and moods in naturalistic situations.


The availability of “big data” presents psychologists with unprecedented professional and scientific opportunities, but also with new challenges. On the business side, for example, a growing number of tech-companies are hiring psychologists to help make sense of huge data sets collected online from their actual and prospective customers.

The job description of a “data psychologist” not only requires perfect mastery of advanced statistics, but also the ability to identify the kinds of behaviours that are most useful to track and analyse, in order to improve products and business strategies. Psychological research, too, may be revolutionized from emerging field of big data. Until recently, online research methods were mostly represented by web experiments and online survey studies.

Example of topic areas included cognitive psychology, social psychology, but also health psychology and forensing psychology (for an updated list of psychological experiments on the Internet see this useful resource by the Hanover College Psychology Department).

However, the emergence of advanced cloud-based data analytics has provided psychologists with powerful new ways of studying human behaviour using digital footprints. An interesting example is CrowdSignal, a crowdfunded mobile data collection campaign that aims at building the largest set of longitudinal mobile and sensor data recorded from smartphones and smartwatches available to the community. As reported in the project’s website, the final dataset will include geo-location, sensor, system and network logs, user interactions, social connections, communications as well as user-provided ground truth labels and survey feedback, collected from a demographically diverse pool of Android users across the United States.

A further interesting service that well exemplifies the scientific potential of social data analytics is the “Apply Magic Sauce PredictionAPI” developed by the Psychometrics Centre of the University of Cambridge. According to the Cambridge researchers, this algorithm allows predicting users’ personality traits based on Facebook interactions (i.e., Facebook Likes). To test the validity of the tool, the team compared the predictions generated by computer algorithms and the personality judgments made by human. The results, which were reported on Proceedings of the National Academy of Sciences (Youyou et al., 2015, PNAS, 112/4, pp. 1036–1040), showed that the computers’ judgments of people’s personalities based on their digital behaviors were more accurate than judgments made by their close others or acquaintances.


However, the emergence of “big data psychology” presents also big challenges. For example, it is the advantages of this approach for business and research should take into account the issues related to ethical, privacy and legal implications that are unavoidably linked to the collection of digital footprints. On the methodological side, it is also important to consider that quantity (of data) is not synonimous with quality (of data interpretation).

In order to create meaningful and accurate models from behavioural logs, one needs to consider the role played by contextual variables, as well as the possible data errors and spurious correlations introduced by high dimensionality.