Frank van der Most
In January this year, the ACUMEN (Academic Careers Understood through MEasurements and Norms) project finished the last of three feed-back workshops which it had organized to test the main deliverables of the project. The workshops addressed different sets of actors with different forms of the same question : ‘What do you think?’ In all three cases, ACUMEN met with quite some criticism accompanied by equal amounts of enthusiasm. My impression is that the project hits a nerve in academia: there is a need for new ways to evaluate individual researchers. Moreover, the two main concepts that the project developed over the course of the past three years seem to be a good candidate to fill the gap: the ACUMEN Portfolio and the Good Evaluation Practices guidelines. Let me introduce them to you, briefly explain what happened during the workshops and end with a schism that still needs to be solved.
About four years ago, ACUMEN was conceived as a project that addressed important omissions in evaluation practices. The ACUMEN team saw an increasing divergence between the criteria that are being used in academic evaluations and the wider socio-economic functions of science. Or, in other words, researchers are mainly evaluated on paper-output and citation scores, whereas 1) there is much more to research than publications and 2) they perform other important tasks besides research: teaching and activities related to the societal relevance and benefit of their research. Another important criticism is that the World Wide Web has offered new means to do research, to teach and to relate research to societal concerns, which are also overlooked by evaluations. Finally, many of the quantitative indicators that have been developed to measure research activity and output are not suitable to evaluate individuals’ work.
The solution part 1: the Portfolio
ACUMEN’s solution to these problems still involves evidence that the individual researcher needs to present. The project proposes a two-sided solution to the above mentioned problem with existing evaluations.
On the one side, the ACUMEN Portfolio (hereafter simply Portfolio) allows researchers to present themselves in a comprehensive way. It “will enable a researcher to propose an extended set of materials and criteria for evaluation in relation to the relevant scientific and social mission of her research.” (ACUMEN 2010, p. B3). This way it will empower the evaluated to address relevant items that evaluations tend to overlook.
In the course of the project, the Portfolio was developed in depth. In its present form it consists of a narrative and three sub-portfolios: one for Expertise, one for Output and one for Influence. (See the figure) Each of these three addresses research, teaching and societal concerns with sets of so-called factors and sub-factors. In turn, each of these can be filled with one or more types of ‘evidence’ which can range from self-made claims (admittedly, not a very strong form of evidence, but sometimes that is all there is), to references (to papers, websites, collaborators, etceteras), numerical indicators (which may take some work to produce and calculate). For example, the Expertise sub-portfolio includes the factor ‘Organizational expertise’, which has the sub-factors ‘Management’, ‘Advising’, ‘Project Leadership’, ‘Collaboration’, ‘Administration and committee work’. Under ‘Advising’ the researcher or portfolio owner is asked to list “Visits to other institutions (universities or other) and the type of advice given (list top 3)”
Whereas a CV may contain all such visits, the portfolio asks only for a top 3 as evidence for this expertise sub-factor. Although the sub-portfolios are defined, the lists of factors, sub-factors and evidence all in principle include the option ‘Other’ to add something that the project partners have overlooked. And if someone prefers to list a top 4 rather than a top 3 that is fine too. The idea however is to limit the number in order to save time. In case of a number of sub-factors in the Expertise sub-portfolio, the researcher is invited to not merely list evidence, but also write a few sentences to summarize and explain for example his/her theoretical expertise. This should be backed with references to papers or other evidence, but the point is that space for interpretation and explanation is available, which brings me to the narrative.
The narrative gives an overall-but-selective view on one’s own qualities in the light of the application for which one wants to use it. The researcher may highlight a particular achievement, or discovery when he/she is applying for project-funding, present a view on future research and teaching ambitions for a tenure-track job-application, or give an interpretation on recent progress (or lack thereof) in an annual job-appraisal or promotion application. It is a narrative, so in principle it could go in all directions, but the idea is that the researcher backs his/her interpretation with references to evidence presented in the sub-portfolios. The narrative is envisaged to be no longer than 500 words.
Besides the narrative, an important quality of the Portfolio is that it is modular (ACUMEN 2010, p. B6). The researcher can leave elements out, depending on whether he/she has something to report or not and on what is important for the evaluation purpose. Certain factors or sub-factors can be left out, but also entire sub-portfolio’s. If it will be built, the Portfolio is imagined to be an on-line service, in which a researcher can gradually fill out elements in the course of time (or all at once of course). When an evaluation comes up he/she can then make a selection and add a specific narrative. The researcher also controls who gets to see which parts of the portfolio.
The solution part 2: guidelines
The Portfolio is accompanied by Good Evaluation Practices guidelines (GEP guidelines), which is the other side of the proposed solution. The Portfolio requires some help and explanation on how to fill it out and in particular how to calculate the numerical indicators in the Influence sub-portfolio. The GEP guidelines not only exist for that purpose but also to inform evaluators. Evaluators need to assess the portfolio and here too, the numerical indicators require a lot of attention because of differences in disciplines and differences between the data-sources on which the indicators are based. This is true for the bibliometric indicators, which are based on Web of Science, Scopus or Google Scholar. It is equally true for the alt-metric (or web-metric) indicators, i.e. the indicators for on-line activity, output and its influence. These are provided by Google, Bing, Academia.edu, ResearchGate.net, Twitter, WordPress, Blogger, YouTube, TED talks, SlideShare, and others. Both in case of bibliometric indicators and alt-metric indicators, there are ‘technicalities’ to take into account, but one should also be aware how citations, downloads, in-links, ‘likes’, and ‘followers’, come into existence.
The guidelines acknowledge that both groups of indicators are not without problems, and that there are differences between indicators within both groups, the bibliometric indicators are more reliable than the alt-metric indicators. Thus the alt-metric indicators should be used as complement to the bibliometric indicators.
The Utrecht workshop. Testing the main concepts on PhD students
The first workshop was a test of concept. It was part of the ‘Crafting your career’ event, held in October 2013 in Utrecht, the Netherlands, and organized by the Rathenau Institute and the CWTS research group of the University of Leiden. The participants were predominantly PhD students and early-career postdoctoral students. Those interested sent in their CV in advance, and ACUMEN team-members would give feedback and suggestions based on the outlines of the Portfolio design. In return, the participants gave their opinion about the main concepts of the narrative, and the three sub-portfolios.
The response was overwhelming. Out of a total of 170 participants, 30 sent in their CV, which took five team-members to give individual feedback during so-called speed-dating sessions. The idea for a narrative received a lot of applause. Many students said they considered adding it to their CV. One or two actually had one already. Other than that, most participants liked the idea of the sub-portfolios but they would use them mostly to enhance their CV with a selection of elements (factors or sub-factors). They saw the CV as the main thing and did not think they would replace it by a portfolio.
With these and many more detailed responses, we enhanced and fine-tuned the Portfolio design in collaborative effort. Mike Thelwall (of Statistical Cybermetrics Research Group at the University of Wolverhampton) introduced the ‘Academic age’, a concept that he had seen in the new Research Assessment Exercise in the UK. In the ACUMEN version, the start of the PhD is taken as the start of the academic life. Years can be subtracted from this academic life-time in order to compensate for time spent on raising children or time lost due to long-term health problems, military service, non-academic jobs and other so-called ‘special allowances’. This way, indicators such as the total production of papers and books can be put in perspective.
The Madrid workshop. Try-out on researchers
In contrast to the Utrecht workshop, the Madrid workshop was fully dedicated to testing of the portfolio. It was held in December last year and 35 researchers from physics, library and information science, history, biochemistry, medicine, food science and technology, and veterinary sciences gave their input. In terms of seniority, they ranged from PhD student to senior researcher and professor. During the workshop, the participants received a fully-detailed paper design of the portfolio and were asked to point out which elements made sense, which not and why.
Here too, the general idea was received with enthusiasm. The biggest source for discussion was the set of rules for the academic-age calculation. Different national laws governing academic employment require different ways of compensating for different sorts of allowances, which means that the calculation rules should be flexible.
Participants differed in their assessment of the usefulness of the three sub-portfolios and this seemed to differ per discipline. For example the researchers in medicine thought that Output and Influence were the most valuable parts whereas Expertise was deemed interesting. The physicists on the other hand seemed more interested in the Expertise sub-portfolio.
The many detailed comments and indications of prospective use of certain elements and discard of others, pointed out that the draft Portfolio design already captured a high level of diversity between tasks, disciplines, levels of seniority and countries, but that it could still be expanded. Moreover, what is deemed relevant will change over time so regular updating of portfolio elements will be necessary, perhaps even to the level of sub-portfolios in the long run.
The Brussels workshop: scenario’s for evaluators
The third testing-ground was in Brussels where we met around 30 policy makers and management staff from research councils, ministries, universities and organizations in data-management and standardization thereof (EuroCRIS, CASRAI, VIVO, CRIStin). For this audience we not only presented the Portfolio design, but also developed fictive use cases in which two personas (an early-career postdoc in philosophy and a mid career researcher in environmental engineering) applied for a job and a project grant.
These use-cases and overarching issues were extensively discussed in three break-out groups. Again, many details were discussed and preferences for certain parts articulated. Because of the actually filled out portfolios, there were also interesting observations and criticisms on these presentations. Probably, if and when the Portfolio tool will be used, there will be some learning needed on the side of the researchers/evaluated in how to best present their case. Besides for job applications, project applications and self-evaluation, participants also saw potential use of the Portfolio for profiling (searching for people with particular qualities) and networking. With a few notable exceptions, most participants in my break-out group were not that interested in the Narrative nor the Output sub-portfolio, but particularly in Expertise and Influence. This shows that not only researchers from different disciplines disagree about what the most relevant sub-portfolios are, evaluators and evaluated also have different views. This brings me to the schism that I mentioned in the introduction.
The schism: who selects the relevant elements from the Portfolio?
An interesting difference in presentation with the previous workshops was that here we stressed the possibilities for the evaluators (rather than the evaluated researchers) to determine which elements of the Portfolio should be filled out. In my mind, it remains an un-resolved issue. On the one hand we want to empower researchers to present themselves and their work ‘in full’, allowing them to present what they feel fits best for a particular evaluation. On the other hand, evaluators also know what they want to know about an applicant/application. And what they do not want to know. They can use web-services to restrict the input from applicants.
Empowering the evaluated too much will probably cause the evaluators to discard the Portfolio system. Empowering the evaluators may cause a lot of protest. The evaluated may not have many alternatives but often they know how to tweak situations to comply to their needs. Moreover, although evaluators know what they want to know, often it is not enough: after the desk-selection and peer-review more candidates and applications remain than can be awarded.
Interestingly, since around 2010 when ACUMEN was conceived, evaluators have developed on-line tools for applications. Research councils may have had them for a while, but now also universities started using them for job-applications. On the other hand, by my observations, LinkedIn and SlideShare saw an increasing use among scientists during the past four years and new web-services saw the light of day: researchgate.net, mendeley.net and academia.edu. Together, these websites allow researchers to present themselves and their work. So, web-services are being erected for both evaluator and evaluated.
If that is the case, then consider for a moment a web-based evaluation system in which both the evaluators and the evaluated have the opportunity to indicate which elements of the portfolio are important for specific evaluation purposes. Then not only the evaluators can indicate requirements, but also the evaluated can articulate which important elements are overlooked by the evaluators. In such a system, all those engaged in a particular evaluation could see which elements the evaluator requests and what the evaluated as a group indicate is lacking. In the example above, 50 out of 65 applicants may have filled out the sub-factor ‘Advising’ in the Expertise sub-portfolio, indicating that the evaluator should consider it. If a great majority of the applicants points to an omission, the evaluator could consider taking that advice. It may help solve the problem of selecting applicants after the initial selection criteria proved insufficiently selective.
ACUMEN project is funded under contract agreement 266632 of the European Commission’s Seventh Framework Program. I am grateful to Cristina Chaminade for comments on the draft version of this text.
ACUMEN (2010) Description of work (Annex I to the grant application, funded under grant agreement 266632). S.l. : ACUMEN consortium.
More information can be found at the ACUMEN website http://research-acumen.eu/