Chapter 4 Only good intentions?
Only good intentions? The impact of a national research assessment on the productivity of sociologists in Italy8
Abstract
This chapter investigates the impact of a national research assessment (VQR 2004-2010) on sociological publications in Italy. We reconstructed all publications from Italian sociologists in Scopus between 2006 and 2015, i.e., before and after the assessment. We also checked academic tenures and promotions during the assessment. Our results showed that the effect of institutional signals on productivity was minimal, while certain individual patterns were more important. Our findings suggest that by opting for informed peer review rather than bibliometric indicators, and by not providing strict signals on the importance of prestigious publications, the assessment did not stimulate a virtuous rational adaptive response.
Keywords: Research Assessment; ANVUR; Sociologists, Italy, VQR 2004-2010
4.1 Introduction
The dominant “publish or perish” culture in academia is often associated with regular and pervasive quantitative research assessments of productivity, which are now performed worldwide. This is because the “meritocratic culture” of academia requires systematic and impartial assessment as a means to legitimize funding allocation on a competitive basis (Jappelli, Nappi, and Torrini 2017; Abramo and D’Angelo 2015). Outcomes of these assessments often even define individual promotion and careers (Edwards and Roy 2017; Nederhof 2006; Abramo, D’Angelo, and Rosati 2014).
Previous research suggests that these assessments can have serious implications, though it is difficult to measure the effect of institutional impulses on scientists’ behaviour (Rijcke et al. 2016). Here, context matters and developing comprehensive and robust indicators of evaluation is always difficult, also considering that research is rarely pursued individually or in complete isolation (Waltman 2018). Furthermore, once introduced, the performance nature of quantitative evaluation may trigger gamification, which in turn risks distorting academic freedom, encouraging short-termism of research investment and portfolios (e.g., Rijcke et al. (2016), Hicks et al. (2015)), even nurturing an excessively competitive environment (e.g., Edwards and Roy (2017)).
In this respect, Rijcke et al. (2016) suggest that metrics-based evaluation has four negative implications. First, metrics could affect scientists’ strategic behavior, leading to switching goals with means and stimulating gamification. This occurs when evaluation itself becomes scientists’ goal, and research activity is planned so as to produce publishable outcomes and maximize numbers. This would increase scientists’ risk aversion and dis-incentivize long-term, challenging research projects (e.g., Butler (2003); see also a recent re-examination by Besselaar, Heyman, and Sandström (2017) which cast some doubt on her results and conclusions). Secondly, metrics could embody bias against interdisciplinary, cutting edge research, especially when combined with peer review. Furthermore, indicators could stimulate scientists to reduce task complexity and standardize collaboration (Whitley 2007), thus reducing diversity of approaches, methods and standards, including imposing rigid constraints on publication formats. Finally, metrics could have effects on academic and research institutions, by increasing resource allocation towards more productive institutions capable of creating cumulative advantages (e.g., Abramo and D’Angelo (2015)). There is need for empirical analyses of research assessments especially in contexts where productivity, quantification and indicators are not intrinsic to the pre-existing reward and incentive structure of academia.
This chapter aims to analyze the impact of a recent assessment on an interesting subset of Italian academics. The chapter is organized as follows: Section 2 presents our research background and the motivation of our study. Section 3 presents the methodology and data, while Section 4 presents the results. Finally, Section 5 summarizes the main findings and discusses certain limitations of the study.
4.2 Background
Among European countries (Jonkers and Zacharewicz 2016), recently Italy has also made progress towards national research evaluation. After some preliminary examples (i.e., VTR9), in October 2006, the Italian government established a public independent agency, called the “Italian National Agency for the Evaluation of University and Research Institutes” (henceforth ANVUR) (ANVUR 2006) but the institutive Law was published only four year later with DPR 76/2010, and the agency started to operate only in February, 2011 with the nomination of its Board of Directors. ANVUR was directed to assess research productivity of universities and research institutes. Its mandate was to establish more competitive criteria in order to allocate university budgets and link promotion and careers to research productivity and scientific merit. Besides assessing research output, it also evaluates teaching, administrative performance, social impact and student competence, by covering all areas of sciences, from hard sciences to musical conservatories (Benedetto et al. 2017b). ANVUR members include well-known Italian scientists from different scientific fields appointed by MIUR (Italian Ministry of University and Research) (ANVUR 2013; Ancaiani et al. 2015; Benedetto et al. 2017b).
The first National assessment by ANVUR was the VQR 2004-2010, which started by a call for participation published on 7th November 2011 (basically overlapping with the introduction of the National Scientific Habilitation (ASN) as a requirement for promotion and career of all academics (Abramo and D’Angelo 2015; Marzolla 2016)). About 185,000 research products were evaluated in 14 research areas. In STEM10 and hard sciences, nearly 90% of products were journal articles. In the humanities and social sciences, journal articles covered only 26% of the total number of products, the rest being mostly books and book chapters. Academics working in research institutes were required to submit six research products, while those working in universities, having also teaching duties, were required to submit three products for evaluation. An exception for younger researchers was considered based on the time of their employment in academia (ANVUR 2013; Ancaiani et al. 2015).
ANVUR selected 450 experts as part of a number of GEVs (“Groups of experts in evaluation”), which mapped the structure of disciplinary sectors that dominate the organization of Italian academia. They were asked to define details and methodology of evaluation, including deciding either to use a combination of peer review and bibliometric analysis or exclusively peer review to assess scientists’ production. Under pressure from GEVs and many academics in the humanities11, ANVUR decided to follow a mix of quantitative metrics, such as the number of publications, citations and h-index in hard sciences and qualitative metrics, such as peer review in the humanities and social sciences. In regards to the latter, VQR 2004-2010 fully complied with the so-called informed peer review. Scientist under evaluation were required to pre-select and submit their best published research products while VQR peer reviewers could identify each author and the impact of journal or the prestige of book series in which products were published via online sources. This also occurred when products were articles previously published in peer-reviewed, authoritative journals. They were peer-reviewed by GEV experts after publication. Each product was categorized as: Excellent, Good, Acceptable or Limited (ANVUR 2013; Ancaiani et al. 2015; Bertocchi et al. 2015).
ANVUR’s decision to use a mixture of peer review and bibliometrics raised a heated debate in the press and media. After the publication of evaluation results, members of ANVUR found a certain degree of agreement between peer review and bibliometric analysis (e.g., Bertocchi et al. (2015)). However, other studies re-examined the situation and found inconsistencies (Baccini and De Nicolao 2016), while some concluded that “bibliometric indicators could be used only to assess hard sciences” (Abramo, D’Angelo, and Caprasecca 2009a).
Furthermore, supporters of bibliometric evaluations outlined the excessive cost of peer review, insisting on the subjective bias of peer reviewers and argued the higher fidelity of quantitative exercises that measured each scientist’s productivity without adding selection distortion (Abramo and D’Angelo 2011a). Indeed, VQR 2004-2010 was extremely costly and time consuming. It had a total cost of € 70,654,852, with € 5,940,000 for peer review and € 250,000 spent for bibliometric data, which informed the peer review evaluation in a subset of 14 scientific areas (see Geuna and Piolatto (2016) for further detail)12. It is worth noting that UK’s REF which is the reference point of VQR, in 2014 had a total cost of 106 million euros (€17,712,000 added to the cost of RAE), which is one of the reasons behind criticism of it (Harzing 2018). Note that the total cost of VQR 2004-2010 was approximately the entire budget of PRIN13 funds in 2015 (i.e., the only funding scheme by MIUR for public research in Italy), which was about €90,000,000, for which all academics and researchers in Italy competed.
In a study, Bertocchi et al. (2015), who were members of the panel who evaluated research outputs of scientists in scientific area 13 (Economics, Management and Statistics), selected 590 random papers among 5,681 which were reviewed by VQR 2004-2010 experts. GEV in this field opted for informed peer review while peer reviewers used bibliometric indicators and journal impact indices to assess the quality of research products. They found substantial agreement between bibliometrics and peer review. However, they suggested there was potential ambiguity of these non-blind processes as it is impossible to disentangle whether close agreement was due to reviewers’ opinion or to the prestige of the outlet where articles were published.
Besides the interesting mixture of metrics and criteria, the case of Italy is interesting also for other reasons. Firstly, while ANVUR assessment aimed to evaluate academic and research structures, rather than individual scientists, the Minister, the agency and the press also explicitly conveyed a strong message regarding the importance of productivity and quantitative indicators (Turri 2014). Secondly, ANVUR was also involved in developing common standards for the national habilitation of all new associate and full professors, which linked promotion and resources to research productivity. These standards generated widespread debate within the scientific community, were contrasted by many academics, especially from the humanities, and in some cases generated contrasting outcomes (e.g., Baccini and De Nicolao (2016)). However, all these initiatives marked the beginning of a cultural shift in the institutional setting of Italian academia, with scientists who were unfamiliar with international standards learning for the first time about h-index, WoS (Web of Science) and Scopus (Akbaritabar, Casnici, and Squazzoni 2018).
In addition, looking at the case of sociologists is also an important insight. While the impact of ANVUR assessment has been extensively examined for the hard sciences and economics, management and statistics (Bertocchi et al. 2015; Abramo, D’Angelo, and Caprasecca 2009a), the case of sociologists has not yet been considered. On the one hand, ANVUR followed the dominant opinion of academics when considering bibliometric indicators as unreliable for assessing the productivity of social scientists (Benedetto et al. 2017a). On the other hand, sociologists are part of a community that includes not only humanities academics, who are predominantly qualitative, anti-bibliometric and preferably publish in national journals, but also quantitative social scientists, whose research standards are closer to hard scientists, are familiar with bibliometric indicators and publish preferably in international journals. This co-existence of different epistemic communities in the same discipline could make research assessment even more problematic and its impact even more worth investigating (Akbaritabar, Casnici, and Squazzoni 2018).
Another point was that the decision by ANVUR, under pressure from GEVs, allowed the same experts being evaluated to classify the quality of journals in their field, independent of the internationally recognized prestige of these outlets, added a further layer of ambiguity to the policy signal. It is worth noting here that most Italian journals with a national academic readership were categorized on a level with highly prestigious international ones, though rigorous internal peer review, impact factor and citation rates were not comparable. This may have led rational, adaptive scientists to minimize the risk of delaying publication by targeting national outlets, which were less competitive than international ones. It is also worth considering that rewards and penalties of assessment were addressed to universities and departments rather than to teams or individual scientists. This may have created incentives for less intrinsically motivated, productive scientists towards minimizing their effort.
4.3 Data and measures
Data were collected from Scopus in September 2016 and included all records published by Italian sociologists between 2006 and 2015 (compared to data used in Chapter 3 here we excluded postdocs whose majority did not have any publications at 2006). This period covered five full years before and after ANVUR, whose call for participation was originally published on 7th November 2011. Looking at five years before and after the call allowed us to trace pre-existing behavior and examine scientists’ reactions to institutional policies without compromising our analysis with other exogenous factors (Jonkers and Zacharewicz 2016; Hicks 2012; Benedetto et al. 2017b).
We started from available institutional data on the MIUR website and gathered a list of all sociologists currently registered in Italian Universities and research institutes, i.e., a total of 1,029 scholars. We reconstructed each academic’s level (i.e., assistant, associate or full professor), the “scientific disciplinary sector”14 in which they were formally assigned to, gender, affiliation, department and finally their first and last names. Promotion and careers were reconstructed by comparing data from 2010 to 2016. This was coded as a dichotomous variable in dataset: “academic level changed” or “unchanged”.
As regards publication records, Scopus includes approximately 8,698 journals in social sciences (Scopus 2017). These include all the most prestigious sociology journals published by Italian publishers or edited by Italian sociologists, such as Sociologica, Sociologia, Rassegna Italiana di Sociologia, Salute e Societa, Studi Emigrazione, Stato e Mercato, Italian Sociological Review, Journal of Modern Italian Studies, Etnografia e Ricerca Qualitativa and Polis. This confirmed that the most prestigious Italian journals were represented in the dataset.15 However in VQR 2004-2010, ANVUR categorized the more prestigious Italian journals (including a total of 50 journals from which 50% were indexed in Scopus (last checked on 1st August 2018) under the title of “Fascia journals” considering three levels, Fascia A, Fascia B, and Fascia C (ANVUR 2012). To complete our dataset, we then coded all articles in the sample as Fascia (either A, B, or C) or non-Fascia journals.
Following Akbaritabar, Casnici, and Squazzoni (2018), in order to measure scientist output and examine the effect of certain institutional and structural factors, we used a productivity indicator developed by Abramo and D’Angelo (2011b) by considering a microeconomic stance. This function takes time as input of academic work and publications and impact of publications as output. This index was called FSS (Fractional Scientific Strength) as Eq 1 shows:
\(FSS = \frac{1}{t}\sum_{i=1}^N \frac{c_i}{\bar{c}}f_i\) (1)
- \(t\): The time window between first publication and the last one for each researcher divided into two time periods before and after ANVUR16
- \(N\): Number of sociologist’s publications
- \(i\): Each sociologist’s publication
- \(c_i\): Number of citations that each publication \(i\) collected
- \(\bar{c}\): Average number of citations of all other records published in the same year17
- \(f_i\): Inverse of the number of authors (fractional contribution of each author to paper)
To measure scientists’ level of international collaborations (Katz and Martin 1997; Akbaritabar, Casnici, and Squazzoni 2018) we used an “internationalization index”, which considered the co-authors’ affiliation and country (e.g., Leydesdorff, Park, and Wagner (2014)) and calculated the number of authors with non-Italian affiliations \(a_{fi}\) on the total number of authors of each paper \(a_i\). We aggregated this value by averaging the internationalization scores of all publications, \(N\) as follows in Eq 2:
\(IntScore = \frac{1}{N}\sum_{i=1}^N \frac{a_{fi}}{a_i}\) (2)
It is worth noting that in order to control for the possible confounding factor of career promotion, or academics promoted between 2010 and 2016, we added two categorical variables, which indicated the FSS level of everyone in 2010 and the number of papers published up to 2010. We then categorized FSS and number of papers in 2010 as “low”, “medium” and “high” to import it into statistical models and see whether this had a significant association with sociologists’ status in other variables. This was to check if recently promoted sociologists followed different publication patterns and reacted differently to ANVUR assessment.
4.3.1 Crossed membership repeated measurement model
In order to examine the importance of institutional embeddedness and compare individuals at different institutional levels before and after ANVUR, we followed Akbaritabar, Casnici, and Squazzoni (2018) in assuming that each scientist in our database was nested into different clusters, possibly having influencing their productivity. We considered two cluster levels: (1) the department, as the first level of organizational embeddedness, and (2) the university, which were relevant for power, career and strategic relationships, sometimes also for research and collaboration (e.g., Abramo, D’Angelo, and Rosati (2016b)). This was to understand whether differences in these levels could be associated with different reactions to institutional policies. Sociologists in similar departments in different universities could be exposed to similar cultural contexts and sociologists in different departments of the same university could be exposed to similar organizational facilities and constraints. In order to accommodate this complexity, we used crossed membership random effects structure(Baayen, Davidson, and Bates 2008).
In order to model ANVUR impact among different departments and universities, we followed previous research (Snjiders and Bosker 1999; Faraway 2005; Zuur et al. 2009) and used repeated measurements mixed effect model with crossed membership structure in departments and universities. This was to create “between” (different individuals in each of the conditions) and “within” (same individuals before and after ANVUR) group measures that could help us understand any possible heterogeneity in individual responses. We then kept the same random effects structure for each model so that important variables were compared without excessive complications or computational inefficiencies.18 In case of total number of publications due to count nature of this variable and to control robustness of our results, we ran separate models with negative binomial distribution (while keeping the random effects structure similar to our other models).
4.4 Results
It is important to first note that despite the vast coverage of publications in Scopus (Scopus 2017), our dataset only covered a fraction of the productivity of Italian sociologists (which is limited even more when focusing only on five years before and after ANVUR). Indeed, only 57.53% of the 1,029 sociologists had at least one publication record indexed in Scopus. Most missing records were probably published in less prestigious national outlets, including book series by national publishers (Akbaritabar, Casnici, and Squazzoni 2018).
Figure 4.1 shows that the distribution of publications was highly skewed with a few sociologists publishing a considerable amount of the total number of publications. This was in line with previous studies, which showed that productivity is cumulative and non-linear (e.g., Nygaard (2015); Ramsden (1994); Coile (1977)). Table 4.1 presents a descriptive view (mean and median) of total number of publications and FSS of sociologists compared over geographical regions, universities of different ranks, sectors and departments.
Figure 4.1: Distribution of the total number of publications 2006-2015 (Scopus data) (X = number of publications, Y = percentage of authors with that much publications)
| Main Cat | Sub Cat | A FSS BF | M FSS BF | A Pub BF | M Pub BF | A FSS AF | M FSS AF | A Pub AF | M Pub AF |
|---|---|---|---|---|---|---|---|---|---|
| Geo Region | center | 0.048 | 0.001 | 1.729 | 1.0 | 0.006 | 0.000 | 2.466 | 1.0 |
| Geo Region | isolated | 0.011 | 0.013 | 1.364 | 1.0 | 0.003 | 0.000 | 2.259 | 1.0 |
| Geo Region | north | 0.097 | 0.013 | 2.312 | 1.5 | 0.023 | 0.002 | 4.350 | 3.0 |
| Geo Region | south | 0.033 | 0.000 | 1.605 | 1.0 | 0.004 | 0.000 | 2.405 | 2.0 |
| University Rank | high | 0.043 | 0.007 | 1.940 | 1.0 | 0.009 | 0.001 | 3.278 | 2.0 |
| University Rank | low | 0.099 | 0.003 | 2.100 | 1.0 | 0.011 | 0.000 | 3.294 | 2.0 |
| University Rank | medium | 0.112 | 0.007 | 2.163 | 1.0 | 0.026 | 0.001 | 3.874 | 2.0 |
| Sector | SPS/07 | 0.048 | 0.003 | 1.977 | 1.0 | 0.013 | 0.001 | 3.284 | 2.0 |
| Sector | SPS/08 | 0.071 | 0.006 | 1.915 | 1.0 | 0.016 | 0.000 | 3.552 | 2.0 |
| Sector | SPS/09 | 0.123 | 0.012 | 2.308 | 1.0 | 0.019 | 0.002 | 4.186 | 2.5 |
| Sector | SPS/10 | 0.100 | 0.025 | 2.133 | 2.0 | 0.018 | 0.000 | 3.235 | 2.0 |
| Sector | SPS/11 | 0.152 | 0.023 | 2.625 | 1.5 | 0.010 | 0.004 | 2.800 | 2.0 |
| Sector | SPS/12 | 0.022 | 0.000 | 1.812 | 1.5 | 0.010 | 0.000 | 3.100 | 2.0 |
| Department | Economics | 0.131 | 0.004 | 2.538 | 1.0 | 0.034 | 0.000 | 3.913 | 2.0 |
| Department | Engineering | 0.068 | 0.006 | 1.333 | 1.0 | 0.006 | 0.001 | 3.300 | 2.0 |
| Department | Humanities | 0.074 | 0.005 | 2.121 | 1.0 | 0.013 | 0.001 | 3.528 | 2.0 |
| Department | Medicine | 0.003 | 0.000 | 1.667 | 1.0 | 0.001 | 0.000 | 3.333 | 3.0 |
| Department | Other | 0.000 | 0.000 | 1.250 | 1.0 | 0.002 | 0.000 | 2.500 | 2.0 |
| Department | Psychology | 0.012 | 0.014 | 1.667 | 1.0 | 0.008 | 0.003 | 4.000 | 2.0 |
| Department | Social Sciences | 0.071 | 0.010 | 2.020 | 1.0 | 0.016 | 0.001 | 3.465 | 2.0 |
Following Abramo, Cicero, and D’Angelo (2013), we restricted our analysis to sociologists with FSS values higher than 0. We considered multilevel repeated measures with nested structure for each relevant variable such as: the FSS, internationalization, number of coauthors, number of papers and overall citations. Results showed that the FSS decreased significantly after ANVUR (-0.053***, CI = [-0.068, -0.038]).19 We controlled whether this was merely the effect of lower citations for newer papers published after ANVUR. Here, we found that there were lower citations after ANVUR, but results were not statistically significant (-0.009, CI = [-0.025, 0.007]). Unlike previous studies, which suggested that internationalization of scientific collaborations has recently increased (e.g., Leydesdorff, Park, and Wagner (2014); Akbaritabar, Casnici, and Squazzoni (2018)), we found that the internationalization of Italian sociologists actually decreased after ANVUR (-0.039***, CI = [-0.052, -0.025]) as did the number of coauthors (-0.013**, CI = [-0.025, -0.00004]). Only the number of articles published by Italian sociologists increased (0.780***, CI = [0.670, 0.890]). Note that these are baseline (NULL) models which had the same random structure as those described in the Methods section, but they did not include any fixed effects. In this way, we tried to explore the general trends after ANVUR (see Table ?? in the Appendix Section for further detail).
We then looked at the effect of “within” and “between” groups on FSS and the number of papers, including the academic level and its change from 2010 to 2016 and each subject’s scientific disciplinary sector.
Table ?? shows the results with FSS as a dependent variable20. In general, the FSS of sociologists did not increase after ANVUR. When considering between group effects (see rows without × post-ANVUR), we found certain specific differences, such as SPS/09 and SPS/11 have higher FSS than SPS/07, associate and full professors having significantly higher FSS than assistant professors. However, more importantly, sociologists with a higher FSS in 2010, were the same as those who were more productive in 2016. Similarly, sociologists who had a high or medium number of papers published in 2010 were more prolific even between 2010 and 2016. Moreover, we found that sociologists who had been promoted between 2010 and 2016 were not significantly more productive than those who were not.
When considering within group effects (see × post-ANVUR rows), we did not find any significant positive impact of ANVUR but a few significant negative impact. The average gap of FSS of sociologists who had a higher FSS in 2010 and those who had low FSS in 2010 significantly decreased after ANVUR. The same for sociologists who were more prolific in terms of publications compared to low prolific ones in 2010. These results contradict previous research, showing that more prolific authors tended to have more recognition in terms of citations (Hâncean and Perc 2016).
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN” “http://www.w3.org/TR/html4/loose.dtd”>| Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | ||
|---|---|---|---|---|---|---|
| Constant | 0.04 (0.01)*** | 0.03 (0.01)* | 0.07 (0.01)*** | 0.00 (0.01) | 0.01 (0.01) | |
| SPS/08 | 0.02 (0.02) | |||||
| SPS/09 | 0.08 (0.02)*** | |||||
| SPS/10 | 0.04 (0.03) | |||||
| SPS/11 | 0.10 (0.04)** | |||||
| SPS/12 | -0.02 (0.03) | |||||
| Post-ANVUR | -0.03 (0.01)** | -0.01 (0.01) | -0.05 (0.01)*** | 0.00 (0.01) | -0.00 (0.01) | |
| SPS/08 × post-ANVUR | -0.01 (0.02) | |||||
| SPS/09 × post-ANVUR | -0.07 (0.02)** | |||||
| SPS/10 × post-ANVUR | -0.04 (0.03) | |||||
| SPS/11 × post-ANVUR | -0.10 (0.04)* | |||||
| SPS/12 × post-ANVUR | 0.02 (0.04) | |||||
| Associate professors | 0.06 (0.02)*** | |||||
| Full professors | 0.07 (0.02)*** | |||||
| Associate prof. × post-ANVUR | -0.05 (0.02)** | |||||
| Full prof. × post-ANVUR | -0.07 (0.02)*** | |||||
| Level changed from 2010 | -0.00 (0.01) | |||||
| Level changed × post-ANVUR | -0.00 (0.02) | |||||
| Medium FSS in 2010 | 0.03 (0.01)* | |||||
| High FSS in 2010 | 0.53 (0.02)*** | |||||
| Medium FSS in 2010 × post-ANVUR | -0.01 (0.02) | |||||
| High FSS in 2010 × post-ANVUR | -0.40 (0.02)*** | |||||
| Medium n.o papers in 2010 | 0.11 (0.02)*** | |||||
| High n.o papers in 2010 | 0.39 (0.03)*** | |||||
| Medium n.o papers in 2010 × post-ANVUR | -0.08 (0.02)*** | |||||
| High n.o papers in 2010 × post-ANVUR | -0.26 (0.04)*** | |||||
| AIC | -1150.73 | -1192.83 | -1189.58 | -877.86 | -544.48 | |
| BIC | -1076.36 | -1146.35 | -1152.40 | -837.48 | -504.10 | |
| Log Likelihood | 591.36 | 606.42 | 602.79 | 448.93 | 282.24 | |
| Num. obs. | 771 | 771 | 771 | 419 | 419 | |
| Num. groups: id | 587 | 587 | 587 | 235 | 235 | |
| Num. groups: university | 66 | 66 | 66 | 49 | 49 | |
| Num. groups: department | 7 | 7 | 7 | 7 | 7 | |
| Var: id (Intercept) | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| Var: university (Intercept) | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| Var: department (Intercept) | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| Var: Residual | 0.01 | 0.01 | 0.01 | 0.00 | 0.01 | |
| p < 0.001; p < 0.01; p < 0.05 | ||||||
We then considered the total number of publications as a dependent variable rather than the FSS. Considering between group effects (rows of Table ?? without × post-ANVUR), we found no significant differences in the total number of articles between sociologists from different disciplinary sectors. Those who were promoted did not publish any more. Sociologists with high FSS in 2010 had more papers in 2016. Confirming previous research on productivity persistence (Hâncean and Perc 2016), we found that sociologists with a medium and high number of papers in 2010 were the same with more articles published in 2016.
When considering within group effects (rows of Table ?? where × post-ANVUR is indicated), we found some traces of strategic signaling: sociologists who were promoted between 2010 and 2016, were the same as those publishing more after ANVUR but it was not statistically significant (Leahey, Keith, and Crockett 2010; Long 1992; Grant and Ward 1991). We found that those with high FSS in 2010 kept a positive gap with lower prolific group even in 2016 while the gap between those with medium and high number of papers in 2010 decreased in 2016.
Since total number of publications has a count nature and to control the robustness of resutls presented here (which is based on scaled version of total number of publications to 0-1), we ran negative binomial models using count data (see Table ?? in the Appendix Section)). In case of sociologists with medium and high number of papers in 2010 the results were reverse to what is reported here, meaning the gap of productivity between them and those with low number of papers in 2010 has decreased in 2016, which calls further probes.
| Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | ||
|---|---|---|---|---|---|---|
| Constant | 0.05 (0.01)*** | 0.02 (0.01) | 0.05 (0.01)*** | 0.03 (0.01)* | 0.00 (0.01) | |
| SPS/08 | -0.02 (0.01) | |||||
| SPS/09 | 0.01 (0.02) | |||||
| SPS/10 | -0.01 (0.02) | |||||
| SPS/11 | 0.04 (0.03) | |||||
| SPS/12 | -0.01 (0.02) | |||||
| Post-ANVUR | -0.00 (0.01) | 0.02 (0.01)* | -0.00 (0.01) | 0.01 (0.01) | 0.06 (0.01)*** | |
| SPS/08 × post-ANVUR | 0.03 (0.01) | |||||
| SPS/09 × post-ANVUR | 0.01 (0.02) | |||||
| SPS/10 × post-ANVUR | 0.01 (0.03) | |||||
| SPS/11 × post-ANVUR | -0.05 (0.03) | |||||
| SPS/12 × post-ANVUR | 0.00 (0.03) | |||||
| Associate professors | 0.05 (0.01)*** | |||||
| Full professors | 0.05 (0.01)*** | |||||
| Associate prof. × post-ANVUR | -0.03 (0.01)* | |||||
| Full prof. × post-ANVUR | -0.02 (0.01) | |||||
| Level changed from 2010 | -0.02 (0.01) | |||||
| Level changed × post-ANVUR | 0.02 (0.01) | |||||
| Medium FSS in 2010 | 0.03 (0.01) | |||||
| High FSS in 2010 | 0.25 (0.02)*** | |||||
| Medium FSS in 2010 × post-ANVUR | 0.03 (0.02) | |||||
| High FSS in 2010 × post-ANVUR | -0.02 (0.03) | |||||
| Medium n.o papers in 2010 | 0.12 (0.01)*** | |||||
| High n.o papers in 2010 | 0.42 (0.02)*** | |||||
| Medium n.o papers in 2010 × post-ANVUR | -0.04 (0.01)*** | |||||
| High n.o papers in 2010 × post-ANVUR | -0.20 (0.02)*** | |||||
| AIC | -1390.24 | -1445.51 | -1447.63 | -724.55 | -870.19 | |
| BIC | -1315.88 | -1399.03 | -1410.44 | -684.17 | -829.81 | |
| Log Likelihood | 711.12 | 732.75 | 731.81 | 372.28 | 445.09 | |
| Num. obs. | 771 | 771 | 771 | 419 | 419 | |
| Num. groups: id | 587 | 587 | 587 | 235 | 235 | |
| Num. groups: university | 66 | 66 | 66 | 49 | 49 | |
| Num. groups: department | 7 | 7 | 7 | 7 | 7 | |
| Var: id (Intercept) | 0.00 | 0.00 | 0.00 | 0.01 | 0.00 | |
| Var: university (Intercept) | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| Var: department (Intercept) | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| Var: Residual | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| p < 0.001; p < 0.01; p < 0.05 | ||||||
As previously mentioned, ANVUR ranked Italian journals, and assigned categories such as Fascia (whether Fascia A, B, or C) and non-Fascia (ANVUR 2012). From a total of 2,607 papers in our sample, Fascia articles were 409 (15.69 %) while non-Fascia articles were 2,198 (84.31 %). By adding these categories to a separate set of models, we attempted to verify whether sociologists published differently in these journals. We hypothesized that under assessment, sociologists could be induced to target more national journals given that these were considered influential in evaluation (by being Fascia) but were less competitive targets for easier publications.
However, Table ??21 shows that Italian sociologists preferably published in non-Fascia journals even after ANVUR, although the statistical effect was not significant (0.01, CI = [-0.01; 0.02]). Furthermore, we found that Italian sociologists published less in journals categorized as Fascia by VQR 2004-2010 after ANVUR (-0.09, CI = [-0.13, -0.04]) (For more elaborated analysis on Fascia and non-Fascia journals, see Table ?? and ?? in the Appendix Section). Considering that our sample covered only journals and book series of the highest quality (i.e., international and national journals indexed in Scopus including 50% of Fascia journals), it is reasonable to suppose that other scientists either targeted even lower quality journals and book series not indexed in Scopus or did not publish at all.
| Model 1 | Model 2 | ||
|---|---|---|---|
| Constant | 0.13 (0.02)*** | 0.05 (0.01)*** | |
| Pub in Fascia journals post-ANVUR | -0.09 (0.02)*** | ||
| Pub in non-Fascia journals post-ANVUR | 0.01 (0.01) | ||
| AIC | -236.60 | -1277.24 | |
| BIC | -215.84 | -1250.24 | |
| Log Likelihood | 124.30 | 644.62 | |
| Num. obs. | 235 | 665 | |
| Num. groups: id | 214 | 508 | |
| Num. groups: university | 49 | 64 | |
| Num. groups: department | 7 | 7 | |
| Var: id (Intercept) | 0.01 | 0.00 | |
| Var: university (Intercept) | 0.00 | 0.00 | |
| Var: department (Intercept) | 0.00 | 0.00 | |
| Var: Residual | 0.01 | 0.00 | |
| p < 0.001; p < 0.01; p < 0.05 | |||
4.5 Discussion and conclusions
This chapter aims to provide a quantitative analysis of the impact of ANVUR and VQR 2004-2010 research assessment on Italian sociologists’ publication strategies. We considered sociologists an interesting case as this community is characterized by the co-existence of different sub-communities, some closer to the humanities, while others are more aligned to international standards of hard sciences. Considering that the assessment followed informed peer review in a context that did not have any consensual evaluation standards and was not familiar with these assessments, it was interesting to examine scientists’ reactions (Abramo and D’Angelo 2011a). Indeed, besides the pros and cons of this assessment (e.g., Franceschini and Maisano (2017); Baccini and De Nicolao (2016); Abramo and D’Angelo (2017)), evaluating the research output of scientists in similar periods can provide a picture of the interplay between institutional pressure and endogenous patterns of behavior. This is why, independent of the details and specifics, the VQR assessment by ANVUR could be considered the first policy experiment on “nudging” Italian scientists towards international standards of research (e.g., Ancaiani et al. (2015); Bertocchi et al. (2015); Benedetto et al. (2017b); Benedetto et al. (2017a)).
However, it is also worth noting that VQR assessment embodied a certain level of institutional ambiguity (Boffo and Moscati 1998). The methods and procedures used by ANVUR for sociologists included redundant signals, sometimes even contradictory, e.g., discriminating publication outlets according to their quality in principle, at the same time restricting the Fascia list (top quality) only to certain prestigious national journals. This would suggest that certain endogenous properties of the Italian academic system, such as the presence of a well-established system of national and local publication outlets, perhaps controlled by close colleagues, could have discouraged international research without penalizing promotion and careers (e.g., Abramo, D’Angelo, and Murgia (2017)). Furthermore, considering that scientists have their own personal career agenda, also influenced by local contexts (e.g., department or university hiring programs) and that there are fixed costs and endogenous forces that constrain productivity, it is not surprising that institutional pressures were individually interpreted differently. This lack of coherence between performance evaluation and promotion and career advancement (Abramo, D’Angelo, and Rosati 2014) has led some observer even to question whether the Italian academic system is prepared to reward any “performance based” scheme (Bertocchi et al. 2015). Although the assessment had certainly positive implications, i.e., establishing a competitive resource allocation scheme for university structures, its signaling function on the importance of high-quality research on social scientists was limited, with adaptive responses that probably reflected the rigidities and constraints of short-term adjustments.
Our findings suggest that ANVUR and specifically VQR 2004-2010 research assessment did not stimulate the quantity or quality of research output among Italian sociologists. All observed differences were mainly due to promotion and individual strategies and motivation, only partially malleable by institutional stimuli. Those who were more productive before were also more productive later. It is probable that to the ambiguity of institutional signals of the quality of publication outlets, the artificial definition of Fascia journals, which was often irrespective of the “objective” prestige of outlets and the mismatch between VQR 2004-2010 and the overlapping national habilitation procedures were used strategically by academics (Jonkers and Zacharewicz 2016).
Secondly, it is worth noting that while attempts at using VQR 2004-2010 to link research assessment to resource allocation at the university level have been successfully pursued by MIUR, positive and negative rewards have not affected individual academics or groups. Here, universities exploit their relative degree of autonomy in promotion, careers and funding allocations. Thus, comparative research on similar research assessments in different EU contexts (e.g., Sandström and Van den Besselaar (2018)), in which universities and departments are embedded in different institutional regimes (e.g., more or less autonomy and centralized/decentralized university systems) could improve our understanding of the complex interplay between institutional policies and individual behavior (Provasi, Squazzoni, and Tosio 2012).
In this respect, it is worth noting that initiatives aligning assessment and promotions at different academic organizational levels and based on more transparent evaluation procedures could help reduce such institutional ambiguity. Systematic research comparing trajectories and adaptations in different fields could be used to understand the interplay of pre-existing endogenous forces and the different malleability of specific communities in more detail, possibly trying to contextualize evaluation methods (e.g., bibliometrics, informed peer review and the time scale of evaluations) on context-specific if not on individual characteristics (see the case of gender in Marini and Meschitti (2018); see also Abramo, D’Angelo, and Rosati (2014); Jonkers and Zacharewicz (2016); Abramo, D’Angelo, and Rosati (2015)).
Finally, our work has certain limitations that should be considered. First, while Scopus covers all the most prestigious international and national outlets, including international book series, most Italian sociologists seem to address their work to plenty of local outlets that were not included in our sample (Bertocchi et al. 2015). By including datasets that cover national publications in more detail (e.g., Google Scholar or MIUR database), we could verify whether Italian sociologists reacted to the institutional pressure of ANVUR by increasing publications in these outlets, also considering that most of them do not follow strict peer review policies and accept/elicit submissions by their editorial boards. However, considering that our findings looked at (low-profile) adaptive responses by the most productive sociologists, it would not be surprising to find whether our findings could also be generalizable to sociologists’ productivity not covered here. Furthermore, it is worth mentioning that Scopus is increasing its coverage step-by-step with implications on the number and type of products, which in turn affects the observed level of scientist production as reflected in our and similar samples.
Secondly, our analysis on the effects of competition for promotion and career on publication patterns is incomplete. Indeed, we were only able to reconstruct those who were eventually promoted during the assessment (Abramo, D’Angelo, and Rosati 2014), while we did not have any data on candidates who applied for the same positions but were not eventually promoted. This means that the effect of competition for promotion and career may be over-estimated in our analysis. To understand the truth of this assertion, we could record all candidates for all positions as traceable on the MIUR website. However, this would require extensive work which is significantly time consuming, as data collection in this case could not be automatized.
Finally, we used crossed membership random effects structure which enabled studying organizational context along with other fixed effects while ruling out individual subjects’ random noise (see Baayen, Davidson, and Bates (2008) for a discussion), but there are other methodological possibilities to study effect of policy changes in scientist’s behavior, specifically Dif in Dif and Regression Discontinuity Design (e.g. see Seeber et al. (2017) for an example) which we haven’t used here. Those analytical strategies might help give more insights into ANVUR’s effects.
4.6 Appendix
This appendix includes further data and analysis that complement findings presented in the article. In particular, it provides details on repeated measurement mixed effect models with nested structures, which were our baseline, and models which considered publications in Fascia and non-Fascia journals as dependent variable.
4.6.1 Appendix A: Baseline models
Table ?? shows the results of our baseline models which were ran to detect certain general trends in our sample. These models included the same random effects (random intercepts) structure as other models, while they did not include any fixed effects (independent variables). In other words, these models only present the trend and significance of change in dependent variable before and after ANVUR.
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN” “http://www.w3.org/TR/html4/loose.dtd”>| Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | ||
|---|---|---|---|---|---|---|
| Constant | 0.07 (0.01)*** | 0.05 (0.01)*** | 0.06 (0.01)*** | 0.05 (0.01)*** | 0.20 (0.07)** | |
| FSS Post-ANVUR | -0.05 (0.01)*** | |||||
| Citations Post-ANVUR | -0.01 (0.01) | |||||
| Internationalization Post-ANVUR | -0.04 (0.01)*** | |||||
| N.O. authors Post-ANVUR | -0.01 (0.01)* | |||||
| N.O. of papers Post-ANVUR | 0.78 (0.06)*** | |||||
| AIC | -1207.65 | -1143.31 | -1295.47 | -1438.08 | 3171.07 | |
| BIC | -1179.76 | -1115.42 | -1267.59 | -1410.19 | 3198.95 | |
| Log Likelihood | 609.82 | 577.65 | 653.74 | 725.04 | -1579.53 | |
| Num. obs. | 771 | 771 | 771 | 771 | 771 | |
| Num. groups: id | 587 | 587 | 587 | 587 | 587 | |
| Num. groups: university | 66 | 66 | 66 | 66 | 66 | |
| Num. groups: department | 7 | 7 | 7 | 7 | 7 | |
| Var: id (Intercept) | 0.00 | 0.00 | 0.00 | 0.00 | 0.33 | |
| Var: university (Intercept) | 0.00 | 0.00 | 0.00 | 0.00 | 0.07 | |
| Var: department (Intercept) | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| Var: Residual | 0.01 | 0.01 | 0.01 | 0.01 | ||
| p < 0.001; p < 0.01; p < 0.05 | ||||||
4.6.2 Appendix B: papers published in Fascia journals
Table ?? shows results of models that considered the number of publications in Fascia journals as dependent variable against various independent variables, e.g., academic level, change in academic level, FSS and number of papers in 2010. Results show that after ANVUR number of papers in Fascia journals increased significantly only in the case of sociologists who were more prolific and notified by community in 2010 (with medium and high FSS). For those who have been promoted from 2010 to 2016, the number of papers published in Fascia journals has increased, but only minimally and not statistically significant (-0.037, CI = [-0.033, 0.107]).
| Model 1 | Model 2 | Model 3 | ||
|---|---|---|---|---|
| Constant | 0.03 (0.03) | 0.16 (0.03)*** | -0.00 (0.03) | |
| Associate professors | 0.15 (0.05)** | |||
| Full professors | 0.21 (0.05)*** | |||
| Post-ANVUR | -0.01 (0.03) | -0.12 (0.03)*** | 0.04 (0.03) | |
| Associate prof. × post-ANVUR | -0.14 (0.05)* | |||
| Full prof. × post-ANVUR | -0.14 (0.05)** | |||
| Level changed from 2010 | -0.08 (0.04) | |||
| Level changed × post-ANVUR | 0.08 (0.05) | |||
| Medium n.o papers in 2010 | 0.15 (0.04)*** | |||
| High n.o papers in 2010 | 0.95 (0.07)*** | |||
| Medium n.o papers in 2010 × post-ANVUR | -0.13 (0.05)** | |||
| High n.o papers in 2010 × post-ANVUR | -0.75 (0.08)*** | |||
| AIC | -232.55 | -225.53 | -129.11 | |
| BIC | -197.96 | -197.85 | -101.57 | |
| Log Likelihood | 126.28 | 120.76 | 74.55 | |
| Num. obs. | 235 | 235 | 116 | |
| Num. groups: id | 214 | 214 | 95 | |
| Num. groups: university | 49 | 49 | 33 | |
| Num. groups: department | 7 | 7 | 6 | |
| Var: id (Intercept) | 0.00 | 0.01 | 0.00 | |
| Var: university (Intercept) | 0.00 | 0.00 | 0.00 | |
| Var: department (Intercept) | 0.00 | 0.00 | 0.00 | |
| Var: Residual | 0.01 | 0.01 | 0.01 | |
| p < 0.001; p < 0.01; p < 0.05 | ||||
4.6.3 Appendix C: papers published in non-Fascia journals
Table ?? shows the results of models that considered the number of publications in non-Fascia journals as dependent variable against various independent variables, e.g., scientific disciplinary sector, academic level, change in academic level, FSS and number of papers in 2010. Results showed that in the case of prolific authors with a high number of papers in 2010, for associate professors and those belonging to sector SPS/11 the number of papers published in non-Fascia journals after AVNUR decreased significantly.
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN” “http://www.w3.org/TR/html4/loose.dtd”>| Model 1 | Model 2 | Model 3 | ||
|---|---|---|---|---|
| Constant | 0.02 (0.01) | 0.05 (0.01)*** | 0.00 (0.01) | |
| Associate professors | 0.04 (0.02)** | |||
| Full professors | 0.04 (0.01)** | |||
| Post-ANVUR | 0.02 (0.01)* | -0.00 (0.01) | 0.05 (0.01)*** | |
| Associate prof. × post-ANVUR | -0.03 (0.02) | |||
| Full prof. × post-ANVUR | -0.02 (0.02) | |||
| Level changed from 2010 | -0.02 (0.01) | |||
| Level changed × post-ANVUR | 0.02 (0.01) | |||
| Medium n.o papers in 2010 | 0.12 (0.01)*** | |||
| High n.o papers in 2010 | 0.39 (0.03)*** | |||
| Medium n.o papers in 2010 × post-ANVUR | -0.04 (0.01)** | |||
| High n.o papers in 2010 × post-ANVUR | -0.17 (0.03)*** | |||
| AIC | -1252.14 | -1261.22 | -738.08 | |
| BIC | -1207.14 | -1225.22 | -699.03 | |
| Log Likelihood | 636.07 | 638.61 | 379.04 | |
| Num. obs. | 665 | 665 | 367 | |
| Num. groups: id | 508 | 508 | 210 | |
| Num. groups: university | 64 | 64 | 47 | |
| Num. groups: department | 7 | 7 | 7 | |
| Var: id (Intercept) | 0.00 | 0.00 | 0.00 | |
| Var: university (Intercept) | 0.00 | 0.00 | 0.00 | |
| Var: department (Intercept) | 0.00 | 0.00 | 0.00 | |
| Var: Residual | 0.00 | 0.00 | 0.00 | |
| p < 0.001; p < 0.01; p < 0.05 | ||||
4.6.4 Appendix D: Negative Binomial models
In order to control robustness of our results with alternative modelling strategies, we ran negative binomial models on the total number of publication as dependent variable which has a count nature and highly skewed distribution (see Table ??). In case of sociologists with medium and high number of papers in 2010 the results were reverse to what is reported in Table ??, meaning the gap of productivity between them and those with low number of papers in 2010 has decreased in 2016, which calls further probes.
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN” “http://www.w3.org/TR/html4/loose.dtd”>| Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | ||
|---|---|---|---|---|---|---|
| Constant | 0.28 (0.11)** | 0.00 (0.12) | 0.23 (0.09)** | 0.24 (0.12)* | -0.06 (0.09) | |
| SPS/08 | -0.22 (0.14) | |||||
| SPS/09 | -0.02 (0.16) | |||||
| SPS/10 | -0.11 (0.24) | |||||
| SPS/11 | 0.21 (0.30) | |||||
| SPS/12 | -0.02 (0.25) | |||||
| Post-ANVUR | 0.65 (0.09)*** | 0.83 (0.11)*** | 0.70 (0.07)*** | 0.67 (0.14)*** | 1.22 (0.10)*** | |
| SPS/08 × post-ANVUR | 0.30 (0.14)* | |||||
| SPS/09 × post-ANVUR | 0.18 (0.16) | |||||
| SPS/10 × post-ANVUR | 0.12 (0.24) | |||||
| SPS/11 × post-ANVUR | -0.30 (0.29) | |||||
| SPS/12 × post-ANVUR | 0.01 (0.26) | |||||
| Associate professors | 0.31 (0.14)* | |||||
| Full professors | 0.38 (0.14)** | |||||
| Associate prof. × post-ANVUR | -0.08 (0.14) | |||||
| Full prof. × post-ANVUR | -0.02 (0.14) | |||||
| Level changed from 2010 | -0.09 (0.12) | |||||
| Level changed × post-ANVUR | 0.19 (0.11) | |||||
| Medium FSS in 2010 | 0.24 (0.14) | |||||
| High FSS in 2010 | 1.28 (0.18)*** | |||||
| Medium FSS in 2010 × post-ANVUR | 0.36 (0.16)* | |||||
| High FSS in 2010 × post-ANVUR | 0.22 (0.18) | |||||
| Medium n.o papers in 2010 | 1.01 (0.12)*** | |||||
| High n.o papers in 2010 | 1.93 (0.17)*** | |||||
| Medium n.o papers in 2010 × post-ANVUR | -0.34 (0.13)** | |||||
| High n.o papers in 2010 × post-ANVUR | -0.63 (0.16)*** | |||||
| AIC | 3181.54 | 3160.01 | 3171.43 | 1691.15 | 1630.59 | |
| BIC | 3255.90 | 3206.48 | 3208.61 | 1731.53 | 1670.97 | |
| Log Likelihood | -1574.77 | -1570.00 | -1577.71 | -835.58 | -805.30 | |
| Num. obs. | 771 | 771 | 771 | 419 | 419 | |
| Num. groups: id | 587 | 587 | 587 | 235 | 235 | |
| Num. groups: university | 66 | 66 | 66 | 49 | 49 | |
| Num. groups: department | 7 | 7 | 7 | 7 | 7 | |
| Var: id (Intercept) | 0.33 | 0.31 | 0.33 | 0.21 | 0.14 | |
| Var: university (Intercept) | 0.07 | 0.07 | 0.07 | 0.00 | 0.01 | |
| Var: department (Intercept) | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| p < 0.001; p < 0.01; p < 0.05 | ||||||
References
Jappelli, Tullio, Carmela Anna Nappi, and Roberto Torrini. 2017. “Gender Effects in Research Evaluation.” Research Policy 46 (5): 911–24.
Abramo, Giovanni, and Ciriaco Andrea D’Angelo. 2015. “An Assessment of the First ‘Scientific Habilitation’ for University Appointments in Italy.” Economia Politica 32 (3): 329–57.
Edwards, Marc A, and Siddhartha Roy. 2017. “Academic Research in the 21st Century: Maintaining Scientific Integrity in a Climate of Perverse Incentives and Hypercompetition.” Environmental Engineering Science 34 (1): 51–61.
Nederhof, Anton J. 2006. “Bibliometric Monitoring of Research Performance in the Social Sciences and the Humanities: A Review.” Scientometrics 66 (1): 81–100.
Abramo, Giovanni, Ciriaco Andrea D’Angelo, and Francesco Rosati. 2014. “Career Advancement and Scientific Performance in Universities.” Scientometrics 98 (2): 891–907.
Rijcke, Sarah de, Paul F Wouters, Alex D Rushforth, Thomas P Franssen, and Björn Hammarfelt. 2016. “Evaluation Practices and Effects of Indicator Use—a Literature Review.” Research Evaluation 25 (2): 161–69.
Waltman, Ludo. 2018. “Responsible Metrics: One Size Doesn’t Fit All - Last Accessed 20 April 2018.” https://www.cwts.nl/blog?article=n-r2s294&title=responsible-metrics-one-size-doesnt-fit-all.
Hicks, Diana, Paul Wouters, Ludo Waltman, Sarah De Rijcke, and Ismael Rafols. 2015. “The Leiden Manifesto for Research Metrics.” Nature 520 (7548): 429.
Butler, Linda. 2003. “Explaining Australia’s Increased Share of Isi Publications—the Effects of a Funding Formula Based on Publication Counts.” Research Policy 32 (1): 143–55.
Besselaar, Peter van den, Ulf Heyman, and Ulf Sandström. 2017. “Perverse Effects of Output-Based Research Funding? Butler’s Australian Case Revisited.” Journal of Informetrics 11 (3): 905–18.
Whitley, Richard. 2007. “Changing Governance of the Public Sciences.” In The Changing Governance of the Sciences, 3–27. Springer.
Jonkers, K., and T. Zacharewicz. 2016. “Research Performance Based Funding Systems: A Comparative Assessment.” Publications Office of the European Union, Luxembourg, EUR 27837 EN. https://doi.org/10.2791/70120.
ANVUR. 2006. “National Agency for the Evaluation of the University and Research Systems - Last Accessed 20 July 2018.” http://www.anvur.it/wp-content/uploads/2014/07/LEGGE%2024%20novembre%202006%20art%202.pdf.
Benedetto, Sergio, Daniele Checchi, Andrea Graziosi, and Marco Malgarini. 2017b. “Comments on the Paper ‘Critical Remarks on the Italian Assessment Exercise’, Journal of Informetrics, 11 (2017) and Pp. 337–357.” Journal of Informetrics 11 (2): 622–24.
ANVUR. 2013. “ANVUR Rapporto Finale Vqr - Last Accessed 20 July 2018.” http://www.anvur.org/rapporto/.
Ancaiani, Alessio, Alberto F Anfossi, Anna Barbara, Sergio Benedetto, Brigida Blasi, Valentina Carletti, Tindaro Cicero, et al. 2015. “Evaluating Scientific Research in Italy: The 2004–10 Research Evaluation Exercise.” Research Evaluation 24 (3): 242–55.
Marzolla, Moreno. 2016. “Assessing Evaluation Procedures for Individual Researchers: The Case of the Italian National Scientific Qualification.” Journal of Informetrics 10 (2): 408–38.
Bertocchi, Graziella, Alfonso Gambardella, Tullio Jappelli, Carmela A Nappi, and Franco Peracchi. 2015. “Bibliometric Evaluation Vs. Informed Peer Review: Evidence from Italy.” Research Policy 44 (2): 451–66.
Baccini, Alberto, and Giuseppe De Nicolao. 2016. “Do They Agree? Bibliometric Evaluation Versus Informed Peer Review in the Italian Research Assessment Exercise.” Scientometrics 108 (3): 1651–71.
Abramo, Giovanni, Ciriaco Andrea D’Angelo, and Alessandro Caprasecca. 2009a. “Allocative Efficiency in Public Research Funding: Can Bibliometrics Help?” Research Policy 38 (1): 206–15.
Abramo, Giovanni, and Ciriaco Andrea D’Angelo. 2011a. “Evaluating Research: From Informed Peer Review to Bibliometrics.” Scientometrics 87 (3): 499–514.
Geuna, Aldo, and Matteo Piolatto. 2016. “Research Assessment in the Uk and Italy: Costly and Difficult, but Probably Worth It (at Least for a While).” Research Policy 45 (1): 260–71.
Harzing, Anne-Wil. 2018. “Running the Ref on a Rainy Sunday Afternoon: Can We Exchange Peer Review for Metrics?” In 23rd International Conference on Science and Technology Indicators (Sti 2018), September 12-14, 2018, Leiden, the Netherlands. Centre for Science; Technology Studies (CWTS).
Turri, Matteo. 2014. “The New Italian Agency for the Evaluation of the University System (Anvur): A Need for Governance or Legitimacy?” Quality in Higher Education 20 (1): 64–82.
Akbaritabar, Aliakbar, Niccolò Casnici, and Flaminio Squazzoni. 2018. “The Conundrum of Research Productivity: A Study on Sociologists in Italy.” Scientometrics 114 (3): 859–82.
Benedetto, Sergio, Daniele Checchi, Andrea Graziosi, and Marco Malgarini. 2017a. “Comments on the Correspondence "on Tit for Tat: Franceschini and Maisano Versus Anvur Regarding the Italian Research Assessment Exercise Vqr 2011-2014", J. Informetr., 11 (2017), 783-787.” Journal of Informetrics 11: 838–40.
Hicks, Diana. 2012. “Performance-Based University Research Funding Systems.” Research Policy 41 (2): 251–61.
Scopus. 2017. “Scopus Content Coverage Guide, Updated August 2017 - Last Accessed 7 December 2017.” https://www.elsevier.com/__data/assets/pdf_file/0007/69451/0597-Scopus-Content-Coverage-Guide-US-LETTER-v4-HI-singles-no-ticks.pdf.
ANVUR. 2012. “Documento Di Lavoro Sulla Classificazione Delle Riviste Scientifiche Italiane Dell’area Scienze Politiche E Sociali Del Gruppo Di Esperti Della Valutazione Dell’area 14 (Gev 14) - Last Accessed 1 August 2018.” http://www.anvur.it/wp-content/uploads/2012/04/gev14_allegato.pdf.
Abramo, Giovanni, and Ciriaco Andrea D’Angelo. 2011b. “National-Scale Research Performance Assessment at the Individual Level.” Scientometrics 86 (2): 347–64. https://doi.org/10.1007/s11192-010-0297-2.
Katz, J. S., and B. R. Martin. 1997. “What Is Research Collaboration?” Research Policy 26 (1): 1–18.
Leydesdorff, L., H. W. Park, and C. Wagner. 2014. “International Coauthorship Relations in the Social Sciences Citation Index: Is Internationalization Leading the Network?” Journal of the Association for Information Science and Technology 65 (10): 2111–26.
Abramo, Giovanni, Ciriaco Andrea D’Angelo, and Francesco Rosati. 2016b. “The North–South Divide in the Italian Higher Education System.” Scientometrics 109 (3): 2093–2117. https://doi.org/10.1007/s11192-016-2141-9.
Baayen, R Harald, Douglas J Davidson, and Douglas M Bates. 2008. “Mixed-Effects Modeling with Crossed Random Effects for Subjects and Items.” Journal of Memory and Language 59 (4): 390–412.
Snjiders, TAB, and Roel Bosker. 1999. “Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling.” London: Sage.
Faraway, Julian L. 2005. “Extending the Linear Model with R: Gerenalized Linear, Mixed Effects and Nonparametric Regression Models.” In. CRC PRESS.
Zuur, AF, EN Ieno, NJ Walker, AA Saveliev, and GM Smith. 2009. “Mixed Effects Models and Extensions in Ecology with R. Gail M, Krickeberg K, Samet Jm, Tsiatis a, Wong W, Editors.” New York, NY: Spring Science and Business Media.
Nygaard, Lynn P. 2015. “Publishing and Perishing: An Academic Literacies Framework for Investigating Research Productivity.” Studies in Higher Education, 1–14.
Ramsden, Paul. 1994. “Describing and Explaining Research Productivity.” Higher Education 28 (2): 207–26.
Coile, Russell C. 1977. “Lotka’s Frequency Distribution of Scientific Productivity.” Journal of the American Society for Information Science 28 (6): 366–70.
Abramo, Giovanni, Tindaro Cicero, and Ciriaco Andrea D’Angelo. 2013. “Individual Research Performance: A Proposal for Comparing Apples to Oranges.” Journal of Informetrics 7 (2): 528–39. https://doi.org/10.1016/j.joi.2013.01.013.
Hâncean, Marian-Gabriel, and Matjaž Perc. 2016. “Homophily in Coauthorship Networks of East European Sociologists.” Scientific Reports 6.
Leahey, Erin, Bruce Keith, and Jason Crockett. 2010. “Specialization and Promotion in an Academic Discipline.” Research in Social Stratification and Mobility 28 (2): 135–55.
Long, J Scott. 1992. “Measures of Sex Differences in Scientific Productivity.” Social Forces 71 (1): 159–78.
Grant, Linda, and Kathryn B Ward. 1991. “Gender and Publishing in Sociology.” Gender & Society 5 (2): 207–23.
Franceschini, Fiorenzo, and Domenico Maisano. 2017. “Critical Remarks on the Italian Research Assessment Exercise Vqr 2011–2014.” Journal of Informetrics 11 (2): 337–57.
Abramo, Giovanni, and Ciriaco Andrea D’Angelo. 2017. “On Tit for Tat: Franceschini and Maisano Versus Anvur Regarding the Italian Research Assessment Exercise Vqr 2011–2014.” Journal of Informetrics 11 (3): 783–87.
Boffo, Stefano, and Roberto Moscati. 1998. “Evaluation in the Italian Higher Education System: Many Tribes, Many Territories... Many Godfathers.” European Journal of Education 33 (3): 349–60.
Abramo, Giovanni, Andrea Ciriaco D’Angelo, and Gianluca Murgia. 2017. “The Relationship Among Research Productivity, Research Collaboration, and Their Determinants.” Journal of Informetrics 11 (4): 1016–30.
Sandström, Ulf, and Peter Van den Besselaar. 2018. “Funding, Evaluation, and the Performance of National Research Systems.” Journal of Informetrics 12 (1): 365–84.
Provasi, Giancarlo, Flaminio Squazzoni, and Beatrice Tosio. 2012. “Did They Sell Their Soul to the Devil? Some Comparative Case-Studies on Academic Entrepreneurs in the Life Sciences in Europe.” Higher Education 64 (6): 805–29.
Marini, Giulio, and Viviana Meschitti. 2018. “The Trench Warfare of Gender Discrimination: Evidence from Academic Promotions to Full Professor in Italy.” Scientometrics 115 (2): 989–1006.
Abramo, Giovanni, Ciriaco Andrea D’Angelo, and Francesco Rosati. 2015. “The Determinants of Academic Career Advancement: Evidence from Italy.” Science and Public Policy 42 (6): 761–74.
Seeber, Marco, Mattia Cattaneo, Michele Meoli, and Paolo Malighetti. 2017. “Self-Citations as Strategic Response to the Use of Metrics for Career Decisions.” Research Policy.
A slightly different version of this chapter with the same title coauthored with Giangiacomo Bravo and Flaminio Squazzoni has been published in (Akbaritabar, A., Bravo, G., & Squazzoni, F. (2021). The impact of a national research assessment on the publications of sociologists in Italy. Science and Public Policy, scab013. https://doi.org/10.1093/scipol/scab013).↩
Note that the previous Valutazione Triennale della Ricerca (Triennial research evaluation) carried out in 2006 for the period 2001-2003 had limited practical consequences in comparison to the UK RAE, which originally inspired it, and to VQR 2004-2010 (Jonkers and Zacharewicz 2016).↩
Science, technology, engineering, and mathematics↩
As example of this pressures we can point to debates on Italian publications, blogsphere and media such as: Cassese (2013), sociologia (2014) and AIS (2014)↩
The rest was spent for GEVs, leading group, peer review expenditures and other expenses which can be seen in detail in Geuna and Piolatto (2016).↩
Research projects of national interest (Progetti di Ricerca di Interesse Nazionale)↩
These are the scientific disciplinary sectors established by MIUR: General sociology (SPS/07), Sociology of culture and communication (SPS/08), Economic sociology (SPS/09), Environmental sociology (SPS/10), Political sociology (SPS/11) and Sociology of law and social change (SPS/12).↩
To compare MIUR and Scopus records, we not only automatically checked the correspondence between MIUR data and Scopus publications list with multiple criteria and step-by-step procedures. We also cross-checked manually each conflicting or absent case by a group of three independent assistants. As emphasized by De Stefano et al. (2013), this is a time consuming and hard task but is one of the few ways to minimize mistakes which are sometimes due to surname changes (e.g., marriage or divorce) and homonyms.↩
In case an author did not have any publications indexed either before or after ANVUR, we coded this as NA (Not Available).↩
Note that we experimented with different variable and parameter configurations (Abramo, Cicero, and D’Angelo 2013; Akbaritabar, Casnici, and Squazzoni 2018). For instance, we normalized individual data with all publications of each scientific disciplinary sector in the same year. We excluded the year of publication and included the impact factor of journals (here following Abramo, D’Angelo, and Di Costa (2008) and Abramo, D’Angelo, and Rosati (2014)). We eventually decided to follow Abramo and D’Angelo (2014b) by using only papers with citations higher than 0. We did not divide publications based on the type of products (i.e., research article, monograph or book chapter), as there were not sufficient variations in each type of product in each year to build a reliable baseline of \(\bar{c}\) calculation. Note that the distribution of publications was highly skewed (see detail in the Results Section).↩
In the nested structure of repeated measurements mixed effect models, each individual was assigned to their own cluster(s), while they had a random starting point (intercept) relative to the cluster assignment. The random structure is the same for all models, the fixed effects are entered in models shown in Tables ?? - ??.↩
In all the following statistics, the first numbers are coefficients, while numbers in parentheses are confidence intervals (which show two numbers as lower and upper bounds) at 95% level.↩
Note that the random structure in all 5 models presented in Tables ?? and ?? is the same (i.e., each individual has crossed memebership in his/her department and university as described in Methods section). In Model 1, we added as fixed effects the scientific disciplinary sector, in Model 2 the academic level, in Model 3 the academic status change, in Model 4 the individual FSS in 2010 (categorized as low, medium and high), while in Model 5 the number of papers published in 2010 (categorized as low, medium and high). Furthermore, note that the number of observations in models 4 and 5 differed from Models 1-3 due to the fact that certain scholars were hired after 2010 and did not have any publications recorded in 2010. This holds for 357 academics so that only 235 individuals in the sample were actively publishing in 2010. Then, consider that 437 individuals (out of total of 1,029) did not have any publications recorded in Scopus.↩
Note that the random structure of the two models presented in Table ?? are the same as the models in Tables ?? and ?? (i.e., individuals are nested in their department and university). However, the dependent variables in these two models were the number of papers published in Fascia and non-Fascia journals. In this case, these models did not include any fixed effects (we show more detailed models with fixed effects in Tables ?? and ?? in the Appendix Section). This explains why the number of observations was different. Interestingly, some academics did not have any articles published in Fascia journals either before or after ANVUR.↩