Bibliometric evidence for empirical trade-offs
in national funding strategies
Duane Shelton and Loet Leydesdorff
ISSI 2011 Durban
Outline





Modeling of Input-Output Relations
Best Models from Correlations and Regression
Trade-offs in Allocation of R&D investments
Validation by Forecasts from Extrapolations,
Regressions, and Individual Country Models
Conclusions
Some prior work


Leydesdorff. A series starting in 1990 with
regression of papers with GERD. Most recently a
2009 publication with Wagner on which GERD
components are best in encouraging papers .
Shelton. Started in 2006 modeling paper share as a
function of GERD share to account for US decline.
Recently a 2010 presentation with Foland using
GERD components to account for the European
Paradox.
Output dependent variables (DVs)

Papers and Paper Share



Science Citation Index
Scopus
Patents and Patent Share




Triadic
USPTO
PCT
The full paper covers all; here we will focus on
those in red.
Input variables (IVs) from OECD


Overall GERD (Gross Expenditures on R&D)
GERD source components:





GERD spending components:





Government
Industry
Abroad (funding from abroad)
Other
HERD (higher education sector)
BERD (business sector)
Non-Profit (other than universities)
GOVERD (government labs)
Number of researchers
Shares provide the best national
comparisons





Some indicators are nearly zero-sum: countries
compete for a nearly fixed number of slots for paper
publications and patent grants. (Paper submissions
and patent applications are unbounded.)
The slots do rise slowly with time, and this
complicates national comparisons.
Thus, in analyzing relative positions of nations, their
share of most outputs is a more relevant indicator.
Modeling of the inputs that cause these output
shares is also best done in shares.
Of course, once a model is built for shares, it can
easily be used to calculate absolutes.
The size of nations is a confounding factor



All inputs and outputs depend on the size of the
country, making all country-wise correlations
high, and obscuring identification of which
variables are most important
One might divide all variables by some measure
of size, but stepwise multiple linear regression
can also tease out which input IVs are best for
predicting output DV.
IVs are added one-by-one in order of which
makes the best model for the DV.
Step-wise regression of 2007 SCI
paper share (ps07) vs. three IVs
Government
GERD share
HERD share
Overall GERD share
Fit of regression line
Correlations: Papers vs. Inputs
Red indicates strongest correlation of pair; it will dominate a 2 IV model
SCI
Scopus
1999
2007
1999
2007
GERD
0.982
0.977
0.977
0.938
Researchers
0.894
0.838
0.842
0.920
Industry
0.973
0.959
0.968
0.920
Government
0.989
0.989
0.986
0.944
HERD
0.976
0.983
0.977
0.928
BERD
0.980
0.968
0.975
0.927
Capital vs. Labor
Funding Components
Spending Components
Regressions of SCI paper share in 2007
IV1
IV2
GERD
Researchers
GERD
Government
Coeff1
0.819
Coeff2
-0.027
0.800
Industry
0.774
0.067
Const
p1
0.536
0.000
0.492
0.000
0.330
0.000
p2
0.697
R2
95.5%
95.5%
0.351
97.9%
Government
0.846
0.316
0.000
97.9%
HERD
0.979
-0.048
0.000
96.6%
0.127
0.000
Government
HERD
0.527
0.383
For example the best single IV model is:
Papers07 = 0.846 Governments07 + 0.316
0.000
98.8%
Step-wise regression of 2007 triadic
patent share (Patents07) vs. three IVs
Industry GERD share
Government GERD
share
BERD share
Fit of regression line
Fit is OK, but not as good as paper models
Coefficientsa
Model
1
Unstandardized
Coefficients
Standardized
Coefficients
B
Beta
Std. Error
(Constant)
.438
.450
Gov
-.973
.251
.224
Ind
1.778
a. Dependent Variable: Patents07
t
Sig.
.974
.337
-.843
-3.878
.000
1.725
7.934
.000
Shelton, R. D. & Leydesdorff, L. (in preparation).
Publish or Patent: Bibliometric evidence for empirical
trade-offs in national funding strategies
Correlations: Patents vs. Inputs
Red indicates strongest correlation of pair; it will dominate 2 IV model
Triadic
USPTO
1999
2007
1999
2007
GERD
0.924
0.895
0.947
0.830
Researchers
0.847
0.680
0.664
0.428
Industry
0.934
0.913
0.970
0.861
Government
0.881
0.818
0.834
0.628
HERD
0.949
0.890
0.910
0.791
BERD
0.921
0.905
0.966
0.852
Capital vs. Labor
Funding Components
Spending Components
Regressions for 2007 triadic patent share
Coeff1
Coeff2
Const
p1
p2
R2
IV1
IV2
GERD
Researchers
1.34
-0.46
0.327
0.000
0.014
83.3%
Industry
Government
1.78
-0.973
0.438
0.000
0.000
88.6%
Industry
BERD
4.32
3.46
0.201
0.004
0.021
85.9%
Industry
NonProfit
2.04
-0.653
-0.584
0.000
0.000
98.3%
Industry
0.941
0.058
0.000
83.4%
BERD
0.953
0.078
0.000
81.8%
For example the best single IV model is:
Patents07 = 0.941 Industrys07 + 0.058
Regressions show a trade-off
in allocations





To maximize papers, a country should maximize its
government funding of R&D, instead of industry funding
To maximize patents, a country should do the opposite:
maximize its industrial funding of R&D, which can be
encouraged by government
Similarly spending in the higher education sector seems
to encourage papers, while spending in the business
sector more encourages patents
Thus these allocations are simply a choice between
longer and shorter term benefits of R&D
Not surprising, but regressions provide some quantitative
confirmation of this logic
Summary of models for paper share

Simple extrapolations of trends in output paper share mi provide
a reality check for models based on input resource drivers

The Shelton Model based on GERD share works well for big
countries. It accounts for the decline in US and EU due to the
rise of China's share of GERD wi .
mi = ki wi

The Shelton-Leydesdorff Model based on government share
accounts for the EU increase in efficiency in the 1990s, and the
long-term US decline.
mi = ki’ wi’ + c’

Adding a second IV, HERD spending share wi’’ works even
better. This accounts for the EU passing the US in 1995.
mi = ki’wi’ + ki’’wi’’ + c’’
Validation of paper share models




Like any theory, models need to be tested to see
how well they account for new phenomena.
Scattergrams can show how well regression models
fit a year’s data, or perhaps a new data point. They
don’t forecast the future so well.
Once key IVs are identified by statistics, individual
country models can be built and tested by
“forecasting the past.”
Simple extrapolation of output DVs serves as a
reality check
Extrapolation of SCI paper shares
This model forecasts that the PRC will not pass the US until
about 2020, and the EU27 until after 2025
Extrapolation of paper share in the
Scopus database
This can be
compared to a
recent similar
forecast by the UK
Royal Society.
Scattergram of paper share vs.
government funding share
40
% Share of Publications 2007
(OECD+ countries)
35
EU27
EU15
30
USA
25
20
15
10
UK
5
China
Japan
Germany
Canada France
Spain
Russian Federation
0
0
5
10
15
20
25
% Share of Government Funding 2007
30
35
40
Same scattergram focused on smaller
countries
9
China
% Share of Publications
(OECD+ countries; without US)
8
Japan
7
United Kingdom
Germany
6
5
France
4
Canada
3
Italy
Spain
Australia
2
y = 0.9299x + 0.1737
2
R = 0.8946
Korea
Russian Federation
Chinese Taipei
Sweden
Turkey
1
0
0
2
4
6
% Share of Government Funding
8
10
Scattergram of patent share vs.
industrial funding share
45
% Triadic Patents (OECD+)
40
35
United States
30
Japan
EU27
EU15
25
20
15
Germany
10
5
Patents07 = 1.0811 Industry07 + 0.0104
R2 = 0.8696
France Korea
United Kingdom
China
0
0
5
10
15
20
25
% Industrial Funding of R&D
30
35
40
Same scattergram focused on
smaller countries
14
Germany
% Share of Triadic Patents
12
10
8
6
France
Korea
4
United Kingdom
2
Sweden
Canada
Israel
Australia
Chinese Taipei
0
0
2
Patents07 = 0.6353 Industry07 + 0.2316
R2 = 0.394
4
6
8
% Share of Industrial Funding of R&D
China
10
12
Performance of Shelton Model in
forecasting from 2005 to 2010
40
US Forecast
30
EU15 Forecast
20
PRC Forecast
US Actual
10
PRC Actual
EU15 Actual
20
17
20
16
20
15
20
14
20
13
20
12
20
11
20
10
20
09
20
08
20
07
20
06
0
20
05
Percentage of World Share
50
Based on forecasts of GERD and its share from 2005 data. Accuracy of US
and EU is not bad. PRC is growing slower than forecast.
Performance of Shelton-Leydesdorff model:
forecasting from 2005 to 2010
Paper Forecasts from Shelton-Leydesdorff Model
45.00
40.00
Percent of WoS
35.00
US Actual
30.00
EU15 Actual
25.00
PRC Actual
20.00
US Geo Forecast
EU15 Geo Forecast
15.00
PRC Geo Forecast
10.00
5.00
0.00
2005
2006
2007
2008
2009
2010
Uses 5-year average of rates of Gov increase. EU and PRC fit well, but US is worse than
forecast, because its rate of Gov increase has plummeted to near zero. (Individual models used.)
Conclusions




Regressions show that investment choices are
complementary: some are best for papers and some
for patents
Models based on these resource inputs have some
success in forecasting
But a take-away for the professors in the audience:
just using HERD share to predict paper share is
surprisingly accurate
Thus if nations want to excel in papers, they should
just give money to professors!
Paper share ≈ HERD share!
S c a tte r pl o t o f P a pe r s 0 7 v s H E R D
35
30
Pa p e r s 0 7
25
20
15
10
5
0
0
5
10
15
20
25
30
35
HER D
ps07 = 0.027 + 0.930HERD
R2 = 98.6%
p=0.000
Performance of HERD as
predictor of paper share
Paper Share Compared to HERD Share
40.0
Percent
35.0
30.0
US
25.0
EU
PRC
20.0
US hs
15.0
EU hs
10.0
PRC hs
5.0
0.0
1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007
Forget statistics: Simply predicting paper share with HERD share works
well for the US and EU. It also predicts that the EU should lead the US.
Descargar

Document