Skip to main content

Table 1 Annotations of the genetic components in the corpus.

From: Stringent response of Escherichia coli: revisiting the bibliome using literature mining

Class

Concept

Number of Annotations

Number of Documents

% Frequency (Eq. 1)ψ

Mean (Eq. 2)

Std (Eq. 3)

VMR (Eq. 4)

Genes

relA

3163

138

71.50

22.92

27.23

33.14

 

spoT

1315

88

45.60

14.94

27.42

52.07

 

lac

354

63

32.64

5.620

19.42

72.20

 

lacZ

534

50

25.91

10.68

17.16

28.90

 

thi

91

47

24.35

1.940

0.050

4.000

 

rel

523

47

24.35

11.13

20.68

36.36

 

recA

82

39

20.21

2.100

1.810

0.5000

 

rpsL

95

36

18.65

2.640

3.530

4.500

 

thr

84

36

18.65

2.330

3.760

4.500

 

rpsG

103

34

17.62

3.030

7.250

16.33

 

leu

98

34

17.62

2.880

6.800

18.00

 

rpoS

205

33

17.10

6.210

10.83

16.67

 

kan

308

33

17.10

9.330

16.61

28.44

 

glnV

42

31

16.06

1.350

0.7400

0

 

rpoB

389

30

15.54

12.97

17.60

24.08

 

ptsG

240

30

15.54

8.000

21.54

55.13

 

trp

144

25

12.95

5.760

14.73

39.20

 

carA

60

20

10.36

3.000

3.810

3.000

 

hsdR

23

19

9.840

1.210

0.5600

0

DNAs

DNA

1839

137

70.98

13.42

16.31

19.69

 

plasmid DNA

193

36

18.65

5.360

12.31

28.80

 

chromosomal DNA

63

24

12.44

2.630

2.440

2.000

 

cDNA

125

23

11.92

5.430

5.820

5.000

RNAs

RNA

4193

140

72.54

29.95

38.21

49.79

 

uncharged tRNA

1168

117

60.62

9.980

19.64

40.11

 

rRNA

1116

97

50.26

11.51

25.97

56.82

 

a mRNA

999

91

47.15

10.98

19.52

36.10

 

rrnA

911

87

45.08

10.47

22.51

48.40

 

stable RNA

430

87

45.08

4.940

8.030

16.00

 

a charged tRNA

140

43

22.28

3.260

4.200

5.330

 

rrnB

301

26

13.47

11.58

19.30

32.82

 

rrn

321

26

13.47

12.35

30.42

75.00

 

16s-rRNAs

156

25

12.95

6.240

9.090

13.50

  1. Individual genetic components (i.e. genes, DNAs and RNAs) were evaluated considering the number of documents where these entities were annotated and their number of annotations in the corpus. Statistical measurements are detailed in the Methods and Materials section.
  2. ψ A threshold of 10% of the frequency of annotation was set for each genetic component category. However, lists of all annotated entities are provided in Additional file 5.
  3. VMR: variance-to-mean
  4. Std: standard deviation