Skip to main content

Table 2 Annotations of the gene products in the corpus.

From: Stringent response of Escherichia coli: revisiting the bibliome using literature mining

Class

Concept

Number of Annotations

Number of Documents

% Frequency (Eq. 1)ψ

Mean (Eq. 2)

Std (Eq. 3)

VMR (Eq. 4)

Proteins

Ribosome

1643

128

66.32

12.84

23.57

44.08

 

Rel

1021

62

32.12

16.50

36.60

81.00

 

LacZ

543

53

27.46

10.30

17.44

28.90

 

Sigma 38 factor

392

42

21.76

9.330

15.40

25.00

 

Sigma factor

112

35

18.13

3.200

5.870

8.330

 

UvrD

56

35

18.13

1.600

1.300

1.000

 

RpoB

252

35

18.13

7.200

11.50

17.29

 

RecA

99

31

16.06

3.190

4.260

5.330

 

EF-Tu

223

26

13.47

8.580

17.32

36.13

 

Der

51

25

12.95

2.040

2.140

2.000

 

Sigma 70 factor

134

21

10.88

6.380

11.19

20.17

Transcription factors

Fis

888

18

9.330

49.33

86.88

150.9

 

Fur

56

13

6.740

4.310

9.260

20.25

 

CRP

279

12

6.220

23.25

36.28

56.35

 

DnaA

121

11

5.700

11.00

23.00

48.09

 

H-NS

73

11

5.700

6.640

10.73

16.67

 

LexA

101

10

5.180

10.10

18.32

32.40

 

IHF

54

9

4.660

6.000

5.250

4.170

Enzymes

RelA

4138

152

78.76

27.22

31.16

35.59

 

RNAP

1873

117

60.62

16.01

28.08

49.00

 

SpoT

1024

60

31.09

17.07

42.19

103.8

 

EcoRI

215

53

27.46

4.060

4.970

4.000

 

β-galactosidase

294

47

24.35

6.260

6.550

6.000

 

BamHI

149

43

22.28

3.470

5.870

8.330

 

HindIII

114

41

21.24

2.780

2.160

2.000

 

RNase

109

36

18.65

3.030

4.280

5.330

 

YbcS

50

23

11.92

2.170

2.620

2.000

 

Reverse transcriptase

34

21

10.88

1.620

1.050

1.000

 

tRNA synthetase

54

20

10.36

2.700

2.630

2.000

 

Endonuclease I

29

20

10.36

1.450

1.400

1.000

  1. Individual gene products (i.e. enzymes, transcription factors and other proteins) were evaluated considering the number of documents where these entities were annotated and their number of annotations in the corpus. Statistical measurements are detailed in the Methods and Materials section.
  2. ψ A threshold of 10% of the frequency of annotation was set for enzymes and other proteins, whereas a threshold of 5% was set for transcription factors. However, lists of all annotated entities are provided in Additional file 6.
  3. VMR: variance-to-mean
  4. Std: standard deviation