Skip to main content

Table 2 Annotations of the gene products in the corpus.

From: Stringent response of Escherichia coli: revisiting the bibliome using literature mining

Class Concept Number of Annotations Number of Documents % Frequency (Eq. 1)ψ Mean (Eq. 2) Std (Eq. 3) VMR (Eq. 4)
Proteins Ribosome 1643 128 66.32 12.84 23.57 44.08
  Rel 1021 62 32.12 16.50 36.60 81.00
  LacZ 543 53 27.46 10.30 17.44 28.90
  Sigma 38 factor 392 42 21.76 9.330 15.40 25.00
  Sigma factor 112 35 18.13 3.200 5.870 8.330
  UvrD 56 35 18.13 1.600 1.300 1.000
  RpoB 252 35 18.13 7.200 11.50 17.29
  RecA 99 31 16.06 3.190 4.260 5.330
  EF-Tu 223 26 13.47 8.580 17.32 36.13
  Der 51 25 12.95 2.040 2.140 2.000
  Sigma 70 factor 134 21 10.88 6.380 11.19 20.17
Transcription factors Fis 888 18 9.330 49.33 86.88 150.9
  Fur 56 13 6.740 4.310 9.260 20.25
  CRP 279 12 6.220 23.25 36.28 56.35
  DnaA 121 11 5.700 11.00 23.00 48.09
  H-NS 73 11 5.700 6.640 10.73 16.67
  LexA 101 10 5.180 10.10 18.32 32.40
  IHF 54 9 4.660 6.000 5.250 4.170
Enzymes RelA 4138 152 78.76 27.22 31.16 35.59
  RNAP 1873 117 60.62 16.01 28.08 49.00
  SpoT 1024 60 31.09 17.07 42.19 103.8
  EcoRI 215 53 27.46 4.060 4.970 4.000
  β-galactosidase 294 47 24.35 6.260 6.550 6.000
  BamHI 149 43 22.28 3.470 5.870 8.330
  HindIII 114 41 21.24 2.780 2.160 2.000
  RNase 109 36 18.65 3.030 4.280 5.330
  YbcS 50 23 11.92 2.170 2.620 2.000
  Reverse transcriptase 34 21 10.88 1.620 1.050 1.000
  tRNA synthetase 54 20 10.36 2.700 2.630 2.000
  Endonuclease I 29 20 10.36 1.450 1.400 1.000
  1. Individual gene products (i.e. enzymes, transcription factors and other proteins) were evaluated considering the number of documents where these entities were annotated and their number of annotations in the corpus. Statistical measurements are detailed in the Methods and Materials section.
  2. ψ A threshold of 10% of the frequency of annotation was set for enzymes and other proteins, whereas a threshold of 5% was set for transcription factors. However, lists of all annotated entities are provided in Additional file 6.
  3. VMR: variance-to-mean
  4. Std: standard deviation