Skip to main content

Table 1 Dataset characteristics

From: Compression algorithm for colored de Bruijn graphs

Source

type

C

M

n. \(k\)-mers \(\times 10^6\) (% single color)

n. simplitigs \(\times 10^6\)

    

\(k=23\)

\(k=31\)

\(k=23\)

\(k=31\)

E. coli

assemblies

100

542,545

27 (30%)

31 (31%)

0.5

0.5

E. coli

assemblies

10

826

13 (38%)

14 (41%)

0.2

0.2

Fungi

assemblies

20

13,227

394 (93%)

409 (93%)

1.8

1.7

Gut

metagenome reads

9

511

2,236 (67%)

2,477 (70%)

76

95

Human

RNA-seq reads

19

9654

120 (71%)

103 (75%)

7.2

10

  1. C is the number of colors and M is the number color classes