Skip to main content

Table 7 Index and query statistics of pangenome query tools, as with Table 1 but with MEMO pivot as GRCh38 instead of T2T-CHM13

From: Mem-based pangenome indexing for k-mer queries

Method

Index - HPRC

Query - HLA Locus

Size (GB)

Pivot

Query Length

Query Type

Time

Memory (GB)

PanKmer

23.29

any

31-mer only

1, –, 3, –

1:54:43.45

8.04

KMC3-M

1,267.20

any

re-index

1, 2, 3, –

0:49:15.49

13.99

KMC3-C

18.05

any

re-index

1, –, 3, 4*

0:00:38.59

17.68

MEMO-M

1.94

re-index

any

1, 2, 3, –

0:00:59.60

3.51

MEMO-C

1.67

re-index

any

1, –, 3, –

0:00:08.26

3.51

MEMO-DC

0.76

re-index

any

–, –, –, 4

0:00:06.16

3.26

  1. The pangenome includes 88 human autosomal haplotypes from HPRC and GRCh38. Regions of Ns on GRCh38 are masked. Index query types include: 1. Global presence/absence; 2. Member presence/absence; 3. Conservation; 4. Decile conservation. Query type 4* indicates no relative size reduction in a KMC3 decile index. The decile conservation index yields counts to the nearest lowest decile. Elapsed conservation query runtime and peak memory usage on the HLA locus (chr6:28,510,120–33,480,577) anchored to GRCh38. Note that the HLA region annotated in GRCh38 is larger than that of T2T-CHM13, explaining the higher memory footprint compared to 1. Time is expressed in hours:minutes:seconds