plink_bed_reader API
- class plink_bed_reader.BEDMode(value)[source]
Bases:
Enum
Enum with the possible modes of a BED file
- INDIVIDUAL_MAJOR = 2
- SNP_MAJOR = 1
- class plink_bed_reader.PLINKBEDReader(bed_file_path: str, offset: int = 0, count: int | None = None, mode: BEDMode | None = None, fam_file_path: str | None = None, bim_file_path: str | None = None)[source]
Bases:
object
Reads PLINK BED files (individual major or SNP major) and returns the genotypes as a NumPy array (uint8). The file is read in chunks to reduce memory usage and allows for random access. Matching the PLINK format specification (https://www.cog-genomics.org/plink/1.9/formats#bed), the genotypes are encoded as follows: 0 = homozygous 1/1 (usually minor) 1 = heterozygous 2 = missing 3 = homozygous 2/2 (usually major)
- __init__(bed_file_path: str, offset: int = 0, count: int | None = None, mode: BEDMode | None = None, fam_file_path: str | None = None, bim_file_path: str | None = None)[source]
- Parameters:
bed_file_path (str) – Path to the BED file (can be with or without the extension). Admits both SNP major and individual major modes.
offset (int, optional) – Number of samples or SNPs to skip at the beginning of the file, depending on the major mode.
count (int, optional) – Number of samples or SNPs to read from the file, depending on the major mode.
mode (BEDMode, optional) – Major mode of the file. The mode will be inferred from the file. If the mode is provided, it will be used as a sanity check.
fam_file_path (str, optional) – Path to the FAM file. If not provided, it will be inferred from the BED file.
bim_file_path (str, optional) – Path to the BIM file. If not provided, it will be inferred from the BED file.
- property major_mode
Major mode of the file
- property sample_count
Number of samples
- property snp_count
Number of SNPs