Main Content

getAlignment

Class: BioMap

Construct alignment represented in BioMap object

Syntax

Alignment = getAlignment(BioObj, StartPos, EndPos)
Alignment = getAlignment(BioObj, StartPos, EndPos, R)
Alignment = getAlignment(..., 'ParameterName', ParameterValue)
[Alignment, Indices] = getAlignment(...)

Description

Alignment = getAlignment(BioObj, StartPos, EndPos) returns Alignment, a character array containing the aligned read sequences from BioObj, a BioMap object. The read sequences must align within a specific region of the reference sequence, which is defined by StartPos and EndPos, two positive integers such that StartPos is less than EndPos, and both are smaller than the length of the reference sequence.

Alignment = getAlignment(BioObj, StartPos, EndPos, R) selects the reference where getAlignment reconstructs the alignment.

Alignment = getAlignment(..., 'ParameterName', ParameterValue) accepts one or more comma-separated parameter name/value pairs. Specify ParameterName inside single quotes.

[Alignment, Indices] = getAlignment(...) returns Indices, a vector of indices specifying the read sequences that align within a specific region of the reference sequence.

Input Arguments

BioObj

Object of the BioMap class.

StartPos

Positive integer that defines the start of a region of the reference sequence. StartPos must be less than EndPos, and smaller than the total length of the reference sequence.

EndPos

Positive integer that defines the end of a region of the reference sequence. EndPos must be greater than StartPos, and smaller than the total length of the reference sequence.

R

Positive integer indexing the SequenceDictionary property of BioObj, or a character vector or string specifying the actual name of the reference.

Name-Value Arguments

OffsetPad

Specifies if padding blanks are added at the beginning of each aligned sequence to represent the offset of the start position of each aligned sequence with respect to the reference. Choices are true or false (default).

Output Arguments

Alignment

Character array containing the aligned read sequences from BioObj that align within a specific region of the reference sequence. Each row of the character array contains one aligned sequence, that is, the sequence positions that fall within the specified region of the reference sequence. Each aligned sequence can include gaps.

Indices

Vector of indices specifying the read sequences from BioObj that align within a specific region of the reference sequence.

Examples

Construct a BioMap object, and then reconstruct the alignment between positions 10 and 25 of the reference sequence:

% Construct a BioMap object from a SAM file 
BMObj1 = BioMap('ex1.sam');
% Construct the alignment between positions 10 and 25 of the
% reference sequence. 
Alignment = getAlignment(BMObj1, 10, 25)
Alignment =

CTCATTGTAAATGTGT
CTCATTGTAAATGTGT
CTCATTGTAAATGTGT
CTCATTGTAATTTTTT
CTCATTGTAAATGTGT
   ATTGTAAATGTGT
   ATTGTAAATGTGT
     TGTAAATGTGT
        AAATGTGT
            GTGT
            GTGT
              GT

Algorithms

getAlignment assumes the reference sequence has no gaps. Therefore, positions in reads corresponding to insertions (I) and padding (P) do not appear in the alignment.

Because soft clipped positions (S) are not associated with positions that align to the reference sequence, they do not appear in the alignment.

A skipped position (N) appears as a . (period) in the alignment.

Hard clipped positions (H) do not appear in the sequences or the alignment.