sequence_similarity_ops

Ops for computing quantities based on sequence similarity.

source

smooth_substitution_matrix_similarities_dot

 smooth_substitution_matrix_similarities_dot (x:torch.Tensor,
                                              subs_mat:torch.Tensor,
                                              use_scoredist:bool=False, ex
                                              pected_value:Optional[float]
                                              =None)

TODO.


source

smooth_substitution_matrix_similarities_cdist

 smooth_substitution_matrix_similarities_cdist (x:torch.Tensor,
                                                subs_mat:torch.Tensor,
                                                p:float=1.0)

TODO.


source

smooth_hamming_similarities_dot

 smooth_hamming_similarities_dot (x:torch.Tensor)

Smooth extension of the normalized Hamming similarity between all pairs of sequences in x. x must have shape (…, N, L, R), and the result has shape (…, N, N).


source

smooth_hamming_similarities_cdist

 smooth_hamming_similarities_cdist (x:torch.Tensor, p:float=1.0)

Smooth extension of the normalized Hamming similarity between all pairs of sequences in x. x must have shape (…, N, L, R), and the result has shape (…, N, N).


source

soft_best_hits

 soft_best_hits (similarities:torch.Tensor, reciprocal:bool=False,
                 group_slices:collections.abc.Sequence[slice],
                 tau:Union[float,torch.Tensor]=0.1)

Soft reciprocal best hits graphs from pairwise similarities. similarities must have shape (…, N, N). The main diagonal is excluded by setting its entries to minus infinity before softmax.


source

hard_best_hits

 hard_best_hits (similarities:torch.Tensor, reciprocal:bool=False,
                 group_slices:collections.abc.Sequence[slice])

Hard reciprocal best hits graphs from pairwise similarities. similarities must have shape (…, N, N). The main diagonal is excluded by setting its entries to minus infinity before argmax.

def test_soft_reciprocal_best_hits_bounds():
    group_slices = [
        slice(0, 4), slice(4, 7), slice(7, 10), slice(10, 18), slice(18, 20)
    ]
    x = torch.randn(20, 6)
    similarities = torch.cdist(x, x, p=1)
    similarities /= similarities.max()
    srbh = soft_best_hits(similarities, group_slices=group_slices, reciprocal=True)

    assert srbh.shape == similarities.shape
    assert torch.all(torch.logical_and(srbh <= 1, srbh >= 0))


test_soft_reciprocal_best_hits_bounds()