This function identifies the valley(s) that tend to be outliers compared to other valley locations and tries to find the closest samples with similar density distribution to impute the valley. If no neighbor sample is detected, the valley will remain as original.

detect_impute_outlier_valley(
  valley_location_res,
  adt_marker_select,
  cell_x_adt,
  cell_x_feature,
  scale = 3,
  method = "MAD",
  nearest_neighbor_n = 3,
  nearest_neighbor_threshold = 0.75
)

Arguments

valley_location_res

Matrix of valley landmark locations with rows being samples and columns being the valleys.

adt_marker_select

The marker whose valley needs to be imputed. Find the neighbor samples whose density distribution is close to the target sample of the same ADT marker.

cell_x_adt

Matrix of ADT raw counts in cells (rows) by ADT markers (columns) format.

cell_x_feature

Matrix of cells (rows) by cell features (columns) such as cell type, sample, and batch-related information.

scale

Scale level to defining outlier. A larger scale value corresponds to more severe outliers.

method

Outlier detection methods, choose from "MAD" (Median Absolute Deviation) or "IQR" (InterQuartile Range). The default is MAD.

nearest_neighbor_n

Number of top nearest neighbor samples to detect.

nearest_neighbor_threshold

Threshold to call neighbor samples.

Examples

if (FALSE) {
detect_impute_outlier_valley(valley_location_res, cell_x_feature)
}