You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This discussion was converted from issue #146 on September 16, 2024 20:03.
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
I want to implement the series_outlier method in Python & used the following code
import pandas as pd
import numpy as np
from scipy.stats import norm
Load the data into a DataFrame
data = {
'series': [67.95675, 58.63898, 33.59188, 4906.018, 5.372538, 702.1194, 0.037261, 11161.05, 1.403496, 100.116]
}
df = pd.DataFrame(data)
Function to calculate the outlier score based on custom percentiles
def custom_percentile_outliers(series, p_low=10, p_high=90):
Calculate custom percentiles
percentile_low = np.percentile(series, p_low)
percentile_high = np.percentile(series, p_high)
z_high = norm.ppf(p_high / 100)
Calculate normalization factor
normalization_factor = (2 * z_high - z_low) / (2 * z_high - 2.704)
Calculate outliers score
return series.apply(lambda x: (x - percentile_high) / (percentile_high - percentile_low) * normalization_factor
if x > percentile_high else ((x - percentile_low) / (percentile_high - percentile_low) * normalization_factor
if x < percentile_low else 0))
Apply the custom percentile outlier scoring function
df['outliers'] = custom_percentile_outliers(df['series'], p_low=10, p_high=90)
Display the DataFrame with outliers
print(df)
And getting the following results for the series
0 67.956750 0.000000 1 58.638980 0.000000 2 33.591880 0.000000 3 4906.018000 0.000000 4 5.372538 0.000000 5 702.119400 0.000000 6 0.037261 0.006067 7 11161.050000 -27.776847 8 1.403496 0.000000 9 100.116000 0.000000
While with the series_outlier function I get the below results enter image description here
I referred the github article #136 & also tried implementing & manually calculating with the help of the solution given on stackoverflow - How does Kusto series_outliers() calculate anomaly scores?
I am probably going wrong with the normalization score calculation. Would be great if someone can help
Beta Was this translation helpful? Give feedback.
All reactions